Anthropic 5h-Window Multiplier

Tracks Anthropic’s hidden multiplier M per 5-hour window. M = actual %p ÷ predicted %p where predicted %p sums baseline β·tokens (M = 1) per (model family, token type) from the calibrated MATRIX.md. M = 1 means baseline. M > 1 = Anthropic tightened. M < 1 = loosened.

Live Auto-refreshes in 60s Generated: 2026-05-27 09:25 UTC
Current Window
5-Hour Window
×1.009
multiplier (×M)
13.0%p utilized
7-Day (Anthropic)
4.0%
of quota burned
7-Day (Sonnet)
2.0%
of quota burned
What is M? Anthropic reports %p burned per window but not the rate at which each model+token-type contributes. We hold the relative β ratios between model.token-types fixed from the 2026-05-01 burn-test calibration — and infer Anthropic’s overall multiplier from the gap between actual %p and what those β values predict. %p and MULTIPLIER — no token counts.
Reading the chart: Each 5h-window data point is one M value. A flat line at M ≈ 1 means Anthropic hasn’t adjusted their hidden multiplier. A step-change (mid-day, day-of-week) means they have. Dashed verticals on the utilization chart mark 7-day window rollovers.
5h Multiplier (×M) — 30 Days
M = actual %p ÷ predicted %p (β·tokens, M=1 from MATRIX.md). M = 1 means baseline. M > 1 = Anthropic tightened. M < 1 = loosened.

Utilization (%p)
5-Hour Window 13.0%p
×1.009 multiplier as of 2026-05-11 13:34 UTC
7-Day (Anthropic) 4.0%p
as of 2026-05-11 13:34 UTC
7-Day (Sonnet) 2.0%p
as of 2026-05-11 13:34 UTC
Utilization (%p) — 30 Days
Quota burned per window. 5h windows reset every 5h; 7-day windows reset weekly. Dashed vertical lines = 7-day window rollovers.
Window Utilized Multiplier As Of
5-Hour Window 13.0% ×1.009 2026-05-11 13:34 UTC
7-Day (Anthropic) 4.0% 2026-05-11 13:34 UTC
7-Day (Sonnet) 2.0% 2026-05-11 13:34 UTC

Anthropic doesn’t publish per-token rate-limit weights. Every API response gives %p burned in the current window, but not how much of that came from input-tokens vs cached reads vs output for each model family.

We pre-measured the relative β ratios between every (model family, token type) pair — 12 cells across haiku, sonnet, opus × input, output, cache_read, cache_create — on a paused-daemon Max account 2026-05-01..02. These β values are stored in MATRIX.md and embedded in tokenomics.py.

Per 5h window, we sum up every recorded Claude invocation: predicted %p = Σ β(family,ttype) × tokens. We then divide Anthropic’s reported actual %p by the predicted total: M = actual %p / predicted %p. The β ratios are assumed stable; M absorbs whatever overall scaling Anthropic applies behind their 5h/7d limit math.

M = 1 would mean today’s burn matches the baseline calibration exactly. M > 1 means each token now burns more %p than baseline (Anthropic tightened). M < 1 means looser (Anthropic loosened). Same math powers the 7:tokenomics tab, 8:api_usage tab, and per-session summary in our internal dashboards.

vmfarms operates on an Anthropic 20× Max plan — M reflects multiplier shifts on our quota. Pricing-tier customers will see a different M baseline.

Need fully managed cloud hosting without the markup? We handle the infrastructure — bare metal performance, managed for you.

vmfarms.com →