Anthropic's Hidden Token Cap
by 5-Hour Window

Each data point is one 5-hour Anthropic window — exposing how quota caps vary throughout the day. The implied cap = tokens burned ÷ utilization. A sudden jump = Anthropic changed your limit.

Live Auto-refreshes in 60s Generated: 2026-04-17 04:30 UTC
Current Implied Cap
5-Hour Window
88.99M
9.59M burned 11.0% utilized
7-Day (All Models)
179.13M burned 2.0% utilized
7-Day (Sonnet)
171.43M burned 2.0% utilized
What is the implied cap? Anthropic publishes utilization percentages but not the underlying token limits. By dividing tokens burned by the utilization fraction, we back-calculate the effective quota cap for each rate-limit window. A sudden jump in the chart below = Anthropic changed your limit.
Reading this chart: Multiple data points per calendar day = multiple 5h windows. A drop between the 10:00 UTC window and the 15:00 UTC window means Anthropic lowered the cap mid-day. Night windows (20:00+ UTC) often show higher caps than business-hour windows (14:00–20:00 UTC). 5h series (orange) on the right axis; 7-day series on the left.
Implied Token Cap — 5h Windows (30 Days)
Each data point is one 5-hour Anthropic window. Reveals intraday variation: business hours vs. evening vs. night caps.
Token Burn per 5h Window (30 Days)
Tokens burned within each 5-hour Anthropic window. A tall bar = heavy burn; when Anthropic adjusts the cap, burn patterns shift visibly.

Utilization
5-Hour Window 11.0%
9.59M burned  /  ~88.99M cap as of 2026-04-17 04:10 UTC
7-Day (All Models) 2.0%
179.13M burned  /  ~— cap as of 2026-04-17 04:10 UTC
7-Day (Sonnet) 2.0%
171.43M burned  /  ~— cap as of 2026-04-17 04:10 UTC
Utilization % — 5h Windows (30 Days)
How much of the quota is being consumed per 5-hour window.
Window Utilized Burned Implied Cap As Of
5-Hour Window 11.0% 9.59M 88.99M 2026-04-17 04:10 UTC
7-Day (All Models) 2.0% 179.13M 2026-04-17 04:10 UTC
7-Day (Sonnet) 2.0% 171.43M 2026-04-17 04:10 UTC

Anthropic doesn't publish rate limits directly. Every API response includes two values: tokens consumed in the current window and what percentage of the limit that represents. Dividing one by the other gives the implied cap — a real-time estimate of the actual limit.

A step-change in the chart means the underlying limit changed: a sudden jump indicates an increase, a drop indicates a reduction or a change in model mix. Three windows are tracked independently: a 5-hour rolling window, a 7-day window for all models, and a 7-day window for Sonnet specifically — each resets on its own schedule.

vmfarms operates on an Anthropic 20× Max plan — the caps shown here reflect our actual quota, which is substantially higher than the standard API tier. Your own limits will differ based on your plan.

Need fully managed cloud hosting without the markup? We handle the infrastructure — bare metal performance, managed for you.

vmfarms.com →