LLM Pricing Tracker: API and Subscription Costs
A tracker for leading LLM API token prices and consumer subscriptions, with official links, repo snapshot refreshes, price-history charts, and live benchmark/provider snapshots.
Note
Latest repo snapshot in this build: May 7, 2026. Click Check for newer snapshot below to query GitHub for a fresher snapshot. Each browser is limited to one remote check per day.
This page tracks public pricing from official provider pages for major frontier-model vendors I regularly compare: OpenAI, Google (Gemini and Gemma), Anthropic, xAI, DeepSeek, Qwen, Moonshot/Kimi, Xiaomi/MiMo, MiniMax, Together AI (including GLM-5 and public Llama endpoints), Meta/Llama references, and GitHub Copilot.
A few quick cautions before using the numbers:
- API pricing and consumer subscription pricing are different products.
- Some vendors publish tiered pricing by context length, region, or prompt type.
- When a vendor does not publicly expose a comparable token-billing number, I mark that clearly instead of guessing.
- The charts below use snapshots stored in this repo, including daily-refreshable Artificial Analysis benchmark/provider snapshots and manually curated pricing rows from official vendor pages.
API Token Fees
Representative API pricing for major frontier-model vendors. Numbers are taken from official vendor pricing pages; mixed currencies are kept in the vendor’s quoted currency.
| Vendor | Model / Product | Input | Cached input | Output | Notes | Official |
|---|---|---|---|---|---|---|
| OpenAI | GPT-5.5 | $5.00 | $0.50 | $30.00 | OpenAI's pricing page now lists GPT-5.5 as the latest flagship model. The page labels it coming soon and notes higher long-context multipliers above 270K input tokens. | OpenAI API pricing |
| Gemini 2.5 Pro | $1.25 (<=200K prompts) | $0.125 (<=200K prompts) | $10.00 (<=200K prompts) | Tiered pricing. For prompts above 200K tokens, Google lists $2.50 input, $0.25 cached input, and $15.00 output. | Gemini API pricing | |
| Gemma 4 | Free of charge | Free of charge | Free of charge | Google's current Gemini Developer API pricing page lists Gemma 4 with free input, output, and context caching, while the paid tier remains unavailable. | Google Gemini Developer API pricing | |
| Anthropic | Claude Sonnet 4 / Claude Code backend | $3.00 | $0.30 (cache hits) | $15.00 | Claude Code team/API usage is billed from Claude API token consumption. Anthropic also lists separate cache-write prices. | Anthropic model pricing |
| DeepSeek | deepseek-v4-flash | $0.14 (<=128K cache miss) | $0.028 (<=128K cache hit) | $0.28 (<=128K) | Official DeepSeek V4 Flash overseas pricing. For prompts above 128K tokens, DeepSeek lists $0.28 input, $0.056 cached input, and $0.56 output per 1M tokens. | DeepSeek models & pricing |
| DeepSeek | deepseek-v4-pro | $1.74 (<=128K cache miss) | $0.145 (<=128K cache hit) | $3.48 (<=128K) | Official DeepSeek V4 Pro overseas pricing. For prompts above 128K tokens, DeepSeek lists $3.48 input, $0.29 cached input, and $6.96 output per 1M tokens. | DeepSeek models & pricing |
| Qwen / Alibaba Cloud | qwen-max-latest | $1.60 | - | $6.40 | Alibaba Cloud lists qwen-max-latest with non-thinking pricing and no tiered pricing on the current pricing page. | Alibaba Cloud Model Studio pricing |
| Together AI / Z AI | GLM-5 | $1.00 | - | $3.20 | Together AI's public serverless price for Z AI's GLM-5. The public model page does not list a separate cached-input rate. | Together AI GLM-5 pricing |
| Meta / Llama via Together AI | Llama 4 Maverick | $0.27 | - | $0.85 | Meta's Llama 4 Maverick served through Together AI's public serverless API. Meta's public developer docs describe the model family, while Together AI exposes a comparable public token price. | Together AI Llama 4 Maverick pricing |
| Moonshot AI / Kimi | kimi-latest | ¥2 / ¥5 / ¥10 (8k/32k/128k) | ¥1.00 (auto cache hit) | ¥10 / ¥20 / ¥30 (8k/32k/128k) | Moonshot's official April 7, 2025 pricing notice shows kimi-latest auto-selects the 8K / 32K / 128K tier at ¥2 / ¥5 / ¥10 input and ¥10 / ¥20 / ¥30 output per 1M tokens. Automatic cache-hit billing remains ¥1 / 1M tokens. | Moonshot official pricing update |
| Xiaomi / MiMo | MiMo-V2.5-Pro | $1.00 (<=256K cache miss) | $0.20 (<=256K cache hit) | $3.00 (<=256K) | Official overseas pricing for Xiaomi's flagship MiMo-V2.5-Pro / MiMo-V2-Pro tier. For prompts above 256K tokens, Xiaomi lists $2.00 input, $0.40 cached input, and $6.00 output per 1M tokens. | Xiaomi MiMo pricing |
| MiniMax | MiniMax-M2.5 | ¥2.10 | ¥0.21 (cache read) | ¥8.40 | Current MiniMax pay-as-you-go text pricing. Cache writes are listed separately at ¥2.625 / 1M tokens. | MiniMax pay-as-you-go pricing |
| xAI | grok-4.20-beta-0309-reasoning | $2.00 | - | $6.00 | xAI lists Grok 4.20 reasoning and non-reasoning variants with the same token pricing on the public API page. | xAI API pricing |
| GitHub Copilot | Copilot product pricing | N/A | N/A | N/A | GitHub Copilot is sold as a subscription product. GitHub does not publicly publish a Copilot per-token API price comparable to the other vendors here. | GitHub Copilot plans |
Subscription Plans
Publicly listed consumer or team subscriptions from the official provider pages I checked for this tracker.
| Vendor | Plan | Price | Notes | Official |
|---|---|---|---|---|
| OpenAI | ChatGPT Plus | $20 / month | Consumer plan. API usage is billed separately. | ChatGPT pricing |
| OpenAI | ChatGPT Pro | $200 / month | Highest individual-access plan. | ChatGPT pricing |
| OpenAI | ChatGPT Business | $25 / user / month billed annually or $30 monthly | Shared workspace for teams and growing businesses. | ChatGPT pricing |
| Anthropic | Claude Pro | $20 / month | Individual paid plan for Claude. | Anthropic pricing |
| Anthropic | Claude Max 5x | $100 / month | 5x more usage than Pro. Includes Claude Code. | Claude Max |
| Anthropic | Claude Max 20x | $200 / month | 20x more usage than Pro. Includes Claude Code. | Claude Max |
| Anthropic | Claude Team | $25 / seat / month billed annually, $30 billed monthly | Standard team seats. Anthropic also lists premium seats at $150 / member / month, including Claude Code and higher usage. | Claude Team billing |
| Google AI Pro | $19.99 / month | Formerly AI Premium. Consumer plan with Gemini app, Flow, NotebookLM and more. | Google AI plans | |
| Google AI Ultra | $249.99 / month | Highest Google AI subscription tier for the Gemini app and related tools. | Google AI plans | |
| Xiaomi / MiMo | Token Plan (monthly) | $6 / $16 / $50 / $100 per month | Official monthly Token Plan tiers: Lite, Standard, Pro, and Max. The package covers MiMo-V2.5-Pro, MiMo-V2.5, and the rest of Xiaomi's current V2 / V2.5 programming-tool lineup. | MiMo Token Plan subscription |
| Xiaomi / MiMo | Token Plan (annual) | $63.36 / $168.96 / $528.00 / $1056.00 per year | Official annual Token Plan tiers: Lite, Standard, Pro, and Max. | MiMo Token Plan subscription |
| MiniMax | Coding Plan Starter | ¥29 / month | 40 prompts every 5 hours. | MiniMax Coding Plan |
| MiniMax | Coding Plan Plus | ¥49 / month | 100 prompts every 5 hours. | MiniMax Coding Plan |
| MiniMax | Coding Plan Max | ¥119 / month | 300 prompts every 5 hours. | MiniMax Coding Plan |
| xAI / Grok | X Premium+ | $40 / month in the US web pricing table | Help.x.com ties Premium+ to expanded Grok access. Regional prices vary. | X Premium pricing |
| GitHub Copilot | Copilot Pro | $10 / month or $100 / year | Individual developer plan. | GitHub Copilot plans |
| GitHub Copilot | Copilot Pro+ | $39 / month or $390 / year | Higher premium-request limits and broader model access. | GitHub Copilot plans |
| GitHub Copilot | Copilot Business | $19 / seat / month | Organization-managed plan from GitHub Docs. | GitHub Docs |
| GitHub Copilot | Copilot Enterprise | $39 / seat / month | Enterprise-managed plan from GitHub Docs. | GitHub Docs |
Some vendors in the API table are omitted here because I could not find a public official subscription plan for them in the sources checked for this snapshot: DeepSeek, Qwen / Alibaba Cloud, Moonshot AI / Kimi.
Price History
Switch provider, metric, and time grain to compare the official pricing checkpoints I have stored so far.
History lines use official pricing snapshots that are stored in this repo. Some providers only have one official snapshot recorded so far, while others mix successive flagship or promo models.
Artificial Analysis Benchmark Snapshot
The current top 10 models on the Artificial Analysis Intelligence Index. This list is intentionally model-level, so repeated vendors can appear more than once when they occupy multiple top-10 slots.
Top 10 current models by Artificial Analysis Intelligence Index. Prices and speeds below come from Artificial Analysis' benchmark snapshot, so they may differ from the manual API rows above when multiple deployments or reasoning modes exist. Source: Artificial Analysis models leaderboard (checked May 7, 2026).
| Vendor | Benchmark model | Intelligence | Speed | Blended price | Prompt pricing | Details |
|---|---|---|---|---|---|---|
| OpenAI | GPT-5.5 (xhigh) | 60.24 | 64.83 t/s | $ | Input $5 · Output $30 | Model details |
| OpenAI | GPT-5.5 (high) | 58.87 | 66.23 t/s | $ | Input $5 · Output $30 | Model details |
| Anthropic | Claude Opus 4.7 (max) | 57.28 | 45.24 t/s | $ | Input $6.25 · Output $25 | Model details |
| Gemini 3.1 Pro Preview | 57.18 | 125.55 t/s | $ | Input $2 · Output $12 | Model details | |
| OpenAI | GPT-5.5 (medium) | 56.71 | 60.44 t/s | $ | Input $5 · Output $30 | Model details |
| Kimi | Kimi K2.6 | 53.9 | 30.06 t/s | $ | Input $0.95 · Output $4 | Model details |
| Xiaomi | MiMo-V2.5-Pro | 53.83 | 57.27 t/s | $ | Input $1 · Output $3 | Model details |
| OpenAI | GPT-5.3 Codex (xhigh) | 53.56 | 78.73 t/s | $ | Input $1.75 · Output $14 | Model details |
| xAI | Grok 4.3 | 53.2 | 80.14 t/s | $ | Input $1.25 · Output $2.5 | Model details |
| Meta | Muse Spark | 52.15 | 0 t/s | $ | Input $0 · Output $0 | Model details |
Top-25 API Provider Leaderboard
Artificial Analysis provider leaderboard snapshot, aggregated to the best currently benchmarked endpoint for each provider. Switch metrics to compare intelligence, blended price, latency, and context window separately.
Each provider rank uses that provider's best currently benchmarked endpoint for the selected metric, so the representative model can change across metrics. Source: Artificial Analysis provider leaderboard (checked May 7, 2026).
Scale and Price Frontier
A Star History-style yearly line for open-weight model sizes and the highest reviewed output-token prices since 2021.
Lines start in 2021 and combine curated source-backed historical records with the latest Artificial Analysis model metadata. Model size tracks the largest open-weight/open-access LLM by total disclosed parameters for each year, so sparse MoE and dense models are not quality-equivalent. Output-token price uses public USD text-token API prices and excludes tool-call, image, audio, video, and subscription pricing. Source: Artificial Analysis models (checked May 7, 2026).