LLM Pricing Tracker: API and Subscription Costs
A tracker for leading LLM API token prices and consumer subscriptions, with official links, repo snapshot refreshes, price-history charts, and live benchmark/provider snapshots.
Note
Latest repo snapshot in this build: June 18, 2026. Click Check for newer snapshot below to query GitHub for a fresher snapshot. Each browser is limited to one remote check per day.
This page tracks public pricing from official provider pages for major frontier-model vendors I regularly compare: OpenAI, Google (Gemini and Gemma), Anthropic, xAI, DeepSeek, Qwen, Moonshot/Kimi, Xiaomi/MiMo, MiniMax, Together AI (including GLM-5 and public Llama endpoints), Meta/Llama references, and GitHub Copilot.
A few quick cautions before using the numbers:
- API pricing and consumer subscription pricing are different products.
- Some vendors publish tiered pricing by context length, region, or prompt type.
- When a vendor does not publicly expose a comparable token-billing number, I mark that clearly instead of guessing.
- The charts below use snapshots stored in this repo, including daily-refreshable Artificial Analysis benchmark/provider snapshots and manually curated pricing rows from official vendor pages.
API Token Fees
Representative API pricing for major frontier-model vendors. Numbers are taken from official vendor pricing pages; mixed currencies are kept in the vendor’s quoted currency.
| Vendor | Model / Product | Input | Cached input | Output | Notes | Official |
|---|---|---|---|---|---|---|
| OpenAI | GPT-5.5 (<272K context) | $5.00 | $0.50 | $30.00 | OpenAI developer pricing lists GPT-5.5 as a flagship model at the standard <272K-context tier; regional processing endpoints carry a 10% uplift. | OpenAI developer pricing |
| Gemini 3.5 Flash | $1.50 | $0.15 | $9.00 | Current Gemini API paid-tier standard pricing for Gemini 3.5 Flash. Google describes it as its most intelligent model built for speed; context-cache storage is listed separately at $1.00 / 1M tokens per hour. | Gemini API pricing | |
| Gemma 4 | Free of charge | Free of charge | Free of charge | Google's current Gemini Developer API pricing page lists Gemma 4 with free input, output, and context caching, while the paid tier remains unavailable. | Google Gemini Developer API pricing | |
| Anthropic | Claude Opus 4.8 | $5.00 | $0.50 (cache read) | $25.00 | Anthropic pricing lists Opus 4.8 as ideal for complex agentic coding and enterprise work. Prompt-cache writes are $6.25 / 1M tokens. Fable 5 is listed at $10 / $50 but is unavailable, so it is not used as the active API row. | Anthropic pricing |
| DeepSeek | deepseek-v4-flash | $0.14 (cache miss) | $0.0028 (cache hit) | $0.28 | Official DeepSeek V4 Flash overseas pricing. The compatibility names deepseek-chat and deepseek-reasoner are scheduled for deprecation on 2026-07-24 15:59 UTC. | DeepSeek models & pricing |
| DeepSeek | deepseek-v4-pro | $0.435 (cache miss) | $0.003625 (cache hit) | $0.87 | Official DeepSeek V4 Pro overseas pricing as listed on the current models and pricing page. | DeepSeek models & pricing |
| Qwen / Alibaba Cloud | qwen3-max | $1.20 (<=32K, International) | Context Cache discount available | $6.00 (<=32K, International) | Alibaba Cloud Model Studio now lists qwen3-max above qwen-max. International tiered pricing is $1.20/$6.00 up to 32K tokens, $2.40/$12.00 from 32K to 128K, and $3.00/$15.00 from 128K to 252K. | Alibaba Cloud Model Studio pricing |
| Together AI / Z AI | GLM-5 | $1.00 | - | $3.20 | Together AI's public serverless price for Z AI's GLM-5. The public model page does not list a separate cached-input rate. | Together AI GLM-5 pricing |
| Meta / Llama via Together AI | Llama 4 Maverick | $0.27 | - | $0.85 | Meta's Llama 4 Maverick served through Together AI's public serverless API. Meta's public developer docs describe the model family, while Together AI exposes a comparable public token price. | Together AI Llama 4 Maverick pricing |
| Moonshot AI / Kimi | kimi-k2.7-code | $0.95 (cache miss) | $0.19 (cache hit) | $4.00 | Kimi K2.7 Code is Kimi’s current most capable coding model. The high-speed variant is $1.90 cache-miss input, $0.38 cache-hit input, and $8.00 output per 1M tokens. | Kimi K2.7 Code pricing |
| Xiaomi / MiMo | MiMo-V2.5-Pro | $1.00 (<=256K cache miss) | $0.20 (<=256K cache hit) | $3.00 (<=256K) | Official overseas pricing for Xiaomi's flagship MiMo-V2.5-Pro / MiMo-V2-Pro tier. For prompts above 256K tokens, Xiaomi lists $2.00 input, $0.40 cached input, and $6.00 output per 1M tokens. | Xiaomi MiMo pricing |
| MiniMax | MiniMax-M3 | ¥2.10 (<=512K, 50% off) | ¥0.42 (cache read, <=512K) | ¥8.40 (<=512K, 50% off) | Current MiniMax pay-as-you-go text pricing for MiniMax-M3 standard service after the listed permanent 50% discount. Prompts above 512K are listed at ¥4.20 input, ¥0.84 cache read, and ¥16.80 output while limited supply applies; priority service is 1.5x standard pricing. | MiniMax pay-as-you-go pricing |
| xAI | Grok 4.3 | $1.25 | $0.20 | $2.50 | xAI docs list Grok 4.3 at $1.25 input, $0.20 cached input, and $2.50 output per 1M tokens. | xAI Grok 4.3 pricing |
| GitHub Copilot | Copilot product pricing | N/A | N/A | N/A | GitHub Copilot is sold as a subscription product. GitHub does not publicly publish a Copilot per-token API price comparable to the other vendors here. | GitHub Copilot plans |
Subscription Plans
Publicly listed consumer or team subscriptions from the official provider pages I checked for this tracker.
| Vendor | Plan | Price | Notes | Official |
|---|---|---|---|---|
| OpenAI | ChatGPT Plus | $20 / month | Consumer plan. API usage is billed separately. | ChatGPT pricing |
| OpenAI | ChatGPT Pro | $200 / month | Highest individual-access plan. | ChatGPT pricing |
| OpenAI | ChatGPT Business | $25 / user / month billed annually or $30 monthly | Shared workspace for teams and growing businesses. | ChatGPT pricing |
| Anthropic | Claude Pro | $20 / month | Individual paid plan for Claude. | Anthropic pricing |
| Anthropic | Claude Max 5x | $100 / month | 5x more usage than Pro. Includes Claude Code. | Claude Max |
| Anthropic | Claude Max 20x | $200 / month | 20x more usage than Pro. Includes Claude Code. | Claude Max |
| Anthropic | Claude Team | $25 / seat / month billed annually, $30 billed monthly | Standard team seats. Anthropic also lists premium seats at $150 / member / month, including Claude Code and higher usage. | Claude Team billing |
| Google AI Pro | $19.99 / month | Formerly AI Premium. Consumer plan with Gemini app, Flow, NotebookLM and more. | Google AI plans | |
| Google AI Ultra | $249.99 / month | Highest Google AI subscription tier for the Gemini app and related tools. | Google AI plans | |
| Xiaomi / MiMo | Token Plan (monthly) | $6 / $16 / $50 / $100 per month | Official monthly Token Plan tiers: Lite, Standard, Pro, and Max. The package covers MiMo-V2.5-Pro, MiMo-V2.5, and the rest of Xiaomi's current V2 / V2.5 programming-tool lineup. | MiMo Token Plan subscription |
| Xiaomi / MiMo | Token Plan (annual) | $63.36 / $168.96 / $528.00 / $1056.00 per year | Official annual Token Plan tiers: Lite, Standard, Pro, and Max. | MiMo Token Plan subscription |
| MiniMax | Coding Plan Starter | ¥29 / month | 40 prompts every 5 hours. | MiniMax Coding Plan |
| MiniMax | Coding Plan Plus | ¥49 / month | 100 prompts every 5 hours. | MiniMax Coding Plan |
| MiniMax | Coding Plan Max | ¥119 / month | 300 prompts every 5 hours. | MiniMax Coding Plan |
| xAI / Grok | X Premium+ | $40 / month in the US web pricing table | Help.x.com ties Premium+ to expanded Grok access. Regional prices vary. | X Premium pricing |
| GitHub Copilot | Copilot Pro | $10 / month or $100 / year | Individual developer plan. | GitHub Copilot plans |
| GitHub Copilot | Copilot Pro+ | $39 / month or $390 / year | Higher premium-request limits and broader model access. | GitHub Copilot plans |
| GitHub Copilot | Copilot Business | $19 / seat / month | Organization-managed plan from GitHub Docs. | GitHub Docs |
| GitHub Copilot | Copilot Enterprise | $39 / seat / month | Enterprise-managed plan from GitHub Docs. | GitHub Docs |
Some vendors in the API table are omitted here because I could not find a public official subscription plan for them in the sources checked for this snapshot: DeepSeek, Qwen / Alibaba Cloud, Moonshot AI / Kimi.
Price History
Switch provider, metric, and time grain to compare the official pricing checkpoints I have stored so far.
History lines use official pricing snapshots that are stored in this repo. Some providers only have one official snapshot recorded so far, while others mix successive flagship or promo models.
Artificial Analysis Benchmark Snapshot
The current top 10 models on the Artificial Analysis Intelligence Index. This list is intentionally model-level, so repeated vendors can appear more than once when they occupy multiple top-10 slots.
Top 10 current models by Artificial Analysis Intelligence Index. Prices and speeds below come from Artificial Analysis' benchmark snapshot, so they may differ from the manual API rows above when multiple deployments or reasoning modes exist. Source: Artificial Analysis models leaderboard (checked June 18, 2026).
| Vendor | Benchmark model | Intelligence | Speed | Blended price | Prompt pricing | Details |
|---|---|---|---|---|---|---|
| Anthropic | Claude Fable 5 (with fallback) | 59.86 | 0 t/s | $ | Input $10 · Output $50 | Model details |
| Anthropic | Claude Opus 4.8 (max) | 55.69 | 60.3 t/s | $ | Input $5 · Output $25 | Model details |
| OpenAI | GPT-5.5 (xhigh) | 54.84 | 56.35 t/s | $ | Input $5 · Output $30 | Model details |
| Anthropic | Claude Opus 4.7 (max) | 53.53 | 50.25 t/s | $ | Input $5 · Output $25 | Model details |
| OpenAI | GPT-5.5 (high) | 53.13 | 48.82 t/s | $ | Input $5 · Output $30 | Model details |
| Z AI | GLM-5.2 (max) | 50.67 | 104.83 t/s | $ | Input $1.4 · Output $4.4 | Model details |
| Gemini 3.5 Flash | 50.2 | 148.2 t/s | $ | Input $1.5 · Output $9 | Model details | |
| Anthropic | Claude Sonnet 4.6 (max) | 47.21 | 50.48 t/s | $ | Input $3 · Output $15 | Model details |
| OpenAI | GPT-5.5 (medium) | 47.14 | 52.65 t/s | $ | Input $5 · Output $30 | Model details |
| Gemini 3.1 Pro Preview | 46.46 | 122.4 t/s | $ | Input $2 · Output $12 | Model details |
Top-25 API Provider Leaderboard
Artificial Analysis provider leaderboard snapshot, aggregated to the best currently benchmarked endpoint for each provider. Switch metrics to compare intelligence, blended price, latency, and context window separately.
Each provider rank uses that provider's best currently benchmarked endpoint for the selected metric, so the representative model can change across metrics. Source: Artificial Analysis provider leaderboard (checked June 18, 2026).
Scale and Price Frontier
A Star History-style yearly line for open-weight model sizes and the highest reviewed output-token prices since 2021.
Lines start in 2021 and combine curated source-backed historical records with the latest Artificial Analysis model metadata. Model size tracks the largest open-weight/open-access LLM by total disclosed parameters for each year, so sparse MoE and dense models are not quality-equivalent. Output-token price uses public USD text-token API prices and excludes tool-call, image, audio, video, and subscription pricing. Source: Artificial Analysis models (checked June 18, 2026).