# AI Model Pricing

> aimodelpricing.com is a public lookup service for current AI model token prices.

Use this file when an agent needs to call the public pricing API without reading the source code. Prices are listed in USD per 1,000,000 tokens.

For a human-browsable catalog with copyable API URLs, visit https://aimodelpricing.com/prices.

## API

GET /api/cost returns pricing rows. Without query parameters it lists every published row.

Query parameters:
- `model` (optional): model id or compatible longer model name. Exact matches are preferred, then hyphen-boundary prefix matches are used.
- `provider` (optional): provider name such as `anthropic`, `google`, or `openai`. Matching is case-insensitive.

Filtering behavior:
- No query parameters: returns all rows.
- `provider` without `model`: returns all rows for that provider.
- `model` without `provider`: exact-then-prefix model lookup across all providers.
- `model` with `provider`: exact-then-prefix model lookup scoped to that provider.

Response shape:
- Successful responses are JSON objects shaped as `{ "matches": [...] }`.
- Each row has `id`, `model_id`, `provider`, `input_cost_per_mtok`, `cached_input_cost_per_mtok`, `output_cost_per_mtok`, `long_context_threshold_tokens`, `long_context_input_multiplier`, `long_context_output_multiplier`, and `updated_at`.
- `input_cost_per_mtok`, `cached_input_cost_per_mtok`, and `output_cost_per_mtok` are USD per 1,000,000 tokens. `cached_input_cost_per_mtok` is `null` when the provider has no cached-input price.
- `long_context_threshold_tokens` is the prompt-token count where a higher long-context tier starts. When it is `null`, the row has no long-context tier.
- `long_context_input_multiplier` and `long_context_output_multiplier` multiply the base input/output prices after the prompt exceeds the threshold. Example: `effective_input = input_cost_per_mtok × long_context_input_multiplier` when prompt size exceeds `long_context_threshold_tokens`.

Status codes:
- 200: success. List responses may contain an empty `matches` array.
- 400: invalid query parameter shape; no current public parameters use this.
- 404: a `model` filter was provided and no rows matched.

Prefix matching: callers may pass the short canonical alias such as `claude-opus-4-5`, or any longer form such as `claude-opus-4-5-20251022`, provider-prefixed forms, and similar vendor ids. The API resolves to the canonical row. The boundary rule is `-` or end-of-string only, so `gpt-5` does not match `gpt-50`.

CORS: `Access-Control-Allow-Origin: *`. No authentication is required for `GET /api/cost`.

## Examples

- List all rows: `curl https://aimodelpricing.com/api/cost`
- List OpenAI rows: `curl 'https://aimodelpricing.com/api/cost?provider=openai'`
- Exact match: `curl 'https://aimodelpricing.com/api/cost?model=claude-opus-4-5&provider=anthropic'`
- Date-suffix prefix match: `curl 'https://aimodelpricing.com/api/cost?model=claude-opus-4-5-20251022&provider=anthropic'`
- Long-context lookup: `curl 'https://aimodelpricing.com/api/cost?model=gemini-2.5-pro&provider=google'`
- No-provider lookup: `curl 'https://aimodelpricing.com/api/cost?model=gpt-4o'`

## Coverage

The list below is generated from the live AgentDB pricing table. Rows are grouped by provider and sorted by provider, then model id.

### anthropic
- `claude-haiku-3-5`: input $0.8, cached input $0.08, output $4 per 1,000,000 tokens.
- `claude-haiku-4-5`: input $1, cached input $0.1, output $5 per 1,000,000 tokens.
- `claude-opus-4-1`: input $15, cached input $1.5, output $75 per 1,000,000 tokens.
- `claude-opus-4-5`: input $5, cached input $0.5, output $25 per 1,000,000 tokens.
- `claude-opus-4-6`: input $5, cached input $0.5, output $25 per 1,000,000 tokens.
- `claude-opus-4-7`: input $5, cached input $0.5, output $25 per 1,000,000 tokens.
- `claude-sonnet-4-5`: input $3, cached input $0.3, output $15 per 1,000,000 tokens.
- `claude-sonnet-4-6`: input $3, cached input $0.3, output $15 per 1,000,000 tokens.

### google
- `gemini-2.0-flash`: input $0.1, cached input $0.025, output $0.4 per 1,000,000 tokens.
- `gemini-2.0-flash-lite`: input $0.075, cached input n/a, output $0.3 per 1,000,000 tokens.
- `gemini-2.5-flash`: input $0.1, cached input $0.01, output $0.4 per 1,000,000 tokens.
- `gemini-2.5-flash-lite`: input $0.1, cached input $0.01, output $0.4 per 1,000,000 tokens.
- `gemini-2.5-pro`: input $1.25, cached input $0.125, output $10 per 1,000,000 tokens; long-context tier after 200,000 tokens: input ×2, output ×1.5.
- `gemini-3-flash-preview`: input $0.5, cached input $0.05, output $3 per 1,000,000 tokens.
- `gemini-3.1-pro-preview`: input $2, cached input $0.2, output $12 per 1,000,000 tokens; long-context tier after 200,000 tokens: input ×2, output ×1.5.

### openai
- `gpt-4.1`: input $2, cached input $0.5, output $8 per 1,000,000 tokens.
- `gpt-4.1-mini`: input $0.4, cached input $0.1, output $1.6 per 1,000,000 tokens.
- `gpt-4.1-nano`: input $0.1, cached input $0.025, output $0.4 per 1,000,000 tokens.
- `gpt-4o`: input $2.5, cached input $1.25, output $10 per 1,000,000 tokens.
- `gpt-4o-mini`: input $0.15, cached input $0.075, output $0.6 per 1,000,000 tokens.
- `gpt-5`: input $1.25, cached input $0.125, output $10 per 1,000,000 tokens.
- `gpt-5-mini`: input $0.25, cached input $0.025, output $2 per 1,000,000 tokens.
- `gpt-5-nano`: input $0.05, cached input n/a, output $0.4 per 1,000,000 tokens.
- `gpt-5-pro`: input $15, cached input n/a, output $120 per 1,000,000 tokens.
- `gpt-5.1`: input $1.25, cached input $0.125, output $10 per 1,000,000 tokens.
- `gpt-5.2`: input $1.75, cached input $0.175, output $14 per 1,000,000 tokens.
- `gpt-5.2-pro`: input $21, cached input n/a, output $168 per 1,000,000 tokens.
- `gpt-5.4`: input $2.5, cached input $0.25, output $15 per 1,000,000 tokens; long-context tier after 200,000 tokens: input ×2, output ×1.5.
- `gpt-5.4-mini`: input $0.75, cached input $0.075, output $4.5 per 1,000,000 tokens.
- `gpt-5.5`: input $5, cached input $0.5, output $30 per 1,000,000 tokens.
- `o1`: input $15, cached input $7.5, output $60 per 1,000,000 tokens.
- `o3`: input $2, cached input $0.5, output $8 per 1,000,000 tokens.
- `o3-mini`: input $1.1, cached input $0.55, output $4.4 per 1,000,000 tokens.
- `o4-mini`: input $1.1, cached input $0.275, output $4.4 per 1,000,000 tokens.