OpenAI: GPT-4o-mini
openai/gpt-4o-mini
GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more affordable than other recent frontier models, and more than 60% cheaper than [GPT-3.5 Turbo](/models/openai/gpt-3.5-turbo). It maintains SOTA intelligence, while being significantly more cost-effective. GPT-4o mini achieves an 82% score on MMLU and presently ranks higher than GPT-4 on chat preferences [common leaderboards](https://arena.lmsys.org/). #multimodal
ByopenaiInput typeOutput type
Recent activity on GPT-4o-mini
Tokens processed per day
Thoughput
(tokens/s)
ProvidersMin (tokens/s)Max (tokens/s)Avg (tokens/s)
Azure6.1782.6324.64
OpenAI8.344.2613.64
First Token Latency
(ms)
ProvidersMin (ms)Max (ms)Avg (ms)
Azure129927421547.46
OpenAI642814719.82
Providers for GPT-4o-mini
ZenMux Provider to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.
Latency
-
Throughput
-
Uptime
100.00
%
Recent uptime
Oct 10,2025 - 3 PM100.00%
Price
Input
$ 0.15
/ M tokens
Output
$ 0.6
/ M tokens
Cache read
$ 0.075
/ M tokens
Cache write 5m
-
Cache write 1h
-
Cache write
-
Web search
-
Model limitation
Context
128.00K
Max output
16.38K
Supported Parameters
max_completion_tokens
-
temperature
top_p
frequency_penalty
presence_penalty
seed
logit_bias
logprobs
top_logprobs
response_format
stop
tools
-
tool_choice
-
parallel_tool_calls
-
Model Protocol Compatibility
openai
anthropic
-
Data policy
Prompt training
false
Prompt Logging
30 day retention
Moderation
Responsibility of developer
Status Page
status page
Latency
0.74
s
Throughput
9.56
tps
Uptime
100.00
%
Recent uptime
Oct 10,2025 - 3 PM100.00%
Price
Input
$ 0.15
/ M tokens
Output
$ 0.6
/ M tokens
Cache read
$ 0.075
/ M tokens
Cache write 5m
-
Cache write 1h
-
Cache write
-
Web search
-
Model limitation
Context
128.00K
Max output
16.38K
Supported Parameters
max_completion_tokens
temperature
top_p
frequency_penalty
presence_penalty
seed
logit_bias
logprobs
top_logprobs
response_format
stop
tools
tool_choice
parallel_tool_calls
-
Model Protocol Compatibility
openai
anthropic
-
Data policy
Prompt training
false
Prompt Logging
Zero retention
Moderation
Responsibility of developer
Status Page
status page
Sample code and API for GPT-4o-mini
ZenMux normalizes requests and responses across providers for you.
OpenAI-PythonPythonTypeScriptOpenAI-TypeScriptcURL
python
from openai import OpenAI

client = OpenAI(
  base_url="https://zenmux.ai/api/v1",
  api_key="<ZenMux_API_KEY>",
)

completion = client.chat.completions.create(
  model="openai/gpt-4o-mini",
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is in this image?"
        }
      ]
    }
  ]
)
print(completion.choices[0].message.content)