Anthropic: Claude Sonnet 4
anthropic/claude-sonnet-4
Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%), Sonnet 4 balances capability and computational efficiency, making it suitable for a broad range of applications from routine coding tasks to complex software development projects. Key enhancements include improved autonomous codebase navigation, reduced error rates in agent-driven workflows, and increased reliability in following intricate instructions. Sonnet 4 is optimized for practical everyday use, providing advanced reasoning capabilities while maintaining efficiency and responsiveness in diverse internal and external scenarios.
ByanthropicInput typeOutput type
Recent activity on Claude Sonnet 4
Tokens processed per day
Thoughput
(tokens/s)
ProvidersMin (tokens/s)Max (tokens/s)Avg (tokens/s)
Anthropic22.7441.5730.66
Vertex AI11.6148.9537.03
Amazon Bedrock30.4767.253.06
First Token Latency
(ms)
ProvidersMin (ms)Max (ms)Avg (ms)
Anthropic212530552763.24
Vertex AI1003472497280.65
Amazon Bedrock167498523240.29
Providers for Claude Sonnet 4
ZenMux Provider to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.
Latency
2.50
s
Throughput
40.68
tps
Uptime
100.00
%
Recent uptime
Oct 10,2025 - 3 PM100.00%
Price
Tiered pricing
0 <= Input < 200k
Input
$ 3
/ M tokens
Output
$ 15
/ M tokens
Cache read
$ 0.3
/ M tokens
Cache write 5m
$ 3.75
/ M tokens
Cache write 1h
$ 6
/ M tokens
Cache write
-
Web search
$ 0.01
/ request
Model limitation
Context
1000.00K
Max output
64.00K
Supported Parameters
max_completion_tokens
temperature
top_p
frequency_penalty
-
presence_penalty
-
seed
-
logit_bias
-
logprobs
-
top_logprobs
-
response_format
-
stop
tools
tool_choice
parallel_tool_calls
Model Protocol Compatibility
openai
anthropic
Data policy
Prompt training
false
Prompt Logging
30 day retention
Moderation
Responsibility of developer
Status Page
status page
Latency
-
Throughput
-
Uptime
100.00
%
Recent uptime
Oct 10,2025 - 3 PM100.00%
Price
Input
$ 3
/ M tokens
Output
$ 15
/ M tokens
Cache read
$ 0.3
/ M tokens
Cache write 5m
$ 3.75
/ M tokens
Cache write 1h
-
Cache write
-
Web search
-
Model limitation
Context
200.00K
Max output
64.00K
Supported Parameters
max_completion_tokens
temperature
top_p
frequency_penalty
-
presence_penalty
-
seed
-
logit_bias
-
logprobs
-
top_logprobs
-
response_format
-
stop
tools
tool_choice
parallel_tool_calls
Model Protocol Compatibility
openai
anthropic
Data policy
Prompt training
false
Prompt Logging
Zero retention
Moderation
Responsibility of developer
Status Page
status page
Latency
-
Throughput
-
Uptime
100.00
%
Recent uptime
Oct 10,2025 - 3 PM100.00%
Price
Input
$ 3
/ M tokens
Output
$ 15
/ M tokens
Cache read
$ 0.3
/ M tokens
Cache write 5m
$ 3.75
/ M tokens
Cache write 1h
-
Cache write
-
Web search
-
Model limitation
Context
1000.00K
Max output
64.00K
Supported Parameters
max_completion_tokens
temperature
top_p
frequency_penalty
-
presence_penalty
-
seed
-
logit_bias
-
logprobs
-
top_logprobs
-
response_format
-
stop
tools
tool_choice
parallel_tool_calls
Model Protocol Compatibility
openai
anthropic
Data policy
Prompt training
false
Prompt Logging
Zero retention
Moderation
Responsibility of developer
Status Page
status page
Sample code and API for Claude Sonnet 4
ZenMux normalizes requests and responses across providers for you.
OpenAI-PythonPythonTypeScriptOpenAI-TypeScriptcURL
python
from openai import OpenAI

client = OpenAI(
  base_url="https://zenmux.ai/api/v1",
  api_key="<ZenMux_API_KEY>",
)

completion = client.chat.completions.create(
  model="anthropic/claude-sonnet-4",
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is in this image?"
        }
      ]
    }
  ]
)
print(completion.choices[0].message.content)