Thank you for choosing our service. Add credits today to receive an additional 20% bonus. Add Credits

Anthropic: Claude Sonnet 4

anthropic/claude-sonnet-4

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%), Sonnet 4 balances capability and computational efficiency, making it suitable for a broad range of applications from routine coding tasks to complex software development projects. Key enhancements include improved autonomous codebase navigation, reduced error rates in agent-driven workflows, and increased reliability in following intricate instructions. Sonnet 4 is optimized for practical everyday use, providing advanced reasoning capabilities while maintaining efficiency and responsiveness in diverse internal and external scenarios.

ByanthropicInput typeOutput type

Recent activity on Claude Sonnet 4

Tokens processed per day

Thoughput

(tokens/s)

Providers	Min (tokens/s)	Max (tokens/s)	Avg (tokens/s)
Anthropic	22.74	41.57	30.66
Vertex AI	11.61	48.95	37.03
Amazon Bedrock	30.47	67.2	53.06

First Token Latency

(ms)

Providers	Min (ms)	Max (ms)	Avg (ms)
Anthropic	2125	3055	2763.24
Vertex AI	1003	47249	7280.65
Amazon Bedrock	1674	9852	3240.29

Providers for Claude Sonnet 4

ZenMux Provider to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.

Anthropic

Latency

2.50

Throughput

40.68

tps

Uptime

100.00

Recent uptime

Oct 10,2025 - 3 PM100.00%

Price

Tiered pricing

0 <= Input < 200k

Input

$ 3

/ M tokens

Output

$ 15

/ M tokens

Cache read

$ 0.3

/ M tokens

Cache write 5m

$ 3.75

/ M tokens

Cache write 1h

$ 6

/ M tokens

Cache write

Web search

$ 0.01

/ request

Model limitation

Context

1000.00K

Max output

64.00K

Supported Parameters

max_completion_tokens

temperature

top_p

frequency_penalty

presence_penalty

seed

logit_bias

logprobs

top_logprobs

response_format

stop

tools

tool_choice

parallel_tool_calls

Model Protocol Compatibility

openai

anthropic

Data policy

Prompt training

false

Prompt Logging

30 day retention

Moderation

Responsibility of developer

Status Page

status page

Vertex AI

Latency

Throughput

Uptime

100.00

Recent uptime

Oct 10,2025 - 3 PM100.00%

Price

Input

$ 3

/ M tokens

Output

$ 15

/ M tokens

Cache read

$ 0.3

/ M tokens

Cache write 5m

$ 3.75

/ M tokens

Cache write 1h

Cache write

Web search

Model limitation

Context

200.00K

Max output

64.00K

Supported Parameters

max_completion_tokens

temperature

top_p

frequency_penalty

presence_penalty

seed

logit_bias

logprobs

top_logprobs

response_format

stop

tools

tool_choice

parallel_tool_calls

Model Protocol Compatibility

openai

anthropic

Data policy

Prompt training

false

Prompt Logging

Zero retention

Moderation

Responsibility of developer

Status Page

status page

Amazon Bedrock

Latency

Throughput

Uptime

100.00

Recent uptime

Oct 10,2025 - 3 PM100.00%

Price

Input

$ 3

/ M tokens

Output

$ 15

/ M tokens

Cache read

$ 0.3

/ M tokens

Cache write 5m

$ 3.75

/ M tokens

Cache write 1h

Cache write

Web search

Model limitation

Context

1000.00K

Max output

64.00K

Supported Parameters

max_completion_tokens

temperature

top_p

frequency_penalty

presence_penalty

seed

logit_bias

logprobs

top_logprobs

response_format

stop

tools

tool_choice

parallel_tool_calls

Model Protocol Compatibility

openai

anthropic

Data policy

Prompt training

false

Prompt Logging

Zero retention

Moderation

Responsibility of developer

Status Page

status page

Sample code and API for Claude Sonnet 4

ZenMux normalizes requests and responses across providers for you.

OpenAI-PythonPythonTypeScriptOpenAI-TypeScriptcURL

python
from openai import OpenAI

client = OpenAI(
  base_url="https://zenmux.ai/api/v1",
  api_key="<ZenMux_API_KEY>",
)

completion = client.chat.completions.create(
  model="anthropic/claude-sonnet-4",
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is in this image?"
        }
      ]
    }
  ]
)
print(completion.choices[0].message.content)