Thank you for choosing our service. Add credits today to receive an additional 20% bonus. Add Credits

Anthropic: Claude 3.5 Haiku

anthropic/claude-3.5-haiku

Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use. Engineered to excel in real-time applications, it delivers quick response times that are essential for dynamic tasks such as chat interactions and immediate coding suggestions. This makes it highly suitable for environments that demand both speed and precision, such as software development, customer service bots, and data management systems.

ByanthropicInput typeOutput type

Recent activity on Claude 3.5 Haiku

Tokens processed per day

Thoughput

(tokens/s)

Providers	Min (tokens/s)	Max (tokens/s)	Avg (tokens/s)
Anthropic	9.35	45.32	12.55
Amazon Bedrock	31	40.13	36.23
Vertex AI	5.11	40.76	25.68

First Token Latency

(ms)

Providers	Min (ms)	Max (ms)	Avg (ms)
Anthropic	672	780	724.24
Amazon Bedrock	1419	3444	2056.50
Vertex AI	1766	31222	6114.00

Providers for Claude 3.5 Haiku

ZenMux Provider to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.

Anthropic

Latency

0.76

Throughput

10.62

tps

Uptime

100.00

Recent uptime

Oct 10,2025 - 3 PM100.00%

Price

Input

$ 0.8

/ M tokens

Output

$ 4

/ M tokens

Cache read

$ 0.08

/ M tokens

Cache write 5m

$ 1

/ M tokens

Cache write 1h

$ 1.6

/ M tokens

Cache write

Web search

$ 0.01

/ request

Model limitation

Context

200.00K

Max output

8.19K

Supported Parameters

max_completion_tokens

temperature

top_p

frequency_penalty

presence_penalty

seed

logit_bias

logprobs

top_logprobs

response_format

stop

tools

tool_choice

parallel_tool_calls

Model Protocol Compatibility

openai

anthropic

Data policy

Prompt training

false

Prompt Logging

Zero retention

Moderation

Responsibility of developer

Status Page

status page

Amazon Bedrock

Latency

Throughput

Uptime

100.00

Recent uptime

Oct 10,2025 - 3 PM100.00%

Price

Input

$ 0.8

/ M tokens

Output

$ 4

/ M tokens

Cache read

$ 0.08

/ M tokens

Cache write 5m

$ 1

/ M tokens

Cache write 1h

Cache write

Web search

Model limitation

Context

200.00K

Max output

8.19K

Supported Parameters

max_completion_tokens

temperature

top_p

frequency_penalty

presence_penalty

seed

logit_bias

logprobs

top_logprobs

response_format

stop

tools

tool_choice

parallel_tool_calls

Model Protocol Compatibility

openai

anthropic

Data policy

Prompt training

false

Prompt Logging

Zero retention

Moderation

Responsibility of developer

Status Page

status page

Vertex AI

Latency

Throughput

Uptime

100.00

Recent uptime

Oct 10,2025 - 3 PM100.00%

Price

Input

$ 0.8

/ M tokens

Output

$ 4

/ M tokens

Cache read

$ 0.08

/ M tokens

Cache write 5m

$ 1

/ M tokens

Cache write 1h

Cache write

Web search

Model limitation

Context

200.00K

Max output

8.19K

Supported Parameters

max_completion_tokens

temperature

top_p

frequency_penalty

presence_penalty

seed

logit_bias

logprobs

top_logprobs

response_format

stop

tools

tool_choice

parallel_tool_calls

Model Protocol Compatibility

openai

anthropic

Data policy

Prompt training

false

Prompt Logging

Zero retention

Moderation

Responsibility of developer

Status Page

status page

Sample code and API for Claude 3.5 Haiku

ZenMux normalizes requests and responses across providers for you.

OpenAI-PythonPythonTypeScriptOpenAI-TypeScriptcURL

python
from openai import OpenAI

client = OpenAI(
  base_url="https://zenmux.ai/api/v1",
  api_key="<ZenMux_API_KEY>",
)

completion = client.chat.completions.create(
  model="anthropic/claude-3.5-haiku",
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is in this image?"
        }
      ]
    }
  ]
)
print(completion.choices[0].message.content)