Qwen: Qwen3-Max
qwen/qwen3-max
Qwen3-Max is an updated release built on the Qwen3 series, offering major improvements in reasoning, instruction following, multilingual support, and long-tail knowledge coverage compared to the January 2025 version. It delivers higher accuracy in math, coding, logic, and science tasks, follows complex instructions in Chinese and English more reliably, reduces hallucinations, and produces higher-quality responses for open-ended Q&A, writing, and conversation. The model supports over 100 languages with stronger translation and commonsense reasoning, and is optimized for retrieval-augmented generation (RAG) and tool calling, though it does not include a dedicated “thinking” mode.
ByqwenInput typeOutput type
Recent activity on Qwen3-Max
Tokens processed per day
Thoughput
(tokens/s)
ProvidersMin (tokens/s)Max (tokens/s)Avg (tokens/s)
Alibaba Cloud6.9731.0912.74
First Token Latency
(ms)
ProvidersMin (ms)Max (ms)Avg (ms)
Alibaba Cloud631998854.53
Providers for Qwen3-Max
ZenMux Provider to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.
Latency
0.94
s
Throughput
24.23
tps
Uptime
100.00
%
Recent uptime
Oct 10,2025 - 3 PM100.00%
Price
Tiered pricing
0 <= Input < 32k
Input
$ 1.2
/ M tokens
Output
$ 6
/ M tokens
Cache read
$ 0.24
/ M tokens
Cache write 5m
$ 1.5
/ M tokens
Cache write 1h
-
Cache write
-
Web search
-
Model limitation
Context
256.00K
Max output
32.00K
Supported Parameters
max_completion_tokens
temperature
top_p
frequency_penalty
-
presence_penalty
seed
logit_bias
-
logprobs
top_logprobs
response_format
stop
tools
tool_choice
parallel_tool_calls
-
Model Protocol Compatibility
openai
anthropic
-
Data policy
Prompt training
false
Prompt Logging
Zero retention
Moderation
Responsibility of developer
Status Page
status page
Sample code and API for Qwen3-Max
ZenMux normalizes requests and responses across providers for you.
OpenAI-PythonPythonTypeScriptOpenAI-TypeScriptcURL
python
from openai import OpenAI

client = OpenAI(
  base_url="https://zenmux.ai/api/v1",
  api_key="<ZenMux_API_KEY>",
)

completion = client.chat.completions.create(
  model="qwen/qwen3-max",
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is in this image?"
        }
      ]
    }
  ]
)
print(completion.choices[0].message.content)