Z.AI: GLM 4.5 Air
z-ai/glm-4.5-air
GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications. Like GLM-4.5, it adopts the Mixture-of-Experts (MoE) architecture but with a more compact parameter size. GLM-4.5-Air also supports hybrid inference modes, offering a "thinking mode" for advanced reasoning and tool use, and a "non-thinking mode" for real-time interaction. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config)
Byz-aiInput typeOutput type
Recent activity on GLM 4.5 Air
Tokens processed per day
Thoughput
(tokens/s)
ProvidersMin (tokens/s)Max (tokens/s)Avg (tokens/s)
Z.AI49.3880.0460.74
First Token Latency
(ms)
ProvidersMin (ms)Max (ms)Avg (ms)
Z.AI386703576.03
Providers for GLM 4.5 Air
ZenMux Provider to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.
Latency
0.73
s
Throughput
57.01
tps
Uptime
100.00
%
Recent uptime
Oct 10,2025 - 3 PM100.00%
Price
Tiered pricing
0 <= Input < 32k
Input
$ 0.11
/ M tokens
Output
$ 0.56
/ M tokens
Cache read
$ 0.022
/ M tokens
Cache write 5m
-
Cache write 1h
-
Cache write
-
Web search
-
Model limitation
Context
128.00K
Max output
96.00K
Supported Parameters
max_completion_tokens
temperature
top_p
frequency_penalty
-
presence_penalty
-
seed
-
logit_bias
-
logprobs
-
top_logprobs
-
response_format
-
stop
tools
tool_choice
parallel_tool_calls
-
Model Protocol Compatibility
openai
anthropic
-
Data policy
Prompt training
false
Prompt Logging
Zero retention
Moderation
Responsibility of developer
Sample code and API for GLM 4.5 Air
ZenMux normalizes requests and responses across providers for you.
OpenAI-PythonPythonTypeScriptOpenAI-TypeScriptcURL
python
from openai import OpenAI

client = OpenAI(
  base_url="https://zenmux.ai/api/v1",
  api_key="<ZenMux_API_KEY>",
)

completion = client.chat.completions.create(
  model="z-ai/glm-4.5-air",
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is in this image?"
        }
      ]
    }
  ]
)
print(completion.choices[0].message.content)