Thank you for choosing our service. Add credits today to receive an additional 20% bonus. Add Credits

xAI: Grok 4 Fast

x-ai/grok-4-fast

Grok 4 Fast is xAI's latest multimodal model with SOTA cost-efficiency and a 2M token context window. It comes in two flavors: non-reasoning and reasoning. Read more about the model on xAI's [news post](http://x.ai/news/grok-4-fast). Reasoning can be enabled using the `reasoning` `enabled` parameter in the API. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#controlling-reasoning-tokens) Prompts and completions may be used by xAI or OpenRouter to improve future models.

Byx-aiInput typeOutput type

Recent activity on Grok 4 Fast

Tokens processed per day

Thoughput

(tokens/s)

Providers	Min (tokens/s)	Max (tokens/s)	Avg (tokens/s)
OpenAIChatCompletionAdapter	0.37	4.73	1.04

First Token Latency

(ms)

Providers	Min (ms)	Max (ms)	Avg (ms)
OpenAIChatCompletionAdapter	1210	1453	1295.00

Providers for Grok 4 Fast

ZenMux Provider to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.

OpenAIChatCompletionAdapter

Latency

1.30

Throughput

0.42

tps

Uptime

100.00

Recent uptime

Oct 10,2025 - 3 PM100.00%

Price

Tiered pricing

0 <= Input < 128k

Input

$ 0.2

/ M tokens

Output

$ 0.5

/ M tokens

Cache read

$ 0.05

/ M tokens

Cache write 5m

Cache write 1h

Cache write

Web search

$ 0.025

/ request

Model limitation

Context

2.00M

Max output

30.00K

Supported Parameters

max_completion_tokens

temperature

top_p

frequency_penalty

presence_penalty

seed

logit_bias

logprobs

top_logprobs

response_format

stop

tools

tool_choice

parallel_tool_calls

Model Protocol Compatibility

openai

anthropic

Data policy

Prompt training

false

Prompt Logging

Zero retention

Moderation

Responsibility of developer

Sample code and API for Grok 4 Fast

ZenMux normalizes requests and responses across providers for you.

OpenAI-PythonPythonTypeScriptOpenAI-TypeScriptcURL

python
from openai import OpenAI

client = OpenAI(
  base_url="https://zenmux.ai/api/v1",
  api_key="<ZenMux_API_KEY>",
)

completion = client.chat.completions.create(
  model="x-ai/grok-4-fast",
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is in this image?"
        }
      ]
    }
  ]
)
print(completion.choices[0].message.content)