MoonshotAI: Kimi K2 0905
moonshotai/kimi-k2-0905
Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It supports long-context inference up to 256k tokens, extended from the previous 128k. This update improves agentic coding with higher accuracy and better generalization across scaffolds, and enhances frontend coding with more aesthetic and functional outputs for web, 3D, and related tasks. Kimi K2 is optimized for agentic capabilities, including advanced tool use, reasoning, and code synthesis. It excels across coding (LiveCodeBench, SWE-bench), reasoning (ZebraLogic, GPQA), and tool-use (Tau2, AceBench) benchmarks. The model is trained with a novel stack incorporating the MuonClip optimizer for stable large-scale MoE training.
BymoonshotaiInput typeOutput type
Recent activity on Kimi K2 0905
Tokens processed per day
Thoughput
(tokens/s)
ProvidersMin (tokens/s)Max (tokens/s)Avg (tokens/s)
Moonshot AI4.5913.615.67
First Token Latency
(ms)
ProvidersMin (ms)Max (ms)Avg (ms)
Moonshot AI141126651631.52
Providers for Kimi K2 0905
ZenMux Provider to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.
Latency
1.59
s
Throughput
5.26
tps
Uptime
100.00
%
Recent uptime
Oct 10,2025 - 3 PM100.00%
Price
Input
$ 0.6
/ M tokens
Output
$ 2.5
/ M tokens
Cache read
$ 0.15
/ M tokens
Cache write 5m
-
Cache write 1h
-
Cache write
-
Web search
-
Model limitation
Context
262.10K
Max output
262.10K
Supported Parameters
max_completion_tokens
-
temperature
top_p
frequency_penalty
presence_penalty
seed
-
logit_bias
-
logprobs
-
top_logprobs
-
response_format
stop
tools
tool_choice
-
parallel_tool_calls
-
Model Protocol Compatibility
openai
anthropic
Data policy
Prompt training
false
Prompt Logging
Zero retention
Moderation
Responsibility of developer
Sample code and API for Kimi K2 0905
ZenMux normalizes requests and responses across providers for you.
OpenAI-PythonPythonTypeScriptOpenAI-TypeScriptcURL
python
from openai import OpenAI

client = OpenAI(
  base_url="https://zenmux.ai/api/v1",
  api_key="<ZenMux_API_KEY>",
)

completion = client.chat.completions.create(
  model="moonshotai/kimi-k2-0905",
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is in this image?"
        }
      ]
    }
  ]
)
print(completion.choices[0].message.content)