Thank you for choosing our service. Add credits today to receive an additional 20% bonus. Add Credits

Google: Gemini 2.5 Pro

google/gemini-2.5-pro

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy and nuanced context handling. Gemini 2.5 Pro achieves top-tier performance on multiple benchmarks, including first-place positioning on the LMArena leaderboard, reflecting superior human-preference alignment and complex problem-solving abilities.

BygoogleInput typeOutput type

Recent activity on Gemini 2.5 Pro

Tokens processed per day

Thoughput

(tokens/s)

Providers	Min (tokens/s)	Max (tokens/s)	Avg (tokens/s)
Google Vertex	18.92	114.77	88.49
SkyRouter	28.68	114.53	72.14

First Token Latency

(ms)

Providers	Min (ms)	Max (ms)	Avg (ms)
Google Vertex	2334	7429	4467.03
SkyRouter	2353	6269	3847.47

Providers for Gemini 2.5 Pro

ZenMux Provider to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.

Google Vertex

Latency

Throughput

Uptime

100.00

Recent uptime

Oct 10,2025 - 3 PM100.00%

Price

Tiered pricing

0 <= Input < 200k

Input

$ 1.25

/ M tokens

Output

$ 10

/ M tokens

Cache read

$ 0.31

/ M tokens

Cache write 5m

Cache write 1h

$ 4.5

/ M tokens

Cache write

Web search

$ 0.035

/ request

Model limitation

Context

1.05M

Max output

65.53K

Supported Parameters

max_completion_tokens

temperature

top_p

frequency_penalty

presence_penalty

seed

logit_bias

logprobs

top_logprobs

response_format

stop

tools

tool_choice

parallel_tool_calls

Model Protocol Compatibility

openai

anthropic

Data policy

Prompt training

false

Prompt Logging

Zero retention

Moderation

Responsibility of developer

Status Page

status page

SkyRouter

Latency

2.95

Throughput

61.97

tps

Uptime

100.00

Recent uptime

Oct 10,2025 - 3 PM100.00%

Price

Tiered pricing

0 <= Input < 200k

Input

$ 1.25

/ M tokens

Output

$ 10

/ M tokens

Cache read

$ 0.31

/ M tokens

Cache write 5m

Cache write 1h

$ 4.5

/ M tokens

Cache write

Web search

Model limitation

Context

1.05M

Max output

65.53K

Supported Parameters

max_completion_tokens

temperature

top_p

frequency_penalty

presence_penalty

seed

logit_bias

logprobs

top_logprobs

response_format

stop

tools

tool_choice

parallel_tool_calls

Model Protocol Compatibility

openai

anthropic

Data policy

Prompt training

false

Prompt Logging

Zero retention

Moderation

Responsibility of developer

Sample code and API for Gemini 2.5 Pro

ZenMux normalizes requests and responses across providers for you.

OpenAI-PythonPythonTypeScriptOpenAI-TypeScriptcURL

python
from openai import OpenAI

client = OpenAI(
  base_url="https://zenmux.ai/api/v1",
  api_key="<ZenMux_API_KEY>",
)

completion = client.chat.completions.create(
  model="google/gemini-2.5-pro",
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is in this image?"
        }
      ]
    }
  ]
)
print(completion.choices[0].message.content)