P
PiazBench
Back to home

Overall

Combined rankings of AI models across all benchmark categories. Arena Elo scores from LMSYS Chatbot Arena.

RankModelScore
1
Anthropic
Claude Opus 4.6anthropic/claude-opus-4-6-thinking
1499.0
2
Anthropic
Claude Opus 4.7anthropic/claude-opus-4-7-thinking
1486.0
3
Google
Gemini 3.5 Flashgoogle/gemini-3.5-flash
1482.0
4
Google
Gemini 3.1 Progoogle/gemini-3.1-pro-preview
1481.0
5
Google
Gemini 3 Progoogle/gemini-3-pro
1480.0
6
Alibaba
Qwen 3.7 Maxalibaba/qwen3.7-max-preview
1475.0
7
Meta
Muse Sparkmeta-llama/muse-spark
1474.0
8
OpenAI
GPT-5.4openai/gpt-5.4-high
1472.0
9
Alibaba
Qwen 3.5 Maxalibaba/qwen3.5-max-preview
1471.0
10
B
Ernie 5.1baidu/ernie-5.1
1470.0
11
Z
GLM 5.1zai/glm-5.1
1469.0
12
OpenAI
GPT-5.5openai/gpt-5.5-high
1469.0
13
Google
Gemini 3 Flashgoogle/gemini-3-flash
1466.0
14
X
Xiaomi: MiMo V2.5 Proxiaomi/mimo-v2.5-pro
1461.0
15
Google
Gemini 2.5 Progoogle/gemini-2.5-pro
1457.0
16
M
kimi k2.6moonshot/kimi-k2.6
1456.0
17
Anthropic
Claude Sonnet 4.6anthropic/claude-sonnet-4-6
1454.0
18
xAI
Grok 4.20xai/grok-4.20-beta-0309-reasoning
1454.0
19
xAI
Grok 4.20xai/grok-4.20-multi-agent-beta-0309
1451.0
20
Anthropic
Claude Opus 4.5anthropic/claude-opus-4-5-20251101
1449.0
21
ByteDance
dola seed 2.0 probytedance/dola-seed-2.0-pro
1449.0
22
Amazon
amazon nova chat 26 02 10amazon/amazon-nova-experimental-chat-26-02-10
1448.0
23
DeepSeek
DeepSeek V4 Prodeepseek/deepseek-v4-pro-thinking
1446.0
24
Google
gemini 3 flashgoogle/gemini-3-flash (thinking-minimal)
1446.0
25
B
ernie 5.0 0110baidu/ernie-5.0-0110
1446.0
26
xAI
Grok 4.20xai/grok-4.20-beta1
1446.0
27
Z
glm 5zai/glm-5
1445.0
28
M
kimi k2.5moonshot/kimi-k2.5-thinking
1445.0
29
Alibaba
qwen3.6 maxalibaba/qwen3.6-max-preview
1444.0
30
Google
Gemma 4 31Bgoogle/gemma-4-31b
1442.0

312 models tested