P
PiazBench
Back to home

Vision

How well can each AI model understand and analyze images? Ranked by vision-specific Arena Elo score.

RankModelScore
1
Anthropic
Claude Opus 4.7anthropic/claude-opus-4-7
1319.0
2
Anthropic
Claude Opus 4.6anthropic/claude-opus-4-6-thinking
1317.0
3
Google
Gemini 3 Progoogle/gemini-3-pro
1303.0
4
Meta
Muse Sparkmeta-llama/muse-spark
1303.0
5
Google
Gemini 3.1 Progoogle/gemini-3.1-pro-preview
1297.0
6
OpenAI
GPT-5.5openai/gpt-5.5
1296.0
7
OpenAI
GPT-5.4openai/gpt-5.4-high
1292.0
8
Anthropic
Claude Sonnet 4.6anthropic/claude-sonnet-4-6
1289.0
9
OpenAI
GPT-5.2openai/gpt-5.2-chat-latest-20260210
1289.0
10
Google
Gemini 3 Flashgoogle/gemini-3-flash
1286.0
11
M
kimi k2.6moonshot/kimi-k2.6
1275.0
12
ByteDance
dola seed 2.0 probytedance/dola-seed-2.0-pro
1273.0
13
Google
gemini 3 flashgoogle/gemini-3-flash (thinking-minimal)
1273.0
14
Alibaba
qwen3.7 plusalibaba/qwen3.7-plus-preview
1273.0
15
Alibaba
qwen3.5 397b a17balibaba/qwen3.5-397b-a17b
1264.0
16
M
kimi k2.5moonshot/kimi-k2.5-thinking
1263.0
17
Google
Gemma 4 31Bgoogle/gemma-4-31b
1261.0
18
OpenAI
GPT-5.2openai/gpt-5.2-high
1260.0
19
xAI
Grok 4.20xai/grok-4.20-beta-0309-reasoning
1259.0
20
OpenAI
GPT-5openai/gpt-5.1-high
1259.0
21
xAI
Grok 4xai/grok-4.3
1259.0
22
Google
Gemma 4 26B A4Bgoogle/gemma-4-26b-a4b
1257.0
23
Google
Gemini 2.5 Progoogle/gemini-2.5-pro
1256.0
24
xAI
Grok 4.20xai/grok-4.20-multi-agent-beta-0309
1254.0
25
Google
Gemini 3.1 Flash Litegoogle/gemini-3.1-flash-lite-preview
1252.0
26
OpenAI
chatgpt 4oopenai/chatgpt-4o-latest-20250326
1249.0
27
M
kimi k2.5 instantmoonshot/kimi-k2.5-instant
1248.0
28
X
Xiaomi: MiMo V2.5xiaomi/mimo-v2.5
1245.0
29
OpenAI
GPT-5openai/gpt-5-chat
1245.0
30
Z
glm 5vzai/glm-5v-turbo
1241.0

108 models tested