Chat · Vision · Diagram Leaderboard

Ranking for Vision / Diagram, based on public preference data.

Selection guide

Diagram model ranking guide

Ranking for Vision / Diagram, based on public preference data.

claude-opus-4-7claude-opus-4-7-thinkingclaude-opus-4-6-thinkinggpt-5.5claude-opus-4-6
Current DirectoryChat · Vision · Diagram
Models82
Published2026/05/18
Arena public preference evaluationOriginal leaderboard: Vision / DiagramPublished: 2026/05/18Leaderboard dataset: LMArena latest parquetOpen Arena sourceOpen leaderboard dataset
1
claude-opus-4-7
Anthropic
100.0
1.8K
1M
¥36 / ¥180Input/Output
2
claude-opus-4-7-thinking
Anthropic
98.8
1.8K
1M
¥36 / ¥180Input/Output
3
claude-opus-4-6-thinking
Anthropic
97.5
1.8K
1M
¥36 / ¥180Input/Output
4
gpt-5.5
Openai
96.3
1.2K
1.05M
¥36 / ¥216Input/Output
5
claude-opus-4-6
Anthropic
95.1
2.2K
1M
¥36 / ¥180Input/Output
6
gpt-5.4-high
Openai
93.8
1.6K
1.05M
¥18 / ¥108Input/Output
7
muse-spark
Meta
92.6
1.2K
-
-
8
gemini-3.1-pro-preview
Google
91.4
4.2K
1.05M
¥14.4 / ¥86.4Input/Output
9
claude-sonnet-4-6
Anthropic
90.1
2.4K
1M
¥21.6 / ¥108Input/Output
10
gpt-5.4
Openai
88.9
1.5K
1.05M
¥18 / ¥108Input/Output
11
gpt-5.5-high
Openai
87.7
1.2K
1.05M
¥36 / ¥216Input/Output
12
gemini-3-pro
Google
86.4
2.9K
1.05M
¥14.4 / ¥86.4Input/Output
13
kimi-k2.6
Moonshot
85.2
1.5K
262K
¥6.84 / ¥28.8Input/Output
14
qwen3.7-plus-preview
Alibaba
84.0
977
131K
¥3.6 / ¥21.6Input/Output
15
gemini-3-flash
Google
82.7
5.1K
1.05M
¥3.6 / ¥21.6Input/Output
16
dola-seed-2.0-pro
Bytedance
81.5
2.3K
-
-
17
gpt-5.2-chat-latest-20260210
Openai
80.2
3.1K
400K
¥12.6 / ¥101Input/Output
18
kimi-k2.5-thinking
Moonshot
79.0
3.6K
262K
¥4.32 / ¥21.6Input/Output
19
qwen3.5-397b-a17b
Alibaba
77.8
3K
262K
¥3.1 / ¥18.6Input/Output
20
gemma-4-31b
Google
76.5
4.4K
262K
¥3.24 / ¥7.2Input/Output
21
glm-5v-turbo
Zai
75.3
1.9K
200K
¥0 / ¥0Input/Output
22
gemini-3-flash (thinking-minimal)
Google
74.1
4.9K
1.05M
¥3.6 / ¥21.6Input/Output
23
gemini-2.5-pro
Google
72.8
8.9K
1.05M
¥9 / ¥72Input/Output
24
gpt-5.5-instant
Openai
71.6
1.1K
400K
¥9 / ¥72Input/Output
25
qwen-vl-max-2025-08-13
Alibaba
70.4
356
131K
¥1.66 / ¥4.13Input/Output
26
gpt-5.2-high
Openai
69.1
3.8K
400K
¥12.6 / ¥101Input/Output
27
gpt-5.1-high
Openai
67.9
2.2K
400K
¥9 / ¥72Input/Output
28
gemma-4-26b-a4b
Google
66.7
2.7K
262K
¥0.94 / ¥2.88Input/Output
29
mimo-v2.5
Xiaomi
65.4
1.9K
1.05M
¥2.88 / ¥14.4Input/Output
30
kimi-k2.5-instant
Moonshot
64.2
911
262K
¥4.32 / ¥21.6Input/Output
31
chatgpt-4o-latest-20250326
Openai
63.0
3.7K
128K
¥18 / ¥72Input/Output
32
qwen3-vl-235b-a22b-instruct
Alibaba
61.7
2.7K
128K
¥2.16 / ¥8.64Input/Output
33
grok-4.20-multi-agent-beta-0309
Xai
60.5
2.2K
2M
¥14.4 / ¥43.2Input/Output
34
gpt-5.4-mini-high
Openai
59.3
2.2K
400K
¥5.4 / ¥32.4Input/Output
35
gpt-5-chat
Openai
58.0
2.6K
400K
¥9 / ¥72Input/Output
36
grok-4.20-beta-0309-reasoning
Xai
56.8
2.5K
2M
¥14.4 / ¥43.2Input/Output
37
gpt-5.2
Openai
55.6
4K
400K
¥12.6 / ¥101Input/Output
38
gemini-2.5-flash-preview-09-2025
Google
54.3
973
1M
¥2.16 / ¥18Input/Output
39
mimo-v2-omni
Xiaomi
53.1
2K
262K
¥2.88 / ¥14.4Input/Output
40
grok-4.3
Xai
51.9
1.1K
1M
¥9 / ¥18Input/Output
41
gemini-2.5-flash
Google
50.6
7.7K
1.05M
¥2.16 / ¥18Input/Output
42
gpt-5.1
Openai
49.4
2.4K
400K
¥9 / ¥72Input/Output
43
gemini-3.1-flash-lite-preview
Google
48.1
3.5K
1.05M
¥1.8 / ¥10.8Input/Output
44
qwen3.5-122b-a10b
Alibaba
46.9
2.6K
262K
¥2.88 / ¥23Input/Output
45
qwen3.5-27b
Alibaba
45.7
2.4K
262K
¥2.16 / ¥17.3Input/Output
46
ernie-5.0-preview-1220
Baidu
44.4
704
128K
¥7.92 / ¥14.4Input/Output
47
gpt-5-high
Openai
43.2
3K
400K
¥9 / ¥72Input/Output
48
o3-2025-04-16
Openai
42.0
3.6K
200K
¥14.4 / ¥57.6Input/Output
49
gpt-4.1-2025-04-14
Openai
40.7
2.8K
1.05M
¥14.4 / ¥57.6Input/Output
50
qwen3-vl-235b-a22b-thinking
Alibaba
39.5
479
131K
¥2.06 / ¥8.26Input/Output
51
gpt-5-mini-high
Openai
38.3
2K
400K
¥1.8 / ¥14.4Input/Output
52
grok-4-0709
Xai
37.0
2.5K
256K
¥21.6 / ¥108Input/Output
53
grok-4-1-fast-reasoning
Xai
35.8
2.7K
2M
¥1.44 / ¥3.6Input/Output
54
gpt-5.4-nano-high
Openai
34.6
2.2K
400K
¥1.44 / ¥9Input/Output
55
gemini-2.5-flash-lite-preview-09-2025-no-thinking
Google
33.3
986
1.05M
¥0.72 / ¥2.88Input/Output
56
o4-mini-2025-04-16
Openai
32.1
2.9K
200K
¥7.92 / ¥31.7Input/Output
57
claude-sonnet-4-20250514-thinking-32k
Anthropic
30.9
230
200K
¥21.6 / ¥108Input/Output
58
hunyuan-vision-1.5-thinking
Tencent
29.6
497
-
-
59
gemini-2.5-flash-lite-preview-06-17-thinking
Google
28.4
2.5K
65.5K
¥0.72 / ¥2.88Input/Output
60
step-1o-turbo-202506
Stepfun
27.2
394
-
-
61
gpt-4.1-mini-2025-04-14
Openai
25.9
2.4K
1.05M
¥2.88 / ¥11.5Input/Output
62
claude-sonnet-4-20250514
Anthropic
24.7
397
200K
¥21.6 / ¥108Input/Output
63
glm-4.6v
Zai
23.5
551
128K
¥2.16 / ¥6.48Input/Output
64
claude-3-7-sonnet-20250219-thinking-32k
Anthropic
22.2
304
-
-
65
claude-opus-4-20250514
Anthropic
21.0
471
200K
¥108 / ¥540Input/Output
66
mistral-medium-2508
Mistral
19.8
3.7K
262K
¥2.88 / ¥14.4Input/Output
67
claude-opus-4-20250514-thinking-16k
Anthropic
18.5
285
200K
¥108 / ¥540Input/Output
68
gemma-3-27b-it
Google
17.3
1.8K
128K
¥2.15 / ¥2.15Input/Output
69
mistral-medium-2505
Mistral
16.0
1.4K
262K
¥2.88 / ¥14.4Input/Output
70
glm-4.5v
Zai
14.8
376
64K
¥4.32 / ¥13Input/Output
71
step-3
Stepfun
13.6
364
65.5K
¥1.8 / ¥4.68Input/Output
72
claude-3-7-sonnet-20250219
Anthropic
12.3
306
200K
¥21.6 / ¥108Input/Output
73
gemini-2.0-flash-001
Google
11.1
1.1K
1.05M
¥1.08 / ¥4.32Input/Output
74
gpt-5-nano-high
Openai
9.9
527
400K
¥0.36 / ¥2.88Input/Output
75
hunyuan-large-vision
Tencent
8.6
263
-
-
76
llama-4-maverick-17b-128e-instruct
Meta
7.4
970
1M
¥1.8 / ¥6.26Input/Output
77
claude-3-5-sonnet-20241022
Anthropic
6.2
318
200K
¥21.6 / ¥108Input/Output
78
mistral-small-2506
Mistral
4.9
1.2K
262K
¥2.88 / ¥14.4Input/Output
79
llama-4-scout-17b-16e-instruct
Meta
3.7
867
128K
¥1.44 / ¥5.62Input/Output
80
mistral-small-3.1-24b-instruct-2503
Mistral
2.5
1.7K
262K
¥2.88 / ¥14.4Input/Output
81
claude-3-5-haiku-20241022
Anthropic
1.2
325
200K
¥5.76 / ¥28.8Input/Output
82
molmo-2-8b
Allenai
0.0
281
-
-
Top model analysis

claude-opus-4-7 why it ranks first

claude-opus-4-7 ranks first with a percent score of 100.0 and 1.8K samples. Use it as the first option for this leaderboard, then compare price, context and availability.

How to choose

Do not only look at rank #1

Start with the leaderboard closest to your task. Compare the top models by score and sample size, then check price, context length, open or closed access, and provider availability.

FAQ

FAQ

图表理解排行榜看什么指标?

主要看排名、百分制分数、样本量和来源。分数用于快速比较同一榜单内模型表现,样本量用于判断结果稳定性。

为什么不同榜单不能直接混合成总分?

不同榜单的任务、样本和评测口径不同,模力榜默认只在同一榜单内排序,避免把写作、代码、图像等能力强行合并。

图表理解模型应该怎么选?

优先看与你任务最接近的榜单,再结合价格、上下文长度、开源闭源和厂商可用性。排名靠前不代表适合所有预算和部署方式。

榜单多久更新?

页面展示的是最新成功采集的公开榜单数据。当前优先使用 LMArena leaderboard dataset,并在页面来源中保留原始链接。