Chat · Vision · Chinese Leaderboard

Ranking for Vision / Chinese, based on public preference data.

Selection guide

Chinese model ranking guide

Ranking for Vision / Chinese, based on public preference data.

claude-opus-4-6-thinkingclaude-opus-4-6gemini-3.1-pro-previewgemini-3-proclaude-opus-4-7-thinking
Current DirectoryChat · Vision · Chinese
Models93
Published2026/05/18
Arena public preference evaluationOriginal leaderboard: Vision / ChinesePublished: 2026/05/18Leaderboard dataset: LMArena latest parquetOpen Arena sourceOpen leaderboard dataset
1
claude-opus-4-6-thinking
Anthropic
100.0
435
1M
¥36 / ¥180Input/Output
2
claude-opus-4-6
Anthropic
98.9
466
1M
¥36 / ¥180Input/Output
3
gemini-3.1-pro-preview
Google
97.8
1K
1.05M
¥14.4 / ¥86.4Input/Output
4
gemini-3-pro
Google
96.7
1.4K
1.05M
¥14.4 / ¥86.4Input/Output
5
claude-opus-4-7-thinking
Anthropic
95.7
386
1M
¥36 / ¥180Input/Output
6
muse-spark
Meta
94.6
325
-
-
7
kimi-k2.6
Moonshot
93.5
368
262K
¥6.84 / ¥28.8Input/Output
8
gpt-5.5
Openai
92.4
315
1.05M
¥36 / ¥216Input/Output
9
gpt-5.4-high
Openai
91.3
357
1.05M
¥18 / ¥108Input/Output
10
claude-opus-4-7
Anthropic
90.2
493
1M
¥36 / ¥180Input/Output
11
kimi-k2.5-instant
Moonshot
89.1
170
262K
¥4.32 / ¥21.6Input/Output
12
gpt-5.5-high
Openai
88.0
256
1.05M
¥36 / ¥216Input/Output
13
gemini-3-flash
Google
87.0
1.1K
1.05M
¥3.6 / ¥21.6Input/Output
14
gemma-4-26b-a4b
Google
85.9
668
262K
¥0.94 / ¥2.88Input/Output
15
glm-5v-turbo
Zai
84.8
488
200K
¥0 / ¥0Input/Output
16
gpt-5.4
Openai
83.7
322
1.05M
¥18 / ¥108Input/Output
17
gemma-4-31b
Google
82.6
1.1K
262K
¥3.24 / ¥7.2Input/Output
18
dola-seed-2.0-pro
Bytedance
81.5
495
-
-
19
qwen3.5-397b-a17b
Alibaba
80.4
570
262K
¥3.1 / ¥18.6Input/Output
20
qwen3.5-27b
Alibaba
79.3
530
262K
¥2.16 / ¥17.3Input/Output
21
gemini-3-flash (thinking-minimal)
Google
78.3
989
1.05M
¥3.6 / ¥21.6Input/Output
22
kimi-k2.5-thinking
Moonshot
77.2
744
262K
¥4.32 / ¥21.6Input/Output
23
grok-4.20-multi-agent-beta-0309
Xai
76.1
525
2M
¥14.4 / ¥43.2Input/Output
24
mimo-v2.5
Xiaomi
75.0
379
1.05M
¥2.88 / ¥14.4Input/Output
25
qwen3.7-plus-preview
Alibaba
73.9
216
131K
¥3.6 / ¥21.6Input/Output
26
qwen3.5-122b-a10b
Alibaba
72.8
562
262K
¥2.88 / ¥23Input/Output
27
claude-sonnet-4-6
Anthropic
71.7
524
1M
¥21.6 / ¥108Input/Output
28
gpt-5.4-mini-high
Openai
70.7
517
400K
¥5.4 / ¥32.4Input/Output
29
gpt-5.2-high
Openai
69.6
778
400K
¥12.6 / ¥101Input/Output
30
grok-4.20-beta-0309-reasoning
Xai
68.5
560
2M
¥14.4 / ¥43.2Input/Output
31
gemini-2.5-pro
Google
67.4
2.9K
1.05M
¥9 / ¥72Input/Output
32
gemini-2.5-flash-preview-09-2025
Google
66.3
391
1M
¥2.16 / ¥18Input/Output
33
qwen3-vl-235b-a22b-instruct
Alibaba
65.2
719
128K
¥2.16 / ¥8.64Input/Output
34
gemini-2.5-flash
Google
64.1
2.4K
1.05M
¥2.16 / ¥18Input/Output
35
gemini-3.1-flash-lite-preview
Google
63.0
812
1.05M
¥1.8 / ¥10.8Input/Output
36
mimo-v2-omni
Xiaomi
62.0
397
262K
¥2.88 / ¥14.4Input/Output
37
gpt-5.1-high
Openai
60.9
453
400K
¥9 / ¥72Input/Output
38
gpt-5.2
Openai
59.8
826
400K
¥12.6 / ¥101Input/Output
39
gpt-5.2-chat-latest-20260210
Openai
58.7
636
400K
¥12.6 / ¥101Input/Output
40
ernie-5.0-preview-1220
Baidu
57.6
173
128K
¥7.92 / ¥14.4Input/Output
41
gpt-5.1
Openai
56.5
558
400K
¥9 / ¥72Input/Output
42
gpt-5-chat
Openai
55.4
1.3K
400K
¥9 / ¥72Input/Output
43
gpt-5.4-nano-high
Openai
54.3
537
400K
¥1.44 / ¥9Input/Output
44
grok-4-0709
Xai
53.3
1.2K
256K
¥21.6 / ¥108Input/Output
45
chatgpt-4o-latest-20250326
Openai
52.2
1.1K
128K
¥18 / ¥72Input/Output
46
gpt-5-high
Openai
51.1
1.4K
400K
¥9 / ¥72Input/Output
47
gemini-2.5-flash-lite-preview-09-2025-no-thinking
Google
50.0
405
1.05M
¥0.72 / ¥2.88Input/Output
48
o3-2025-04-16
Openai
48.9
1.6K
200K
¥14.4 / ¥57.6Input/Output
49
o1-2024-12-17
Openai
47.8
99
128K
¥108 / ¥432Input/Output
50
gpt-5.5-instant
Openai
46.7
231
400K
¥9 / ¥72Input/Output
51
gpt-4.1-2025-04-14
Openai
45.7
1.4K
1.05M
¥14.4 / ¥57.6Input/Output
52
grok-4-1-fast-reasoning
Xai
44.6
632
2M
¥1.44 / ¥3.6Input/Output
53
gpt-5-mini-high
Openai
43.5
1K
400K
¥1.8 / ¥14.4Input/Output
54
o4-mini-2025-04-16
Openai
42.4
1.3K
200K
¥7.92 / ¥31.7Input/Output
55
gemini-2.5-flash-lite-preview-06-17-thinking
Google
41.3
1K
65.5K
¥0.72 / ¥2.88Input/Output
56
gpt-4.1-mini-2025-04-14
Openai
40.2
1.1K
1.05M
¥2.88 / ¥11.5Input/Output
57
gemini-1.5-pro-002
Google
39.1
395
-
-
58
mistral-medium-2508
Mistral
38.0
1.4K
262K
¥2.88 / ¥14.4Input/Output
59
gpt-4.5-preview-2025-02-27
Openai
37.0
77
8.19K
¥216 / ¥432Input/Output
60
gemma-3-27b-it
Google
35.9
734
128K
¥2.15 / ¥2.15Input/Output
61
claude-3-7-sonnet-20250219
Anthropic
34.8
153
200K
¥21.6 / ¥108Input/Output
62
mistral-medium-2505
Mistral
33.7
477
262K
¥2.88 / ¥14.4Input/Output
63
mistral-small-2506
Mistral
32.6
481
262K
¥2.88 / ¥14.4Input/Output
64
gemini-2.0-flash-001
Google
31.5
339
1.05M
¥1.08 / ¥4.32Input/Output
65
mistral-small-3.1-24b-instruct-2503
Mistral
30.4
718
262K
¥2.88 / ¥14.4Input/Output
66
gemini-1.5-flash-002
Google
29.3
370
2M
¥0.54 / ¥2.2Input/Output
67
gpt-4o-2024-05-13
Openai
28.3
1.5K
128K
¥36 / ¥108Input/Output
68
llama-4-maverick-17b-128e-instruct
Meta
27.2
232
1M
¥1.8 / ¥6.26Input/Output
69
claude-3-5-sonnet-20241022
Anthropic
26.1
513
200K
¥21.6 / ¥108Input/Output
70
qwen2.5-vl-72b-instruct
Alibaba
25.0
119
131K
¥16.5 / ¥49.5Input/Output
71
claude-3-5-sonnet-20240620
Anthropic
23.9
1.6K
200K
¥21.6 / ¥108Input/Output
72
pixtral-large-2411
Mistral
22.8
156
128K
¥14.4 / ¥43.2Input/Output
73
llama-4-scout-17b-16e-instruct
Meta
21.7
237
128K
¥1.44 / ¥5.62Input/Output
74
qwen2-vl-72b
Alibaba
20.7
291
-
-
75
internvl2-26b
-
19.6
287
-
-
76
gpt-4-turbo-2024-04-09
Openai
18.5
937
128K
¥72 / ¥216Input/Output
77
gpt-4o-mini-2024-07-18
Openai
17.4
952
128K
¥1.08 / ¥4.32Input/Output
78
gemini-1.5-pro-001
Google
16.3
1.2K
-
-
79
gemini-2.0-flash-lite-preview-02-05
Google
15.2
98
1.05M
¥0.54 / ¥2.16Input/Output
80
gpt-4o-2024-08-06
Openai
14.1
186
128K
¥18 / ¥72Input/Output
81
qwen2-vl-7b-instruct
Alibaba
13.0
298
131K
¥2.07 / ¥5.16Input/Output
82
claude-3-opus-20240229
Anthropic
12.0
1K
200K
¥108 / ¥540Input/Output
83
gemini-1.5-flash-8b-001
Google
10.9
332
2M
¥0.54 / ¥2.2Input/Output
84
gemini-1.5-flash-001
Google
9.8
950
2M
¥0.54 / ¥2.2Input/Output
85
llama-3.2-vision-90b-instruct
Meta
8.7
393
131K
¥2.48 / ¥2.48Input/Output
86
pixtral-12b-2409
Mistral
7.6
335
128K
¥1.08 / ¥1.08Input/Output
87
internvl2-4b
-
6.5
177
-
-
88
claude-3-sonnet-20240229
Anthropic
5.4
915
200K
¥21.6 / ¥108Input/Output
89
molmo-72b-0924
Allenai
4.3
182
-
-
90
claude-3-haiku-20240307
Anthropic
3.3
1K
200K
¥1.8 / ¥9Input/Output
91
llama-3.2-vision-11b-instruct
Meta
2.2
266
131K
¥2.48 / ¥2.48Input/Output
92
llava-v1.6-34b
-
1.1
415
-
-
93
molmo-7b-d-0924
Allenai
0.0
171
-
-
Top model analysis

claude-opus-4-6-thinking why it ranks first

claude-opus-4-6-thinking ranks first with a percent score of 100.0 and 435 samples. Use it as the first option for this leaderboard, then compare price, context and availability.

How to choose

Do not only look at rank #1

Start with the leaderboard closest to your task. Compare the top models by score and sample size, then check price, context length, open or closed access, and provider availability.

FAQ

FAQ

中文排行榜看什么指标?

主要看排名、百分制分数、样本量和来源。分数用于快速比较同一榜单内模型表现,样本量用于判断结果稳定性。

为什么不同榜单不能直接混合成总分?

不同榜单的任务、样本和评测口径不同,模力榜默认只在同一榜单内排序,避免把写作、代码、图像等能力强行合并。

中文模型应该怎么选?

优先看与你任务最接近的榜单,再结合价格、上下文长度、开源闭源和厂商可用性。排名靠前不代表适合所有预算和部署方式。

榜单多久更新?

页面展示的是最新成功采集的公开榜单数据。当前优先使用 LMArena leaderboard dataset,并在页面来源中保留原始链接。