Chat · Vision · English Leaderboard

Ranking for Vision / English, based on public preference data.

Selection guide

English model ranking guide

Ranking for Vision / English, based on public preference data.

claude-opus-4-7claude-opus-4-6-thinkingclaude-opus-4-7-thinkingclaude-opus-4-6muse-spark
Current DirectoryChat · Vision · English
Models126
Published2026/05/18
Arena public preference evaluationOriginal leaderboard: Vision / EnglishPublished: 2026/05/18Leaderboard dataset: LMArena latest parquetOpen Arena sourceOpen leaderboard dataset
1
claude-opus-4-7
Anthropic
100.0
3K
1M
¥36 / ¥180Input/Output
2
claude-opus-4-6-thinking
Anthropic
99.2
3.1K
1M
¥36 / ¥180Input/Output
3
claude-opus-4-7-thinking
Anthropic
98.4
3K
1M
¥36 / ¥180Input/Output
4
claude-opus-4-6
Anthropic
97.6
3.7K
1M
¥36 / ¥180Input/Output
5
muse-spark
Meta
96.8
1.9K
-
-
6
claude-sonnet-4-6
Anthropic
96.0
3.8K
1M
¥21.6 / ¥108Input/Output
7
gemini-3-pro
Google
95.2
5.4K
1.05M
¥14.4 / ¥86.4Input/Output
8
dola-seed-2.0-pro
Bytedance
94.4
3.5K
-
-
9
gpt-5.5
Openai
93.6
2K
1.05M
¥36 / ¥216Input/Output
10
gemini-3.1-pro-preview
Google
92.8
6.8K
1.05M
¥14.4 / ¥86.4Input/Output
11
gpt-5.4-high
Openai
92.0
2.5K
1.05M
¥18 / ¥108Input/Output
12
gpt-5.4
Openai
91.2
2.4K
1.05M
¥18 / ¥108Input/Output
13
kimi-k2.6
Moonshot
90.4
2.3K
262K
¥6.84 / ¥28.8Input/Output
14
gpt-5.5-high
Openai
89.6
1.8K
1.05M
¥36 / ¥216Input/Output
15
gemini-3-flash
Google
88.8
8.8K
1.05M
¥3.6 / ¥21.6Input/Output
16
kimi-k2.5-thinking
Moonshot
88.0
5.9K
262K
¥4.32 / ¥21.6Input/Output
17
gpt-5.2-chat-latest-20260210
Openai
87.2
5.1K
400K
¥12.6 / ¥101Input/Output
18
glm-5v-turbo
Zai
86.4
3.3K
200K
¥0 / ¥0Input/Output
19
qwen3.7-plus-preview
Alibaba
85.6
1.6K
131K
¥3.6 / ¥21.6Input/Output
20
qwen3.5-397b-a17b
Alibaba
84.8
4.9K
262K
¥3.1 / ¥18.6Input/Output
21
gemini-2.5-pro
Google
84.0
41.7K
1.05M
¥9 / ¥72Input/Output
22
grok-4.20-beta-0309-reasoning
Xai
83.2
3.9K
2M
¥14.4 / ¥43.2Input/Output
23
gemma-4-31b
Google
82.4
7.1K
262K
¥3.24 / ¥7.2Input/Output
24
kimi-k2.5-instant
Moonshot
81.6
1.6K
262K
¥4.32 / ¥21.6Input/Output
25
gemini-2.5-flash-preview-09-2025
Google
80.8
2.1K
1M
¥2.16 / ¥18Input/Output
26
gemini-3-flash (thinking-minimal)
Google
80.0
8.1K
1.05M
¥3.6 / ¥21.6Input/Output
27
gpt-5.5-instant
Openai
79.2
1.6K
400K
¥9 / ¥72Input/Output
28
grok-4.20-multi-agent-beta-0309
Xai
78.4
3.7K
2M
¥14.4 / ¥43.2Input/Output
29
gpt-5.1-high
Openai
77.6
3.9K
400K
¥9 / ¥72Input/Output
30
qwen3-vl-235b-a22b-instruct
Alibaba
76.8
5.1K
128K
¥2.16 / ¥8.64Input/Output
31
ernie-5.0-preview-1220
Baidu
76.0
1.3K
128K
¥7.92 / ¥14.4Input/Output
32
gemma-4-26b-a4b
Google
75.2
4.4K
262K
¥0.94 / ¥2.88Input/Output
33
mimo-v2.5
Xiaomi
74.4
2.8K
1.05M
¥2.88 / ¥14.4Input/Output
34
gpt-5.4-mini-high
Openai
73.6
3.6K
400K
¥5.4 / ¥32.4Input/Output
35
gpt-5.2-high
Openai
72.8
6.3K
400K
¥12.6 / ¥101Input/Output
36
chatgpt-4o-latest-20250326
Openai
72.0
11.1K
128K
¥18 / ¥72Input/Output
37
qwen3.5-27b
Alibaba
71.2
3.9K
262K
¥2.16 / ¥17.3Input/Output
38
qwen3.5-122b-a10b
Alibaba
70.4
4.2K
262K
¥2.88 / ¥23Input/Output
39
gemini-2.5-flash
Google
69.6
25.8K
1.05M
¥2.16 / ¥18Input/Output
40
mimo-v2-omni
Xiaomi
68.8
3.1K
262K
¥2.88 / ¥14.4Input/Output
41
grok-4.3
Xai
68.0
1.7K
1M
¥9 / ¥18Input/Output
42
gemini-3.1-flash-lite-preview
Google
67.2
5.7K
1.05M
¥1.8 / ¥10.8Input/Output
43
gpt-5-chat
Openai
66.4
19.7K
400K
¥9 / ¥72Input/Output
44
gpt-5.1
Openai
65.6
4.1K
400K
¥9 / ¥72Input/Output
45
gpt-5.2
Openai
64.8
6.6K
400K
¥12.6 / ¥101Input/Output
46
qwen-vl-max-2025-08-13
Alibaba
64.0
1.6K
131K
¥1.66 / ¥4.13Input/Output
47
o3-2025-04-16
Openai
63.2
22.9K
200K
¥14.4 / ¥57.6Input/Output
48
gpt-5-high
Openai
62.4
16.9K
400K
¥9 / ¥72Input/Output
49
qwen3-vl-235b-a22b-thinking
Alibaba
61.6
1.1K
131K
¥2.06 / ¥8.26Input/Output
50
grok-4-0709
Xai
60.8
16.1K
256K
¥21.6 / ¥108Input/Output
51
gpt-5-mini-high
Openai
60.0
14.1K
400K
¥1.8 / ¥14.4Input/Output
52
gpt-4.1-2025-04-14
Openai
59.2
20.1K
1.05M
¥14.4 / ¥57.6Input/Output
53
step-1o-turbo-202506
Stepfun
58.4
931
-
-
54
grok-4-1-fast-reasoning
Xai
57.6
4.3K
2M
¥1.44 / ¥3.6Input/Output
55
gemini-2.5-flash-lite-preview-09-2025-no-thinking
Google
56.8
2.2K
1.05M
¥0.72 / ¥2.88Input/Output
56
o4-mini-2025-04-16
Openai
56.0
20.8K
200K
¥7.92 / ¥31.7Input/Output
57
hunyuan-large-vision
Tencent
55.2
646
-
-
58
step-3
Stepfun
54.4
1.7K
65.5K
¥1.8 / ¥4.68Input/Output
59
gpt-5.4-nano-high
Openai
53.6
3.5K
400K
¥1.44 / ¥9Input/Output
60
claude-sonnet-4-20250514-thinking-32k
Anthropic
52.8
556
200K
¥21.6 / ¥108Input/Output
61
hunyuan-vision-1.5-thinking
Tencent
52.0
1.2K
-
-
62
gpt-4.5-preview-2025-02-27
Openai
51.2
1.4K
8.19K
¥216 / ¥432Input/Output
63
gpt-4.1-mini-2025-04-14
Openai
50.4
20.1K
1.05M
¥2.88 / ¥11.5Input/Output
64
gemini-2.5-flash-lite-preview-06-17-thinking
Google
49.6
17.5K
65.5K
¥0.72 / ¥2.88Input/Output
65
gemma-3-27b-it
Google
48.8
8.4K
128K
¥2.15 / ¥2.15Input/Output
66
claude-opus-4-20250514
Anthropic
48.0
1K
200K
¥108 / ¥540Input/Output
67
mistral-medium-2508
Mistral
47.2
19.9K
262K
¥2.88 / ¥14.4Input/Output
68
claude-opus-4-20250514-thinking-16k
Anthropic
46.4
596
200K
¥108 / ¥540Input/Output
69
glm-4.5v
Zai
45.6
1.7K
64K
¥4.32 / ¥13Input/Output
70
glm-4.6v
Zai
44.8
965
128K
¥2.16 / ¥6.48Input/Output
71
mistral-medium-2505
Mistral
44.0
5.7K
262K
¥2.88 / ¥14.4Input/Output
72
gemini-2.0-flash-001
Google
43.2
5.2K
1.05M
¥1.08 / ¥4.32Input/Output
73
gpt-5-nano-high
Openai
42.4
2K
400K
¥0.36 / ¥2.88Input/Output
74
claude-3-7-sonnet-20250219-thinking-32k
Anthropic
41.6
689
-
-
75
claude-sonnet-4-20250514
Anthropic
40.8
820
200K
¥21.6 / ¥108Input/Output
76
o1-2024-12-17
Openai
40.0
1.9K
128K
¥108 / ¥432Input/Output
77
mistral-small-2506
Mistral
39.2
5.8K
262K
¥2.88 / ¥14.4Input/Output
78
gemini-1.5-pro-002
Google
38.4
4.8K
-
-
79
mistral-small-3.1-24b-instruct-2503
Mistral
37.6
15.4K
262K
¥2.88 / ¥14.4Input/Output
80
qwen2.5-vl-32b-instruct
Alibaba
36.8
870
16.4K
¥0.39 / ¥1.57Input/Output
81
gemini-1.5-flash-002
Google
36.0
3.9K
2M
¥0.54 / ¥2.2Input/Output
82
llama-4-maverick-17b-128e-instruct
Meta
35.2
3.7K
1M
¥1.8 / ¥6.26Input/Output
83
gpt-4o-2024-05-13
Openai
34.4
14.1K
128K
¥36 / ¥108Input/Output
84
step-1o-vision-32k-highres
Stepfun
33.6
1.6K
-
-
85
llama-4-scout-17b-16e-instruct
Meta
32.8
3.3K
128K
¥1.44 / ¥5.62Input/Output
86
claude-3-7-sonnet-20250219
Anthropic
32.0
2.2K
200K
¥21.6 / ¥108Input/Output
87
claude-3-5-sonnet-20241022
Anthropic
31.2
5.4K
200K
¥21.6 / ¥108Input/Output
88
gemini-2.0-flash-lite-preview-02-05
Google
30.4
2.3K
1.05M
¥0.54 / ¥2.16Input/Output
89
claude-3-5-sonnet-20240620
Anthropic
29.6
13.3K
200K
¥21.6 / ¥108Input/Output
90
qwen2.5-vl-72b-instruct
Alibaba
28.8
2.2K
131K
¥16.5 / ¥49.5Input/Output
91
gpt-4.1-nano-2025-04-14
Openai
28.0
461
1.05M
¥14.4 / ¥57.6Input/Output
92
pixtral-large-2411
Mistral
27.2
2.9K
128K
¥14.4 / ¥43.2Input/Output
93
claude-3-5-haiku-20241022
Anthropic
26.4
685
200K
¥5.76 / ¥28.8Input/Output
94
gpt-4-turbo-2024-04-09
Openai
25.6
8.4K
128K
¥72 / ¥216Input/Output
95
gemini-1.5-pro-001
Google
24.8
10.6K
-
-
96
molmo-2-8b
Allenai
24.0
483
-
-
97
gpt-4o-mini-2024-07-18
Openai
23.2
10.7K
128K
¥1.08 / ¥4.32Input/Output
98
qwen-vl-max-1119
Alibaba
22.4
755
131K
¥1.66 / ¥4.13Input/Output
99
gpt-4o-2024-08-06
Openai
21.6
1.7K
128K
¥18 / ¥72Input/Output
100
step-1v-32k
Stepfun
20.8
839
32.8K
¥14.8 / ¥69Input/Output
101
gemini-1.5-flash-8b-001
Google
20.0
3.3K
2M
¥0.54 / ¥2.2Input/Output
102
qwen2-vl-72b
Alibaba
19.2
3.2K
-
-
103
molmo-72b-0924
Allenai
18.4
1.6K
-
-
104
internvl2-26b
-
17.6
3.3K
-
-
105
gemini-1.5-flash-001
Google
16.8
8.4K
2M
¥0.54 / ¥2.2Input/Output
106
pixtral-12b-2409
Mistral
16.0
4.1K
128K
¥1.08 / ¥1.08Input/Output
107
hunyuan-standard-vision-2024-12-31
Tencent
15.2
450
-
-
108
claude-3-opus-20240229
Anthropic
14.4
9.7K
200K
¥108 / ¥540Input/Output
109
llama-3.2-vision-90b-instruct
Meta
13.6
4.6K
131K
¥2.48 / ¥2.48Input/Output
110
molmo-7b-d-0924
Allenai
12.8
1.5K
-
-
111
yi-vision
-
12.0
785
-
-
112
qwen2-vl-7b-instruct
Alibaba
11.2
3K
131K
¥2.07 / ¥5.16Input/Output
113
amazon-nova-lite-v1.0
Amazon
10.4
1K
300K
¥0.43 / ¥1.73Input/Output
114
c4ai-aya-vision-32b
Cohere
9.6
447
-
-
115
llama-3.2-vision-11b-instruct
Meta
8.8
2.6K
131K
¥2.48 / ¥2.48Input/Output
116
amazon-nova-pro-v1.0
Amazon
8.0
1.3K
300K
¥5.76 / ¥23Input/Output
117
internvl2-4b
-
7.2
2.2K
-
-
118
claude-3-sonnet-20240229
Anthropic
6.4
7.8K
200K
¥21.6 / ¥108Input/Output
119
llava-v1.6-34b
-
5.6
2.8K
-
-
120
cogvlm2-llama3-chat-19b
Zai
4.8
1.2K
-
-
121
llava-onevision-qwen2-72b-ov
-
4.0
841
-
-
122
nvila-internal-15b-v1
Nvidia
3.2
619
-
-
123
claude-3-haiku-20240307
Anthropic
2.4
8.4K
200K
¥1.8 / ¥9Input/Output
124
minicpm-v-2_6
-
1.6
1.3K
-
-
125
phi-3.5-vision-instruct
Microsoft
0.8
1.6K
128K
¥1.15 / ¥4.61Input/Output
126
phi-3-vision-128k-instruct
Microsoft
0.0
880
128K
¥1.08 / ¥4.32Input/Output
Top model analysis

claude-opus-4-7 why it ranks first

claude-opus-4-7 ranks first with a percent score of 100.0 and 3K samples. Use it as the first option for this leaderboard, then compare price, context and availability.

How to choose

Do not only look at rank #1

Start with the leaderboard closest to your task. Compare the top models by score and sample size, then check price, context length, open or closed access, and provider availability.

FAQ

FAQ

英文排行榜看什么指标?

主要看排名、百分制分数、样本量和来源。分数用于快速比较同一榜单内模型表现,样本量用于判断结果稳定性。

为什么不同榜单不能直接混合成总分?

不同榜单的任务、样本和评测口径不同,模力榜默认只在同一榜单内排序,避免把写作、代码、图像等能力强行合并。

英文模型应该怎么选?

优先看与你任务最接近的榜单,再结合价格、上下文长度、开源闭源和厂商可用性。排名靠前不代表适合所有预算和部署方式。

榜单多久更新?

页面展示的是最新成功采集的公开榜单数据。当前优先使用 LMArena leaderboard dataset,并在页面来源中保留原始链接。