Chat · Vision · OCR Leaderboard

Ranking for Vision / OCR, based on public preference data.

Selection guide

OCR model ranking guide

Ranking for Vision / OCR, based on public preference data.

claude-opus-4-7claude-opus-4-6-thinkingclaude-opus-4-7-thinkingclaude-opus-4-6gemini-3-pro
Current DirectoryChat · Vision · OCR
Models82
Published2026/05/18
Arena public preference evaluationOriginal leaderboard: Vision / OcrPublished: 2026/05/18Leaderboard dataset: LMArena latest parquetOpen Arena sourceOpen leaderboard dataset
1
claude-opus-4-7
Anthropic
100.0
4.8K
1M
¥36 / ¥180Input/Output
2
claude-opus-4-6-thinking
Anthropic
98.8
4.9K
1M
¥36 / ¥180Input/Output
3
claude-opus-4-7-thinking
Anthropic
97.5
4.5K
1M
¥36 / ¥180Input/Output
4
claude-opus-4-6
Anthropic
96.3
6.1K
1M
¥36 / ¥180Input/Output
5
gemini-3-pro
Google
95.1
8.1K
1.05M
¥14.4 / ¥86.4Input/Output
6
muse-spark
Meta
93.8
3.2K
-
-
7
gemini-3.1-pro-preview
Google
92.6
11.4K
1.05M
¥14.4 / ¥86.4Input/Output
8
gpt-5.4-high
Openai
91.4
4.2K
1.05M
¥18 / ¥108Input/Output
9
gpt-5.5
Openai
90.1
3.3K
1.05M
¥36 / ¥216Input/Output
10
gpt-5.5-high
Openai
88.9
3K
1.05M
¥36 / ¥216Input/Output
11
claude-sonnet-4-6
Anthropic
87.7
6.3K
1M
¥21.6 / ¥108Input/Output
12
gpt-5.4
Openai
86.4
3.9K
1.05M
¥18 / ¥108Input/Output
13
gemini-3-flash
Google
85.2
14.2K
1.05M
¥3.6 / ¥21.6Input/Output
14
kimi-k2.6
Moonshot
84.0
4K
262K
¥6.84 / ¥28.8Input/Output
15
dola-seed-2.0-pro
Bytedance
82.7
6.2K
-
-
16
qwen3.7-plus-preview
Alibaba
81.5
2.7K
131K
¥3.6 / ¥21.6Input/Output
17
gpt-5.2-chat-latest-20260210
Openai
80.2
8.5K
400K
¥12.6 / ¥101Input/Output
18
kimi-k2.5-thinking
Moonshot
79.0
9.7K
262K
¥4.32 / ¥21.6Input/Output
19
qwen3.5-397b-a17b
Alibaba
77.8
8K
262K
¥3.1 / ¥18.6Input/Output
20
gemini-3-flash (thinking-minimal)
Google
76.5
12.9K
1.05M
¥3.6 / ¥21.6Input/Output
21
gemini-2.5-pro
Google
75.3
30.8K
1.05M
¥9 / ¥72Input/Output
22
gemma-4-31b
Google
74.1
11.9K
262K
¥3.24 / ¥7.2Input/Output
23
gemma-4-26b-a4b
Google
72.8
7.3K
262K
¥0.94 / ¥2.88Input/Output
24
glm-5v-turbo
Zai
71.6
5.2K
200K
¥0 / ¥0Input/Output
25
kimi-k2.5-instant
Moonshot
70.4
2.6K
262K
¥4.32 / ¥21.6Input/Output
26
grok-4.20-beta-0309-reasoning
Xai
69.1
6.6K
2M
¥14.4 / ¥43.2Input/Output
27
gemini-2.5-flash-preview-09-2025
Google
67.9
2.8K
1M
¥2.16 / ¥18Input/Output
28
gpt-5.2-high
Openai
66.7
10.2K
400K
¥12.6 / ¥101Input/Output
29
gpt-5.4-mini-high
Openai
65.4
5.9K
400K
¥5.4 / ¥32.4Input/Output
30
qwen3-vl-235b-a22b-instruct
Alibaba
64.2
7.6K
128K
¥2.16 / ¥8.64Input/Output
31
grok-4.20-multi-agent-beta-0309
Xai
63.0
6.1K
2M
¥14.4 / ¥43.2Input/Output
32
gpt-5.5-instant
Openai
61.7
2.7K
400K
¥9 / ¥72Input/Output
33
mimo-v2.5
Xiaomi
60.5
4.9K
1.05M
¥2.88 / ¥14.4Input/Output
34
gpt-5.1-high
Openai
59.3
5.7K
400K
¥9 / ¥72Input/Output
35
ernie-5.0-preview-1220
Baidu
58.0
1.9K
128K
¥7.92 / ¥14.4Input/Output
36
chatgpt-4o-latest-20250326
Openai
56.8
11.2K
128K
¥18 / ¥72Input/Output
37
grok-4.3
Xai
55.6
2.8K
1M
¥9 / ¥18Input/Output
38
gemini-3.1-flash-lite-preview
Google
54.3
9.5K
1.05M
¥1.8 / ¥10.8Input/Output
39
qwen3.5-122b-a10b
Alibaba
53.1
6.9K
262K
¥2.88 / ¥23Input/Output
40
gpt-5-chat
Openai
51.9
10.8K
400K
¥9 / ¥72Input/Output
41
qwen3.5-27b
Alibaba
50.6
6.5K
262K
¥2.16 / ¥17.3Input/Output
42
gpt-5.1
Openai
49.4
6.5K
400K
¥9 / ¥72Input/Output
43
gemini-2.5-flash
Google
48.1
25K
1.05M
¥2.16 / ¥18Input/Output
44
qwen-vl-max-2025-08-13
Alibaba
46.9
1.2K
131K
¥1.66 / ¥4.13Input/Output
45
gpt-5.2
Openai
45.7
10.9K
400K
¥12.6 / ¥101Input/Output
46
mimo-v2-omni
Xiaomi
44.4
5.2K
262K
¥2.88 / ¥14.4Input/Output
47
o3-2025-04-16
Openai
43.2
14.7K
200K
¥14.4 / ¥57.6Input/Output
48
gpt-5-high
Openai
42.0
10.9K
400K
¥9 / ¥72Input/Output
49
gpt-4.1-2025-04-14
Openai
40.7
11.6K
1.05M
¥14.4 / ¥57.6Input/Output
50
qwen3-vl-235b-a22b-thinking
Alibaba
39.5
1.4K
131K
¥2.06 / ¥8.26Input/Output
51
gpt-5.4-nano-high
Openai
38.3
5.9K
400K
¥1.44 / ¥9Input/Output
52
gemini-2.5-flash-lite-preview-09-2025-no-thinking
Google
37.0
2.8K
1.05M
¥0.72 / ¥2.88Input/Output
53
gpt-5-mini-high
Openai
35.8
8.2K
400K
¥1.8 / ¥14.4Input/Output
54
o4-mini-2025-04-16
Openai
34.6
12.1K
200K
¥7.92 / ¥31.7Input/Output
55
claude-sonnet-4-20250514-thinking-32k
Anthropic
33.3
744
200K
¥21.6 / ¥108Input/Output
56
claude-opus-4-20250514-thinking-16k
Anthropic
32.1
852
200K
¥108 / ¥540Input/Output
57
grok-4-0709
Xai
30.9
10K
256K
¥21.6 / ¥108Input/Output
58
gpt-4.1-mini-2025-04-14
Openai
29.6
10.7K
1.05M
¥2.88 / ¥11.5Input/Output
59
gemini-2.5-flash-lite-preview-06-17-thinking
Google
28.4
10.7K
65.5K
¥0.72 / ¥2.88Input/Output
60
grok-4-1-fast-reasoning
Xai
27.2
7.3K
2M
¥1.44 / ¥3.6Input/Output
61
claude-3-7-sonnet-20250219-thinking-32k
Anthropic
25.9
887
-
-
62
hunyuan-vision-1.5-thinking
Tencent
24.7
1.4K
-
-
63
claude-opus-4-20250514
Anthropic
23.5
1.4K
200K
¥108 / ¥540Input/Output
64
step-1o-turbo-202506
Stepfun
22.2
1.3K
-
-
65
mistral-medium-2508
Mistral
21.0
13.5K
262K
¥2.88 / ¥14.4Input/Output
66
claude-sonnet-4-20250514
Anthropic
19.8
1.1K
200K
¥21.6 / ¥108Input/Output
67
glm-4.6v
Zai
18.5
1.4K
128K
¥2.16 / ¥6.48Input/Output
68
step-3
Stepfun
17.3
1.2K
65.5K
¥1.8 / ¥4.68Input/Output
69
hunyuan-large-vision
Tencent
16.0
815
-
-
70
gemma-3-27b-it
Google
14.8
6.4K
128K
¥2.15 / ¥2.15Input/Output
71
claude-3-7-sonnet-20250219
Anthropic
13.6
915
200K
¥21.6 / ¥108Input/Output
72
gpt-5-nano-high
Openai
12.3
1.7K
400K
¥0.36 / ¥2.88Input/Output
73
mistral-medium-2505
Mistral
11.1
4.7K
262K
¥2.88 / ¥14.4Input/Output
74
glm-4.5v
Zai
9.9
1.2K
64K
¥4.32 / ¥13Input/Output
75
gemini-2.0-flash-001
Google
8.6
3.8K
1.05M
¥1.08 / ¥4.32Input/Output
76
llama-4-maverick-17b-128e-instruct
Meta
7.4
3.2K
1M
¥1.8 / ¥6.26Input/Output
77
claude-3-5-sonnet-20241022
Anthropic
6.2
967
200K
¥21.6 / ¥108Input/Output
78
mistral-small-2506
Mistral
4.9
4.2K
262K
¥2.88 / ¥14.4Input/Output
79
mistral-small-3.1-24b-instruct-2503
Mistral
3.7
7.4K
262K
¥2.88 / ¥14.4Input/Output
80
llama-4-scout-17b-16e-instruct
Meta
2.5
2.9K
128K
¥1.44 / ¥5.62Input/Output
81
claude-3-5-haiku-20241022
Anthropic
1.2
934
200K
¥5.76 / ¥28.8Input/Output
82
molmo-2-8b
Allenai
0.0
791
-
-
Top model analysis

claude-opus-4-7 why it ranks first

claude-opus-4-7 ranks first with a percent score of 100.0 and 4.8K samples. Use it as the first option for this leaderboard, then compare price, context and availability.

How to choose

Do not only look at rank #1

Start with the leaderboard closest to your task. Compare the top models by score and sample size, then check price, context length, open or closed access, and provider availability.

FAQ

FAQ

OCR 识别排行榜看什么指标?

主要看排名、百分制分数、样本量和来源。分数用于快速比较同一榜单内模型表现,样本量用于判断结果稳定性。

为什么不同榜单不能直接混合成总分?

不同榜单的任务、样本和评测口径不同,模力榜默认只在同一榜单内排序,避免把写作、代码、图像等能力强行合并。

OCR 识别模型应该怎么选?

优先看与你任务最接近的榜单,再结合价格、上下文长度、开源闭源和厂商可用性。排名靠前不代表适合所有预算和部署方式。

榜单多久更新?

页面展示的是最新成功采集的公开榜单数据。当前优先使用 LMArena leaderboard dataset,并在页面来源中保留原始链接。