Chat · Text · Expert Leaderboard

Ranking for Text / Expert, based on public preference data.

Selection guide

Expert model ranking guide

Ranking for Text / Expert, based on public preference data.

claude-opus-4-6claude-opus-4-6-thinkinggpt-5.5-highgpt-5.4-highclaude-opus-4-7
Current DirectoryChat · Text · Expert
Models309
Published2026/05/27
Arena public preference evaluationOriginal leaderboard: Text / ExpertPublished: 2026/05/27Leaderboard dataset: LMArena latest parquetOpen Arena sourceOpen leaderboard dataset
1
claude-opus-4-6
Anthropic
100.0
3.1K
1M
¥36 / ¥180Input/Output
2
claude-opus-4-6-thinking
Anthropic
99.7
2.6K
1M
¥36 / ¥180Input/Output
3
gpt-5.5-high
Openai
99.4
1.5K
1.05M
¥36 / ¥216Input/Output
4
gpt-5.4-high
Openai
99.0
2.4K
1.05M
¥18 / ¥108Input/Output
5
claude-opus-4-7
Anthropic
98.7
2K
1M
¥36 / ¥180Input/Output
6
mimo-v2.5-pro
Xiaomi
98.4
1.4K
1.05M
¥7.2 / ¥21.6Input/Output
7
gemini-3.5-flash
Google
98.1
899
1.05M
¥10.8 / ¥64.8Input/Output
8
claude-opus-4-7-thinking
Anthropic
97.7
1.9K
1M
¥36 / ¥180Input/Output
9
amazon-nova-experimental-chat-26-02-10
Amazon
97.4
270
-
-
10
gpt-5.5
Openai
97.1
1.6K
1.05M
¥36 / ¥216Input/Output
11
claude-sonnet-4-6
Anthropic
96.8
2.5K
1M
¥21.6 / ¥108Input/Output
12
gemini-3.1-pro-preview
Google
96.4
3.8K
1.05M
¥14.4 / ¥86.4Input/Output
13
qwen3.5-max-preview
Alibaba
96.1
1.9K
-
-
14
qwen3.7-max-preview
Alibaba
95.8
345
1M
¥18 / ¥54Input/Output
15
gpt-5.4
Openai
95.5
2.5K
1.05M
¥18 / ¥108Input/Output
16
kimi-k2.6
Moonshot
95.1
1.3K
262K
¥6.84 / ¥28.8Input/Output
17
glm-5.1
Zai
94.8
1.2K
200K
¥0 / ¥0Input/Output
18
claude-opus-4-5-20251101
Anthropic
94.5
4.6K
200K
¥36 / ¥180Input/Output
19
claude-sonnet-4-5-20250929-thinking-32k
Anthropic
94.2
5K
200K
¥21.6 / ¥108Input/Output
20
claude-opus-4-5-20251101-thinking-32k
Anthropic
93.8
2.2K
200K
¥108 / ¥540Input/Output
21
gemini-3-pro
Google
93.5
2.5K
1.05M
¥14.4 / ¥86.4Input/Output
22
mimo-v2-pro
Xiaomi
93.2
1.9K
1.05M
¥7.2 / ¥21.6Input/Output
23
claude-sonnet-4-5-20250929
Anthropic
92.9
5.1K
200K
¥21.6 / ¥108Input/Output
24
qwen3.6-max-preview
Alibaba
92.5
469
246K
¥9.5 / ¥56.9Input/Output
25
ernie-5.1
Baidu
92.2
1.2K
119K
¥5.4 / ¥21.6Input/Output
26
gpt-5.1-high
Openai
91.9
2.3K
400K
¥9 / ¥72Input/Output
27
qwen3.5-397b-a17b
Alibaba
91.6
2.8K
262K
¥3.1 / ¥18.6Input/Output
28
kimi-k2.5-thinking
Moonshot
91.2
2.8K
262K
¥4.32 / ¥21.6Input/Output
29
gemini-3-flash
Google
90.9
1.9K
1.05M
¥3.6 / ¥21.6Input/Output
30
mimo-v2.5
Xiaomi
90.6
1.5K
1.05M
¥2.88 / ¥14.4Input/Output
31
glm-5
Zai
90.3
1.7K
205K
¥7.2 / ¥23Input/Output
32
qwen3-max-preview
Alibaba
89.9
1.3K
262K
¥6.2 / ¥24.8Input/Output
33
qwen3-235b-a22b-thinking-2507
Alibaba
89.6
411
131K
¥2.07 / ¥8.26Input/Output
34
gemini-2.5-pro
Google
89.3
7.5K
1.05M
¥9 / ¥72Input/Output
35
qwen3.6-plus
Alibaba
89.0
1.6K
1M
¥3.6 / ¥21.6Input/Output
36
deepseek-v4-pro
Deepseek
88.6
1.6K
1M
¥3.13 / ¥6.26Input/Output
37
muse-spark
Meta
88.3
1.1K
-
-
38
longcat-flash-chat-2602-exp
Meituan
88.0
2.1K
128K
¥1.08 / ¥10.8Input/Output
39
gemma-4-31b
Google
87.7
437
262K
¥3.24 / ¥7.2Input/Output
40
grok-4.20-multi-agent-beta-0309
Xai
87.3
2.5K
2M
¥14.4 / ¥43.2Input/Output
41
gemma-4-26b-a4b
Google
87.0
406
262K
¥0.94 / ¥2.88Input/Output
42
deepseek-v4-pro-thinking
Deepseek
86.7
1.4K
1M
¥3.13 / ¥6.26Input/Output
43
gpt-5.2-high
Openai
86.4
3.5K
400K
¥12.6 / ¥101Input/Output
44
qwen3-vl-235b-a22b-instruct
Alibaba
86.0
563
128K
¥2.16 / ¥8.64Input/Output
45
amazon-nova-experimental-chat-26-01-10
Amazon
85.7
253
-
-
46
qwen3-235b-a22b-instruct-2507
Alibaba
85.4
5.7K
128K
¥2.09 / ¥8.23Input/Output
47
dola-seed-2.0-pro
Bytedance
85.1
3.2K
-
-
48
gpt-5.4-mini-high
Openai
84.7
2.3K
400K
¥5.4 / ¥32.4Input/Output
49
grok-4.20-beta-0309-reasoning
Xai
84.4
2.6K
2M
¥14.4 / ¥43.2Input/Output
50
gpt-5.1
Openai
84.1
2.7K
400K
¥9 / ¥72Input/Output
51
kimi-k2.5-instant
Moonshot
83.8
565
262K
¥4.32 / ¥21.6Input/Output
52
minimax-m2.1-preview
Minimax
83.4
1.1K
205K
¥0 / ¥0Input/Output
53
gpt-5.2-chat-latest-20260210
Openai
83.1
2.7K
400K
¥12.6 / ¥101Input/Output
54
claude-haiku-4-5-20251001
Anthropic
82.8
5.4K
200K
¥7.2 / ¥36Input/Output
55
claude-opus-4-1-20250805-thinking-16k
Anthropic
82.5
2.3K
200K
¥108 / ¥540Input/Output
56
deepseek-v4-flash
Deepseek
82.1
1.5K
1M
¥1.01 / ¥2.02Input/Output
57
kimi-k2-thinking-turbo
Moonshot
81.8
4.1K
262K
¥17.3 / ¥72Input/Output
58
longcat-flash-chat
Meituan
81.5
515
128K
¥1.08 / ¥10.8Input/Output
59
amazon-nova-experimental-chat-11-10
Amazon
81.2
1.5K
-
-
60
deepseek-v4-flash-thinking
Deepseek
80.8
1.6K
1M
¥1.01 / ¥2.02Input/Output
61
deepseek-v3.2
Deepseek
80.5
3K
128K
¥2.09 / ¥3.1Input/Output
62
glm-4.6
Zai
80.2
1.9K
205K
¥4.32 / ¥15.8Input/Output
63
minimax-m2.7
Minimax
79.9
1.9K
205K
¥0 / ¥0Input/Output
64
qwen3.5-122b-a10b
Alibaba
79.5
2.3K
262K
¥2.88 / ¥23Input/Output
65
gpt-5.2
Openai
79.2
3.7K
400K
¥12.6 / ¥101Input/Output
66
deepseek-v3.2-thinking
Deepseek
78.9
2.5K
128K
¥2.09 / ¥3.1Input/Output
67
hunyuan-hy3-preview
Tencent
78.6
575
256K
¥0 / ¥0Input/Output
68
glm-4.5
Zai
78.2
1.1K
131K
¥4.32 / ¥15.8Input/Output
69
ernie-5.0-0110
Baidu
77.9
2.6K
128K
¥7.92 / ¥14.4Input/Output
70
gemini-2.5-flash
Google
77.6
7.6K
1.05M
¥2.16 / ¥18Input/Output
71
qwen3.5-27b
Alibaba
77.3
2.2K
262K
¥2.16 / ¥17.3Input/Output
72
grok-4.20-beta1
Xai
76.9
2K
2M
¥14.4 / ¥43.2Input/Output
73
gemini-3-flash (thinking-minimal)
Google
76.6
4.2K
1.05M
¥3.6 / ¥21.6Input/Output
74
claude-opus-4-1-20250805
Anthropic
76.3
3.9K
200K
¥108 / ¥540Input/Output
75
mimo-v2-flash (non-thinking)
Xiaomi
76.0
3.4K
262K
¥0.72 / ¥2.16Input/Output
76
gemini-2.5-flash-preview-09-2025
Google
75.6
1.6K
1M
¥2.16 / ¥18Input/Output
77
ernie-5.0-preview-1022
Baidu
75.3
276
128K
¥7.92 / ¥14.4Input/Output
78
grok-4.1-thinking
Xai
75.0
4.4K
200K
¥14.4 / ¥72Input/Output
79
qwen3-vl-235b-a22b-thinking
Alibaba
74.7
378
131K
¥2.06 / ¥8.26Input/Output
80
ernie-5.0-preview-1203
Baidu
74.4
672
128K
¥7.92 / ¥14.4Input/Output
81
mimo-v2-omni
Xiaomi
74.0
283
262K
¥2.88 / ¥14.4Input/Output
82
gpt-5-high
Openai
73.7
1.6K
400K
¥9 / ¥72Input/Output
83
glm-4.7
Zai
73.4
717
205K
¥0 / ¥0Input/Output
84
step-3.5-flash
Stepfun
73.1
2.7K
256K
¥0.69 / ¥2.07Input/Output
85
grok-3-preview-02-24
Xai
72.7
1.5K
1M
¥9 / ¥18Input/Output
86
grok-4-0709
Xai
72.4
2K
256K
¥21.6 / ¥108Input/Output
87
deepseek-v3.2-exp-thinking
Deepseek
72.1
396
128K
¥0 / ¥0Input/Output
88
mistral-large-3
Mistral
71.8
3K
262K
¥3.6 / ¥10.8Input/Output
89
grok-4.1
Xai
71.4
4.4K
200K
¥14.4 / ¥72Input/Output
90
amazon-nova-experimental-chat-12-10
Amazon
71.1
242
-
-
91
gpt-5.5-instant
Openai
70.8
2.4K
400K
¥9 / ¥72Input/Output
92
mistral-medium-2508
Mistral
70.5
5.8K
262K
¥2.88 / ¥14.4Input/Output
93
qwen3.5-flash
Alibaba
70.1
2.4K
1M
¥1.24 / ¥12.4Input/Output
94
qwen3-next-80b-a3b-instruct
Alibaba
69.8
1K
131K
¥1.04 / ¥4.13Input/Output
95
qwen3.5-35b-a3b
Alibaba
69.5
2.4K
262K
¥1.8 / ¥14.4Input/Output
96
grok-4-fast-reasoning
Xai
69.2
862
2M
¥1.44 / ¥3.6Input/Output
97
gpt-5-chat
Openai
68.8
1.5K
400K
¥9 / ¥72Input/Output
98
grok-4-fast-chat
Xai
68.5
298
2M
¥1.44 / ¥3.6Input/Output
99
mimo-v2-flash (thinking)
Xiaomi
68.2
711
262K
¥0.72 / ¥2.16Input/Output
100
deepseek-v3.1
Deepseek
67.9
726
128K
¥1.44 / ¥5.04Input/Output
101
grok-4-1-fast-reasoning
Xai
67.5
3.8K
2M
¥1.44 / ¥3.6Input/Output
102
gemini-3.1-flash-lite-preview
Google
67.2
3K
1.05M
¥1.8 / ¥10.8Input/Output
103
deepseek-v3.1-thinking
Deepseek
66.9
529
128K
¥1.44 / ¥5.04Input/Output
104
nvidia-nemotron-3-super-120b-a12b
Nvidia
66.6
594
262K
¥1.44 / ¥5.76Input/Output
105
deepseek-v3.2-exp
Deepseek
66.2
610
128K
¥0 / ¥0Input/Output
106
o3-2025-04-16
Openai
65.9
3K
200K
¥14.4 / ¥57.6Input/Output
107
chatgpt-4o-latest-20250326
Openai
65.6
4.3K
128K
¥18 / ¥72Input/Output
108
qwen3-max-2025-09-23
Alibaba
65.3
449
258K
¥6.19 / ¥24.7Input/Output
109
deepseek-r1-0528
Deepseek
64.9
881
164K
¥3.6 / ¥15.5Input/Output
110
gpt-5.3-chat-latest
Openai
64.6
2.6K
128K
¥12.6 / ¥101Input/Output
111
gpt-5.4-nano-high
Openai
64.3
2.3K
400K
¥1.44 / ¥9Input/Output
112
gpt-4.5-preview-2025-02-27
Openai
64.0
608
8.19K
¥216 / ¥432Input/Output
113
qwen3-30b-a3b-instruct-2507
Alibaba
63.6
1.1K
262K
¥2.16 / ¥3.6Input/Output
114
grok-3-mini-high
Xai
63.3
921
128K
¥0 / ¥0Input/Output
115
hunyuan-t1-20250711
Tencent
63.0
215
131K
¥0 / ¥0Input/Output
116
amazon-nova-experimental-chat-10-20
Amazon
62.7
652
-
-
117
grok-4.3
Xai
62.3
1.5K
1M
¥9 / ¥18Input/Output
118
claude-opus-4-20250514-thinking-16k
Anthropic
62.0
1.7K
200K
¥108 / ¥540Input/Output
119
gpt-5-mini-high
Openai
61.7
1.1K
400K
¥1.8 / ¥14.4Input/Output
120
minimax-m2.5
Minimax
61.4
3K
205K
¥0 / ¥0Input/Output
121
gemini-2.5-flash-lite-preview-09-2025-no-thinking
Google
61.0
2.5K
1.05M
¥0.72 / ¥2.88Input/Output
122
claude-sonnet-4-20250514-thinking-32k
Anthropic
60.7
1.7K
200K
¥21.6 / ¥108Input/Output
123
claude-opus-4-20250514
Anthropic
60.4
2.2K
200K
¥108 / ¥540Input/Output
124
glm-4.5-air
Zai
60.1
1.4K
131K
¥0 / ¥0Input/Output
125
qwen3-235b-a22b-no-thinking
Alibaba
59.7
1.9K
131K
¥2.07 / ¥8.26Input/Output
126
qwen3-next-80b-a3b-thinking
Alibaba
59.4
620
131K
¥1.04 / ¥10.3Input/Output
127
nvidia-nemotron-3-nano-30b-a3b-bf16
Nvidia
59.1
944
131K
¥0 / ¥0Input/Output
128
nova-2-lite
Amazon
58.8
713
128K
¥2.38 / ¥19.8Input/Output
129
kimi-k2-0905-preview
Moonshot
58.4
555
262K
¥4.32 / ¥18Input/Output
130
gemini-2.5-flash-lite-preview-06-17-thinking
Google
58.1
1.6K
65.5K
¥0.72 / ¥2.88Input/Output
131
gpt-4.1-2025-04-14
Openai
57.8
2.5K
1.05M
¥14.4 / ¥57.6Input/Output
132
o3-mini-high
Openai
57.5
847
200K
¥7.92 / ¥31.7Input/Output
133
qwen3-32b
Alibaba
57.1
236
131K
¥2.07 / ¥8.26Input/Output
134
glm-4.7-flash
Zai
56.8
795
200K
¥0 / ¥0Input/Output
135
o1-2024-12-17
Openai
56.5
1.3K
128K
¥108 / ¥432Input/Output
136
trinity-large-thinking
-
56.2
2.1K
262K
¥1.8 / ¥6.48Input/Output
137
grok-3-mini-beta
Xai
55.8
1.2K
1M
¥9 / ¥18Input/Output
138
gpt-oss-120b
Openai
55.5
1.3K
131K
¥1.08 / ¥4.32Input/Output
139
glm-4.5v
Zai
55.2
196
64K
¥4.32 / ¥13Input/Output
140
mercury-2
Inception Ai
54.9
229
128K
¥1.8 / ¥5.4Input/Output
141
ling-flash-2.0
Ant Group
54.5
341
131K
¥1.01 / ¥4.1Input/Output
142
deepseek-v3-0324
Deepseek
54.2
2.3K
75K
¥1.44 / ¥5.76Input/Output
143
trinity-large-preview
-
53.9
2.4K
262K
¥1.8 / ¥6.48Input/Output
144
qwen3-235b-a22b
Alibaba
53.6
1.3K
131K
¥2.07 / ¥8.26Input/Output
145
kimi-k2-0711-preview
Moonshot
53.2
1.5K
131K
¥4.32 / ¥18Input/Output
146
o4-mini-2025-04-16
Openai
52.9
2.3K
200K
¥7.92 / ¥31.7Input/Output
147
mistral-medium-2505
Mistral
52.6
1.8K
262K
¥2.88 / ¥14.4Input/Output
148
gemini-2.0-flash-001
Google
52.3
2.2K
1.05M
¥1.08 / ¥4.32Input/Output
149
ring-flash-2.0
Ant Group
51.9
331
131K
¥1.01 / ¥4.1Input/Output
150
hunyuan-turbos-20250416
Tencent
51.6
581
131K
¥0 / ¥0Input/Output
151
gpt-4.1-mini-2025-04-14
Openai
51.3
2K
1.05M
¥2.88 / ¥11.5Input/Output
152
o1-preview
Openai
51.0
2K
128K
¥108 / ¥432Input/Output
153
deepseek-r1
Deepseek
50.6
848
164K
¥5.04 / ¥18Input/Output
154
qwen3-coder-480b-a35b-instruct
Alibaba
50.3
1.3K
262K
¥6.2 / ¥24.8Input/Output
155
qwen2.5-max
Alibaba
50.0
1.7K
32K
¥11.5 / ¥46Input/Output
156
claude-sonnet-4-20250514
Anthropic
49.7
2K
200K
¥21.6 / ¥108Input/Output
157
o3-mini
Openai
49.4
2.9K
200K
¥7.92 / ¥31.7Input/Output
158
step-3
Stepfun
49.0
259
65.5K
¥1.8 / ¥4.68Input/Output
159
qwen-plus-0125
Alibaba
48.7
358
1M
¥0.83 / ¥2.07Input/Output
160
qwq-32b
Alibaba
48.4
1.2K
131K
¥2.07 / ¥6.2Input/Output
161
minimax-m2
Minimax
48.1
258
197K
¥0 / ¥0Input/Output
162
gpt-5-nano-high
Openai
47.7
324
400K
¥0.36 / ¥2.88Input/Output
163
nvidia-llama-3.3-nemotron-super-49b-v1.5
Nvidia
47.4
180
131K
¥2.88 / ¥2.88Input/Output
164
claude-3-7-sonnet-20250219-thinking-32k
Anthropic
47.1
1.9K
-
-
165
intellect-3
-
46.8
309
131K
¥1.44 / ¥7.92Input/Output
166
minimax-m1
Minimax
46.4
1.6K
1M
¥0.95 / ¥9.03Input/Output
167
o1-mini
Openai
46.1
3.2K
128K
¥7.92 / ¥31.7Input/Output
168
qwen3-30b-a3b
Alibaba
45.8
1.3K
128K
¥0.79 / ¥7.78Input/Output
169
olmo-3.1-32b-instruct
Allenai
45.5
752
200K
¥14.4 / ¥57.6Input/Output
170
granite-4.1-8b
Ibm
45.1
353
131K
¥0.36 / ¥0.72Input/Output
171
deepseek-v3
Deepseek
44.8
1.2K
128K
¥0 / ¥0Input/Output
172
gemini-2.0-flash-lite-preview-02-05
Google
44.5
1.2K
1.05M
¥0.54 / ¥2.16Input/Output
173
gemma-3-27b-it
Google
44.2
2.2K
128K
¥2.15 / ¥2.15Input/Output
174
claude-3-7-sonnet-20250219
Anthropic
43.8
2.1K
200K
¥21.6 / ¥108Input/Output
175
step-1o-turbo-202506
Stepfun
43.5
472
-
-
176
mistral-small-2506
Mistral
43.2
840
262K
¥2.88 / ¥14.4Input/Output
177
command-a-03-2025
Cohere
42.9
2.8K
256K
¥18 / ¥72Input/Output
178
qwen2.5-plus-1127
Alibaba
42.5
664
-
-
179
olmo-3-32b-think
Allenai
42.2
275
128K
¥2.16 / ¥3.24Input/Output
180
yi-lightning
-
41.9
1.5K
12K
¥1.44 / ¥1.44Input/Output
181
olmo-3.1-32b-think
Allenai
41.6
503
200K
¥14.4 / ¥57.6Input/Output
182
glm-4-plus-0111
Zai
41.2
354
128K
¥72 / ¥72Input/Output
183
step-2-16k-exp-202412
Stepfun
40.9
310
16.4K
¥37.5 / ¥118Input/Output
184
gemini-1.5-pro-002
Google
40.6
3.3K
-
-
185
hunyuan-large-2025-02-10
Tencent
40.3
228
-
-
186
gpt-4.1-nano-2025-04-14
Openai
39.9
328
1.05M
¥14.4 / ¥57.6Input/Output
187
athene-v2-chat
-
39.6
1.5K
-
-
188
deepseek-v2.5-1210
Deepseek
39.3
441
1M
¥1.01 / ¥2.02Input/Output
189
claude-3-5-sonnet-20241022
Anthropic
39.0
5K
200K
¥21.6 / ¥108Input/Output
190
gpt-oss-20b
Openai
38.6
489
131K
¥0.32 / ¥1.3Input/Output
191
llama-4-maverick-17b-128e-instruct
Meta
38.3
2K
1M
¥1.8 / ¥6.26Input/Output
192
mistral-small-3.1-24b-instruct-2503
Mistral
38.0
1.6K
262K
¥2.88 / ¥14.4Input/Output
193
grok-2-2024-08-13
Xai
37.7
3.5K
1M
¥9 / ¥18Input/Output
194
hunyuan-large-vision
Tencent
37.3
287
-
-
195
glm-4-plus
Zai
37.0
1.6K
128K
¥54 / ¥54Input/Output
196
gpt-4o-2024-05-13
Openai
36.7
5.9K
128K
¥36 / ¥108Input/Output
197
gemma-3-12b-it
Google
36.4
186
128K
¥1.96 / ¥1.96Input/Output
198
qwen-max-0919
Alibaba
36.0
1K
131K
¥2.48 / ¥9.91Input/Output
199
hunyuan-standard-2025-02-10
Tencent
35.7
207
-
-
200
gemma-3n-e4b-it
Google
35.4
1.1K
128K
¥0 / ¥0Input/Output
201
claude-3-5-sonnet-20240620
Anthropic
35.1
4.3K
200K
¥21.6 / ¥108Input/Output
202
qwen2.5-72b-instruct
Alibaba
34.7
2.4K
131K
¥4.13 / ¥12.4Input/Output
203
gemini-1.5-pro-001
Google
34.4
3.9K
-
-
204
llama-3.1-405b-instruct-fp8
Meta
34.1
3.1K
128K
¥0 / ¥0Input/Output
205
llama-3.1-nemotron-70b-instruct
Nvidia
33.8
453
128K
¥0 / ¥0Input/Output
206
ibm-granite-h-small
Ibm
33.4
283
-
-
207
gpt-4o-2024-08-06
Openai
33.1
2.3K
128K
¥18 / ¥72Input/Output
208
deepseek-v2.5
Deepseek
32.8
1.5K
1M
¥1.01 / ¥2.02Input/Output
209
llama-4-scout-17b-16e-instruct
Meta
32.5
1.5K
128K
¥1.44 / ¥5.62Input/Output
210
grok-2-mini-2024-08-13
Xai
32.1
2.8K
1M
¥9 / ¥18Input/Output
211
gpt-4o-mini-2024-07-18
Openai
31.8
3.5K
128K
¥1.08 / ¥4.32Input/Output
212
gemini-1.5-flash-002
Google
31.5
2.1K
2M
¥0.54 / ¥2.2Input/Output
213
mistral-large-2407
Mistral
31.2
2.5K
131K
¥14.4 / ¥43.2Input/Output
214
llama-3.1-405b-instruct-bf16
Meta
30.8
2.1K
128K
¥0 / ¥0Input/Output
215
athene-70b-0725
-
30.5
867
-
-
216
llama-3.3-70b-instruct
Meta
30.2
2.9K
128K
¥0 / ¥0Input/Output
217
gemma-3-4b-it
Google
29.9
208
128K
¥1.44 / ¥1.44Input/Output
218
gpt-4-turbo-2024-04-09
Openai
29.5
5.2K
128K
¥72 / ¥216Input/Output
219
claude-3-opus-20240229
Anthropic
29.2
10.4K
200K
¥108 / ¥540Input/Output
220
qwen2.5-coder-32b-instruct
Alibaba
28.9
267
131K
¥2.07 / ¥6.2Input/Output
221
magistral-medium-2506
Mistral
28.6
553
128K
¥14.4 / ¥36Input/Output
222
jamba-1.5-large
-
28.2
331
256K
¥0 / ¥0Input/Output
223
gemini-advanced-0514
Google
27.9
2.4K
-
-
224
reka-core-20240904
-
27.6
458
-
-
225
gpt-4-1106-preview
Openai
27.3
4.2K
8.19K
¥216 / ¥432Input/Output
226
amazon-nova-pro-v1.0
Amazon
26.9
1.4K
300K
¥5.76 / ¥23Input/Output
227
llama-3.1-70b-instruct
Meta
26.6
2.9K
131K
¥2.88 / ¥2.88Input/Output
228
mistral-large-2411
Mistral
26.3
1.5K
128K
¥14.4 / ¥43.2Input/Output
229
claude-3-5-haiku-20241022
Anthropic
26.0
3.5K
200K
¥5.76 / ¥28.8Input/Output
230
gpt-4-0125-preview
Openai
25.6
4.5K
8.19K
¥216 / ¥432Input/Output
231
phi-4
Microsoft
25.3
1.1K
128K
¥0.9 / ¥3.6Input/Output
232
mistral-small-24b-instruct-2501
Mistral
25.0
754
262K
¥2.88 / ¥14.4Input/Output
233
amazon-nova-lite-v1.0
Amazon
24.7
1.1K
300K
¥0.43 / ¥1.73Input/Output
234
gemini-1.5-flash-001
Google
24.4
3.2K
2M
¥0.54 / ¥2.2Input/Output
235
gemini-1.5-flash-8b-001
Google
24.0
2.1K
2M
¥0.54 / ¥2.2Input/Output
236
amazon-nova-micro-v1.0
Amazon
23.7
1.1K
128K
¥0.25 / ¥1.01Input/Output
237
c4ai-aya-expanse-32b
Cohere
23.4
1.8K
-
-
238
reka-flash-20240904
-
23.1
493
65.5K
¥0.72 / ¥1.44Input/Output
239
deepseek-coder-v2
Deepseek
22.7
769
1M
¥1.01 / ¥2.02Input/Output
240
glm-4-0520
Zai
22.4
514
128K
¥108 / ¥108Input/Output
241
command-r-plus-08-2024
Cohere
22.1
524
128K
¥18 / ¥72Input/Output
242
claude-3-sonnet-20240229
Anthropic
21.8
5.6K
200K
¥21.6 / ¥108Input/Output
243
gemma-2-27b-it
Google
21.4
4K
8.19K
¥0.58 / ¥0.58Input/Output
244
nemotron-4-340b-instruct
Nvidia
21.1
1K
-
-
245
qwen2-72b-instruct
Alibaba
20.8
1.8K
131K
¥4.13 / ¥12.4Input/Output
246
ministral-8b-2410
Mistral
20.5
332
128K
¥0.72 / ¥0.72Input/Output
247
llama-3.1-nemotron-51b-instruct
Nvidia
20.1
265
128K
¥0 / ¥0Input/Output
248
gemma-2-9b-it-simpo
-
19.8
369
8.19K
¥1.44 / ¥1.44Input/Output
249
command-r-plus
Cohere
19.5
4K
128K
¥18 / ¥72Input/Output
250
c4ai-aya-expanse-8b
Cohere
19.2
600
-
-
251
internlm2_5-20b-chat
-
18.8
604
-
-
252
llama-3-70b-instruct
Meta
18.5
8K
8.19K
¥3.67 / ¥5.33Input/Output
253
gpt-4-0314
Openai
18.2
2.2K
8.19K
¥216 / ¥432Input/Output
254
claude-3-haiku-20240307
Anthropic
17.9
6.3K
200K
¥1.8 / ¥9Input/Output
255
gemma-2-9b-it
Google
17.5
2.8K
8.19K
¥1.44 / ¥1.44Input/Output
256
qwen1.5-110b-chat
Alibaba
17.2
1.4K
-
-
257
yi-1.5-34b-chat
-
16.9
1K
-
-
258
llama-3.1-8b-instruct
Meta
16.6
2.6K
131K
¥0.79 / ¥0.79Input/Output
259
granite-3.1-8b-instruct
Ibm
16.2
237
-
-
260
command-r-08-2024
Cohere
15.9
604
128K
¥18 / ¥72Input/Output
261
qwen1.5-72b-chat
Alibaba
15.6
1.8K
-
-
262
jamba-1.5-mini
-
15.3
332
256K
¥0 / ¥0Input/Output
263
qwq-32b-preview
Alibaba
14.9
232
131K
¥2.07 / ¥6.2Input/Output
264
granite-3.1-2b-instruct
Ibm
14.6
225
-
-
265
gpt-4-0613
Openai
14.3
3.6K
8.19K
¥216 / ¥432Input/Output
266
qwen1.5-32b-chat
Alibaba
14.0
1.2K
-
-
267
mistral-medium
Mistral
13.6
1.4K
262K
¥2.88 / ¥14.4Input/Output
268
mistral-large-2402
Mistral
13.3
2.9K
262K
¥2.88 / ¥14.4Input/Output
269
reka-flash-21b-20240226-online
-
13.0
830
-
-
270
llama-3-8b-instruct
Meta
12.7
5.4K
8.19K
¥0.29 / ¥0.29Input/Output
271
mixtral-8x22b-instruct-v0.1
Mistral
12.3
2.6K
64K
¥14.4 / ¥43.2Input/Output
272
phi-3-medium-4k-instruct
Microsoft
12.0
1.1K
4.1K
¥1.22 / ¥4.9Input/Output
273
command-r
Cohere
11.7
2.8K
128K
¥18 / ¥72Input/Output
274
reka-flash-21b-20240226
-
11.4
1.3K
-
-
275
gemma-2-2b-it
Google
11.0
2.5K
128K
¥0 / ¥0Input/Output
276
qwen1.5-14b-chat
Alibaba
10.7
944
-
-
277
llama-3.2-3b-instruct
Meta
10.4
499
131K
¥0.22 / ¥0.35Input/Output
278
mixtral-8x7b-instruct-v0.1
Mistral
10.1
3.2K
32K
¥5.04 / ¥5.04Input/Output
279
granite-3.0-8b-instruct
Ibm
9.7
345
-
-
280
starling-lm-7b-beta
-
9.4
952
200K
¥5.4 / ¥18.7Input/Output
281
dbrx-instruct-preview
-
9.1
1.7K
-
-
282
gpt-3.5-turbo-1106
Openai
8.8
437
16.4K
¥7.2 / ¥14.4Input/Output
283
phi-3-small-8k-instruct
Microsoft
8.4
895
8.19K
¥1.08 / ¥4.32Input/Output
284
gpt-3.5-turbo-0125
Openai
8.1
3.2K
16.4K
¥3.6 / ¥10.8Input/Output
285
granite-3.0-2b-instruct
Ibm
7.8
407
-
-
286
zephyr-orpo-141b-A35b-v0.1
-
7.5
219
200K
¥108 / ¥432Input/Output
287
yi-34b-chat
-
7.1
559
-
-
288
gemini-pro-dev-api
Google
6.8
694
1.05M
¥14.4 / ¥86.4Input/Output
289
qwen1.5-7b-chat
Alibaba
6.5
231
-
-
290
openchat-3.5-0106
-
6.2
611
-
-
291
phi-3-mini-4k-instruct-june-2024
Microsoft
5.8
486
4.1K
¥0.94 / ¥3.74Input/Output
292
phi-3-mini-4k-instruct
Microsoft
5.5
936
4.1K
¥0.94 / ¥3.74Input/Output
293
openchat-3.5
-
5.2
194
-
-
294
gemma-1.1-7b-it
Google
4.9
1.2K
-
-
295
llama-2-70b-chat
Meta
4.5
1.4K
-
-
296
llama-2-7b-chat
Meta
4.2
411
128K
¥4.03 / ¥48Input/Output
297
mistral-7b-instruct-v0.2
Mistral
3.9
798
262K
¥2.88 / ¥14.4Input/Output
298
starling-lm-7b-alpha
-
3.6
318
200K
¥5.4 / ¥18.7Input/Output
299
llama-2-13b-chat
Meta
3.2
539
-
-
300
vicuna-33b
-
2.9
484
-
-
301
snowflake-arctic-instruct
-
2.6
1.7K
-
-
302
llama-3.2-1b-instruct
Meta
2.3
487
16.4K
¥0.07 / ¥0.08Input/Output
303
zephyr-7b-beta
-
1.9
201
-
-
304
gemma-7b-it
Google
1.6
340
-
-
305
vicuna-13b
-
1.3
321
-
-
306
phi-3-mini-128k-instruct
Microsoft
1.0
1.1K
128K
¥0.94 / ¥3.74Input/Output
307
qwen1.5-4b-chat
Alibaba
0.6
356
-
-
308
gemma-1.1-2b-it
Google
0.3
578
-
-
309
mistral-7b-instruct
Mistral
0.0
183
262K
¥2.88 / ¥14.4Input/Output
Top model analysis

claude-opus-4-6 why it ranks first

claude-opus-4-6 ranks first with a percent score of 100.0 and 3.1K samples. Use it as the first option for this leaderboard, then compare price, context and availability.

How to choose

Do not only look at rank #1

Start with the leaderboard closest to your task. Compare the top models by score and sample size, then check price, context length, open or closed access, and provider availability.

FAQ

FAQ

专家榜排行榜看什么指标?

主要看排名、百分制分数、样本量和来源。分数用于快速比较同一榜单内模型表现,样本量用于判断结果稳定性。

为什么不同榜单不能直接混合成总分?

不同榜单的任务、样本和评测口径不同,模力榜默认只在同一榜单内排序,避免把写作、代码、图像等能力强行合并。

专家榜模型应该怎么选?

优先看与你任务最接近的榜单,再结合价格、上下文长度、开源闭源和厂商可用性。排名靠前不代表适合所有预算和部署方式。

榜单多久更新?

页面展示的是最新成功采集的公开榜单数据。当前优先使用 LMArena leaderboard dataset,并在页面来源中保留原始链接。