Chat · Text · Math Leaderboard

Ranking for Text / Math, based on public preference data.

Selection guide

Math model ranking guide

Ranking for Text / Math, based on public preference data.

gemini-3.5-flashclaude-opus-4-6-thinkingclaude-opus-4-6gpt-5.4-highqwen3.7-max-preview
Current DirectoryChat · Text · Math
Models349
Published2026/05/27
Arena public preference evaluationOriginal leaderboard: Text / MathPublished: 2026/05/27Leaderboard dataset: LMArena latest parquetOpen Arena sourceOpen leaderboard dataset
1
gemini-3.5-flash
Google
100.0
526
1.05M
¥10.8 / ¥64.8Input/Output
2
claude-opus-4-6-thinking
Anthropic
99.7
2K
1M
¥36 / ¥180Input/Output
3
claude-opus-4-6
Anthropic
99.4
2.2K
1M
¥36 / ¥180Input/Output
4
gpt-5.4-high
Openai
99.1
1.7K
1.05M
¥18 / ¥108Input/Output
5
qwen3.7-max-preview
Alibaba
98.9
220
1M
¥18 / ¥54Input/Output
6
gemini-3.1-pro-preview
Google
98.6
2.6K
1.05M
¥14.4 / ¥86.4Input/Output
7
claude-opus-4-7-thinking
Anthropic
98.3
1.2K
1M
¥36 / ¥180Input/Output
8
claude-opus-4-7
Anthropic
98.0
1.2K
1M
¥36 / ¥180Input/Output
9
mimo-v2.5-pro
Xiaomi
97.7
866
1.05M
¥7.2 / ¥21.6Input/Output
10
ernie-5.1
Baidu
97.4
836
119K
¥5.4 / ¥21.6Input/Output
11
gpt-5.5
Openai
97.1
1K
1.05M
¥36 / ¥216Input/Output
12
gpt-5.5-high
Openai
96.8
1K
1.05M
¥36 / ¥216Input/Output
13
qwen3.6-max-preview
Alibaba
96.6
327
246K
¥9.5 / ¥56.9Input/Output
14
glm-5.1
Zai
96.3
860
200K
¥0 / ¥0Input/Output
15
qwen3.5-max-preview
Alibaba
96.0
1.3K
-
-
16
gemini-3-pro
Google
95.7
2.7K
1.05M
¥14.4 / ¥86.4Input/Output
17
gemini-3-flash
Google
95.4
2K
1.05M
¥3.6 / ¥21.6Input/Output
18
kimi-k2.6
Moonshot
95.1
887
262K
¥6.84 / ¥28.8Input/Output
19
kimi-k2.5-thinking
Moonshot
94.8
2.3K
262K
¥4.32 / ¥21.6Input/Output
20
gemma-4-26b-a4b
Google
94.5
369
262K
¥0.94 / ¥2.88Input/Output
21
deepseek-v4-pro-thinking
Deepseek
94.3
886
1M
¥3.13 / ¥6.26Input/Output
22
gemma-4-31b
Google
94.0
398
262K
¥3.24 / ¥7.2Input/Output
23
grok-4.20-beta-0309-reasoning
Xai
93.7
1.7K
2M
¥14.4 / ¥43.2Input/Output
24
claude-opus-4-5-20251101
Anthropic
93.4
4.1K
200K
¥36 / ¥180Input/Output
25
claude-opus-4-5-20251101-thinking-32k
Anthropic
93.1
2.3K
200K
¥108 / ¥540Input/Output
26
claude-sonnet-4-6
Anthropic
92.8
1.7K
1M
¥21.6 / ¥108Input/Output
27
muse-spark
Meta
92.5
795
-
-
28
gpt-5.4
Openai
92.2
1.8K
1.05M
¥18 / ¥108Input/Output
29
qwen3.6-plus
Alibaba
92.0
1.1K
1M
¥3.6 / ¥21.6Input/Output
30
gemini-2.5-pro
Google
91.7
7.5K
1.05M
¥9 / ¥72Input/Output
31
qwen3-max-preview
Alibaba
91.4
1.5K
262K
¥6.2 / ¥24.8Input/Output
32
gemini-3-flash (thinking-minimal)
Google
91.1
3.2K
1.05M
¥3.6 / ¥21.6Input/Output
33
mimo-v2-pro
Xiaomi
90.8
1.5K
1.05M
¥7.2 / ¥21.6Input/Output
34
qwen3.5-397b-a17b
Alibaba
90.5
2K
262K
¥3.1 / ¥18.6Input/Output
35
claude-sonnet-4-5-20250929-thinking-32k
Anthropic
90.2
4.7K
200K
¥21.6 / ¥108Input/Output
36
deepseek-v4-flash
Deepseek
89.9
992
1M
¥1.01 / ¥2.02Input/Output
37
grok-4.20-multi-agent-beta-0309
Xai
89.7
1.7K
2M
¥14.4 / ¥43.2Input/Output
38
gpt-5.1-high
Openai
89.4
2.5K
400K
¥9 / ¥72Input/Output
39
gpt-5.2-high
Openai
89.1
2.9K
400K
¥12.6 / ¥101Input/Output
40
qwen3-next-80b-a3b-instruct
Alibaba
88.8
1.2K
131K
¥1.04 / ¥4.13Input/Output
41
kimi-k2.5-instant
Moonshot
88.5
515
262K
¥4.32 / ¥21.6Input/Output
42
longcat-flash-chat
Meituan
88.2
689
128K
¥1.08 / ¥10.8Input/Output
43
amazon-nova-experimental-chat-26-02-10
Amazon
87.9
207
-
-
44
ernie-5.0-0110
Baidu
87.6
2.1K
128K
¥7.92 / ¥14.4Input/Output
45
qwen3-max-2025-09-23
Alibaba
87.4
584
258K
¥6.19 / ¥24.7Input/Output
46
mimo-v2.5
Xiaomi
87.1
914
1.05M
¥2.88 / ¥14.4Input/Output
47
dola-seed-2.0-pro
Bytedance
86.8
2.3K
-
-
48
gpt-5.2-chat-latest-20260210
Openai
86.5
2K
400K
¥12.6 / ¥101Input/Output
49
deepseek-v3.2
Deepseek
86.2
3K
128K
¥2.09 / ¥3.1Input/Output
50
grok-4.20-beta1
Xai
85.9
1.5K
2M
¥14.4 / ¥43.2Input/Output
51
glm-5
Zai
85.6
1.4K
205K
¥7.2 / ¥23Input/Output
52
longcat-flash-chat-2602-exp
Meituan
85.3
1.5K
128K
¥1.08 / ¥10.8Input/Output
53
glm-4.6
Zai
85.1
2.1K
205K
¥4.32 / ¥15.8Input/Output
54
kimi-k2-thinking-turbo
Moonshot
84.8
3.7K
262K
¥17.3 / ¥72Input/Output
55
qwen3.5-27b
Alibaba
84.5
1.6K
262K
¥2.16 / ¥17.3Input/Output
56
deepseek-v4-pro
Deepseek
84.2
1K
1M
¥3.13 / ¥6.26Input/Output
57
amazon-nova-experimental-chat-11-10
Amazon
83.9
1.6K
-
-
58
qwen3-235b-a22b-instruct-2507
Alibaba
83.6
5.8K
128K
¥2.09 / ¥8.23Input/Output
59
claude-opus-4-1-20250805-thinking-16k
Anthropic
83.3
3K
200K
¥108 / ¥540Input/Output
60
gemini-3.1-flash-lite-preview
Google
83.0
2.2K
1.05M
¥1.8 / ¥10.8Input/Output
61
amazon-nova-experimental-chat-10-20
Amazon
82.8
805
-
-
62
glm-4.5
Zai
82.5
1.4K
131K
¥4.32 / ¥15.8Input/Output
63
gpt-5.5-instant
Openai
82.2
1.4K
400K
¥9 / ¥72Input/Output
64
qwen3.5-122b-a10b
Alibaba
81.9
1.7K
262K
¥2.88 / ¥23Input/Output
65
deepseek-v3.2-exp-thinking
Deepseek
81.6
481
128K
¥0 / ¥0Input/Output
66
deepseek-v4-flash-thinking
Deepseek
81.3
948
1M
¥1.01 / ¥2.02Input/Output
67
o3-2025-04-16
Openai
81.0
3.7K
200K
¥14.4 / ¥57.6Input/Output
68
grok-4-0709
Xai
80.7
2.3K
256K
¥21.6 / ¥108Input/Output
69
qwen3-vl-235b-a22b-instruct
Alibaba
80.5
704
128K
¥2.16 / ¥8.64Input/Output
70
grok-4.1-thinking
Xai
80.2
3.7K
200K
¥14.4 / ¥72Input/Output
71
glm-4.7
Zai
79.9
711
205K
¥0 / ¥0Input/Output
72
deepseek-v3.2-exp
Deepseek
79.6
775
128K
¥0 / ¥0Input/Output
73
claude-opus-4-1-20250805
Anthropic
79.3
4.7K
200K
¥108 / ¥540Input/Output
74
hunyuan-hy3-preview
Tencent
79.0
378
256K
¥0 / ¥0Input/Output
75
amazon-nova-experimental-chat-12-10
Amazon
78.7
234
-
-
76
deepseek-v3.1
Deepseek
78.4
992
128K
¥1.44 / ¥5.04Input/Output
77
claude-sonnet-4-5-20250929
Anthropic
78.2
4.7K
200K
¥21.6 / ¥108Input/Output
78
grok-4.1
Xai
77.9
4.1K
200K
¥14.4 / ¥72Input/Output
79
gpt-5.2
Openai
77.6
2.8K
400K
¥12.6 / ¥101Input/Output
80
deepseek-v3.2-thinking
Deepseek
77.3
2.5K
128K
¥2.09 / ¥3.1Input/Output
81
gpt-5.4-mini-high
Openai
77.0
1.6K
400K
¥5.4 / ¥32.4Input/Output
82
gpt-5.4-nano-high
Openai
76.7
1.5K
400K
¥1.44 / ¥9Input/Output
83
grok-4-fast-chat
Xai
76.4
399
2M
¥1.44 / ¥3.6Input/Output
84
gemini-2.5-flash-preview-09-2025
Google
76.1
1.9K
1M
¥2.16 / ¥18Input/Output
85
mistral-large-3
Mistral
75.9
2.7K
262K
¥3.6 / ¥10.8Input/Output
86
qwen3-vl-235b-a22b-thinking
Alibaba
75.6
428
131K
¥2.06 / ¥8.26Input/Output
87
deepseek-v3.1-thinking
Deepseek
75.3
665
128K
¥1.44 / ¥5.04Input/Output
88
qwen3-235b-a22b-thinking-2507
Alibaba
75.0
490
131K
¥2.07 / ¥8.26Input/Output
89
gpt-4.5-preview-2025-02-27
Openai
74.7
1.4K
8.19K
¥216 / ¥432Input/Output
90
gemini-2.5-flash
Google
74.4
7.8K
1.05M
¥2.16 / ¥18Input/Output
91
minimax-m2.7
Minimax
74.1
1.4K
205K
¥0 / ¥0Input/Output
92
mistral-medium-2508
Mistral
73.9
5.7K
262K
¥2.88 / ¥14.4Input/Output
93
ernie-5.0-preview-1022
Baidu
73.6
268
128K
¥7.92 / ¥14.4Input/Output
94
gpt-5.1
Openai
73.3
2.9K
400K
¥9 / ¥72Input/Output
95
hunyuan-t1-20250711
Tencent
73.0
236
131K
¥0 / ¥0Input/Output
96
gpt-5-chat
Openai
72.7
1.8K
400K
¥9 / ¥72Input/Output
97
deepseek-v3.1-terminus-thinking
Deepseek
72.4
200
128K
¥1.8 / ¥5.04Input/Output
98
qwen3.5-flash
Alibaba
72.1
1.9K
1M
¥1.24 / ¥12.4Input/Output
99
grok-4-1-fast-reasoning
Xai
71.8
3.4K
2M
¥1.44 / ¥3.6Input/Output
100
qwen3.5-35b-a3b
Alibaba
71.6
1.7K
262K
¥1.8 / ¥14.4Input/Output
101
chatgpt-4o-latest-20250326
Openai
71.3
5.7K
128K
¥18 / ¥72Input/Output
102
ernie-5.0-preview-1203
Baidu
71.0
618
128K
¥7.92 / ¥14.4Input/Output
103
step-3.5-flash
Stepfun
70.7
2.1K
256K
¥0.69 / ¥2.07Input/Output
104
grok-4-fast-reasoning
Xai
70.4
1.1K
2M
¥1.44 / ¥3.6Input/Output
105
deepseek-r1-0528
Deepseek
70.1
869
164K
¥3.6 / ¥15.5Input/Output
106
amazon-nova-experimental-chat-26-01-10
Amazon
69.8
263
-
-
107
deepseek-v3.1-terminus
Deepseek
69.5
218
128K
¥1.8 / ¥5.04Input/Output
108
qwen3-235b-a22b-no-thinking
Alibaba
69.3
2.4K
131K
¥2.07 / ¥8.26Input/Output
109
grok-4.3
Xai
69.0
846
1M
¥9 / ¥18Input/Output
110
qwen3-32b
Alibaba
68.7
316
131K
¥2.07 / ¥8.26Input/Output
111
gpt-5-high
Openai
68.4
1.9K
400K
¥9 / ¥72Input/Output
112
glm-4.5-air
Zai
68.1
1.5K
131K
¥0 / ¥0Input/Output
113
kimi-k2-0905-preview
Moonshot
67.8
759
262K
¥4.32 / ¥18Input/Output
114
mimo-v2-flash (non-thinking)
Xiaomi
67.5
2.7K
262K
¥0.72 / ¥2.16Input/Output
115
o3-mini-high
Openai
67.2
1.9K
200K
¥7.92 / ¥31.7Input/Output
116
qwen3-235b-a22b
Alibaba
67.0
1.6K
131K
¥2.07 / ¥8.26Input/Output
117
minimax-m2.1-preview
Minimax
66.7
1K
205K
¥0 / ¥0Input/Output
118
qwen3-next-80b-a3b-thinking
Alibaba
66.4
829
131K
¥1.04 / ¥10.3Input/Output
119
qwen3-30b-a3b-instruct-2507
Alibaba
66.1
1.4K
262K
¥2.16 / ¥3.6Input/Output
120
nvidia-llama-3.3-nemotron-super-49b-v1.5
Nvidia
65.8
194
131K
¥2.88 / ¥2.88Input/Output
121
deepseek-r1
Deepseek
65.5
1.6K
164K
¥5.04 / ¥18Input/Output
122
claude-opus-4-20250514-thinking-16k
Anthropic
65.2
2.2K
200K
¥108 / ¥540Input/Output
123
grok-3-preview-02-24
Xai
64.9
2.7K
1M
¥9 / ¥18Input/Output
124
claude-haiku-4-5-20251001
Anthropic
64.7
4.7K
200K
¥7.2 / ¥36Input/Output
125
o1-2024-12-17
Openai
64.4
3K
128K
¥108 / ¥432Input/Output
126
gpt-oss-120b
Openai
64.1
1.8K
131K
¥1.08 / ¥4.32Input/Output
127
o4-mini-2025-04-16
Openai
63.8
2.9K
200K
¥7.92 / ¥31.7Input/Output
128
gpt-5.3-chat-latest
Openai
63.5
1.9K
128K
¥12.6 / ¥101Input/Output
129
intellect-3
-
63.2
332
131K
¥1.44 / ¥7.92Input/Output
130
grok-3-mini-high
Xai
62.9
977
128K
¥0 / ¥0Input/Output
131
minimax-m2.5
Minimax
62.6
2.2K
205K
¥0 / ¥0Input/Output
132
nvidia-nemotron-3-super-120b-a12b
Nvidia
62.4
511
262K
¥1.44 / ¥5.76Input/Output
133
mimo-v2-flash (thinking)
Xiaomi
62.1
633
262K
¥0.72 / ¥2.16Input/Output
134
gpt-5-mini-high
Openai
61.8
1.5K
400K
¥1.8 / ¥14.4Input/Output
135
claude-sonnet-4-20250514-thinking-32k
Anthropic
61.5
2K
200K
¥21.6 / ¥108Input/Output
136
deepseek-v3-0324
Deepseek
61.2
3.2K
75K
¥1.44 / ¥5.76Input/Output
137
nvidia-nemotron-3-nano-30b-a3b-bf16
Nvidia
60.9
987
131K
¥0 / ¥0Input/Output
138
gemini-2.5-flash-lite-preview-09-2025-no-thinking
Google
60.6
2.9K
1.05M
¥0.72 / ¥2.88Input/Output
139
o3-mini
Openai
60.3
4.7K
200K
¥7.92 / ¥31.7Input/Output
140
claude-opus-4-20250514
Anthropic
60.1
2.8K
200K
¥108 / ¥540Input/Output
141
o1-preview
Openai
59.8
4.6K
128K
¥108 / ¥432Input/Output
142
trinity-large-thinking
-
59.5
1.4K
262K
¥1.8 / ¥6.48Input/Output
143
ling-flash-2.0
Ant Group
59.2
461
131K
¥1.01 / ¥4.1Input/Output
144
grok-3-mini-beta
Xai
58.9
1.5K
1M
¥9 / ¥18Input/Output
145
qwen2.5-max
Alibaba
58.6
3.3K
32K
¥11.5 / ¥46Input/Output
146
gpt-4.1-2025-04-14
Openai
58.3
3.2K
1.05M
¥14.4 / ¥57.6Input/Output
147
kimi-k2-0711-preview
Moonshot
58.0
1.7K
131K
¥4.32 / ¥18Input/Output
148
step-3
Stepfun
57.8
353
65.5K
¥1.8 / ¥4.68Input/Output
149
qwen3-coder-480b-a35b-instruct
Alibaba
57.5
1.6K
262K
¥6.2 / ¥24.8Input/Output
150
gemini-2.5-flash-lite-preview-06-17-thinking
Google
57.2
2.1K
65.5K
¥0.72 / ¥2.88Input/Output
151
minimax-m1
Minimax
56.9
1.8K
1M
¥0.95 / ¥9.03Input/Output
152
nova-2-lite
Amazon
56.6
825
128K
¥2.38 / ¥19.8Input/Output
153
llama-3.1-nemotron-ultra-253b-v1
Nvidia
56.3
209
128K
¥4.32 / ¥13Input/Output
154
qwq-32b
Alibaba
56.0
1.7K
131K
¥2.07 / ¥6.2Input/Output
155
hunyuan-turbos-20250416
Tencent
55.7
845
131K
¥0 / ¥0Input/Output
156
glm-4.7-flash
Zai
55.5
718
200K
¥0 / ¥0Input/Output
157
o1-mini
Openai
55.2
7.5K
128K
¥7.92 / ¥31.7Input/Output
158
claude-sonnet-4-20250514
Anthropic
54.9
2.5K
200K
¥21.6 / ¥108Input/Output
159
qwen3-30b-a3b
Alibaba
54.6
1.7K
128K
¥0.79 / ¥7.78Input/Output
160
minimax-m2
Minimax
54.3
318
197K
¥0 / ¥0Input/Output
161
mistral-medium-2505
Mistral
54.0
2.2K
262K
¥2.88 / ¥14.4Input/Output
162
gemini-2.0-flash-001
Google
53.7
4.1K
1.05M
¥1.08 / ¥4.32Input/Output
163
glm-4.5v
Zai
53.4
276
64K
¥4.32 / ¥13Input/Output
164
ring-flash-2.0
Ant Group
53.2
453
131K
¥1.01 / ¥4.1Input/Output
165
gpt-4.1-mini-2025-04-14
Openai
52.9
2.7K
1.05M
¥2.88 / ¥11.5Input/Output
166
mistral-small-2506
Mistral
52.6
1K
262K
¥2.88 / ¥14.4Input/Output
167
claude-3-7-sonnet-20250219-thinking-32k
Anthropic
52.3
2.8K
-
-
168
trinity-large-preview
-
52.0
1.8K
262K
¥1.8 / ¥6.48Input/Output
169
qwen-plus-0125
Alibaba
51.7
732
1M
¥0.83 / ¥2.07Input/Output
170
claude-3-7-sonnet-20250219
Anthropic
51.4
3.4K
200K
¥21.6 / ¥108Input/Output
171
step-1o-turbo-202506
Stepfun
51.1
564
-
-
172
gpt-oss-20b
Openai
50.9
680
131K
¥0.32 / ¥1.3Input/Output
173
gpt-5-nano-high
Openai
50.6
494
400K
¥0.36 / ¥2.88Input/Output
174
olmo-3-32b-think
Allenai
50.3
314
128K
¥2.16 / ¥3.24Input/Output
175
gemini-1.5-pro-002
Google
50.0
7.6K
-
-
176
gemma-3-27b-it
Google
49.7
3.6K
128K
¥2.15 / ¥2.15Input/Output
177
olmo-3.1-32b-instruct
Allenai
49.4
696
200K
¥14.4 / ¥57.6Input/Output
178
deepseek-v3
Deepseek
49.1
2.7K
128K
¥0 / ¥0Input/Output
179
gemini-2.0-flash-lite-preview-02-05
Google
48.9
2.8K
1.05M
¥0.54 / ¥2.16Input/Output
180
granite-4.1-8b
Ibm
48.6
218
131K
¥0.36 / ¥0.72Input/Output
181
gemma-3-12b-it
Google
48.3
389
128K
¥1.96 / ¥1.96Input/Output
182
claude-3-5-sonnet-20241022
Anthropic
48.0
10K
200K
¥21.6 / ¥108Input/Output
183
step-2-16k-exp-202412
Stepfun
47.7
642
16.4K
¥37.5 / ¥118Input/Output
184
claude-3-5-sonnet-20240620
Anthropic
47.4
11.4K
200K
¥21.6 / ¥108Input/Output
185
athene-v2-chat
-
47.1
3.4K
-
-
186
llama-4-maverick-17b-128e-instruct
Meta
46.8
2.8K
1M
¥1.8 / ¥6.26Input/Output
187
yi-lightning
-
46.6
3.9K
12K
¥1.44 / ¥1.44Input/Output
188
command-a-03-2025
Cohere
46.3
4K
256K
¥18 / ¥72Input/Output
189
olmo-3.1-32b-think
Allenai
46.0
473
200K
¥14.4 / ¥57.6Input/Output
190
qwen2.5-plus-1127
Alibaba
45.7
1.4K
-
-
191
hunyuan-turbos-20250226
Tencent
45.4
238
131K
¥0 / ¥0Input/Output
192
deepseek-v2.5-1210
Deepseek
45.1
1K
1M
¥1.01 / ¥2.02Input/Output
193
glm-4-plus-0111
Zai
44.8
721
128K
¥72 / ¥72Input/Output
194
llama-4-scout-17b-16e-instruct
Meta
44.5
1.9K
128K
¥1.44 / ¥5.62Input/Output
195
gpt-4o-2024-08-06
Openai
44.3
6.8K
128K
¥18 / ¥72Input/Output
196
gpt-4o-2024-05-13
Openai
44.0
15.1K
128K
¥36 / ¥108Input/Output
197
grok-2-2024-08-13
Xai
43.7
9K
1M
¥9 / ¥18Input/Output
198
qwen2.5-72b-instruct
Alibaba
43.4
5.4K
131K
¥4.13 / ¥12.4Input/Output
199
llama-3.1-405b-instruct-fp8
Meta
43.1
8.5K
128K
¥0 / ¥0Input/Output
200
hunyuan-large-2025-02-10
Tencent
42.8
497
-
-
201
llama-3.1-405b-instruct-bf16
Meta
42.5
5.2K
128K
¥0 / ¥0Input/Output
202
qwen-max-0919
Alibaba
42.2
2.2K
131K
¥2.48 / ¥9.91Input/Output
203
glm-4-plus
Zai
42.0
3.6K
128K
¥54 / ¥54Input/Output
204
gpt-4.1-nano-2025-04-14
Openai
41.7
582
1.05M
¥14.4 / ¥57.6Input/Output
205
hunyuan-standard-2025-02-10
Tencent
41.4
499
-
-
206
hunyuan-turbo-0110
Tencent
41.1
243
-
-
207
claude-3-opus-20240229
Anthropic
40.8
25.8K
200K
¥108 / ¥540Input/Output
208
gemini-advanced-0514
Google
40.5
6.4K
-
-
209
gpt-4-turbo-2024-04-09
Openai
40.2
13.2K
128K
¥72 / ¥216Input/Output
210
llama-3.1-nemotron-70b-instruct
Nvidia
39.9
1K
128K
¥0 / ¥0Input/Output
211
deepseek-v2.5
Deepseek
39.7
3.6K
1M
¥1.01 / ¥2.02Input/Output
212
gemini-1.5-pro-001
Google
39.4
10.5K
-
-
213
gpt-4-1106-preview
Openai
39.1
13.3K
8.19K
¥216 / ¥432Input/Output
214
gemini-1.5-flash-002
Google
38.8
4.8K
2M
¥0.54 / ¥2.2Input/Output
215
hunyuan-large-vision
Tencent
38.5
351
-
-
216
gpt-4-0125-preview
Openai
38.2
12.4K
8.19K
¥216 / ¥432Input/Output
217
gpt-4o-mini-2024-07-18
Openai
37.9
9.3K
128K
¥1.08 / ¥4.32Input/Output
218
llama-3.3-70b-instruct
Meta
37.6
5.8K
128K
¥0 / ¥0Input/Output
219
grok-2-mini-2024-08-13
Xai
37.4
7.3K
1M
¥9 / ¥18Input/Output
220
mistral-large-2407
Mistral
37.1
6.7K
131K
¥14.4 / ¥43.2Input/Output
221
mistral-small-3.1-24b-instruct-2503
Mistral
36.8
2.1K
262K
¥2.88 / ¥14.4Input/Output
222
mistral-large-2411
Mistral
36.5
3.6K
128K
¥14.4 / ¥43.2Input/Output
223
llama-3.1-70b-instruct
Meta
36.2
7.7K
131K
¥2.88 / ¥2.88Input/Output
224
amazon-nova-pro-v1.0
Amazon
35.9
3K
300K
¥5.76 / ¥23Input/Output
225
ibm-granite-h-small
Ibm
35.6
358
-
-
226
gemma-3n-e4b-it
Google
35.3
1.6K
128K
¥0 / ¥0Input/Output
227
qwen2.5-coder-32b-instruct
Alibaba
35.1
725
131K
¥2.07 / ¥6.2Input/Output
228
magistral-medium-2506
Mistral
34.8
553
128K
¥14.4 / ¥36Input/Output
229
phi-4
Microsoft
34.5
2.8K
128K
¥0.9 / ¥3.6Input/Output
230
claude-3-5-haiku-20241022
Anthropic
34.2
6.4K
200K
¥5.76 / ¥28.8Input/Output
231
llama-3.1-tulu-3-70b
Allenai
33.9
397
-
-
232
deepseek-coder-v2
Deepseek
33.6
1.9K
1M
¥1.01 / ¥2.02Input/Output
233
mistral-small-24b-instruct-2501
Mistral
33.3
1.7K
262K
¥2.88 / ¥14.4Input/Output
234
gemma-3-4b-it
Google
33.0
423
128K
¥1.44 / ¥1.44Input/Output
235
qwen2-72b-instruct
Alibaba
32.8
4.8K
131K
¥4.13 / ¥12.4Input/Output
236
hunyuan-standard-256k
Tencent
32.5
361
-
-
237
athene-70b-0725
-
32.2
2.9K
-
-
238
gpt-4-0314
Openai
31.9
7.1K
8.19K
¥216 / ¥432Input/Output
239
llama-3.1-nemotron-51b-instruct
Nvidia
31.6
507
128K
¥0 / ¥0Input/Output
240
gemini-1.5-flash-001
Google
31.3
8.4K
2M
¥0.54 / ¥2.2Input/Output
241
amazon-nova-lite-v1.0
Amazon
31.0
2.5K
300K
¥0.43 / ¥1.73Input/Output
242
reka-core-20240904
-
30.7
1.2K
-
-
243
jamba-1.5-large
-
30.5
1.1K
256K
¥0 / ¥0Input/Output
244
glm-4-0520
Zai
30.2
1.2K
128K
¥108 / ¥108Input/Output
245
llama-3-70b-instruct
Meta
29.9
20.9K
8.19K
¥3.67 / ¥5.33Input/Output
246
gpt-4-0613
Openai
29.6
11.2K
8.19K
¥216 / ¥432Input/Output
247
nemotron-4-340b-instruct
Nvidia
29.3
2.4K
-
-
248
qwq-32b-preview
Alibaba
29.0
480
131K
¥2.07 / ¥6.2Input/Output
249
claude-3-sonnet-20240229
Anthropic
28.7
13.8K
200K
¥21.6 / ¥108Input/Output
250
gemma-2-27b-it
Google
28.4
10.2K
8.19K
¥0.58 / ¥0.58Input/Output
251
olmo-2-0325-32b-instruct
Allenai
28.2
375
-
-
252
gemini-1.5-flash-8b-001
Google
27.9
5K
2M
¥0.54 / ¥2.2Input/Output
253
amazon-nova-micro-v1.0
Amazon
27.6
2.5K
128K
¥0.25 / ¥1.01Input/Output
254
mistral-large-2402
Mistral
27.3
8K
262K
¥2.88 / ¥14.4Input/Output
255
c4ai-aya-expanse-32b
Cohere
27.0
3.9K
-
-
256
reka-flash-20240904
-
26.7
1.3K
65.5K
¥0.72 / ¥1.44Input/Output
257
llama-3.1-tulu-3-8b
Allenai
26.4
363
-
-
258
ministral-8b-2410
Mistral
26.1
683
128K
¥0.72 / ¥0.72Input/Output
259
claude-3-haiku-20240307
Anthropic
25.9
15K
200K
¥1.8 / ¥9Input/Output
260
command-r-plus-08-2024
Cohere
25.6
1.5K
128K
¥18 / ¥72Input/Output
261
qwen1.5-110b-chat
Alibaba
25.3
3.2K
-
-
262
mixtral-8x22b-instruct-v0.1
Mistral
25.0
6.8K
64K
¥14.4 / ¥43.2Input/Output
263
gemma-2-9b-it
Google
24.7
7.1K
8.19K
¥1.44 / ¥1.44Input/Output
264
yi-1.5-34b-chat
-
24.4
3K
-
-
265
mistral-medium
Mistral
24.1
4.4K
262K
¥2.88 / ¥14.4Input/Output
266
internlm2_5-20b-chat
-
23.9
1.4K
-
-
267
llama-3.1-8b-instruct
Meta
23.6
7.1K
131K
¥0.79 / ¥0.79Input/Output
268
phi-3-medium-4k-instruct
Microsoft
23.3
3.2K
4.1K
¥1.22 / ¥4.9Input/Output
269
gemma-2-9b-it-simpo
-
23.0
1.3K
8.19K
¥1.44 / ¥1.44Input/Output
270
c4ai-aya-expanse-8b
Cohere
22.7
1.3K
-
-
271
reka-flash-21b-20240226-online
-
22.4
2K
-
-
272
command-r-plus
Cohere
22.1
9.8K
128K
¥18 / ¥72Input/Output
273
qwen1.5-72b-chat
Alibaba
21.8
5.3K
-
-
274
jamba-1.5-mini
-
21.6
1.1K
256K
¥0 / ¥0Input/Output
275
granite-3.1-2b-instruct
Ibm
21.3
391
-
-
276
reka-flash-21b-20240226
-
21.0
3.4K
-
-
277
qwen1.5-32b-chat
Alibaba
20.7
2.6K
-
-
278
command-r-08-2024
Cohere
20.4
1.6K
128K
¥18 / ¥72Input/Output
279
phi-3-mini-4k-instruct-june-2024
Microsoft
20.1
1.6K
4.1K
¥0.94 / ¥3.74Input/Output
280
granite-3.1-8b-instruct
Ibm
19.8
382
-
-
281
llama-3-8b-instruct
Meta
19.5
14.3K
8.19K
¥0.29 / ¥0.29Input/Output
282
phi-3-small-8k-instruct
Microsoft
19.3
2.1K
8.19K
¥1.08 / ¥4.32Input/Output
283
zephyr-orpo-141b-A35b-v0.1
-
19.0
589
200K
¥108 / ¥432Input/Output
284
mixtral-8x7b-instruct-v0.1
Mistral
18.7
9.7K
32K
¥5.04 / ¥5.04Input/Output
285
dbrx-instruct-preview
-
18.4
4K
-
-
286
granite-3.0-8b-instruct
Ibm
18.1
873
-
-
287
gpt-3.5-turbo-0125
Openai
17.8
8.6K
16.4K
¥3.6 / ¥10.8Input/Output
288
gpt-3.5-turbo-1106
Openai
17.5
2.1K
16.4K
¥7.2 / ¥14.4Input/Output
289
gemma-2-2b-it
Google
17.2
6.6K
128K
¥0 / ¥0Input/Output
290
gemini-pro-dev-api
Google
17.0
2.3K
1.05M
¥14.4 / ¥86.4Input/Output
291
gemini-pro
Google
16.7
993
1.05M
¥14.4 / ¥86.4Input/Output
292
llama-3.2-3b-instruct
Meta
16.4
1.1K
131K
¥0.22 / ¥0.35Input/Output
293
qwen1.5-14b-chat
Alibaba
16.1
2.2K
-
-
294
starling-lm-7b-beta
-
15.8
2K
200K
¥5.4 / ¥18.7Input/Output
295
command-r
Cohere
15.5
6.7K
128K
¥18 / ¥72Input/Output
296
granite-3.0-2b-instruct
Ibm
15.2
908
-
-
297
wizardlm-70b
Microsoft
14.9
903
-
-
298
yi-34b-chat
-
14.7
2K
-
-
299
phi-3-mini-4k-instruct
Microsoft
14.4
2.6K
4.1K
¥0.94 / ¥3.74Input/Output
300
snowflake-arctic-instruct
-
14.1
4.8K
-
-
301
deepseek-llm-67b-chat
Deepseek
13.8
576
1M
¥1.01 / ¥2.02Input/Output
302
tulu-2-dpo-70b
-
13.5
888
-
-
303
gemma-1.1-7b-it
Google
13.2
3K
-
-
304
openchat-3.5-0106
-
12.9
1.7K
-
-
305
smollm2-1.7b-instruct
-
12.6
271
-
-
306
openhermes-2.5-mistral-7b
-
12.4
697
1M
¥36 / ¥180Input/Output
307
llama-2-70b-chat
Meta
12.1
4.7K
-
-
308
phi-3-mini-128k-instruct
Microsoft
11.8
2.8K
128K
¥0.94 / ¥3.74Input/Output
309
llama-3.2-1b-instruct
Meta
11.5
1.2K
16.4K
¥0.07 / ¥0.08Input/Output
310
mistral-7b-instruct-v0.2
Mistral
11.2
2.6K
262K
¥2.88 / ¥14.4Input/Output
311
starling-lm-7b-alpha
-
10.9
1.3K
200K
¥5.4 / ¥18.7Input/Output
312
qwen1.5-7b-chat
Alibaba
10.6
690
-
-
313
dolphin-2.2.1-mistral-7b
-
10.3
219
262K
¥2.88 / ¥14.4Input/Output
314
llama2-70b-steerlm-chat
Nvidia
10.1
440
-
-
315
openchat-3.5
-
9.8
945
-
-
316
vicuna-33b
-
9.5
2.7K
-
-
317
qwen-14b-chat
Alibaba
9.2
534
32.8K
¥1.04 / ¥3.1Input/Output
318
gemma-7b-it
Google
8.9
1.1K
-
-
319
llama-2-13b-chat
Meta
8.6
2.2K
-
-
320
solar-10.7b-instruct-v1.0
-
8.3
604
128K
¥0 / ¥0Input/Output
321
nous-hermes-2-mixtral-8x7b-dpo
-
8.0
628
1M
¥36 / ¥180Input/Output
322
codellama-34b-instruct
Meta
7.8
770
-
-
323
palm-2
Google
7.5
901
-
-
324
gemma-1.1-2b-it
Google
7.2
1.4K
-
-
325
mpt-30b-chat
-
6.9
242
-
-
326
llama-2-7b-chat
Meta
6.6
1.7K
128K
¥4.03 / ¥48Input/Output
327
zephyr-7b-beta
-
6.3
1.3K
-
-
328
stripedhyena-nous-7b
-
6.0
676
-
-
329
guanaco-33b
-
5.7
280
200K
¥14.4 / ¥57.6Input/Output
330
vicuna-13b
-
5.5
2.1K
-
-
331
mistral-7b-instruct
Mistral
5.2
974
262K
¥2.88 / ¥14.4Input/Output
332
qwen1.5-4b-chat
Alibaba
4.9
988
-
-
333
olmo-7b-instruct
Allenai
4.6
848
-
-
334
wizardlm-13b
Microsoft
4.3
669
-
-
335
gemma-2b-it
Google
4.0
597
-
-
336
vicuna-7b
-
3.7
658
-
-
337
chatglm3-6b
-
3.4
576
200K
¥5.4 / ¥18.7Input/Output
338
gpt4all-13b-snoozy
-
3.2
211
1M
¥36 / ¥216Input/Output
339
koala-13b
-
2.9
751
-
-
340
chatglm-6b
-
2.6
525
200K
¥5.4 / ¥18.7Input/Output
341
RWKV-4-Raven-14B
-
2.3
544
-
-
342
mpt-7b-chat
-
2.0
471
-
-
343
chatglm2-6b
-
1.7
227
200K
¥5.4 / ¥18.7Input/Output
344
alpaca-13b
-
1.4
652
-
-
345
oasst-pythia-12b
-
1.1
687
-
-
346
dolly-v2-12b
-
0.9
370
-
-
347
fastchat-t5-3b
-
0.6
462
-
-
348
stablelm-tuned-alpha-7b
-
0.3
353
-
-
349
llama-13b
Meta
0.0
252
-
-
Top model analysis

gemini-3.5-flash why it ranks first

gemini-3.5-flash ranks first with a percent score of 100.0 and 526 samples. Use it as the first option for this leaderboard, then compare price, context and availability.

How to choose

Do not only look at rank #1

Start with the leaderboard closest to your task. Compare the top models by score and sample size, then check price, context length, open or closed access, and provider availability.

FAQ

FAQ

数学排行榜看什么指标?

主要看排名、百分制分数、样本量和来源。分数用于快速比较同一榜单内模型表现,样本量用于判断结果稳定性。

为什么不同榜单不能直接混合成总分?

不同榜单的任务、样本和评测口径不同,模力榜默认只在同一榜单内排序,避免把写作、代码、图像等能力强行合并。

数学模型应该怎么选?

优先看与你任务最接近的榜单,再结合价格、上下文长度、开源闭源和厂商可用性。排名靠前不代表适合所有预算和部署方式。

榜单多久更新?

页面展示的是最新成功采集的公开榜单数据。当前优先使用 LMArena leaderboard dataset,并在页面来源中保留原始链接。