Chat · Text · Legal & Government Leaderboard

Ranking for Text / Legal & Government, based on public preference data.

Selection guide

Legal & Government model ranking guide

Ranking for Text / Legal & Government, based on public preference data.

claude-opus-4-6-thinkingclaude-opus-4-6claude-opus-4-7-thinkingmuse-sparkgemini-3-pro
Current DirectoryChat · Text · Legal & Government
Models333
Published2026/05/27
Arena public preference evaluationOriginal leaderboard: Text / Industry Legal And GovernmentPublished: 2026/05/27Leaderboard dataset: LMArena latest parquetOpen Arena sourceOpen leaderboard dataset
1
claude-opus-4-6-thinking
Anthropic
100.0
2.6K
1M
¥36 / ¥180Input/Output
2
claude-opus-4-6
Anthropic
99.7
2.7K
1M
¥36 / ¥180Input/Output
3
claude-opus-4-7-thinking
Anthropic
99.4
1.6K
1M
¥36 / ¥180Input/Output
4
muse-spark
Meta
99.1
876
-
-
5
gemini-3-pro
Google
98.8
2.9K
1.05M
¥14.4 / ¥86.4Input/Output
6
gemini-3.1-pro-preview
Google
98.5
3.3K
1.05M
¥14.4 / ¥86.4Input/Output
7
gemini-3.5-flash
Google
98.2
689
1.05M
¥10.8 / ¥64.8Input/Output
8
gpt-5.4-high
Openai
97.9
2.2K
1.05M
¥18 / ¥108Input/Output
9
qwen3.5-max-preview
Alibaba
97.6
1.6K
-
-
10
gemini-2.5-pro
Google
97.3
8.7K
1.05M
¥9 / ¥72Input/Output
11
gpt-5.5-high
Openai
97.0
1.3K
1.05M
¥36 / ¥216Input/Output
12
claude-opus-4-7
Anthropic
96.7
1.6K
1M
¥36 / ¥180Input/Output
13
gemini-3-flash
Google
96.4
2.3K
1.05M
¥3.6 / ¥21.6Input/Output
14
qwen3.7-max-preview
Alibaba
96.1
292
1M
¥18 / ¥54Input/Output
15
ernie-5.1
Baidu
95.8
1.1K
119K
¥5.4 / ¥21.6Input/Output
16
mimo-v2.5-pro
Xiaomi
95.5
1.2K
1.05M
¥7.2 / ¥21.6Input/Output
17
glm-5.1
Zai
95.2
997
200K
¥0 / ¥0Input/Output
18
gpt-5.4
Openai
94.9
2.2K
1.05M
¥18 / ¥108Input/Output
19
gpt-5.5
Openai
94.6
1.2K
1.05M
¥36 / ¥216Input/Output
20
claude-sonnet-4-6
Anthropic
94.3
2.1K
1M
¥21.6 / ¥108Input/Output
21
deepseek-v4-pro-thinking
Deepseek
94.0
1.2K
1M
¥3.13 / ¥6.26Input/Output
22
amazon-nova-experimental-chat-26-02-10
Amazon
93.7
247
-
-
23
ernie-5.0-preview-1022
Baidu
93.4
322
128K
¥7.92 / ¥14.4Input/Output
24
deepseek-v3.1-terminus-thinking
Deepseek
93.1
262
128K
¥1.8 / ¥5.04Input/Output
25
dola-seed-2.0-pro
Bytedance
92.8
2.7K
-
-
26
gpt-5.1-high
Openai
92.5
2.9K
400K
¥9 / ¥72Input/Output
27
claude-opus-4-5-20251101
Anthropic
92.2
4.8K
200K
¥36 / ¥180Input/Output
28
deepseek-v4-pro
Deepseek
91.9
1.2K
1M
¥3.13 / ¥6.26Input/Output
29
kimi-k2.6
Moonshot
91.6
1.1K
262K
¥6.84 / ¥28.8Input/Output
30
gemini-3-flash (thinking-minimal)
Google
91.3
4.1K
1.05M
¥3.6 / ¥21.6Input/Output
31
grok-4.20-multi-agent-beta-0309
Xai
91.0
2.2K
2M
¥14.4 / ¥43.2Input/Output
32
gemma-4-26b-a4b
Google
90.7
386
262K
¥0.94 / ¥2.88Input/Output
33
claude-opus-4-5-20251101-thinking-32k
Anthropic
90.4
2.7K
200K
¥108 / ¥540Input/Output
34
glm-4.6
Zai
90.1
2.4K
205K
¥4.32 / ¥15.8Input/Output
35
glm-5
Zai
89.8
1.7K
205K
¥7.2 / ¥23Input/Output
36
ernie-5.0-preview-1203
Baidu
89.5
763
128K
¥7.92 / ¥14.4Input/Output
37
claude-sonnet-4-5-20250929
Anthropic
89.2
5.4K
200K
¥21.6 / ¥108Input/Output
38
mistral-large-3
Mistral
88.9
3K
262K
¥3.6 / ¥10.8Input/Output
39
deepseek-v4-flash-thinking
Deepseek
88.6
1.2K
1M
¥1.01 / ¥2.02Input/Output
40
gemma-4-31b
Google
88.3
428
262K
¥3.24 / ¥7.2Input/Output
41
grok-4.1
Xai
88.0
4.7K
200K
¥14.4 / ¥72Input/Output
42
glm-4.7
Zai
87.7
958
205K
¥0 / ¥0Input/Output
43
qwen3-vl-235b-a22b-instruct
Alibaba
87.3
753
128K
¥2.16 / ¥8.64Input/Output
44
grok-4.20-beta-0309-reasoning
Xai
87.0
2.2K
2M
¥14.4 / ¥43.2Input/Output
45
mimo-v2-pro
Xiaomi
86.7
1.6K
1.05M
¥7.2 / ¥21.6Input/Output
46
chatgpt-4o-latest-20250326
Openai
86.4
5.3K
128K
¥18 / ¥72Input/Output
47
qwen3.6-max-preview
Alibaba
86.1
330
246K
¥9.5 / ¥56.9Input/Output
48
grok-3-preview-02-24
Xai
85.8
1.9K
1M
¥9 / ¥18Input/Output
49
grok-4.1-thinking
Xai
85.5
4.6K
200K
¥14.4 / ¥72Input/Output
50
mistral-medium-2508
Mistral
85.2
6.4K
262K
¥2.88 / ¥14.4Input/Output
51
gpt-5.1
Openai
84.9
3.1K
400K
¥9 / ¥72Input/Output
52
ernie-5.0-0110
Baidu
84.6
2.5K
128K
¥7.92 / ¥14.4Input/Output
53
grok-4.20-beta1
Xai
84.3
1.8K
2M
¥14.4 / ¥43.2Input/Output
54
qwen3.6-plus
Alibaba
84.0
1.3K
1M
¥3.6 / ¥21.6Input/Output
55
deepseek-v4-flash
Deepseek
83.7
1.2K
1M
¥1.01 / ¥2.02Input/Output
56
glm-4.5
Zai
83.4
1.5K
131K
¥4.32 / ¥15.8Input/Output
57
kimi-k2.5-thinking
Moonshot
83.1
2.7K
262K
¥4.32 / ¥21.6Input/Output
58
claude-sonnet-4-5-20250929-thinking-32k
Anthropic
82.8
5.3K
200K
¥21.6 / ¥108Input/Output
59
qwen3.5-397b-a17b
Alibaba
82.5
2.3K
262K
¥3.1 / ¥18.6Input/Output
60
gpt-5.2-chat-latest-20260210
Openai
82.2
2.4K
400K
¥12.6 / ¥101Input/Output
61
qwen3-next-80b-a3b-instruct
Alibaba
81.9
1.4K
131K
¥1.04 / ¥4.13Input/Output
62
qwen3-max-preview
Alibaba
81.6
1.7K
262K
¥6.2 / ¥24.8Input/Output
63
amazon-nova-experimental-chat-12-10
Amazon
81.3
256
-
-
64
gpt-4.5-preview-2025-02-27
Openai
81.0
701
8.19K
¥216 / ¥432Input/Output
65
gemini-2.5-flash
Google
80.7
8.5K
1.05M
¥2.16 / ¥18Input/Output
66
kimi-k2.5-instant
Moonshot
80.4
517
262K
¥4.32 / ¥21.6Input/Output
67
amazon-nova-experimental-chat-11-10
Amazon
80.1
1.8K
-
-
68
gpt-5-chat
Openai
79.8
2K
400K
¥9 / ¥72Input/Output
69
qwen3-235b-a22b-instruct-2507
Alibaba
79.5
6.4K
128K
¥2.09 / ¥8.23Input/Output
70
gpt-5.2
Openai
79.2
3.5K
400K
¥12.6 / ¥101Input/Output
71
grok-4-0709
Xai
78.9
2.7K
256K
¥21.6 / ¥108Input/Output
72
o3-2025-04-16
Openai
78.6
3.8K
200K
¥14.4 / ¥57.6Input/Output
73
gemini-2.5-flash-preview-09-2025
Google
78.3
2.2K
1M
¥2.16 / ¥18Input/Output
74
grok-4-fast-reasoning
Xai
78.0
1.2K
2M
¥1.44 / ¥3.6Input/Output
75
gpt-5.2-high
Openai
77.7
3.4K
400K
¥12.6 / ¥101Input/Output
76
hunyuan-hy3-preview
Tencent
77.4
441
256K
¥0 / ¥0Input/Output
77
deepseek-v3.2
Deepseek
77.1
3.3K
128K
¥2.09 / ¥3.1Input/Output
78
qwen3.5-122b-a10b
Alibaba
76.8
2.1K
262K
¥2.88 / ¥23Input/Output
79
deepseek-v3.2-thinking
Deepseek
76.5
2.9K
128K
¥2.09 / ¥3.1Input/Output
80
gpt-5.4-mini-high
Openai
76.2
2K
400K
¥5.4 / ¥32.4Input/Output
81
gpt-5-high
Openai
75.9
1.9K
400K
¥9 / ¥72Input/Output
82
deepseek-v3.2-exp-thinking
Deepseek
75.6
521
128K
¥0 / ¥0Input/Output
83
deepseek-r1-0528
Deepseek
75.3
1.4K
164K
¥3.6 / ¥15.5Input/Output
84
hunyuan-t1-20250711
Tencent
75.0
273
131K
¥0 / ¥0Input/Output
85
gpt-5.5-instant
Openai
74.7
2.1K
400K
¥9 / ¥72Input/Output
86
deepseek-v3.1
Deepseek
74.4
1K
128K
¥1.44 / ¥5.04Input/Output
87
claude-opus-4-1-20250805-thinking-16k
Anthropic
74.1
3.2K
200K
¥108 / ¥540Input/Output
88
deepseek-v3.1-thinking
Deepseek
73.8
780
128K
¥1.44 / ¥5.04Input/Output
89
kimi-k2-thinking-turbo
Moonshot
73.5
4.3K
262K
¥17.3 / ¥72Input/Output
90
claude-opus-4-1-20250805
Anthropic
73.2
5.3K
200K
¥108 / ¥540Input/Output
91
longcat-flash-chat
Meituan
72.9
778
128K
¥1.08 / ¥10.8Input/Output
92
gemini-3.1-flash-lite-preview
Google
72.6
2.7K
1.05M
¥1.8 / ¥10.8Input/Output
93
mimo-v2-flash (non-thinking)
Xiaomi
72.3
3.1K
262K
¥0.72 / ¥2.16Input/Output
94
longcat-flash-chat-2602-exp
Meituan
72.0
1.7K
128K
¥1.08 / ¥10.8Input/Output
95
minimax-m2.1-preview
Minimax
71.7
1.2K
205K
¥0 / ¥0Input/Output
96
grok-4-fast-chat
Xai
71.4
436
2M
¥1.44 / ¥3.6Input/Output
97
qwen3-vl-235b-a22b-thinking
Alibaba
71.1
511
131K
¥2.06 / ¥8.26Input/Output
98
deepseek-v3.2-exp
Deepseek
70.8
782
128K
¥0 / ¥0Input/Output
99
mimo-v2.5
Xiaomi
70.5
1.2K
1.05M
¥2.88 / ¥14.4Input/Output
100
step-3.5-flash
Stepfun
70.2
2.5K
256K
¥0.69 / ¥2.07Input/Output
101
qwen3-235b-a22b-thinking-2507
Alibaba
69.9
539
131K
¥2.07 / ¥8.26Input/Output
102
deepseek-v3.1-terminus
Deepseek
69.6
271
128K
¥1.8 / ¥5.04Input/Output
103
qwen3.5-27b
Alibaba
69.3
2K
262K
¥2.16 / ¥17.3Input/Output
104
qwen3-max-2025-09-23
Alibaba
69.0
570
258K
¥6.19 / ¥24.7Input/Output
105
mimo-v2-flash (thinking)
Xiaomi
68.7
856
262K
¥0.72 / ¥2.16Input/Output
106
kimi-k2-0905-preview
Moonshot
68.4
781
262K
¥4.32 / ¥18Input/Output
107
grok-4-1-fast-reasoning
Xai
68.1
3.9K
2M
¥1.44 / ¥3.6Input/Output
108
minimax-m2.7
Minimax
67.8
1.7K
205K
¥0 / ¥0Input/Output
109
gpt-4.1-2025-04-14
Openai
67.5
3.3K
1.05M
¥14.4 / ¥57.6Input/Output
110
qwen3.5-35b-a3b
Alibaba
67.2
2.1K
262K
¥1.8 / ¥14.4Input/Output
111
glm-4.6v
Zai
66.9
203
128K
¥2.16 / ¥6.48Input/Output
112
grok-4.3
Xai
66.6
1.2K
1M
¥9 / ¥18Input/Output
113
qwen3-235b-a22b-no-thinking
Alibaba
66.3
2.6K
131K
¥2.07 / ¥8.26Input/Output
114
qwen3.5-flash
Alibaba
66.0
2.2K
1M
¥1.24 / ¥12.4Input/Output
115
mimo-v2-omni
Xiaomi
65.7
199
262K
¥2.88 / ¥14.4Input/Output
116
claude-haiku-4-5-20251001
Anthropic
65.4
5.5K
200K
¥7.2 / ¥36Input/Output
117
gemini-2.5-flash-lite-preview-09-2025-no-thinking
Google
65.1
3.2K
1.05M
¥0.72 / ¥2.88Input/Output
118
nvidia-nemotron-3-super-120b-a12b
Nvidia
64.8
512
262K
¥1.44 / ¥5.76Input/Output
119
gpt-5.3-chat-latest
Openai
64.5
2.4K
128K
¥12.6 / ¥101Input/Output
120
amazon-nova-experimental-chat-10-20
Amazon
64.2
814
-
-
121
grok-3-mini-high
Xai
63.9
1.2K
128K
¥0 / ¥0Input/Output
122
glm-4.5-air
Zai
63.6
1.9K
131K
¥0 / ¥0Input/Output
123
qwen3-30b-a3b-instruct-2507
Alibaba
63.3
1.5K
262K
¥2.16 / ¥3.6Input/Output
124
mistral-medium-2505
Mistral
63.0
2.1K
262K
¥2.88 / ¥14.4Input/Output
125
amazon-nova-experimental-chat-26-01-10
Amazon
62.7
264
-
-
126
gemma-3-12b-it
Google
62.3
217
128K
¥1.96 / ¥1.96Input/Output
127
deepseek-v3-0324
Deepseek
62.0
2.9K
75K
¥1.44 / ¥5.76Input/Output
128
gpt-5-mini-high
Openai
61.7
1.7K
400K
¥1.8 / ¥14.4Input/Output
129
gemma-3-27b-it
Google
61.4
2.7K
128K
¥2.15 / ¥2.15Input/Output
130
qwen2.5-max
Alibaba
61.1
1.9K
32K
¥11.5 / ¥46Input/Output
131
claude-opus-4-20250514
Anthropic
60.8
2.9K
200K
¥108 / ¥540Input/Output
132
gemini-2.5-flash-lite-preview-06-17-thinking
Google
60.5
2.1K
65.5K
¥0.72 / ¥2.88Input/Output
133
gemini-2.0-flash-001
Google
60.2
2.6K
1.05M
¥1.08 / ¥4.32Input/Output
134
hunyuan-turbos-20250416
Tencent
59.9
741
131K
¥0 / ¥0Input/Output
135
deepseek-r1
Deepseek
59.6
1.1K
164K
¥5.04 / ¥18Input/Output
136
ling-flash-2.0
Ant Group
59.3
437
131K
¥1.01 / ¥4.1Input/Output
137
o1-2024-12-17
Openai
59.0
1.5K
128K
¥108 / ¥432Input/Output
138
grok-3-mini-beta
Xai
58.7
1.5K
1M
¥9 / ¥18Input/Output
139
amazon-nova-experimental-chat-10-09
Amazon
58.4
208
-
-
140
minimax-m2.5
Minimax
58.1
2.8K
205K
¥0 / ¥0Input/Output
141
claude-opus-4-20250514-thinking-16k
Anthropic
57.8
2.5K
200K
¥108 / ¥540Input/Output
142
qwen3-next-80b-a3b-thinking
Alibaba
57.5
868
131K
¥1.04 / ¥10.3Input/Output
143
gpt-5.4-nano-high
Openai
57.2
2K
400K
¥1.44 / ¥9Input/Output
144
glm-4-plus-0111
Zai
56.9
367
128K
¥72 / ¥72Input/Output
145
glm-4.7-flash
Zai
56.6
825
200K
¥0 / ¥0Input/Output
146
qwen3-coder-480b-a35b-instruct
Alibaba
56.3
1.6K
262K
¥6.2 / ¥24.8Input/Output
147
gpt-oss-120b
Openai
56.0
1.9K
131K
¥1.08 / ¥4.32Input/Output
148
kimi-k2-0711-preview
Moonshot
55.7
1.8K
131K
¥4.32 / ¥18Input/Output
149
qwen3-235b-a22b
Alibaba
55.4
1.7K
131K
¥2.07 / ¥8.26Input/Output
150
nova-2-lite
Amazon
55.1
891
128K
¥2.38 / ¥19.8Input/Output
151
step-1o-turbo-202506
Stepfun
54.8
657
-
-
152
o4-mini-2025-04-16
Openai
54.5
2.8K
200K
¥7.92 / ¥31.7Input/Output
153
qwen-plus-0125
Alibaba
54.2
392
1M
¥0.83 / ¥2.07Input/Output
154
nvidia-nemotron-3-nano-30b-a3b-bf16
Nvidia
53.9
1.2K
131K
¥0 / ¥0Input/Output
155
command-a-03-2025
Cohere
53.6
3.5K
256K
¥18 / ¥72Input/Output
156
o1-preview
Openai
53.3
1.9K
128K
¥108 / ¥432Input/Output
157
granite-4.1-8b
Ibm
53.0
206
131K
¥0.36 / ¥0.72Input/Output
158
minimax-m2
Minimax
52.7
429
197K
¥0 / ¥0Input/Output
159
nvidia-llama-3.3-nemotron-super-49b-v1.5
Nvidia
52.4
233
131K
¥2.88 / ¥2.88Input/Output
160
gpt-4.1-mini-2025-04-14
Openai
52.1
2.5K
1.05M
¥2.88 / ¥11.5Input/Output
161
gemini-2.0-flash-lite-preview-02-05
Google
51.8
1.4K
1.05M
¥0.54 / ¥2.16Input/Output
162
trinity-large-thinking
-
51.5
1.8K
262K
¥1.8 / ¥6.48Input/Output
163
claude-sonnet-4-20250514
Anthropic
51.2
2.7K
200K
¥21.6 / ¥108Input/Output
164
deepseek-v3
Deepseek
50.9
1.3K
128K
¥0 / ¥0Input/Output
165
minimax-m1
Minimax
50.6
2.4K
1M
¥0.95 / ¥9.03Input/Output
166
trinity-large-preview
-
50.3
2K
262K
¥1.8 / ¥6.48Input/Output
167
mistral-small-2506
Mistral
50.0
1.2K
262K
¥2.88 / ¥14.4Input/Output
168
step-3
Stepfun
49.7
416
65.5K
¥1.8 / ¥4.68Input/Output
169
glm-4.5v
Zai
49.4
277
64K
¥4.32 / ¥13Input/Output
170
qwq-32b
Alibaba
49.1
1.6K
131K
¥2.07 / ¥6.2Input/Output
171
step-2-16k-exp-202412
Stepfun
48.8
278
16.4K
¥37.5 / ¥118Input/Output
172
claude-sonnet-4-20250514-thinking-32k
Anthropic
48.5
2.3K
200K
¥21.6 / ¥108Input/Output
173
olmo-3.1-32b-instruct
Allenai
48.2
908
200K
¥14.4 / ¥57.6Input/Output
174
o3-mini-high
Openai
47.9
1K
200K
¥7.92 / ¥31.7Input/Output
175
gemini-1.5-pro-002
Google
47.6
3.3K
-
-
176
yi-lightning
-
47.3
1.7K
12K
¥1.44 / ¥1.44Input/Output
177
gemma-3-4b-it
Google
47.0
245
128K
¥1.44 / ¥1.44Input/Output
178
grok-2-2024-08-13
Xai
46.7
3.8K
1M
¥9 / ¥18Input/Output
179
qwen3-32b
Alibaba
46.4
285
131K
¥2.07 / ¥8.26Input/Output
180
intellect-3
-
46.1
398
131K
¥1.44 / ¥7.92Input/Output
181
o1-mini
Openai
45.8
3.1K
128K
¥7.92 / ¥31.7Input/Output
182
gpt-4o-2024-05-13
Openai
45.5
6.7K
128K
¥36 / ¥108Input/Output
183
qwen3-30b-a3b
Alibaba
45.2
1.8K
128K
¥0.79 / ¥7.78Input/Output
184
gemma-3n-e4b-it
Google
44.9
1.4K
128K
¥0 / ¥0Input/Output
185
glm-4-plus
Zai
44.6
1.7K
128K
¥54 / ¥54Input/Output
186
ring-flash-2.0
Ant Group
44.3
444
131K
¥1.01 / ¥4.1Input/Output
187
qwen2.5-plus-1127
Alibaba
44.0
588
-
-
188
o3-mini
Openai
43.7
3.6K
200K
¥7.92 / ¥31.7Input/Output
189
gpt-4o-mini-2024-07-18
Openai
43.4
3.9K
128K
¥1.08 / ¥4.32Input/Output
190
gemini-advanced-0514
Google
43.1
3K
-
-
191
llama-3.1-405b-instruct-fp8
Meta
42.8
3.4K
128K
¥0 / ¥0Input/Output
192
llama-3.1-nemotron-70b-instruct
Nvidia
42.5
489
128K
¥0 / ¥0Input/Output
193
gemini-1.5-flash-002
Google
42.2
2.2K
2M
¥0.54 / ¥2.2Input/Output
194
claude-3-7-sonnet-20250219-thinking-32k
Anthropic
41.9
2.5K
-
-
195
llama-3.1-405b-instruct-bf16
Meta
41.6
2.3K
128K
¥0 / ¥0Input/Output
196
gpt-5-nano-high
Openai
41.3
471
400K
¥0.36 / ¥2.88Input/Output
197
gpt-4.1-nano-2025-04-14
Openai
41.0
382
1.05M
¥14.4 / ¥57.6Input/Output
198
claude-3-7-sonnet-20250219
Anthropic
40.7
2.5K
200K
¥21.6 / ¥108Input/Output
199
athene-v2-chat
-
40.4
1.4K
-
-
200
olmo-3-32b-think
Allenai
40.1
430
128K
¥2.16 / ¥3.24Input/Output
201
athene-70b-0725
-
39.8
1K
-
-
202
grok-2-mini-2024-08-13
Xai
39.5
3K
1M
¥9 / ¥18Input/Output
203
hunyuan-standard-2025-02-10
Tencent
39.2
288
-
-
204
qwen2.5-72b-instruct
Alibaba
38.9
2.4K
131K
¥4.13 / ¥12.4Input/Output
205
llama-4-scout-17b-16e-instruct
Meta
38.6
2K
128K
¥1.44 / ¥5.62Input/Output
206
llama-4-maverick-17b-128e-instruct
Meta
38.3
2.6K
1M
¥1.8 / ¥6.26Input/Output
207
llama-3.3-70b-instruct
Meta
38.0
3.2K
128K
¥0 / ¥0Input/Output
208
llama-3.1-70b-instruct
Meta
37.7
3.2K
131K
¥2.88 / ¥2.88Input/Output
209
qwen-max-0919
Alibaba
37.3
1.1K
131K
¥2.48 / ¥9.91Input/Output
210
deepseek-v2.5
Deepseek
37.0
1.5K
1M
¥1.01 / ¥2.02Input/Output
211
amazon-nova-pro-v1.0
Amazon
36.7
1.3K
300K
¥5.76 / ¥23Input/Output
212
hunyuan-large-2025-02-10
Tencent
36.4
252
-
-
213
gpt-oss-20b
Openai
36.1
694
131K
¥0.32 / ¥1.3Input/Output
214
olmo-3.1-32b-think
Allenai
35.8
636
200K
¥14.4 / ¥57.6Input/Output
215
deepseek-v2.5-1210
Deepseek
35.5
375
1M
¥1.01 / ¥2.02Input/Output
216
claude-3-5-sonnet-20241022
Anthropic
35.2
5.5K
200K
¥21.6 / ¥108Input/Output
217
mistral-large-2407
Mistral
34.9
2.7K
131K
¥14.4 / ¥43.2Input/Output
218
gpt-4o-2024-08-06
Openai
34.6
2.7K
128K
¥18 / ¥72Input/Output
219
mistral-large-2411
Mistral
34.3
1.6K
128K
¥14.4 / ¥43.2Input/Output
220
reka-core-20240904
-
34.0
371
-
-
221
mistral-small-3.1-24b-instruct-2503
Mistral
33.7
2.2K
262K
¥2.88 / ¥14.4Input/Output
222
gemini-1.5-pro-001
Google
33.4
4.8K
-
-
223
gemma-2-9b-it-simpo
-
33.1
543
8.19K
¥1.44 / ¥1.44Input/Output
224
command-r-plus-08-2024
Cohere
32.8
592
128K
¥18 / ¥72Input/Output
225
claude-3-5-sonnet-20240620
Anthropic
32.5
4.8K
200K
¥21.6 / ¥108Input/Output
226
gpt-4-turbo-2024-04-09
Openai
32.2
5.4K
128K
¥72 / ¥216Input/Output
227
claude-3-opus-20240229
Anthropic
31.9
11.1K
200K
¥108 / ¥540Input/Output
228
jamba-1.5-large
-
31.6
448
256K
¥0 / ¥0Input/Output
229
gpt-4-0125-preview
Openai
31.3
5K
8.19K
¥216 / ¥432Input/Output
230
claude-3-5-haiku-20241022
Anthropic
31.0
4.3K
200K
¥5.76 / ¥28.8Input/Output
231
qwen2.5-coder-32b-instruct
Alibaba
30.7
273
131K
¥2.07 / ¥6.2Input/Output
232
llama-3.1-nemotron-51b-instruct
Nvidia
30.4
230
128K
¥0 / ¥0Input/Output
233
gpt-4-1106-preview
Openai
30.1
5.6K
8.19K
¥216 / ¥432Input/Output
234
c4ai-aya-expanse-32b
Cohere
29.8
1.7K
-
-
235
gemini-1.5-flash-8b-001
Google
29.5
2.1K
2M
¥0.54 / ¥2.2Input/Output
236
gemini-1.5-flash-001
Google
29.2
3.8K
2M
¥0.54 / ¥2.2Input/Output
237
hunyuan-large-vision
Tencent
28.9
397
-
-
238
nemotron-4-340b-instruct
Nvidia
28.6
1.2K
-
-
239
command-r-plus
Cohere
28.3
4.4K
128K
¥18 / ¥72Input/Output
240
glm-4-0520
Zai
28.0
556
128K
¥108 / ¥108Input/Output
241
gemma-2-27b-it
Google
27.7
4.5K
8.19K
¥0.58 / ¥0.58Input/Output
242
phi-4
Microsoft
27.4
1.3K
128K
¥0.9 / ¥3.6Input/Output
243
amazon-nova-lite-v1.0
Amazon
27.1
1.1K
300K
¥0.43 / ¥1.73Input/Output
244
magistral-medium-2506
Mistral
26.8
853
128K
¥14.4 / ¥36Input/Output
245
ibm-granite-h-small
Ibm
26.5
323
-
-
246
command-r-08-2024
Cohere
26.2
569
128K
¥18 / ¥72Input/Output
247
olmo-2-0325-32b-instruct
Allenai
25.9
202
-
-
248
llama-3-70b-instruct
Meta
25.6
8.3K
8.19K
¥3.67 / ¥5.33Input/Output
249
c4ai-aya-expanse-8b
Cohere
25.3
551
-
-
250
jamba-1.5-mini
-
25.0
531
256K
¥0 / ¥0Input/Output
251
amazon-nova-micro-v1.0
Amazon
24.7
1.1K
128K
¥0.25 / ¥1.01Input/Output
252
mistral-small-24b-instruct-2501
Mistral
24.4
869
262K
¥2.88 / ¥14.4Input/Output
253
reka-flash-20240904
-
24.1
414
65.5K
¥0.72 / ¥1.44Input/Output
254
gemma-2-9b-it
Google
23.8
3.2K
8.19K
¥1.44 / ¥1.44Input/Output
255
claude-3-sonnet-20240229
Anthropic
23.5
6.1K
200K
¥21.6 / ¥108Input/Output
256
ministral-8b-2410
Mistral
23.2
324
128K
¥0.72 / ¥0.72Input/Output
257
command-r
Cohere
22.9
3.1K
128K
¥18 / ¥72Input/Output
258
llama-3.1-8b-instruct
Meta
22.6
2.9K
131K
¥0.79 / ¥0.79Input/Output
259
yi-1.5-34b-chat
-
22.3
1.5K
-
-
260
reka-flash-21b-20240226-online
-
22.0
876
-
-
261
qwen2-72b-instruct
Alibaba
21.7
2.3K
131K
¥4.13 / ¥12.4Input/Output
262
gpt-4-0314
Openai
21.4
2.9K
8.19K
¥216 / ¥432Input/Output
263
claude-3-haiku-20240307
Anthropic
21.1
6.8K
200K
¥1.8 / ¥9Input/Output
264
qwen1.5-110b-chat
Alibaba
20.8
1.5K
-
-
265
mistral-large-2402
Mistral
20.5
3.3K
262K
¥2.88 / ¥14.4Input/Output
266
reka-flash-21b-20240226
-
20.2
1.4K
-
-
267
qwen1.5-72b-chat
Alibaba
19.9
2.1K
-
-
268
granite-3.1-8b-instruct
Ibm
19.6
215
-
-
269
mistral-medium
Mistral
19.3
1.8K
262K
¥2.88 / ¥14.4Input/Output
270
gemma-2-2b-it
Google
19.0
2.7K
128K
¥0 / ¥0Input/Output
271
llama-3-8b-instruct
Meta
18.7
5.6K
8.19K
¥0.29 / ¥0.29Input/Output
272
gemini-pro-dev-api
Google
18.4
970
1.05M
¥14.4 / ¥86.4Input/Output
273
mixtral-8x22b-instruct-v0.1
Mistral
18.1
2.8K
64K
¥14.4 / ¥43.2Input/Output
274
deepseek-coder-v2
Deepseek
17.8
888
1M
¥1.01 / ¥2.02Input/Output
275
gpt-4-0613
Openai
17.5
4.7K
8.19K
¥216 / ¥432Input/Output
276
internlm2_5-20b-chat
-
17.2
685
-
-
277
starling-lm-7b-beta
-
16.9
878
200K
¥5.4 / ¥18.7Input/Output
278
qwq-32b-preview
Alibaba
16.6
214
131K
¥2.07 / ¥6.2Input/Output
279
qwen1.5-32b-chat
Alibaba
16.3
1.2K
-
-
280
yi-34b-chat
-
16.0
857
-
-
281
tulu-2-dpo-70b
-
15.7
377
-
-
282
zephyr-orpo-141b-A35b-v0.1
-
15.4
240
200K
¥108 / ¥432Input/Output
283
mixtral-8x7b-instruct-v0.1
Mistral
15.1
4K
32K
¥5.04 / ¥5.04Input/Output
284
phi-3-medium-4k-instruct
Microsoft
14.8
1.5K
4.1K
¥1.22 / ¥4.9Input/Output
285
wizardlm-70b
Microsoft
14.5
434
-
-
286
starling-lm-7b-alpha
-
14.2
504
200K
¥5.4 / ¥18.7Input/Output
287
qwen1.5-14b-chat
Alibaba
13.9
1K
-
-
288
gpt-3.5-turbo-0125
Openai
13.6
3.6K
16.4K
¥3.6 / ¥10.8Input/Output
289
llama-3.2-3b-instruct
Meta
13.3
506
131K
¥0.22 / ¥0.35Input/Output
290
llama-2-70b-chat
Meta
13.0
2.1K
-
-
291
openchat-3.5
-
12.7
467
-
-
292
gemini-pro
Google
12.3
309
1.05M
¥14.4 / ¥86.4Input/Output
293
openchat-3.5-0106
-
12.0
637
-
-
294
dbrx-instruct-preview
-
11.7
1.7K
-
-
295
qwen1.5-7b-chat
Alibaba
11.4
230
-
-
296
phi-3-small-8k-instruct
Microsoft
11.1
1.2K
8.19K
¥1.08 / ¥4.32Input/Output
297
wizardlm-13b
Microsoft
10.8
365
-
-
298
openhermes-2.5-mistral-7b
-
10.5
275
1M
¥36 / ¥180Input/Output
299
vicuna-33b
-
10.2
1.2K
-
-
300
mistral-7b-instruct-v0.2
Mistral
9.9
976
262K
¥2.88 / ¥14.4Input/Output
301
zephyr-7b-beta
-
9.6
621
-
-
302
gemma-1.1-7b-it
Google
9.3
1.4K
-
-
303
snowflake-arctic-instruct
-
9.0
1.5K
-
-
304
deepseek-llm-67b-chat
Deepseek
8.7
300
1M
¥1.01 / ¥2.02Input/Output
305
granite-3.0-8b-instruct
Ibm
8.4
428
-
-
306
granite-3.0-2b-instruct
Ibm
8.1
435
-
-
307
solar-10.7b-instruct-v1.0
-
7.8
219
128K
¥0 / ¥0Input/Output
308
llama-2-13b-chat
Meta
7.5
1K
-
-
309
phi-3-mini-4k-instruct
Microsoft
7.2
1.1K
4.1K
¥0.94 / ¥3.74Input/Output
310
llama-3.2-1b-instruct
Meta
6.9
489
16.4K
¥0.07 / ¥0.08Input/Output
311
phi-3-mini-4k-instruct-june-2024
Microsoft
6.6
706
4.1K
¥0.94 / ¥3.74Input/Output
312
llama-2-7b-chat
Meta
6.3
760
128K
¥4.03 / ¥48Input/Output
313
codellama-34b-instruct
Meta
6.0
383
-
-
314
vicuna-13b
-
5.7
899
-
-
315
gpt-3.5-turbo-1106
Openai
5.4
944
16.4K
¥7.2 / ¥14.4Input/Output
316
qwen-14b-chat
Alibaba
5.1
274
32.8K
¥1.04 / ¥3.1Input/Output
317
olmo-7b-instruct
Allenai
4.8
301
-
-
318
gemma-7b-it
Google
4.5
462
-
-
319
vicuna-7b
-
4.2
305
-
-
320
mistral-7b-instruct
Mistral
3.9
487
262K
¥2.88 / ¥14.4Input/Output
321
gemma-2b-it
Google
3.6
273
-
-
322
qwen1.5-4b-chat
Alibaba
3.3
390
-
-
323
phi-3-mini-128k-instruct
Microsoft
3.0
1K
128K
¥0.94 / ¥3.74Input/Output
324
palm-2
Google
2.7
428
-
-
325
stripedhyena-nous-7b
-
2.4
309
-
-
326
gemma-1.1-2b-it
Google
2.1
578
-
-
327
RWKV-4-Raven-14B
-
1.8
190
-
-
328
chatglm3-6b
-
1.5
268
200K
¥5.4 / ¥18.7Input/Output
329
koala-13b
-
1.2
248
-
-
330
oasst-pythia-12b
-
0.9
266
-
-
331
alpaca-13b
-
0.6
230
-
-
332
fastchat-t5-3b
-
0.3
184
-
-
333
chatglm-6b
-
0.0
195
200K
¥5.4 / ¥18.7Input/Output
Top model analysis

claude-opus-4-6-thinking why it ranks first

claude-opus-4-6-thinking ranks first with a percent score of 100.0 and 2.6K samples. Use it as the first option for this leaderboard, then compare price, context and availability.

How to choose

Do not only look at rank #1

Start with the leaderboard closest to your task. Compare the top models by score and sample size, then check price, context length, open or closed access, and provider availability.

FAQ

FAQ

法律与政务排行榜看什么指标?

主要看排名、百分制分数、样本量和来源。分数用于快速比较同一榜单内模型表现,样本量用于判断结果稳定性。

为什么不同榜单不能直接混合成总分?

不同榜单的任务、样本和评测口径不同,模力榜默认只在同一榜单内排序,避免把写作、代码、图像等能力强行合并。

法律与政务模型应该怎么选?

优先看与你任务最接近的榜单,再结合价格、上下文长度、开源闭源和厂商可用性。排名靠前不代表适合所有预算和部署方式。

榜单多久更新?

页面展示的是最新成功采集的公开榜单数据。当前优先使用 LMArena leaderboard dataset,并在页面来源中保留原始链接。