Chat · Text · German Leaderboard

Ranking for Text / German, based on public preference data.

Selection guide

German model ranking guide

Ranking for Text / German, based on public preference data.

claude-opus-4-6-thinkinggemini-3-progemini-3.1-pro-previewclaude-opus-4-6gemini-3-flash
Current DirectoryChat · Text · German
Models270
Published2026/05/27
Arena public preference evaluationOriginal leaderboard: Text / GermanPublished: 2026/05/27Leaderboard dataset: LMArena latest parquetOpen Arena sourceOpen leaderboard dataset
1
claude-opus-4-6-thinking
Anthropic
100.0
518
1M
¥36 / ¥180Input/Output
2
gemini-3-pro
Google
99.6
760
1.05M
¥14.4 / ¥86.4Input/Output
3
gemini-3.1-pro-preview
Google
99.3
765
1.05M
¥14.4 / ¥86.4Input/Output
4
claude-opus-4-6
Anthropic
98.9
576
1M
¥36 / ¥180Input/Output
5
gemini-3-flash
Google
98.5
527
1.05M
¥3.6 / ¥21.6Input/Output
6
muse-spark
Meta
98.1
226
-
-
7
gemini-2.5-pro
Google
97.8
2.5K
1.05M
¥9 / ¥72Input/Output
8
claude-opus-4-7-thinking
Anthropic
97.4
273
1M
¥36 / ¥180Input/Output
9
gpt-5.4-high
Openai
97.0
488
1.05M
¥18 / ¥108Input/Output
10
qwen3.5-max-preview
Alibaba
96.7
368
-
-
11
ernie-5.1
Baidu
96.3
282
119K
¥5.4 / ¥21.6Input/Output
12
claude-opus-4-7
Anthropic
95.9
266
1M
¥36 / ¥180Input/Output
13
qwen3-max-preview
Alibaba
95.5
567
262K
¥6.2 / ¥24.8Input/Output
14
glm-5.1
Zai
95.2
211
200K
¥0 / ¥0Input/Output
15
gemini-3-flash (thinking-minimal)
Google
94.8
869
1.05M
¥3.6 / ¥21.6Input/Output
16
grok-4.20-beta1
Xai
94.4
427
2M
¥14.4 / ¥43.2Input/Output
17
ernie-5.0-preview-1203
Baidu
94.1
215
128K
¥7.92 / ¥14.4Input/Output
18
gpt-4.5-preview-2025-02-27
Openai
93.7
319
8.19K
¥216 / ¥432Input/Output
19
deepseek-v4-pro
Deepseek
93.3
286
1M
¥3.13 / ¥6.26Input/Output
20
qwen3.6-plus
Alibaba
92.9
290
1M
¥3.6 / ¥21.6Input/Output
21
glm-4.6
Zai
92.6
642
205K
¥4.32 / ¥15.8Input/Output
22
gpt-5.5
Openai
92.2
226
1.05M
¥36 / ¥216Input/Output
23
gpt-5.2-chat-latest-20260210
Openai
91.8
539
400K
¥12.6 / ¥101Input/Output
24
gpt-5.2-high
Openai
91.4
773
400K
¥12.6 / ¥101Input/Output
25
gpt-5.4
Openai
91.1
431
1.05M
¥18 / ¥108Input/Output
26
grok-4.1
Xai
90.7
1.2K
200K
¥14.4 / ¥72Input/Output
27
kimi-k2.5-thinking
Moonshot
90.3
639
262K
¥4.32 / ¥21.6Input/Output
28
grok-4.20-multi-agent-beta-0309
Xai
90.0
483
2M
¥14.4 / ¥43.2Input/Output
29
kimi-k2.6
Moonshot
89.6
314
262K
¥6.84 / ¥28.8Input/Output
30
claude-opus-4-5-20251101-thinking-32k
Anthropic
89.2
614
200K
¥108 / ¥540Input/Output
31
gemini-3.1-flash-lite-preview
Google
88.8
571
1.05M
¥1.8 / ¥10.8Input/Output
32
qwen3.5-397b-a17b
Alibaba
88.5
531
262K
¥3.1 / ¥18.6Input/Output
33
claude-opus-4-5-20251101
Anthropic
88.1
1K
200K
¥36 / ¥180Input/Output
34
glm-5
Zai
87.7
401
205K
¥7.2 / ¥23Input/Output
35
qwen3-max-2025-09-23
Alibaba
87.4
193
258K
¥6.19 / ¥24.7Input/Output
36
gpt-5.5-high
Openai
87.0
208
1.05M
¥36 / ¥216Input/Output
37
gpt-5.1-high
Openai
86.6
662
400K
¥9 / ¥72Input/Output
38
ernie-5.0-0110
Baidu
86.2
630
128K
¥7.92 / ¥14.4Input/Output
39
qwen3-235b-a22b-instruct-2507
Alibaba
85.9
1.7K
128K
¥2.09 / ¥8.23Input/Output
40
grok-4.1-thinking
Xai
85.5
1.1K
200K
¥14.4 / ¥72Input/Output
41
claude-sonnet-4-6
Anthropic
85.1
467
1M
¥21.6 / ¥108Input/Output
42
deepseek-v3.2-exp-thinking
Deepseek
84.8
172
128K
¥0 / ¥0Input/Output
43
mistral-large-3
Mistral
84.4
688
262K
¥3.6 / ¥10.8Input/Output
44
deepseek-v3.2-exp
Deepseek
84.0
210
128K
¥0 / ¥0Input/Output
45
dola-seed-2.0-pro
Bytedance
83.6
635
-
-
46
mistral-medium-2508
Mistral
83.3
1.6K
262K
¥2.88 / ¥14.4Input/Output
47
grok-3-preview-02-24
Xai
82.9
755
1M
¥9 / ¥18Input/Output
48
grok-4.20-beta-0309-reasoning
Xai
82.5
453
2M
¥14.4 / ¥43.2Input/Output
49
grok-4-0709
Xai
82.2
853
256K
¥21.6 / ¥108Input/Output
50
deepseek-v4-flash
Deepseek
81.8
267
1M
¥1.01 / ¥2.02Input/Output
51
claude-sonnet-4-5-20250929-thinking-32k
Anthropic
81.4
1.2K
200K
¥21.6 / ¥108Input/Output
52
chatgpt-4o-latest-20250326
Openai
81.0
1.7K
128K
¥18 / ¥72Input/Output
53
gpt-5.1
Openai
80.7
707
400K
¥9 / ¥72Input/Output
54
qwen3-vl-235b-a22b-instruct
Alibaba
80.3
259
128K
¥2.16 / ¥8.64Input/Output
55
claude-sonnet-4-5-20250929
Anthropic
79.9
1.2K
200K
¥21.6 / ¥108Input/Output
56
mimo-v2.5-pro
Xiaomi
79.6
234
1.05M
¥7.2 / ¥21.6Input/Output
57
gemini-2.5-flash-preview-09-2025
Google
79.2
601
1M
¥2.16 / ¥18Input/Output
58
qwen3-vl-235b-a22b-thinking
Alibaba
78.8
170
131K
¥2.06 / ¥8.26Input/Output
59
mimo-v2-pro
Xiaomi
78.4
419
1.05M
¥7.2 / ¥21.6Input/Output
60
qwen3-next-80b-a3b-instruct
Alibaba
78.1
459
131K
¥1.04 / ¥4.13Input/Output
61
o3-2025-04-16
Openai
77.7
1.4K
200K
¥14.4 / ¥57.6Input/Output
62
glm-4.7
Zai
77.3
219
205K
¥0 / ¥0Input/Output
63
gemini-2.5-flash
Google
77.0
2.5K
1.05M
¥2.16 / ¥18Input/Output
64
gpt-5.2
Openai
76.6
690
400K
¥12.6 / ¥101Input/Output
65
deepseek-v4-flash-thinking
Deepseek
76.2
264
1M
¥1.01 / ¥2.02Input/Output
66
longcat-flash-chat
Meituan
75.8
273
128K
¥1.08 / ¥10.8Input/Output
67
kimi-k2.5-instant
Moonshot
75.5
133
262K
¥4.32 / ¥21.6Input/Output
68
qwen3.5-122b-a10b
Alibaba
75.1
408
262K
¥2.88 / ¥23Input/Output
69
claude-opus-4-1-20250805-thinking-16k
Anthropic
74.7
985
200K
¥108 / ¥540Input/Output
70
deepseek-v4-pro-thinking
Deepseek
74.3
244
1M
¥3.13 / ¥6.26Input/Output
71
gpt-5-high
Openai
74.0
702
400K
¥9 / ¥72Input/Output
72
deepseek-v3.1-thinking
Deepseek
73.6
314
128K
¥1.44 / ¥5.04Input/Output
73
deepseek-v3.1
Deepseek
73.2
374
128K
¥1.44 / ¥5.04Input/Output
74
deepseek-v3.2-thinking
Deepseek
72.9
707
128K
¥2.09 / ¥3.1Input/Output
75
gpt-5.4-mini-high
Openai
72.5
377
400K
¥5.4 / ¥32.4Input/Output
76
gpt-5-chat
Openai
72.1
661
400K
¥9 / ¥72Input/Output
77
glm-4.5
Zai
71.7
519
131K
¥4.32 / ¥15.8Input/Output
78
deepseek-v3.2
Deepseek
71.4
855
128K
¥2.09 / ¥3.1Input/Output
79
qwen3.5-27b
Alibaba
71.0
424
262K
¥2.16 / ¥17.3Input/Output
80
qwen3-235b-a22b-thinking-2507
Alibaba
70.6
203
131K
¥2.07 / ¥8.26Input/Output
81
claude-opus-4-1-20250805
Anthropic
70.3
1.5K
200K
¥108 / ¥540Input/Output
82
gpt-5.5-instant
Openai
69.9
404
400K
¥9 / ¥72Input/Output
83
mimo-v2.5
Xiaomi
69.5
232
1.05M
¥2.88 / ¥14.4Input/Output
84
grok-4.3
Xai
69.1
211
1M
¥9 / ¥18Input/Output
85
deepseek-r1-0528
Deepseek
68.8
560
164K
¥3.6 / ¥15.5Input/Output
86
grok-4-1-fast-reasoning
Xai
68.4
875
2M
¥1.44 / ¥3.6Input/Output
87
amazon-nova-experimental-chat-11-10
Amazon
68.0
455
-
-
88
step-3.5-flash
Stepfun
67.7
607
256K
¥0.69 / ¥2.07Input/Output
89
kimi-k2-thinking-turbo
Moonshot
67.3
1.1K
262K
¥17.3 / ¥72Input/Output
90
mimo-v2-flash (thinking)
Xiaomi
66.9
242
262K
¥0.72 / ¥2.16Input/Output
91
amazon-nova-experimental-chat-10-20
Amazon
66.5
274
-
-
92
gemini-2.5-flash-lite-preview-09-2025-no-thinking
Google
66.2
779
1.05M
¥0.72 / ¥2.88Input/Output
93
grok-4-fast-reasoning
Xai
65.8
350
2M
¥1.44 / ¥3.6Input/Output
94
qwen3.5-flash
Alibaba
65.4
512
1M
¥1.24 / ¥12.4Input/Output
95
qwen3-235b-a22b
Alibaba
65.1
725
131K
¥2.07 / ¥8.26Input/Output
96
deepseek-r1
Deepseek
64.7
436
164K
¥5.04 / ¥18Input/Output
97
kimi-k2-0905-preview
Moonshot
64.3
260
262K
¥4.32 / ¥18Input/Output
98
claude-opus-4-20250514-thinking-16k
Anthropic
63.9
900
200K
¥108 / ¥540Input/Output
99
mimo-v2-flash (non-thinking)
Xiaomi
63.6
736
262K
¥0.72 / ¥2.16Input/Output
100
gpt-4.1-2025-04-14
Openai
63.2
1.2K
1.05M
¥14.4 / ¥57.6Input/Output
101
qwen3-235b-a22b-no-thinking
Alibaba
62.8
1K
131K
¥2.07 / ¥8.26Input/Output
102
deepseek-v3-0324
Deepseek
62.5
1.2K
75K
¥1.44 / ¥5.76Input/Output
103
step-3
Stepfun
62.1
145
65.5K
¥1.8 / ¥4.68Input/Output
104
minimax-m2.7
Minimax
61.7
393
205K
¥0 / ¥0Input/Output
105
mistral-medium-2505
Mistral
61.3
942
262K
¥2.88 / ¥14.4Input/Output
106
longcat-flash-chat-2602-exp
Meituan
61.0
380
128K
¥1.08 / ¥10.8Input/Output
107
minimax-m2.1-preview
Minimax
60.6
266
205K
¥0 / ¥0Input/Output
108
gpt-5.3-chat-latest
Openai
60.2
500
128K
¥12.6 / ¥101Input/Output
109
kimi-k2-0711-preview
Moonshot
59.9
621
131K
¥4.32 / ¥18Input/Output
110
claude-opus-4-20250514
Anthropic
59.5
1.2K
200K
¥108 / ¥540Input/Output
111
gemini-2.5-flash-lite-preview-06-17-thinking
Google
59.1
784
65.5K
¥0.72 / ¥2.88Input/Output
112
gemma-3-12b-it
Google
58.7
166
128K
¥1.96 / ¥1.96Input/Output
113
glm-4.5-air
Zai
58.4
673
131K
¥0 / ¥0Input/Output
114
qwen3-30b-a3b-instruct-2507
Alibaba
58.0
486
262K
¥2.16 / ¥3.6Input/Output
115
claude-haiku-4-5-20251001
Anthropic
57.6
1.3K
200K
¥7.2 / ¥36Input/Output
116
gpt-5-mini-high
Openai
57.2
554
400K
¥1.8 / ¥14.4Input/Output
117
gpt-5.4-nano-high
Openai
56.9
356
400K
¥1.44 / ¥9Input/Output
118
gemma-3-27b-it
Google
56.5
1.2K
128K
¥2.15 / ¥2.15Input/Output
119
qwen3-next-80b-a3b-thinking
Alibaba
56.1
287
131K
¥1.04 / ¥10.3Input/Output
120
qwen3.5-35b-a3b
Alibaba
55.8
463
262K
¥1.8 / ¥14.4Input/Output
121
minimax-m2.5
Minimax
55.4
563
205K
¥0 / ¥0Input/Output
122
grok-3-mini-high
Xai
55.0
424
128K
¥0 / ¥0Input/Output
123
nova-2-lite
Amazon
54.6
263
128K
¥2.38 / ¥19.8Input/Output
124
gemini-2.0-flash-001
Google
54.3
1K
1.05M
¥1.08 / ¥4.32Input/Output
125
hunyuan-turbos-20250416
Tencent
53.9
322
131K
¥0 / ¥0Input/Output
126
gpt-oss-120b
Openai
53.5
619
131K
¥1.08 / ¥4.32Input/Output
127
nvidia-nemotron-3-nano-30b-a3b-bf16
Nvidia
53.2
281
131K
¥0 / ¥0Input/Output
128
qwen2.5-max
Alibaba
52.8
761
32K
¥11.5 / ¥46Input/Output
129
trinity-large-thinking
-
52.4
347
262K
¥1.8 / ¥6.48Input/Output
130
mistral-small-2506
Mistral
52.0
403
262K
¥2.88 / ¥14.4Input/Output
131
gpt-4.1-mini-2025-04-14
Openai
51.7
1K
1.05M
¥2.88 / ¥11.5Input/Output
132
minimax-m1
Minimax
51.3
828
1M
¥0.95 / ¥9.03Input/Output
133
grok-3-mini-beta
Xai
50.9
546
1M
¥9 / ¥18Input/Output
134
command-a-03-2025
Cohere
50.6
1.3K
256K
¥18 / ¥72Input/Output
135
qwen3-32b
Alibaba
50.2
172
131K
¥2.07 / ¥8.26Input/Output
136
nvidia-nemotron-3-super-120b-a12b
Nvidia
49.8
196
262K
¥1.44 / ¥5.76Input/Output
137
glm-4-plus-0111
Zai
49.4
207
128K
¥72 / ¥72Input/Output
138
o1-2024-12-17
Openai
49.1
556
128K
¥108 / ¥432Input/Output
139
o4-mini-2025-04-16
Openai
48.7
1.1K
200K
¥7.92 / ¥31.7Input/Output
140
claude-sonnet-4-20250514
Anthropic
48.3
1K
200K
¥21.6 / ¥108Input/Output
141
glm-4.7-flash
Zai
48.0
209
200K
¥0 / ¥0Input/Output
142
gpt-5-nano-high
Openai
47.6
202
400K
¥0.36 / ¥2.88Input/Output
143
trinity-large-preview
-
47.2
451
262K
¥1.8 / ¥6.48Input/Output
144
deepseek-v3
Deepseek
46.8
493
128K
¥0 / ¥0Input/Output
145
qwen3-coder-480b-a35b-instruct
Alibaba
46.5
543
262K
¥6.2 / ¥24.8Input/Output
146
step-1o-turbo-202506
Stepfun
46.1
323
-
-
147
claude-sonnet-4-20250514-thinking-32k
Anthropic
45.7
815
200K
¥21.6 / ¥108Input/Output
148
qwq-32b
Alibaba
45.4
646
131K
¥2.07 / ¥6.2Input/Output
149
o1-preview
Openai
45.0
774
128K
¥108 / ¥432Input/Output
150
gemma-3n-e4b-it
Google
44.6
692
128K
¥0 / ¥0Input/Output
151
qwen3-30b-a3b
Alibaba
44.2
727
128K
¥0.79 / ¥7.78Input/Output
152
gemini-2.0-flash-lite-preview-02-05
Google
43.9
627
1.05M
¥0.54 / ¥2.16Input/Output
153
o3-mini-high
Openai
43.5
377
200K
¥7.92 / ¥31.7Input/Output
154
o3-mini
Openai
43.1
1.3K
200K
¥7.92 / ¥31.7Input/Output
155
claude-3-7-sonnet-20250219-thinking-32k
Anthropic
42.8
999
-
-
156
claude-3-5-sonnet-20241022
Anthropic
42.4
2K
200K
¥21.6 / ¥108Input/Output
157
llama-4-maverick-17b-128e-instruct
Meta
42.0
990
1M
¥1.8 / ¥6.26Input/Output
158
gpt-4.1-nano-2025-04-14
Openai
41.6
222
1.05M
¥14.4 / ¥57.6Input/Output
159
grok-2-2024-08-13
Xai
41.3
1.6K
1M
¥9 / ¥18Input/Output
160
gemini-1.5-pro-002
Google
40.9
1.3K
-
-
161
olmo-3-32b-think
Allenai
40.5
128
128K
¥2.16 / ¥3.24Input/Output
162
gpt-4o-2024-05-13
Openai
40.1
3.6K
128K
¥36 / ¥108Input/Output
163
claude-3-7-sonnet-20250219
Anthropic
39.8
1K
200K
¥21.6 / ¥108Input/Output
164
gemma-3-4b-it
Google
39.4
162
128K
¥1.44 / ¥1.44Input/Output
165
glm-4-plus
Zai
39.0
659
128K
¥54 / ¥54Input/Output
166
o1-mini
Openai
38.7
1.2K
128K
¥7.92 / ¥31.7Input/Output
167
gemini-advanced-0514
Google
38.3
1.8K
-
-
168
olmo-3.1-32b-instruct
Allenai
37.9
224
200K
¥14.4 / ¥57.6Input/Output
169
grok-2-mini-2024-08-13
Xai
37.5
1.3K
1M
¥9 / ¥18Input/Output
170
gpt-4o-mini-2024-07-18
Openai
37.2
1.6K
128K
¥1.08 / ¥4.32Input/Output
171
llama-4-scout-17b-16e-instruct
Meta
36.8
766
128K
¥1.44 / ¥5.62Input/Output
172
claude-3-5-sonnet-20240620
Anthropic
36.4
2.5K
200K
¥21.6 / ¥108Input/Output
173
yi-lightning
-
36.1
700
12K
¥1.44 / ¥1.44Input/Output
174
mistral-small-3.1-24b-instruct-2503
Mistral
35.7
830
262K
¥2.88 / ¥14.4Input/Output
175
gemini-1.5-flash-002
Google
35.3
739
2M
¥0.54 / ¥2.2Input/Output
176
athene-v2-chat
-
34.9
626
-
-
177
gpt-4-turbo-2024-04-09
Openai
34.6
3.4K
128K
¥72 / ¥216Input/Output
178
claude-3-opus-20240229
Anthropic
34.2
5.7K
200K
¥108 / ¥540Input/Output
179
deepseek-v2.5-1210
Deepseek
33.8
140
1M
¥1.01 / ¥2.02Input/Output
180
gpt-4o-2024-08-06
Openai
33.5
1.2K
128K
¥18 / ¥72Input/Output
181
mistral-large-2407
Mistral
33.1
1.2K
131K
¥14.4 / ¥43.2Input/Output
182
qwen-max-0919
Alibaba
32.7
438
131K
¥2.48 / ¥9.91Input/Output
183
llama-3.1-405b-instruct-bf16
Meta
32.3
894
128K
¥0 / ¥0Input/Output
184
athene-70b-0725
-
32.0
585
-
-
185
llama-3.3-70b-instruct
Meta
31.6
1.2K
128K
¥0 / ¥0Input/Output
186
gpt-oss-20b
Openai
31.2
208
131K
¥0.32 / ¥1.3Input/Output
187
gpt-4-1106-preview
Openai
30.9
2.5K
8.19K
¥216 / ¥432Input/Output
188
llama-3.1-405b-instruct-fp8
Meta
30.5
1.6K
128K
¥0 / ¥0Input/Output
189
gemini-1.5-pro-001
Google
30.1
2.7K
-
-
190
olmo-3.1-32b-think
Allenai
29.7
159
200K
¥14.4 / ¥57.6Input/Output
191
gpt-4-0125-preview
Openai
29.4
2.7K
8.19K
¥216 / ¥432Input/Output
192
magistral-medium-2506
Mistral
29.0
341
128K
¥14.4 / ¥36Input/Output
193
reka-core-20240904
-
28.6
183
-
-
194
amazon-nova-pro-v1.0
Amazon
28.3
553
300K
¥5.76 / ¥23Input/Output
195
mistral-large-2411
Mistral
27.9
648
128K
¥14.4 / ¥43.2Input/Output
196
claude-3-5-haiku-20241022
Anthropic
27.5
1.7K
200K
¥5.76 / ¥28.8Input/Output
197
qwen2.5-72b-instruct
Alibaba
27.1
926
131K
¥4.13 / ¥12.4Input/Output
198
qwen2.5-plus-1127
Alibaba
26.8
257
-
-
199
amazon-nova-lite-v1.0
Amazon
26.4
490
300K
¥0.43 / ¥1.73Input/Output
200
deepseek-v2.5
Deepseek
26.0
563
1M
¥1.01 / ¥2.02Input/Output
201
phi-4
Microsoft
25.7
523
128K
¥0.9 / ¥3.6Input/Output
202
llama-3.1-70b-instruct
Meta
25.3
1.4K
131K
¥2.88 / ¥2.88Input/Output
203
gemma-2-9b-it-simpo
-
24.9
318
8.19K
¥1.44 / ¥1.44Input/Output
204
gemini-1.5-flash-001
Google
24.5
2.2K
2M
¥0.54 / ¥2.2Input/Output
205
command-r-plus-08-2024
Cohere
24.2
267
128K
¥18 / ¥72Input/Output
206
mistral-small-24b-instruct-2501
Mistral
23.8
301
262K
¥2.88 / ¥14.4Input/Output
207
gemma-2-27b-it
Google
23.4
2K
8.19K
¥0.58 / ¥0.58Input/Output
208
gemini-1.5-flash-8b-001
Google
23.0
798
2M
¥0.54 / ¥2.2Input/Output
209
claude-3-sonnet-20240229
Anthropic
22.7
3.1K
200K
¥21.6 / ¥108Input/Output
210
jamba-1.5-large
-
22.3
234
256K
¥0 / ¥0Input/Output
211
c4ai-aya-expanse-32b
Cohere
21.9
609
-
-
212
gpt-4-0314
Openai
21.6
1.3K
8.19K
¥216 / ¥432Input/Output
213
command-r-plus
Cohere
21.2
2.9K
128K
¥18 / ¥72Input/Output
214
amazon-nova-micro-v1.0
Amazon
20.8
468
128K
¥0.25 / ¥1.01Input/Output
215
nemotron-4-340b-instruct
Nvidia
20.4
703
-
-
216
c4ai-aya-expanse-8b
Cohere
20.1
240
-
-
217
gemma-2-9b-it
Google
19.7
1.6K
8.19K
¥1.44 / ¥1.44Input/Output
218
glm-4-0520
Zai
19.3
371
128K
¥108 / ¥108Input/Output
219
reka-flash-20240904
-
19.0
207
65.5K
¥0.72 / ¥1.44Input/Output
220
mistral-large-2402
Mistral
18.6
1.7K
262K
¥2.88 / ¥14.4Input/Output
221
command-r-08-2024
Cohere
18.2
275
128K
¥18 / ¥72Input/Output
222
claude-3-haiku-20240307
Anthropic
17.8
3.5K
200K
¥1.8 / ¥9Input/Output
223
llama-3-70b-instruct
Meta
17.5
5K
8.19K
¥3.67 / ¥5.33Input/Output
224
deepseek-coder-v2
Deepseek
17.1
516
1M
¥1.01 / ¥2.02Input/Output
225
jamba-1.5-mini
-
16.7
248
256K
¥0 / ¥0Input/Output
226
gpt-4-0613
Openai
16.4
2.2K
8.19K
¥216 / ¥432Input/Output
227
mistral-medium
Mistral
16.0
852
262K
¥2.88 / ¥14.4Input/Output
228
reka-flash-21b-20240226
-
15.6
798
-
-
229
reka-flash-21b-20240226-online
-
15.2
501
-
-
230
qwen2-72b-instruct
Alibaba
14.9
1.3K
131K
¥4.13 / ¥12.4Input/Output
231
llama-3.1-8b-instruct
Meta
14.5
1.3K
131K
¥0.79 / ¥0.79Input/Output
232
mixtral-8x22b-instruct-v0.1
Mistral
14.1
1.7K
64K
¥14.4 / ¥43.2Input/Output
233
command-r
Cohere
13.8
1.4K
128K
¥18 / ¥72Input/Output
234
gemini-pro-dev-api
Google
13.4
450
1.05M
¥14.4 / ¥86.4Input/Output
235
qwen1.5-110b-chat
Alibaba
13.0
928
-
-
236
gemma-2-2b-it
Google
12.6
1.2K
128K
¥0 / ¥0Input/Output
237
mixtral-8x7b-instruct-v0.1
Mistral
12.3
2K
32K
¥5.04 / ¥5.04Input/Output
238
yi-1.5-34b-chat
-
11.9
871
-
-
239
llama-3-8b-instruct
Meta
11.5
3.4K
8.19K
¥0.29 / ¥0.29Input/Output
240
phi-3-medium-4k-instruct
Microsoft
11.2
851
4.1K
¥1.22 / ¥4.9Input/Output
241
gpt-3.5-turbo-0125
Openai
10.8
1.9K
16.4K
¥3.6 / ¥10.8Input/Output
242
qwen1.5-72b-chat
Alibaba
10.4
1.1K
-
-
243
wizardlm-70b
Microsoft
10.0
193
-
-
244
starling-lm-7b-beta
-
9.7
367
200K
¥5.4 / ¥18.7Input/Output
245
phi-3-small-8k-instruct
Microsoft
9.3
620
8.19K
¥1.08 / ¥4.32Input/Output
246
snowflake-arctic-instruct
-
8.9
947
-
-
247
internlm2_5-20b-chat
-
8.6
265
-
-
248
openchat-3.5-0106
-
8.2
268
-
-
249
vicuna-33b
-
7.8
428
-
-
250
qwen1.5-32b-chat
Alibaba
7.4
777
-
-
251
dbrx-instruct-preview
-
7.1
742
-
-
252
llama-3.2-3b-instruct
Meta
6.7
232
131K
¥0.22 / ¥0.35Input/Output
253
gpt-3.5-turbo-1106
Openai
6.3
325
16.4K
¥7.2 / ¥14.4Input/Output
254
gemma-1.1-7b-it
Google
5.9
827
-
-
255
phi-3-mini-4k-instruct
Microsoft
5.6
699
4.1K
¥0.94 / ¥3.74Input/Output
256
qwen1.5-14b-chat
Alibaba
5.2
540
-
-
257
yi-34b-chat
-
4.8
381
-
-
258
llama-2-70b-chat
Meta
4.5
878
-
-
259
phi-3-mini-4k-instruct-june-2024
Microsoft
4.1
397
4.1K
¥0.94 / ¥3.74Input/Output
260
starling-lm-7b-alpha
-
3.7
199
200K
¥5.4 / ¥18.7Input/Output
261
llama-3.2-1b-instruct
Meta
3.3
247
16.4K
¥0.07 / ¥0.08Input/Output
262
llama-2-13b-chat
Meta
3.0
454
-
-
263
phi-3-mini-128k-instruct
Microsoft
2.6
727
128K
¥0.94 / ¥3.74Input/Output
264
vicuna-13b
-
2.2
283
-
-
265
zephyr-7b-beta
-
1.9
152
-
-
266
mistral-7b-instruct-v0.2
Mistral
1.5
564
262K
¥2.88 / ¥14.4Input/Output
267
llama-2-7b-chat
Meta
1.1
327
128K
¥4.03 / ¥48Input/Output
268
gemma-1.1-2b-it
Google
0.7
365
-
-
269
mistral-7b-instruct
Mistral
0.4
174
262K
¥2.88 / ¥14.4Input/Output
270
qwen1.5-4b-chat
Alibaba
0.0
211
-
-
Top model analysis

claude-opus-4-6-thinking why it ranks first

claude-opus-4-6-thinking ranks first with a percent score of 100.0 and 518 samples. Use it as the first option for this leaderboard, then compare price, context and availability.

How to choose

Do not only look at rank #1

Start with the leaderboard closest to your task. Compare the top models by score and sample size, then check price, context length, open or closed access, and provider availability.

FAQ

FAQ

德语排行榜看什么指标?

主要看排名、百分制分数、样本量和来源。分数用于快速比较同一榜单内模型表现,样本量用于判断结果稳定性。

为什么不同榜单不能直接混合成总分?

不同榜单的任务、样本和评测口径不同,模力榜默认只在同一榜单内排序,避免把写作、代码、图像等能力强行合并。

德语模型应该怎么选?

优先看与你任务最接近的榜单,再结合价格、上下文长度、开源闭源和厂商可用性。排名靠前不代表适合所有预算和部署方式。

榜单多久更新?

页面展示的是最新成功采集的公开榜单数据。当前优先使用 LMArena leaderboard dataset,并在页面来源中保留原始链接。