Code · Image-to-WebDev · Image-to-WebDev Leaderboard

Ranking for Image-to-WebDev / Image-to-WebDev, based on public preference data.

Selection guide

Image-to-WebDev model ranking guide

Ranking for Image-to-WebDev / Image-to-WebDev, based on public preference data.

claude-opus-4-7-thinkingclaude-sonnet-4-6claude-opus-4-7claude-opus-4-6-thinkinggpt-5.5-xhigh (codex-harness)
Current DirectoryCode · Image-to-WebDev · Image-to-WebDev
Models23
Published2026/05/14
Arena public preference evaluationOriginal leaderboard: WebDev / Image To WebdevPublished: 2026/05/14Leaderboard dataset: LMArena latest parquetOpen Arena sourceOpen leaderboard dataset
1
claude-opus-4-7-thinking
Anthropic
100.0
2.1K
1M
¥36 / ¥180Input/Output
2
claude-sonnet-4-6
Anthropic
95.5
3.2K
1M
¥21.6 / ¥108Input/Output
3
claude-opus-4-7
Anthropic
90.9
2.4K
1M
¥36 / ¥180Input/Output
4
claude-opus-4-6-thinking
Anthropic
86.4
3K
1M
¥36 / ¥180Input/Output
5
gpt-5.5-xhigh (codex-harness)
Openai
81.8
1.8K
400K
¥9 / ¥72Input/Output
6
claude-opus-4-6
Anthropic
77.3
3K
1M
¥36 / ¥180Input/Output
7
kimi-k2.6
Moonshot
72.7
1.5K
262K
¥6.84 / ¥28.8Input/Output
8
gpt-5.5-high (codex-harness)
Openai
68.2
2K
400K
¥9 / ¥72Input/Output
9
gemini-3.1-pro-preview
Google
63.6
3.6K
1.05M
¥14.4 / ¥86.4Input/Output
10
gpt-5.5 (codex-harness)
Openai
59.1
1.9K
400K
¥9 / ¥72Input/Output
11
qwen3.6-plus
Alibaba
54.5
2.6K
1M
¥3.6 / ¥21.6Input/Output
12
gemini-3-pro
Google
50.0
1.1K
1.05M
¥14.4 / ¥86.4Input/Output
13
gemini-3-flash
Google
45.5
4.4K
1.05M
¥3.6 / ¥21.6Input/Output
14
gpt-5.3-codex (codex-harness)
Openai
40.9
2.5K
400K
¥9 / ¥72Input/Output
15
kimi-k2.5-thinking
Moonshot
36.4
1.7K
262K
¥4.32 / ¥21.6Input/Output
16
gpt-5.4
Openai
31.8
1.2K
1.05M
¥18 / ¥108Input/Output
17
gemini-3-flash (thinking-minimal)
Google
27.3
4.4K
1.05M
¥3.6 / ¥21.6Input/Output
18
gpt-5.1-high
Openai
22.7
1.1K
400K
¥9 / ¥72Input/Output
19
kimi-k2.5-instant
Moonshot
18.2
1.1K
262K
¥4.32 / ¥21.6Input/Output
20
grok-4.3
Xai
13.6
965
1M
¥9 / ¥18Input/Output
21
gpt-5.1
Openai
9.1
1.3K
400K
¥9 / ¥72Input/Output
22
gemini-3.1-flash-lite-preview
Google
4.5
3.7K
1.05M
¥1.8 / ¥10.8Input/Output
23
gemini-2.5-pro
Google
0.0
1.2K
1.05M
¥9 / ¥72Input/Output
Top model analysis

claude-opus-4-7-thinking why it ranks first

claude-opus-4-7-thinking ranks first with a percent score of 100.0 and 2.1K samples. Use it as the first option for this leaderboard, then compare price, context and availability.

How to choose

Do not only look at rank #1

Start with the leaderboard closest to your task. Compare the top models by score and sample size, then check price, context length, open or closed access, and provider availability.

Related leaderboards

Compare adjacent capabilities

FAQ

FAQ

图片转网页排行榜看什么指标?

主要看排名、百分制分数、样本量和来源。分数用于快速比较同一榜单内模型表现,样本量用于判断结果稳定性。

为什么不同榜单不能直接混合成总分?

不同榜单的任务、样本和评测口径不同,模力榜默认只在同一榜单内排序,避免把写作、代码、图像等能力强行合并。

图片转网页模型应该怎么选?

优先看与你任务最接近的榜单,再结合价格、上下文长度、开源闭源和厂商可用性。排名靠前不代表适合所有预算和部署方式。

榜单多久更新?

页面展示的是最新成功采集的公开榜单数据。当前优先使用 LMArena leaderboard dataset,并在页面来源中保留原始链接。