OpenAI
o3
面向复杂推理任务的闭源模型。
Use Case Fit
适合的应用场景
按应用任务展示该模型被推荐的理由和证据数量。
跨 Benchmark 成绩
已收录结果
6Results
| 领域 | Benchmark | 排名 | 分数 | 指标 | 来源 | 更新时间 |
|---|---|---|---|---|---|---|
| math | LMArena Math | #1 | 1392 Elo | Arena Elo | LMArena | 2026/05/30 |
| math | MMLU-Pro Mathematics | #1 | 87.4% | Accuracy | TIGER-Lab / MMLU-Pro | 2026/05/20 |
| physics | MMLU-Pro Physics | #2 | 82.4% | Accuracy | TIGER-Lab / MMLU-Pro | 2026/05/20 |
| chemistry | ChemBench | #3 | 77.3 pts | Normalized Score | ChemBench | 2026/05/28 |
| economics | MMLU-Pro Economics | #1 | 85.9% | Accuracy | TIGER-Lab / MMLU-Pro | 2026/05/20 |
| medicine | MedHELM | #4 | 79.8 pts | Overall Score | MedHELM | 2026/05/27 |