FRONTIER MODEL SPECS: JAN 2026
SOURCE: Aggregated Benchmarks (SWE-bench Verified / AIME 2025 / GPQA Diamond)
⚠️ Disclaimer: Some specifications listed (e.g., GPT-5 Orion, Gemini 3.0 Pro) are projected or based on limited public information. Benchmark scores and pricing are subject to change. This page is for research and comparison purposes only. Always verify specifications with official sources before making production decisions.
Benchmark Glossary
SWE-bench Verified
Real-world software engineering tasks from GitHub issues
AIME 2025
American Invitational Mathematics Examination - advanced math reasoning
GPQA Diamond
Graduate-level science questions (physics, chemistry, biology)
The Trade-Off: Context Memory vs. Reasoning Power
Context Window (Log Scale)
Coding & Math Reasoning Score (SWE + AIME)
128k
500k
2M
10M
50%
65%
75%
98%