Skip to content

Best LLMs for Coding

The best LLMs for coding are ranked here by their scores on coding benchmarks — measuring code generation, bug fixing, and software-engineering ability. A higher score means stronger performance on programming tasks. Pair benchmark strength with context window (for large codebases) and price when choosing a coding model.

Ranked list (top 25)

#ModelCoding scoreInput / 1MOutput / 1MContext
1OpenAI o3-pro (2025-06-10)
NanoGPT
84.9$10.00$19.99200K
2Google: Gemini 2.5 Pro Preview 06-05
Google
83.1$1.25$10.001M
3Doubao-Seed-Code
ZenMux
78.8$0.17$1.12256K
4Google: Gemini 2.5 Pro Preview 05-06
Google
76.9$1.25$10.001M
5Claude 4 Sonnet
NanoGPT
76.8$2.99$14.99200K
6Gemini 3 Flash Thinking
NanoGPT
75.8$0.50$3.001M
7MiniMax M2.5
NanoGPT
75.8$0.30$1.20205K
8Anthropic: Claude Opus 4.6 (Fast)
Anthropic
75.6$30.00$150.001M
9GPT 4.1
NanoGPT
74.6$2.00$8.001M
10OpenAI o4-mini high
NanoGPT
74.4$1.10$4.40200K
11DeepSeek: DeepSeek V3.2
DeepSeek
74.2$0.23$0.34131K
12Claude 4 Opus Thinking (1K)
NanoGPT
73.2$14.99$75.00200K
13GPT-5.1
Poe
66.0$1.10$9.00400K
14Qwen: Qwen3 Coder 30B A3B Instruct
Qwen
60.4$0.07$0.27160K
15Devstral Small
MistralOpen weights
56.4$0.10$0.30128K
16gemini-2.5-flash-preview-05-20
Jiekou.AI
55.1$0.14$3.151M
17Devstral 2
MistralOpen weights
53.8$0.40$2.00262K
18Grok 3 Mini
GitHub Models
49.3$0.00$0.00128K
19Mistral Devstral Small 2505
NanoGPT
46.8$0.06$0.0633K
20chatgpt-4o-latest
302.AI
45.3$5.00$15.00128K
21Amazon: Nova Premier 1.0
Amazon
42.4$2.50$12.501M
22Qwen: Qwen3 32B
Qwen
40.0$0.08$0.28131K
23GPT 4.1 Mini
NanoGPT
32.4$0.40$1.601M
24Gemini 2.5 Flash Preview
NanoGPT
28.7$0.15$0.601M
25Qwen Max
Alibaba (China)
21.8$0.35$1.38131K

Prices are per 1M tokens (USD); confirm with the provider. Updated regularly.

More LLM rankings

Frequently asked questions