Best LLM for Every Task

There is no single "best" language model — the right pick depends on the workload. Each guide below ranks models for one specific task using benchmark scores, capability fit and current list pricing, with a clear rationale for every rank. 20 tasks covered, updated as new models ship.

Pick by task

How these rankings are built

Each task weights benchmarks differently — "code generation" leans on coding benchmarks and price, while "customer support" weights price and latency over peak quality. Composite scores combine those weighted benchmarks with capability fit (tool use, vision, function calling). The output is an ordered list with reasons, not a single verdict, so you can override on constraints the ranking can't model.

Reproduce the math with your own assumptions: compare any two models with the comparison tool, browse the full model database, or estimate spend with the cost calculator.