Meta LLMs

Publisher of the open-weight Llama family.

Meta does not run a hosted API at the same scale as OpenAI/Anthropic/Google. Llama models are distributed under a permissive (but not OSI-open) license and run via third-party inference providers like Together, Groq, Fireworks, and Cerebras. Prices below reflect typical hosted pricing on Together AI; self-hosted is roughly 30-70% cheaper depending on hardware utilization.

Founded: 2004 · HQ: Menlo Park, USA · Docs: llama.com ↗

All Meta models

Model	Family	Context	Input $/M	Output $/M	Released	Status
Llama 3.3 70B	Llama 3	128K	$0.59	$0.79	2024-12-06	active
Llama 3.1 405B	Llama 3	128K	$3.50	$3.50	2024-07-23	active

Comparisons with other providers

The most-searched comparisons involving Meta models:

Working with the Meta API

Documentation lives at https://llama.com. Before paying for any call, count input tokens with the appropriate counter:

OpenAI Token Counter — for GPT-* and o-series models
Claude Token Counter — for Anthropic Claude models
Gemini Token Counter — for Google Gemini models