Meta LLMs
Publisher of the open-weight Llama family.
Meta does not run a hosted API at the same scale as OpenAI/Anthropic/Google. Llama models are distributed under a permissive (but not OSI-open) license and run via third-party inference providers like Together, Groq, Fireworks, and Cerebras. Prices below reflect typical hosted pricing on Together AI; self-hosted is roughly 30-70% cheaper depending on hardware utilization.
Founded: 2004 · HQ: Menlo Park, USA · Docs: llama.com ↗
All Meta models
| Model | Family | Context | Input $/M | Output $/M | Released | Status |
|---|---|---|---|---|---|---|
| Llama 3.3 70B | Llama 3 | 128K | $0.59 | $0.79 | 2024-12-06 | active |
| Llama 3.1 405B | Llama 3 | 128K | $3.50 | $3.50 | 2024-07-23 | active |
Comparisons with other providers
The most-searched comparisons involving Meta models:
- Llama 3.3 70B vs GPT-5 Nano
- Llama 3.3 70B vs GPT-4o Mini
- Llama 3.1 405B vs GPT-5 Nano
- Llama 3.1 405B vs GPT-4o Mini
Working with the Meta API
Documentation lives at https://llama.com. Before paying for any call, count input tokens with the appropriate counter:
- OpenAI Token Counter — for GPT-* and o-series models
- Claude Token Counter — for Anthropic Claude models
- Gemini Token Counter — for Google Gemini models