Embedding Cost Calculator

Estimate embedding generation cost for a corpus. Per-document token estimate × number of documents × provider price. Covers OpenAI, Voyage, Cohere, Gemini, and the open-weight families served on Together / Fireworks.

Embedding cost economics

Embeddings are usually a one-time cost per document, plus a per-query cost at retrieval. For a 100K-document corpus with 500-token average, embedding the full corpus once costs a few dollars at the cheap end (text-embedding-3-small, voyage-3-lite) and ~$10-15 at the high end (voyage-3-large). Query embedding is negligible by comparison — even 1M queries/month at 50 tokens each is single-digit dollars.

The expensive part of embeddings isn't the embedding API call — it's storage and retrieval infrastructure (Pinecone, Qdrant, Postgres pgvector, etc.). For most teams, embedding API spend is less than 10% of total RAG infrastructure cost.

Picking an embedding model

Voyage-3 (regular) and OpenAI's text-embedding-3-large are the current quality leaders on the MTEB leaderboard. For most RAG workloads the quality differences between top-5 models are smaller than the differences caused by chunking strategy, retrieval-top-k tuning, and reranker choice. Start with the cheapest reasonable option (text-embedding-3-small or voyage-3-lite); invest in upgrading only after the rest of the pipeline is tight.