pgvector Index Sizing Calculator
Calculate how much disk and RAM your pgvector deployment will need before you provision hardware. Set the number of vectors, dimensions, element type (float32 / halfvec / int8), and index type (flat / IVFFlat / HNSW) to get humanised size estimates and a recommended RAM target for keeping the index hot.
How to use the pgvector Index Sizing Calculator
Set the vector count, dimensionality, element type, and index type, then click Calculate:
- N — total number of vectors you plan to store. 1 million is a typical starting point; RAG pipelines often scale to 10-100M.
- D — dimensionality of each vector. Common values: 384 (MiniLM), 768 (BERT), 1536 (text-embedding-3-small), 3072 (text-embedding-3-large).
- Element type — float32 is the default and gives full precision. halfvec (pgvector 0.7+) halves storage at minor recall cost. int8 / bit vectors are for binary quantisation.
- Index type — flat uses a sequential scan (no index overhead, good for <100K vectors); IVFFlat clusters vectors into lists and probes a subset at query time; HNSW builds a navigable small-world graph and typically gives the best recall/speed trade-off.
- HNSW m — number of graph links per node. Higher m → better recall, larger index. Default 16; 32-64 for high-recall workloads; 8 for memory-constrained deployments.
The calculator outputs raw vector storage, table overhead, index size, total disk, and recommended RAM (table + index × 1.2 so the working set fits in memory).
pgvector index types: flat, IVFFlat, HNSW
pgvector is a PostgreSQL extension that stores and indexes floating-point vectors alongside relational data. It supports three query strategies, each with different storage and recall characteristics. The flat (sequential scan) strategy computes exact nearest-neighbour distances by scanning every vector — 100% recall, zero index overhead, but O(N) per query. At a million vectors with 1536 dimensions this is typically too slow for interactive use. The IVFFlat (Inverted File with Flat lists) index partitions vectors into k lists using k-means at build time, then probes only a subset of lists at query time. The index adds roughly 5% overhead to the raw vector storage. Build time is fast (minutes), but recall drops noticeably if the number of probes is too low. The HNSW (Hierarchical Navigable Small World) index builds a layered proximity graph. It delivers the best recall/QPS trade-off in most benchmarks, at the cost of significantly more RAM — the graph links add approximately N × m × 8.5 bytes on top of the raw vectors. Build time is also slower than IVFFlat.
The recommended RAM figure (table + index × 1.2) ensures the index fits in the OS page cache. If the index exceeds available RAM, queries will hit disk and latency will spike dramatically — usually 10-100x. For HNSW on large datasets, provisioning enough RAM is the single most important deployment decision. If RAM is constrained, use halfvec (2 bytes/element) to cut vector storage in half, or reduce m from 16 to 8, which roughly halves the graph link overhead.
Common use cases
- RAG pipeline sizing — estimate Postgres RAM before loading your document corpus for a retrieval-augmented generation system.
- Instance right-sizing — compare float32 vs halfvec storage costs to decide whether to upgrade to a larger AWS RDS or Supabase plan.
- HNSW m tuning — see the RAM impact of changing m from 16 to 32 or 64 before modifying a production index.
- Flat vs HNSW decision — for small datasets (<200K vectors) the flat scan may fit comfortably in RAM and adds zero maintenance overhead.
- Capacity planning — project disk usage for 6 months of vector ingestion at a known rate (e.g. 50K new documents/day).
Frequently asked questions
What is the HNSW m parameter and how do I choose it?
Does pgvector support binary vectors?
bit type for binary vectors (1 bit per dimension, stored as bytes). The effective bytes-per-element is 1/8 — use "1 byte" in this calculator as an approximation; actual storage is approximately D/8 bytes per vector.