**DRAFT — pending editorial expansion.** This article is a working draft published as scaffolding for the NINtec content programme. The current version covers the substantive perspective in compressed form; the published version will expand each section to the 2,000+ word depth the topic warrants. Editorial review is required before promotion.
Vector database choice for production Claude RAG is workload-specific. Pinecone, Weaviate, Qdrant, pgvector, Chroma — each has its operational fit. This piece covers the decision framework NINtec applies in Discovery.
pgvector — operational simplicity for moderate scale
For corpora up to ~10M vectors with moderate query rate, pgvector on Postgres is operationally simplest. Existing Postgres operations cover the storage; ANN indexing via HNSW or IVF is well-supported. The fit is excellent for teams already on Postgres.
Pinecone — hosted simplicity at scale
Pinecone abstracts away operational concerns at the cost of hosted-service vendor lock. Right for teams that want to outsource vector-database operations and pay for the abstraction. Scales to billions of vectors.
Weaviate — hybrid search first-class
Weaviate's hybrid search (vector + keyword) is built-in. For workloads where keyword retrieval rescues dense retrieval misses, Weaviate's integrated approach reduces engineering effort. Open-source with hosted offering.
Qdrant — performance at high query load
Qdrant's performance characteristics shine under heavy concurrent query load. Open-source with hosted offering. Right for high-throughput production workloads where latency at p99 matters.
Chroma — prototyping and smaller corpora
Chroma is lightweight and fast to set up. Right for prototyping and smaller production corpora. Operational footprint is small; less suited for very high scale.
The vector database choice is one of several decisions in RAG architecture — embedding selection, chunking strategy, retrieval pattern, re-ranking — and each interacts. Discovery makes the call with your specific numbers.