Why are engineering teams moving off standalone vector databases?

Beyond per-query billing that consumes 40-50% of application cost, running relational and vector data as two systems creates asynchronous consistency gaps, zombie vectors, and broken access control. Converged options like pgvector keep vectors next to the source of truth and eliminate that dual-state overhead.

What is a zombie vector and why does it matter?

A zombie vector is a stale embedding left behind after its source record is updated or deleted. In in-memory databases it wastes expensive RAM, but the bigger risk is compliance: the embedding stays fully searchable after the underlying record is supposed to be gone.

When do you still need a standalone vector database?

Mainly at billion-scale and beyond, where in-memory indexes become financially unsustainable and specialized techniques like DiskANN, HNSW centroid routing, and RaBitQ rotation earn their keep. Most enterprise use cases never reach that scale.

Why isn't pure semantic search enough for enterprise retrieval?

Embedding models weight conceptual meaning over literal matches, so they fail silently on precise, high-stakes queries like SKUs, policy IDs, cancellation policies, and dates. Hybrid search combining BM25 lexical matching with semantic embeddings is now a production requirement.

How Two Databases Quietly Break Your RBAC and GDPR Story

The $200 Migration That Exposed a $50M Assumption

A mid-sized fintech engineering team processing 50 million daily transactions ran the same retrieval workload on a standalone vector database and on pgvector. Same queries. Same results. The standalone bill collapsed to roughly $200 a month after migration.¹ That number tells the story everyone already knows: standalone vector databases are expensive at scale. But it buries a more dangerous reality. The migration was inevitable because the two-database architecture was quietly breaking things that never showed up on an invoice.

The cost story is well documented. Vector databases can consume 40 to 50 percent of total application cost in production, second only to LLM API calls.² Metadata filtering silently multiplies read-unit consumption, turning a one-unit semantic query into five or ten.¹ Per-query billing works at prototype scale and fails at production scale.³ What no one prices is the compliance blast radius of running relational data and vector data as two independent stateful systems.

Call it Dual-State Drag: the compounding operational tax, which also drags down security and velocity, paid by teams keeping source-of-truth records and their vector representations in two databases. These systems must be kept consistent but never actually can be. Dual-State Drag is the hidden second-order cost that margin-cannibalization coverage misses. It drives the shift toward converged architectures.

Why Two Databases Can Never Stay Consistent

Dual-state architectures fail because consistency between a relational store and a vector store is asynchronous by construction. Every write to the primary database triggers a separate, delayed operation to regenerate and reinsert an embedding. That gap is where correctness dies.

When a new product lands in your primary database, you face what one Google Cloud engineer described as a "frantic, asynchronous scramble to create its vector fingerprint and stuff it into the vector database."⁴ That scramble is not an edge case. It runs on every database mutation forever. Teams start with a standalone vector database for retrieval. Then they discover that embeddings and permissions must stay perfectly synced with operational records.⁵ The moment they discover it, they own a distributed-consistency problem they never chose to build.

two mismatched clock faces wired together by a tangle of cables, one showing a slightly later time than the other — Dual-state architectures run on two clocks that never quite agree.

The drag compounds beyond runtime. Coordinating schema migrations across two independent stateful databases turns routine CI/CD into a choreography problem. Running a separate database alongside operational data requires extra infrastructure and forces additional coordination when AI workloads need application context.⁶ Costs compound as the dataset scales. Every coordination point is a place where a deploy can leave the vector store describing a version of reality that no longer exists.

The Zombie Vector Trap Is a Compliance Problem, Not a RAM Problem

Stale embeddings left behind after operational data changes are a live compliance liability. When the primary record is updated or deleted but its embedding survives, you get a zombie vector. In an in-memory database, that zombie keeps occupying expensive RAM indefinitely.⁷ The RAM waste is the part practitioners built CLI tools to measure.⁷ The compliance exposure is the part no dashboard shows.

A zombie vector is a searchable copy of data that your system of record believes is gone. If a user exercises deletion rights and the relational row disappears, the embedding derived from that row can persist in the vector store. It remains fully retrievable, ready to be surfaced by an AI agent as though the deletion never happened. The asynchronous scramble that creates zombies on every update is the exact mechanism leaving deleted data alive in a second system. The insidious part is silence. A zombie vector produces no error. Instead, it returns a confident answer built on a deleted record, accumulating undetected as the dataset grows.

A zombie vector is a searchable copy of data your system of record swears it already deleted.

RBAC and Retrieval Quality: Where Standalone Databases Break Enterprises

Role-based access control is the clearest example of Dual-State Drag. Permissions are nearly impossible to keep coherent across two separate databases. Your operational database enforces who can see what. Your vector database, sitting beside it, has its own idea. The two drift.

Permissions live as relationships and rules in the relational system. Embeddings in the vector store carry only whatever metadata you remembered to copy across, subject to the same asynchronous lag as everything else. Update a user's access in the primary system and there is a window where the vector store still answers as if the old permissions held. Enterprises are consolidating to fix this. Running pgvector inside enterprise PostgreSQL eliminates the separate vector database, killing the duplicated management overhead while keeping access control in one enforcement layer.⁸

In observed production deployments, the vector database runs 40-50% of total application cost, second only to LLM API calls.²

A retrieval-quality trap sits layered on top. Pure semantic search fails silently on the exact queries enterprises care most about. For precise, high-stakes lookups such as cancellation policies, SKUs, policy IDs, and dates, embedding models weight conceptual meaning over literal matches and miss the answer entirely.⁹ Hybrid search combining BM25 lexical matching with semantic embeddings is a production requirement for handling proper nouns and exact matches.¹¹ A standalone vector database optimized purely for similarity fights you on the queries where being wrong is most expensive.

The Team You Split When You Split the Database

Dual-State Drag has a human cost the invoices never capture. Two databases bifurcate the engineering org that maintains them. Someone owns the relational schema and its migrations. Someone else owns the vector index and the pipeline required to scale it. The seam between them becomes a permanent coordination tax on feature velocity. Every feature that touches retrieval now requires both owners in the room. Every incident requires deciding which database lied.

The operational overhead of managing separate licensing and infrastructure negates the specialized performance benefits for teams operating under a billion vectors.⁶ That is not a throwaway line. The raw retrieval speed you bought the standalone database for gets eaten by the cost of keeping two systems in step. The escape hatch is often simpler than the category admits. For some AI agent architectures, plain Markdown files as the primary memory system beat managed vector databases on unit economics and latency.¹² When the cheapest correct answer is a text file, the burden of proof shifts onto the specialized store.

a sleek specialized tool sealed inside a shrinking glass display case while a much larger general-purpose toolbox sits open beside it — Vector search is being absorbed as a feature; the standalone product retreats into a shrinking niche.

The Billion-Scale Niche Is Real, and It Is Small

Standalone vector databases survive where the scale genuinely demands them. That territory is far narrower than three years of adoption implied. The honest boundary is billion-scale and beyond, where general-purpose databases still struggle and specialized indexing earns its keep. As datasets grow from millions to billions of records, purely in-memory indexes become financially unsustainable, forcing on-disk techniques like DiskANN.¹⁴ Pushing to 10 billion vectors requires distributed indexing and machinery the vast majority of enterprise use cases will never touch.¹⁵

The incumbents are annexing even this frontier. Amazon OpenSearch now builds billion-scale vector databases in under an hour, indexing up to 10 times faster at a quarter of the cost using GPU acceleration.¹⁶ AWS S3 Vectors targets up to 90 percent cost reduction versus standalone solutions.¹⁷ Vector search has become a checkbox feature in cloud data platforms, not a standalone moat.¹⁸

"The standalone vector database isn't dead. But it's being cornered into the billion-scale niche where most teams will never operate." Source: Vector Databases Are Dying. Here's the Production Evidence.

What to Do Before Your Next Retrieval Decision

Start from converged and force the standalone case to justify itself. Default to keeping vectors next to the data in pgvector or an equivalent. Pgvectorscale already delivers roughly 75 percent lower cost than Pinecone on comparable workloads while closing the performance gap.¹⁹ Reserve a standalone vector database for the genuine billion-scale regime, and be ruthless about whether you actually live there.

Before committing, run three checks:

Check	What to look for	Why it matters
Zombie vector audit	Embeddings of deleted records still retrievable	GDPR/deletion compliance, not just RAM
Access control layer	One enforcement point vs. two drifting copies	RBAC coherence across updates
Hybrid retrieval	BM25 + semantic, not semantic alone	Exact matches on SKUs and policy IDs fail silently without it

The larger shift is a reversal of the default. For three years the question was "which vector database?" The better question is whether the vector belongs in a separate database at all. Dual-State Drag is the reason the answer is no. The standalone vector database is rapidly becoming a solution in search of a problem, waiting for a billion-scale dataset that most companies will never build.

The standalone vector database isn't dead. But it's being cornered into the billion-scale niche where most teams will never operate.

Author of Vector Databases Are Dying · Vector Databases Are Dying. Here's the Production Evidence.

Key Takeaways

1Vector databases can eat 40-50% of total production application cost, second only to LLM API calls.
2A single query with metadata filtering can consume 5 to 10 read units instead of one, and that cost stays hidden during sales calls.
3Zombie vectors keep deleted records searchable in a second store, so a data-hygiene chore becomes a compliance exposure.
4pgvectorscale runs comparable workloads at roughly 75% less cost than Pinecone while closing the performance gap.
5Amazon OpenSearch builds billion-scale vector databases in under an hour with GPU acceleration, which commoditizes the last niche.

Keywords

Vector DatabasespgvectorRAGHybrid SearchData ArchitectureAI Infrastructure

Back to Articles

Share:

X LinkedIn WhatsApp Facebook