Series/Learn Database Design and Architect

Build CoreOrdered learning track

Search and Vector Index-Aware Design

Learn Database Design and Architect - Part 043

Search and vector index-aware design for production systems: inverted index, full-text search, semantic retrieval, hybrid search, filtered vector search, freshness, security, rebuild, and operational failure modes.

[2026-07-05]22 min read4339 words

In This Lesson

1. What This Part Covers 2. Three Retrieval Modes 3. Search Index Is Usually a Projection

PrevNext

Lesson 4384 lesson track16–45 Build Core

#database#architecture#search#vector-search+4 more

Part 043 — Search and Vector Index-Aware Design

Search is not “just another query”.

A relational query usually asks:

“Which rows match these exact predicates?”

Search asks:

“Which documents are most relevant to this user intent, under these filters, permissions, freshness constraints, and ranking rules?”

Vector search asks an even less exact question:

“Which items are close in embedding space to this query representation?”

That difference changes the database architecture.

A normal B-Tree index optimizes exact lookup, range lookup, ordering, and joins. A search index optimizes token lookup and ranking. A vector index optimizes nearest-neighbor retrieval in high-dimensional space. They have different data structures, freshness semantics, failure modes, and correctness risks.

The core mental model:

Search/vector systems are usually retrieval projections, not the canonical system of truth.

They are built from authoritative operational data, transformed into searchable documents or vectors, indexed, queried, ranked, filtered, and periodically rebuilt.

A top-level engineer does not start by asking, “Should we use Elasticsearch, OpenSearch, PostgreSQL full-text search, pgvector, MongoDB Atlas Vector Search, Pinecone, Weaviate, or Qdrant?”

They start by asking:

What is the user trying to retrieve?
What is the authoritative source?
What filters are mandatory for correctness/security?
What ranking signal matters?
How fresh must the result be?
What recall/latency tradeoff is acceptable?
How do we rebuild and verify the index?
What happens when retrieval is wrong, stale, incomplete, or unauthorized?

That is search and vector index-aware design.

1. What This Part Covers

This part focuses on database design and architecture around:

full-text search;
inverted index mental model;
search document projection;
semantic/vector search;
HNSW and IVFFlat design intuition;
exact vs approximate nearest-neighbor retrieval;
hybrid lexical + vector search;
filtered vector search;
tenant/security-aware retrieval;
freshness and eventual consistency;
indexing pipelines;
blue/green index rebuild;
embedding version migration;
search correctness testing;
operational failure modes.

We will not repeat general indexing or B-Tree internals from previous parts. Here, the emphasis is different: retrieval quality, ranking, projection, and operational control.

2. Three Retrieval Modes

Most production systems mix three retrieval modes.

Mode	Typical Question	Main Index Type	Correctness Shape
Exact lookup	“Find case `CASE-2026-001`.”	B-Tree / hash / primary key	deterministic
Lexical search	“Find cases mentioning illegal import permit.”	inverted index / full-text index	relevance-ranked
Semantic search	“Find cases similar to this complaint narrative.”	vector index / ANN index	approximate, similarity-ranked

A system becomes brittle when engineers confuse these modes.

Example mistakes:

using vector search for exact regulatory identifiers;
using text search as the only authorization filter;
using a search index as canonical storage;
expecting semantic search to return deterministic legal/evidence results;
treating approximate nearest-neighbor recall as correctness instead of a tunable tradeoff.

A safe design usually combines modes:

3. Search Index Is Usually a Projection

A search index should usually be treated like a read model.

It is derived from one or more authoritative sources.

This gives several design consequences:

The operational database remains the source of truth.
Search documents are optimized for retrieval, not normalization.
Indexing is asynchronous unless strong freshness is explicitly required.
Search results may be stale.
Rebuild must be possible from authoritative data.
Security filters must survive projection.
Deletion and privacy rules must propagate into the index.

Do not design a search index as a random dump of tables. Design it as a deliberate retrieval contract.

4. The Search Document Contract

A search document is not merely “the row as JSON”.

It is the shape optimized for query, filter, ranking, display, and security.

Example search document for regulatory case search:

{
  "document_id": "case:9b2f5a6e",
  "source_type": "case",
  "source_id": "9b2f5a6e",
  "tenant_id": "tenant-a",
  "case_number": "ENF-2026-000184",
  "title": "Import permit irregularity investigation",
  "summary": "Investigation into suspected misuse of import permit documents.",
  "body": "...flattened searchable narrative...",
  "status": "UNDER_REVIEW",
  "risk_level": "HIGH",
  "assigned_unit_id": "unit-enforcement-1",
  "security_labels": ["ENFORCEMENT", "RESTRICTED"],
  "visible_to_actor_ids": ["user-123", "group-investigator"],
  "jurisdiction": "ID-JK",
  "created_at": "2026-07-01T09:20:00Z",
  "updated_at": "2026-07-05T02:11:00Z",
  "source_version": 17,
  "index_schema_version": 3,
  "embedding_model": "text-embedding-model-x",
  "embedding_version": 2,
  "content_vector": [0.013, -0.204, 0.771]
}

A good search document has these groups:

Group	Purpose
Identity fields	stable reference back to source
Display fields	title, snippet, badges, status
Lexical fields	text analyzed for keyword search
Filter fields	tenant, status, type, date, unit, lifecycle
Security fields	access scopes, labels, groups, visibility rules
Ranking fields	popularity, recency, risk, quality score
Freshness fields	source version, updated time, index time
Vector fields	embeddings and model version
Operational fields	index schema version, replay offset, error flags

The key design question:

Can this document answer the search query without accidentally leaking, hiding, duplicating, or misranking critical data?

5. Inverted Index Mental Model

Full-text search is usually powered by an inverted index.

Instead of mapping:

Document -> Terms

it maps:

Term -> Documents containing that term

Example:

case:1 = "illegal import permit"
case:2 = "permit renewal rejected"
case:3 = "illegal warehouse operation"

Inverted index:

illegal   -> case:1, case:3
import    -> case:1
permit    -> case:1, case:2
renewal   -> case:2
rejected  -> case:2
warehouse -> case:3
operation -> case:3

Search engines then add:

tokenization;
lowercasing;
stemming;
stop-word removal;
synonym expansion;
phrase positions;
term frequency;
inverse document frequency;
field weighting;
ranking algorithms.

This means full-text search is not just “contains string”.

The index does semantic-ish lexical processing before matching.

6. Full-Text Search Design Decisions

When designing lexical search, decide these explicitly.

6.1 Which fields are searchable?

Do not index everything blindly.

Common field classes:

Field	Search Mode
title	high-weight lexical
summary	medium-weight lexical
body/content	broad lexical
case number	exact keyword
external reference	exact keyword
person/company name	analyzed + exact subfield
status	filter only
tenant	mandatory filter
security label	mandatory filter

A regulatory case number should not be tokenized like normal prose.

ENF-2026-000184 should be searchable exactly, and maybe with normalized variants, but not treated like a narrative paragraph.

6.2 Which fields are filters?

Filters decide eligibility.

Ranking decides order.

Never rely on ranking to enforce access.

Mandatory filters usually include:

tenant;
actor permission;
lifecycle visibility;
jurisdiction;
classification/security label;
deleted/archived status;
document type;
valid time window.

6.3 Which fields affect ranking?

Ranking signals may include:

textual relevance;
recency;
authority;
popularity;
status priority;
risk level;
exact field match boost;
user context;
business-specific priority.

Ranking is product behavior, not only database behavior.

Document it like a business rule.

7. PostgreSQL Full-Text Search vs Dedicated Search Engine

PostgreSQL can support full-text search with tsvector, tsquery, and GIN/GiST indexes.

This is often good enough when:

data is already in PostgreSQL;
search volume is moderate;
ranking needs are simple;
freshness must be transactionally close to source data;
operational simplicity matters;
you need SQL joins and filters around search.

Dedicated search engines like Elasticsearch/OpenSearch become stronger when:

search is a primary product feature;
ranking and analyzers are complex;
scale is high;
indexing pipeline is independent;
documents combine multiple source systems;
autocomplete, faceting, highlighting, synonyms, and relevance tuning matter;
search cluster operations are acceptable.

A good default rule:

Start with the simplest engine that satisfies retrieval semantics, then move search to a dedicated projection when ranking, scale, isolation, or operational ownership demands it.

8. PostgreSQL Full-Text Example

Example source table:

CREATE TABLE enforcement_case (
    id uuid PRIMARY KEY,
    tenant_id uuid NOT NULL,
    case_number text NOT NULL,
    title text NOT NULL,
    summary text,
    status text NOT NULL,
    risk_level text NOT NULL,
    created_at timestamptz NOT NULL DEFAULT now(),
    updated_at timestamptz NOT NULL DEFAULT now(),
    deleted_at timestamptz,
    search_vector tsvector GENERATED ALWAYS AS (
        setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
        setweight(to_tsvector('english', coalesce(summary, '')), 'B') ||
        setweight(to_tsvector('simple', coalesce(case_number, '')), 'A')
    ) STORED
);

CREATE INDEX enforcement_case_search_gin
ON enforcement_case
USING gin (search_vector);

CREATE INDEX enforcement_case_active_tenant_status_idx
ON enforcement_case (tenant_id, status, updated_at DESC)
WHERE deleted_at IS NULL;

Query:

SELECT
    id,
    case_number,
    title,
    status,
    risk_level,
    ts_rank_cd(search_vector, plainto_tsquery('english', :query)) AS rank
FROM enforcement_case
WHERE tenant_id = :tenant_id
  AND deleted_at IS NULL
  AND search_vector @@ plainto_tsquery('english', :query)
ORDER BY rank DESC, updated_at DESC
LIMIT 20;

Important observations:

The full-text index accelerates lexical matching.
The tenant/status/deleted predicates still matter.
Ranking is explicit.
Case number is included with a simpler configuration.
Exact case-number lookup may still deserve its own unique/B-Tree index.

Search does not replace normal schema design.

9. Dedicated Search Index Example

A search index document might flatten operational data:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "case_text_analyzer": {
          "type": "standard",
          "stopwords": "_english_"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "tenant_id": { "type": "keyword" },
      "source_type": { "type": "keyword" },
      "source_id": { "type": "keyword" },
      "case_number": { "type": "keyword" },
      "title": { "type": "text", "analyzer": "case_text_analyzer" },
      "summary": { "type": "text", "analyzer": "case_text_analyzer" },
      "status": { "type": "keyword" },
      "risk_level": { "type": "keyword" },
      "security_labels": { "type": "keyword" },
      "visible_group_ids": { "type": "keyword" },
      "updated_at": { "type": "date" },
      "source_version": { "type": "long" }
    }
  }
}

The important part is not the syntax. It is the classification:

keyword fields are exact/filterable;
text fields are analyzed/searchable;
dates/numbers support filtering/sorting;
security fields are preserved as filterable fields;
source_version supports idempotent updates and freshness checks.

10. Vector Search Mental Model

Vector search converts content into vectors.

A vector is an array of numbers representing semantic features.

Example:

"suspected misuse of import permit" -> [0.013, -0.204, 0.771, ...]

Similar concepts should be close in embedding space.

A vector query usually follows this path:

Vector search is powerful for:

semantic search;
similarity matching;
recommendations;
deduplication assistance;
RAG retrieval;
clustering;
anomaly discovery.

But it is dangerous for:

exact identifiers;
legal truth;
authorization;
deterministic audit evidence;
financial balance correctness;
unique constraint enforcement.

A vector index finds “similar”, not “true”.

11. Exact kNN vs Approximate ANN

Exact nearest-neighbor search compares the query vector against all candidate vectors.

That gives high recall but can be expensive.

Approximate nearest-neighbor search uses an index to trade some recall for latency.

Search Type	Behavior	Tradeoff
Exact kNN	checks all candidates	high recall, high cost
Approximate ANN	searches index graph/list	lower latency, tunable recall

Common ANN index families:

Index	Mental Model	Strength	Risk
HNSW	navigable graph of vectors	strong recall/latency	memory, build cost, filter complexity
IVFFlat	vectors partitioned into lists	simpler, lower memory/build cost	recall depends on probes/lists/training
Product quantization variants	compressed vector representation	scale/cost reduction	accuracy loss and tuning complexity

The architect-level question:

What recall, latency, memory, freshness, and filter behavior does the business require?

Not:

Which vector database is fashionable this month?

12. HNSW Design Intuition

HNSW, or Hierarchical Navigable Small World, is graph-based.

Each vector becomes a node. Edges connect nearby vectors. Search navigates the graph toward closer neighbors instead of scanning everything.

Important knobs usually include:

Parameter	Meaning	Impact
`m`	graph connectivity	higher recall, more memory
`ef_construction`	build-time candidate breadth	better graph, slower build
`ef_search`	query-time search breadth	higher recall, higher latency

Production implications:

HNSW often wants memory.
Recall is not automatic.
Insert order and data distribution can affect quality.
Filtering can reduce effective recall.
Rebuild may be required after large data/model changes.
Latency tuning must be measured on real data.

13. IVFFlat Design Intuition

IVFFlat divides vectors into clusters/lists.

At query time, the engine searches only some nearby lists.

Important knobs:

Parameter	Meaning	Impact
lists	number of partitions	affects build/search balance
probes	number of searched lists	higher recall, higher latency

Production implications:

Data distribution matters.
Training/build strategy matters.
Low probes can miss relevant vectors.
High probes approach more exhaustive search.
It can be simpler and cheaper than HNSW for some workloads.

14. Vector Schema Design

A vector record should not be just (id, embedding).

Example relational design:

CREATE TABLE searchable_content (
    id uuid PRIMARY KEY,
    tenant_id uuid NOT NULL,
    source_type text NOT NULL,
    source_id uuid NOT NULL,
    content_hash text NOT NULL,
    content_text text NOT NULL,
    embedding_model text NOT NULL,
    embedding_version integer NOT NULL,
    embedding vector(1536) NOT NULL,
    security_scope text[] NOT NULL,
    lifecycle_status text NOT NULL,
    source_updated_at timestamptz NOT NULL,
    indexed_at timestamptz NOT NULL DEFAULT now(),
    UNIQUE (source_type, source_id, embedding_model, embedding_version)
);

Fields you usually need:

Field	Why it matters
`source_type`, `source_id`	traceability
`content_hash`	idempotent re-embedding
`embedding_model`	model provenance
`embedding_version`	migration/versioning
`tenant_id`	isolation/filtering
`security_scope`	retrieval authorization
`lifecycle_status`	exclude deleted/archived content
`source_updated_at`	freshness comparison
`indexed_at`	pipeline lag measurement

Without these fields, vector search becomes an opaque retrieval toy instead of a production database capability.

15. Hybrid Search

Hybrid search combines lexical search and vector search.

Why?

Lexical search is good for:

exact terms;
identifiers;
rare names;
legal/regulatory phrases;
strict keyword match.

Vector search is good for:

semantic similarity;
paraphrases;
fuzzy intent;
concept-level retrieval.

Hybrid design:

Common fusion methods:

Method	Idea
Weighted score	combine normalized lexical/vector scores
Reciprocal Rank Fusion	combine based on rank positions
Learning-to-rank	model-based ranking using features
Reranking model	expensive second-pass semantic ranking

A practical architecture:

Generate lexical candidates.
Generate vector candidates.
Apply mandatory filters.
Merge and deduplicate.
Rerank top candidates.
Fetch authoritative source data.
Return result with reason/snippet.

16. Filtered Vector Search

Filtered vector search is hard.

Example query:

“Find semantically similar cases, but only within tenant A, visible to investigator X, not archived, in jurisdiction Y, created in the last 2 years.”

That query has two parts:

similarity search;
mandatory structured filters.

There are two common approaches.

16.1 Pre-filter

Filter candidates first, then vector search within the allowed subset.

Pros:

safer for security;
avoids unauthorized candidate leakage;
better for highly selective filters if engine supports it well.

Cons:

can be slow if subset handling is poor;
may reduce ANN index efficiency.

16.2 Post-filter

Vector search first, then filter results.

Pros:

simple;
often fast for broad queries.

Cons:

can return too few results after filtering;
dangerous if not carefully isolated;
poor for highly selective tenant/security filters;
recall becomes unpredictable.

Post-filter example failure:

Search top 20 globally.
Filter to tenant A.
Only 1 result remains.
But there were 50 good tenant A results outside global top 20.

This is not a small bug. It is a retrieval correctness failure.

Design rule:

Mandatory security and tenant filters must be part of the retrieval contract, not an afterthought after ranking.

17. Search Authorization Boundary

Search is one of the easiest ways to leak data.

Common leak paths:

result title from unauthorized document;
autocomplete suggestions from restricted data;
facet counts revealing hidden records;
snippets exposing sensitive text;
vector similarity returning restricted documents;
cache keys missing tenant/user dimension;
logs storing raw query or retrieved restricted text;
offline embedding pipeline indexing data that should be excluded.

Search authorization must apply to:

Layer	Requirement
indexing	only index allowed content or index security metadata
query	apply tenant/security/lifecycle filters
ranking	rank only eligible candidates
snippet	generate snippets from authorized fields only
facets	count only authorized documents
cache	key by tenant/user/security context
logging	redact sensitive query/result data
rebuild	preserve security policy during reindex

Never rely only on UI-side filtering.

18. Freshness Contract

Search indexes are often eventually consistent.

That is acceptable only if the freshness contract is explicit.

Examples:

Use Case	Freshness Requirement
exact case lookup after create	immediate or primary DB fallback
public documentation search	seconds/minutes may be fine
compliance deletion	must disappear quickly and provably
evidence search	freshness must be disclosed or bounded
authorization change	must be reflected before access is granted
vector recommendation	stale results may be acceptable

Represent freshness explicitly:

{
  "source_version": 17,
  "indexed_source_version": 17,
  "indexed_at": "2026-07-05T02:11:30Z",
  "pipeline_lag_ms": 850
}

For critical operations, search should often return IDs only, then the API rechecks authority/source state in the primary database.

19. Indexing Pipeline Design

Avoid direct best-effort indexing inside the same application request unless the search index is non-critical and failure is acceptable.

A robust pattern:

Write source data in the operational transaction.
Write an outbox event in the same transaction.
Relay outbox events to indexing pipeline.
Transform source data into search document.
Upsert index document idempotently.
Store indexing offset/version.
Retry failures.
Send poison records to DLQ.
Monitor lag and error rate.

Important: the indexer should usually load the source snapshot instead of trusting event payloads blindly. Event payloads may be partial or schema-versioned.

20. Idempotent Indexing

Indexing must tolerate duplicates, retries, reordering, and partial failures.

Use a deterministic document ID:

document_id = source_type + ':' + source_id

Use source version checks:

if incoming.source_version < indexed.source_version:
    ignore stale indexing event
else:
    upsert document

For multi-document projections:

case:123:main
case:123:evidence:456
case:123:note:789

Never rely on “event delivered once”. Treat exactly-once as an end-to-end property you simulate with idempotency.

21. Deletion and Retention in Search/Vector Indexes

Deletion must propagate to all derived retrieval stores.

This includes:

full-text index;
vector index;
autocomplete index;
embedding cache;
reranker cache;
recommendation index;
analytics/search logs if policy requires;
backups according to retention rules.

Deletion architecture:

A common failure mode:

Source row is hidden, but search index still returns the old title/snippet.

This is often a security incident, not just stale search.

22. Embedding Version Migration

Embedding models change.

When they do, you cannot blindly mix vector spaces.

Different embedding models may produce incomparable vectors.

Design for versioning from day one:

CREATE TABLE content_embedding (
    source_type text NOT NULL,
    source_id uuid NOT NULL,
    chunk_id text NOT NULL,
    embedding_model text NOT NULL,
    embedding_version integer NOT NULL,
    vector vector(1536) NOT NULL,
    content_hash text NOT NULL,
    created_at timestamptz NOT NULL DEFAULT now(),
    PRIMARY KEY (source_type, source_id, chunk_id, embedding_model, embedding_version)
);

Migration strategy:

Keep old embedding index active.
Generate new embeddings in parallel.
Build new index.
Evaluate retrieval quality.
Route small traffic percentage to new index.
Compare result overlap and quality metrics.
Cut over.
Retire old embeddings after retention window.

This is blue/green indexing.

23. Chunking for Vector Search

For long documents, embedding the whole document can be poor.

Chunking splits content into smaller retrievable units.

Chunk design choices:

Decision	Why it matters
chunk size	affects semantic precision and context coverage
overlap	helps avoid boundary loss
chunk identity	enables traceability
parent document link	enables final result grouping
section metadata	improves filtering and explanation
security metadata	prevents unauthorized chunk retrieval
version/hash	supports rebuild and dedup

Example chunk key:

case:123:evidence:456:chunk:0007

Good chunk metadata:

{
  "chunk_id": "case:123:evidence:456:chunk:0007",
  "parent_id": "case:123",
  "source_type": "evidence_document",
  "source_id": "456",
  "section": "findings",
  "page_start": 4,
  "page_end": 5,
  "tenant_id": "tenant-a",
  "security_labels": ["RESTRICTED"],
  "content_hash": "sha256:...",
  "embedding_model": "...",
  "embedding_version": 2
}

Chunking is a database design problem because it affects identity, authorization, lineage, retention, and rebuild.

24. Search Result Explanation

For serious systems, search results should be explainable enough for users to trust them.

Not necessarily full algorithm disclosure, but enough signal:

matched exact case number;
matched title phrase;
matched evidence text;
similar to selected case;
boosted because high risk;
limited to visible cases;
result may be stale as of timestamp;
hidden records excluded due to access policy.

For regulated workflows, this matters.

A user must distinguish:

“no matching records exist”; from
“no matching records visible to you”; from
“search index is delayed”; from
“query was too broad/narrow”; from
“semantic search found similar but not exact records”.

25. Search Quality Metrics

Database engineers often measure only latency.

Search systems also need quality metrics.

Metric	Meaning
precision@k	how many top-k results are relevant
recall@k	how many relevant results are retrieved in top-k
MRR	reciprocal rank of first relevant result
NDCG	ranking quality with graded relevance
zero-result rate	queries returning no results
reformulation rate	users changing query after bad result
click-through rate	weak signal of usefulness
abandonment	users leave without selecting result
freshness lag	source update to index visibility
unauthorized-result count	must be zero

For vector search, also measure:

exact-vs-approx recall;
recall under filters;
latency under concurrent load;
index memory size;
build time;
quality by tenant/domain/category;
degradation after embedding model changes.

26. Search Performance Design

Performance is not only index type.

Key dimensions:

Dimension	Design Question
corpus size	how many documents/chunks?
update rate	how often do documents change?
query rate	how many searches per second?
filter selectivity	are filters broad or narrow?
top-k size	how many candidates are needed?
reranking cost	can expensive reranker run per query?
latency SLO	p50/p95/p99 target?
freshness SLO	max index lag?
memory budget	can vector index fit in memory?
rebuild time	can index be rebuilt within operational window?

A safe search architecture uses budgets:

Total p95 target: 800 ms
- request validation: 20 ms
- query embedding: 100 ms
- lexical retrieval: 120 ms
- vector retrieval: 180 ms
- merge/filter: 50 ms
- rerank: 200 ms
- source fetch: 100 ms
- response serialization: 30 ms

Without budgets, search becomes unbounded product magic.

27. Multi-Tenant Search Index Design

Common options:

Model	Description	Strength	Risk
shared index	all tenants in one index with `tenant_id` filter	simple, cost-efficient	filter mistakes, noisy tenants
index per tenant	separate index per tenant	strong isolation	operational explosion
index per tenant tier/cell	grouped by shard/cell	balanced	routing complexity
dedicated index for regulated tenants	special isolation for high-risk tenants	compliance	higher cost

Default for SaaS:

shared index for small/medium tenants;
cell/index split for large tenants;
dedicated index for high-compliance tenants;
mandatory tenant filter in every query;
automated tests proving cross-tenant leakage is impossible.

Search tenant isolation must be tested like database row-level security.

28. Blue/Green Index Rebuild

Indexes must be rebuildable without downtime.

Pattern:

Validation checklist:

document count by type;
count by tenant;
count by lifecycle status;
sample source-to-index equality;
unauthorized search test;
known-query relevance test;
freshness lag;
vector dimension/model version;
duplicate document IDs;
missing delete propagation;
query latency under load.

Never treat reindex as a manual hero operation.

29. Search/Vector Failure Modes

Failure Mode	Symptom	Root Cause	Mitigation
stale result	user sees old status	async lag	freshness metadata, primary recheck
unauthorized result	hidden item appears	missing filter/index metadata	mandatory filter tests, source recheck
zero results after filter	vector top-k filtered away	post-filtering too late	prefilter, larger candidate set, filter-aware engine
duplicate result	same source appears multiple times	multiple projections not grouped	canonical source ID, dedup/grouping
poor relevance	irrelevant top results	analyzer/ranking/vector issue	query logs, judged dataset, reranking
index drift	index differs from DB	failed events/retries	reconciliation job
bad embedding migration	quality drops	mixed vector spaces	versioned embeddings, shadow eval
rebuild overload	source DB impacted	unthrottled scan	snapshot, chunking, rate limits, replica use
vector memory pressure	p99 spikes/OOM	large HNSW index	quantization, sharding, capacity planning
privacy deletion leak	deleted content still searchable	derived store not purged	delete propagation audit
facet leak	hidden counts visible	facets computed pre-auth	authorized-only aggregation

30. Case Study: Regulatory Case Search

Requirement:

Investigators must search cases by keyword and semantic similarity. Search must respect tenant, jurisdiction, role, confidentiality label, lifecycle status, and deletion/retention policy. Recent case updates should appear within 10 seconds. Exact case-number lookup must be immediate.

Architecture:

Design:

Case DB remains source of truth.
Exact case-number lookup uses primary DB/index.
Search document includes tenant, jurisdiction, status, security labels, visible groups.
Vector chunks represent case summary, allegations, evidence summaries, and decision text.
Search API applies mandatory filters before ranking.
API fetches current case state from DB before returning restricted fields.
Outbox pipeline indexes within 10-second SLO.
Reconciliation job compares source count to index count.
Blue/green index rebuild supports analyzer and embedding upgrades.
Deletion event purges lexical and vector records.

This is not overengineering. This is what makes retrieval safe in a serious domain.

31. Implementation Checklist

Before approving a search/vector design, answer these.

Source of Truth

What is the canonical source table/service?
Is search a projection or source of truth?
How are search documents rebuilt?
How do results link back to authoritative data?

Query Semantics

Which fields are exact?
Which fields are lexical?
Which fields are semantic/vector?
Which filters are mandatory?
Which ranking signals are business rules?

Security

Are tenant/security filters indexed and mandatory?
Can autocomplete leak hidden terms?
Can facets leak hidden counts?
Are snippets generated only from authorized content?
Is cache keyed by security context?

Freshness

What is the max index lag?
Which operations need immediate source lookup?
Is indexed source version visible?
Is pipeline lag monitored?

Vector

Which embedding model/version is used?
How are chunks identified?
Are vectors comparable across versions?
What recall/latency target exists?
Are filters applied before/inside retrieval?

Operations

Can we rebuild index without downtime?
Can we replay missed events?
Is there a DLQ?
Are delete events audited?
Are quality tests automated?

32. Engineering Heuristics

Use these as practical rules.

Search is a projection until proven otherwise.
Exact identifiers deserve exact indexes.
Authorization is a filter, not a ranking feature.
Vector search is approximate unless explicitly exact.
Do not mix embedding versions blindly.
Search freshness must be part of the product contract.
Every derived index needs rebuild and reconciliation.
Deletion must propagate to every retrieval surface.
Hybrid search usually beats pure vector search for enterprise systems.
Relevance must be tested with judged queries, not vibes.

33. Final Mental Model

A search/vector architecture has four truths:

Data truth — what the source system says.
Retrieval truth — what the index can find.
Security truth — what the user may see.
Ranking truth — what the system chooses to show first.

The hard part is keeping these aligned.

When they diverge:

stale results appear;
unauthorized data leaks;
relevant records disappear;
users lose trust;
audits fail;
product behavior becomes unexplainable.

Design search like a database subsystem, not a UI feature.

That is the difference between basic implementation and production-grade architecture.

References

PostgreSQL Documentation — Full Text Search Indexes: https://www.postgresql.org/docs/current/textsearch-indexes.html
PostgreSQL Documentation — Full Text Search: https://www.postgresql.org/docs/current/textsearch.html
OpenSearch Documentation — k-NN Vector Field: https://docs.opensearch.org/latest/mappings/supported-field-types/knn-vector/
OpenSearch Documentation — Approximate k-NN: https://docs.opensearch.org/latest/vector-search/vector-search-techniques/approximate-knn/
pgvector README — HNSW and IVFFlat: https://github.com/pgvector/pgvector
MongoDB Documentation — Vector Search: https://www.mongodb.com/docs/vector-search/

Lesson Recap

You just completed lesson 43 in build core. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Back To Series Next Lesson

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.

Previous Lesson

Lesson 42

Graph Database Design

Next Lesson

Lesson 44

Choosing the Right Database Engine