How do you fuse semantic and keyword rankings?

Reciprocal Rank Fusion (RRF) is the standard approach: combine ranks rather than scores. Weighted combination is the alternative. Defaults usually work; tune for specific use cases if quality data shows a benefit.

Do all vector databases support hybrid search?

Modern ones do (Qdrant, Weaviate, Pinecone, pgvector with extensions). Some legacy / simple vector stores require manual implementation. Check vendor documentation.

What's the implementation overhead?

Typically 30-50 lines of additional code to add BM25 keyword search alongside semantic search and fuse rankings. Marginal compared to the typical retrieval pipeline complexity.

Should we use sparse embeddings (SPLADE) instead of BM25?

Sparse learned embeddings (SPLADE, etc.) are an emerging alternative to traditional BM25 keyword search. Often performs better than BM25 but more complex to implement and serve. For most production cases, hybrid with traditional BM25 + dense embeddings is sufficient.

Does adding keyword search hurt performance?

Adds latency (parallel queries to two indexes plus fusion), typically 20-50ms additional. Storage cost increases slightly. Quality improvement (10-30%) typically dominates.

Should we add reranking to hybrid search?

Yes: for production RAG. Standard production pipeline: hybrid retrieval (top 50-100) + reranking (top 5-10) → LLM. Reranking provides additional quality improvement on top of hybrid retrieval.

Start a conversation

Decision framework

Semantic vs Hybrid Search: Which Retrieval Approach to Choose

TL;DR

Use hybrid search (semantic + keyword) for almost every production RAG and search use case: combines the meaning understanding of semantic search with the exact-match precision of keyword search. Use pure semantic search only for use cases with minimal proper noun content where keyword precision doesn't matter. Use pure keyword search rarely: only when meaning understanding is genuinely irrelevant. Modern production retrieval is hybrid by default; the 30-40 lines of code it takes to add keyword search to a semantic pipeline is one of the highest-ROI improvements available.

Side-by-side comparison

Dimension	Semantic Search	Hybrid Search (Semantic + Keyword)
Captures meaning	Yes	Yes
Captures exact-match	No	Yes
Performance on conceptual queries	Strong	Strong
Performance on specific lookups	Weak	Strong
Cross-lingual capability	Yes (with multilingual embeddings)	Yes (semantic component handles)
Implementation complexity	Lower	Slightly higher
Storage cost	Lower	Slightly higher (sparse + dense indexes)
Typical benchmark improvement	Baseline	10-30% over semantic alone
Vector DB support	All	Modern (Qdrant, Weaviate, Pinecone, others)
Best for	Conceptual queries, minimal proper nouns	Production retrieval (almost all cases)

Semantic Search

Vector embeddings find by meaning. Strong on conceptual queries.

Semantic search converts queries and documents to vector embeddings and finds matches by similarity in vector space. Captures meaning beyond exact terms ('cancel my subscription' matches 'how do I unsubscribe'). Strong on conceptual queries where users describe intent in different words than documents use. Weak on exact-match signals: proper nouns, product codes, technical terms, specific numbers that should rank highly when present. Modern embedding models (OpenAI text-embedding-3, Cohere Embed, Voyage) make semantic search practical at scale.

Pros

Captures meaning beyond exact terms
Handles synonym and paraphrase queries well
Cross-lingual capability with multilingual embeddings
Strong on conceptual / intent queries
Works without complex query parsing

Cons

Misses exact-match signals (proper nouns, codes, technical terms)
Can return semantically-similar but irrelevant results
Ranking on 'similar but not exact' can be worse than keyword for specific lookups
Embedding model choice matters significantly

Best for

→ Conceptual queries where meaning matters more than exact terms
→ Cross-lingual retrieval
→ Use cases with minimal proper noun / code content

Worst for

→ Use cases with significant proper noun, code, or technical term content
→ Specific lookup queries where exact match matters
→ Use cases where keyword precision is critical

Cost model

Embedding cost (per token) + vector store storage / queries.

Time to value

Days for production semantic search.

Hybrid Search (Semantic + Keyword)

Best of both worlds. The production default for modern retrieval.

Hybrid search combines semantic search and keyword search (typically BM25) in parallel, then fuses the rankings (Reciprocal Rank Fusion or weighted combination). Captures both meaning (semantic) and exact-match precision (keyword). Production benchmarks consistently show hybrid outperforming either alone, typically 10-30% improvement on benchmarks like BEIR. The 30-40 lines of code to add keyword search to a semantic pipeline is one of the highest-ROI improvements available. Modern vector databases (Qdrant, Weaviate, Pinecone) all support hybrid search natively.