Vector Databases & Hybrid Search: A Learning Guide

🎯 Overview

This guide explains how our FastAPI-based FastOpp application implements hybrid search using vector databases, embeddings, and traditional text search. You'll learn about FAISS, embeddings, FTS5, and how to extend the system for real-world applications.

FastOpp is an open source learning tool for AI applications with pre-built UI and admin components to help students spend more time on core AI learning concepts instead of configuring the full FastAPI stack from scratch with authentication and UI components. Learn more at https://github.com/Oppkey/fastopp.

🧠 What is Hybrid Search?

Hybrid search combines two different search approaches to get the best of both worlds:

Keyword Search (FTS5 + BM25): Finds exact matches and phrases
Semantic Search (FAISS + Embeddings): Understands meaning and context

Why Hybrid? Sometimes you want exact matches ("social media strategy"), sometimes you want related concepts ("marketing approach" → finds "brand strategy").

🔍 Understanding the Components

1. Embeddings: Converting Text to Numbers

What are embeddings? - Embeddings convert text into a list of numbers (vectors) - Similar text gets similar numbers - Our app uses OpenAI's text-embedding-3-small model (1536 dimensions)

Example:

# Text: "marketing strategy"
# Embedding: [0.1, -0.3, 0.8, 0.2, ...] (1536 numbers)

# Text: "brand approach" 
# Embedding: [0.2, -0.2, 0.7, 0.3, ...] (similar numbers = similar meaning)

How it works in our app:

# services/embedding_service.py
async def generate_embedding(self, text: str) -> List[float]:
    response = await self.client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding  # Returns 1536 numbers

2. FAISS: Vector Database

What is FAISS? - Facebook AI Similarity Search - Stores and searches through vectors efficiently - Finds similar vectors quickly (cosine similarity)

Why FAISS over alternatives?

The vector database can be replaced with other solutions depending on your infrastructure preferences and requirements:

Feature	FAISS	ChromaDB	Pinecone	PostgreSQL + pgvector
Setup	Simple, local	Easy, local/cloud	Cloud service	Moderate, local/cloud
Cost	Free	Free/paid	Paid	Free/paid
Performance	Very fast	Good	Good	Good
Scalability	Self-managed	Self-managed	Managed	Self-managed
Learning Value	High (understand internals)	Medium	Low (black box)	High (SQL knowledge)
Integration	Python library	Python library	API-based	Native SQL
Persistence	File-based	File/cloud	Cloud	Database
ACID Compliance	No	No	No	Yes

How to Change the Vector Database:

The vector database implementation is abstracted in our HybridSearchService class, making it straightforward to swap implementations:

Create a new vector database adapter (e.g., PostgreSQLVectorAdapter)
Implement the same interface as the current FAISS implementation
Update the service initialization to use the new adapter
Migrate existing vector data to the new database

Example: Switching to PostgreSQL + pgvector

# New adapter class
class PostgreSQLVectorAdapter:
    def __init__(self, connection_string: str):
        self.conn = psycopg2.connect(connection_string)

    def add_vectors(self, vectors: List[List[float]], ids: List[int]):
        # Insert vectors into PostgreSQL with pgvector
        pass

    def search(self, query_vector: List[float], k: int = 10):
        # Use pgvector's similarity search
        pass

# Update HybridSearchService
class HybridSearchService:
    def __init__(self, vector_db_type: str = "faiss"):
        if vector_db_type == "postgresql":
            self.vector_db = PostgreSQLVectorAdapter(DATABASE_URL)
        else:
            self.vector_db = FAISSAdapter()  # Current implementation

FAISS in our app:

# services/hybrid_search_service.py
class HybridSearchService:
    def __init__(self):
        self.faiss_index = faiss.IndexFlatIP(1536)  # Inner product index
        self.embedding_dimension = 1536

3. FTS5: Full-Text Search

What is FTS5? - SQLite's Full-Text Search extension - Searches through text content efficiently - Uses BM25 ranking algorithm

FTS5 in our app:

-- Virtual table for searchable content
CREATE VIRTUAL TABLE chunks_fts USING fts5(
    content,
    chunk_type,
    content='chunks',
    content_rowid='rowid'
);

⚖️ Understanding Weight Modification

The Hybrid Score Formula:

hybrid_score = (bm25_weight × BM25_score) + (cosine_weight × cosine_similarity_score)

Default Weights: - bm25_weight = 0.35 (35% keyword search) - cosine_weight = 0.65 (65% semantic search)

How to Modify Weights:

Scenario 1: More Keyword Focus

# For exact term matching (legal documents, code)
bm25_weight = 0.7    # 70% keyword
cosine_weight = 0.3  # 30% semantic

Scenario 2: More Semantic Focus

# For creative content (marketing, research)
bm25_weight = 0.2    # 20% keyword  
cosine_weight = 0.8  # 80% semantic

Scenario 3: Balanced Search

# For general content
bm25_weight = 0.5    # 50% keyword
cosine_weight = 0.5  # 50% semantic

Testing Different Weights:

# API call with custom weights
GET /search/hybrid?q=marketing&bm25_weight=0.2&cosine_weight=0.8

🛠️ Learning Path

Phase 1: Understanding the Basics

Study the embedding service (services/embedding_service.py)
Examine chunk creation (services/chunking_service.py)
Test different search methods via the UI at /search

Phase 2: Experimenting with Weights

Modify weights in the search UI
Compare results between keyword, semantic, and hybrid
Test with different content types

Phase 3: Extending the System

Add new content types (PDFs, images)
Implement custom scoring algorithms
Add search filters and faceted search

Phase 4: Advanced Features

Add search suggestions and autocomplete
Implement search analytics and user behavior tracking
Add real-time search updates

��️ Architecture Deep Dive

Data Flow

User types search query
Query gets embedded (text → numbers)
FAISS finds similar vectors
FTS5 finds keyword matches
Results get combined with weights
Final ranked results returned

Key Files to Study

services/hybrid_search_service.py - Core hybrid search logic
services/embedding_service.py - OpenAI integration
services/chunking_service.py - Content processing
routes/marketing.py - Search API endpoints
templates/search.html - Frontend interface

�� Extending for Real-World Use

Adding New Content Types

# Example: Add PDF support
class PDFChunkingService:
    def chunk_pdf(self, pdf_path: str) -> List[Chunk]:
        # Extract text from PDF
        # Split into chunks
        # Generate embeddings
        pass

Custom Scoring Algorithms

# Example: Boost recent content
def custom_score(self, chunk, base_score):
    recency_boost = self.calculate_recency_boost(chunk.created_at)
    return base_score * recency_boost

Performance Optimization

# Example: Caching frequent searches
@lru_cache(maxsize=1000)
def cached_embedding(self, text: str):
    return self.generate_embedding(text)

📊 Monitoring and Analytics

Search Metrics to Track

Query performance (response time)
Result relevance (click-through rates)
Popular queries and search patterns
Search method effectiveness (hybrid vs keyword vs semantic)

Implementation Example

# Add to search endpoints
async def track_search(query: str, method: str, results_count: int):
    # Log search metrics
    # Update analytics database
    pass

🔧 Troubleshooting Common Issues

Issue 1: No Search Results

Check: Are chunks being created?
Check: Are embeddings being generated?
Check: Is FAISS index built?

Issue 2: Poor Search Quality

Try: Adjusting weights
Try: Different chunk sizes
Try: Different embedding models

Issue 3: Slow Performance

Check: FAISS index size
Check: Database query performance
Consider: Adding caching

📚 Further Learning Resources

FAISS Documentation: https://faiss.ai/
OpenAI Embeddings Guide: https://platform.openai.com/docs/guides/embeddings
SQLite FTS5: https://www.sqlite.org/fts5.html
Vector Search Best Practices: https://weaviate.io/blog/vector-search-best-practices

🎯 Next Steps

Experiment with the search interface
Modify weights and see how results change
Add your own content and test search quality
Extend the system with new features
Deploy to a real environment

Remember: The best way to learn is by doing! Start with small modifications and gradually build more complex features.

�� Corporate Database Use Case: 10,000 Customer Analysis

This architecture is perfect for analyzing a corporate database of 10,000 customers/stakeholders for cross-selling opportunities. Here's how:

Why This Architecture Works for Customer Analysis

Semantic Understanding: Find customers with similar interests even if they used different words
Hybrid Search: Combine exact matches (company names, industries) with semantic matches (similar business needs)
Scalable: FAISS handles 10,000+ records efficiently
Real-time: Update customer profiles and immediately searchable

Implementation for Customer Analysis

1. Customer Data Structure

class CustomerProfile:
    company_name: str
    industry: str
    business_needs: str  # Rich text description
    current_products: List[str]
    pain_points: str
    company_size: str
    location: str
    engagement_level: str  # webinar, whitepaper, trial, etc.

2. Search Use Cases

"Find companies like Acme Corp" → Semantic search finds similar business profiles
"Manufacturing companies needing automation" → Hybrid search finds exact + related matches
"SaaS companies with 50-200 employees" → Filtered semantic search

3. Cross-Selling Opportunities

# Example search queries for sales teams:
queries = [
    "companies using our CRM who might need marketing automation",
    "enterprise clients who haven't tried our analytics product",
    "SMB customers ready for enterprise features",
    "companies in healthcare needing compliance tools"
]

4. Enhanced Features for Sales

Customer Similarity: "Show me 10 companies most similar to [current customer]"
Gap Analysis: "What products do similar companies use that this customer doesn't?"
Engagement Scoring: Boost customers with high engagement in search results
Timing Analysis: Find customers who engaged 6+ months ago but haven't converted

Performance at Scale

10,000 customers: FAISS handles this easily (can scale to millions)
Real-time updates: New customer data immediately searchable
Fast queries: Sub-second response times even with complex searches

Business Value

Increase Revenue: Find cross-selling opportunities automatically
Improve Targeting: Better customer segmentation and personalization
Save Time: Sales teams find relevant prospects instantly
Data-Driven: Use actual customer behavior and needs, not just demographics

This architecture transforms your customer database from a static list into an intelligent, searchable knowledge base that actively helps identify sales opportunities.