AI Chat UI: Beyond ChatGPT

Building Advanced Chat Interfaces with FastAPI

Streaming, Context Management, and Business Integration

Module Overview

What You'll Learn

Streaming responses for real-time interactivity
Context management with conversation history
Internet search integration with smart triggering
Database connectivity for business context
Advanced features for production applications

The Problem with Basic Chat

Current Limitations

No real-time feedback - Users wait for complete responses
Poor context management - Limited conversation memory
No external data - Can't access current information
Isolated conversations - No business context
Basic functionality - Missing advanced features

Our Solution: Advanced Chat UI

Key Features

Streaming responses - Real-time feedback
Smart context management - Last 20 messages
Internet search integration - Current information access
Business context - Client and project integration
Extensible architecture - Easy to enhance

1. Streaming Responses

Why Streaming?

Real-time feedback - Users see progress
Reduced perceived latency - Feels faster
Better user experience - More interactive
Professional feel - Like ChatGPT/Claude

Streaming Implementation

Technical Approach

# FastAPI EventSourceResponse
@router.post("/chat/stream")
async def chat_with_llama_stream(request: Request):
    async def generate():
        async for chunk in openai_stream:
            yield f"data: {chunk}\n\n"

    return EventSourceResponse(generate())

Frontend processes data: events for real-time updates

2. Context Management

Smart Context Strategy

Last 20 messages - Balances continuity with performance
Chronological order - Maintains conversation flow
Token management - Stays within LLM limits
Database storage - Persistent conversation history

Context Implementation

Database Design

class Message(SQLModel, table=True):
    id: UUID
    conversation_id: UUID
    role: str  # user, assistant, system
    content: str
    created_at: datetime

Smart retrieval with get_conversation_context()

3. Internet Search Integration

Current Implementation

Keyword-Based Triggering: - Detects current events keywords - Keywords: "latest", "current", "2024", "news", "price" - Uses Tavily API for structured results - LLM-optimized search responses

Search Triggering Strategies

5 Approaches

Keyword matching (current) - Simple and reliable
Intent classification - ML-based detection
LLM-based decisions - AI determines need
Hybrid approach - Multiple signals
User-controlled - Manual search button

Strategy 1: Keyword Matching

Current Implementation

SEARCH_KEYWORDS = ["latest", "current", "2024", "news", "price", "weather"]

def should_search_web(user_message: str) -> bool:
    return any(keyword in user_message.lower() 
               for keyword in SEARCH_KEYWORDS)

Simple, fast, and reliable

Strategy 2: Intent Classification

ML-Based Approach

def classify_search_intent(user_message: str) -> bool:
    # Train a model to detect when users need current information
    # More sophisticated than keyword matching
    pass

More accurate but requires training data

Strategy 3: LLM-Based Decisions

AI-Powered Detection

def should_search_web(user_message: str, llm_response: str) -> bool:
    # LLM analyzes if it needs current data to answer properly
    # More context-aware than keyword detection
    pass

Most intelligent but slower

Strategy 4: Hybrid Approach

Multiple Signals

def should_search(user_message: str, conversation_history: list) -> bool:
    keyword_match = any(keyword in user_message.lower() 
                       for keyword in SEARCH_KEYWORDS)
    intent_score = classify_intent(user_message)
    llm_confidence = get_llm_confidence(user_message)

    return keyword_match or (intent_score > 0.7) or (llm_confidence < 0.5)

Best of all worlds

Strategy 5: User-Controlled

Manual Search Button

"Search Web" button in UI
User explicitly requests web search
Most reliable but requires interaction
Good for power users

4. Database Integration

Hybrid Search Integration

Current Capabilities: - Vector database (FAISS) + Full-text search (FTS5) - BM25 ranking algorithm - Available in search page - Can be integrated into chat

Database Integration Strategies

3 Approaches

Automatic context retrieval - AI finds relevant data
User-triggered search - Manual knowledge base search
Smart context injection - Domain-specific context

Strategy A: Automatic Context Retrieval

AI-Powered Data Access

def get_relevant_context(user_message: str, conversation_id: str) -> str:
    # Search for relevant documents/conversations
    search_results = hybrid_search_service.search(user_message, limit=5)

    # Format results for LLM context
    context = format_search_results_for_llm(search_results)
    return context

Seamless but can be unpredictable

Strategy B: User-Triggered Search

Manual Knowledge Base Access

"Search Knowledge Base" button
User explicitly searches their data
Most reliable but requires interaction
Good for specific queries

Strategy C: Smart Context Injection

Domain-Specific Context

def smart_context_injection(user_message: str, conversation_history: list) -> str:
    if is_asking_about_customers(user_message):
        return get_customer_context(user_message)
    elif is_asking_about_projects(user_message):
        return get_project_context(user_message)
    # ... other domain-specific contexts

Intelligent and contextual

Customer Database Integration

Business Use Cases

"Find customers similar to Acme Corp"
"What products do our enterprise clients use?"
"Show me customers who haven't been contacted in 6 months"

Customer Integration Implementation

Data Flow

Data Preparation - Convert customer data to embeddings
Search Integration - Use hybrid search to find relevant customers
Context Formatting - Format results for LLM consumption
Response Enhancement - LLM provides human-like responses with data

Customer Service Example

Implementation

class CustomerContextService:
    def get_customer_context(self, query: str) -> str:
        # Search customer database using hybrid search
        results = self.hybrid_search.search(query, content_type="customer")

        # Format for LLM
        context = "Relevant customers:\n"
        for customer in results:
            context += f"- {customer.name}: {customer.summary}\n"

        return context

Document Integration

Supported Document Types

PDFs - Contracts, reports, manuals
Word documents - Proposals, specifications
Text files - Notes, procedures
Web pages - Company knowledge base

Document Processing Pipeline

Step-by-Step

class DocumentProcessor:
    def process_document(self, file_path: str) -> List[Chunk]:
        # Extract text from document
        text = extract_text(file_path)

        # Split into chunks
        chunks = chunk_text(text)

        # Generate embeddings
        embeddings = generate_embeddings(chunks)

        # Store in database
        store_chunks(chunks, embeddings)

Learning Path

Phase 1: Core Functionality

Study streaming implementation (services/chat_service.py)
Understand context management (services/chat_history_service.py)
Test different search triggering strategies

Learning Path

Phase 2: Database Integration

Integrate hybrid search into chat responses
Add customer database connectivity
Implement document processing pipeline

Learning Path

Phase 3: Advanced Features

Add intent classification for smarter search triggering
Implement multi-modal document support
Create domain-specific context injection

Learning Path

Phase 4: Production Features

Add conversation analytics and user behavior tracking
Implement conversation summarization
Add conversation export and sharing

Architecture Overview

Current Architecture

User Input → Chat Service → OpenRouter API → Streaming Response
                ↓
        Chat History Service → Database Storage
                ↓
        Web Search Service → Tavily API (when triggered)

Enhanced Architecture

With Database Integration

User Input → Chat Service → OpenRouter API → Streaming Response
                ↓
        Chat History Service → Database Storage
                ↓
        Hybrid Search Service → Vector DB + FTS5
                ↓
        Document Service → Document Processing
                ↓
        Customer Service → Customer Database

Key Services

Service Architecture

ChatService - Core chat functionality and LLM integration
WebSearchService - Internet search with smart triggering
ChatHistoryService - Conversation management and context
HybridSearchService - Vector and text search capabilities
DocumentService - Document processing and retrieval
CustomerService - Customer data integration

Implementation Strategies

1. Gradual Enhancement

Start with existing functionality
Add one feature at a time
Test and validate each addition
Build on successful patterns

Implementation Strategies

2. User-Centric Design

Focus on user experience
Make features discoverable
Provide clear feedback
Allow user control

Implementation Strategies

3. Performance Considerations

Cache frequently accessed data
Optimize database queries
Use async/await patterns
Implement proper error handling

Advanced Concepts

Multi-Agent Systems

Different agents for different tasks
Specialized search agents
Document analysis agents
Customer service agents

Advanced Concepts

Context-Aware Responses

Understand conversation history
Maintain user preferences
Adapt to user behavior
Provide personalized responses

Advanced Concepts

Real-Time Collaboration

Multiple users in same conversation
Real-time updates
Conflict resolution
Shared context

Next Steps

Immediate Actions

Experiment with different search triggering strategies
Integrate hybrid search into chat responses
Add document processing capabilities
Implement customer database connectivity
Test and optimize performance
Deploy to production environment

Key Takeaways

What Makes This Superior

Real-time streaming - Better user experience
Smart context management - Maintains conversation flow
External data integration - Access to current information
Business context - Connects to your data
Extensible architecture - Easy to enhance

Ready to Build?

Start Learning

The best way to learn is by building!

Start with small enhancements and gradually add more sophisticated features.

Let's create something amazing! 🚀