AI Chat UI: A Learning Guide
🎯 Overview
This guide explains how our FastAPI-based FastOpp application implements a ChatGPT/Claude-like AI chat interface with advanced features like streaming responses, internet search integration, and database connectivity. You'll learn about the technical architecture, implementation strategies, and potential enhancements for building production-ready chat applications.
FastOpp is an open source learning tool for AI applications with pre-built UI and admin components to help students spend more time on core AI learning concepts instead of configuring the full FastAPI stack from scratch with authentication and UI components. Learn more at https://github.com/Oppkey/fastopp.
🧠 Core Chat Functionality
1. Streaming Responses for Interactivity
Why Streaming? - Instead of waiting for complete responses, streaming provides real-time feedback - Makes the chat feel more interactive and responsive - Reduces perceived latency and improves user experience
Technical Implementation:
- Uses FastAPI's EventSourceResponse
with OpenRouter's streaming API
- Sends chunks as they arrive from the LLM
- Frontend processes data:
events for real-time UI updates
Code Flow:
chat_with_llama_stream() → EventSourceResponse → Frontend processes data: events → Real-time UI updates
2. Chat History & Context Management
Context Window Strategy: - This app sends the last 20 messages to the LLM for context - Balances conversation continuity with token limits - Llama 3.3 70B has a 128k token context window, but we limit to 20 messages for fast responses and cost management
Smart Retrieval:
- Uses get_conversation_context()
to fetch recent messages in chronological order
- Maintains conversation flow while managing token usage
- Database design: Messages stored with conversation_id
, role
, content
, and created_at
3. Internet Search Integration
Current Implementation: Keyword-Based Triggering - Automatically detects when users ask about current events using keywords - Keywords: "latest", "current", "2024", "news", "price", "weather", etc. - Uses Tavily API for structured, LLM-optimized search results
Alternative Search Triggering Strategies:
Strategy 1: Intent Classification
# Use a lightweight ML model to classify user intent
def classify_search_intent(user_message: str) -> bool:
# Train a model to detect when users need current information
# More sophisticated than keyword matching
pass
Strategy 2: LLM-Based Decision Making
# Ask the LLM itself if it needs current information
def should_search_web(user_message: str, llm_response: str) -> bool:
# LLM analyzes if it needs current data to answer properly
# More context-aware than keyword detection
pass
Strategy 3: Hybrid Approach
# Combine multiple signals
def should_search(user_message: str, conversation_history: list) -> bool:
keyword_match = any(keyword in user_message.lower() for keyword in SEARCH_KEYWORDS)
intent_score = classify_intent(user_message)
llm_confidence = get_llm_confidence(user_message)
return keyword_match or (intent_score > 0.7) or (llm_confidence < 0.5)
Strategy 4: User-Controlled Search
- Add a "Search Web" button to the UI
- Let users explicitly request web search
- Most reliable but requires user interaction
Strategy 5: Time-Based Triggers
- Automatically search for topics that are likely to change over time
- Use embeddings to detect when users ask about "current" topics
- More sophisticated than simple keyword matching
🔍 Advanced Database Integration
1. Hybrid Search Integration
Current Implementation: - Vector database (FAISS) + Full-text search (FTS5) + BM25 ranking - Available in the search page for document retrieval - Can be integrated into chat for context-aware responses
Integration Strategies:
Strategy A: Automatic Context Retrieval
def get_relevant_context(user_message: str, conversation_id: str) -> str:
# Search for relevant documents/conversations
search_results = hybrid_search_service.search(user_message, limit=5)
# Format results for LLM context
context = format_search_results_for_llm(search_results)
return context
Strategy B: User-Triggered Search
- Add a "Search Knowledge Base" button
- Let users explicitly search their data
- Most reliable but requires user interaction
Strategy C: Smart Context Injection
def smart_context_injection(user_message: str, conversation_history: list) -> str:
# Analyze if user is asking about specific topics
if is_asking_about_customers(user_message):
return get_customer_context(user_message)
elif is_asking_about_projects(user_message):
return get_project_context(user_message)
# ... other domain-specific contexts
2. Customer Database Integration
Use Cases: - "Find customers similar to Acme Corp" - "What products do our enterprise clients use?" - "Show me customers who haven't been contacted in 6 months"
Implementation Approach: 1. Data Preparation: Convert customer data to embeddings 2. Search Integration: Use hybrid search to find relevant customers 3. Context Formatting: Format results for LLM consumption 4. Response Enhancement: LLM provides human-like responses with data
Example Implementation:
class CustomerContextService:
def get_customer_context(self, query: str) -> str:
# Search customer database using hybrid search
results = self.hybrid_search.search(query, content_type="customer")
# Format for LLM
context = "Relevant customers:\n"
for customer in results:
context += f"- {customer.name}: {customer.summary}\n"
return context
3. Document Integration
Document Types to Support: - PDFs (contracts, reports, manuals) - Word documents (proposals, specifications) - Text files (notes, procedures) - Web pages (company knowledge base)
Implementation Steps:
Step 1: Document Processing Pipeline
class DocumentProcessor:
def process_document(self, file_path: str) -> List[Chunk]:
# Extract text from document
text = extract_text(file_path)
# Split into chunks
chunks = chunk_text(text)
# Generate embeddings
embeddings = generate_embeddings(chunks)
# Store in database
store_chunks(chunks, embeddings)
Step 2: Document Search Integration
def search_documents(user_message: str) -> str:
# Use hybrid search to find relevant documents
results = hybrid_search.search(user_message, content_type="document")
# Format for LLM context
context = format_document_results(results)
return context
Step 3: Smart Document Retrieval
def get_relevant_documents(user_message: str, conversation_history: list) -> str:
# Analyze conversation context
context_keywords = extract_keywords(conversation_history)
# Search with context
search_query = f"{user_message} {context_keywords}"
results = search_documents(search_query)
return results
🛠️ Learning Path: Building Advanced Chat Features
Phase 1: Core Functionality
- Study streaming implementation (
services/chat_service.py
) - Understand context management (
services/chat_history_service.py
) - Test different search triggering strategies
Phase 2: Database Integration
- Integrate hybrid search into chat responses
- Add customer database connectivity
- Implement document processing pipeline
Phase 3: Advanced Features
- Add intent classification for smarter search triggering
- Implement multi-modal document support
- Create domain-specific context injection
Phase 4: Production Features
- Add conversation analytics and user behavior tracking
- Implement conversation summarization
- Add conversation export and sharing
🏗️ Architecture Deep Dive
Current Architecture
User Input → Chat Service → OpenRouter API → Streaming Response
↓
Chat History Service → Database Storage
↓
Web Search Service → Tavily API (when triggered)
Enhanced Architecture
User Input → Chat Service → OpenRouter API → Streaming Response
↓
Chat History Service → Database Storage
↓
Hybrid Search Service → Vector DB + FTS5
↓
Document Service → Document Processing
↓
Customer Service → Customer Database
Key Services
- ChatService: Core chat functionality and LLM integration
- WebSearchService: Internet search with smart triggering
- ChatHistoryService: Conversation management and context
- HybridSearchService: Vector and text search capabilities
- DocumentService: Document processing and retrieval
- CustomerService: Customer data integration
📊 Implementation Strategies
1. Gradual Enhancement Approach
- Start with existing functionality
- Add one feature at a time
- Test and validate each addition
- Build on successful patterns
2. User-Centric Design
- Focus on user experience
- Make features discoverable
- Provide clear feedback
- Allow user control
3. Performance Considerations
- Cache frequently accessed data
- Optimize database queries
- Use async/await patterns
- Implement proper error handling
🎯 Next Steps for Learning
- Experiment with different search triggering strategies
- Integrate hybrid search into chat responses
- Add document processing capabilities
- Implement customer database connectivity
- Test and optimize performance
- Deploy to production environment
💡 Advanced Concepts
1. Multi-Agent Systems
- Different agents for different tasks
- Specialized search agents
- Document analysis agents
- Customer service agents
2. Context-Aware Responses
- Understand conversation history
- Maintain user preferences
- Adapt to user behavior
- Provide personalized responses
3. Real-Time Collaboration
- Multiple users in same conversation
- Real-time updates
- Conflict resolution
- Shared context
Remember: The best way to learn is by building! Start with small enhancements and gradually add more sophisticated features.