🔄 Conversational Routing Architecture

Intelligently Determining When to Fetch New Data vs. Continue Dialogue

🎯 Intent Classification 🧠 Context Awareness ⚡ Real-Time Decision 🔄 State Management

🎯 The Architectural Challenge

🔴 The Core Problem

Context: PeopleWorksGPT is evolving from simple grid/pivot output to rich conversational experiences like Claude, ChatGPT, and Gemini.

Current Flow:

  1. Step 1 (Data Fetch): User question → Schema + Hints → SQL Generation → Execute → ExpandedObject
  2. Step 2 (Rich Output): Data + New Prompt → LLM → Enriched text with charts, explanations, insights

The Decision Point: When user asks a follow-up question, how do we determine:

  • Execute Step 1 again? (New data query needed)
  • Continue with existing data? (Step 2 only with current context)
2
Processing Steps
85%
Classification Accuracy Needed
< 200ms
Decision Latency Target

💡 The Recommended Solution: Hybrid Intent Classifier

✅ Multi-Layer Routing Architecture

A three-tier decision system combining heuristics, context analysis, and LLM classification:

  1. Tier 1 - Fast Heuristics: Pattern matching for obvious cases (90% of scenarios)
  2. Tier 2 - Context Analysis: Evaluate conversation state and data availability
  3. Tier 3 - LLM Classifier: Intelligent routing for ambiguous cases
graph TD
    A[User Follow-up Question] --> B{Tier 1: Fast Heuristics}
    
    B -->|Contains Data Keywords| C[NEW_DATA Required]
    B -->|Contains Visualization Keywords| D[CONTINUE with Data]
    B -->|Ambiguous| E{Tier 2: Context Analysis}
    
    E -->|Data Still Valid| F{Covers Current Scope}
    E -->|Data Expired/Invalid| C
    
    F -->|Yes Within Scope| D
    F -->|No Different Scope| C
    F -->|Uncertain| G{Tier 3: LLM Classifier}
    
    G -->|Analyze Intent| H[Classification Prompt]
    H --> I{Decision}
    I -->|NEW_DATA| C
    I -->|CONTINUE| D
    
    C --> J[Execute Step 1: SQL Generation]
    D --> K[Execute Step 2: Rich Output Only]
    
    J --> L[Return with Fresh Data]
    K --> M[Return with Existing Data]
    
    style C fill:#fc8181,stroke:#e53e3e,stroke-width:3px
    style D fill:#9ae6b4,stroke:#38a169,stroke-width:3px
    style G fill:#fef08a,stroke:#d69e2e,stroke-width:3px
    style J fill:#fecaca,stroke:#f56565,stroke-width:2px
    style K fill:#bbf7d0,stroke:#48bb78,stroke-width:2px
                    

🚀 Tier 1: Fast Heuristic Patterns (< 10ms)

Pattern Detection Rules

Pattern Type Keywords/Indicators Decision Confidence Example Questions
NEW_DATA Signals • "show", "get", "fetch"
• "who", "what", "when", "where"
• "find", "search", "list"
• "compare", "analyze"
• Entity names (tables/columns)
NEW_DATA 95% "Show me sales from 2024"
"What are the top 10 customers?"
"Get employees in Marketing"
CONTINUE Signals • "explain", "why", "how"
• "visualize", "chart", "graph"
• "format as", "create a"
• "summarize", "interpret"
• "this data", "these results"
CONTINUE 92% "Explain why this happened"
"Create a pie chart with this"
"Summarize these findings"
Temporal Changes • Different time period
• "now show", "what about"
• "instead of", "rather than"
NEW_DATA 88% "Now show 2023 instead"
"What about last quarter?"
"Change date range to May"
Drill-Down Requests • "more details on"
• "breakdown of", "drill into"
• Specific record reference
NEW_DATA 75% "Show details for customer #123"
"Drill into that department"
"Breakdown by region"
Refinement Requests • "also include", "add"
• "filter by", "exclude"
• "sort by", "order by"
DEPENDS 60% "Also include inactive users"
"Sort by revenue DESC"
"Filter out cancelled orders"
💡 Performance Insight: Fast heuristics handle ~70% of all routing decisions with minimal latency and high accuracy. This tier is crucial for maintaining responsive UX.
// C# Implementation: Tier 1 Heuristic Classifier public enum RoutingDecision { NewData, Continue, Uncertain } public class HeuristicClassifier { private static readonly string[] NEW_DATA_KEYWORDS = { "show", "get", "fetch", "find", "search", "list", "who", "what", "when", "where", "which", "compare", "analyze", "calculate" }; private static readonly string[] CONTINUE_KEYWORDS = { "explain", "why", "how", "visualize", "chart", "graph", "summarize", "format", "create a" }; public RoutingDecision Classify( string userQuestion, string[] tableNames) { var normalized = userQuestion.ToLowerInvariant(); // Check for entity names (tables/columns) if (tableNames.Any(t => normalized.Contains(t.ToLowerInvariant()))) { return RoutingDecision.NewData; } // Count keyword matches var newDataScore = NEW_DATA_KEYWORDS .Count(k => normalized.Contains(k)); var continueScore = CONTINUE_KEYWORDS .Count(k => normalized.Contains(k)); if (newDataScore > continueScore && newDataScore >= 2) { return RoutingDecision.NewData; } if (continueScore > newDataScore && continueScore >= 1) { return RoutingDecision.Continue; } return RoutingDecision.Uncertain; } }

🔍 Tier 2: Context Analysis (< 50ms)

Conversational State Evaluation

📊 Data Context State

Maintain a conversation state object tracking:

  • Last Query Metadata: Original question, generated SQL, execution timestamp
  • Data Scope: Tables queried, date ranges, filters applied
  • Result Summary: Row count, columns returned, data freshness
  • User Intent History: Last 3-5 questions to understand trajectory
// Conversation State Model public class ConversationState { public string OriginalQuestion { get; set; } public string GeneratedSQL { get; set; } public DateTime ExecutedAt { get; set; } public DataScope Scope { get; set; } public ResultSummary Results { get; set; } public List<string> IntentHistory { get; set; } } public class DataScope { public string[] TablesQueried { get; set; } public DateRange? DateFilter { get; set; } public Dictionary<string, string> AppliedFilters { get; set; } } public class ContextAnalyzer { public RoutingDecision AnalyzeContext( string newQuestion, ConversationState currentState) { // 1. Data Freshness Check var dataAge = DateTime.Now - currentState.ExecutedAt; if (dataAge.TotalMinutes > 15) { return RoutingDecision.NewData; // Data too old } // 2. Scope Compatibility Check var mentionedEntities = ExtractEntityNames(newQuestion); var isWithinScope = mentionedEntities .All(e => currentState.Scope.TablesQueried .Contains(e)); if (!isWithinScope) { return RoutingDecision.NewData; // Different data needed } // 3. Intent Trajectory Analysis var lastIntent = currentState.IntentHistory .LastOrDefault(); if (lastIntent == "data_fetch" && IsVisualizationRequest(newQuestion)) { return RoutingDecision.Continue; // Logical follow-up } return RoutingDecision.Uncertain; // Need LLM classification } }

🌳 Context Decision Tree

  1. Is data expired? (> 15 min) → NEW_DATA
  2. Does question reference new entities? → NEW_DATA
  3. Are date/time filters different? → NEW_DATA
  4. Is it a visualization/formatting request? → CONTINUE
  5. Is it a clarification/explanation? → CONTINUE
  6. Still uncertain? → Escalate to Tier 3

🧠 Tier 3: LLM Intent Classifier (< 200ms)

The Classification Prompt

When Tiers 1 & 2 are uncertain, delegate to the LLM with a specialized classification prompt that returns structured output:

// LLM Classification Prompt Template var classificationPrompt = $@" You are a routing classifier for a business intelligence system. CONTEXT: - Previous Question: {currentState.OriginalQuestion} - Data Scope: Tables: [{string.Join(', ', currentState.Scope.TablesQueried)}] - Date Range: {currentState.Scope.DateFilter} - Rows Returned: {currentState.Results.RowCount} - Data Age: {dataAgeMinutes} minutes old NEW USER QUESTION: {newUserQuestion} TASK: Determine if this question requires: 1. NEW_DATA: Fetching new/different data from database 2. CONTINUE: Using existing data with new presentation/analysis CLASSIFICATION RULES: - NEW_DATA if: Different entities, time periods, filters, aggregations, or scope - CONTINUE if: Visualization, formatting, explanation, or analysis of current data - Consider: Data freshness, scope compatibility, logical conversation flow RESPOND ONLY WITH VALID JSON: {{ \"decision\": \"NEW_DATA\" | \"CONTINUE\", \"confidence\": 0.0-1.0, \"reasoning\": \"brief explanation\" }} ";
sequenceDiagram
    participant User
    participant Router
    participant Tier1 as Tier 1
    participant Tier2 as Tier 2
    participant LLM as Tier 3 LLM
    participant Engine

    User->>Router: Follow-up Question
    Router->>Tier1: Fast Heuristics
    
    alt Clear Pattern Match
        Tier1-->>Router: NEW_DATA or CONTINUE
        Router->>Engine: Execute Decision
    else Uncertain
        Router->>Tier2: Context Analysis
        
        alt Context Determines
            Tier2-->>Router: NEW_DATA or CONTINUE
            Router->>Engine: Execute Decision
        else Still Uncertain
            Router->>LLM: Classification Prompt
            LLM-->>Router: JSON Response
            Router->>Engine: Execute Decision
        end
    end
    
    Engine-->>User: Result
                    

⚡ Performance Optimization

  • Caching: Cache classification decisions for similar questions (embeddings similarity)
  • Parallel Processing: Run Tier 2 and Tier 3 concurrently, use fastest result
  • Model Selection: Use fast model (Claude Haiku, GPT-4o-mini) for classification
  • Streaming: Start response generation while classification completes

🛠️ Complete Implementation Strategy

Step 1: Build State Manager

Create a ConversationStateManager that persists:

  • Last N queries with metadata
  • Current data context (tables, filters, results)
  • User interaction history

Step 2: Implement Routing Pipeline

public class ConversationalRouter { public async Task<RoutingResult> RouteQuestion( string question, ConversationState state) { // Tier 1: Fast Heuristics var heuristicResult = _heuristicClassifier.Classify(question, state); if (heuristicResult != RoutingDecision.Uncertain) { return new RoutingResult { Decision = heuristicResult, Method = "Heuristic", Latency = 5 }; } // Tier 2: Context Analysis var contextResult = _contextAnalyzer.AnalyzeContext(question, state); if (contextResult != RoutingDecision.Uncertain) { return new RoutingResult { Decision = contextResult, Method = "Context", Latency = 35 }; } // Tier 3: LLM Classification var llmResult = await _llmClassifier .ClassifyAsync(question, state); return new RoutingResult { Decision = llmResult.Decision, Method = "LLM", Latency = llmResult.Duration, Confidence = llmResult.Confidence }; } }

Step 3: Add Telemetry & Learning

Track routing decisions to improve over time:

  • Log decision accuracy (did user correct us?)
  • Measure latency per tier
  • Identify patterns that should move from Tier 3 → Tier 1
  • A/B test threshold values

Step 4: Handle Edge Cases

⚠️ Edge Case Strategies:
  • Ambiguous Refinements: "Add region column" → Try CONTINUE first, fallback to NEW_DATA if data doesn't support it
  • User Corrections: "No, I meant..." → Always NEW_DATA
  • Multi-Intent Questions: "Show sales AND explain the trend" → NEW_DATA (primary intent)
  • Timeout/Failure: Default to NEW_DATA (safer choice)

📊 Strategy Comparison: Alternative Approaches

Strategy Accuracy Latency Cost Complexity Pros Cons
Always NEW_DATA 100% High High Low Simple, no errors Wasteful, slow, expensive
Heuristics Only 65-75% < 10ms Free Low Fast, cheap Miss nuanced cases
LLM Only 90-95% 150-300ms $0.002/call Medium Accurate, flexible Slower, costs add up
Hybrid (Recommended) 88-93% 20-100ms $0.0003/call avg Medium Best balance of speed, accuracy, cost More code to maintain
User Selection 100% 0ms Free Low Perfect accuracy, user control Requires UI elements, friction

✅ Hybrid Approach Advantages

  • Performance: 70% of requests route in < 10ms via heuristics
  • Accuracy: 88-93% correct routing with graceful degradation
  • Cost-Effective: LLM calls only for 15-20% of difficult cases
  • Self-Improving: Telemetry identifies patterns → migrate to heuristics
  • User Experience: Fast responses without decision friction

❌ Potential Challenges

  • Complexity: Three-tier system requires more engineering
  • Maintenance: Heuristic rules need periodic updates
  • Edge Cases: 7-12% misclassifications need fallback handling
  • State Management: Must persist conversation context reliably
  • Testing: Requires comprehensive test coverage across tiers

🎨 User Experience Enhancement Options

Option 1: Invisible Routing (Recommended)

System makes decision transparently. User never sees routing logic.

  • ✅ Seamless conversational flow
  • ✅ No cognitive load on user
  • ❌ Occasional misrouting (handle with "Not what you meant? Try rephrasing")

Option 2: Smart Suggestions

When uncertain, show user quick-action buttons:

Your question: "Show regional breakdown" [🔄 Fetch New Data] [📊 Visualize Current Data]
  • ✅ User has final say
  • ✅ Doubles as feedback mechanism
  • ❌ Adds UI friction for ambiguous cases

Option 3: Hybrid with Override

Auto-route but provide subtle "Using current data" indicator with option to refresh:

Using data from your last query (2 min ago) [Refresh Data]
  • ✅ Best of both worlds
  • ✅ Transparency without friction
  • ✅ Easy corrections

🎯 Real-World Routing Scenarios

Conversation Flow Routing Decision Reasoning Tier Used
Q1: "Show 2024 sales by region"
Q2: "Create a bar chart"
CONTINUE Visualization request, data still valid Tier 1 (Heuristic)
Q1: "Show top 10 customers"
Q2: "Now show bottom 10"
NEW_DATA Different data subset required Tier 1 (Heuristic)
Q1: "Analyze Q3 performance"
Q2: "Why did Region A underperform?"
CONTINUE Explanation within existing scope Tier 2 (Context)
Q1: "Show employee count"
Q2: "Also include their salaries"
NEW_DATA Additional columns = new SQL needed Tier 2 (Context)
Q1: "Show this month's orders"
Q2: "Compare with last month"
NEW_DATA New time range = new query Tier 1 (Heuristic)
Q1: "Get product inventory"
Q2: "How is this calculated?"
CONTINUE Meta-question about data/process Tier 1 (Heuristic)
Q1: "Show sales performance"
Q2: "Break down by product category"
AMBIGUOUS Could be in current data OR need new query Tier 3 (LLM)
🎓 Learning Insight: Notice how ambiguous cases (~15% of queries) require LLM classification, while clear patterns (~70%) route instantly. This distribution justifies the hybrid architecture's cost-benefit tradeoff.

🏆 Final Recommendation: Phased Implementation

gantt
    title PeopleWorksGPT Routing Implementation Roadmap
    dateFormat YYYY-MM-DD
    section Phase 1 Foundation
    State Manager Heuristics :done, p1, 2025-01-01, 2w
    Basic Routing Pipeline :done, p2, after p1, 1w
    Testing Infrastructure :active, p3, after p2, 1w
    
    section Phase 2 Intelligence
    Context Analysis :p4, after p3, 2w
    LLM Classifier Integration :p5, after p4, 1w
    Performance Optimization :p6, after p5, 1w
    
    section Phase 3 Refinement
    Telemetry Analytics :p7, after p6, 1w
    User Feedback Loop :p8, after p7, 2w
    AB Testing Framework :p9, after p8, 1w
    
    section Phase 4 Scale
    Pattern Learning System :p10, after p9, 2w
    Cost Optimization :p11, after p10, 1w
    Production Launch :milestone, p12, after p11, 0d
                    

✅ Recommended Implementation Path

Phase 1: MVP (Week 1-2)

  • ✅ Implement Tier 1 heuristics only
  • ✅ Default to NEW_DATA for uncertain cases
  • ✅ Add basic state management
  • ✅ Deploy with telemetry

Phase 2: Intelligence (Week 3-5)

  • ✅ Add Tier 2 context analysis
  • ✅ Integrate LLM classifier (Tier 3)
  • ✅ Optimize heuristic rules based on data
  • ✅ A/B test against always-NEW_DATA baseline

Phase 3: Optimization (Week 6-8)

  • ✅ Implement caching and parallel processing
  • ✅ Add user override options (smart suggestions)
  • ✅ Build automated pattern learning
  • ✅ Cost and performance optimization
🎯 Success Criteria:
  • Accuracy: ≥ 85% correct routing decisions
  • Performance: Average routing latency < 100ms
  • Cost: < $0.0005 per conversation on average
  • User Satisfaction: < 10% regeneration rate
  • Coverage: ≥ 70% routed via Tier 1 (heuristics)