Skip to Content
📚 MyStoryFlow Docs — Your guide to preserving family stories

AI Conversation System: Audit and Redesign — January 2026

Date: January 27, 2026 Scope: Comprehensive audit of AI conversation flow, context management, and memory systems Status: COMPLETE — Recommendations ready for implementation


Executive Summary

MyStoryFlow is a Next.js application that helps seniors create life stories through AI-assisted conversations. This audit was conducted to assess the current state of the AI conversation system and align it with 2025 best practices for conversational AI.

Key Findings:

  1. DB Foundation (Phase 1 from Nov 2024) is fully intact and ready — pgvector, user_memory table, HNSW indexes, and search functions all verified working
  2. Critical gap: Context manager uses regex-based extraction — This is the weakest link in the system, missing semantic meaning and family relationships
  3. No cross-session memory — AI does not remember users across conversations
  4. Fixed 6-message context window — Long conversations lose important earlier details

Recommendation: Build custom pgvector solution (scored 4.10/5 in weighted decision matrix) rather than adopting external memory services.


Current State Assessment

What Works

ComponentLocationStatus
Conversation API routeapps/web-app/app/api/conversation/route.tsBasic flow working
AI service orchestrationapps/web-app/lib/ai/enhanced-server-ai-service.tsMulti-provider support
Gemini providerapps/web-app/lib/ai/providers/gemini-provider.tsPrimary provider, working
Book context serviceapps/web-app/lib/ai/book-context-service.tsFetches story previews
Conversation taggingAI serviceWorking
Quality analysisAI serviceWorking
Usage trackingRate limitsWorking
Prompt templatesDatabaseAdmin-configurable

What’s Weak (Critical Gaps)

GapImpactSeverity
No cross-session memoryAI doesn’t remember user across conversationsCritical
Regex-based context extractionMisses semantic meaning, family relationshipsCritical
6-message fixed context windowLong conversations lose important earlier detailsHigh
No story/recording retrievalConversations can’t reference user’s existing contentHigh
No user profilingAI doesn’t adapt to communication style over timeMedium
No conversation compactionNo strategy for token limitsMedium
No memory transparencyUsers can’t see/edit what AI remembersLow (MVP)

Current Context Manager Analysis

File: apps/web-app/lib/conversation/context-manager.ts

Current Flow: User Message -> Regex extraction -> JSON summary -> LLM Problems: - Regex patterns miss semantic meaning - Only last 6 messages in context - No cross-session memory - No story/recording retrieval - Max 2000 characters (loses detail) - Every API call sends full context (inefficient)

DB Foundation Verification (Phase 1 — COMPLETE)

Verified against live Supabase project qrlygafaejovxxlnkpxa on January 27, 2026:

ComponentStatusNotes
pgvector extensionEnabledVersion 0.8.0, working
stories.content_embedding vector(1536)Column existsNo data yet — pending embedding pipeline
recordings.transcript_embedding vector(1536)Column existsNo data yet — pending embedding pipeline
user_memory tableExists13 columns including metadata jsonb
HNSW index on storiesidx_stories_embeddingCreated, ready
HNSW index on recordingsidx_recordings_embeddingCreated, ready
HNSW index on user_memoryidx_user_memory_embeddingCreated, ready
search_unified_context() SQL functionExistsReady to use
RLS policies on user_memoryConfiguredUsers can only access own memories

user_memory Table Schema (Verified)

CREATE TABLE user_memory ( id uuid PRIMARY KEY DEFAULT gen_random_uuid(), user_id uuid NOT NULL REFERENCES auth.users(id) ON DELETE CASCADE, memory_type text NOT NULL CHECK (memory_type IN ('fact', 'preference', 'relationship', 'event', 'theme')), content text NOT NULL, embedding vector(1536), source_type text, -- 'conversation', 'story', 'recording', 'explicit' source_id uuid, importance_score numeric(3,2) DEFAULT 0.5, confidence numeric(3,2) DEFAULT 0.8, metadata jsonb DEFAULT '{}', last_referenced_at timestamptz DEFAULT now(), reference_count int DEFAULT 0, created_at timestamptz DEFAULT now(), updated_at timestamptz DEFAULT now() );

Memory Layer Decision (Weighted Analysis)

Evaluated three approaches for the memory layer:

CriteriaWeightCustom pgvectorMem0 Self-HostedMem0 Cloud
Cost at MVP30%5 ($0 infra)3 (needs Qdrant)1 (per-op cost)
Implementation effort20%345
Control and customization15%532
Performance10%455
Maintenance burden10%235
Migration risk10%532
Domain fit (stories)5%533
Weighted Score100%4.103.302.95

Decision: Custom pgvector solution

Rationale:

  • Already available in Supabase (zero additional infrastructure cost)
  • Data stays in PostgreSQL (no sync issues, GDPR compliance easier)
  • Full control over memory extraction logic (story-domain customization)
  • Sufficient performance for projected scale (<1M vectors)
  • Easy migration path to dedicated vector DB if needed later

2025 Best Practices Applied

Based on research from Anthropic, Mem0, LangChain, and industry standards:

3-tier strategy for managing conversation context:

Tier 1: Recent Messages (full fidelity) - Last 8-10 conversation turns - Full message content preserved Tier 2: Session Summaries (compressed) - Compressed summaries of older session content - Created after every 8 turns - Stores: key topics, emotional tone, decisions made Tier 3: Cross-Session Memories (persistent) - Extracted facts, preferences, relationships - Stored in user_memory table with embeddings - Retrieved via semantic similarity

Implementation: CompactionService with configurable thresholds

2. LLM-Based Memory Extraction (Replaces Regex)

The current regex-based extraction in context-manager.ts will be replaced:

Current (Regex):

// Misses semantic meaning const familyPattern = /\b(mother|father|grandmother|grandfather|sister|brother)\b/gi

New (LLM Extraction):

// Uses cheap model for structured extraction const extraction = await gemini.extractStructured(message, { model: 'gemini-2.0-flash-lite', // Fastest, cheapest schema: { people: [{ name: string, relationship: string, details: string }], events: [{ description: string, timeframe: string, emotions: string[] }], preferences: [{ category: string, preference: string }], facts: [{ subject: string, fact: string }] } })

3. Adaptive Context Window

Replace fixed 6-message limit with dynamic sizing:

class AdaptiveContextWindow { private tokenBudget = 4000 // Adjustable per conversation buildContext(history: Message[]): Message[] { let tokens = 0 const result: Message[] = [] // Always include system prompt tokens += this.countTokens(systemPrompt) // Include recent messages until budget exhausted for (const msg of history.reverse()) { const msgTokens = this.countTokens(msg) if (tokens + msgTokens > this.tokenBudget) break result.unshift(msg) tokens += msgTokens } // Add compressed summary of excluded messages if (result.length < history.length) { result.unshift(this.getSummary(history.slice(0, -result.length))) } return result } }

4. Memory Transparency

User-facing memory dashboard for trust and GDPR compliance:

  • “What do you know about me?” query handler in conversation
  • Memory dashboard in user settings
  • Edit/delete individual memories
  • Export all memories (GDPR data portability)
  • Clear all memories option

5. Session Continuity

When resuming a conversation:

async resumeConversation(sessionId: string, userId: string) { // 1. Load session checkpoint (compressed summary) const checkpoint = await loadCheckpoint(sessionId) // 2. Retrieve relevant memories for current topic const memories = await searchMemories(userId, checkpoint.lastTopic) // 3. Find related stories/recordings const relatedContent = await searchUnifiedContext(userId, checkpoint.lastTopic) // 4. Build resumption context return { systemPrompt: buildResumptionPrompt(checkpoint, memories, relatedContent), suggestedFollowUp: generateFollowUp(checkpoint) } }

6. Anthropic Constitution Alignment

Following Anthropic’s guidance on AI assistants:

“Brilliant Friend” Model for Seniors:

  • Patient, unhurried responses
  • Gentle prompts, never pressuring
  • Validates emotions before asking follow-ups
  • Explains what it remembers and why

Anti-Sycophancy in Memory Recall:

  • Admits uncertainty about remembered facts
  • Asks for confirmation before using old memories
  • Never fabricates details to seem helpful

Emotional Safety Layer:

const emotionalSafetyCheck = async (message: string) => { const indicators = await detectDistressIndicators(message) if (indicators.grief || indicators.trauma) { return { responseModifier: 'gentle_acknowledgment', avoidFollowUps: true, suggestPause: indicators.severity > 0.7 } } }

Vulnerable User Safeguards:

  • Detect confusion or frustration
  • Offer to simplify or take a break
  • Never rush through difficult topics
  • Provide clear exit options

Implementation Phases

PhaseStatusDescription
Phase 1COMPLETEDB foundation (pgvector, user_memory, indexes, search function)
Phase 2COMPLETEEmbedding Pipeline (EmbeddingService, background jobs, save hooks)
Phase 3COMPLETEEnhanced Context Manager (replaces regex context-manager.ts)
Phase 4COMPLETEMemory System (lifecycle, transparency API, GDPR compliance)
Phase 5COMPLETEQuality and Analytics (memory-aware tagging, adaptive pace)

Implementation Summary (January 2026)

All phases have been implemented. Key files created:

Phase 2 — Embedding Pipeline:

  • apps/web-app/lib/ai/embedding-service.ts — OpenAI embedding generation
  • apps/web-app/lib/jobs/embedding-pipeline.ts — Batch processing
  • apps/web-app/lib/hooks/use-embedding-trigger.ts — React hooks
  • scripts/backfill-embeddings.ts — CLI backfill script
  • apps/web-app/app/api/embeddings/generate/route.ts — API endpoint

Phase 3 — Enhanced Context Manager:

  • apps/web-app/lib/conversation/enhanced-context-manager.ts — pgvector semantic search
  • apps/web-app/lib/conversation/compaction-service.ts — Conversation summarization
  • apps/web-app/lib/conversation/memory-extractor.ts — LLM-based memory extraction
  • supabase/migrations/20260127000000_conversation_context_enhancements.sql — DB migration

Phase 4 — Memory System:

  • apps/web-app/lib/conversation/memory-manager.ts — Memory lifecycle management
  • apps/web-app/app/api/user/memories/route.ts — Memory transparency API
  • apps/web-app/app/api/user/memories/[id]/route.ts — Individual memory operations

Phase 5 — Quality and Analytics:

  • apps/web-app/lib/ai/conversation-quality-service.ts — Quality tracking
  • apps/web-app/lib/ai/conversation-tagging-service.ts — Enhanced with memory context
  • supabase/migrations/20260127100000_quality_tracking.sql — Quality tables

Key Files to Modify/Create

Existing Files to Modify

FileChange
apps/web-app/app/api/conversation/route.tsIntegrate EnhancedContextManager
apps/web-app/lib/ai/enhanced-server-ai-service.tsAdd memory extraction hooks
apps/web-app/lib/conversation/context-manager.tsDeprecate (replace with EnhancedContextManager)

New Files to Create

FilePurpose
apps/web-app/lib/ai/embedding-service.tsGenerate embeddings via OpenAI
apps/web-app/lib/conversation/enhanced-context-manager.tsNew context manager with memory retrieval
apps/web-app/lib/conversation/memory-extractor.tsLLM-based memory extraction
apps/web-app/lib/conversation/compaction-service.tsConversation summary generation
apps/web-app/lib/jobs/embedding-backfill.tsBackfill existing content

Cost Projections

Embedding Costs (text-embedding-3-small)

ScaleStoriesRecordingsMemoriesQueries/moMonthly Cost
100 users5K2K10K10K~$0.50
1,000 users50K20K100K100K~$5
10,000 users500K200K1M1M~$50

Memory Extraction Costs (gemini-2.0-flash-lite)

ScaleExtractions/moAvg TokensMonthly Cost
100 users1K500~$0.10
1,000 users10K500~$1
10,000 users100K500~$10

Total projected cost at 1,000 users: ~$6/month


References

External

  • Anthropic’s Claude Constitution — Guidance on AI assistant behavior
  • Anthropic: Effective Context Engineering for AI Agents (2025)
  • Anthropic: Protecting Well-Being of Users (2025)
  • Mem0 Research Paper (arXiv:2504.19413) — Memory layer architecture
  • Supabase pgvector Documentation

Internal Documents


Audit Sign-Off

RoleNameDate
Audit LeadAI System ReviewJanuary 27, 2026
Architecture ReviewPending
Implementation LeadPending

This audit document should be referenced for all AI conversation system changes. Update implementation phases as work progresses.