Skip to Content
📚 MyStoryFlow Docs — Your guide to preserving family stories

Conversation Testing Dashboard — Test Guide

Location: /dashboard/admin/conversation-testing Access: Admin users only (requires user_profiles.role = 'admin') Created: January 28, 2026

Prerequisites

  1. Dev server runningnpm run dev from the web-app workspace
  2. Logged in as admin — Your account must have role = 'admin' in user_profiles
  3. OpenAI API key — The persona simulator uses GPT-4o for generating persona responses and conversation analysis. Ensure OPENAI_API_KEY is set in .env.local

How to Access

Navigate to: http://localhost:3000/dashboard/admin/conversation-testing

Or from any admin page, manually enter the URL (no nav link exists yet).


Feature Overview

The dashboard lets you:

  1. Configure tests — Pick personas, book types, turn count, duration
  2. Start a test run — Creates a run with N persona x book-type combinations
  3. View results — See pass/fail, quality/empathy scores per test
  4. View transcripts — Read full Elena <-> persona conversations
  5. Track history — Compare runs over time

Test Scenarios

Scenario 1: Smoke Test (Single Persona, Single Book Type)

Goal: Verify the full pipeline works end-to-end.

SettingValue
PersonasUncheck “All”, select Margaret O’Sullivan only
Book TypesUncheck “All”, select Memoir only
Max Turns4
Duration3 min

Expected: 1 test runs. After completion, you should see:

  • Status changes from pending -> running -> passed/failed
  • Quality and empathy scores appear (0-100)
  • Clicking “View” shows the transcript with persona and Elena messages

What to verify:

  • Test run appears in the Run History sidebar
  • Summary cards show 1 total test
  • Results table shows Margaret with Memoir book type
  • Transcript viewer opens with conversation bubbles
  • Scores bar chart renders correctly

Scenario 2: Persona Variety (3 Personas, 1 Book Type)

Goal: Compare how Elena adapts to different personality types.

SettingValue
PersonasSelect: Margaret O’Sullivan, Ray Washington, Priya Sharma
Book TypesSelect Family History only
Max Turns6
Duration5 min

Expected: 3 tests. Compare:

  • Margaret (82, warm Irish grandmother) — should get high empathy, slower pacing
  • Ray (78, reserved veteran) — Elena should be patient, not push too hard
  • Priya (34, young mom) — more energetic tone, different topics

What to verify:

  • All 3 results appear in the table
  • Quality/empathy scores differ per persona
  • Filter buttons work (All, Passed, Failed)
  • Sorting by quality score orders correctly
  • Each transcript reflects the persona’s personality

Scenario 3: Book Type Variety (1 Persona, 3 Book Types)

Goal: Verify Elena adapts conversation style to different book types.

SettingValue
PersonasSelect Carlos Rivera only
Book TypesSelect: Cookbook, Military Memoir, Poetry Collection
Max Turns6
Duration5 min

Expected: 3 tests. Compare:

  • Cookbook — Elena should ask about recipes, ingredients, kitchen memories, sensory details
  • Military Memoir — Elena should handle sensitive topics, ask about service timeline, honor experiences
  • Poetry Collection — Elena should explore emotional expression, creative process, life moments that inspired poems

What to verify:

  • Book type label shows correctly in results table
  • Transcript viewer shows book type badge
  • Elena’s questions differ meaningfully across book types
  • Information extracted section shows book-type-relevant data

Scenario 4: Emotional Safety (Sensitive Personas)

Goal: Verify Elena handles emotionally sensitive conversations appropriately.

SettingValue
PersonasSelect: Linda Chen (cancer survivor), Frank Kowalski (widower)
Book TypesSelect Legacy Book only
Max Turns8
Duration5 min

Expected: 2 tests. Watch for:

  • Linda — conversations may touch health struggles, mortality; Elena should acknowledge without pushing
  • Frank — conversations may reference late wife; Elena should be compassionate, not redirect too quickly

What to verify:

  • Empathy scores are high (80+) if Elena handles emotions well
  • Elena Performance section notes emotional handling
  • Transcript shows Elena acknowledging difficult moments, offering choice to continue or pivot
  • No dismissive or rushed responses in transcript

Scenario 5: Full Matrix (All Personas, All Book Types)

Goal: Comprehensive quality assessment across all 130 combinations.

SettingValue
PersonasCheck “All 10 personas”
Book TypesCheck “All 13 book types”
Max Turns8
Duration5 min

Expected: 130 tests. This will take significant time and API calls.

Warning: This consumes substantial OpenAI API credits (GPT-4o for persona simulation + analysis per test). Estimate ~260 API calls minimum.

What to verify:

  • Progress polling updates every 5 seconds
  • Run History shows running status with remaining count
  • Results table populates incrementally
  • Filter by status shows running/pending/passed/failed counts
  • Pass rate card updates in real-time

Scenario 6: Error Handling

Goal: Verify the UI handles failures gracefully.

TestHow to TriggerExpected
No API keyRemove OPENAI_API_KEY from envError status on results, error message visible in transcript viewer
Rate limitingRun multiple tests quicklySome tests show error status with rate limit message
Network failureDisconnect during runRun stays in running state, can refresh later
No admin accessLog in as non-admin user403 error, page should show error state

UI Walkthrough

1. Test Configuration Panel (Left Side)

+-------------------------------+ | Test Configuration | | | | Personas: | | [x] All 10 personas | | [Select] to pick specific | | | | Book Types: | | [x] All 13 book types | | [Select] to pick specific | | | | Max Turns: [8 v] | | Duration: [5 min v] | | | | 130 tests will run | | 10 personas x 13 book types | | | | [ Run 130 Tests ] | +-------------------------------+

2. Run History (Left Side, Below Config)

Shows previous test runs with:

  • Status icon (spinning = running, check = completed, X = failed)
  • Test count and date
  • Pass/fail counts
  • Click to select and view results

3. Summary Cards (Right Side, Top)

4 cards showing: Total Tests | Passed | Failed | Pass Rate %

4. Results Table (Right Side, Main Area)

  • Filter tabs: All, Passed, Failed, Error, Pending
  • Sortable columns: Status, Persona, Book Type, Quality, Empathy, Turns
  • Actions: Click “View” to open transcript

5. Transcript Viewer (Modal)

Opens as an overlay showing:

  • Header: Persona name, book type badge, turn count, duration, status
  • Scores: Bar charts for quality, empathy, narrative depth, question quality, emotional safety, pacing
  • Elena Performance: Strengths and areas for improvement
  • Information Extracted: Characters, places, events, themes found
  • Conversation: Chat bubble view with persona (blue) and Elena (amber)
  • Raw Analysis: Collapsible JSON of full analysis data

Database Tables

conversation_test_runs

ColumnTypeDescription
idUUIDPrimary key
statusTEXTrunning, completed, failed
total_testsINTTotal test combinations
passed_testsINTTests that passed
failed_testsINTTests that failed or errored
configJSONB{ personaIds, bookTypes, maxTurns, conversationDuration }
triggered_byUUIDAdmin user who started the run
started_atTIMESTAMPTZWhen run began
completed_atTIMESTAMPTZWhen run finished

conversation_test_results

ColumnTypeDescription
idUUIDPrimary key
test_run_idUUIDFK to test run
persona_idTEXTe.g. grandma_margaret
persona_nameTEXTe.g. Margaret O'Sullivan
book_typeTEXTe.g. cookbook
statusTEXTpending, running, passed, failed, error
overall_qualityFLOAT0-100 score
empathy_scoreFLOAT0-100 score
turn_countINTNumber of conversation turns
duration_secondsINTTime taken
conversation_transcriptJSONBArray of { role, content } turns
analysis_scoresJSONBDetailed scoring breakdown
elena_performanceJSONBStrengths, improvements, pacing notes
information_extractedJSONBCharacters, places, events, themes
error_messageTEXTError details if status is error
conversation_metadataJSONB{ compositeId, displayLabel, bookTypeLabel }

API Endpoints

MethodEndpointDescription
GET/api/admin/conversation-testsList all test runs
POST/api/admin/conversation-testsStart a new test run
GET/api/admin/conversation-tests/[runId]Get run details
PATCH/api/admin/conversation-tests/[runId]Update run status
GET/api/admin/conversation-tests/[runId]/resultsGet results for a run
PATCH/api/admin/conversation-tests/[runId]/resultsUpdate a specific result

Example: Start a test run via API

curl -X POST http://localhost:3000/api/admin/conversation-tests \ -H "Content-Type: application/json" \ -H "Cookie: <your-session-cookie>" \ -d '{ "personaIds": ["grandma_margaret"], "bookTypes": ["cookbook"], "maxTurns": 4, "conversationDuration": 3 }'

Example: Fetch results

curl http://localhost:3000/api/admin/conversation-tests/<run-id>/results \ -H "Cookie: <your-session-cookie>"

Available Personas

IDNameAgeDescription
grandma_margaretMargaret O’Sullivan82Irish immigrant grandmother, warm storyteller
veteran_rayRay Washington78Korean War veteran, reserved
young_mom_priyaPriya Sharma34Indian-American mom recording for baby
struggling_artist_carlosCarlos Rivera45Mexican-American artist, creative
college_student_malikMalik Johnson20College student interviewing grandparents
cancer_survivor_lindaLinda Chen62Cancer survivor documenting journey
widower_frankFrank Kowalski75Widower recording late wife’s memories
entrepreneur_sarahSarah Mitchell55Entrepreneur writing business memoir
rural_farmer_tomTom Erikson70Rural farmer, quiet, heritage focus
nurse_patriciaPatricia Williams58Nurse with healthcare stories

Available Book Types

KeyLabelFocus
memoirMemoirPersonal life experiences, turning points
autobiographyAutobiographyChronological life story
family-historyFamily HistoryGenealogy, family traditions, heritage
cookbookFamily CookbookRecipes with stories, kitchen memories
recipe-collectionRecipe CollectionFocused recipe documentation
travel-journalTravel JournalTravel experiences, cultural encounters
childrens-bookChildren’s BookStories for grandchildren, lessons
poetry-collectionPoetry CollectionEmotional expression, creative writing
business-biographyBusiness BiographyCareer journey, business lessons
military-memoirMilitary MemoirService experiences, comrades
spiritual-journeySpiritual JourneyFaith, beliefs, spiritual growth
photo-bookPhoto BookStories behind photos, visual memories
legacy-bookLegacy BookWisdom, life lessons, values to pass on

Scoring Guide

Score RangeColorMeaning
80-100GreenExcellent — Elena performed well
60-79AmberAcceptable — Room for improvement
0-59RedPoor — Needs attention

Quality Score measures: narrative depth, question relevance, story extraction effectiveness, conversation flow

Empathy Score measures: emotional acknowledgment, pacing sensitivity, gentle handling of difficult topics, avoiding dismissiveness


Troubleshooting

IssueCauseFix
”Not authenticated” errorSession expiredLog out and back in
”Admin access required”Account isn’t adminCheck user_profiles.role in Supabase
Tests stuck in “pending”Test execution hasn’t startedThe POST creates placeholder rows; execution is separate
No scores showingAnalysis hasn’t run yetCheck if OpenAI API key is valid
Empty transcriptTest errored before conversation startedCheck error_message in transcript viewer
Run stays “running” foreverServer crashed mid-runManually update run status in Supabase to completed

File Locations

apps/web-app/ ├── app/ │ ├── (dashboard)/dashboard/admin/ │ │ └── conversation-testing/page.tsx # Main dashboard page │ ├── api/admin/conversation-tests/ │ │ ├── route.ts # POST (create run), GET (list runs) │ │ └── [runId]/ │ │ ├── route.ts # GET (run details), PATCH (update run) │ │ └── results/route.ts # GET (results), PATCH (update result) │ └── components/admin/conversation-testing/ │ ├── TestConfigPanel.tsx # Config UI │ ├── TestResultsTable.tsx # Results grid │ └── TranscriptViewer.tsx # Transcript modal └── lib/testing/ ├── realistic-personas.ts # 10 persona definitions ├── book-type-modifiers.ts # 13 book-type modifiers ├── persona-composer.ts # Persona + modifier composition ├── ai-persona-simulator.ts # GPT-4o persona simulation └── enhanced-conversation-tester.ts # Test orchestration