F007 - AI Genre Classification

Objective

Automatically detect manuscript genre using AI analysis with high accuracy to enable genre-specific analysis templates and market positioning.

Quick Implementation

Using MyStoryFlow Components

AI service integration from F004 via @mystoryflow/manuscript-analysis
Database models in analyzer schema
API route patterns in apps/analyzer-app
Error handling utilities from @mystoryflow/shared

New Requirements

Genre taxonomy definition
Multi-genre support logic
Confidence scoring algorithm
Genre-specific keywords/patterns

MVP Implementation

1. Database Schema


-- Genre detection results (in analyzer schema)
CREATE TABLE analyzer.genre_detections (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  manuscript_id UUID REFERENCES analyzer.manuscripts(id),
  primary_genre VARCHAR(50) NOT NULL,
  primary_confidence DECIMAL(3,2) NOT NULL, -- 0.00 to 1.00
  secondary_genre VARCHAR(50),
  secondary_confidence DECIMAL(3,2),
  subgenres JSONB DEFAULT '[]', -- Array of subgenre matches
  genre_markers JSONB DEFAULT '{}', -- Detected genre-specific elements
  analysis_version VARCHAR(20) DEFAULT '1.0',
  created_at TIMESTAMP DEFAULT NOW()
);
 
-- Genre definitions (in analyzer schema)
CREATE TABLE analyzer.genres (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  code VARCHAR(50) UNIQUE NOT NULL,
  name VARCHAR(100) NOT NULL,
  parent_genre VARCHAR(50),
  description TEXT,
  keywords JSONB DEFAULT '[]',
  typical_elements JSONB DEFAULT '[]',
  is_active BOOLEAN DEFAULT true,
  created_at TIMESTAMP DEFAULT NOW()
);
 
-- Seed core genres
INSERT INTO analyzer.genres (code, name, description, keywords, typical_elements) VALUES
('romance', 'Romance', 'Focus on romantic relationships', 
  '["love", "heart", "kiss", "embrace", "passion"]',
  '["meet-cute", "conflict", "emotional connection", "HEA/HFN"]'),
('mystery', 'Mystery', 'Crime solving and investigation',
  '["detective", "clue", "murder", "investigation", "suspect"]',
  '["crime", "investigation", "red herrings", "revelation"]'),
('fantasy', 'Fantasy', 'Magical and supernatural elements',
  '["magic", "wizard", "dragon", "quest", "realm"]',
  '["world-building", "magic system", "quest", "chosen one"]'),
('scifi', 'Science Fiction', 'Futuristic and technological themes',
  '["space", "technology", "future", "alien", "robot"]',
  '["technology", "space travel", "dystopia", "time travel"]'),
('thriller', 'Thriller', 'High stakes and suspense',
  '["danger", "chase", "conspiracy", "survive", "escape"]',
  '["pacing", "plot twists", "danger", "time pressure"]'),
('literary', 'Literary Fiction', 'Character-driven narratives',
  '["human condition", "society", "meaning", "existence"]',
  '["character development", "themes", "prose style", "symbolism"]');
 
CREATE INDEX idx_genre_detections_manuscript_id ON genre_detections(manuscript_id);

2. Genre Detection Service


// packages/manuscript-analysis/src/services/genre-detector.ts
import { AIService } from '@mystoryflow/manuscript-analysis'
import { getSupabaseBrowserClient } from '@mystoryflow/database'
import { trackAIUsage } from '@mystoryflow/analytics'
 
interface GenreDetectionResult {
  primaryGenre: string
  primaryConfidence: number
  secondaryGenre?: string
  secondaryConfidence?: number
  subgenres: string[]
  genreMarkers: {
    themes: string[]
    plotElements: string[]
    characterTypes: string[]
    settings: string[]
    tone: string[]
  }
}
 
export class GenreDetector {
  private aiService: AIService
  private supabase = getSupabaseBrowserClient()
 
  constructor() {
    this.aiService = new AIService()
  }
 
  async detectGenre(manuscriptId: string): Promise<GenreDetectionResult> {
    // Get manuscript content (first 10k words for genre detection)
    const { data: content } = await this.supabase
      .from('analyzer.manuscript_content')
      .select('raw_text')
      .eq('manuscript_id', manuscriptId)
      .single()
 
    if (!content) throw new Error('Content not found')
 
    // Get first ~10k words for analysis
    const sampleText = this.extractSample(content.raw_text, 10000)
    
    // Get available genres from reference data
    const { data: genres } = await this.supabase
      .from('analyzer.genres')
      .select('*')
      .eq('is_active', true)
 
    // Build genre detection prompt
    const prompt = this.buildGenreDetectionPrompt(sampleText, genres)
    
    // Get AI analysis
    const aiResult = await this.aiService.analyzeWithPrimary(
      prompt,
      'genre_detection',
      3000
    )
 
    // Process and validate results
    const result = this.processGenreResult(aiResult, genres)
    
    // Save to database
    await this.saveGenreDetection(manuscriptId, result)
    
    // Track AI usage for analytics
    await trackAIUsage({
      userId: (await this.getManuscriptUserId(manuscriptId)),
      model: aiResult.modelUsed,
      tokens: aiResult.tokensUsed,
      operation: 'genre-detection',
      metadata: { manuscriptId, genre: result.primaryGenre }
    })
    
    return result
  }
 
  private extractSample(text: string, wordLimit: number): string {
    const words = text.split(/\s+/)
    const sampleWords = words.slice(0, wordLimit)
    return sampleWords.join(' ')
  }
 
  private buildGenreDetectionPrompt(text: string, genres: any[]): string {
    const genreList = genres.map(g => `${g.code}: ${g.name} - ${g.description}`).join('\n')
    
    return `You are a literary genre expert analyzing a manuscript sample.
 
AVAILABLE GENRES:
${genreList}
 
ANALYSIS TASK:
1. Identify the PRIMARY genre with confidence score (0.0-1.0)
2. Identify SECONDARY genre if applicable with confidence
3. List any relevant SUBGENRES
4. Identify specific genre markers:
   - Key themes present
   - Plot elements and structure
   - Character archetypes
   - Settings and world-building
   - Tone and mood
 
Consider these genre indicators:
- Romance: emotional connections, relationship development, romantic tension
- Mystery: crimes, investigations, clues, suspects, revelations
- Fantasy: magic, supernatural, world-building, quests
- Sci-Fi: technology, future settings, scientific concepts
- Thriller: suspense, danger, fast pacing, high stakes
- Literary: character depth, thematic exploration, prose style
 
MANUSCRIPT SAMPLE:
${text}
 
Provide analysis in this JSON format:
{
  "primary_genre": "genre_code",
  "primary_confidence": 0.95,
  "secondary_genre": "genre_code or null",
  "secondary_confidence": 0.65,
  "subgenres": ["subgenre1", "subgenre2"],
  "genre_markers": {
    "themes": ["theme1", "theme2"],
    "plot_elements": ["element1", "element2"],
    "character_types": ["type1", "type2"],
    "settings": ["setting1", "setting2"],
    "tone": ["tone1", "tone2"]
  },
  "reasoning": "Brief explanation of genre classification"
}`
  }
 
  private processGenreResult(aiResult: any, validGenres: any[]): GenreDetectionResult {
    // Validate primary genre exists
    const validGenreCodes = validGenres.map(g => g.code)
    
    if (!validGenreCodes.includes(aiResult.primary_genre)) {
      // Fallback to literary fiction if genre not recognized
      aiResult.primary_genre = 'literary'
      aiResult.primary_confidence = 0.5
    }
 
    // Ensure confidence scores are valid
    const primaryConfidence = Math.max(0, Math.min(1, aiResult.primary_confidence || 0.5))
    const secondaryConfidence = aiResult.secondary_confidence 
      ? Math.max(0, Math.min(1, aiResult.secondary_confidence))
      : undefined
 
    return {
      primaryGenre: aiResult.primary_genre,
      primaryConfidence,
      secondaryGenre: aiResult.secondary_genre || undefined,
      secondaryConfidence,
      subgenres: Array.isArray(aiResult.subgenres) ? aiResult.subgenres : [],
      genreMarkers: {
        themes: aiResult.genre_markers?.themes || [],
        plotElements: aiResult.genre_markers?.plot_elements || [],
        characterTypes: aiResult.genre_markers?.character_types || [],
        settings: aiResult.genre_markers?.settings || [],
        tone: aiResult.genre_markers?.tone || []
      }
    }
  }
 
  private async saveGenreDetection(
    manuscriptId: string,
    result: GenreDetectionResult
  ): Promise<void> {
    await this.supabase
      .from('analyzer.genre_detections')
      .insert({
        manuscript_id: manuscriptId,
        primary_genre: result.primaryGenre,
        primary_confidence: result.primaryConfidence,
        secondary_genre: result.secondaryGenre,
        secondary_confidence: result.secondaryConfidence,
        subgenres: result.subgenres,
        genre_markers: result.genreMarkers
      })
 
    // Update manuscript with detected genre
    await this.supabase
      .from('analyzer.manuscripts')
      .update({ 
        metadata: {
          detected_genre: result.primaryGenre,
          genre_confidence: result.primaryConfidence
        }
      })
      .eq('id', manuscriptId)
  }
 
  private async getManuscriptUserId(manuscriptId: string): Promise<string> {
    const { data } = await this.supabase
      .from('analyzer.manuscripts')
      .select('user_id')
      .eq('id', manuscriptId)
      .single()
    return data?.user_id
  }
}

3. API Endpoint


// apps/analyzer-app/src/app/api/manuscripts/[id]/detect-genre/route.ts
import { NextRequest, NextResponse } from 'next/server'
import { GenreDetector } from '@/packages/analysis'
import { withAuth } from '@/lib/auth'
 
export async function POST(
  req: NextRequest,
  { params }: { params: { id: string } }
) {
  const session = await withAuth(req)
  if (!session) {
    return NextResponse.json({ error: 'Unauthorized' }, { status: 401 })
  }
 
  try {
    const detector = new GenreDetector()
    const result = await detector.detectGenre(params.id)
    
    return NextResponse.json({ 
      success: true,
      genre: result 
    })
  } catch (error) {
    console.error('Genre detection error:', error)
    return NextResponse.json(
      { error: 'Genre detection failed' },
      { status: 500 }
    )
  }
}

4. React Hook for Genre Display


// apps/analyzer-app/src/hooks/useGenreDetection.ts
import { useQuery } from '@tanstack/react-query'
import { getSupabaseBrowserClient } from '@mystoryflow/database'
 
export function useGenreDetection(manuscriptId: string) {
  const supabase = getSupabaseBrowserClient()
  
  return useQuery({
    queryKey: ['genre-detection', manuscriptId],
    queryFn: async () => {
      const { data, error } = await supabase
        .from('analyzer.genre_detections')
        .select('*')
        .eq('manuscript_id', manuscriptId)
        .order('created_at', { ascending: false })
        .limit(1)
        .single()
        
      if (error) throw error
      return data
    },
    enabled: !!manuscriptId
  })
}

5. Genre Display Component


// apps/analyzer-app/src/components/manuscript/GenreDisplay.tsx
import { Badge } from '@mystoryflow/ui'
import { useGenreDetection } from '@/hooks/useGenreDetection'
 
export function GenreDisplay({ manuscriptId }: { manuscriptId: string }) {
  const { data: detection, isLoading } = useGenreDetection(manuscriptId)
  
  if (isLoading) return <div>Detecting genre...</div>
  if (!detection) return null
  
  const confidenceColor = detection.primary_confidence > 0.8 
    ? 'success' 
    : detection.primary_confidence > 0.6 
    ? 'warning' 
    : 'default'
  
  return (
    <div className="space-y-2">
      <div className="flex items-center gap-2">
        <Badge variant={confidenceColor}>
          {detection.primary_genre} ({Math.round(detection.primary_confidence * 100)}%)
        </Badge>
        {detection.secondary_genre && (
          <Badge variant="secondary">
            {detection.secondary_genre} ({Math.round(detection.secondary_confidence * 100)}%)
          </Badge>
        )}
      </div>
      
      {detection.subgenres.length > 0 && (
        <div className="flex gap-1">
          {detection.subgenres.map(subgenre => (
            <Badge key={subgenre} variant="outline" size="sm">
              {subgenre}
            </Badge>
          ))}
        </div>
      )}
    </div>
  )
}

MVP Acceptance Criteria

Detect primary genre with confidence score
Support for 6 core genres (romance, mystery, fantasy, sci-fi, thriller, literary)
Secondary genre detection when applicable
Subgenre identification
Genre marker extraction (themes, plot elements, etc.)
80%+ accuracy on genre classification
Results saved to database

Post-MVP Enhancements

Support for 20+ genres and subgenres
Multi-label classification for genre blends
Historical fiction period detection
Non-fiction category detection
Genre trend analysis
Market positioning recommendations
Comparative genre analysis
User genre override option

Implementation Time

Development: 1 day
Testing: 0.5 days
Total: 1.5 days

Dependencies

F004-AI-SERVICES (AI integration)
F006-CONTENT-EXTRACTION (text content needed)

Next Feature

After completion, proceed to F008-TEXT-CHUNKING for optimal AI processing.