F007 - AI Genre Classification
Objective
Automatically detect manuscript genre using AI analysis with high accuracy to enable genre-specific analysis templates and market positioning.
Quick Implementation
Using MyStoryFlow Components
- AI service integration from F004 via
@mystoryflow/manuscript-analysis - Database models in analyzer schema
- API route patterns in
apps/analyzer-app - Error handling utilities from
@mystoryflow/shared
New Requirements
- Genre taxonomy definition
- Multi-genre support logic
- Confidence scoring algorithm
- Genre-specific keywords/patterns
MVP Implementation
1. Database Schema
-- Genre detection results (in analyzer schema)
CREATE TABLE analyzer.genre_detections (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
manuscript_id UUID REFERENCES analyzer.manuscripts(id),
primary_genre VARCHAR(50) NOT NULL,
primary_confidence DECIMAL(3,2) NOT NULL, -- 0.00 to 1.00
secondary_genre VARCHAR(50),
secondary_confidence DECIMAL(3,2),
subgenres JSONB DEFAULT '[]', -- Array of subgenre matches
genre_markers JSONB DEFAULT '{}', -- Detected genre-specific elements
analysis_version VARCHAR(20) DEFAULT '1.0',
created_at TIMESTAMP DEFAULT NOW()
);
-- Genre definitions (in analyzer schema)
CREATE TABLE analyzer.genres (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
code VARCHAR(50) UNIQUE NOT NULL,
name VARCHAR(100) NOT NULL,
parent_genre VARCHAR(50),
description TEXT,
keywords JSONB DEFAULT '[]',
typical_elements JSONB DEFAULT '[]',
is_active BOOLEAN DEFAULT true,
created_at TIMESTAMP DEFAULT NOW()
);
-- Seed core genres
INSERT INTO analyzer.genres (code, name, description, keywords, typical_elements) VALUES
('romance', 'Romance', 'Focus on romantic relationships',
'["love", "heart", "kiss", "embrace", "passion"]',
'["meet-cute", "conflict", "emotional connection", "HEA/HFN"]'),
('mystery', 'Mystery', 'Crime solving and investigation',
'["detective", "clue", "murder", "investigation", "suspect"]',
'["crime", "investigation", "red herrings", "revelation"]'),
('fantasy', 'Fantasy', 'Magical and supernatural elements',
'["magic", "wizard", "dragon", "quest", "realm"]',
'["world-building", "magic system", "quest", "chosen one"]'),
('scifi', 'Science Fiction', 'Futuristic and technological themes',
'["space", "technology", "future", "alien", "robot"]',
'["technology", "space travel", "dystopia", "time travel"]'),
('thriller', 'Thriller', 'High stakes and suspense',
'["danger", "chase", "conspiracy", "survive", "escape"]',
'["pacing", "plot twists", "danger", "time pressure"]'),
('literary', 'Literary Fiction', 'Character-driven narratives',
'["human condition", "society", "meaning", "existence"]',
'["character development", "themes", "prose style", "symbolism"]');
CREATE INDEX idx_genre_detections_manuscript_id ON genre_detections(manuscript_id);2. Genre Detection Service
// packages/manuscript-analysis/src/services/genre-detector.ts
import { AIService } from '@mystoryflow/manuscript-analysis'
import { getSupabaseBrowserClient } from '@mystoryflow/database'
import { trackAIUsage } from '@mystoryflow/analytics'
interface GenreDetectionResult {
primaryGenre: string
primaryConfidence: number
secondaryGenre?: string
secondaryConfidence?: number
subgenres: string[]
genreMarkers: {
themes: string[]
plotElements: string[]
characterTypes: string[]
settings: string[]
tone: string[]
}
}
export class GenreDetector {
private aiService: AIService
private supabase = getSupabaseBrowserClient()
constructor() {
this.aiService = new AIService()
}
async detectGenre(manuscriptId: string): Promise<GenreDetectionResult> {
// Get manuscript content (first 10k words for genre detection)
const { data: content } = await this.supabase
.from('analyzer.manuscript_content')
.select('raw_text')
.eq('manuscript_id', manuscriptId)
.single()
if (!content) throw new Error('Content not found')
// Get first ~10k words for analysis
const sampleText = this.extractSample(content.raw_text, 10000)
// Get available genres from reference data
const { data: genres } = await this.supabase
.from('analyzer.genres')
.select('*')
.eq('is_active', true)
// Build genre detection prompt
const prompt = this.buildGenreDetectionPrompt(sampleText, genres)
// Get AI analysis
const aiResult = await this.aiService.analyzeWithPrimary(
prompt,
'genre_detection',
3000
)
// Process and validate results
const result = this.processGenreResult(aiResult, genres)
// Save to database
await this.saveGenreDetection(manuscriptId, result)
// Track AI usage for analytics
await trackAIUsage({
userId: (await this.getManuscriptUserId(manuscriptId)),
model: aiResult.modelUsed,
tokens: aiResult.tokensUsed,
operation: 'genre-detection',
metadata: { manuscriptId, genre: result.primaryGenre }
})
return result
}
private extractSample(text: string, wordLimit: number): string {
const words = text.split(/\s+/)
const sampleWords = words.slice(0, wordLimit)
return sampleWords.join(' ')
}
private buildGenreDetectionPrompt(text: string, genres: any[]): string {
const genreList = genres.map(g => `${g.code}: ${g.name} - ${g.description}`).join('\n')
return `You are a literary genre expert analyzing a manuscript sample.
AVAILABLE GENRES:
${genreList}
ANALYSIS TASK:
1. Identify the PRIMARY genre with confidence score (0.0-1.0)
2. Identify SECONDARY genre if applicable with confidence
3. List any relevant SUBGENRES
4. Identify specific genre markers:
- Key themes present
- Plot elements and structure
- Character archetypes
- Settings and world-building
- Tone and mood
Consider these genre indicators:
- Romance: emotional connections, relationship development, romantic tension
- Mystery: crimes, investigations, clues, suspects, revelations
- Fantasy: magic, supernatural, world-building, quests
- Sci-Fi: technology, future settings, scientific concepts
- Thriller: suspense, danger, fast pacing, high stakes
- Literary: character depth, thematic exploration, prose style
MANUSCRIPT SAMPLE:
${text}
Provide analysis in this JSON format:
{
"primary_genre": "genre_code",
"primary_confidence": 0.95,
"secondary_genre": "genre_code or null",
"secondary_confidence": 0.65,
"subgenres": ["subgenre1", "subgenre2"],
"genre_markers": {
"themes": ["theme1", "theme2"],
"plot_elements": ["element1", "element2"],
"character_types": ["type1", "type2"],
"settings": ["setting1", "setting2"],
"tone": ["tone1", "tone2"]
},
"reasoning": "Brief explanation of genre classification"
}`
}
private processGenreResult(aiResult: any, validGenres: any[]): GenreDetectionResult {
// Validate primary genre exists
const validGenreCodes = validGenres.map(g => g.code)
if (!validGenreCodes.includes(aiResult.primary_genre)) {
// Fallback to literary fiction if genre not recognized
aiResult.primary_genre = 'literary'
aiResult.primary_confidence = 0.5
}
// Ensure confidence scores are valid
const primaryConfidence = Math.max(0, Math.min(1, aiResult.primary_confidence || 0.5))
const secondaryConfidence = aiResult.secondary_confidence
? Math.max(0, Math.min(1, aiResult.secondary_confidence))
: undefined
return {
primaryGenre: aiResult.primary_genre,
primaryConfidence,
secondaryGenre: aiResult.secondary_genre || undefined,
secondaryConfidence,
subgenres: Array.isArray(aiResult.subgenres) ? aiResult.subgenres : [],
genreMarkers: {
themes: aiResult.genre_markers?.themes || [],
plotElements: aiResult.genre_markers?.plot_elements || [],
characterTypes: aiResult.genre_markers?.character_types || [],
settings: aiResult.genre_markers?.settings || [],
tone: aiResult.genre_markers?.tone || []
}
}
}
private async saveGenreDetection(
manuscriptId: string,
result: GenreDetectionResult
): Promise<void> {
await this.supabase
.from('analyzer.genre_detections')
.insert({
manuscript_id: manuscriptId,
primary_genre: result.primaryGenre,
primary_confidence: result.primaryConfidence,
secondary_genre: result.secondaryGenre,
secondary_confidence: result.secondaryConfidence,
subgenres: result.subgenres,
genre_markers: result.genreMarkers
})
// Update manuscript with detected genre
await this.supabase
.from('analyzer.manuscripts')
.update({
metadata: {
detected_genre: result.primaryGenre,
genre_confidence: result.primaryConfidence
}
})
.eq('id', manuscriptId)
}
private async getManuscriptUserId(manuscriptId: string): Promise<string> {
const { data } = await this.supabase
.from('analyzer.manuscripts')
.select('user_id')
.eq('id', manuscriptId)
.single()
return data?.user_id
}
}3. API Endpoint
// apps/analyzer-app/src/app/api/manuscripts/[id]/detect-genre/route.ts
import { NextRequest, NextResponse } from 'next/server'
import { GenreDetector } from '@/packages/analysis'
import { withAuth } from '@/lib/auth'
export async function POST(
req: NextRequest,
{ params }: { params: { id: string } }
) {
const session = await withAuth(req)
if (!session) {
return NextResponse.json({ error: 'Unauthorized' }, { status: 401 })
}
try {
const detector = new GenreDetector()
const result = await detector.detectGenre(params.id)
return NextResponse.json({
success: true,
genre: result
})
} catch (error) {
console.error('Genre detection error:', error)
return NextResponse.json(
{ error: 'Genre detection failed' },
{ status: 500 }
)
}
}4. React Hook for Genre Display
// apps/analyzer-app/src/hooks/useGenreDetection.ts
import { useQuery } from '@tanstack/react-query'
import { getSupabaseBrowserClient } from '@mystoryflow/database'
export function useGenreDetection(manuscriptId: string) {
const supabase = getSupabaseBrowserClient()
return useQuery({
queryKey: ['genre-detection', manuscriptId],
queryFn: async () => {
const { data, error } = await supabase
.from('analyzer.genre_detections')
.select('*')
.eq('manuscript_id', manuscriptId)
.order('created_at', { ascending: false })
.limit(1)
.single()
if (error) throw error
return data
},
enabled: !!manuscriptId
})
}5. Genre Display Component
// apps/analyzer-app/src/components/manuscript/GenreDisplay.tsx
import { Badge } from '@mystoryflow/ui'
import { useGenreDetection } from '@/hooks/useGenreDetection'
export function GenreDisplay({ manuscriptId }: { manuscriptId: string }) {
const { data: detection, isLoading } = useGenreDetection(manuscriptId)
if (isLoading) return <div>Detecting genre...</div>
if (!detection) return null
const confidenceColor = detection.primary_confidence > 0.8
? 'success'
: detection.primary_confidence > 0.6
? 'warning'
: 'default'
return (
<div className="space-y-2">
<div className="flex items-center gap-2">
<Badge variant={confidenceColor}>
{detection.primary_genre} ({Math.round(detection.primary_confidence * 100)}%)
</Badge>
{detection.secondary_genre && (
<Badge variant="secondary">
{detection.secondary_genre} ({Math.round(detection.secondary_confidence * 100)}%)
</Badge>
)}
</div>
{detection.subgenres.length > 0 && (
<div className="flex gap-1">
{detection.subgenres.map(subgenre => (
<Badge key={subgenre} variant="outline" size="sm">
{subgenre}
</Badge>
))}
</div>
)}
</div>
)
}MVP Acceptance Criteria
- Detect primary genre with confidence score
- Support for 6 core genres (romance, mystery, fantasy, sci-fi, thriller, literary)
- Secondary genre detection when applicable
- Subgenre identification
- Genre marker extraction (themes, plot elements, etc.)
- 80%+ accuracy on genre classification
- Results saved to database
Post-MVP Enhancements
- Support for 20+ genres and subgenres
- Multi-label classification for genre blends
- Historical fiction period detection
- Non-fiction category detection
- Genre trend analysis
- Market positioning recommendations
- Comparative genre analysis
- User genre override option
Implementation Time
- Development: 1 day
- Testing: 0.5 days
- Total: 1.5 days
Dependencies
- F004-AI-SERVICES (AI integration)
- F006-CONTENT-EXTRACTION (text content needed)
Next Feature
After completion, proceed to F008-TEXT-CHUNKING for optimal AI processing.