F009 - Core Analysis Engine

Objective

Build the core AI-powered manuscript analysis framework that evaluates 200+ criteria across 12 categories to provide comprehensive feedback on manuscripts.

Requirements

Functional Requirements

Analyze manuscripts across 12 core categories with 200+ evaluation points
Generate detailed scores and feedback for each category
Provide genre-specific analysis using appropriate templates
Support parallel processing for sub-5-minute analysis time
Generate actionable improvement suggestions
Create market readiness assessments

Technical Requirements

Integrate with OpenAI GPT-4 and Claude AI models
Handle manuscripts up to 150,000 words
Process text in optimized chunks for AI consumption
Cache analysis results for performance
Support retry logic and error handling
Generate structured analysis data for reports

Analysis Framework Architecture

1. Core Analysis Categories (200+ Evaluation Points)


// Core analysis categories with weights and evaluation points
export const ANALYSIS_CATEGORIES = {
  structure: {
    weight: 0.15,
    evaluationPoints: 25,
    subcategories: [
      'opening_strength',
      'three_act_structure', 
      'chapter_organization',
      'pacing_variation',
      'tension_progression'
    ]
  },
  character_development: {
    weight: 0.20,
    evaluationPoints: 30,
    subcategories: [
      'protagonist_development',
      'supporting_characters',
      'antagonist_strength',
      'character_consistency',
      'relationship_dynamics'
    ]
  },
  plot_and_conflict: {
    weight: 0.15,
    evaluationPoints: 25,
    subcategories: [
      'central_conflict',
      'subplot_integration',
      'conflict_escalation',
      'resolution_satisfaction',
      'plot_holes_detection'
    ]
  },
  writing_craft: {
    weight: 0.15,
    evaluationPoints: 20,
    subcategories: [
      'prose_quality',
      'voice_consistency',
      'show_vs_tell',
      'description_balance',
      'technical_proficiency'
    ]
  },
  dialogue: {
    weight: 0.10,
    evaluationPoints: 15,
    subcategories: [
      'character_voice_distinction',
      'dialogue_naturalness',
      'subtext_usage',
      'dialogue_tags',
      'conversation_flow'
    ]
  },
  pacing: {
    weight: 0.10,
    evaluationPoints: 12,
    subcategories: [
      'scene_pacing',
      'chapter_rhythm',
      'action_sequence_flow',
      'quiet_moment_balance'
    ]
  },
  world_building: {
    weight: 0.05,
    evaluationPoints: 10,
    subcategories: [
      'setting_development',
      'world_consistency',
      'sensory_details',
      'cultural_authenticity'
    ]
  },
  theme_and_meaning: {
    weight: 0.05,
    evaluationPoints: 8,
    subcategories: [
      'thematic_clarity',
      'message_integration',
      'symbolic_usage',
      'emotional_resonance'
    ]
  },
  market_readiness: {
    weight: 0.05,
    evaluationPoints: 10,
    subcategories: [
      'genre_compliance',
      'target_audience_fit',
      'commercial_viability',
      'market_positioning'
    ]
  }
}

2. Analysis Engine Core Class


// packages/manuscript-analysis/src/engine/manuscript-analyzer.ts
export class ManuscriptAnalysisEngine {
  private aiService: AIAnalysisService
  private genreTemplates: GenreTemplateService
  private cache: AnalysisCache
 
  constructor(
    aiService: AIAnalysisService,
    genreTemplates: GenreTemplateService,
    cache: AnalysisCache
  ) {
    this.aiService = aiService
    this.genreTemplates = genreTemplates
    this.cache = cache
  }
 
  async analyzeManuscript(
    manuscriptId: string,
    content: ExtractedContent,
    genre: GenreDetection,
    sessionId: string
  ): Promise<AnalysisResult> {
    // Update session progress
    await this.updateSessionProgress(sessionId, 'starting_analysis', 10)
 
    // Get genre-specific template
    const template = await this.genreTemplates.getTemplate(genre.primaryGenre)
    
    // Split content into analyzable chunks
    const chunks = this.prepareContentChunks(content, template.chunkingStrategy)
    await this.updateSessionProgress(sessionId, 'content_prepared', 20)
 
    // Run parallel analysis across all categories
    const categoryPromises = Object.keys(ANALYSIS_CATEGORIES).map(category =>
      this.analyzeCategoryWithRetry(
        category,
        chunks,
        content,
        genre,
        template,
        sessionId
      )
    )
 
    const categoryResults = await Promise.allSettled(categoryPromises)
    await this.updateSessionProgress(sessionId, 'analysis_complete', 80)
 
    // Process results and handle any failures
    const processedResults = this.processCategoryResults(categoryResults, template)
 
    // Calculate overall scores
    const scores = this.calculateOverallScores(processedResults, template.weights)
    
    // Generate improvement suggestions
    const suggestions = await this.generateImprovementSuggestions(
      processedResults,
      content,
      genre,
      template
    )
    
    // Create market insights
    const marketInsights = await this.generateMarketInsights(
      processedResults,
      genre,
      template
    )
    
    await this.updateSessionProgress(sessionId, 'generating_insights', 90)
 
    const result: AnalysisResult = {
      manuscriptId,
      overallScore: scores.overall,
      categoryScores: scores.categories,
      detailedFeedback: processedResults.feedback,
      strengths: this.identifyStrengths(processedResults),
      weaknesses: this.identifyWeaknesses(processedResults),
      improvementSuggestions: suggestions,
      marketInsights,
      genreCompliance: this.assessGenreCompliance(processedResults, template),
      publishingReadiness: this.assessPublishingReadiness(scores),
      processingMetadata: {
        aiModel: this.aiService.getModelVersion(),
        processingTime: Date.now() - startTime,
        chunksProcessed: chunks.length,
        templateVersion: template.version
      }
    }
 
    // Cache the result
    await this.cache.store(manuscriptId, result)
    await this.updateSessionProgress(sessionId, 'completed', 100)
 
    return result
  }
 
  private prepareContentChunks(
    content: ExtractedContent,
    strategy: ChunkingStrategy
  ): ContentChunk[] {
    const chunks: ContentChunk[] = []
    const maxChunkSize = strategy.maxTokens || 3000
    
    // Split by chapters first
    for (const chapter of content.structure.chapters) {
      const chapterText = chapter.content.join('\n')
      
      if (chapterText.length <= maxChunkSize) {
        chunks.push({
          type: 'chapter',
          content: chapterText,
          metadata: {
            chapterTitle: chapter.title,
            chapterNumber: chunks.length + 1,
            wordCount: this.countWords(chapterText)
          }
        })
      } else {
        // Split large chapters into smaller chunks
        const subChunks = this.splitLargeText(chapterText, maxChunkSize)
        subChunks.forEach((subChunk, index) => {
          chunks.push({
            type: 'chapter_section',
            content: subChunk,
            metadata: {
              chapterTitle: chapter.title,
              sectionIndex: index,
              wordCount: this.countWords(subChunk)
            }
          })
        })
      }
    }
 
    return chunks
  }
 
  private async analyzeCategoryWithRetry(
    category: string,
    chunks: ContentChunk[],
    fullContent: ExtractedContent,
    genre: GenreDetection,
    template: GenreTemplate,
    sessionId: string,
    maxRetries: number = 3
  ): Promise<CategoryAnalysisResult> {
    let lastError: Error | null = null
    
    for (let attempt = 1; attempt <= maxRetries; attempt++) {
      try {
        return await this.analyzeCategory(
          category,
          chunks,
          fullContent,
          genre,
          template,
          sessionId
        )
      } catch (error) {
        lastError = error as Error
        
        if (attempt < maxRetries) {
          // Exponential backoff
          await new Promise(resolve => 
            setTimeout(resolve, Math.pow(2, attempt) * 1000)
          )
        }
      }
    }
    
    // If all retries failed, return partial result
    return this.createPartialCategoryResult(category, lastError)
  }
 
  private async analyzeCategory(
    category: string,
    chunks: ContentChunk[],
    fullContent: ExtractedContent,
    genre: GenreDetection,
    template: GenreTemplate,
    sessionId: string
  ): Promise<CategoryAnalysisResult> {
    const categoryConfig = ANALYSIS_CATEGORIES[category]
    const genreSpecificCriteria = template.categories[category]
    
    // Build analysis prompt
    const prompt = this.buildCategoryPrompt(
      category,
      categoryConfig,
      genreSpecificCriteria,
      genre,
      chunks.slice(0, 10) // Analyze first 10 chunks for detailed analysis
    )
    
    // Get AI analysis
    const aiResult = await this.aiService.analyzeWithAI(
      prompt,
      category,
      {
        genre: genre.primaryGenre,
        wordCount: fullContent.wordCount,
        chunkCount: chunks.length
      }
    )
    
    // Process and validate result
    const processedResult = this.processCategoryResult(aiResult, categoryConfig)
    
    // Update progress
    const progressUpdate = Math.floor(20 + (60 * this.getCategoryProgress(category)))
    await this.updateSessionProgress(
      sessionId,
      `analyzing_${category}`,
      progressUpdate
    )
    
    return processedResult
  }
 
  private buildCategoryPrompt(
    category: string,
    config: CategoryConfig,
    genreCriteria: GenreSpecificCriteria,
    genre: GenreDetection,
    chunks: ContentChunk[]
  ): string {
    const basePrompt = `You are a professional manuscript editor specializing in ${genre.primaryGenre} fiction. 
    
Analyze the following manuscript excerpt for ${category.replace('_', ' ')} quality.
 
GENRE: ${genre.primaryGenre} (${Math.round(genre.confidence * 100)}% confidence)
 
ANALYSIS CRITERIA for ${category}:
${this.formatAnalysisCriteria(config.subcategories, genreCriteria)}
 
EVALUATION POINTS (${config.evaluationPoints} total):
${this.formatEvaluationPoints(config.subcategories)}
 
MANUSCRIPT CONTENT:
${chunks.map(chunk => `[${chunk.type.toUpperCase()}] ${chunk.content.substring(0, 2000)}`).join('\n\n')}
 
Provide your analysis in the following JSON format:
{
  "overall_score": number (0-100),
  "subcategory_scores": {
    ${config.subcategories.map(sub => `"${sub}": number (0-100)`).join(',\n    ')}
  },
  "detailed_feedback": {
    ${config.subcategories.map(sub => `"${sub}": "specific feedback for ${sub}"`).join(',\n    ')}
  },
  "strengths": ["list of specific strengths found"],
  "weaknesses": ["list of specific weaknesses found"],
  "specific_examples": {
    "good_examples": ["quotes from text showing strengths"],
    "improvement_areas": ["quotes from text needing improvement"]
  },
  "improvement_suggestions": ["specific, actionable suggestions"]
}`
 
    return basePrompt
  }
 
  private formatAnalysisCriteria(
    subcategories: string[],
    genreCriteria: GenreSpecificCriteria
  ): string {
    return subcategories.map(sub => {
      const criteria = genreCriteria[sub]
      return `- ${sub.replace('_', ' ').toUpperCase()}: ${criteria.description}\n  Key aspects: ${criteria.keyAspects.join(', ')}`
    }).join('\n')
  }
 
  private calculateOverallScores(
    results: ProcessedCategoryResults,
    weights: CategoryWeights
  ): OverallScores {
    let weightedSum = 0
    let totalWeight = 0
    const categoryScores: Record<string, number> = {}
    
    for (const [category, result] of Object.entries(results)) {
      if (result.success && result.score !== null) {
        const weight = weights[category] || ANALYSIS_CATEGORIES[category].weight
        weightedSum += result.score * weight
        totalWeight += weight
        categoryScores[category] = result.score
      }
    }
    
    const overall = totalWeight > 0 ? Math.round(weightedSum / totalWeight) : 0
    
    return {
      overall,
      categories: categoryScores
    }
  }
 
  private async generateImprovementSuggestions(
    results: ProcessedCategoryResults,
    content: ExtractedContent,
    genre: GenreDetection,
    template: GenreTemplate
  ): Promise<ImprovementSuggestion[]> {
    const suggestions: ImprovementSuggestion[] = []
    
    // Identify the lowest-scoring categories for focused improvement
    const sortedCategories = Object.entries(results)
      .filter(([_, result]) => result.success)
      .sort(([_, a], [__, b]) => a.score - b.score)
      .slice(0, 5) // Top 5 areas for improvement
    
    for (const [category, result] of sortedCategories) {
      if (result.score < 70) { // Only suggest improvements for low scores
        const categoryConfig = ANALYSIS_CATEGORIES[category]
        const suggestion = await this.generateCategorySuggestion(
          category,
          result,
          content,
          genre,
          template
        )
        
        suggestions.push({
          category,
          priority: this.calculateSuggestionPriority(result.score, categoryConfig.weight),
          title: suggestion.title,
          description: suggestion.description,
          actionItems: suggestion.actionItems,
          estimatedImpact: suggestion.estimatedImpact,
          resources: suggestion.resources
        })
      }
    }
    
    return suggestions
  }
 
  private identifyStrengths(results: ProcessedCategoryResults): string[] {
    const strengths: string[] = []
    
    for (const [category, result] of Object.entries(results)) {
      if (result.success && result.score >= 80) {
        strengths.push(...result.strengths)
      }
    }
    
    return strengths.slice(0, 10) // Top 10 strengths
  }
 
  private identifyWeaknesses(results: ProcessedCategoryResults): string[] {
    const weaknesses: string[] = []
    
    for (const [category, result] of Object.entries(results)) {
      if (result.success && result.score < 60) {
        weaknesses.push(...result.weaknesses)
      }
    }
    
    return weaknesses.slice(0, 10) // Top 10 weaknesses
  }
 
  private assessPublishingReadiness(scores: OverallScores): PublishingReadiness {
    const overall = scores.overall
    
    if (overall >= 85) {
      return {
        level: 'ready',
        description: 'Manuscript is ready for publication with minimal revisions',
        recommendations: ['Consider final proofreading', 'Review market positioning'],
        timeToMarket: '2-4 weeks'
      }
    } else if (overall >= 70) {
      return {
        level: 'needs_revision',
        description: 'Manuscript needs moderate revisions before publication',
        recommendations: ['Address major weaknesses', 'Consider developmental editing'],
        timeToMarket: '2-3 months'
      }
    } else if (overall >= 50) {
      return {
        level: 'major_revision',
        description: 'Manuscript requires significant revision and development',
        recommendations: ['Extensive rewriting needed', 'Consider professional editing'],
        timeToMarket: '6-12 months'
      }
    } else {
      return {
        level: 'not_ready',
        description: 'Manuscript needs fundamental restructuring and development',
        recommendations: ['Major rewrite required', 'Consider writing courses or coaching'],
        timeToMarket: '12+ months'
      }
    }
  }
}

3. AI Service Integration


// packages/manuscript-analysis/src/ai/ai-analysis-service.ts
export class AIAnalysisService {
  private openAIClient: OpenAI
  private claudeClient: Anthropic
  private modelRouter: ModelRouter
 
  async analyzeWithAI(
    prompt: string,
    analysisType: string,
    context: AnalysisContext
  ): Promise<AIAnalysisResult> {
    const model = this.modelRouter.selectModel(analysisType)
    
    try {
      let result: any
      
      if (model.startsWith('gpt')) {
        result = await this.analyzeWithOpenAI(prompt, model, context)
      } else if (model.startsWith('claude')) {
        result = await this.analyzeWithClaude(prompt, model, context)
      }
      
      return this.parseAIResponse(result, analysisType)
    } catch (error) {
      // Fallback to alternative model
      const fallbackModel = this.modelRouter.getFallback(model)
      return this.analyzeWithFallback(prompt, fallbackModel, analysisType, context)
    }
  }
 
  private async analyzeWithOpenAI(
    prompt: string,
    model: string,
    context: AnalysisContext
  ): Promise<any> {
    const response = await this.openAIClient.chat.completions.create({
      model,
      messages: [
        {
          role: 'system',
          content: 'You are a professional manuscript editor and literary analyst.'
        },
        {
          role: 'user',
          content: prompt
        }
      ],
      temperature: 0.3, // Low temperature for consistent analysis
      max_tokens: 2000,
      response_format: { type: 'json_object' }
    })
    
    return response.choices[0].message.content
  }
 
  private async analyzeWithClaude(
    prompt: string,
    model: string,
    context: AnalysisContext
  ): Promise<any> {
    const response = await this.claudeClient.messages.create({
      model,
      max_tokens: 2000,
      temperature: 0.3,
      messages: [
        {
          role: 'user',
          content: prompt
        }
      ]
    })
    
    return response.content[0].text
  }
}

Database Integration

Updates the analysis_sessions and manuscript_analyses tables to track progress and store results.

API Endpoints


// POST /api/manuscripts/{id}/analyze
// Start new analysis session
 
// GET /api/manuscripts/{id}/analysis/{sessionId}
// Get analysis progress and results
 
// POST /api/manuscripts/{id}/analysis/{sessionId}/retry
// Retry failed analysis categories

Testing Requirements

Unit Tests

Each analysis category function
Score calculation algorithms
Improvement suggestion generation
AI response parsing and validation

Integration Tests

Complete analysis workflow
AI service integration with both OpenAI and Claude
Error handling and retry logic
Results caching and retrieval

E2E Tests

Author uploads manuscript and gets analysis
Progress tracking works correctly
Results match expected quality standards
Analysis completes within 5-minute target

Acceptance Criteria

Must Have

Analyze 200+ evaluation points across 12 categories
Complete analysis in under 5 minutes for 150k words
Generate structured feedback and scores
Support both OpenAI and Claude AI models
Provide genre-specific analysis
Create actionable improvement suggestions

Should Have

Progress tracking with real-time updates
Retry logic for failed analysis components
Results caching for performance
Market readiness assessment
Publishing timeline estimates

Could Have

Competitive manuscript analysis
Trend analysis across multiple manuscripts
Collaborative analysis with multiple reviewers
Advanced visualization of results

Dependencies

F004-AI-SERVICES (AI service setup)
F007-GENRE-DETECTION (genre classification)
F008-TEXT-CHUNKING (content preparation)
F010-GENRE-TEMPLATES (analysis templates)

Estimated Effort

Development: 6 days
Testing: 3 days
AI Model Fine-tuning: 2 days
Documentation: 1 day
Total: 12 days

Next Feature

After completion, proceed to F010-GENRE-TEMPLATES to implement genre-specific analysis templates and criteria.