VeritaScribe Documentation

AI-Powered Thesis Review Tool

1 Welcome to VeritaScribe

VeritaScribe is an intelligent document analysis system that automatically reviews PDF thesis documents for quality issues including grammar errors, content plausibility problems, and citation format inconsistencies.

1.1 What is VeritaScribe?

VeritaScribe combines advanced AI language models with structured document processing to provide comprehensive academic document review. Built with modern Python tools including DSPy for LLM orchestration, Pydantic for structured data modeling, and PyMuPDF for PDF processing.

1.2 Key Features

1.2.1 🔍 Comprehensive Analysis

  • Grammar and linguistic error detection
  • Content plausibility validation
  • Citation format verification
  • Error severity classification

1.2.2 📊 Smart Reporting

  • Detailed error reports with locations
  • Visual analytics and charts
  • JSON data export
  • Markdown reports

1.2.3 ⚙️ Flexible Configuration

  • Multiple LLM model support
  • Customizable analysis parameters
  • Citation style configuration
  • Processing optimization settings

1.2.4 🚀 Easy to Use

  • Command-line interface
  • Quick analysis mode
  • Demo mode for testing
  • Comprehensive error messages

1.3 How It Works

flowchart LR
    A[PDF Input] --> B[Text Extraction]
    B --> C[LLM Analysis]
    C --> D[Error Detection]
    D --> E[Report Generation]
    E --> F[Visualizations]

  1. PDF Processing: Extracts text while preserving layout and location information
  2. AI Analysis: Uses large language models to analyze content for various types of errors
  3. Error Classification: Categorizes and scores errors by type and severity
  4. Report Generation: Creates comprehensive reports and visualizations

1.4 Quick Start

Get started with VeritaScribe in just a few steps:

  1. Install dependencies:

    uv sync
  2. Configure API key:

    cp .env.example .env
    # Edit .env to add your OpenAI API key
  3. Try the demo:

    uv run python -m veritascribe demo
  4. Analyze your document:

    uv run python -m veritascribe analyze your_thesis.pdf

1.5 Error Types Detected

1.5.1 Grammar and Linguistics

  • Spelling mistakes and typos
  • Grammatical inconsistencies
  • Punctuation errors
  • Style and readability issues

1.5.2 Content Quality

  • Logical inconsistencies
  • Factual accuracy concerns
  • Argument structure problems
  • Citation-content mismatches

1.5.3 Citation Format

  • Incorrect citation style formatting
  • Missing or incomplete references
  • Inconsistent bibliography formatting
  • Citation accuracy issues

1.6 Architecture Overview

VeritaScribe follows a modular pipeline architecture:

  • Configuration Layer: Environment-based settings management
  • PDF Processing: Text extraction with layout preservation
  • LLM Analysis: DSPy-based structured analysis modules
  • Data Models: Pydantic schemas for type safety
  • Report Generation: Multi-format output with visualizations

1.7 Next Steps

1.8 Support

If you encounter issues or have questions:

  1. Check the Troubleshooting Guide
  2. Run system diagnostics: uv run python -m veritascribe test
  3. Review configuration: uv run python -m veritascribe config

VeritaScribe is designed for defensive security and academic quality assurance purposes only.