Configuration Reference

1 Configuration Reference

VeritaScribe uses environment variables for configuration, providing flexibility and security for different deployment scenarios.

1.1 Configuration Overview

Configuration is managed through:

  1. Environment variables (highest priority)
  2. .env file (for local development)
  3. Default values (built into the application)

1.2 LLM Provider Configuration

VeritaScribe supports multiple LLM providers for flexibility, cost optimization, and access to different models.

1.2.1 Provider Selection

# Choose your LLM provider (default: openai)
LLM_PROVIDER=openai  # or: openrouter, anthropic, custom

Available Providers: - openai: Direct OpenAI API access - openrouter: Access to 100+ models through OpenRouter - anthropic: Direct Anthropic Claude API access
- custom: OpenAI-compatible endpoints (Ollama, Azure, etc.)

1.2.2 OpenAI Configuration

For LLM_PROVIDER=openai:

OPENAI_API_KEY=your_openai_api_key_here

Getting an OpenAI API Key: 1. Visit OpenAI Platform 2. Sign in or create an account 3. Click “Create new secret key” 4. Copy the key and add it to your .env file

1.2.3 OpenRouter Configuration

For LLM_PROVIDER=openrouter (access to 100+ models):

OPENROUTER_API_KEY=sk-or-your_openrouter_api_key_here

Getting an OpenRouter API Key: 1. Visit OpenRouter 2. Sign up and verify your account 3. Create a new API key 4. Add credits or set up billing

1.2.4 Anthropic Configuration

For LLM_PROVIDER=anthropic (direct Claude access):

ANTHROPIC_API_KEY=sk-ant-your_anthropic_api_key_here

Getting an Anthropic API Key: 1. Visit Anthropic Console 2. Sign up and verify your account 3. Navigate to API Keys section 4. Create a new key and add credits

1.2.5 Custom Provider Configuration

For LLM_PROVIDER=custom (Ollama, Azure OpenAI, etc.):

OPENAI_API_KEY=your_custom_api_key
OPENAI_BASE_URL=https://your-endpoint.com/v1

Common Custom Endpoints: - Ollama: http://localhost:11434/v1 - Azure OpenAI: https://your-resource.openai.azure.com/ - Other providers: Check provider documentation

API Key Security
  • Never commit API keys to version control
  • Use environment variables in production
  • The .env file is already in .gitignore
  • Rotate keys regularly for security
  • Different providers have different key formats

1.3 Model Configuration

1.3.1 Provider-Specific Model Selection

Each provider uses different model naming conventions:

OpenAI Provider Models:

LLM_PROVIDER=openai
DEFAULT_MODEL=gpt-4  # or: gpt-4-turbo, gpt-4o, gpt-3.5-turbo

OpenRouter Provider Models:

LLM_PROVIDER=openrouter
# Note: OpenRouter models are automatically prefixed with 'openrouter/'
DEFAULT_MODEL=anthropic/claude-3.5-sonnet    # Claude models
# DEFAULT_MODEL=openai/gpt-4                 # OpenAI via OpenRouter
# DEFAULT_MODEL=meta-llama/llama-3.1-70b-instruct  # Open source
# DEFAULT_MODEL=z-ai/glm-4.5-air:free       # Free models

Anthropic Provider Models:

LLM_PROVIDER=anthropic
DEFAULT_MODEL=claude-3-5-sonnet-20241022  # Latest Sonnet
# DEFAULT_MODEL=claude-3-5-haiku-20241022  # Fast and cost-effective
# DEFAULT_MODEL=claude-3-opus-20240229     # Highest quality

Custom Provider Models:

LLM_PROVIDER=custom
DEFAULT_MODEL=llama3.1:8b  # Ollama format
# DEFAULT_MODEL=gpt-4       # Azure OpenAI deployment name

1.3.2 Model Recommendations by Provider

OpenAI: - Quality: gpt-4 - Best analysis quality - Speed: gpt-4o-mini - Fast and cost-effective - Cost: gpt-3.5-turbo - Most economical

OpenRouter: - Quality: anthropic/claude-3-opus - Highest quality - Speed: anthropic/claude-3-haiku - Fast Claude model - Cost: z-ai/glm-4.5-air:free - Free model

Anthropic: - Quality: claude-3-opus-20240229 - Best Claude model - Speed: claude-3-5-haiku-20241022 - Fast and efficient - Cost: claude-3-haiku-20240307 - Most economical

Provider Comparison
  • OpenAI: Most mature, reliable API, extensive model selection
  • OpenRouter: Access to 100+ models, competitive pricing, free options
  • Anthropic: Excellent reasoning, safety-focused, direct API access
  • Custom: Use local models (Ollama), private deployments, cost control

1.3.3 Request Parameters

# Maximum tokens per LLM request (default: 2000)
MAX_TOKENS=2000

# LLM temperature for consistency (default: 0.1)
# Range: 0.0 (deterministic) to 1.0 (creative)
TEMPERATURE=0.1
Optimizing Token Usage
  • Lower MAX_TOKENS reduces costs but may truncate analysis
  • Higher values allow more detailed analysis
  • Monitor usage with uv run python -m veritascribe config

1.4 Analysis Feature Configuration

1.4.1 Enable/Disable Analysis Types

# Grammar and linguistic analysis (default: true)
GRAMMAR_ANALYSIS_ENABLED=true

# Content plausibility checking (default: true)
CONTENT_ANALYSIS_ENABLED=true

# Citation format validation (default: true)
CITATION_ANALYSIS_ENABLED=true

Use Cases: - Disable expensive analysis types for cost optimization - Focus on specific error types during review phases - Customize analysis for different document types

1.4.2 Error Severity Thresholds

# Threshold for high severity classification (default: 0.8)
HIGH_SEVERITY_THRESHOLD=0.8

# Threshold for medium severity classification (default: 0.5)
MEDIUM_SEVERITY_THRESHOLD=0.5

Errors are classified as: - High: Score ≥ HIGH_SEVERITY_THRESHOLD - Medium: Score ≥ MEDIUM_SEVERITY_THRESHOLD
- Low: Score < MEDIUM_SEVERITY_THRESHOLD

1.5 Processing Configuration

1.5.1 Text Block Processing

# Maximum characters per analysis block (default: 2000)
MAX_TEXT_BLOCK_SIZE=2000

# Minimum characters for analysis (default: 50)
MIN_TEXT_BLOCK_SIZE=50

Optimization Guidelines:

Document Size Recommended MAX_TEXT_BLOCK_SIZE
Small (< 50 pages) 2000-3000
Medium (50-100 pages) 1500-2000
Large (> 100 pages) 1000-1500

1.5.2 Parallel Processing

# Enable parallel LLM requests (default: true)
PARALLEL_PROCESSING=true

# Maximum concurrent requests (default: 5)
MAX_CONCURRENT_REQUESTS=5
Rate Limiting

Setting MAX_CONCURRENT_REQUESTS too high may trigger API rate limits. Start with 3-5 and increase gradually based on your API tier.

1.5.3 Retry Configuration

# Maximum retry attempts for failed requests (default: 3)
MAX_RETRIES=3

# Delay between retries in seconds (default: 1.0)
RETRY_DELAY=1.0

1.6 Output Configuration

1.6.1 File Output Settings

# Default output directory (default: ./analysis_output)
OUTPUT_DIRECTORY=./analysis_output

# Generate error visualization charts (default: true)
GENERATE_VISUALIZATIONS=true

# Save detailed text reports (default: true)
SAVE_DETAILED_REPORTS=true

1.7 Environment-Specific Configuration

1.7.1 Development Environment

Create a .env.dev file for development settings:

# Development-specific settings
DEFAULT_MODEL=gpt-3.5-turbo
MAX_TOKENS=1500
PARALLEL_PROCESSING=false
MAX_CONCURRENT_REQUESTS=2
GENERATE_VISUALIZATIONS=true

1.7.2 Production Environment

Set environment variables directly:

export OPENAI_API_KEY="your-production-key"
export DEFAULT_MODEL="gpt-4"
export MAX_TOKENS=2000
export PARALLEL_PROCESSING=true
export MAX_CONCURRENT_REQUESTS=10
export OUTPUT_DIRECTORY="/app/analysis_output"

1.7.3 Docker Configuration

Example docker-compose.yml:

version: '3.8'
services:
  veritascribe:
    build: .
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - DEFAULT_MODEL=gpt-4-turbo
      - MAX_TOKENS=2000
      - PARALLEL_PROCESSING=true
      - MAX_CONCURRENT_REQUESTS=5
    volumes:
      - ./analysis_output:/app/analysis_output

1.8 Configuration Validation

1.8.1 View Current Configuration

Check your current settings:

uv run python -m veritascribe config

1.8.2 View Available Providers

See all supported providers and their models:

uv run python -m veritascribe providers

1.8.3 Test Configuration

Validate your configuration with system tests:

uv run python -m veritascribe test

1.9 Advanced Configuration

1.9.1 Multi-Provider Setup Examples

Example 1: Standard OpenAI

LLM_PROVIDER=openai
OPENAI_API_KEY=sk-your-key-here
DEFAULT_MODEL=gpt-4

Example 2: OpenRouter with Claude

LLM_PROVIDER=openrouter
OPENROUTER_API_KEY=sk-or-your-key-here
DEFAULT_MODEL=anthropic/claude-3.5-sonnet

Example 3: Direct Anthropic Claude

LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-your-key-here
DEFAULT_MODEL=claude-3-5-sonnet-20241022

Example 4: Local Ollama

LLM_PROVIDER=custom
OPENAI_API_KEY=ollama  # Can be any value for local models
OPENAI_BASE_URL=http://localhost:11434/v1
DEFAULT_MODEL=llama3.1:8b

Example 5: Azure OpenAI

LLM_PROVIDER=custom
OPENAI_API_KEY=your-azure-key
OPENAI_BASE_URL=https://your-resource.openai.azure.com/
DEFAULT_MODEL=gpt-4  # Your Azure deployment name

For more advanced configuration scenarios, see the Architecture Guide.