Configuration Reference
1 Configuration Reference
VeritaScribe uses environment variables for configuration, providing flexibility and security for different deployment scenarios.
1.1 Configuration Overview
Configuration is managed through:
- Environment variables (highest priority)
.env
file (for local development)- Default values (built into the application)
1.2 LLM Provider Configuration
VeritaScribe supports multiple LLM providers for flexibility, cost optimization, and access to different models.
1.2.1 Provider Selection
# Choose your LLM provider (default: openai)
LLM_PROVIDER=openai # or: openrouter, anthropic, custom
Available Providers: - openai: Direct OpenAI API access - openrouter: Access to 100+ models through OpenRouter - anthropic: Direct Anthropic Claude API access
- custom: OpenAI-compatible endpoints (Ollama, Azure, etc.)
1.2.2 OpenAI Configuration
For LLM_PROVIDER=openai
:
OPENAI_API_KEY=your_openai_api_key_here
Getting an OpenAI API Key: 1. Visit OpenAI Platform 2. Sign in or create an account 3. Click “Create new secret key” 4. Copy the key and add it to your .env
file
1.2.3 OpenRouter Configuration
For LLM_PROVIDER=openrouter
(access to 100+ models):
OPENROUTER_API_KEY=sk-or-your_openrouter_api_key_here
Getting an OpenRouter API Key: 1. Visit OpenRouter 2. Sign up and verify your account 3. Create a new API key 4. Add credits or set up billing
1.2.4 Anthropic Configuration
For LLM_PROVIDER=anthropic
(direct Claude access):
ANTHROPIC_API_KEY=sk-ant-your_anthropic_api_key_here
Getting an Anthropic API Key: 1. Visit Anthropic Console 2. Sign up and verify your account 3. Navigate to API Keys section 4. Create a new key and add credits
1.2.5 Custom Provider Configuration
For LLM_PROVIDER=custom
(Ollama, Azure OpenAI, etc.):
OPENAI_API_KEY=your_custom_api_key
OPENAI_BASE_URL=https://your-endpoint.com/v1
Common Custom Endpoints: - Ollama: http://localhost:11434/v1
- Azure OpenAI: https://your-resource.openai.azure.com/
- Other providers: Check provider documentation
- Never commit API keys to version control
- Use environment variables in production
- The
.env
file is already in.gitignore
- Rotate keys regularly for security
- Different providers have different key formats
1.3 Model Configuration
1.3.1 Provider-Specific Model Selection
Each provider uses different model naming conventions:
OpenAI Provider Models:
LLM_PROVIDER=openai
DEFAULT_MODEL=gpt-4 # or: gpt-4-turbo, gpt-4o, gpt-3.5-turbo
OpenRouter Provider Models:
LLM_PROVIDER=openrouter
# Note: OpenRouter models are automatically prefixed with 'openrouter/'
DEFAULT_MODEL=anthropic/claude-3.5-sonnet # Claude models
# DEFAULT_MODEL=openai/gpt-4 # OpenAI via OpenRouter
# DEFAULT_MODEL=meta-llama/llama-3.1-70b-instruct # Open source
# DEFAULT_MODEL=z-ai/glm-4.5-air:free # Free models
Anthropic Provider Models:
LLM_PROVIDER=anthropic
DEFAULT_MODEL=claude-3-5-sonnet-20241022 # Latest Sonnet
# DEFAULT_MODEL=claude-3-5-haiku-20241022 # Fast and cost-effective
# DEFAULT_MODEL=claude-3-opus-20240229 # Highest quality
Custom Provider Models:
LLM_PROVIDER=custom
DEFAULT_MODEL=llama3.1:8b # Ollama format
# DEFAULT_MODEL=gpt-4 # Azure OpenAI deployment name
1.3.2 Model Recommendations by Provider
OpenAI: - Quality: gpt-4
- Best analysis quality - Speed: gpt-4o-mini
- Fast and cost-effective - Cost: gpt-3.5-turbo
- Most economical
OpenRouter: - Quality: anthropic/claude-3-opus
- Highest quality - Speed: anthropic/claude-3-haiku
- Fast Claude model - Cost: z-ai/glm-4.5-air:free
- Free model
Anthropic: - Quality: claude-3-opus-20240229
- Best Claude model - Speed: claude-3-5-haiku-20241022
- Fast and efficient - Cost: claude-3-haiku-20240307
- Most economical
- OpenAI: Most mature, reliable API, extensive model selection
- OpenRouter: Access to 100+ models, competitive pricing, free options
- Anthropic: Excellent reasoning, safety-focused, direct API access
- Custom: Use local models (Ollama), private deployments, cost control
1.3.3 Request Parameters
# Maximum tokens per LLM request (default: 2000)
MAX_TOKENS=2000
# LLM temperature for consistency (default: 0.1)
# Range: 0.0 (deterministic) to 1.0 (creative)
TEMPERATURE=0.1
- Lower
MAX_TOKENS
reduces costs but may truncate analysis - Higher values allow more detailed analysis
- Monitor usage with
uv run python -m veritascribe config
1.4 Analysis Feature Configuration
1.4.1 Enable/Disable Analysis Types
# Grammar and linguistic analysis (default: true)
GRAMMAR_ANALYSIS_ENABLED=true
# Content plausibility checking (default: true)
CONTENT_ANALYSIS_ENABLED=true
# Citation format validation (default: true)
CITATION_ANALYSIS_ENABLED=true
Use Cases: - Disable expensive analysis types for cost optimization - Focus on specific error types during review phases - Customize analysis for different document types
1.4.2 Error Severity Thresholds
# Threshold for high severity classification (default: 0.8)
HIGH_SEVERITY_THRESHOLD=0.8
# Threshold for medium severity classification (default: 0.5)
MEDIUM_SEVERITY_THRESHOLD=0.5
Errors are classified as: - High: Score ≥ HIGH_SEVERITY_THRESHOLD
- Medium: Score ≥ MEDIUM_SEVERITY_THRESHOLD
- Low: Score < MEDIUM_SEVERITY_THRESHOLD
1.5 Processing Configuration
1.5.1 Text Block Processing
# Maximum characters per analysis block (default: 2000)
MAX_TEXT_BLOCK_SIZE=2000
# Minimum characters for analysis (default: 50)
MIN_TEXT_BLOCK_SIZE=50
Optimization Guidelines:
Document Size | Recommended MAX_TEXT_BLOCK_SIZE |
---|---|
Small (< 50 pages) | 2000-3000 |
Medium (50-100 pages) | 1500-2000 |
Large (> 100 pages) | 1000-1500 |
1.5.2 Parallel Processing
# Enable parallel LLM requests (default: true)
PARALLEL_PROCESSING=true
# Maximum concurrent requests (default: 5)
MAX_CONCURRENT_REQUESTS=5
Setting MAX_CONCURRENT_REQUESTS
too high may trigger API rate limits. Start with 3-5 and increase gradually based on your API tier.
1.5.3 Retry Configuration
# Maximum retry attempts for failed requests (default: 3)
MAX_RETRIES=3
# Delay between retries in seconds (default: 1.0)
RETRY_DELAY=1.0
1.6 Output Configuration
1.6.1 File Output Settings
# Default output directory (default: ./analysis_output)
OUTPUT_DIRECTORY=./analysis_output
# Generate error visualization charts (default: true)
GENERATE_VISUALIZATIONS=true
# Save detailed text reports (default: true)
SAVE_DETAILED_REPORTS=true
1.7 Environment-Specific Configuration
1.7.1 Development Environment
Create a .env.dev
file for development settings:
# Development-specific settings
DEFAULT_MODEL=gpt-3.5-turbo
MAX_TOKENS=1500
PARALLEL_PROCESSING=false
MAX_CONCURRENT_REQUESTS=2
GENERATE_VISUALIZATIONS=true
1.7.2 Production Environment
Set environment variables directly:
export OPENAI_API_KEY="your-production-key"
export DEFAULT_MODEL="gpt-4"
export MAX_TOKENS=2000
export PARALLEL_PROCESSING=true
export MAX_CONCURRENT_REQUESTS=10
export OUTPUT_DIRECTORY="/app/analysis_output"
1.7.3 Docker Configuration
Example docker-compose.yml
:
version: '3.8'
services:
veritascribe:
build: .
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- DEFAULT_MODEL=gpt-4-turbo
- MAX_TOKENS=2000
- PARALLEL_PROCESSING=true
- MAX_CONCURRENT_REQUESTS=5
volumes:
- ./analysis_output:/app/analysis_output
1.8 Configuration Validation
1.8.1 View Current Configuration
Check your current settings:
uv run python -m veritascribe config
1.8.2 View Available Providers
See all supported providers and their models:
uv run python -m veritascribe providers
1.8.3 Test Configuration
Validate your configuration with system tests:
uv run python -m veritascribe test
1.9 Advanced Configuration
1.9.1 Multi-Provider Setup Examples
Example 1: Standard OpenAI
LLM_PROVIDER=openai
OPENAI_API_KEY=sk-your-key-here
DEFAULT_MODEL=gpt-4
Example 2: OpenRouter with Claude
LLM_PROVIDER=openrouter
OPENROUTER_API_KEY=sk-or-your-key-here
DEFAULT_MODEL=anthropic/claude-3.5-sonnet
Example 3: Direct Anthropic Claude
LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-your-key-here
DEFAULT_MODEL=claude-3-5-sonnet-20241022
Example 4: Local Ollama
LLM_PROVIDER=custom
OPENAI_API_KEY=ollama # Can be any value for local models
OPENAI_BASE_URL=http://localhost:11434/v1
DEFAULT_MODEL=llama3.1:8b
Example 5: Azure OpenAI
LLM_PROVIDER=custom
OPENAI_API_KEY=your-azure-key
OPENAI_BASE_URL=https://your-resource.openai.azure.com/
DEFAULT_MODEL=gpt-4 # Your Azure deployment name
For more advanced configuration scenarios, see the Architecture Guide.