Config Imports¶
The config imports feature allows you to split your Duckalog configuration across multiple files, making it easier to organize, maintain, and reuse configuration components.
Why Use Config Imports?¶
Config imports are useful when you want to:
- Split configuration by domain - Separate views, settings, and secrets into different files
- Share common configuration - Reuse the same settings across multiple projects
- Organize by team or module - Different teams can own their own config files
- Environment-specific configs - Easily switch between dev/staging/prod configurations
- Version control friendly - Smaller files result in clearer diffs
Basic Usage¶
Simple Import List¶
Create a main catalog file that imports other config files using a simple list:
# settings.yaml
version: 1
duckdb:
database: imported.duckdb
install_extensions:
- httpfs
pragmas:
- "SET threads = 4"
views:
- name: imported_view
sql: "SELECT 1"
When you load catalog.yaml, Duckalog will:
1. Load the main config
2. Import settings.yaml and views.yaml
3. Merge them together (views are concatenated, scalar values are overridden)
4. Return a single, merged configuration
Advanced Import Options¶
Duckalog supports advanced import features through SelectiveImports and ImportEntry objects.
Section-Specific Imports¶
Import different files for different configuration sections:
# catalog.yaml
version: 1
imports:
duckdb:
- ./database-settings.yaml
views:
- ./user-views.yaml
- ./product-views.yaml
attachments:
- ./external-databases.yaml
duckdb:
database: main.duckdb # This can be overridden by database-settings.yaml
Import Override Control¶
Control whether imports can override existing values:
# catalog.yaml
version: 1
imports:
- path: ./base-settings.yaml
override: true # Can override existing values (default)
- path: ./optional-settings.yaml
override: false # Only fills missing fields, won't override
duckdb:
database: main.duckdb
Combined Section-Specific with Override Control¶
# catalog.yaml
version: 1
imports:
duckdb:
- path: ./database.yaml
override: true
- path: ./optional-db-settings.yaml
override: false
views:
- path: ./base-views.yaml
override: true
semantic_models:
- path: ./shared-models.yaml
override: false # Won't override existing models
duckdb:
database: catalog.duckdb
Import Entry Format¶
Each import can be specified as either:
- Simple string: "./settings.yaml" (uses default override=true)
- ImportEntry object: {path: "./settings.yaml", override: false}
# All equivalent:
imports:
- ./settings.yaml # Simple string
- path: ./settings.yaml # ImportEntry with defaults
- path: ./settings.yaml
override: true # Explicit override
Import Resolution Algorithm¶
Duckalog follows a precise algorithm when resolving imports:
1. Path Resolution¶
- Remote URIs (
s3://,gs://,https://, etc.): Used as-is - Absolute paths: Used as-is without modification
- Relative paths: Resolved relative to the importing file's directory
- Environment variables: Expanded before resolution (
${env:VAR})
2. Import Processing Order¶
- Load main configuration file
- Process imports in the order they appear
- For
SelectiveImports: process each section separately - Apply override behavior based on
ImportEntry.overridesetting - Validate final merged configuration
3. Caching¶
- Imported files are cached to avoid duplicate loading
- Circular imports are detected during processing
- Remote imports use authentication from environment or filesystem context
Merge Behavior¶
Config imports use a deep merge strategy with override control:
Scalar Values¶
Later imports override earlier ones:
Result:
Dictionaries¶
Dictionaries are merged recursively:
Result:
Lists with Override Control¶
Lists behavior depends on the override setting:
Default Behavior (override=true)¶
Lists are concatenated (items from all imports are included):
Result:
Override Control (override=false)¶
When override=false, lists are only merged if the target list is empty:
# main.yaml
version: 1
imports:
- path: ./additional-views.yaml
override: false # Won't override existing views
views:
- name: main_view
sql: "SELECT 1"
Result: (additional-views.yaml views are ignored because main.yaml already has views)
Environment Variables¶
You can use environment variables in import paths:
# catalog.yaml
version: 1
imports:
- ${env:CONFIG_DIR}/settings.yaml # Uses CONFIG_DIR environment variable
duckdb:
database: main.duckdb
Set the environment variable:
Import Order and Precedence¶
Imports are processed in the order they appear, and later imports override earlier ones. The main config file always has the final say:
# catalog.yaml
imports:
- ./base.yaml
- ./production.yaml # This can override settings from base.yaml
# Settings here override both imported files
duckdb:
database: catalog.duckdb
Uniqueness Validation¶
After merging, Duckalog validates that certain items are unique:
- Views: Must have unique
(schema, name)tuples - Semantic Models: Must have unique names
- Iceberg Catalogs: Must have unique names
- Attachments: Must have unique aliases
If duplicates are found, you'll get an error:
Circular Import Detection¶
Duckalog automatically detects and prevents circular imports:
# file_a.yaml
imports:
- ./file_b.yaml
# file_b.yaml
imports:
- ./file_a.yaml # This creates a circular reference!
This will fail with an error:
Remote Imports¶
You can import configuration files from remote storage systems such as S3, GCS, Azure Blob Storage, or HTTPS endpoints. This allows you to share common configuration across multiple projects or load configuration from centralized locations.
Supported URI Schemes¶
Duckalog supports the following remote URI schemes:
- S3:
s3://bucket/path/config.yaml - Google Cloud Storage:
gs://bucket/path/config.yamlorgcs://bucket/path/config.yaml - Azure Blob Storage:
abfs://account@container/path/config.yaml - SFTP:
sftp://user@host/path/config.yaml - HTTPS:
https://example.com/config.yaml
Example: Import from S3¶
# catalog.yaml
version: 1
imports:
- s3://my-bucket/shared/base-config.yaml
- ./local-config.yaml
duckdb:
database: my-project.duckdb
Example: Import from HTTPS¶
# catalog.yaml
version: 1
imports:
- https://raw.githubusercontent.com/company/config-templates/main/base.yaml
duckdb:
database: main.duckdb
Authentication¶
Remote imports use the same authentication mechanisms as regular remote config loading:
- S3: AWS credentials via environment variables,
~/.aws/credentials, or IAM role - GCS: Google Cloud credentials via
GOOGLE_APPLICATION_CREDENTIALSor ADC - Azure: Azure credentials via environment variables or managed identity
- SFTP: SSH credentials via SSH config or environment variables
- HTTPS: No authentication required for public URLs
For S3, you can use environment variables:
Environment Variables in Remote Paths¶
You can use environment variables in remote import paths:
# catalog.yaml
version: 1
imports:
- https://${env:COMPANY}.s3.amazonaws.com/configs/base.yaml
duckdb:
database: main.duckdb
Error Handling¶
Remote import failures include clear error messages with the URI and operation:
Security Notes¶
- Remote imports follow the same security rules as remote config loading
- Credentials should be provided via environment variables or secure credential stores
- For HTTPS URLs, ensure you're using trusted endpoints
- Consider using signed URLs for temporary access to private resources
Best Practices¶
1. Organize by Domain¶
config/
├── catalog.yaml # Main file with imports
├── settings.yaml # Database settings, extensions
├── views/
│ ├── users.yaml
│ ├── products.yaml
│ └── orders.yaml
└── environments/
├── dev.yaml
└── prod.yaml
2. Use Empty Lists for Optional Sections¶
If a file only contains imports and no views, add an empty views list:
# settings-only.yaml
version: 1
duckdb:
database: db.duckdb
views: [] # Explicitly empty to avoid required field errors
3. Document Your Imports¶
Add comments explaining why files are imported:
# catalog.yaml
imports:
- ./settings.yaml # Base database configuration
- ./views/users.yaml # User-related views
- ./views/products.yaml # Product catalog views
4. Version Control¶
- Keep import files in version control
- Use consistent naming conventions
- Add
.gitignorefor generated databases (.duckdbfiles)
Common Patterns¶
Environment-Specific Configuration¶
# catalog.yaml
version: 1
imports:
- ./base.yaml
- ./environments/${env:ENVIRONMENT}.yaml # dev.yaml, prod.yaml, etc.
Layered Configuration with Override Control¶
# catalog.yaml
version: 1
imports:
# Base layer - always applied
- path: ./base.yaml
override: true
# Environment-specific - can override base
- path: ./environments/${env:ENVIRONMENT}.yaml
override: true
# Local overrides - only fills gaps, doesn't override
- path: ./local-overrides.yaml
override: false
duckdb:
database: catalog.duckdb
Section-Specific Team Ownership¶
# catalog.yaml
version: 1
imports:
# Infrastructure team owns database settings
duckdb:
- path: ./infrastructure/database.yaml
override: true
# Data team owns views
views:
- path: ./analytics/core-views.yaml
override: true
- path: ./analytics/dimension-views.yaml
override: true
# BI team owns semantic models
semantic_models:
- path: ./bi/shared-models.yaml
override: false # Won't override local models
duckdb:
database: analytics.duckdb
Modular View Organization¶
# catalog.yaml
version: 1
imports:
views:
- ./domains/users/views.yaml # User domain views
- ./domains/products/views.yaml # Product domain views
- ./domains/orders/views.yaml # Order domain views
- ./domains/analytics/views.yaml # Analytics views
duckdb:
database: multi_domain.duckdb
Safe Configuration Updates¶
# catalog.yaml
version: 1
imports:
# Production config - can be overridden
- path: ./production.yaml
override: true
# Emergency fixes - only fills gaps, safe to apply
- path: ./emergency-fixes.yaml
override: false
# Local development - never overrides production
- path: ./local-dev.yaml
override: false
duckdb:
database: catalog.duckdb
Shared Base Configuration¶
Then import this in all your projects:
Team Ownership¶
config/
├── catalog.yaml
├── infrastructure/ # Ops team owns this
│ └── database.yaml
├── analytics/ # Data team owns this
│ └── views.yaml
└── reporting/ # BI team owns this
└── reports.yaml
Migration from Single File¶
To migrate an existing single-file configuration:
-
Create a new main file:
-
Rename your existing config:
-
Test that it works:
-
Gradually split
migrated-config.yamlinto smaller files
Troubleshooting¶
Import Resolution Details¶
Path Resolution Rules¶
-
Remote URIs: Used as-is without modification
-
Absolute Paths: Used as-is
-
Relative Paths: Resolved relative to importing file
-
Environment Variables: Expanded before resolution
Import Processing Order¶
Imports are processed in this order:
1. Main configuration file loaded
2. imports section processed top-to-bottom
3. For SelectiveImports: each section processed in order (duckdb → views → attachments → iceberg_catalogs → semantic_models)
4. Within each section: imports processed in order listed
5. Override behavior applied based on ImportEntry.override
6. Final validation performed
Caching and Performance¶
- File Caching: Each unique file path loaded only once
- Remote Caching: Remote files cached per import operation
- Circular Detection: Tracks import chains to prevent cycles
- Validation: Merged config validated after all imports processed
Debugging Import Issues¶
Use the show-imports command to visualize and debug your import graph:
# View the import tree
duckalog show-imports catalog.yaml
# Show import diagnostics (depth, file counts, duplicates)
duckalog show-imports catalog.yaml --diagnostics
# Export import graph as JSON for programmatic analysis
duckalog show-imports catalog.yaml --format json
# Preview the fully merged configuration
duckalog show-imports catalog.yaml --show-merged
This helps you: - Visualize the import structure - See which files import which others - Detect circular imports - Identify problematic import chains - Count total files - Understand the complexity of your configuration - Find duplicate imports - Catch redundant file references - Preview merged config - Verify the final configuration before building - See override behavior - Understand which imports override others - Validate section-specific imports - Check SelectiveImports structure
Example Output¶
$ duckalog show-imports catalog.yaml --diagnostics
Import Graph:
catalog.yaml
├── ./database-settings.yaml
├── ./views/
│ ├── users.yaml
│ └── products.yaml
└── ../shared/base.yaml
Diagnostics:
- Total files: 5
- Import depth: 3 levels
- No circular imports detected
- No duplicate imports found
- Selective imports: 1 file with section-specific imports
File Not Found¶
If you see "Imported file not found", check:
- The path is correct and relative to the importing file
- The file exists and is readable
- Environment variables in paths are set correctly
- Use duckalog show-imports catalog.yaml to see the resolved paths
Validation Errors¶
If you see "Field required" errors:
- Make sure all imported files have required fields (version, duckdb, views)
- Add empty lists for sections that don't apply: views: []
- Use duckalog show-imports catalog.yaml --show-merged to see the merged config
Duplicate Names¶
If you see duplicate name errors:
- Check that view names are unique across all imported files
- Use schema-qualified names if needed: schema.view_name
- Use duckalog show-imports catalog.yaml --diagnostics to identify where duplicates come from
Circular Imports¶
If you see a circular import error:
- Duckalog will show you the import chain where the cycle occurs
- Use duckalog show-imports catalog.yaml to visualize the structure
- Refactor your imports to eliminate the cycle (e.g., move common config to a separate file)