Skip to content

Config Imports

The config imports feature allows you to split your Duckalog configuration across multiple files, making it easier to organize, maintain, and reuse configuration components.

Why Use Config Imports?

Config imports are useful when you want to:

  • Split configuration by domain - Separate views, settings, and secrets into different files
  • Share common configuration - Reuse the same settings across multiple projects
  • Organize by team or module - Different teams can own their own config files
  • Environment-specific configs - Easily switch between dev/staging/prod configurations
  • Version control friendly - Smaller files result in clearer diffs

Basic Usage

Simple Import List

Create a main catalog file that imports other config files using a simple list:

# catalog.yaml
version: 1
imports:
  - ./settings.yaml
  - ./views.yaml

duckdb:
  database: main.duckdb
# settings.yaml
version: 1
duckdb:
  database: imported.duckdb
  install_extensions:
    - httpfs
  pragmas:
    - "SET threads = 4"

views:
  - name: imported_view
    sql: "SELECT 1"
# views.yaml
version: 1
views:
  - name: another_view
    sql: "SELECT 2"

When you load catalog.yaml, Duckalog will: 1. Load the main config 2. Import settings.yaml and views.yaml 3. Merge them together (views are concatenated, scalar values are overridden) 4. Return a single, merged configuration

Advanced Import Options

Duckalog supports advanced import features through SelectiveImports and ImportEntry objects.

Section-Specific Imports

Import different files for different configuration sections:

# catalog.yaml
version: 1
imports:
  duckdb:
    - ./database-settings.yaml
  views:
    - ./user-views.yaml
    - ./product-views.yaml
  attachments:
    - ./external-databases.yaml

duckdb:
  database: main.duckdb  # This can be overridden by database-settings.yaml

Import Override Control

Control whether imports can override existing values:

# catalog.yaml
version: 1
imports:
  - path: ./base-settings.yaml
    override: true    # Can override existing values (default)
  - path: ./optional-settings.yaml
    override: false   # Only fills missing fields, won't override

duckdb:
  database: main.duckdb

Combined Section-Specific with Override Control

# catalog.yaml
version: 1
imports:
  duckdb:
    - path: ./database.yaml
      override: true
    - path: ./optional-db-settings.yaml
      override: false
  views:
    - path: ./base-views.yaml
      override: true
  semantic_models:
    - path: ./shared-models.yaml
      override: false  # Won't override existing models

duckdb:
  database: catalog.duckdb

Import Entry Format

Each import can be specified as either: - Simple string: "./settings.yaml" (uses default override=true) - ImportEntry object: {path: "./settings.yaml", override: false}

# All equivalent:
imports:
  - ./settings.yaml                    # Simple string
  - path: ./settings.yaml              # ImportEntry with defaults
  - path: ./settings.yaml
    override: true                     # Explicit override

Import Resolution Algorithm

Duckalog follows a precise algorithm when resolving imports:

1. Path Resolution

  • Remote URIs (s3://, gs://, https://, etc.): Used as-is
  • Absolute paths: Used as-is without modification
  • Relative paths: Resolved relative to the importing file's directory
  • Environment variables: Expanded before resolution (${env:VAR})

2. Import Processing Order

  1. Load main configuration file
  2. Process imports in the order they appear
  3. For SelectiveImports: process each section separately
  4. Apply override behavior based on ImportEntry.override setting
  5. Validate final merged configuration

3. Caching

  • Imported files are cached to avoid duplicate loading
  • Circular imports are detected during processing
  • Remote imports use authentication from environment or filesystem context

Merge Behavior

Config imports use a deep merge strategy with override control:

Scalar Values

Later imports override earlier ones:

# file1.yaml
version: 1
duckdb:
  database: file1.duckdb
  threads: 2
# file2.yaml
version: 1
duckdb:
  threads: 4  # This overrides the value from file1

Result:

duckdb:
  database: file1.duckdb  # From file1
  threads: 4              # From file2 (overrides file1)

Dictionaries

Dictionaries are merged recursively:

# base.yaml
duckdb:
  database: base.duckdb
  install_extensions:
    - httpfs
# override.yaml
duckdb:
  extensions:
    - json  # Adds to extensions, doesn't replace

Result:

duckdb:
  database: base.duckdb
  install_extensions:
    - httpfs
  extensions:
    - json

Lists with Override Control

Lists behavior depends on the override setting:

Default Behavior (override=true)

Lists are concatenated (items from all imports are included):

# file1.yaml
views:
  - name: view1
    sql: "SELECT 1"
# file2.yaml
views:
  - name: view2
    sql: "SELECT 2"

Result:

views:
  - name: view1  # From file1
  - name: view2  # From file2

Override Control (override=false)

When override=false, lists are only merged if the target list is empty:

# main.yaml
version: 1
imports:
  - path: ./additional-views.yaml
    override: false  # Won't override existing views

views:
  - name: main_view
    sql: "SELECT 1"
# additional-views.yaml
version: 1
views:
  - name: extra_view
    sql: "SELECT 2"

Result: (additional-views.yaml views are ignored because main.yaml already has views)

views:
  - name: main_view
    sql: "SELECT 1"

Environment Variables

You can use environment variables in import paths:

# catalog.yaml
version: 1
imports:
  - ${env:CONFIG_DIR}/settings.yaml  # Uses CONFIG_DIR environment variable

duckdb:
  database: main.duckdb

Set the environment variable:

export CONFIG_DIR=/path/to/configs

Import Order and Precedence

Imports are processed in the order they appear, and later imports override earlier ones. The main config file always has the final say:

# catalog.yaml
imports:
  - ./base.yaml
  - ./production.yaml  # This can override settings from base.yaml

# Settings here override both imported files
duckdb:
  database: catalog.duckdb

Uniqueness Validation

After merging, Duckalog validates that certain items are unique:

  • Views: Must have unique (schema, name) tuples
  • Semantic Models: Must have unique names
  • Iceberg Catalogs: Must have unique names
  • Attachments: Must have unique aliases

If duplicates are found, you'll get an error:

Duplicate view name(s) found: users

Circular Import Detection

Duckalog automatically detects and prevents circular imports:

# file_a.yaml
imports:
  - ./file_b.yaml

# file_b.yaml
imports:
  - ./file_a.yaml  # This creates a circular reference!

This will fail with an error:

Circular import detected in import chain: file_a.yaml -> file_b.yaml -> file_a.yaml

Remote Imports

You can import configuration files from remote storage systems such as S3, GCS, Azure Blob Storage, or HTTPS endpoints. This allows you to share common configuration across multiple projects or load configuration from centralized locations.

Supported URI Schemes

Duckalog supports the following remote URI schemes:

  • S3: s3://bucket/path/config.yaml
  • Google Cloud Storage: gs://bucket/path/config.yaml or gcs://bucket/path/config.yaml
  • Azure Blob Storage: abfs://account@container/path/config.yaml
  • SFTP: sftp://user@host/path/config.yaml
  • HTTPS: https://example.com/config.yaml

Example: Import from S3

# catalog.yaml
version: 1
imports:
  - s3://my-bucket/shared/base-config.yaml
  - ./local-config.yaml

duckdb:
  database: my-project.duckdb

Example: Import from HTTPS

# catalog.yaml
version: 1
imports:
  - https://raw.githubusercontent.com/company/config-templates/main/base.yaml

duckdb:
  database: main.duckdb

Authentication

Remote imports use the same authentication mechanisms as regular remote config loading:

  • S3: AWS credentials via environment variables, ~/.aws/credentials, or IAM role
  • GCS: Google Cloud credentials via GOOGLE_APPLICATION_CREDENTIALS or ADC
  • Azure: Azure credentials via environment variables or managed identity
  • SFTP: SSH credentials via SSH config or environment variables
  • HTTPS: No authentication required for public URLs

For S3, you can use environment variables:

export AWS_ACCESS_KEY_ID=your_access_key
export AWS_SECRET_ACCESS_KEY=your_secret_key

Environment Variables in Remote Paths

You can use environment variables in remote import paths:

# catalog.yaml
version: 1
imports:
  - https://${env:COMPANY}.s3.amazonaws.com/configs/base.yaml

duckdb:
  database: main.duckdb

Error Handling

Remote import failures include clear error messages with the URI and operation:

Failed to load remote config 's3://bucket/config.yaml': NoSuchKey: The specified key does not exist

Security Notes

  • Remote imports follow the same security rules as remote config loading
  • Credentials should be provided via environment variables or secure credential stores
  • For HTTPS URLs, ensure you're using trusted endpoints
  • Consider using signed URLs for temporary access to private resources

Best Practices

1. Organize by Domain

config/
├── catalog.yaml          # Main file with imports
├── settings.yaml         # Database settings, extensions
├── views/
│   ├── users.yaml
│   ├── products.yaml
│   └── orders.yaml
└── environments/
    ├── dev.yaml
    └── prod.yaml

2. Use Empty Lists for Optional Sections

If a file only contains imports and no views, add an empty views list:

# settings-only.yaml
version: 1
duckdb:
  database: db.duckdb
views: []  # Explicitly empty to avoid required field errors

3. Document Your Imports

Add comments explaining why files are imported:

# catalog.yaml
imports:
  - ./settings.yaml        # Base database configuration
  - ./views/users.yaml     # User-related views
  - ./views/products.yaml  # Product catalog views

4. Version Control

  • Keep import files in version control
  • Use consistent naming conventions
  • Add .gitignore for generated databases (.duckdb files)

Common Patterns

Environment-Specific Configuration

# catalog.yaml
version: 1
imports:
  - ./base.yaml
  - ./environments/${env:ENVIRONMENT}.yaml  # dev.yaml, prod.yaml, etc.

Layered Configuration with Override Control

# catalog.yaml
version: 1
imports:
  # Base layer - always applied
  - path: ./base.yaml
    override: true

  # Environment-specific - can override base
  - path: ./environments/${env:ENVIRONMENT}.yaml
    override: true

  # Local overrides - only fills gaps, doesn't override
  - path: ./local-overrides.yaml
    override: false

duckdb:
  database: catalog.duckdb

Section-Specific Team Ownership

# catalog.yaml
version: 1
imports:
  # Infrastructure team owns database settings
  duckdb:
    - path: ./infrastructure/database.yaml
      override: true

  # Data team owns views
  views:
    - path: ./analytics/core-views.yaml
      override: true
    - path: ./analytics/dimension-views.yaml
      override: true

  # BI team owns semantic models
  semantic_models:
    - path: ./bi/shared-models.yaml
      override: false  # Won't override local models

duckdb:
  database: analytics.duckdb

Modular View Organization

# catalog.yaml
version: 1
imports:
  views:
    - ./domains/users/views.yaml      # User domain views
    - ./domains/products/views.yaml   # Product domain views
    - ./domains/orders/views.yaml    # Order domain views
    - ./domains/analytics/views.yaml  # Analytics views

duckdb:
  database: multi_domain.duckdb

Safe Configuration Updates

# catalog.yaml
version: 1
imports:
  # Production config - can be overridden
  - path: ./production.yaml
    override: true

  # Emergency fixes - only fills gaps, safe to apply
  - path: ./emergency-fixes.yaml
    override: false

  # Local development - never overrides production
  - path: ./local-dev.yaml
    override: false

duckdb:
  database: catalog.duckdb

Shared Base Configuration

# base.yaml
version: 1
duckdb:
  database: analytics.duckdb
  install_extensions:
    - httpfs
    - iceberg

Then import this in all your projects:

# project1.yaml
imports:
  - ../shared/base.yaml

Team Ownership

config/
├── catalog.yaml
├── infrastructure/        # Ops team owns this
│   └── database.yaml
├── analytics/             # Data team owns this
│   └── views.yaml
└── reporting/             # BI team owns this
    └── reports.yaml

Migration from Single File

To migrate an existing single-file configuration:

  1. Create a new main file:

    # catalog.yaml
    version: 1
    imports:
      - ./migrated-config.yaml
    

  2. Rename your existing config:

    mv old-catalog.yaml migrated-config.yaml
    

  3. Test that it works:

    duckalog run catalog.yaml
    

  4. Gradually split migrated-config.yaml into smaller files

Troubleshooting

Import Resolution Details

Path Resolution Rules

  1. Remote URIs: Used as-is without modification

    imports:
      - s3://bucket/config.yaml
      - https://example.com/config.yaml
    

  2. Absolute Paths: Used as-is

    imports:
      - /etc/duckalog/base.yaml
      - C:\Configs\base.yaml
    

  3. Relative Paths: Resolved relative to importing file

    # config/catalog.yaml imports:
    - ./settings.yaml        # → config/settings.yaml
    - ../shared/base.yaml     # → shared/base.yaml
    - ./views/users.yaml      # → config/views/users.yaml
    

  4. Environment Variables: Expanded before resolution

    imports:
      - ${env:CONFIG_DIR}/settings.yaml
      - ${env:SHARED_CONFIGS}/base.yaml
    

Import Processing Order

Imports are processed in this order: 1. Main configuration file loaded 2. imports section processed top-to-bottom 3. For SelectiveImports: each section processed in order (duckdb → views → attachments → iceberg_catalogs → semantic_models) 4. Within each section: imports processed in order listed 5. Override behavior applied based on ImportEntry.override 6. Final validation performed

Caching and Performance

  • File Caching: Each unique file path loaded only once
  • Remote Caching: Remote files cached per import operation
  • Circular Detection: Tracks import chains to prevent cycles
  • Validation: Merged config validated after all imports processed

Debugging Import Issues

Use the show-imports command to visualize and debug your import graph:

# View the import tree
duckalog show-imports catalog.yaml

# Show import diagnostics (depth, file counts, duplicates)
duckalog show-imports catalog.yaml --diagnostics

# Export import graph as JSON for programmatic analysis
duckalog show-imports catalog.yaml --format json

# Preview the fully merged configuration
duckalog show-imports catalog.yaml --show-merged

This helps you: - Visualize the import structure - See which files import which others - Detect circular imports - Identify problematic import chains - Count total files - Understand the complexity of your configuration - Find duplicate imports - Catch redundant file references - Preview merged config - Verify the final configuration before building - See override behavior - Understand which imports override others - Validate section-specific imports - Check SelectiveImports structure

Example Output

$ duckalog show-imports catalog.yaml --diagnostics

Import Graph:
catalog.yaml
├── ./database-settings.yaml
├── ./views/
   ├── users.yaml
   └── products.yaml
└── ../shared/base.yaml

Diagnostics:
- Total files: 5
- Import depth: 3 levels
- No circular imports detected
- No duplicate imports found
- Selective imports: 1 file with section-specific imports

File Not Found

If you see "Imported file not found", check: - The path is correct and relative to the importing file - The file exists and is readable - Environment variables in paths are set correctly - Use duckalog show-imports catalog.yaml to see the resolved paths

Validation Errors

If you see "Field required" errors: - Make sure all imported files have required fields (version, duckdb, views) - Add empty lists for sections that don't apply: views: [] - Use duckalog show-imports catalog.yaml --show-merged to see the merged config

Duplicate Names

If you see duplicate name errors: - Check that view names are unique across all imported files - Use schema-qualified names if needed: schema.view_name - Use duckalog show-imports catalog.yaml --diagnostics to identify where duplicates come from

Circular Imports

If you see a circular import error: - Duckalog will show you the import chain where the cycle occurs - Use duckalog show-imports catalog.yaml to visualize the structure - Refactor your imports to eliminate the cycle (e.g., move common config to a separate file)

See Also