API Reference¶

This section provides comprehensive API documentation for the Duckalog library, including both the traditional API and new modular architecture patterns with dependency injection support.

Core API¶

The main Duckalog API provides convenience functions for common use cases:

`duckalog` ¶

Duckalog public API.

`AttachmentsConfig` ¶

Bases: BaseModel

Collection of attachment configurations.

Attributes:

Name	Type	Description
`duckdb`	`list[DuckDBAttachment]`	DuckDB attachment entries.
`sqlite`	`list[SQLiteAttachment]`	SQLite attachment entries.
`postgres`	`list[PostgresAttachment]`	Postgres attachment entries.
`duckalog`	`list[DuckalogAttachment]`	Duckalog config attachment entries.

`CatalogConnection` ¶

Manages DuckDB connections with session state restoration and lazy initialization.

This class ensures that every connection obtained through it has the correct catalog state (pragmas, settings, extensions, attachments, and secrets) applied, regardless of whether it's a new or existing connection.

Attributes:

Name	Type	Description
`config_path`		Path to the Duckalog configuration file.
`database_path`		Optional override for the DuckDB database path.
`read_only`		Whether to open the connection in read-only mode.
`force_rebuild`		If True, all views will be recreated even if they exist.
`config`	`Optional[Config]`	The loaded and validated configuration.
`conn`	`Optional[DuckDBPyConnection]`	The active DuckDB connection, or None if not initialized.

`enter()` ¶

Context manager support: returns the CatalogConnection instance.

`exit(exc_type, exc_val, exc_tb)` ¶

Context manager support: ensures connection cleanup.

`init(config_path, database_path=None, read_only=False, force_rebuild=False, filesystem=None, load_dotenv=True)` ¶

Initialize the catalog connection manager.

Parameters:

Name	Type	Description	Default
`config_path`	`str`	Path to the Duckalog configuration file.	required
`database_path`	`Optional[str]`	Optional override for the DuckDB database path.	`None`
`read_only`	`bool`	Whether to open the connection in read-only mode.	`False`
`force_rebuild`	`bool`	If True, all views will be recreated even if they exist.	`False`
`filesystem`	`Optional[Any]`	Optional fsspec filesystem object for remote file access.	`None`
`load_dotenv`	`bool`	If True, automatically load and process .env files.	`True`

`close()` ¶

Clean up the DuckDB connection and resources.

`get_connection()` ¶

Get the DuckDB connection, initializing it if necessary.

This method establishes the connection lazily on the first call, restores the session state, and performs incremental updates. Subsequent calls return the same connection instance.

Returns:

Type	Description
`DuckDBPyConnection`	An active DuckDB connection with catalog state restored.

Raises:

Type	Description
`ConfigError`	If loading the configuration fails.
`FileNotFoundError`	If the database path is invalid or missing (for existing catalogs).
`EngineError`	If connecting to DuckDB or restoring state fails.

`Config` ¶

Bases: BaseModel

Top-level Duckalog configuration.

Attributes:

Name	Type	Description
`version`	`int`	Positive integer describing the config schema version.
`duckdb`	`DuckDBConfig`	DuckDB session and connection settings.
`views`	`list[ViewConfig]`	List of view definitions to create in the catalog.
`attachments`	`AttachmentsConfig`	Optional attachments to external databases.
`iceberg_catalogs`	`list[IcebergCatalogConfig]`	Optional Iceberg catalog definitions.
`semantic_models`	`list[SemanticModelConfig]`	Optional semantic model definitions for business metadata.
`imports`	`Union[list[Union[str, ImportEntry]], SelectiveImports]`	Optional list of additional config files to import and merge. Can be a simple list of paths (backward compatible) or a SelectiveImports object for advanced options like section-specific imports, override behavior, and glob patterns.
`env_files`	`list[str]`	Optional list of custom .env file patterns to load. Supports patterns like ['.env', '.env.local', '.env.production']. Files are loaded in order with later files overriding earlier ones. Defaults to ['.env'] for backward compatibility.
`loader_settings`	`LoaderSettings`	Optional settings for the configuration loader.

`ConfigError` ¶

Bases: DuckalogError

Configuration-related errors.

This exception is raised when a catalog configuration cannot be read, parsed, interpolated, or validated according to the Duckalog schema.

Typical error conditions include:

The config file does not exist or cannot be read.
The file is not valid YAML/JSON.
Required fields are missing or invalid.
An environment variable placeholder cannot be resolved.

`DuckDBAttachment` ¶

Bases: BaseModel

Configuration for attaching another DuckDB database.

Attributes:

Name	Type	Description
`alias`	`str`	Alias under which the database will be attached.
`path`	`str`	Filesystem path to the DuckDB database file.
`read_only`	`bool`	Whether the attachment should be opened in read-only mode. Defaults to `True` for safety.

`DuckDBConfig` ¶

Bases: BaseModel

DuckDB connection and session settings.

Attributes:

Name	Type	Description
`database`	`str`	Path to the DuckDB database file. Defaults to `":memory:"`.
`install_extensions`	`list[str]`	Names of extensions to install before use.
`load_extensions`	`list[str]`	Names of extensions to load in the session.
`pragmas`	`list[str]`	SQL statements (typically `SET` pragmas) executed after connecting and loading extensions.
`settings`	`Optional[Union[str, list[str]]]`	DuckDB SET statements executed after pragmas. Can be a single string or list of strings.
`secrets`	`list[SecretConfig]`	List of secret definitions for external services and databases.

`EngineError` ¶

Bases: DuckalogError

Engine-level error raised during catalog builds.

This exception wraps lower-level DuckDB errors, such as failures to connect to the database, attach external systems, or execute generated SQL statements.

`IcebergCatalogConfig` ¶

Bases: BaseModel

Configuration for an Iceberg catalog.

Attributes:

Name	Type	Description
`name`	`str`	Catalog name referenced by Iceberg views.
`catalog_type`	`str`	Backend type (for example, `rest`, `hive`, `glue`).
`uri`	`Optional[str]`	Optional URI used by certain catalog types.
`warehouse`	`Optional[str]`	Optional warehouse location for catalog data.
`options`	`dict[str, Any]`	Additional catalog-specific options.

`PostgresAttachment` ¶

Bases: BaseModel

Configuration for attaching a Postgres database.

Attributes:

Name	Type	Description
`alias`	`str`	Alias used inside DuckDB to reference the Postgres database.
`host`	`str`	Hostname or IP address of the Postgres server.
`port`	`int`	TCP port of the Postgres server.
`database`	`str`	Database name to connect to.
`user`	`str`	Username for authentication.
`password`	`str`	Password for authentication.
`sslmode`	`Optional[str]`	Optional SSL mode (for example, `require`).
`options`	`dict[str, Any]`	Extra key/value options passed to the attachment clause.

`SQLFileEncodingError` ¶

Bases: SQLFileError

Raised when a SQL file has invalid encoding.

`SQLFileError` ¶

Bases: ConfigError

Base exception for SQL file-related errors.

This exception is raised when SQL file operations fail, such as when a referenced SQL file cannot be found, read, or processed.

`SQLFileLoader` ¶

Loads SQL content from external files and processes templates.

`init()` ¶

Initialize the SQL file loader.

`load_sql_file(file_path, config_file_path, variables=None, as_template=False, filesystem=None)` ¶

Load SQL content from a file and optionally process as a template.

Parameters:

Name	Type	Description	Default
`file_path`	`str`	Path to the SQL file (can be relative or absolute)	required
`config_file_path`	`str`	Path to the config file for resolving relative paths	required
`variables`	`Optional[dict[str, Any]]`	Dictionary of variables for template substitution	`None`
`as_template`	`bool`	Whether to process the file content as a template	`False`
`filesystem`	`Optional[Any]`	Optional filesystem object for file I/O operations	`None`

Returns:

Type	Description
`str`	The loaded SQL content (processed if template, raw otherwise)

Raises:

Type	Description
`SQLFileError`	If the file cannot be loaded or processed

`SQLFileNotFoundError` ¶

Bases: SQLFileError

Raised when a referenced SQL file does not exist.

`SQLFilePermissionError` ¶

Bases: SQLFileError

Raised when a SQL file cannot be read due to permissions.

`SQLFileReference` ¶

Bases: BaseModel

Reference to SQL content in an external file.

Attributes:

Name	Type	Description
`path`	`str`	Path to the SQL file (relative or absolute).
`variables`	`Optional[dict[str, Any]]`	Dictionary of variables for template substitution.
`as_template`	`bool`	Whether to process the file content as a template.

`SQLFileSizeError` ¶

Bases: SQLFileError

Raised when a SQL file exceeds size limits.

`SQLGroup` ¶

Unified access to all SQL-related functionality.

`SQLTemplateError` ¶

Bases: SQLFileError

Raised when template processing fails.

`SQLiteAttachment` ¶

Bases: BaseModel

Configuration for attaching a SQLite database.

Attributes:

Name	Type	Description
`alias`	`str`	Alias under which the SQLite database will be attached.
`path`	`str`	Filesystem path to the SQLite `.db` file.

`SecretConfig` ¶

Bases: BaseModel

Configuration for a DuckDB secret.

Attributes:

Name	Type	Description
`type`	`SecretType`	Secret type (s3, azure, gcs, http, postgres, mysql).
`name`	`Optional[str]`	Optional name for the secret (defaults to type if not provided).
`provider`	`SecretProvider`	Secret provider (config or credential_chain).
`persistent`	`bool`	Whether to create a persistent secret. Defaults to False.
`scope`	`Optional[str]`	Optional scope prefix for the secret.
`key_id`	`Optional[str]`	Access key ID or username for authentication.
`secret`	`Optional[str]`	Secret key or password for authentication.
`region`	`Optional[str]`	Geographic region for cloud services.
`endpoint`	`Optional[str]`	Custom endpoint URL for cloud services.
`connection_string`	`Optional[str]`	Full connection string for databases.
`tenant_id`	`Optional[str]`	Azure tenant ID for authentication.
`account_name`	`Optional[str]`	Azure storage account name.
`client_id`	`Optional[str]`	Azure client ID for authentication.
`client_secret`	`Optional[str]`	Azure client secret for authentication.
`service_account_key`	`Optional[str]`	GCS service account key.
`json_key`	`Optional[str]`	GCS JSON key.
`bearer_token`	`Optional[str]`	HTTP bearer token for authentication.
`header`	`Optional[str]`	HTTP header for authentication.
`database`	`Optional[str]`	Database name for database secrets.
`host`	`Optional[str]`	Database host for database secrets.
`port`	`Optional[int]`	Database port for database secrets.
`user`	`Optional[str]`	Database username (alternative to key_id for database types).
`password`	`Optional[str]`	Database password (alternative to secret for database types).
`options`	`dict[str, Any]`	Additional key-value options for the secret.

`SemanticDefaultsConfig` ¶

Bases: BaseModel

Default configuration for a semantic model.

Provides default settings for query builders and dashboards, such as the primary time dimension and default measures.

Attributes:

Name	Type	Description
`time_dimension`	`Optional[str]`	Default time dimension name.
`primary_measure`	`Optional[str]`	Default primary measure name.
`default_filters`	`list[dict[str, Any]]`	Optional list of default filters.

`SemanticDimensionConfig` ¶

Bases: BaseModel

Definition of a semantic dimension.

A dimension represents a business attribute that maps to an expression over the base view of a semantic model.

Attributes:

Name	Type	Description
`name`	`str`	Unique dimension name within the semantic model.
`expression`	`str`	SQL expression referencing columns from the base view.
`label`	`Optional[str]`	Human-readable display name.
`description`	`Optional[str]`	Optional detailed description.
`type`	`Optional[str]`	Optional data type hint (time, number, string, boolean, date).
`time_grains`	`list[str]`	Optional list of time grains for time dimensions.

`SemanticJoinConfig` ¶

Bases: BaseModel

Definition of a semantic join.

A join defines a relationship to another view for enriching the semantic model with additional data, typically dimension tables.

Attributes:

Name	Type	Description
`to_view`	`str`	Name of an existing view in the views section to join to.
`type`	`str`	Join type (inner, left, right, full).
`on_condition`	`str`	SQL join condition expression.

`SemanticMeasureConfig` ¶

Bases: BaseModel

Definition of a semantic measure.

A measure represents a business metric that typically involves aggregation or calculation over the base view of a semantic model.

Attributes:

Name	Type	Description
`name`	`str`	Unique measure name within the semantic model.
`expression`	`str`	SQL expression (often aggregated) over the base view.
`label`	`Optional[str]`	Human-readable display name.
`description`	`Optional[str]`	Optional detailed description.
`type`	`Optional[str]`	Optional data type hint.

`SemanticModelConfig` ¶

Bases: BaseModel

Definition of a semantic model.

A semantic model provides business-friendly metadata on top of an existing Duckalog view, defining dimensions and measures for analytics and BI use cases.

Attributes:

Name	Type	Description
`name`	`str`	Unique semantic model name within the config.
`base_view`	`str`	Name of an existing view in the views section.
`dimensions`	`list[SemanticDimensionConfig]`	Optional list of dimension definitions.
`measures`	`list[SemanticMeasureConfig]`	Optional list of measure definitions.
`joins`	`list[SemanticJoinConfig]`	Optional list of join definitions to other views.
`defaults`	`Optional[SemanticDefaultsConfig]`	Optional default configuration for query builders.
`label`	`Optional[str]`	Human-readable display name.
`description`	`Optional[str]`	Optional detailed description.
`tags`	`list[str]`	Optional list of classification tags.

`ViewConfig` ¶

Bases: BaseModel

Definition of a single catalog view.

A view can be defined in several ways: 1. Inline SQL: Using the sql field with raw SQL text 2. SQL File: Using sql_file to reference external SQL files 3. SQL Template: Using sql_template for parameterized SQL files 4. Data Source: Using source + required fields for direct data access 5. Source + SQL: Using source for data access plus sql for transformations

For data sources, the required fields depend on the source type: - Parquet/Delta: uri field is required - Iceberg: Either uri OR both catalog and table - DuckDB/SQLite/Postgres: Both database and table are required

When using SQL with a data source, the SQL will be applied as a transformation over the data from the specified source.

Additional metadata fields such as description and tags do not affect SQL generation but are preserved for documentation and tooling.

Attributes:

Name	Type	Description
`name`	`str`	Unique view name within the config.
`schema`	`str`	Optional schema name for organizing views in DuckDB schemas.
`sql`	`Optional[str]`	Raw SQL text defining the view body.
`sql_file`	`Optional[SQLFileReference]`	Direct reference to a SQL file.
`sql_template`	`Optional[SQLFileReference]`	Reference to a SQL template file with variable substitution.
`source`	`Optional[EnvSource]`	Source type (e.g. `"parquet"`, `"iceberg"`, `"duckdb"`).
`uri`	`Optional[str]`	URI for file- or table-based sources (Parquet/Delta/Iceberg).
`database`	`Optional[str]`	Attachment alias for attached-database sources.
`table`	`Optional[str]`	Table name (optionally schema-qualified) for attached sources.
`catalog`	`Optional[str]`	Iceberg catalog name for catalog-based Iceberg views.
`options`	`dict[str, Any]`	Source-specific options passed to scan functions.
`description`	`Optional[str]`	Optional human-readable description of the view.
`tags`	`list[str]`	Optional list of tags for classification.

`connect_to_catalog(config_path, database_path=None, read_only=False, force_rebuild=False)` ¶

Create a CatalogConnection instance that manages DuckDB connections with state restoration.

This is the primary entry point for working with Duckalog catalogs in Python. It returns a :class:CatalogConnection instance which lazily establishes a DuckDB connection and automatically restores session state (pragmas, attachments, etc.) and performs incremental view updates.

Parameters:

Name	Type	Description	Default
`config_path`	`str`	Path to the YAML/JSON configuration file.	required
`database_path`	`str \| None`	Optional database path override.	`None`
`read_only`	`bool`	Open the connection in read-only mode for safety.	`False`
`force_rebuild`	`bool`	If True, all views will be recreated even if they exist.	`False`

Returns:

Name	Type	Description
`A`	`CatalogConnection`	class:`CatalogConnection` instance.

Example

Using as a context manager::

from duckalog import connect_to_catalog
with connect_to_catalog("catalog.yaml") as catalog:
    conn = catalog.get_connection()
    result = conn.execute("SELECT * FROM my_view").fetchall()

Using for persistent state management::

catalog = connect_to_catalog("catalog.yaml")
conn1 = catalog.get_connection()
# ... later ...
conn2 = catalog.get_connection()  # Returns the same connection
catalog.close()

`connect_to_catalog_cm(config_path, database_path=None, read_only=False, force_rebuild=False)` ¶

Context manager that yields an active, state-restored DuckDB connection.

This provides the same state restoration and incremental update benefits as :class:CatalogConnection, but yields the raw DuckDB connection object for convenience in simple scripts.

Usage::

from duckalog import connect_to_catalog_cm
with connect_to_catalog_cm("catalog.yaml") as conn:
    data = conn.execute("SELECT * FROM users").fetchall()
    print(f"Found {len(data)} records")
# Connection automatically closed here

Parameters:

Name	Type	Description	Default
`config_path`	`str`	Path to the YAML/JSON configuration file.	required
`database_path`	`str \| None`	Optional database path override.	`None`
`read_only`	`bool`	Open the connection in read-only mode for safety.	`False`
`force_rebuild`	`bool`	If True, all views will be recreated.	`False`

Yields:

Type	Description
`Generator[DuckDBPyConnection]`	An active DuckDB connection with catalog state restored.

`create_config_template(format='yaml', output_path=None, database_name='analytics_catalog.duckdb', project_name='my_analytics_project')` ¶

Generate a basic, valid Duckalog configuration template.

This function creates a configuration template with sensible defaults and educational example content that demonstrates key Duckalog features.

Parameters:

Name	Type	Description	Default
`format`	`ConfigFormat`	Output format for the configuration ('yaml' or 'json'). Defaults to 'yaml'.	`'yaml'`
`output_path`	`str \| None`	Optional path to write the configuration file. If provided, the template is written to this path and the content is also returned as a string.	`None`
`database_name`	`str`	Name for the DuckDB database file.	`'analytics_catalog.duckdb'`
`project_name`	`str`	Name used in comments to personalize the template.	`'my_analytics_project'`

Returns:

Type	Description
`str`	The generated configuration as a string.

Raises:

Type	Description
`ValueError`	If format is not 'yaml' or 'json'.
`ConfigError`	If the generated template fails validation.
`OSError`	If writing to output_path fails.

Example

Generate a YAML template::

template = create_config_template(format='yaml')
print(template)

Generate and save a JSON template::

template = create_config_template(
    format='json',
    output_path='my_config.json'
)

`generate_all_views_sql(config, include_secrets=False)` ¶

Generate SQL for all views in a configuration.

The output includes a descriptive header with the config version followed by a CREATE OR REPLACE VIEW statement for each view in the order they appear in the configuration.

Parameters:

Name	Type	Description	Default
`config`	`Config`	The validated :class:`Config` instance to render.	required
`include_secrets`	`bool`	Whether to include CREATE SECRET statements for secrets.	`False`

Returns:

Type	Description
`str`	A multi-statement SQL script suitable for use as a catalog definition.

`generate_secret_sql(secret)` ¶

Generate CREATE SECRET statement for a DuckDB secret.

Parameters:

Name	Type	Description	Default
`secret`	`SecretConfig`	Secret configuration object.	required

Returns:

Type	Description
`str`	SQL CREATE SECRET statement.

`generate_sql(config_path)` ¶

Generate a full SQL script from a config file.

This is a convenience wrapper around :func:load_config and :func:generate_all_views_sql that does not connect to DuckDB.

Parameters:

Name	Type	Description	Default
`config_path`	`str`	Path to the YAML/JSON configuration file.	required

Returns:

Type	Description
`str`	A multi-statement SQL script containing `CREATE OR REPLACE VIEW`
`str`	statements for all configured views.

Raises:

Type	Description
`ConfigError`	If the configuration file is invalid.

Example

from duckalog import generate_sql sql = generate_sql("catalog.yaml") print("CREATE VIEW" in sql) True

`generate_view_sql(view)` ¶

Generate a CREATE OR REPLACE VIEW statement for a single view.

Parameters:

Name	Type	Description	Default
`view`	`ViewConfig`	The :class:`ViewConfig` to generate SQL for.	required

Returns:

Type	Description
`str`	A single SQL statement that creates or replaces the view.

`load_config(path, load_sql_files=True, sql_file_loader=None, resolve_paths=True, filesystem=None, load_dotenv=True)` ¶

Load, interpolate, and validate a Duckalog configuration file.

`quote_ident(value)` ¶

Quote a SQL identifier using double quotes.

This helper wraps a string in double quotes and escapes any embedded double quotes according to SQL rules.

Parameters:

Name	Type	Description	Default
`value`	`str`	Identifier to quote (for example, a view or column name).	required

Returns:

Type	Description
`str`	The identifier wrapped in double quotes.

Example

quote_ident("events") '"events"'

`quote_literal(value)` ¶

Quote a SQL string literal using single quotes.

This helper wraps a string in single quotes and escapes any embedded single quotes according to SQL rules.

Parameters:

Name	Type	Description	Default
`value`	`str`	String literal to quote (for example, a file path, secret, or connection string).	required

Returns:

Type	Description
`str`	The string wrapped in single quotes with proper escaping.

Example

quote_literal("path/to/file.parquet") "'path/to/file.parquet'" quote_literal("user's data") "'user''s data'"

`render_options(options)` ¶

Render a mapping of options into scan-function arguments.

The resulting string is suitable for appending to a *_scan function call. Keys are sorted alphabetically to keep output deterministic.

Parameters:

Name	Type	Description	Default
`options`	`dict[str, Any]`	Mapping of option name to value (str, bool, int, or float).	required

Returns:

Type	Description
`str`	A string that starts with `,` when options are present (for example,
`str`	`", hive_partitioning=TRUE"`) or an empty string when no options
`str`	are provided.

Raises:

Type	Description
`TypeError`	If a value has a type that cannot be rendered safely.

`validate_config(config_path)` ¶

Validate a configuration file without touching DuckDB.

Parameters:

Name	Type	Description	Default
`config_path`	`str`	Path to the YAML/JSON configuration file.	required

Raises:

Type	Description
`ConfigError`	If the configuration file is missing, malformed, or does not satisfy the schema and interpolation rules.

Example

from duckalog import validate_config validate_config("catalog.yaml") # raises on invalid config

`validate_generated_config(content, format='yaml')` ¶

Validate that generated configuration content can be loaded successfully.

Parameters:

Name	Type	Description	Default
`content`	`str`	Configuration content as string.	required
`format`	`ConfigFormat`	Format of the content ('yaml' or 'json').	`'yaml'`

Raises:

Type	Description
`ConfigError`	If the configuration cannot be loaded or is invalid.

New Architecture Patterns¶

Configuration Loading API¶

The new duckalog.config.api module provides enhanced configuration loading with dependency injection support:

Key Features: - Dependency Injection: Custom resolvers and processors - Request-Scoped Caching: Performance optimization for batch operations - Enhanced Error Handling: Better error context and recovery - Modular Design: Clean separation of concerns

Main Functions: - load_config(): Enhanced configuration loader with DI support - _load_config_from_local_file(): Core local file loading logic

Import Resolution Interfaces¶

The duckalog.config.resolution package provides extensible import resolution:

Key Components: - ImportResolver: Interface for custom import resolution logic - ImportContext: Tracks state during configuration loading - RequestContext: Aggregates caches for single config load operations - request_cache_scope(): Context manager for performance optimization

Environment Processing¶

The duckalog.config.resolution.env module provides environment variable processing:

Key Components: - EnvProcessor: Interface for custom environment processing - DefaultEnvProcessor: Standard environment variable resolution - EnvCache: Caching for environment variable lookups

Configuration Models¶

Core Models¶

These models define the structure of Duckalog configuration files:

Config: Root configuration container
DuckDBConfig: Database settings and pragmas
AttachmentsConfig: External database attachments
ViewConfig: Individual view definitions
SecretConfig: Credential and secret management
IcebergCatalogConfig: Iceberg catalog connections

Advanced Models¶

Extended models for complex configurations:

SemanticModelConfig: Semantic layer definitions
SQLFileReference: External SQL file references
SemanticDimensionConfig: Semantic dimension definitions
SemanticMeasureConfig: Semantic measure definitions

Import Patterns¶

Recommended Modern Imports¶

# New modular imports (recommended)
from duckalog.config.api import load_config
from duckalog.config.resolution.imports import request_cache_scope
from duckalog.config.resolution.base import ImportResolver, ImportContext
from duckalog.config.resolution.env import EnvProcessor

# Convenience functions (unchanged)
from duckalog import generate_sql, validate_config, connect_to_catalog

# Configuration models
from duckalog.config import Config, DuckDBConfig, ViewConfig

Backward Compatibility¶

All existing import patterns continue to work without modification:

# Legacy imports (still supported)
from duckalog import load_config  # Re-exports from new location
from duckalog.config import load_config  # Re-exports from new location

Usage Patterns¶

Basic Usage (Unchanged)¶

from duckalog import load_config, build_catalog

# Load configuration
config = load_config("catalog.yaml")

# Build catalog
build_catalog("catalog.yaml")

Advanced Usage with Dependency Injection¶

from duckalog.config.api import load_config
from duckalog.config.resolution.imports import request_cache_scope
import fsspec

# Custom filesystem
filesystem = fsspec.filesystem("s3", key="...", secret="...")

# Load with custom dependencies
config = load_config(
    "s3://bucket/config.yaml",
    filesystem=filesystem,
    load_dotenv=False
)

# Batch loading with caching
with request_cache_scope() as context:
    configs = [load_config(f) for f in config_files]

Custom Implementation¶

from duckalog.config.resolution.base import ImportResolver, ImportContext

class CustomResolver(ImportResolver):
    def resolve(self, config_data: dict, context: ImportContext) -> dict:
        # Custom import resolution logic
        return resolved_config

Utility Functions¶

Core Functions¶

load_config: Enhanced configuration loading with DI support
build_catalog: Build DuckDB catalog from configuration
validate_config: Validate configuration without building
generate_sql: Generate SQL statements from configuration

High-Level Convenience Functions¶

connect_to_catalog: Connect to existing DuckDB catalog
connect_to_catalog_cm: Context manager for catalog connections
connect_to_catalog: Build and connect in one operation

Command Line Interface¶

Available Commands¶

duckalog run: Build a DuckDB catalog from a configuration file
duckalog validate: Validate a configuration file
duckalog generate-sql: Generate SQL statements from a configuration file
duckalog init: Initialize new configuration from templates

Enhanced CLI Features¶

Shared Filesystem Options: Centralized remote access configuration
Context Management: Automatic filesystem object lifecycle management
Error Reporting: Improved error messages and context

Error Handling¶

Core Exceptions¶

ConfigError: Configuration validation or loading errors
EngineError: Catalog building or execution errors

Import Resolution Exceptions¶

ImportError: General import resolution failures
ImportFileNotFoundError: Imported file not found
CircularImportError: Circular dependency detected
ImportValidationError: Import content validation failures

Path Security Exceptions¶

PathResolutionError: Path security validation failures
SQLFileError: SQL file processing errors
SQLFileNotFoundError: SQL file not found
SQLFilePermissionError: Permission denied for SQL file

Performance Features¶

Caching Architecture¶

Request-Scoped Caching: Shared resolution across multiple loads
Import Resolution Caching: Avoid re-processing imports
Environment Variable Caching: Reuse resolved values
Path Resolution Caching: Cache normalized paths

Memory Management¶

Automatic Cache Cleanup: Context-managed cache clearing
Import Chain Tracking: Efficient circular dependency detection
Lightweight Context Objects: Minimal overhead for tracking

Security Features¶

Enhanced Path Security¶

Rooted Resolution: All paths relative to config location
Traversal Protection: Prevent directory traversal attacks
Cross-Platform Support: Consistent behavior across systems
Remote URI Support: Secure handling of remote configurations

Secret Management¶

Environment Variable Integration: Secure credential handling
Automatic Redaction: Sensitive data protection in logs
DuckDB Integration: Direct mapping to CREATE SECRET statements

Advanced Usage¶

For detailed examples and advanced usage patterns, see:¶

User Guide: Comprehensive usage documentation
Examples: Real-world configuration examples
Migration Guide: Migrate from legacy patterns
Architecture Documentation: Deep dive into system design

Extensibility¶

Custom Import Resolvers: Implement custom import loading logic
Custom Environment Processors: Add specialized environment handling
Custom Filesystem Integration: Support new protocols and backends
Plugin Architecture: Extensible design for new features

API Reference¶

Core API¶

duckalog ¶

AttachmentsConfig ¶

CatalogConnection ¶

__enter__() ¶

__exit__(exc_type, exc_val, exc_tb) ¶

__init__(config_path, database_path=None, read_only=False, force_rebuild=False, filesystem=None, load_dotenv=True) ¶

close() ¶

get_connection() ¶

Config ¶

ConfigError ¶

DuckDBAttachment ¶

DuckDBConfig ¶

EngineError ¶

IcebergCatalogConfig ¶

PostgresAttachment ¶

SQLFileEncodingError ¶

SQLFileError ¶

SQLFileLoader ¶

__init__() ¶

load_sql_file(file_path, config_file_path, variables=None, as_template=False, filesystem=None) ¶

SQLFileNotFoundError ¶

SQLFilePermissionError ¶

SQLFileReference ¶

SQLFileSizeError ¶

SQLGroup ¶

SQLTemplateError ¶

SQLiteAttachment ¶

SecretConfig ¶

SemanticDefaultsConfig ¶

SemanticDimensionConfig ¶

SemanticJoinConfig ¶

SemanticMeasureConfig ¶

SemanticModelConfig ¶

ViewConfig ¶

connect_to_catalog(config_path, database_path=None, read_only=False, force_rebuild=False) ¶

connect_to_catalog_cm(config_path, database_path=None, read_only=False, force_rebuild=False) ¶

create_config_template(format='yaml', output_path=None, database_name='analytics_catalog.duckdb', project_name='my_analytics_project') ¶

generate_all_views_sql(config, include_secrets=False) ¶

generate_secret_sql(secret) ¶

generate_sql(config_path) ¶

generate_view_sql(view) ¶

load_config(path, load_sql_files=True, sql_file_loader=None, resolve_paths=True, filesystem=None, load_dotenv=True) ¶

quote_ident(value) ¶

quote_literal(value) ¶

render_options(options) ¶

validate_config(config_path) ¶

validate_generated_config(content, format='yaml') ¶

New Architecture Patterns¶

Configuration Loading API¶

Import Resolution Interfaces¶

Environment Processing¶

Configuration Models¶

Core Models¶

Advanced Models¶

Import Patterns¶

Recommended Modern Imports¶

Backward Compatibility¶

Usage Patterns¶

Basic Usage (Unchanged)¶

Advanced Usage with Dependency Injection¶

Custom Implementation¶

Utility Functions¶

Core Functions¶

High-Level Convenience Functions¶

Command Line Interface¶

Available Commands¶

Enhanced CLI Features¶

Error Handling¶

Core Exceptions¶

Import Resolution Exceptions¶

Path Security Exceptions¶

Performance Features¶

Caching Architecture¶

Memory Management¶

Security Features¶

Enhanced Path Security¶

Secret Management¶

Advanced Usage¶

`duckalog` ¶

`AttachmentsConfig` ¶

`CatalogConnection` ¶

`enter()` ¶

`exit(exc_type, exc_val, exc_tb)` ¶

`init(config_path, database_path=None, read_only=False, force_rebuild=False, filesystem=None, load_dotenv=True)` ¶

`close()` ¶

`get_connection()` ¶

`Config` ¶

`ConfigError` ¶

`DuckDBAttachment` ¶

`DuckDBConfig` ¶

`EngineError` ¶

`IcebergCatalogConfig` ¶

`PostgresAttachment` ¶

`SQLFileEncodingError` ¶

`SQLFileError` ¶

`SQLFileLoader` ¶

`init()` ¶

`load_sql_file(file_path, config_file_path, variables=None, as_template=False, filesystem=None)` ¶

`SQLFileNotFoundError` ¶

`SQLFilePermissionError` ¶

`SQLFileReference` ¶

`SQLFileSizeError` ¶

`SQLGroup` ¶

`SQLTemplateError` ¶

`SQLiteAttachment` ¶

`SecretConfig` ¶

`SemanticDefaultsConfig` ¶

`SemanticDimensionConfig` ¶

`SemanticJoinConfig` ¶

`SemanticMeasureConfig` ¶

`SemanticModelConfig` ¶

`ViewConfig` ¶

`connect_to_catalog(config_path, database_path=None, read_only=False, force_rebuild=False)` ¶

`connect_to_catalog_cm(config_path, database_path=None, read_only=False, force_rebuild=False)` ¶

`create_config_template(format='yaml', output_path=None, database_name='analytics_catalog.duckdb', project_name='my_analytics_project')` ¶

`generate_all_views_sql(config, include_secrets=False)` ¶

`generate_secret_sql(secret)` ¶

`generate_sql(config_path)` ¶

`generate_view_sql(view)` ¶

`load_config(path, load_sql_files=True, sql_file_loader=None, resolve_paths=True, filesystem=None, load_dotenv=True)` ¶

`quote_ident(value)` ¶

`quote_literal(value)` ¶

`render_options(options)` ¶

`validate_config(config_path)` ¶

`validate_generated_config(content, format='yaml')` ¶