API Reference¶
This section provides comprehensive API documentation for the Duckalog library, including both the traditional API and new modular architecture patterns with dependency injection support.
Core API¶
The main Duckalog API provides convenience functions for common use cases:
duckalog
¶
Duckalog public API.
AttachmentsConfig
¶
Bases: BaseModel
Collection of attachment configurations.
Attributes:
| Name | Type | Description |
|---|---|---|
duckdb |
list[DuckDBAttachment]
|
DuckDB attachment entries. |
sqlite |
list[SQLiteAttachment]
|
SQLite attachment entries. |
postgres |
list[PostgresAttachment]
|
Postgres attachment entries. |
duckalog |
list[DuckalogAttachment]
|
Duckalog config attachment entries. |
CatalogConnection
¶
Manages DuckDB connections with session state restoration and lazy initialization.
This class ensures that every connection obtained through it has the correct catalog state (pragmas, settings, extensions, attachments, and secrets) applied, regardless of whether it's a new or existing connection.
Attributes:
| Name | Type | Description |
|---|---|---|
config_path |
Path to the Duckalog configuration file. |
|
database_path |
Optional override for the DuckDB database path. |
|
read_only |
Whether to open the connection in read-only mode. |
|
force_rebuild |
If True, all views will be recreated even if they exist. |
|
config |
Optional[Config]
|
The loaded and validated configuration. |
conn |
Optional[DuckDBPyConnection]
|
The active DuckDB connection, or None if not initialized. |
__enter__()
¶
Context manager support: returns the CatalogConnection instance.
__exit__(exc_type, exc_val, exc_tb)
¶
Context manager support: ensures connection cleanup.
__init__(config_path, database_path=None, read_only=False, force_rebuild=False, filesystem=None, load_dotenv=True)
¶
Initialize the catalog connection manager.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_path
|
str
|
Path to the Duckalog configuration file. |
required |
database_path
|
Optional[str]
|
Optional override for the DuckDB database path. |
None
|
read_only
|
bool
|
Whether to open the connection in read-only mode. |
False
|
force_rebuild
|
bool
|
If True, all views will be recreated even if they exist. |
False
|
filesystem
|
Optional[Any]
|
Optional fsspec filesystem object for remote file access. |
None
|
load_dotenv
|
bool
|
If True, automatically load and process .env files. |
True
|
close()
¶
Clean up the DuckDB connection and resources.
get_connection()
¶
Get the DuckDB connection, initializing it if necessary.
This method establishes the connection lazily on the first call, restores the session state, and performs incremental updates. Subsequent calls return the same connection instance.
Returns:
| Type | Description |
|---|---|
DuckDBPyConnection
|
An active DuckDB connection with catalog state restored. |
Raises:
| Type | Description |
|---|---|
ConfigError
|
If loading the configuration fails. |
FileNotFoundError
|
If the database path is invalid or missing (for existing catalogs). |
EngineError
|
If connecting to DuckDB or restoring state fails. |
Config
¶
Bases: BaseModel
Top-level Duckalog configuration.
Attributes:
| Name | Type | Description |
|---|---|---|
version |
int
|
Positive integer describing the config schema version. |
duckdb |
DuckDBConfig
|
DuckDB session and connection settings. |
views |
list[ViewConfig]
|
List of view definitions to create in the catalog. |
attachments |
AttachmentsConfig
|
Optional attachments to external databases. |
iceberg_catalogs |
list[IcebergCatalogConfig]
|
Optional Iceberg catalog definitions. |
semantic_models |
list[SemanticModelConfig]
|
Optional semantic model definitions for business metadata. |
imports |
Union[list[Union[str, ImportEntry]], SelectiveImports]
|
Optional list of additional config files to import and merge. Can be a simple list of paths (backward compatible) or a SelectiveImports object for advanced options like section-specific imports, override behavior, and glob patterns. |
env_files |
list[str]
|
Optional list of custom .env file patterns to load. Supports patterns like ['.env', '.env.local', '.env.production']. Files are loaded in order with later files overriding earlier ones. Defaults to ['.env'] for backward compatibility. |
loader_settings |
LoaderSettings
|
Optional settings for the configuration loader. |
ConfigError
¶
Bases: DuckalogError
Configuration-related errors.
This exception is raised when a catalog configuration cannot be read, parsed, interpolated, or validated according to the Duckalog schema.
Typical error conditions include:
- The config file does not exist or cannot be read.
- The file is not valid YAML/JSON.
- Required fields are missing or invalid.
- An environment variable placeholder cannot be resolved.
DuckDBAttachment
¶
Bases: BaseModel
Configuration for attaching another DuckDB database.
Attributes:
| Name | Type | Description |
|---|---|---|
alias |
str
|
Alias under which the database will be attached. |
path |
str
|
Filesystem path to the DuckDB database file. |
read_only |
bool
|
Whether the attachment should be opened in read-only mode.
Defaults to |
DuckDBConfig
¶
Bases: BaseModel
DuckDB connection and session settings.
Attributes:
| Name | Type | Description |
|---|---|---|
database |
str
|
Path to the DuckDB database file. Defaults to |
install_extensions |
list[str]
|
Names of extensions to install before use. |
load_extensions |
list[str]
|
Names of extensions to load in the session. |
pragmas |
list[str]
|
SQL statements (typically |
settings |
Optional[Union[str, list[str]]]
|
DuckDB SET statements executed after pragmas. Can be a single string or list of strings. |
secrets |
list[SecretConfig]
|
List of secret definitions for external services and databases. |
EngineError
¶
Bases: DuckalogError
Engine-level error raised during catalog builds.
This exception wraps lower-level DuckDB errors, such as failures to connect to the database, attach external systems, or execute generated SQL statements.
IcebergCatalogConfig
¶
Bases: BaseModel
Configuration for an Iceberg catalog.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Catalog name referenced by Iceberg views. |
catalog_type |
str
|
Backend type (for example, |
uri |
Optional[str]
|
Optional URI used by certain catalog types. |
warehouse |
Optional[str]
|
Optional warehouse location for catalog data. |
options |
dict[str, Any]
|
Additional catalog-specific options. |
PostgresAttachment
¶
Bases: BaseModel
Configuration for attaching a Postgres database.
Attributes:
| Name | Type | Description |
|---|---|---|
alias |
str
|
Alias used inside DuckDB to reference the Postgres database. |
host |
str
|
Hostname or IP address of the Postgres server. |
port |
int
|
TCP port of the Postgres server. |
database |
str
|
Database name to connect to. |
user |
str
|
Username for authentication. |
password |
str
|
Password for authentication. |
sslmode |
Optional[str]
|
Optional SSL mode (for example, |
options |
dict[str, Any]
|
Extra key/value options passed to the attachment clause. |
SQLFileEncodingError
¶
SQLFileError
¶
Bases: ConfigError
Base exception for SQL file-related errors.
This exception is raised when SQL file operations fail, such as when a referenced SQL file cannot be found, read, or processed.
SQLFileLoader
¶
Loads SQL content from external files and processes templates.
__init__()
¶
Initialize the SQL file loader.
load_sql_file(file_path, config_file_path, variables=None, as_template=False, filesystem=None)
¶
Load SQL content from a file and optionally process as a template.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_path
|
str
|
Path to the SQL file (can be relative or absolute) |
required |
config_file_path
|
str
|
Path to the config file for resolving relative paths |
required |
variables
|
Optional[dict[str, Any]]
|
Dictionary of variables for template substitution |
None
|
as_template
|
bool
|
Whether to process the file content as a template |
False
|
filesystem
|
Optional[Any]
|
Optional filesystem object for file I/O operations |
None
|
Returns:
| Type | Description |
|---|---|
str
|
The loaded SQL content (processed if template, raw otherwise) |
Raises:
| Type | Description |
|---|---|
SQLFileError
|
If the file cannot be loaded or processed |
SQLFileNotFoundError
¶
SQLFilePermissionError
¶
SQLFileReference
¶
Bases: BaseModel
Reference to SQL content in an external file.
Attributes:
| Name | Type | Description |
|---|---|---|
path |
str
|
Path to the SQL file (relative or absolute). |
variables |
Optional[dict[str, Any]]
|
Dictionary of variables for template substitution. |
as_template |
bool
|
Whether to process the file content as a template. |
SQLFileSizeError
¶
SQLGroup
¶
Unified access to all SQL-related functionality.
SQLTemplateError
¶
SQLiteAttachment
¶
Bases: BaseModel
Configuration for attaching a SQLite database.
Attributes:
| Name | Type | Description |
|---|---|---|
alias |
str
|
Alias under which the SQLite database will be attached. |
path |
str
|
Filesystem path to the SQLite |
SecretConfig
¶
Bases: BaseModel
Configuration for a DuckDB secret.
Attributes:
| Name | Type | Description |
|---|---|---|
type |
SecretType
|
Secret type (s3, azure, gcs, http, postgres, mysql). |
name |
Optional[str]
|
Optional name for the secret (defaults to type if not provided). |
provider |
SecretProvider
|
Secret provider (config or credential_chain). |
persistent |
bool
|
Whether to create a persistent secret. Defaults to False. |
scope |
Optional[str]
|
Optional scope prefix for the secret. |
key_id |
Optional[str]
|
Access key ID or username for authentication. |
secret |
Optional[str]
|
Secret key or password for authentication. |
region |
Optional[str]
|
Geographic region for cloud services. |
endpoint |
Optional[str]
|
Custom endpoint URL for cloud services. |
connection_string |
Optional[str]
|
Full connection string for databases. |
tenant_id |
Optional[str]
|
Azure tenant ID for authentication. |
account_name |
Optional[str]
|
Azure storage account name. |
client_id |
Optional[str]
|
Azure client ID for authentication. |
client_secret |
Optional[str]
|
Azure client secret for authentication. |
service_account_key |
Optional[str]
|
GCS service account key. |
json_key |
Optional[str]
|
GCS JSON key. |
bearer_token |
Optional[str]
|
HTTP bearer token for authentication. |
header |
Optional[str]
|
HTTP header for authentication. |
database |
Optional[str]
|
Database name for database secrets. |
host |
Optional[str]
|
Database host for database secrets. |
port |
Optional[int]
|
Database port for database secrets. |
user |
Optional[str]
|
Database username (alternative to key_id for database types). |
password |
Optional[str]
|
Database password (alternative to secret for database types). |
options |
dict[str, Any]
|
Additional key-value options for the secret. |
SemanticDefaultsConfig
¶
Bases: BaseModel
Default configuration for a semantic model.
Provides default settings for query builders and dashboards, such as the primary time dimension and default measures.
Attributes:
| Name | Type | Description |
|---|---|---|
time_dimension |
Optional[str]
|
Default time dimension name. |
primary_measure |
Optional[str]
|
Default primary measure name. |
default_filters |
list[dict[str, Any]]
|
Optional list of default filters. |
SemanticDimensionConfig
¶
Bases: BaseModel
Definition of a semantic dimension.
A dimension represents a business attribute that maps to an expression over the base view of a semantic model.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Unique dimension name within the semantic model. |
expression |
str
|
SQL expression referencing columns from the base view. |
label |
Optional[str]
|
Human-readable display name. |
description |
Optional[str]
|
Optional detailed description. |
type |
Optional[str]
|
Optional data type hint (time, number, string, boolean, date). |
time_grains |
list[str]
|
Optional list of time grains for time dimensions. |
SemanticJoinConfig
¶
Bases: BaseModel
Definition of a semantic join.
A join defines a relationship to another view for enriching the semantic model with additional data, typically dimension tables.
Attributes:
| Name | Type | Description |
|---|---|---|
to_view |
str
|
Name of an existing view in the views section to join to. |
type |
str
|
Join type (inner, left, right, full). |
on_condition |
str
|
SQL join condition expression. |
SemanticMeasureConfig
¶
Bases: BaseModel
Definition of a semantic measure.
A measure represents a business metric that typically involves aggregation or calculation over the base view of a semantic model.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Unique measure name within the semantic model. |
expression |
str
|
SQL expression (often aggregated) over the base view. |
label |
Optional[str]
|
Human-readable display name. |
description |
Optional[str]
|
Optional detailed description. |
type |
Optional[str]
|
Optional data type hint. |
SemanticModelConfig
¶
Bases: BaseModel
Definition of a semantic model.
A semantic model provides business-friendly metadata on top of an existing Duckalog view, defining dimensions and measures for analytics and BI use cases.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Unique semantic model name within the config. |
base_view |
str
|
Name of an existing view in the views section. |
dimensions |
list[SemanticDimensionConfig]
|
Optional list of dimension definitions. |
measures |
list[SemanticMeasureConfig]
|
Optional list of measure definitions. |
joins |
list[SemanticJoinConfig]
|
Optional list of join definitions to other views. |
defaults |
Optional[SemanticDefaultsConfig]
|
Optional default configuration for query builders. |
label |
Optional[str]
|
Human-readable display name. |
description |
Optional[str]
|
Optional detailed description. |
tags |
list[str]
|
Optional list of classification tags. |
ViewConfig
¶
Bases: BaseModel
Definition of a single catalog view.
A view can be defined in several ways:
1. Inline SQL: Using the sql field with raw SQL text
2. SQL File: Using sql_file to reference external SQL files
3. SQL Template: Using sql_template for parameterized SQL files
4. Data Source: Using source + required fields for direct data access
5. Source + SQL: Using source for data access plus sql for transformations
For data sources, the required fields depend on the source type:
- Parquet/Delta: uri field is required
- Iceberg: Either uri OR both catalog and table
- DuckDB/SQLite/Postgres: Both database and table are required
When using SQL with a data source, the SQL will be applied as a transformation over the data from the specified source.
Additional metadata fields such as description and tags do not affect
SQL generation but are preserved for documentation and tooling.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Unique view name within the config. |
schema |
str
|
Optional schema name for organizing views in DuckDB schemas. |
sql |
Optional[str]
|
Raw SQL text defining the view body. |
sql_file |
Optional[SQLFileReference]
|
Direct reference to a SQL file. |
sql_template |
Optional[SQLFileReference]
|
Reference to a SQL template file with variable substitution. |
source |
Optional[EnvSource]
|
Source type (e.g. |
uri |
Optional[str]
|
URI for file- or table-based sources (Parquet/Delta/Iceberg). |
database |
Optional[str]
|
Attachment alias for attached-database sources. |
table |
Optional[str]
|
Table name (optionally schema-qualified) for attached sources. |
catalog |
Optional[str]
|
Iceberg catalog name for catalog-based Iceberg views. |
options |
dict[str, Any]
|
Source-specific options passed to scan functions. |
description |
Optional[str]
|
Optional human-readable description of the view. |
tags |
list[str]
|
Optional list of tags for classification. |
connect_to_catalog(config_path, database_path=None, read_only=False, force_rebuild=False)
¶
Create a CatalogConnection instance that manages DuckDB connections with state restoration.
This is the primary entry point for working with Duckalog catalogs in Python.
It returns a :class:CatalogConnection instance which lazily establishes
a DuckDB connection and automatically restores session state (pragmas,
attachments, etc.) and performs incremental view updates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_path
|
str
|
Path to the YAML/JSON configuration file. |
required |
database_path
|
str | None
|
Optional database path override. |
None
|
read_only
|
bool
|
Open the connection in read-only mode for safety. |
False
|
force_rebuild
|
bool
|
If True, all views will be recreated even if they exist. |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
CatalogConnection
|
class: |
Example
Using as a context manager::
from duckalog import connect_to_catalog
with connect_to_catalog("catalog.yaml") as catalog:
conn = catalog.get_connection()
result = conn.execute("SELECT * FROM my_view").fetchall()
Using for persistent state management::
catalog = connect_to_catalog("catalog.yaml")
conn1 = catalog.get_connection()
# ... later ...
conn2 = catalog.get_connection() # Returns the same connection
catalog.close()
connect_to_catalog_cm(config_path, database_path=None, read_only=False, force_rebuild=False)
¶
Context manager that yields an active, state-restored DuckDB connection.
This provides the same state restoration and incremental update benefits
as :class:CatalogConnection, but yields the raw DuckDB connection object
for convenience in simple scripts.
Usage::
from duckalog import connect_to_catalog_cm
with connect_to_catalog_cm("catalog.yaml") as conn:
data = conn.execute("SELECT * FROM users").fetchall()
print(f"Found {len(data)} records")
# Connection automatically closed here
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_path
|
str
|
Path to the YAML/JSON configuration file. |
required |
database_path
|
str | None
|
Optional database path override. |
None
|
read_only
|
bool
|
Open the connection in read-only mode for safety. |
False
|
force_rebuild
|
bool
|
If True, all views will be recreated. |
False
|
Yields:
| Type | Description |
|---|---|
Generator[DuckDBPyConnection]
|
An active DuckDB connection with catalog state restored. |
create_config_template(format='yaml', output_path=None, database_name='analytics_catalog.duckdb', project_name='my_analytics_project')
¶
Generate a basic, valid Duckalog configuration template.
This function creates a configuration template with sensible defaults and educational example content that demonstrates key Duckalog features.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
format
|
ConfigFormat
|
Output format for the configuration ('yaml' or 'json'). Defaults to 'yaml'. |
'yaml'
|
output_path
|
str | None
|
Optional path to write the configuration file. If provided, the template is written to this path and the content is also returned as a string. |
None
|
database_name
|
str
|
Name for the DuckDB database file. |
'analytics_catalog.duckdb'
|
project_name
|
str
|
Name used in comments to personalize the template. |
'my_analytics_project'
|
Returns:
| Type | Description |
|---|---|
str
|
The generated configuration as a string. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If format is not 'yaml' or 'json'. |
ConfigError
|
If the generated template fails validation. |
OSError
|
If writing to output_path fails. |
Example
Generate a YAML template::
template = create_config_template(format='yaml')
print(template)
Generate and save a JSON template::
template = create_config_template(
format='json',
output_path='my_config.json'
)
generate_all_views_sql(config, include_secrets=False)
¶
Generate SQL for all views in a configuration.
The output includes a descriptive header with the config version followed
by a CREATE OR REPLACE VIEW statement for each view in the order they
appear in the configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
Config
|
The validated :class: |
required |
include_secrets
|
bool
|
Whether to include CREATE SECRET statements for secrets. |
False
|
Returns:
| Type | Description |
|---|---|
str
|
A multi-statement SQL script suitable for use as a catalog definition. |
generate_secret_sql(secret)
¶
Generate CREATE SECRET statement for a DuckDB secret.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
secret
|
SecretConfig
|
Secret configuration object. |
required |
Returns:
| Type | Description |
|---|---|
str
|
SQL CREATE SECRET statement. |
generate_sql(config_path)
¶
Generate a full SQL script from a config file.
This is a convenience wrapper around :func:load_config and
:func:generate_all_views_sql that does not connect to DuckDB.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_path
|
str
|
Path to the YAML/JSON configuration file. |
required |
Returns:
| Type | Description |
|---|---|
str
|
A multi-statement SQL script containing |
str
|
statements for all configured views. |
Raises:
| Type | Description |
|---|---|
ConfigError
|
If the configuration file is invalid. |
Example
from duckalog import generate_sql sql = generate_sql("catalog.yaml") print("CREATE VIEW" in sql) True
generate_view_sql(view)
¶
Generate a CREATE OR REPLACE VIEW statement for a single view.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
view
|
ViewConfig
|
The :class: |
required |
Returns:
| Type | Description |
|---|---|
str
|
A single SQL statement that creates or replaces the view. |
load_config(path, load_sql_files=True, sql_file_loader=None, resolve_paths=True, filesystem=None, load_dotenv=True)
¶
Load, interpolate, and validate a Duckalog configuration file.
quote_ident(value)
¶
Quote a SQL identifier using double quotes.
This helper wraps a string in double quotes and escapes any embedded double quotes according to SQL rules.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
str
|
Identifier to quote (for example, a view or column name). |
required |
Returns:
| Type | Description |
|---|---|
str
|
The identifier wrapped in double quotes. |
Example
quote_ident("events") '"events"'
quote_literal(value)
¶
Quote a SQL string literal using single quotes.
This helper wraps a string in single quotes and escapes any embedded single quotes according to SQL rules.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
str
|
String literal to quote (for example, a file path, secret, or connection string). |
required |
Returns:
| Type | Description |
|---|---|
str
|
The string wrapped in single quotes with proper escaping. |
Example
quote_literal("path/to/file.parquet") "'path/to/file.parquet'" quote_literal("user's data") "'user''s data'"
render_options(options)
¶
Render a mapping of options into scan-function arguments.
The resulting string is suitable for appending to a *_scan function
call. Keys are sorted alphabetically to keep output deterministic.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
options
|
dict[str, Any]
|
Mapping of option name to value (str, bool, int, or float). |
required |
Returns:
| Type | Description |
|---|---|
str
|
A string that starts with |
str
|
|
str
|
are provided. |
Raises:
| Type | Description |
|---|---|
TypeError
|
If a value has a type that cannot be rendered safely. |
validate_config(config_path)
¶
Validate a configuration file without touching DuckDB.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_path
|
str
|
Path to the YAML/JSON configuration file. |
required |
Raises:
| Type | Description |
|---|---|
ConfigError
|
If the configuration file is missing, malformed, or does not satisfy the schema and interpolation rules. |
Example
from duckalog import validate_config validate_config("catalog.yaml") # raises on invalid config
validate_generated_config(content, format='yaml')
¶
Validate that generated configuration content can be loaded successfully.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
content
|
str
|
Configuration content as string. |
required |
format
|
ConfigFormat
|
Format of the content ('yaml' or 'json'). |
'yaml'
|
Raises:
| Type | Description |
|---|---|
ConfigError
|
If the configuration cannot be loaded or is invalid. |
New Architecture Patterns¶
Configuration Loading API¶
The new duckalog.config.api module provides enhanced configuration loading with dependency injection support:
Key Features: - Dependency Injection: Custom resolvers and processors - Request-Scoped Caching: Performance optimization for batch operations - Enhanced Error Handling: Better error context and recovery - Modular Design: Clean separation of concerns
Main Functions:
- load_config(): Enhanced configuration loader with DI support
- _load_config_from_local_file(): Core local file loading logic
Import Resolution Interfaces¶
The duckalog.config.resolution package provides extensible import resolution:
Key Components:
- ImportResolver: Interface for custom import resolution logic
- ImportContext: Tracks state during configuration loading
- RequestContext: Aggregates caches for single config load operations
- request_cache_scope(): Context manager for performance optimization
Environment Processing¶
The duckalog.config.resolution.env module provides environment variable processing:
Key Components:
- EnvProcessor: Interface for custom environment processing
- DefaultEnvProcessor: Standard environment variable resolution
- EnvCache: Caching for environment variable lookups
Configuration Models¶
Core Models¶
These models define the structure of Duckalog configuration files:
Config: Root configuration containerDuckDBConfig: Database settings and pragmasAttachmentsConfig: External database attachmentsViewConfig: Individual view definitionsSecretConfig: Credential and secret managementIcebergCatalogConfig: Iceberg catalog connections
Advanced Models¶
Extended models for complex configurations:
SemanticModelConfig: Semantic layer definitionsSQLFileReference: External SQL file referencesSemanticDimensionConfig: Semantic dimension definitionsSemanticMeasureConfig: Semantic measure definitions
Import Patterns¶
Recommended Modern Imports¶
# New modular imports (recommended)
from duckalog.config.api import load_config
from duckalog.config.resolution.imports import request_cache_scope
from duckalog.config.resolution.base import ImportResolver, ImportContext
from duckalog.config.resolution.env import EnvProcessor
# Convenience functions (unchanged)
from duckalog import generate_sql, validate_config, connect_to_catalog
# Configuration models
from duckalog.config import Config, DuckDBConfig, ViewConfig
Backward Compatibility¶
All existing import patterns continue to work without modification:
# Legacy imports (still supported)
from duckalog import load_config # Re-exports from new location
from duckalog.config import load_config # Re-exports from new location
Usage Patterns¶
Basic Usage (Unchanged)¶
from duckalog import load_config, build_catalog
# Load configuration
config = load_config("catalog.yaml")
# Build catalog
build_catalog("catalog.yaml")
Advanced Usage with Dependency Injection¶
from duckalog.config.api import load_config
from duckalog.config.resolution.imports import request_cache_scope
import fsspec
# Custom filesystem
filesystem = fsspec.filesystem("s3", key="...", secret="...")
# Load with custom dependencies
config = load_config(
"s3://bucket/config.yaml",
filesystem=filesystem,
load_dotenv=False
)
# Batch loading with caching
with request_cache_scope() as context:
configs = [load_config(f) for f in config_files]
Custom Implementation¶
from duckalog.config.resolution.base import ImportResolver, ImportContext
class CustomResolver(ImportResolver):
def resolve(self, config_data: dict, context: ImportContext) -> dict:
# Custom import resolution logic
return resolved_config
Utility Functions¶
Core Functions¶
load_config: Enhanced configuration loading with DI supportbuild_catalog: Build DuckDB catalog from configurationvalidate_config: Validate configuration without buildinggenerate_sql: Generate SQL statements from configuration
High-Level Convenience Functions¶
connect_to_catalog: Connect to existing DuckDB catalogconnect_to_catalog_cm: Context manager for catalog connectionsconnect_to_catalog: Build and connect in one operation
Command Line Interface¶
Available Commands¶
duckalog run: Build a DuckDB catalog from a configuration fileduckalog validate: Validate a configuration fileduckalog generate-sql: Generate SQL statements from a configuration fileduckalog init: Initialize new configuration from templates
Enhanced CLI Features¶
- Shared Filesystem Options: Centralized remote access configuration
- Context Management: Automatic filesystem object lifecycle management
- Error Reporting: Improved error messages and context
Error Handling¶
Core Exceptions¶
ConfigError: Configuration validation or loading errorsEngineError: Catalog building or execution errors
Import Resolution Exceptions¶
ImportError: General import resolution failuresImportFileNotFoundError: Imported file not foundCircularImportError: Circular dependency detectedImportValidationError: Import content validation failures
Path Security Exceptions¶
PathResolutionError: Path security validation failuresSQLFileError: SQL file processing errorsSQLFileNotFoundError: SQL file not foundSQLFilePermissionError: Permission denied for SQL file
Performance Features¶
Caching Architecture¶
- Request-Scoped Caching: Shared resolution across multiple loads
- Import Resolution Caching: Avoid re-processing imports
- Environment Variable Caching: Reuse resolved values
- Path Resolution Caching: Cache normalized paths
Memory Management¶
- Automatic Cache Cleanup: Context-managed cache clearing
- Import Chain Tracking: Efficient circular dependency detection
- Lightweight Context Objects: Minimal overhead for tracking
Security Features¶
Enhanced Path Security¶
- Rooted Resolution: All paths relative to config location
- Traversal Protection: Prevent directory traversal attacks
- Cross-Platform Support: Consistent behavior across systems
- Remote URI Support: Secure handling of remote configurations
Secret Management¶
- Environment Variable Integration: Secure credential handling
- Automatic Redaction: Sensitive data protection in logs
- DuckDB Integration: Direct mapping to CREATE SECRET statements
Advanced Usage¶
For detailed examples and advanced usage patterns, see:¶
- User Guide: Comprehensive usage documentation
- Examples: Real-world configuration examples
- Migration Guide: Migrate from legacy patterns
- Architecture Documentation: Deep dive into system design
Extensibility¶
- Custom Import Resolvers: Implement custom import loading logic
- Custom Environment Processors: Add specialized environment handling
- Custom Filesystem Integration: Support new protocols and backends
- Plugin Architecture: Extensible design for new features