API Reference¶

This section documents the core API functions and classes in Duckalog, with a focus on path resolution features.

Core Configuration API¶

load_config()¶

Load, interpolate, and validate a Duckalog configuration file with optional path resolution.

from duckalog import load_config

# Load with path resolution enabled (default)
config = load_config("catalog.yaml")

# Load with explicit path resolution control
config = load_config("catalog.yaml", resolve_paths=True)
config = load_config("catalog.yaml", resolve_paths=False)

# Load without processing SQL files (for validation)
config = load_config("catalog.yaml", load_sql_files=False)

Parameters: - path (str): Path to YAML or JSON configuration file - load_sql_files (bool, optional): Whether to load SQL from external files. Default: True - sql_file_loader (SQLFileLoader, optional): Custom SQL file loader instance - resolve_paths (bool, optional): Whether to resolve relative paths. Default: True

Returns: - Config: Validated configuration object

Raises: - ConfigError: Configuration parsing or validation errors (inherits from DuckalogError)

Path Resolution API¶

Path Detection Functions¶

is_relative_path()¶

Detect if a path is relative based on platform-specific rules.

from duckalog.path_resolution import is_relative_path

# Returns True for relative paths
assert is_relative_path("data/file.parquet")
assert is_relative_path("../shared/data.parquet")

# Returns False for absolute paths and URIs
assert not is_relative_path("/absolute/path/file.parquet")
assert not is_relative_path("s3://bucket/file.parquet")

Parameters: - path (str): Path string to check

Returns: - bool: True if path is relative, False otherwise

detect_path_type()¶

Categorize paths by type for processing.

from duckalog.path_resolution import detect_path_type

path_type = detect_path_type("../data/file.parquet")  # Returns: "relative"
path_type = detect_path_type("/absolute/file.parquet")  # Returns: "absolute"
path_type = detect_path_type("s3://bucket/file.parquet")  # Returns: "remote"

Parameters: - path (str): Path string to analyze

Returns: - str: One of "relative", "absolute", "remote", "invalid"

Path Resolution Functions¶

resolve_relative_path()¶

Resolve a relative path to an absolute path relative to configuration directory.

from duckalog.path_resolution import resolve_relative_path
from pathlib import Path

config_dir = Path("/project/config")
resolved = resolve_relative_path("data/file.parquet", config_dir)
# Returns: "/project/config/data/file.parquet"

Parameters: - path (str): Path to resolve (relative or absolute) - config_dir (Path): Directory containing the configuration file

Returns: - str: Resolved absolute path

Raises: - ValueError: If path resolution fails or violates security rules

Security Validation Functions¶

validate_path_security()¶

Validate that resolved paths don't violate security boundaries.

from duckalog.path_resolution import validate_path_security
from pathlib import Path

config_dir = Path("/project/config")

# Safe paths return True
assert validate_path_security("data/file.parquet", config_dir)
assert validate_path_security("../shared/data.parquet", config_dir)

# Dangerous paths return False
assert not validate_path_security("../../../etc/passwd", config_dir)

Parameters: - path (str): Path to validate (will be resolved if relative) - config_dir (Path): Directory containing the configuration file

Returns: - bool: True if path is safe, False otherwise

validate_file_accessibility()¶

Validate that a file path is accessible and readable.

from duckalog.path_resolution import validate_file_accessibility

accessible, error = validate_file_accessibility("/path/to/file.parquet")

if accessible:
    print("File is accessible")
else:
    print(f"File access error: {error}")

Parameters: - path (str): File path to validate

Returns: - tuple[bool, str | None]: (is_accessible, error_message)

Path Normalization Functions¶

normalize_path_for_sql()¶

Normalize a path for use in SQL statements.

from duckalog.path_resolution import normalize_path_for_sql

sql_path = normalize_path_for_sql("/path/to/file.parquet")
# Returns: "'/path/to/file.parquet'"

sql_path = normalize_path_for_sql("/path/file's_data.parquet")  
# Returns: "'/path/file''s_data.parquet'" (properly escaped)

Parameters: - path (str): Absolute path to normalize

Returns: - str: SQL-safe quoted path

Root-Based Path Security Functions¶

is_within_allowed_roots()¶

Check if a resolved path is within any of the allowed root directories using robust cross-platform validation.

from duckalog.path_resolution import is_within_allowed_roots
from pathlib import Path

config_dir = Path("/project/config")
allowed_roots = [config_dir]

# Check if a path is within allowed roots
is_safe = is_within_allowed_roots("/project/config/data/file.parquet", allowed_roots)
# Returns: True

is_unsafe = is_within_allowed_roots("/etc/passwd", allowed_roots)  
# Returns: False

is_traversal_blocked = is_within_allowed_roots("../../../etc/passwd", allowed_roots)
# Returns: False (path traversal blocked)

Parameters: - candidate_path (str): The path to check (will be resolved to absolute) - allowed_roots (list[Path]): List of Path objects representing allowed root directories

Returns: - bool: True if the candidate path is within at least one allowed root, False otherwise

Raises: - ValueError: If the candidate path cannot be resolved (invalid path)

Security Features: - Uses Path.resolve() to follow symlinks and get canonical paths - Uses os.path.commonpath() for robust cross-platform path comparison
- Prevents all forms of path traversal attacks regardless of encoding or separators - Handles Windows drive letters and UNC paths correctly

SQL Generation API¶

SQL Quoting Functions¶

quote_ident()¶

Quote a SQL identifier using double quotes with proper escaping.

from duckalog import quote_ident

# Quote simple identifiers
assert quote_ident("events") == '"events"'

# Quote identifiers with spaces
assert quote_ident("daily users") == '"daily users"'

# Quote identifiers with quotes (escapes embedded quotes)
assert quote_ident('user "events"') == '"user ""events"""'

Parameters: - identifier (str): Identifier to quote

Returns: - str: Identifier wrapped in double quotes with proper escaping

quote_literal()¶

Quote a SQL string literal using single quotes with proper escaping.

from duckalog import quote_literal

# Quote simple strings
assert quote_literal("user's data") == "'user''s data'"

# Quote file paths
assert quote_literal("path/to/file.parquet") == "'path/to/file.parquet'"

# Quote SQL statements
assert quote_literal("SELECT * FROM table") == "'SELECT * FROM table'"

# Quote empty strings
assert quote_literal("") == "''"

Parameters: - value (str): String literal to quote

Returns: - str: String wrapped in single quotes with proper escaping

generate_view_sql()¶

Generate a CREATE OR REPLACE VIEW statement for a view configuration.

from duckalog import ViewConfig, generate_view_sql

view = ViewConfig(
    name="test_view",
    source="duckdb",
    database="my_db",
    table="users"
)

sql = generate_view_sql(view)
# Returns: CREATE OR REPLACE VIEW "test_view" AS SELECT * FROM "my_db"."users";

Parameters: - view (ViewConfig): View configuration to generate SQL for

Returns: - str: SQL statement creating the view with proper quoting

generate_secret_sql()¶

Generate CREATE SECRET statement for a secret configuration with proper escaping.

from duckalog import SecretConfig, generate_secret_sql

secret = SecretConfig(
    type="s3",
    name="prod_s3",
    provider="config",
    key_id="user's key",  # Contains single quote
    secret="secret'with'quotes"
)

sql = generate_secret_sql(secret)
# Returns: CREATE SECRET prod_s3 (TYPE S3, KEY_ID 'user''s key', SECRET 'secret''with''quotes')

Parameters: - secret (SecretConfig): Secret configuration to generate SQL for

Returns: - str: SQL statement creating the secret with proper escaping

Note: This function enforces strict type checking for secret options and will raise TypeError for unsupported types (non-bool, int, float, or str values).

Configuration Models¶

Config¶

Top-level configuration model with path resolution integration.

from duckalog import load_config

config = load_config("catalog.yaml")

# Access configuration properties
print(config.version)
print(len(config.views))

# Views have resolved paths when resolution is enabled
for view in config.views:
    print(f"View: {view.name}, URI: {view.uri}")

ViewConfig¶

Model for individual view definitions that may contain resolved paths.

from duckalog.config import ViewConfig

# View with relative path (will be resolved during config loading)
view = ViewConfig(
    name="events",
    source="parquet", 
    uri="data/events.parquet"  # Relative path
)

# View with absolute path (unchanged during resolution)
view = ViewConfig(
    name="remote_data",
    source="parquet",
    uri="s3://bucket/events.parquet"  # Remote URI - unchanged
)

AttachmentConfig¶

Models for database attachments with path resolution support.

from duckalog.config import DuckDBAttachment, SQLiteAttachment

# DuckDB attachment with relative path
duckdb_attach = DuckDBAttachment(
    alias="reference_db",
    path="./databases/reference.duckdb",  # Will be resolved
    read_only=True
)

# SQLite attachment with relative path
sqlite_attach = SQLiteAttachment(
    alias="legacy_db", 
    path="../legacy/system.db"  # Will be resolved
)

Exception Classes¶

Duckalog uses a consistent exception hierarchy based on DuckalogError as the base class for all library exceptions. This provides a unified error handling interface and makes it easy to catch all Duckalog-specific errors.

Base Exception¶

DuckalogError¶

Base exception for all Duckalog-specific errors. You can catch this class to handle any Duckalog-related error, or catch more specific subclasses for targeted error handling.

from duckalog import DuckalogError

try:
    config = load_config("catalog.yaml")
    build_catalog(config)
except DuckalogError as e:
    # Handle any Duckalog-specific error
    print(f"Duckalog error: {e}")

Configuration Errors¶

ConfigError¶

Raised for configuration-related errors including parsing, validation, and path resolution failures. Inherits from DuckalogError.

from duckalog import load_config, ConfigError

try:
    config = load_config("catalog.yaml")
except ConfigError as e:
    print(f"Configuration error: {e}")

Common scenarios: - Missing required configuration fields - Invalid YAML/JSON syntax - Unresolved environment variable placeholders (${env:VAR}) - Invalid view definitions

PathResolutionError¶

Raised specifically when path resolution fails due to security or access issues. Inherits from ConfigError.

from duckalog import PathResolutionError, resolve_relative_path
from pathlib import Path

try:
    resolved = resolve_relative_path("../../../etc/passwd", Path("/project/config"))
except PathResolutionError as e:
    print(f"Path resolution failed: {e}")
    print(f"Original path: {e.original_path}")
    print(f"Resolved path: {e.resolved_path}")

RemoteConfigError¶

Raised when remote configuration loading fails. Inherits from ConfigError.

from duckalog import RemoteConfigError, load_config_from_uri

try:
    config = load_config_from_uri("s3://bucket/config.yaml")
except RemoteConfigError as e:
    print(f"Remote config error: {e}")

SQL File Errors¶

SQLFileError¶

Base exception for SQL file-related errors. Inherits from DuckalogError.

from duckalog import SQLFileError

try:
    # Operations that load SQL from files
    pass
except SQLFileError as e:
    print(f"SQL file error: {e}")

Subclasses for specific SQL file errors: - SQLFileNotFoundError: Referenced SQL file does not exist - SQLFilePermissionError: SQL file cannot be read due to permissions - SQLFileEncodingError: SQL file has invalid encoding - SQLFileSizeError: SQL file exceeds size limits - SQLTemplateError: Template processing fails

Engine Errors¶

EngineError¶

Raised during catalog builds when DuckDB operations fail. Inherits from DuckalogError.

from duckalog import EngineError, build_catalog

try:
    build_catalog("catalog.yaml")
except EngineError as e:
    print(f"Engine error: {e}")

Common scenarios: - DuckDB connection failures - SQL execution errors - Attachment setup failures - Extension installation/loading errors - Secret creation failures

Exception Chaining¶

All Duckalog exceptions support proper exception chaining to preserve the original error context:

from duckalog import EngineError

try:
    # Some operation that fails
    raise ValueError("Original database connection failed")
except ValueError as exc:
    raise EngineError("Failed to connect to DuckDB") from exc

This preserves the full traceback and makes debugging much easier.

Generated API Documentation¶

This section is generated from the Duckalog source code using mkdocstrings.

`duckalog` ¶

Duckalog public API.

`AttachmentsConfig` ¶

Bases: BaseModel

Collection of attachment configurations.

Attributes:

Name	Type	Description
`duckdb`	`list[DuckDBAttachment]`	DuckDB attachment entries.
`sqlite`	`list[SQLiteAttachment]`	SQLite attachment entries.
`postgres`	`list[PostgresAttachment]`	Postgres attachment entries.
`duckalog`	`list[DuckalogAttachment]`	Duckalog config attachment entries.

`Config` ¶

Bases: BaseModel

Top-level Duckalog configuration.

Attributes:

Name	Type	Description
`version`	`int`	Positive integer describing the config schema version.
`duckdb`	`DuckDBConfig`	DuckDB session and connection settings.
`views`	`list[ViewConfig]`	List of view definitions to create in the catalog.
`attachments`	`AttachmentsConfig`	Optional attachments to external databases.
`iceberg_catalogs`	`list[IcebergCatalogConfig]`	Optional Iceberg catalog definitions.
`semantic_models`	`list[SemanticModelConfig]`	Optional semantic model definitions for business metadata.

`ConfigError` ¶

Bases: DuckalogError

Configuration-related errors.

This exception is raised when a catalog configuration cannot be read, parsed, interpolated, or validated according to the Duckalog schema.

Typical error conditions include:

The config file does not exist or cannot be read.
The file is not valid YAML/JSON.
Required fields are missing or invalid.
An environment variable placeholder cannot be resolved.

`DuckDBAttachment` ¶

Bases: BaseModel

Configuration for attaching another DuckDB database.

Attributes:

Name	Type	Description
`alias`	`str`	Alias under which the database will be attached.
`path`	`str`	Filesystem path to the DuckDB database file.
`read_only`	`bool`	Whether the attachment should be opened in read-only mode. Defaults to `True` for safety.

`DuckDBConfig` ¶

Bases: BaseModel

DuckDB connection and session settings.

Attributes:

Name	Type	Description
`database`	`str`	Path to the DuckDB database file. Defaults to `":memory:"`.
`install_extensions`	`list[str]`	Names of extensions to install before use.
`load_extensions`	`list[str]`	Names of extensions to load in the session.
`pragmas`	`list[str]`	SQL statements (typically `SET` pragmas) executed after connecting and loading extensions.
`settings`	`Optional[Literal[str, list[str]]]`	DuckDB SET statements executed after pragmas. Can be a single string or list of strings.
`secrets`	`list[SecretConfig]`	List of secret definitions for external services and databases.

`EngineError` ¶

Bases: DuckalogError

Engine-level error raised during catalog builds.

This exception wraps lower-level DuckDB errors, such as failures to connect to the database, attach external systems, or execute generated SQL statements.

`IcebergCatalogConfig` ¶

Bases: BaseModel

Configuration for an Iceberg catalog.

Attributes:

Name	Type	Description
`name`	`str`	Catalog name referenced by Iceberg views.
`catalog_type`	`str`	Backend type (for example, `rest`, `hive`, `glue`).
`uri`	`Optional[str]`	Optional URI used by certain catalog types.
`warehouse`	`Optional[str]`	Optional warehouse location for catalog data.
`options`	`dict[str, Any]`	Additional catalog-specific options.

`PostgresAttachment` ¶

Bases: BaseModel

Configuration for attaching a Postgres database.

Attributes:

Name	Type	Description
`alias`	`str`	Alias used inside DuckDB to reference the Postgres database.
`host`	`str`	Hostname or IP address of the Postgres server.
`port`	`int`	TCP port of the Postgres server.
`database`	`str`	Database name to connect to.
`user`	`str`	Username for authentication.
`password`	`str`	Password for authentication.
`sslmode`	`Optional[str]`	Optional SSL mode (for example, `require`).
`options`	`dict[str, Any]`	Extra key/value options passed to the attachment clause.

`SQLiteAttachment` ¶

Bases: BaseModel

Configuration for attaching a SQLite database.

Attributes:

Name	Type	Description
`alias`	`str`	Alias under which the SQLite database will be attached.
`path`	`str`	Filesystem path to the SQLite `.db` file.

`SecretConfig` ¶

Bases: BaseModel

Configuration for a DuckDB secret.

Attributes:

Name	Type	Description
`type`	`SecretType`	Secret type (s3, azure, gcs, http, postgres, mysql).
`name`	`Optional[str]`	Optional name for the secret (defaults to type if not provided).
`provider`	`SecretProvider`	Secret provider (config or credential_chain).
`persistent`	`bool`	Whether to create a persistent secret. Defaults to False.
`scope`	`Optional[str]`	Optional scope prefix for the secret.
`key_id`	`Optional[str]`	Access key ID or username for authentication.
`secret`	`Optional[str]`	Secret key or password for authentication.
`region`	`Optional[str]`	Geographic region for cloud services.
`endpoint`	`Optional[str]`	Custom endpoint URL for cloud services.
`connection_string`	`Optional[str]`	Full connection string for databases.
`tenant_id`	`Optional[str]`	Azure tenant ID for authentication.
`account_name`	`Optional[str]`	Azure storage account name.
`client_id`	`Optional[str]`	Azure client ID for authentication.
`client_secret`	`Optional[str]`	Azure client secret for authentication.
`service_account_key`	`Optional[str]`	GCS service account key.
`json_key`	`Optional[str]`	GCS JSON key.
`bearer_token`	`Optional[str]`	HTTP bearer token for authentication.
`header`	`Optional[str]`	HTTP header for authentication.
`database`	`Optional[str]`	Database name for database secrets.
`host`	`Optional[str]`	Database host for database secrets.
`port`	`Optional[int]`	Database port for database secrets.
`user`	`Optional[str]`	Database username (alternative to key_id for database types).
`password`	`Optional[str]`	Database password (alternative to secret for database types).
`options`	`dict[str, Any]`	Additional key-value options for the secret.

`ViewConfig` ¶

Bases: BaseModel

Definition of a single catalog view.

A view can be defined in several ways: 1. Inline SQL: Using the sql field with raw SQL text 2. SQL File: Using sql_file to reference external SQL files 3. SQL Template: Using sql_template for parameterized SQL files 4. Data Source: Using source + required fields for direct data access 5. Source + SQL: Using source for data access plus sql for transformations

For data sources, the required fields depend on the source type: - Parquet/Delta: uri field is required - Iceberg: Either uri OR both catalog and table - DuckDB/SQLite/Postgres: Both database and table are required

When using SQL with a data source, the SQL will be applied as a transformation over the data from the specified source.

Additional metadata fields such as description and tags do not affect SQL generation but are preserved for documentation and tooling.

Attributes:

Name	Type	Description
`name`	`str`	Unique view name within the config.
`sql`	`Optional[str]`	Raw SQL text defining the view body.
`sql_file`	`Optional[SQLFileReference]`	Direct reference to a SQL file.
`sql_template`	`Optional[SQLFileReference]`	Reference to a SQL template file with variable substitution.
`source`	`Optional[EnvSource]`	Source type (e.g. `"parquet"`, `"iceberg"`, `"duckdb"`).
`uri`	`Optional[str]`	URI for file- or table-based sources (Parquet/Delta/Iceberg).
`database`	`Optional[str]`	Attachment alias for attached-database sources.
`table`	`Optional[str]`	Table name (optionally schema-qualified) for attached sources.
`catalog`	`Optional[str]`	Iceberg catalog name for catalog-based Iceberg views.
`options`	`dict[str, Any]`	Source-specific options passed to scan functions.
`description`	`Optional[str]`	Optional human-readable description of the view.
`tags`	`list[str]`	Optional list of tags for classification.

`build_catalog(config_path, db_path=None, dry_run=False, verbose=False, filesystem=None, include_secrets=True)` ¶

Build or update a DuckDB catalog from a configuration file.

This function is the high-level entry point used by both the CLI and Python API. It loads the config, optionally performs a dry-run SQL generation, or otherwise connects to DuckDB, sets up attachments and Iceberg catalogs, and creates or replaces configured views.

Parameters:

Name	Type	Description	Default
`config_path`	`str`	Path to the YAML/JSON configuration file.	required
`db_path`	`str \| None`	Optional override for `duckdb.database` in the config. Can be a local path or remote URI (s3://, gs://, gcs://, abfs://, adl://, sftp://).	`None`
`dry_run`	`bool`	If `True`, do not connect to DuckDB; instead generate and return the full SQL script for all views.	`False`
`verbose`	`bool`	If `True`, enable more verbose logging via the standard logging module.	`False`
`filesystem`	`Any \| None`	Optional pre-configured fsspec filesystem object for remote export authentication. If not provided, default authentication will be used.	`None`

Returns:

Type	Description
`str \| None`	The generated SQL script as a string when `dry_run` is `True`,
`str \| None`	otherwise `None` when the catalog is applied to DuckDB.

Raises:

Type	Description
`ConfigError`	If the configuration file is invalid.
`EngineError`	If connecting to DuckDB or executing SQL fails, or if remote export fails.

Example

Build a catalog in-place::

from duckalog import build_catalog

build_catalog("catalog.yaml")

Build and export to remote storage::

build_catalog("catalog.yaml", db_path="s3://my-bucket/catalog.duckdb")

Generate SQL without modifying the database::

sql = build_catalog("catalog.yaml", dry_run=True)
print(sql)

`connect_and_build_catalog(config_path, database_path=None, dry_run=False, verbose=False, read_only=False, **kwargs)` ¶

Build a catalog and create a DuckDB connection in one operation.

This function combines catalog building with connection creation, providing a streamlined workflow for users who want to start working with their catalog immediately after creating it.

Parameters:

Name	Type	Description	Default
`config_path`	`str`	Path to the YAML/JSON configuration file.	required
`database_path`	`str \| None`	Optional database path override.	`None`
`dry_run`	`bool`	If True, only validates configuration and returns SQL. If False, builds the catalog and creates a connection.	`False`
`verbose`	`bool`	Enable verbose logging during build process.	`False`
`read_only`	`bool`	Open the resulting connection in read-only mode.	`False`
`**kwargs`	`Any`	Additional keyword arguments.	`{}`

Returns:

Type	Description
`DuckDBPyConnection \| str \| None`	A DuckDB connection object for immediate use, or SQL string when dry_run=True.

Raises:

Type	Description
`ConfigError`	If the configuration file is invalid.
`EngineError`	If catalog building or connection fails.
`FileNotFoundError`	If the resulting database file doesn't exist (after build).

Example

Build catalog and start querying immediately::

conn = connect_and_build_catalog("catalog.yaml")
data = conn.execute("SELECT * FROM important_table").fetchall()
print(f"Found {len(data)} records")
conn.close()

Validate only (dry run)::

sql = connect_and_build_catalog("catalog.yaml", dry_run=True) print("SQL generation completed, no database created")

Custom database path::

conn = connect_and_build_catalog("catalog.yaml", database_path="analytics.db")
print("Connected to custom database: analytics.db")

If dry_run=True, the function only validates the configuration and returns the SQL script without creating any database files or connections.

`connect_to_catalog(config_path, database_path=None, read_only=False)` ¶

Connect to an existing DuckDB database created by Duckalog.

This function provides a direct connection to a DuckDB database that was previously created using Duckalog. It simplifies the workflow for users who just want to start querying their catalog database.

Parameters:

Name	Type	Description	Default
`config_path`	`str`	Path to the YAML/JSON configuration file for an existing catalog.	required
`database_path`	`str \| None`	Optional database path override. If not provided, uses the path from the configuration.	`None`
`read_only`	`bool`	Open the connection in read-only mode for safety.	`False`

Returns:

Type	Description
`DuckDBPyConnection`	An active DuckDB connection object ready for query execution.

Raises:

Type	Description
`ConfigError`	If the configuration file is invalid.
`FileNotFoundError`	If the specified database file doesn't exist.
`Error`	If connection or queries fail.

Example

Connect to an existing catalog::

from duckalog import connect_to_catalog
conn = connect_to_catalog("catalog.yaml")

# Use the connection for queries
result = conn.execute("SELECT * FROM some_table").fetchall()
conn.close()

With path override::

conn = connect_to_catalog("catalog.yaml", database_path="analytics.db")

With read-only mode::

conn = connect_to_catalog("catalog.yaml", read_only=True)

With context manager::

from duckalog import connect_to_catalog_cm
with connect_to_catalog_cm("catalog.yaml") as conn:
    data = conn.execute("SELECT * FROM users").fetchall()
    print(f"Found {len(data)} records")
# Connection automatically closed here

`connect_to_catalog_cm(config_path, database_path=None, read_only=False)` ¶

Context manager version of connect_to_catalog for automatic connection cleanup.

Usage::

from duckalog import connect_to_catalog_cm
with connect_to_catalog_cm("catalog.yaml") as conn:
    data = conn.execute("SELECT * FROM users").fetchall()
    print(f"Found {len(data)} records")
# Connection is automatically closed here

Parameters:

Name	Type	Description	Default
`config_path`	`str`	Path to the YAML/JSON configuration file for an existing catalog.	required
`database_path`	`str \| None`	Optional database path override.	`None`
`read_only`	`bool`	Open the connection in read-only mode for safety.	`False`

Yields:

Type	Description
`Generator[DuckDBPyConnection]`	An active DuckDB connection that will be closed automatically.

Raises:

Type	Description
`ConfigError`	If the configuration file is invalid.
`FileNotFoundError`	If the specified database file doesn't exist.
`Error`	If connection or queries fail.

`create_config_template(format='yaml', output_path=None, database_name='analytics_catalog.duckdb', project_name='my_analytics_project')` ¶

Generate a basic, valid Duckalog configuration template.

This function creates a configuration template with sensible defaults and educational example content that demonstrates key Duckalog features.

Parameters:

Name	Type	Description	Default
`format`	`ConfigFormat`	Output format for the configuration ('yaml' or 'json'). Defaults to 'yaml'.	`'yaml'`
`output_path`	`str \| None`	Optional path to write the configuration file. If provided, the template is written to this path and the content is also returned as a string.	`None`
`database_name`	`str`	Name for the DuckDB database file.	`'analytics_catalog.duckdb'`
`project_name`	`str`	Name used in comments to personalize the template.	`'my_analytics_project'`

Returns:

Type	Description
`str`	The generated configuration as a string.

Raises:

Type	Description
`ValueError`	If format is not 'yaml' or 'json'.
`ConfigError`	If the generated template fails validation.
`OSError`	If writing to output_path fails.

Example

Generate a YAML template::

template = create_config_template(format='yaml')
print(template)

Generate and save a JSON template::

template = create_config_template(
    format='json',
    output_path='my_config.json'
)

`generate_all_views_sql(config, include_secrets=False)` ¶

Generate SQL for all views in a configuration.

The output includes a descriptive header with the config version followed by a CREATE OR REPLACE VIEW statement for each view in the order they appear in the configuration.

Parameters:

Name	Type	Description	Default
`config`	`Config`	The validated :class:`Config` instance to render.	required
`include_secrets`	`bool`	Whether to include CREATE SECRET statements for secrets.	`False`

Returns:

Type	Description
`str`	A multi-statement SQL script suitable for use as a catalog definition.

`generate_secret_sql(secret)` ¶

Generate CREATE SECRET statement for a DuckDB secret.

Parameters:

Name	Type	Description	Default
`secret`	`SecretConfig`	Secret configuration object.	required

Returns:

Type	Description
`str`	SQL CREATE SECRET statement.

`generate_sql(config_path)` ¶

Generate a full SQL script from a config file.

This is a convenience wrapper around :func:load_config and :func:generate_all_views_sql that does not connect to DuckDB.

Parameters:

Name	Type	Description	Default
`config_path`	`str`	Path to the YAML/JSON configuration file.	required

Returns:

Type	Description
`str`	A multi-statement SQL script containing `CREATE OR REPLACE VIEW`
`str`	statements for all configured views.

Raises:

Type	Description
`ConfigError`	If the configuration file is invalid.

Example

from duckalog import generate_sql sql = generate_sql("catalog.yaml") print("CREATE VIEW" in sql) True

`generate_view_sql(view)` ¶

Generate a CREATE OR REPLACE VIEW statement for a single view.

Parameters:

Name	Type	Description	Default
`view`	`ViewConfig`	The :class:`ViewConfig` to generate SQL for.	required

Returns:

Type	Description
`str`	A single SQL statement that creates or replaces the view.

`load_config(path, load_sql_files=True, sql_file_loader=None, resolve_paths=True, filesystem=None)` ¶

Load, interpolate, and validate a Duckalog configuration file.

This helper is the main entry point for turning a YAML or JSON file into a validated :class:Config instance. It applies environment-variable interpolation and enforces the configuration schema.

Parameters:

Name	Type	Description	Default
`path`	`str`	Path to a YAML or JSON config file, or a remote URI.	required
`load_sql_files`	`bool`	Whether to load and process SQL from external files. If False, SQL file references are left as-is for later processing.	`True`
`sql_file_loader`	`Optional[Any]`	Optional SQLFileLoader instance for loading SQL files. If None, a default loader will be created.	`None`
`resolve_paths`	`bool`	Whether to resolve relative paths to absolute paths. If True, relative paths in view URIs and attachment paths will be resolved relative to the config file's directory. For remote configs, this defaults to False.	`True`
`filesystem`	`Optional[Any]`	Optional fsspec filesystem object to use for remote operations. If provided, this filesystem will be used instead of creating a new one based on URI scheme. Useful for custom authentication or advanced use cases.	`None`

Returns:

Type	Description
`Any`	A validated :class:`Config` object.

Raises:

Type	Description
`ConfigError`	If the file cannot be read, is not valid YAML/JSON, fails schema validation, contains unresolved `${env:VAR_NAME}` placeholders, or if SQL file loading fails.

Example

Load a catalog from catalog.yaml::

from duckalog import load_config

config = load_config("catalog.yaml")
print(len(config.views))

Load a catalog from S3::

config = load_config("s3://my-bucket/configs/catalog.yaml")
print(len(config.views))

Load a catalog with custom filesystem::

import fsspec
fs = fsspec.filesystem("s3", key="key", secret="secret", anon=False)
config = load_config("s3://my-bucket/configs/catalog.yaml", filesystem=fs)
print(len(config.views))

`quote_ident(value)` ¶

Quote a SQL identifier using double quotes.

This helper wraps a string in double quotes and escapes any embedded double quotes according to SQL rules.

Parameters:

Name	Type	Description	Default
`value`	`str`	Identifier to quote (for example, a view or column name).	required

Returns:

Type	Description
`str`	The identifier wrapped in double quotes.

Example

quote_ident("events") '"events"'

`quote_literal(value)` ¶

Quote a SQL string literal using single quotes.

This helper wraps a string in single quotes and escapes any embedded single quotes according to SQL rules.

Parameters:

Name	Type	Description	Default
`value`	`str`	String literal to quote (for example, a file path, secret, or connection string).	required

Returns:

Type	Description
`str`	The string wrapped in single quotes with proper escaping.

Example

quote_literal("path/to/file.parquet") "'path/to/file.parquet'" quote_literal("user's data") "'user''s data'"

`render_options(options)` ¶

Render a mapping of options into scan-function arguments.

The resulting string is suitable for appending to a *_scan function call. Keys are sorted alphabetically to keep output deterministic.

Parameters:

Name	Type	Description	Default
`options`	`dict[str, Any]`	Mapping of option name to value (str, bool, int, or float).	required

Returns:

Type	Description
`str`	A string that starts with `,` when options are present (for example,
`str`	`", hive_partitioning=TRUE"`) or an empty string when no options
`str`	are provided.

Raises:

Type	Description
`TypeError`	If a value has a type that cannot be rendered safely.

`validate_config(config_path)` ¶

Validate a configuration file without touching DuckDB.

Parameters:

Name	Type	Description	Default
`config_path`	`str`	Path to the YAML/JSON configuration file.	required

Raises:

Type	Description
`ConfigError`	If the configuration file is missing, malformed, or does not satisfy the schema and interpolation rules.

Example

from duckalog import validate_config validate_config("catalog.yaml") # raises on invalid config

`validate_generated_config(content, format='yaml')` ¶

Validate that generated configuration content can be loaded successfully.

Parameters:

Name	Type	Description	Default
`content`	`str`	Configuration content as string.	required
`format`	`ConfigFormat`	Format of the content ('yaml' or 'json').	`'yaml'`

Raises:

Type	Description
`ConfigError`	If the configuration cannot be loaded or is invalid.

API Reference¶

Core Configuration API¶

load_config()¶

Path Resolution API¶

Path Detection Functions¶

is_relative_path()¶

detect_path_type()¶

Path Resolution Functions¶

resolve_relative_path()¶

Security Validation Functions¶

validate_path_security()¶

validate_file_accessibility()¶

Path Normalization Functions¶

normalize_path_for_sql()¶

Root-Based Path Security Functions¶

is_within_allowed_roots()¶

SQL Generation API¶

SQL Quoting Functions¶

quote_ident()¶

quote_literal()¶

generate_view_sql()¶

generate_secret_sql()¶

Configuration Models¶

Config¶

ViewConfig¶

AttachmentConfig¶

Exception Classes¶

Base Exception¶

DuckalogError¶

Configuration Errors¶

ConfigError¶

PathResolutionError¶

RemoteConfigError¶

SQL File Errors¶

SQLFileError¶

Engine Errors¶

EngineError¶

Exception Chaining¶

Generated API Documentation¶

duckalog ¶

AttachmentsConfig ¶

Config ¶

ConfigError ¶

DuckDBAttachment ¶

DuckDBConfig ¶

EngineError ¶

IcebergCatalogConfig ¶

PostgresAttachment ¶

SQLiteAttachment ¶

SecretConfig ¶

ViewConfig ¶

build_catalog(config_path, db_path=None, dry_run=False, verbose=False, filesystem=None, include_secrets=True) ¶

connect_and_build_catalog(config_path, database_path=None, dry_run=False, verbose=False, read_only=False, **kwargs) ¶

connect_to_catalog(config_path, database_path=None, read_only=False) ¶

connect_to_catalog_cm(config_path, database_path=None, read_only=False) ¶

create_config_template(format='yaml', output_path=None, database_name='analytics_catalog.duckdb', project_name='my_analytics_project') ¶

generate_all_views_sql(config, include_secrets=False) ¶

generate_secret_sql(secret) ¶

generate_sql(config_path) ¶

generate_view_sql(view) ¶

load_config(path, load_sql_files=True, sql_file_loader=None, resolve_paths=True, filesystem=None) ¶

quote_ident(value) ¶

quote_literal(value) ¶

render_options(options) ¶

validate_config(config_path) ¶

validate_generated_config(content, format='yaml') ¶

`duckalog` ¶

`AttachmentsConfig` ¶

`Config` ¶

`ConfigError` ¶

`DuckDBAttachment` ¶

`DuckDBConfig` ¶

`EngineError` ¶

`IcebergCatalogConfig` ¶

`PostgresAttachment` ¶

`SQLiteAttachment` ¶

`SecretConfig` ¶

`ViewConfig` ¶

`build_catalog(config_path, db_path=None, dry_run=False, verbose=False, filesystem=None, include_secrets=True)` ¶

`connect_and_build_catalog(config_path, database_path=None, dry_run=False, verbose=False, read_only=False, **kwargs)` ¶

`connect_to_catalog(config_path, database_path=None, read_only=False)` ¶

`connect_to_catalog_cm(config_path, database_path=None, read_only=False)` ¶

`create_config_template(format='yaml', output_path=None, database_name='analytics_catalog.duckdb', project_name='my_analytics_project')` ¶

`generate_all_views_sql(config, include_secrets=False)` ¶

`generate_secret_sql(secret)` ¶

`generate_sql(config_path)` ¶

`generate_view_sql(view)` ¶

`load_config(path, load_sql_files=True, sql_file_loader=None, resolve_paths=True, filesystem=None)` ¶

`quote_ident(value)` ¶

`quote_literal(value)` ¶

`render_options(options)` ¶

`validate_config(config_path)` ¶

`validate_generated_config(content, format='yaml')` ¶