Skip to content

API Reference

This section documents the core API functions and classes in Duckalog, with a focus on path resolution features.

Core Configuration API

load_config()

Load, interpolate, and validate a Duckalog configuration file with optional path resolution.

from duckalog import load_config

# Load with path resolution enabled (default)
config = load_config("catalog.yaml")

# Load with explicit path resolution control
config = load_config("catalog.yaml", resolve_paths=True)
config = load_config("catalog.yaml", resolve_paths=False)

# Load without processing SQL files (for validation)
config = load_config("catalog.yaml", load_sql_files=False)

Parameters: - path (str): Path to YAML or JSON configuration file - load_sql_files (bool, optional): Whether to load SQL from external files. Default: True - sql_file_loader (SQLFileLoader, optional): Custom SQL file loader instance - resolve_paths (bool, optional): Whether to resolve relative paths. Default: True

Returns: - Config: Validated configuration object

Raises: - ConfigError: Configuration parsing or validation errors (inherits from DuckalogError)

Path Resolution API

Path Detection Functions

is_relative_path()

Detect if a path is relative based on platform-specific rules.

from duckalog.path_resolution import is_relative_path

# Returns True for relative paths
assert is_relative_path("data/file.parquet")
assert is_relative_path("../shared/data.parquet")

# Returns False for absolute paths and URIs
assert not is_relative_path("/absolute/path/file.parquet")
assert not is_relative_path("s3://bucket/file.parquet")

Parameters: - path (str): Path string to check

Returns: - bool: True if path is relative, False otherwise

detect_path_type()

Categorize paths by type for processing.

from duckalog.path_resolution import detect_path_type

path_type = detect_path_type("../data/file.parquet")  # Returns: "relative"
path_type = detect_path_type("/absolute/file.parquet")  # Returns: "absolute"
path_type = detect_path_type("s3://bucket/file.parquet")  # Returns: "remote"

Parameters: - path (str): Path string to analyze

Returns: - str: One of "relative", "absolute", "remote", "invalid"

Path Resolution Functions

resolve_relative_path()

Resolve a relative path to an absolute path relative to configuration directory.

from duckalog.path_resolution import resolve_relative_path
from pathlib import Path

config_dir = Path("/project/config")
resolved = resolve_relative_path("data/file.parquet", config_dir)
# Returns: "/project/config/data/file.parquet"

Parameters: - path (str): Path to resolve (relative or absolute) - config_dir (Path): Directory containing the configuration file

Returns: - str: Resolved absolute path

Raises: - ValueError: If path resolution fails or violates security rules

Security Validation Functions

validate_path_security()

Validate that resolved paths don't violate security boundaries.

from duckalog.path_resolution import validate_path_security
from pathlib import Path

config_dir = Path("/project/config")

# Safe paths return True
assert validate_path_security("data/file.parquet", config_dir)
assert validate_path_security("../shared/data.parquet", config_dir)

# Dangerous paths return False
assert not validate_path_security("../../../etc/passwd", config_dir)

Parameters: - path (str): Path to validate (will be resolved if relative) - config_dir (Path): Directory containing the configuration file

Returns: - bool: True if path is safe, False otherwise

validate_file_accessibility()

Validate that a file path is accessible and readable.

from duckalog.path_resolution import validate_file_accessibility

accessible, error = validate_file_accessibility("/path/to/file.parquet")

if accessible:
    print("File is accessible")
else:
    print(f"File access error: {error}")

Parameters: - path (str): File path to validate

Returns: - tuple[bool, str | None]: (is_accessible, error_message)

Path Normalization Functions

normalize_path_for_sql()

Normalize a path for use in SQL statements.

from duckalog.path_resolution import normalize_path_for_sql

sql_path = normalize_path_for_sql("/path/to/file.parquet")
# Returns: "'/path/to/file.parquet'"

sql_path = normalize_path_for_sql("/path/file's_data.parquet")  
# Returns: "'/path/file''s_data.parquet'" (properly escaped)

Parameters: - path (str): Absolute path to normalize

Returns: - str: SQL-safe quoted path

Root-Based Path Security Functions

is_within_allowed_roots()

Check if a resolved path is within any of the allowed root directories using robust cross-platform validation.

from duckalog.path_resolution import is_within_allowed_roots
from pathlib import Path

config_dir = Path("/project/config")
allowed_roots = [config_dir]

# Check if a path is within allowed roots
is_safe = is_within_allowed_roots("/project/config/data/file.parquet", allowed_roots)
# Returns: True

is_unsafe = is_within_allowed_roots("/etc/passwd", allowed_roots)  
# Returns: False

is_traversal_blocked = is_within_allowed_roots("../../../etc/passwd", allowed_roots)
# Returns: False (path traversal blocked)

Parameters: - candidate_path (str): The path to check (will be resolved to absolute) - allowed_roots (list[Path]): List of Path objects representing allowed root directories

Returns: - bool: True if the candidate path is within at least one allowed root, False otherwise

Raises: - ValueError: If the candidate path cannot be resolved (invalid path)

Security Features: - Uses Path.resolve() to follow symlinks and get canonical paths - Uses os.path.commonpath() for robust cross-platform path comparison
- Prevents all forms of path traversal attacks regardless of encoding or separators - Handles Windows drive letters and UNC paths correctly

SQL Generation API

SQL Quoting Functions

quote_ident()

Quote a SQL identifier using double quotes with proper escaping.

from duckalog import quote_ident

# Quote simple identifiers
assert quote_ident("events") == '"events"'

# Quote identifiers with spaces
assert quote_ident("daily users") == '"daily users"'

# Quote identifiers with quotes (escapes embedded quotes)
assert quote_ident('user "events"') == '"user ""events"""'

Parameters: - identifier (str): Identifier to quote

Returns: - str: Identifier wrapped in double quotes with proper escaping

quote_literal()

Quote a SQL string literal using single quotes with proper escaping.

from duckalog import quote_literal

# Quote simple strings
assert quote_literal("user's data") == "'user''s data'"

# Quote file paths
assert quote_literal("path/to/file.parquet") == "'path/to/file.parquet'"

# Quote SQL statements
assert quote_literal("SELECT * FROM table") == "'SELECT * FROM table'"

# Quote empty strings
assert quote_literal("") == "''"

Parameters: - value (str): String literal to quote

Returns: - str: String wrapped in single quotes with proper escaping

generate_view_sql()

Generate a CREATE OR REPLACE VIEW statement for a view configuration.

from duckalog import ViewConfig, generate_view_sql

view = ViewConfig(
    name="test_view",
    source="duckdb",
    database="my_db",
    table="users"
)

sql = generate_view_sql(view)
# Returns: CREATE OR REPLACE VIEW "test_view" AS SELECT * FROM "my_db"."users";

Parameters: - view (ViewConfig): View configuration to generate SQL for

Returns: - str: SQL statement creating the view with proper quoting

generate_secret_sql()

Generate CREATE SECRET statement for a secret configuration with proper escaping.

from duckalog import SecretConfig, generate_secret_sql

secret = SecretConfig(
    type="s3",
    name="prod_s3",
    provider="config",
    key_id="user's key",  # Contains single quote
    secret="secret'with'quotes"
)

sql = generate_secret_sql(secret)
# Returns: CREATE SECRET prod_s3 (TYPE S3, KEY_ID 'user''s key', SECRET 'secret''with''quotes')

Parameters: - secret (SecretConfig): Secret configuration to generate SQL for

Returns: - str: SQL statement creating the secret with proper escaping

Note: This function enforces strict type checking for secret options and will raise TypeError for unsupported types (non-bool, int, float, or str values).

Configuration Models

Config

Top-level configuration model with path resolution integration.

from duckalog import load_config

config = load_config("catalog.yaml")

# Access configuration properties
print(config.version)
print(len(config.views))

# Views have resolved paths when resolution is enabled
for view in config.views:
    print(f"View: {view.name}, URI: {view.uri}")

ViewConfig

Model for individual view definitions that may contain resolved paths.

from duckalog.config import ViewConfig

# View with relative path (will be resolved during config loading)
view = ViewConfig(
    name="events",
    source="parquet", 
    uri="data/events.parquet"  # Relative path
)

# View with absolute path (unchanged during resolution)
view = ViewConfig(
    name="remote_data",
    source="parquet",
    uri="s3://bucket/events.parquet"  # Remote URI - unchanged
)

AttachmentConfig

Models for database attachments with path resolution support.

from duckalog.config import DuckDBAttachment, SQLiteAttachment

# DuckDB attachment with relative path
duckdb_attach = DuckDBAttachment(
    alias="reference_db",
    path="./databases/reference.duckdb",  # Will be resolved
    read_only=True
)

# SQLite attachment with relative path
sqlite_attach = SQLiteAttachment(
    alias="legacy_db", 
    path="../legacy/system.db"  # Will be resolved
)

Exception Classes

Duckalog uses a consistent exception hierarchy based on DuckalogError as the base class for all library exceptions. This provides a unified error handling interface and makes it easy to catch all Duckalog-specific errors.

Base Exception

DuckalogError

Base exception for all Duckalog-specific errors. You can catch this class to handle any Duckalog-related error, or catch more specific subclasses for targeted error handling.

from duckalog import DuckalogError

try:
    config = load_config("catalog.yaml")
    build_catalog(config)
except DuckalogError as e:
    # Handle any Duckalog-specific error
    print(f"Duckalog error: {e}")

Configuration Errors

ConfigError

Raised for configuration-related errors including parsing, validation, and path resolution failures. Inherits from DuckalogError.

from duckalog import load_config, ConfigError

try:
    config = load_config("catalog.yaml")
except ConfigError as e:
    print(f"Configuration error: {e}")

Common scenarios: - Missing required configuration fields - Invalid YAML/JSON syntax - Unresolved environment variable placeholders (${env:VAR}) - Invalid view definitions

PathResolutionError

Raised specifically when path resolution fails due to security or access issues. Inherits from ConfigError.

from duckalog import PathResolutionError, resolve_relative_path
from pathlib import Path

try:
    resolved = resolve_relative_path("../../../etc/passwd", Path("/project/config"))
except PathResolutionError as e:
    print(f"Path resolution failed: {e}")
    print(f"Original path: {e.original_path}")
    print(f"Resolved path: {e.resolved_path}")

RemoteConfigError

Raised when remote configuration loading fails. Inherits from ConfigError.

from duckalog import RemoteConfigError, load_config_from_uri

try:
    config = load_config_from_uri("s3://bucket/config.yaml")
except RemoteConfigError as e:
    print(f"Remote config error: {e}")

SQL File Errors

SQLFileError

Base exception for SQL file-related errors. Inherits from DuckalogError.

from duckalog import SQLFileError

try:
    # Operations that load SQL from files
    pass
except SQLFileError as e:
    print(f"SQL file error: {e}")

Subclasses for specific SQL file errors: - SQLFileNotFoundError: Referenced SQL file does not exist - SQLFilePermissionError: SQL file cannot be read due to permissions - SQLFileEncodingError: SQL file has invalid encoding - SQLFileSizeError: SQL file exceeds size limits - SQLTemplateError: Template processing fails

Engine Errors

EngineError

Raised during catalog builds when DuckDB operations fail. Inherits from DuckalogError.

from duckalog import EngineError, build_catalog

try:
    build_catalog("catalog.yaml")
except EngineError as e:
    print(f"Engine error: {e}")

Common scenarios: - DuckDB connection failures - SQL execution errors - Attachment setup failures - Extension installation/loading errors - Secret creation failures

Exception Chaining

All Duckalog exceptions support proper exception chaining to preserve the original error context:

from duckalog import EngineError

try:
    # Some operation that fails
    raise ValueError("Original database connection failed")
except ValueError as exc:
    raise EngineError("Failed to connect to DuckDB") from exc

This preserves the full traceback and makes debugging much easier.

Generated API Documentation

This section is generated from the Duckalog source code using mkdocstrings.

duckalog

Duckalog public API.

AttachmentsConfig

Bases: BaseModel

Collection of attachment configurations.

Attributes:

Name Type Description
duckdb list[DuckDBAttachment]

DuckDB attachment entries.

sqlite list[SQLiteAttachment]

SQLite attachment entries.

postgres list[PostgresAttachment]

Postgres attachment entries.

duckalog list[DuckalogAttachment]

Duckalog config attachment entries.

Config

Bases: BaseModel

Top-level Duckalog configuration.

Attributes:

Name Type Description
version int

Positive integer describing the config schema version.

duckdb DuckDBConfig

DuckDB session and connection settings.

views list[ViewConfig]

List of view definitions to create in the catalog.

attachments AttachmentsConfig

Optional attachments to external databases.

iceberg_catalogs list[IcebergCatalogConfig]

Optional Iceberg catalog definitions.

semantic_models list[SemanticModelConfig]

Optional semantic model definitions for business metadata.

ConfigError

Bases: DuckalogError

Configuration-related errors.

This exception is raised when a catalog configuration cannot be read, parsed, interpolated, or validated according to the Duckalog schema.

Typical error conditions include:

  • The config file does not exist or cannot be read.
  • The file is not valid YAML/JSON.
  • Required fields are missing or invalid.
  • An environment variable placeholder cannot be resolved.

DuckDBAttachment

Bases: BaseModel

Configuration for attaching another DuckDB database.

Attributes:

Name Type Description
alias str

Alias under which the database will be attached.

path str

Filesystem path to the DuckDB database file.

read_only bool

Whether the attachment should be opened in read-only mode. Defaults to True for safety.

DuckDBConfig

Bases: BaseModel

DuckDB connection and session settings.

Attributes:

Name Type Description
database str

Path to the DuckDB database file. Defaults to ":memory:".

install_extensions list[str]

Names of extensions to install before use.

load_extensions list[str]

Names of extensions to load in the session.

pragmas list[str]

SQL statements (typically SET pragmas) executed after connecting and loading extensions.

settings Optional[Literal[str, list[str]]]

DuckDB SET statements executed after pragmas. Can be a single string or list of strings.

secrets list[SecretConfig]

List of secret definitions for external services and databases.

EngineError

Bases: DuckalogError

Engine-level error raised during catalog builds.

This exception wraps lower-level DuckDB errors, such as failures to connect to the database, attach external systems, or execute generated SQL statements.

IcebergCatalogConfig

Bases: BaseModel

Configuration for an Iceberg catalog.

Attributes:

Name Type Description
name str

Catalog name referenced by Iceberg views.

catalog_type str

Backend type (for example, rest, hive, glue).

uri Optional[str]

Optional URI used by certain catalog types.

warehouse Optional[str]

Optional warehouse location for catalog data.

options dict[str, Any]

Additional catalog-specific options.

PostgresAttachment

Bases: BaseModel

Configuration for attaching a Postgres database.

Attributes:

Name Type Description
alias str

Alias used inside DuckDB to reference the Postgres database.

host str

Hostname or IP address of the Postgres server.

port int

TCP port of the Postgres server.

database str

Database name to connect to.

user str

Username for authentication.

password str

Password for authentication.

sslmode Optional[str]

Optional SSL mode (for example, require).

options dict[str, Any]

Extra key/value options passed to the attachment clause.

SQLiteAttachment

Bases: BaseModel

Configuration for attaching a SQLite database.

Attributes:

Name Type Description
alias str

Alias under which the SQLite database will be attached.

path str

Filesystem path to the SQLite .db file.

SecretConfig

Bases: BaseModel

Configuration for a DuckDB secret.

Attributes:

Name Type Description
type SecretType

Secret type (s3, azure, gcs, http, postgres, mysql).

name Optional[str]

Optional name for the secret (defaults to type if not provided).

provider SecretProvider

Secret provider (config or credential_chain).

persistent bool

Whether to create a persistent secret. Defaults to False.

scope Optional[str]

Optional scope prefix for the secret.

key_id Optional[str]

Access key ID or username for authentication.

secret Optional[str]

Secret key or password for authentication.

region Optional[str]

Geographic region for cloud services.

endpoint Optional[str]

Custom endpoint URL for cloud services.

connection_string Optional[str]

Full connection string for databases.

tenant_id Optional[str]

Azure tenant ID for authentication.

account_name Optional[str]

Azure storage account name.

client_id Optional[str]

Azure client ID for authentication.

client_secret Optional[str]

Azure client secret for authentication.

service_account_key Optional[str]

GCS service account key.

json_key Optional[str]

GCS JSON key.

bearer_token Optional[str]

HTTP bearer token for authentication.

header Optional[str]

HTTP header for authentication.

database Optional[str]

Database name for database secrets.

host Optional[str]

Database host for database secrets.

port Optional[int]

Database port for database secrets.

user Optional[str]

Database username (alternative to key_id for database types).

password Optional[str]

Database password (alternative to secret for database types).

options dict[str, Any]

Additional key-value options for the secret.

ViewConfig

Bases: BaseModel

Definition of a single catalog view.

A view can be defined in several ways: 1. Inline SQL: Using the sql field with raw SQL text 2. SQL File: Using sql_file to reference external SQL files 3. SQL Template: Using sql_template for parameterized SQL files 4. Data Source: Using source + required fields for direct data access 5. Source + SQL: Using source for data access plus sql for transformations

For data sources, the required fields depend on the source type: - Parquet/Delta: uri field is required - Iceberg: Either uri OR both catalog and table - DuckDB/SQLite/Postgres: Both database and table are required

When using SQL with a data source, the SQL will be applied as a transformation over the data from the specified source.

Additional metadata fields such as description and tags do not affect SQL generation but are preserved for documentation and tooling.

Attributes:

Name Type Description
name str

Unique view name within the config.

sql Optional[str]

Raw SQL text defining the view body.

sql_file Optional[SQLFileReference]

Direct reference to a SQL file.

sql_template Optional[SQLFileReference]

Reference to a SQL template file with variable substitution.

source Optional[EnvSource]

Source type (e.g. "parquet", "iceberg", "duckdb").

uri Optional[str]

URI for file- or table-based sources (Parquet/Delta/Iceberg).

database Optional[str]

Attachment alias for attached-database sources.

table Optional[str]

Table name (optionally schema-qualified) for attached sources.

catalog Optional[str]

Iceberg catalog name for catalog-based Iceberg views.

options dict[str, Any]

Source-specific options passed to scan functions.

description Optional[str]

Optional human-readable description of the view.

tags list[str]

Optional list of tags for classification.

build_catalog(config_path, db_path=None, dry_run=False, verbose=False, filesystem=None, include_secrets=True)

Build or update a DuckDB catalog from a configuration file.

This function is the high-level entry point used by both the CLI and Python API. It loads the config, optionally performs a dry-run SQL generation, or otherwise connects to DuckDB, sets up attachments and Iceberg catalogs, and creates or replaces configured views.

Parameters:

Name Type Description Default
config_path str

Path to the YAML/JSON configuration file.

required
db_path str | None

Optional override for duckdb.database in the config. Can be a local path or remote URI (s3://, gs://, gcs://, abfs://, adl://, sftp://).

None
dry_run bool

If True, do not connect to DuckDB; instead generate and return the full SQL script for all views.

False
verbose bool

If True, enable more verbose logging via the standard logging module.

False
filesystem Any | None

Optional pre-configured fsspec filesystem object for remote export authentication. If not provided, default authentication will be used.

None

Returns:

Type Description
str | None

The generated SQL script as a string when dry_run is True,

str | None

otherwise None when the catalog is applied to DuckDB.

Raises:

Type Description
ConfigError

If the configuration file is invalid.

EngineError

If connecting to DuckDB or executing SQL fails, or if remote export fails.

Example

Build a catalog in-place::

from duckalog import build_catalog

build_catalog("catalog.yaml")

Build and export to remote storage::

build_catalog("catalog.yaml", db_path="s3://my-bucket/catalog.duckdb")

Generate SQL without modifying the database::

sql = build_catalog("catalog.yaml", dry_run=True)
print(sql)

connect_and_build_catalog(config_path, database_path=None, dry_run=False, verbose=False, read_only=False, **kwargs)

Build a catalog and create a DuckDB connection in one operation.

This function combines catalog building with connection creation, providing a streamlined workflow for users who want to start working with their catalog immediately after creating it.

Parameters:

Name Type Description Default
config_path str

Path to the YAML/JSON configuration file.

required
database_path str | None

Optional database path override.

None
dry_run bool

If True, only validates configuration and returns SQL. If False, builds the catalog and creates a connection.

False
verbose bool

Enable verbose logging during build process.

False
read_only bool

Open the resulting connection in read-only mode.

False
**kwargs Any

Additional keyword arguments.

{}

Returns:

Type Description
DuckDBPyConnection | str | None

A DuckDB connection object for immediate use, or SQL string when dry_run=True.

Raises:

Type Description
ConfigError

If the configuration file is invalid.

EngineError

If catalog building or connection fails.

FileNotFoundError

If the resulting database file doesn't exist (after build).

Example

Build catalog and start querying immediately::

conn = connect_and_build_catalog("catalog.yaml")
data = conn.execute("SELECT * FROM important_table").fetchall()
print(f"Found {len(data)} records")
conn.close()

Validate only (dry run)::

sql = connect_and_build_catalog("catalog.yaml", dry_run=True) print("SQL generation completed, no database created")

Custom database path::

conn = connect_and_build_catalog("catalog.yaml", database_path="analytics.db")
print("Connected to custom database: analytics.db")

If dry_run=True, the function only validates the configuration and returns the SQL script without creating any database files or connections.

connect_to_catalog(config_path, database_path=None, read_only=False)

Connect to an existing DuckDB database created by Duckalog.

This function provides a direct connection to a DuckDB database that was previously created using Duckalog. It simplifies the workflow for users who just want to start querying their catalog database.

Parameters:

Name Type Description Default
config_path str

Path to the YAML/JSON configuration file for an existing catalog.

required
database_path str | None

Optional database path override. If not provided, uses the path from the configuration.

None
read_only bool

Open the connection in read-only mode for safety.

False

Returns:

Type Description
DuckDBPyConnection

An active DuckDB connection object ready for query execution.

Raises:

Type Description
ConfigError

If the configuration file is invalid.

FileNotFoundError

If the specified database file doesn't exist.

Error

If connection or queries fail.

Example

Connect to an existing catalog::

from duckalog import connect_to_catalog
conn = connect_to_catalog("catalog.yaml")

# Use the connection for queries
result = conn.execute("SELECT * FROM some_table").fetchall()
conn.close()

With path override::

conn = connect_to_catalog("catalog.yaml", database_path="analytics.db")

With read-only mode::

conn = connect_to_catalog("catalog.yaml", read_only=True)

With context manager::

from duckalog import connect_to_catalog_cm
with connect_to_catalog_cm("catalog.yaml") as conn:
    data = conn.execute("SELECT * FROM users").fetchall()
    print(f"Found {len(data)} records")
# Connection automatically closed here

connect_to_catalog_cm(config_path, database_path=None, read_only=False)

Context manager version of connect_to_catalog for automatic connection cleanup.

Usage::

from duckalog import connect_to_catalog_cm
with connect_to_catalog_cm("catalog.yaml") as conn:
    data = conn.execute("SELECT * FROM users").fetchall()
    print(f"Found {len(data)} records")
# Connection is automatically closed here

Parameters:

Name Type Description Default
config_path str

Path to the YAML/JSON configuration file for an existing catalog.

required
database_path str | None

Optional database path override.

None
read_only bool

Open the connection in read-only mode for safety.

False

Yields:

Type Description
Generator[DuckDBPyConnection]

An active DuckDB connection that will be closed automatically.

Raises:

Type Description
ConfigError

If the configuration file is invalid.

FileNotFoundError

If the specified database file doesn't exist.

Error

If connection or queries fail.

create_config_template(format='yaml', output_path=None, database_name='analytics_catalog.duckdb', project_name='my_analytics_project')

Generate a basic, valid Duckalog configuration template.

This function creates a configuration template with sensible defaults and educational example content that demonstrates key Duckalog features.

Parameters:

Name Type Description Default
format ConfigFormat

Output format for the configuration ('yaml' or 'json'). Defaults to 'yaml'.

'yaml'
output_path str | None

Optional path to write the configuration file. If provided, the template is written to this path and the content is also returned as a string.

None
database_name str

Name for the DuckDB database file.

'analytics_catalog.duckdb'
project_name str

Name used in comments to personalize the template.

'my_analytics_project'

Returns:

Type Description
str

The generated configuration as a string.

Raises:

Type Description
ValueError

If format is not 'yaml' or 'json'.

ConfigError

If the generated template fails validation.

OSError

If writing to output_path fails.

Example

Generate a YAML template::

template = create_config_template(format='yaml')
print(template)

Generate and save a JSON template::

template = create_config_template(
    format='json',
    output_path='my_config.json'
)

generate_all_views_sql(config, include_secrets=False)

Generate SQL for all views in a configuration.

The output includes a descriptive header with the config version followed by a CREATE OR REPLACE VIEW statement for each view in the order they appear in the configuration.

Parameters:

Name Type Description Default
config Config

The validated :class:Config instance to render.

required
include_secrets bool

Whether to include CREATE SECRET statements for secrets.

False

Returns:

Type Description
str

A multi-statement SQL script suitable for use as a catalog definition.

generate_secret_sql(secret)

Generate CREATE SECRET statement for a DuckDB secret.

Parameters:

Name Type Description Default
secret SecretConfig

Secret configuration object.

required

Returns:

Type Description
str

SQL CREATE SECRET statement.

generate_sql(config_path)

Generate a full SQL script from a config file.

This is a convenience wrapper around :func:load_config and :func:generate_all_views_sql that does not connect to DuckDB.

Parameters:

Name Type Description Default
config_path str

Path to the YAML/JSON configuration file.

required

Returns:

Type Description
str

A multi-statement SQL script containing CREATE OR REPLACE VIEW

str

statements for all configured views.

Raises:

Type Description
ConfigError

If the configuration file is invalid.

Example

from duckalog import generate_sql sql = generate_sql("catalog.yaml") print("CREATE VIEW" in sql) True

generate_view_sql(view)

Generate a CREATE OR REPLACE VIEW statement for a single view.

Parameters:

Name Type Description Default
view ViewConfig

The :class:ViewConfig to generate SQL for.

required

Returns:

Type Description
str

A single SQL statement that creates or replaces the view.

load_config(path, load_sql_files=True, sql_file_loader=None, resolve_paths=True, filesystem=None)

Load, interpolate, and validate a Duckalog configuration file.

This helper is the main entry point for turning a YAML or JSON file into a validated :class:Config instance. It applies environment-variable interpolation and enforces the configuration schema.

Parameters:

Name Type Description Default
path str

Path to a YAML or JSON config file, or a remote URI.

required
load_sql_files bool

Whether to load and process SQL from external files. If False, SQL file references are left as-is for later processing.

True
sql_file_loader Optional[Any]

Optional SQLFileLoader instance for loading SQL files. If None, a default loader will be created.

None
resolve_paths bool

Whether to resolve relative paths to absolute paths. If True, relative paths in view URIs and attachment paths will be resolved relative to the config file's directory. For remote configs, this defaults to False.

True
filesystem Optional[Any]

Optional fsspec filesystem object to use for remote operations. If provided, this filesystem will be used instead of creating a new one based on URI scheme. Useful for custom authentication or advanced use cases.

None

Returns:

Type Description
Any

A validated :class:Config object.

Raises:

Type Description
ConfigError

If the file cannot be read, is not valid YAML/JSON, fails schema validation, contains unresolved ${env:VAR_NAME} placeholders, or if SQL file loading fails.

Example

Load a catalog from catalog.yaml::

from duckalog import load_config

config = load_config("catalog.yaml")
print(len(config.views))

Load a catalog from S3::

config = load_config("s3://my-bucket/configs/catalog.yaml")
print(len(config.views))

Load a catalog with custom filesystem::

import fsspec
fs = fsspec.filesystem("s3", key="key", secret="secret", anon=False)
config = load_config("s3://my-bucket/configs/catalog.yaml", filesystem=fs)
print(len(config.views))

quote_ident(value)

Quote a SQL identifier using double quotes.

This helper wraps a string in double quotes and escapes any embedded double quotes according to SQL rules.

Parameters:

Name Type Description Default
value str

Identifier to quote (for example, a view or column name).

required

Returns:

Type Description
str

The identifier wrapped in double quotes.

Example

quote_ident("events") '"events"'

quote_literal(value)

Quote a SQL string literal using single quotes.

This helper wraps a string in single quotes and escapes any embedded single quotes according to SQL rules.

Parameters:

Name Type Description Default
value str

String literal to quote (for example, a file path, secret, or connection string).

required

Returns:

Type Description
str

The string wrapped in single quotes with proper escaping.

Example

quote_literal("path/to/file.parquet") "'path/to/file.parquet'" quote_literal("user's data") "'user''s data'"

render_options(options)

Render a mapping of options into scan-function arguments.

The resulting string is suitable for appending to a *_scan function call. Keys are sorted alphabetically to keep output deterministic.

Parameters:

Name Type Description Default
options dict[str, Any]

Mapping of option name to value (str, bool, int, or float).

required

Returns:

Type Description
str

A string that starts with , when options are present (for example,

str

", hive_partitioning=TRUE") or an empty string when no options

str

are provided.

Raises:

Type Description
TypeError

If a value has a type that cannot be rendered safely.

validate_config(config_path)

Validate a configuration file without touching DuckDB.

Parameters:

Name Type Description Default
config_path str

Path to the YAML/JSON configuration file.

required

Raises:

Type Description
ConfigError

If the configuration file is missing, malformed, or does not satisfy the schema and interpolation rules.

Example

from duckalog import validate_config validate_config("catalog.yaml") # raises on invalid config

validate_generated_config(content, format='yaml')

Validate that generated configuration content can be loaded successfully.

Parameters:

Name Type Description Default
content str

Configuration content as string.

required
format ConfigFormat

Format of the content ('yaml' or 'json').

'yaml'

Raises:

Type Description
ConfigError

If the configuration cannot be loaded or is invalid.