Advanced Usage¶
Welcome to the advanced usage guide for FlowerPower. This document covers more complex configurations and use cases to help you get the most out of the library.
See also:
- Guides → Compose Pipelines With Additional Modules
- Guides → Asynchronous Execution
Configuration Flexibility¶
FlowerPower offers multiple ways to configure your project, ensuring flexibility for different environments and workflows. Configuration is applied in this order (highest wins):
- Runtime kwargs / RunConfig (programmatic overrides at execution time)
- Environment overlays via
FP_PIPELINE__*/FP_PROJECT__*variables - YAML files after env interpolation (
${VAR},${VAR:-default}, etc.) - Global env shims like
FP_LOG_LEVEL,FP_EXECUTOR, etc. (applied only if more specific keys not set) - Code defaults (struct defaults)
Programmatic Configuration (recommended)¶
Use RunConfig or kwargs when executing to override YAML/env at runtime.
Environment Variable Overlays¶
You can set typed, nested overrides using double-underscore paths:
FP_PIPELINE__RUN__LOG_LEVEL=DEBUGFP_PIPELINE__RUN__EXECUTOR__TYPE=threadpoolFP_PROJECT__ADAPTER__HAMILTON_TRACKER__API_KEY=...
Global shims still work and are applied only if pipeline/project-specific keys are not set:
FP_LOG_LEVEL=INFOFP_EXECUTOR=threadpool,FP_EXECUTOR_MAX_WORKERS=8,FP_EXECUTOR_NUM_CPUS=4FP_MAX_RETRIES=3,FP_RETRY_DELAY=2.0,FP_JITTER_FACTOR=0.2
Values are strictly coerced (bool/int/float) and JSON is supported for objects/lists.
YAML Environment Interpolation¶
YAML supports Docker Compose–style expansion inside values. Examples:
Supported forms: ${VAR}, ${VAR:-default} (unset or empty), ${VAR-default} (unset), ${VAR:?err} / ${VAR?err} (require), $${...} escapes $. If the expanded value is valid JSON, it becomes a typed object/list/number/bool/null.
Direct Module Usage¶
For fine-grained control, you can work directly with PipelineManager.
PipelineManager¶
The PipelineManager is responsible for loading, validating, and executing data pipelines.
Hooks¶
Hooks allow you to inject custom logic at specific points in the pipeline lifecycle, such as pre-execution validation or post-execution logging.
Adding Hooks¶
Use the add_hook method in the PipelineRegistry to add hooks to your pipeline.
This appends a template function to the hook file. Customize the function in hooks/my_pipeline/hook.py to implement your logic, e.g., for MQTT config building.
Hooks are executed automatically during pipeline runs based on their type.
Adapters¶
Integrate with popular MLOps and observability tools using adapters.
- Hamilton Tracker: For dataflow and lineage tracking.
- MLflow: For experiment tracking.
- OpenTelemetry: For distributed tracing and metrics.
Filesystem Abstraction¶
FlowerPower uses the library fsspeckit to provide a unified interface for interacting with different filesystems, including local storage, S3, and GCS. This allows you to switch between storage backends without changing your code.
Security¶
FlowerPower includes built-in security features to prevent common vulnerabilities, such as directory traversal attacks. All file paths provided to configuration loaders and filesystem utilities are validated to ensure they are within the project's base directory.
Extensible I/O Plugins¶
The FlowerPower plugin flowerpower-io enhances FlowerPower's I/O capabilities, allowing you to connect to various data sources and sinks using a simple plugin architecture.
Supported Types Include:
- CSV, JSON, Parquet
- DeltaTable
- DuckDB, PostgreSQL, MySQL, MSSQL, Oracle, SQLite
- MQTT
To use a plugin, simply specify its type in your pipeline configuration.
Troubleshooting¶
Here are some common issues and how to resolve them:
- Redis Connection Error: Ensure your Redis server is running and accessible. Check the
redis.hostandredis.portsettings in your configuration. - Configuration Errors: Use the
flowerpower pipeline show-summarycommand to inspect the loaded configuration and identify any misconfigurations. - Module Not Found: Make sure your pipeline and task modules are in Python's path. You can add directories to the path using the
PYTHONPATHenvironment variable.
Note
For more detailed information, refer to the API documentation.