API Reference
Welcome to the flowerpower-io
API reference documentation. This section provides detailed information about all public classes, functions, and methods available in the library.
Overview
The flowerpower-io
library provides a unified interface for reading and writing data from various sources and formats. The API is organized into several modules:
- Base Classes - Core classes for file and database operations
- Metadata Functions - Functions for extracting metadata from data sources
- Loader Classes - Classes for reading data from various sources
- Saver Classes - Classes for writing data to various destinations
Quick Navigation
Base Classes
The base classes form the foundation of the library and provide common functionality for all I/O operations.
- BaseFileIO - Base class for file I/O operations
- BaseFileReader - Base class for file reading operations
- BaseDatasetReader - Base class for dataset reading operations
- BaseFileWriter - Base class for file writing operations
- BaseDatasetWriter - Base class for dataset writing operations
- BaseDatabaseIO - Base class for database operations
- BaseDatabaseReader - Base class for database reading operations
- BaseDatabaseWriter - Base class for database writing operations
Metadata Functions
Metadata functions help you understand the structure and properties of your data before processing it.
- get_dataframe_metadata - Extract metadata from DataFrames
- get_pyarrow_table_metadata - Extract metadata from PyArrow Tables
- get_pyarrow_dataset_metadata - Extract metadata from PyArrow Datasets
- get_duckdb_relation_metadata - Extract metadata from DuckDB relations
- get_datafusion_relation_metadata - Extract metadata from DataFusion relations
- get_file_metadata - Extract metadata from files
- get_database_metadata - Extract metadata from database tables
- get_metadata - Generic metadata extraction function
Loader Classes
Loader classes provide specialized functionality for reading data from various sources.
File Loaders
- CSVLoader - Load data from CSV files
- ParquetLoader - Load data from Parquet files
- JSONLoader - Load data from JSON files
- DeltaTableLoader - Load data from Delta Lake tables
- PydalaLoader - Load data from Pydala datasets
- MQTTLoader - Load data from MQTT messages
Database Loaders
- SQLiteLoader - Load data from SQLite databases
- DuckDBLoader - Load data from DuckDB databases
- PostgreSQLLoader - Load data from PostgreSQL databases
- MySQLLoader - Load data from MySQL databases
- MSSQLLoader - Load data from Microsoft SQL Server databases
- OracleLoader - Load data from Oracle databases
Saver Classes
Saver classes provide specialized functionality for writing data to various destinations.
File Savers
- CSVSaver - Save data to CSV files
- ParquetSaver - Save data to Parquet files
- JSONSaver - Save data to JSON files
- DeltaTableSaver - Save data to Delta Lake tables
- PydalaSaver - Save data to Pydala datasets
- MQTTSaver - Save data to MQTT messages
Database Savers
- SQLiteSaver - Save data to SQLite databases
- DuckDBSaver - Save data to DuckDB databases
- PostgreSQLSaver - Save data to PostgreSQL databases
- MySQLSaver - Save data to MySQL databases
- MSSQLSaver - Save data to Microsoft SQL Server databases
- OracleSaver - Save data to Oracle databases
Usage Examples
Basic File Operations
from flowerpower_io import CSVLoader, ParquetSaver
# Load data from CSV
loader = CSVLoader("data.csv")
df = loader.to_polars()
# Save data to Parquet
saver = ParquetSaver("output/")
saver.write(df)
Database Operations
from flowerpower_io import PostgreSQLLoader, SQLiteSaver
# Load from PostgreSQL
loader = PostgreSQLLoader(
host="localhost",
username="user",
password="password",
database="mydb",
table_name="users"
)
df = loader.to_polars()
# Save to SQLite
saver = SQLiteSaver(
path="database.db",
table_name="users"
)
saver.write(df)
Metadata Extraction
from flowerpower_io.metadata import get_dataframe_metadata
# Get metadata from DataFrame
metadata = get_dataframe_metadata(df)
print(metadata)
Common Patterns
Reading Multiple Files
from flowerpower_io import ParquetLoader
# Load multiple Parquet files
loader = ParquetLoader("data/*.parquet")
df = loader.to_polars()
Writing with Partitioning
from flowerpower_io import ParquetSaver
# Save with partitioning
saver = ParquetSaver(
path="output/",
partition_by="category",
compression="zstd"
)
saver.write(df)
Database Connection Management
from flowerpower_io import PostgreSQLLoader
# Using context manager for connection
with PostgreSQLLoader(
host="localhost",
username="user",
password="password",
database="mydb"
) as loader:
df = loader.to_polars()
Error Handling
The library provides comprehensive error handling for various scenarios:
from flowerpower_io import CSVLoader
try:
loader = CSVLoader("nonexistent.csv")
df = loader.to_polars()
except FileNotFoundError:
print("File not found")
except Exception as e:
print(f"Error: {e}")
Performance Tips
- Use
opt_dtypes=True
for better memory efficiency - Use
batch_size
for large datasets - Use
concat=False
when working with multiple files separately - Use appropriate compression for your data format
- Use partitioning for large datasets