55 lines
1.1 KiB
Markdown
55 lines
1.1 KiB
Markdown
# Data Processing Patterns
|
|
|
|
> **Status**: 🚧 Placeholder - Content in development
|
|
|
|
## Overview
|
|
|
|
Patterns for data analysis, ETL, and processing using Polars, pandas, and other data libraries in UV single-file
|
|
scripts.
|
|
|
|
## Topics to Cover
|
|
|
|
- [ ] Polars patterns (recommended for performance)
|
|
- [ ] Pandas alternatives
|
|
- [ ] CSV/Excel processing
|
|
- [ ] JSON data manipulation
|
|
- [ ] Data validation and cleaning
|
|
- [ ] Aggregation and transformation
|
|
- [ ] Memory-efficient processing
|
|
|
|
## Quick Example
|
|
|
|
```python
|
|
#!/usr/bin/env -S uv run
|
|
# /// script
|
|
# requires-python = ">=3.11"
|
|
# dependencies = ["polars>=0.20.0"]
|
|
# ///
|
|
import polars as pl
|
|
|
|
def analyze_csv(file_path: str):
|
|
df = pl.read_csv(file_path)
|
|
|
|
# Basic analysis
|
|
summary = df.describe()
|
|
print(summary)
|
|
|
|
# Filter and aggregate
|
|
result = (
|
|
df.filter(pl.col("value") > 100)
|
|
.groupby("category")
|
|
.agg(pl.col("value").mean())
|
|
)
|
|
print(result)
|
|
```
|
|
|
|
## TODO
|
|
|
|
This file will be expanded to include:
|
|
|
|
- Complete Polars patterns
|
|
- Performance optimization techniques
|
|
- Large file processing strategies
|
|
- Data validation patterns
|
|
- Export formats and options
|