Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:00:18 +08:00
commit 765529cd13
69 changed files with 18291 additions and 0 deletions

View File

@@ -0,0 +1,54 @@
# Data Processing Patterns
> **Status**: 🚧 Placeholder - Content in development
## Overview
Patterns for data analysis, ETL, and processing using Polars, pandas, and other data libraries in UV single-file
scripts.
## Topics to Cover
- [ ] Polars patterns (recommended for performance)
- [ ] Pandas alternatives
- [ ] CSV/Excel processing
- [ ] JSON data manipulation
- [ ] Data validation and cleaning
- [ ] Aggregation and transformation
- [ ] Memory-efficient processing
## Quick Example
```python
#!/usr/bin/env -S uv run
# /// script
# requires-python = ">=3.11"
# dependencies = ["polars>=0.20.0"]
# ///
import polars as pl
def analyze_csv(file_path: str):
df = pl.read_csv(file_path)
# Basic analysis
summary = df.describe()
print(summary)
# Filter and aggregate
result = (
df.filter(pl.col("value") > 100)
.groupby("category")
.agg(pl.col("value").mean())
)
print(result)
```
## TODO
This file will be expanded to include:
- Complete Polars patterns
- Performance optimization techniques
- Large file processing strategies
- Data validation patterns
- Export formats and options