Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:20:43 +08:00
commit 07c3046c05
4 changed files with 274 additions and 0 deletions

3
README.md Normal file
View File

@@ -0,0 +1,3 @@
# docx-smart-extractor
Extract and analyze Word documents (1MB-50MB+) with minimal token usage. Lossless extraction of all text, tables, formatting, and document structure while achieving 10-50x token reduction through local extraction, semantic chunking by headings, and intelligent caching.