Initial commit
This commit is contained in:
54
skills/audio-transcript-cleanup/SKILL.md
Normal file
54
skills/audio-transcript-cleanup/SKILL.md
Normal file
@@ -0,0 +1,54 @@
|
||||
---
|
||||
name: audio-transcription-cleanup
|
||||
description: Transform messy voice transcription text into well-formatted, human-readable documents while preserving original meaning
|
||||
---
|
||||
|
||||
# Audio Transcription Cleanup
|
||||
|
||||
Clean up raw audio transcriptions by removing filler words, fixing errors, and adding proper structure.
|
||||
|
||||
## Usage
|
||||
|
||||
Use the `audio_transcript_cleanup.py` script to process transcript files:
|
||||
|
||||
```bash
|
||||
# Use default output location (~/tmp/cleaned_transcript.md - allows overwrite)
|
||||
python scripts/audio_transcript_cleanup.py --transcript-file /path/to/transcript.txt
|
||||
|
||||
# Specify custom output location (cannot overwrite existing files)
|
||||
python scripts/audio_transcript_cleanup.py --transcript-file /path/to/transcript.txt --output /path/to/output.md
|
||||
```
|
||||
|
||||
## What It Does
|
||||
|
||||
The script automatically:
|
||||
- Removes verbal artifacts (um, uh, like, you know, 呃, 啊, 那个, etc.)
|
||||
- Fixes spelling and grammar errors
|
||||
- Adds semantic paragraph breaks and section headings
|
||||
- Converts spoken fragments into complete sentences
|
||||
- Preserves all original information (no summarization)
|
||||
- Auto-detects language and maintains natural expression
|
||||
|
||||
## Options
|
||||
|
||||
- `--transcript-file` (required) - Path to the transcript file to clean up
|
||||
- `--output` (optional) - Custom output path (default: `~/tmp/cleaned_transcript.md`)
|
||||
|
||||
## Output Behavior
|
||||
|
||||
- **Default location**: `~/tmp/cleaned_transcript.md` - Allows overwrite
|
||||
- **Custom location**: Cannot overwrite existing files (raises error if file exists)
|
||||
|
||||
## Language Support
|
||||
|
||||
Auto-detects and works with:
|
||||
- English
|
||||
- Chinese (Mandarin, Cantonese)
|
||||
- Mixed language content
|
||||
- Multi-speaker transcriptions
|
||||
|
||||
## Requirements
|
||||
|
||||
- Python 3.11+
|
||||
- Claude CLI must be installed and accessible
|
||||
- Transcript file must exist at specified path
|
||||
Reference in New Issue
Block a user