zhongwei/gh-machu-gwu-sanhe-claude-code-plugins-plugins-social-media-network-youtube

Files

Zhongwei Li 7d5b628e7d Initial commit

2025-11-30 08:38:41 +08:00

1.7 KiB

Raw Permalink Blame History

name, description

name	description
audio-transcription-cleanup	Transform messy voice transcription text into well-formatted, human-readable documents while preserving original meaning

Audio Transcription Cleanup

Clean up raw audio transcriptions by removing filler words, fixing errors, and adding proper structure.

Usage

Use the audio_transcript_cleanup.py script to process transcript files:

# Use default output location (~/tmp/cleaned_transcript.md - allows overwrite)
python scripts/audio_transcript_cleanup.py --transcript-file /path/to/transcript.txt

# Specify custom output location (cannot overwrite existing files)
python scripts/audio_transcript_cleanup.py --transcript-file /path/to/transcript.txt --output /path/to/output.md

What It Does

The script automatically:

Removes verbal artifacts (um, uh, like, you know, 呃, 啊, 那个, etc.)
Fixes spelling and grammar errors
Adds semantic paragraph breaks and section headings
Converts spoken fragments into complete sentences
Preserves all original information (no summarization)
Auto-detects language and maintains natural expression

Options

--transcript-file (required) - Path to the transcript file to clean up
--output (optional) - Custom output path (default: ~/tmp/cleaned_transcript.md)

Output Behavior

Default location: ~/tmp/cleaned_transcript.md - Allows overwrite
Custom location: Cannot overwrite existing files (raises error if file exists)

Language Support

Auto-detects and works with:

English
Chinese (Mandarin, Cantonese)
Mixed language content
Multi-speaker transcriptions

Requirements

Python 3.11+
Claude CLI must be installed and accessible
Transcript file must exist at specified path

1.7 KiB Raw Permalink Blame History