Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:40:04 +08:00
commit f12538b75e
11 changed files with 853 additions and 0 deletions

View File

@@ -0,0 +1,46 @@
### Searching
Search current directory and subdirectories
fd "search term"
Find all PDFs
fd -e pdf
Find files by name
fd -g "\*.txt"
Search file contents for “config”
rg "config"
#### Reading & Converting Files (with pandoc)
Convert PDF to text
pdftotext document.pdf
# or via pandoc
pandoc document.pdf -t plain -o document.txt
Convert Word document to text
pandoc document.docx -t plain -o document.txt
Convert RTF or HTML to text
pandoc document.rtf -t plain -o document.txt
pandoc document.html -t plain -o document.txt
Batch convert all PDFs to text
for f in \*.pdf; do pdftotext "$f" "${f%.pdf}.txt"; done
#### Summary
• fd replaces mdfind for fast file search.
• rg (ripgrep) replaces Spotlight content search.
• pandoc + pdftotext replace textutil for format conversion.

View File

@@ -0,0 +1,111 @@
### Use Spotlight for searching
#### Search Current Directory and Subdirectories
bash
```bash
mdfind -onlyin . "search term"
```
Or you can be more explicit:
bash
```bash
mdfind -onlyin "$PWD" "search term"
```
#### Examples
**Find all PDF files in current directory tree:**
bash
```bash
mdfind -onlyin . "kMDItemContentType == 'com.adobe.pdf'"
```
**Search for files containing "config" in current directory:**
bash
```bash
mdfind -onlyin . "config"
```
**Find files by name in current directory:**
bash
```bash
mdfind -onlyin . -name "*.txt"
```
### Reading Files
`textutil` is a powerful built-in macOS command-line utility for converting between various document formats. It's particularly useful for extracting text from documents or converting between formats.
#### Supported Input Formats
**Text formats:**
- `.txt` - Plain text
- `.rtf` - Rich Text Format
- `.rtfd` - RTF with attachments
- `.html` - HTML documents
- `.xml` - XML documents
**Document formats:**
- `.doc` - Microsoft Word (older format)
- `.docx` - Microsoft Word (newer format)
- `.odt` - OpenDocument Text
- `.pages` - Apple Pages documents
**Other formats:**
- `.pdf` - PDF documents
- `.webarchive` - Safari web archives
#### Supported Output Formats
You can convert to these formats using the `-convert` option:
- `txt` - Plain text
- `rtf` - Rich Text Format
- `rtfd` - RTF with attachments
- `html` - HTML
- `xml` - XML
- `doc` - Microsoft Word
- `docx` - Microsoft Word (newer)
- `odt` - OpenDocument Text
- `webarchive` - Web archive
#### Common Usage Examples
**Convert PDF to text:**
```bash
textutil -convert txt document.pdf
textutil -convert txt document.pdf -output extracted.txt
```
**Convert Word doc to plain text:**
```bash
textutil -convert txt document.docx
```
**Convert multiple files:**
```bash
textutil -convert txt *.pdf
textutil -convert html *.rtf
```
**Extract text from Pages document:**
```bash
textutil -convert txt document.pages -output text_version.txt
```