Initial commit
This commit is contained in:
47
agents/data-parser.md
Normal file
47
agents/data-parser.md
Normal file
@@ -0,0 +1,47 @@
|
||||
---
|
||||
name: data-parser
|
||||
description: Parses WordPress XML, JSON, and CSV content exports into structured data
|
||||
model: sonnet
|
||||
---
|
||||
|
||||
# Data Parser Agent
|
||||
|
||||
Extract structured content data from CMS exports (WordPress XML, JSON, CSV).
|
||||
|
||||
## Supported Formats
|
||||
|
||||
### WordPress XML
|
||||
Extract from `<item>` elements:
|
||||
- title, pubDate, dc:creator, category, content:encoded
|
||||
- Parse dates to standard format
|
||||
- Calculate word counts from content
|
||||
|
||||
### JSON
|
||||
Parse flexible schemas, looking for:
|
||||
- Common field names (title/name, date/published/created_at, author/creator)
|
||||
- Nested structures (data.posts, items, articles)
|
||||
- Array or object formats
|
||||
|
||||
### CSV
|
||||
Parse with headers:
|
||||
- Required: title, date, author
|
||||
- Optional: category, tags, word_count, url, status
|
||||
|
||||
## Output Format
|
||||
|
||||
Return array of content objects:
|
||||
```json
|
||||
[
|
||||
{
|
||||
"title": "Post Title",
|
||||
"date": "2024-01-15",
|
||||
"author": "Author Name",
|
||||
"categories": ["Category 1"],
|
||||
"word_count": 1500,
|
||||
"url": "/post-slug",
|
||||
"status": "published"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
Sort by date (newest first). Report parsing stats to user.
|
||||
Reference in New Issue
Block a user