1250 lines
41 KiB
Markdown
1250 lines
41 KiB
Markdown
---
|
|
description: Aggregate crypto news with sentiment analysis and market impact scoring from 50+ sources
|
|
shortcut: an
|
|
---
|
|
|
|
# Aggregate Crypto News
|
|
|
|
Multi-source cryptocurrency news aggregation system with AI-powered sentiment analysis, trend detection, and market impact prediction. Monitors 50+ crypto news sources, social media platforms, and official project announcements in real-time.
|
|
|
|
**Supported Sources**: CoinDesk, CoinTelegraph, Decrypt, The Block, Bitcoin Magazine, CryptoSlate, Twitter/X, Reddit, Medium, Telegram, Discord, official project blogs, SEC filings, exchange announcements
|
|
|
|
## When to Use This Command
|
|
|
|
Use `/aggregate-news` when you need to:
|
|
- Monitor breaking crypto news across multiple sources in real-time
|
|
- Analyze sentiment shifts before market movements
|
|
- Track specific coins, projects, or topics (DeFi, NFTs, regulation)
|
|
- Identify trending narratives and meme coins early
|
|
- Correlate news events with price movements
|
|
- Research fundamental analysis for trading decisions
|
|
- Monitor regulatory developments and exchange listings
|
|
- Track whale social media activity and influencer sentiment
|
|
|
|
**DON'T use this command for:**
|
|
- Single-source news reading - Use direct RSS feeds
|
|
- Historical price analysis - Use `/analyze-trends` instead
|
|
- On-chain data analysis - Use `/analyze-chain` instead
|
|
- Technical analysis - Use `/generate-signal` instead
|
|
- Real-time price tracking - Use `/track-price` instead
|
|
|
|
## Design Decisions
|
|
|
|
**Why Multi-Source Aggregation?**
|
|
Single sources have bias and delays. Aggregating 50+ sources provides complete market coverage and cross-verification of breaking news.
|
|
|
|
**Why Sentiment Analysis?**
|
|
- **AI sentiment chosen**: News sentiment predicts price movements 15-30 minutes before impact
|
|
- **Manual reading rejected**: Impossible to read 1000+ articles/day across sources
|
|
|
|
**Why Real-Time vs Hourly Digests?**
|
|
- **Real-time chosen**: Breaking news impacts prices within seconds (exchange hacks, regulations)
|
|
- **Hourly rejected**: 60-minute delays miss trading opportunities
|
|
|
|
**Why Deduplication?**
|
|
- **Deduplication chosen**: 50+ sources report same news, creates 80% redundancy
|
|
- **All articles kept rejected**: Would generate 5000+ alerts/day
|
|
|
|
**Why Market Impact Scoring?**
|
|
- **Scoring chosen**: Not all news is equally important (0-100 scale)
|
|
- **Equal weight rejected**: Treats minor updates same as exchange hacks
|
|
|
|
## Prerequisites
|
|
|
|
Before running this command, ensure you have:
|
|
|
|
1. **News API Access** (at least 3 recommended):
|
|
- NewsAPI.org key (free: 100 requests/day, paid: unlimited)
|
|
- CryptoCompare News API (free tier available)
|
|
- CoinGecko News API (no key required, rate limited)
|
|
- Alternative Data News API (premium)
|
|
- Messari News API (enterprise)
|
|
|
|
2. **Social Media APIs**:
|
|
- Twitter/X API v2 bearer token (Essential: $100/month)
|
|
- Reddit API credentials (free with rate limits)
|
|
- Telegram bot token (for channel monitoring)
|
|
- Discord bot token + OAuth2 (for server monitoring)
|
|
|
|
3. **AI/ML Services** (choose one):
|
|
- OpenAI API key (GPT-4 for sentiment: $0.03/1K tokens)
|
|
- Anthropic Claude API key (recommended: $0.015/1K tokens)
|
|
- Google Vertex AI credentials (PaLM 2)
|
|
- Local sentiment model (FinBERT, CryptoBERT)
|
|
|
|
4. **Database** (for historical tracking):
|
|
- PostgreSQL 13+ with pgvector (similarity search)
|
|
- MongoDB 5+ (document storage)
|
|
- Elasticsearch 8+ (full-text search, recommended)
|
|
|
|
5. **Infrastructure**:
|
|
- Linux server (Ubuntu 20.04+) or Docker
|
|
- 8GB RAM minimum (16GB for 50+ sources)
|
|
- 500GB storage (6 months history)
|
|
- Redis (for real-time caching and deduplication)
|
|
|
|
## Implementation Process
|
|
|
|
### Step 1: Configure News Sources
|
|
|
|
Create `config/news_sources.json`:
|
|
|
|
```json
|
|
{
|
|
"sources": {
|
|
"mainstream_crypto": [
|
|
{
|
|
"name": "CoinDesk",
|
|
"type": "rss",
|
|
"url": "https://www.coindesk.com/arc/outboundfeeds/rss/",
|
|
"impact_weight": 1.0,
|
|
"credibility": 0.95
|
|
},
|
|
{
|
|
"name": "CoinTelegraph",
|
|
"type": "rss",
|
|
"url": "https://cointelegraph.com/rss",
|
|
"impact_weight": 0.9,
|
|
"credibility": 0.90
|
|
},
|
|
{
|
|
"name": "The Block",
|
|
"type": "rss",
|
|
"url": "https://www.theblock.co/rss.xml",
|
|
"impact_weight": 0.95,
|
|
"credibility": 0.92
|
|
},
|
|
{
|
|
"name": "Decrypt",
|
|
"type": "rss",
|
|
"url": "https://decrypt.co/feed",
|
|
"impact_weight": 0.85,
|
|
"credibility": 0.88
|
|
}
|
|
],
|
|
"social_media": [
|
|
{
|
|
"name": "CryptoTwitter",
|
|
"type": "twitter_search",
|
|
"keywords": ["#Bitcoin", "#Ethereum", "#Crypto", "$BTC", "$ETH"],
|
|
"min_followers": 10000,
|
|
"verified_only": false,
|
|
"impact_weight": 0.7,
|
|
"credibility": 0.70
|
|
},
|
|
{
|
|
"name": "CryptocurrencySubreddit",
|
|
"type": "reddit",
|
|
"subreddit": "CryptoCurrency",
|
|
"min_upvotes": 100,
|
|
"impact_weight": 0.6,
|
|
"credibility": 0.65
|
|
}
|
|
],
|
|
"official_sources": [
|
|
{
|
|
"name": "SEC_Filings",
|
|
"type": "sec_rss",
|
|
"url": "https://www.sec.gov/cgi-bin/browse-edgar?action=getcurrent&CIK=&type=&company=bitcoin&dateb=&owner=exclude&start=0&count=40&output=atom",
|
|
"impact_weight": 1.5,
|
|
"credibility": 1.0
|
|
}
|
|
]
|
|
},
|
|
"fetch_intervals": {
|
|
"high_priority": 60,
|
|
"medium_priority": 300,
|
|
"low_priority": 900
|
|
},
|
|
"content_filters": {
|
|
"min_word_count": 50,
|
|
"exclude_keywords": ["advertisement", "sponsored", "partner content"],
|
|
"language": "en"
|
|
}
|
|
}
|
|
```
|
|
|
|
### Step 2: Set Up Sentiment Analysis
|
|
|
|
Create `config/sentiment_config.json`:
|
|
|
|
```json
|
|
{
|
|
"provider": "anthropic",
|
|
"model": "claude-3-5-sonnet-20241022",
|
|
"api_key_env": "ANTHROPIC_API_KEY",
|
|
|
|
"sentiment_scale": {
|
|
"very_bearish": -1.0,
|
|
"bearish": -0.5,
|
|
"neutral": 0.0,
|
|
"bullish": 0.5,
|
|
"very_bullish": 1.0
|
|
},
|
|
|
|
"entity_extraction": {
|
|
"enabled": true,
|
|
"categories": [
|
|
"cryptocurrencies",
|
|
"exchanges",
|
|
"projects",
|
|
"people",
|
|
"regulations",
|
|
"events"
|
|
]
|
|
},
|
|
|
|
"market_impact": {
|
|
"factors": [
|
|
"source_credibility",
|
|
"sentiment_strength",
|
|
"entity_relevance",
|
|
"social_engagement",
|
|
"breaking_news_indicator"
|
|
],
|
|
"weights": {
|
|
"source_credibility": 0.3,
|
|
"sentiment_strength": 0.25,
|
|
"entity_relevance": 0.2,
|
|
"social_engagement": 0.15,
|
|
"breaking_news_indicator": 0.1
|
|
}
|
|
},
|
|
|
|
"cache_ttl_seconds": 3600,
|
|
"batch_size": 10
|
|
}
|
|
```
|
|
|
|
### Step 3: Initialize Database Schema
|
|
|
|
```sql
|
|
-- PostgreSQL with pgvector for semantic search
|
|
CREATE EXTENSION IF NOT EXISTS vector;
|
|
|
|
CREATE TABLE news_articles (
|
|
id SERIAL PRIMARY KEY,
|
|
article_id VARCHAR(255) UNIQUE NOT NULL,
|
|
source VARCHAR(100) NOT NULL,
|
|
source_type VARCHAR(50) NOT NULL,
|
|
title TEXT NOT NULL,
|
|
content TEXT,
|
|
summary TEXT,
|
|
url TEXT UNIQUE NOT NULL,
|
|
author VARCHAR(255),
|
|
published_at TIMESTAMP NOT NULL,
|
|
fetched_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
|
language VARCHAR(10) DEFAULT 'en',
|
|
|
|
-- Sentiment analysis
|
|
sentiment_score FLOAT,
|
|
sentiment_label VARCHAR(20),
|
|
confidence FLOAT,
|
|
|
|
-- Market impact
|
|
market_impact_score INTEGER,
|
|
impact_category VARCHAR(20),
|
|
|
|
-- Engagement metrics
|
|
social_shares INTEGER DEFAULT 0,
|
|
comments_count INTEGER DEFAULT 0,
|
|
engagement_score FLOAT,
|
|
|
|
-- Vector embedding for similarity search
|
|
embedding vector(1536),
|
|
|
|
INDEX idx_published_at (published_at DESC),
|
|
INDEX idx_source (source),
|
|
INDEX idx_sentiment (sentiment_score),
|
|
INDEX idx_impact (market_impact_score DESC)
|
|
);
|
|
|
|
CREATE TABLE article_entities (
|
|
id SERIAL PRIMARY KEY,
|
|
article_id INTEGER REFERENCES news_articles(id) ON DELETE CASCADE,
|
|
entity_type VARCHAR(50) NOT NULL,
|
|
entity_name VARCHAR(255) NOT NULL,
|
|
entity_ticker VARCHAR(20),
|
|
relevance_score FLOAT,
|
|
sentiment_score FLOAT,
|
|
INDEX idx_entity_name (entity_name),
|
|
INDEX idx_entity_ticker (entity_ticker)
|
|
);
|
|
|
|
CREATE TABLE trending_topics (
|
|
id SERIAL PRIMARY KEY,
|
|
topic VARCHAR(255) UNIQUE NOT NULL,
|
|
category VARCHAR(50),
|
|
mention_count INTEGER DEFAULT 1,
|
|
avg_sentiment FLOAT,
|
|
market_impact_score INTEGER,
|
|
first_seen TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
|
last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
|
is_trending BOOLEAN DEFAULT FALSE,
|
|
INDEX idx_trending (is_trending, market_impact_score DESC)
|
|
);
|
|
|
|
CREATE TABLE article_duplicates (
|
|
id SERIAL PRIMARY KEY,
|
|
article_id INTEGER REFERENCES news_articles(id),
|
|
duplicate_of INTEGER REFERENCES news_articles(id),
|
|
similarity_score FLOAT,
|
|
detected_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
|
INDEX idx_article (article_id)
|
|
);
|
|
|
|
-- Elasticsearch index mapping (alternative/complement)
|
|
-- Run via Elasticsearch REST API
|
|
PUT /crypto_news
|
|
{
|
|
"mappings": {
|
|
"properties": {
|
|
"title": {"type": "text", "analyzer": "english"},
|
|
"content": {"type": "text", "analyzer": "english"},
|
|
"source": {"type": "keyword"},
|
|
"published_at": {"type": "date"},
|
|
"sentiment_score": {"type": "float"},
|
|
"market_impact_score": {"type": "integer"},
|
|
"entities": {
|
|
"type": "nested",
|
|
"properties": {
|
|
"name": {"type": "keyword"},
|
|
"type": {"type": "keyword"},
|
|
"sentiment": {"type": "float"}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Step 4: Run News Aggregator
|
|
|
|
Execute the aggregation script:
|
|
|
|
```bash
|
|
# Start real-time aggregation (all sources)
|
|
python3 news_aggregator.py --sources-config config/news_sources.json \
|
|
--sentiment-config config/sentiment_config.json \
|
|
--min-impact 40 \
|
|
--deduplicate
|
|
|
|
# Monitor specific topics
|
|
python3 news_aggregator.py --topics "Bitcoin,Ethereum,DeFi" \
|
|
--alert-threshold 70
|
|
|
|
# Export to webhook
|
|
python3 news_aggregator.py --webhook-url https://your-api.com/news \
|
|
--format json \
|
|
--interval 300
|
|
|
|
# Generate daily digest
|
|
python3 news_aggregator.py --digest daily \
|
|
--email your@email.com \
|
|
--top-stories 20
|
|
```
|
|
|
|
### Step 5: Set Up Monitoring Dashboard
|
|
|
|
Create Elasticsearch/Kibana dashboard for visualization:
|
|
|
|
```yaml
|
|
# docker-compose.yml
|
|
version: '3.8'
|
|
services:
|
|
elasticsearch:
|
|
image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
|
|
environment:
|
|
- discovery.type=single-node
|
|
- xpack.security.enabled=false
|
|
ports:
|
|
- "9200:9200"
|
|
volumes:
|
|
- es_data:/usr/share/elasticsearch/data
|
|
|
|
kibana:
|
|
image: docker.elastic.co/kibana/kibana:8.11.0
|
|
ports:
|
|
- "5601:5601"
|
|
depends_on:
|
|
- elasticsearch
|
|
|
|
redis:
|
|
image: redis:7-alpine
|
|
ports:
|
|
- "6379:6379"
|
|
volumes:
|
|
- redis_data:/data
|
|
|
|
volumes:
|
|
es_data:
|
|
redis_data:
|
|
```
|
|
|
|
## Output Format
|
|
|
|
The command generates 5 output files:
|
|
|
|
### 1. `news_alerts_YYYYMMDD_HHMMSS.json`
|
|
Real-time high-impact news alerts:
|
|
```json
|
|
{
|
|
"alert_id": "alert_1634567890_coindesk_abc123",
|
|
"timestamp": "2025-10-11T14:23:45Z",
|
|
"article": {
|
|
"title": "SEC Approves Bitcoin Spot ETF from BlackRock",
|
|
"source": "CoinDesk",
|
|
"author": "Jamie Crawley",
|
|
"url": "https://www.coindesk.com/...",
|
|
"published_at": "2025-10-11T14:20:00Z",
|
|
"summary": "The U.S. Securities and Exchange Commission has approved BlackRock's application for a spot Bitcoin ETF, marking a historic moment for cryptocurrency adoption."
|
|
},
|
|
"sentiment_analysis": {
|
|
"overall_sentiment": "very_bullish",
|
|
"sentiment_score": 0.92,
|
|
"confidence": 0.95,
|
|
"reasoning": "Major regulatory approval from SEC, institutional adoption milestone, reduces regulatory risk, expected to increase demand significantly."
|
|
},
|
|
"market_impact": {
|
|
"impact_score": 98,
|
|
"impact_category": "CRITICAL",
|
|
"affected_assets": [
|
|
{
|
|
"ticker": "BTC",
|
|
"name": "Bitcoin",
|
|
"expected_impact": "very_positive",
|
|
"reasoning": "Direct positive impact - increased institutional demand, reduced regulatory uncertainty"
|
|
},
|
|
{
|
|
"ticker": "ETH",
|
|
"name": "Ethereum",
|
|
"expected_impact": "positive",
|
|
"reasoning": "Indirect positive - sets precedent for ETH ETF approval"
|
|
}
|
|
],
|
|
"predicted_price_movement": {
|
|
"direction": "up",
|
|
"magnitude": "high",
|
|
"timeframe": "immediate_to_1_week",
|
|
"confidence": 0.88
|
|
}
|
|
},
|
|
"entities_mentioned": [
|
|
{"name": "SEC", "type": "regulator", "sentiment": 0.8},
|
|
{"name": "BlackRock", "type": "institution", "sentiment": 0.9},
|
|
{"name": "Bitcoin", "type": "cryptocurrency", "sentiment": 0.95}
|
|
],
|
|
"social_engagement": {
|
|
"twitter_mentions": 15234,
|
|
"reddit_upvotes": 8945,
|
|
"trending_score": 95
|
|
}
|
|
}
|
|
```
|
|
|
|
### 2. `daily_summary_YYYYMMDD.json`
|
|
Daily news summary with trends:
|
|
```json
|
|
{
|
|
"date": "2025-10-11",
|
|
"summary": {
|
|
"total_articles": 1247,
|
|
"unique_sources": 52,
|
|
"avg_sentiment": 0.23,
|
|
"sentiment_distribution": {
|
|
"very_bearish": 89,
|
|
"bearish": 234,
|
|
"neutral": 567,
|
|
"bullish": 289,
|
|
"very_bullish": 68
|
|
}
|
|
},
|
|
"trending_topics": [
|
|
{
|
|
"topic": "Bitcoin ETF Approval",
|
|
"mentions": 342,
|
|
"avg_sentiment": 0.87,
|
|
"impact_score": 95,
|
|
"related_tickers": ["BTC", "ETH"],
|
|
"trend_direction": "surging"
|
|
},
|
|
{
|
|
"topic": "Ethereum Shanghai Upgrade",
|
|
"mentions": 198,
|
|
"avg_sentiment": 0.65,
|
|
"impact_score": 78,
|
|
"related_tickers": ["ETH"],
|
|
"trend_direction": "rising"
|
|
}
|
|
],
|
|
"sentiment_shifts": [
|
|
{
|
|
"asset": "BTC",
|
|
"previous_24h_sentiment": 0.12,
|
|
"current_sentiment": 0.68,
|
|
"shift": 0.56,
|
|
"significance": "major_positive_shift"
|
|
}
|
|
],
|
|
"top_sources": [
|
|
{"source": "CoinDesk", "articles": 87, "avg_impact": 72},
|
|
{"source": "The Block", "articles": 65, "avg_impact": 78},
|
|
{"source": "CoinTelegraph", "articles": 134, "avg_impact": 65}
|
|
]
|
|
}
|
|
```
|
|
|
|
### 3. `trending_narratives_YYYYMMDD.csv`
|
|
Trending narratives and memes:
|
|
```csv
|
|
rank,narrative,mentions_24h,mentions_7d,growth_rate,avg_sentiment,related_coins,sample_headlines
|
|
1,"Bitcoin ETF Approval",342,892,283%,0.87,"BTC,ETH","SEC Approves BlackRock Bitcoin ETF; Historic Day for Crypto; ETF Approval Sends BTC to $50K"
|
|
2,"AI Crypto Projects",215,456,112%,0.72,"FET,AGIX,RNDR","AI Tokens Surge 40% on ChatGPT Integration; Fetch.ai Partners with Bosch"
|
|
3,"Memecoin Season",189,1234,53%,0.45,"DOGE,SHIB,PEPE","New Memecoin PEPE2 Explodes 300%; Dogecoin Whale Activity Spikes"
|
|
```
|
|
|
|
### 4. `entity_sentiment_tracker.json`
|
|
Per-entity sentiment tracking:
|
|
```json
|
|
{
|
|
"timestamp": "2025-10-11T14:23:45Z",
|
|
"entities": {
|
|
"cryptocurrencies": [
|
|
{
|
|
"ticker": "BTC",
|
|
"name": "Bitcoin",
|
|
"mentions_24h": 3421,
|
|
"sentiment_score": 0.68,
|
|
"sentiment_trend": "strongly_improving",
|
|
"market_impact_score": 85,
|
|
"top_keywords": ["ETF", "approval", "institutional", "adoption"]
|
|
},
|
|
{
|
|
"ticker": "ETH",
|
|
"name": "Ethereum",
|
|
"mentions_24h": 1876,
|
|
"sentiment_score": 0.52,
|
|
"sentiment_trend": "improving",
|
|
"market_impact_score": 72,
|
|
"top_keywords": ["Shanghai", "staking", "upgrade", "Layer2"]
|
|
}
|
|
],
|
|
"exchanges": [
|
|
{
|
|
"name": "Binance",
|
|
"mentions_24h": 567,
|
|
"sentiment_score": -0.23,
|
|
"sentiment_trend": "declining",
|
|
"reason": "Regulatory concerns, DOJ investigation news"
|
|
}
|
|
],
|
|
"regulations": [
|
|
{
|
|
"topic": "SEC Crypto Policy",
|
|
"mentions_24h": 892,
|
|
"sentiment_score": 0.45,
|
|
"sentiment_trend": "improving",
|
|
"reason": "ETF approval signals more favorable stance"
|
|
}
|
|
]
|
|
}
|
|
}
|
|
```
|
|
|
|
### 5. `market_moving_events.json`
|
|
Critical market-moving events detected:
|
|
```json
|
|
{
|
|
"date": "2025-10-11",
|
|
"critical_events": [
|
|
{
|
|
"event_id": "evt_20251011_001",
|
|
"title": "SEC Approves Bitcoin Spot ETF",
|
|
"category": "regulation",
|
|
"impact_score": 98,
|
|
"detected_at": "2025-10-11T14:20:45Z",
|
|
"price_impact_observed": {
|
|
"BTC": {
|
|
"price_before": 43250,
|
|
"price_15min_after": 47800,
|
|
"change_pct": 10.5,
|
|
"volume_increase_pct": 340
|
|
}
|
|
},
|
|
"news_velocity": {
|
|
"articles_first_hour": 87,
|
|
"sources_reporting": 45,
|
|
"social_mentions_first_hour": 234567
|
|
}
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
## Code Example 1: Core News Aggregator (Python)
|
|
|
|
```python
|
|
#!/usr/bin/env python3
|
|
"""
|
|
Production-grade cryptocurrency news aggregator with AI sentiment analysis.
|
|
Supports 50+ news sources, real-time monitoring, and market impact scoring.
|
|
"""
|
|
|
|
import asyncio
|
|
import hashlib
|
|
import json
|
|
import logging
|
|
import re
|
|
import time
|
|
from datetime import datetime, timedelta
|
|
from typing import Dict, List, Optional, Set
|
|
from dataclasses import dataclass, asdict
|
|
from urllib.parse import urlparse
|
|
|
|
import aiohttp
|
|
import feedparser
|
|
import psycopg2
|
|
from psycopg2.extras import execute_batch
|
|
import redis
|
|
from anthropic import AsyncAnthropic
|
|
from elasticsearch import AsyncElasticsearch
|
|
from sklearn.metrics.pairwise import cosine_similarity
|
|
import numpy as np
|
|
|
|
logging.basicConfig(
|
|
level=logging.INFO,
|
|
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
|
|
)
|
|
logger = logging.getLogger(__name__)
|
|
|
|
|
|
@dataclass
|
|
class NewsArticle:
|
|
"""Represents a news article with all metadata."""
|
|
article_id: str
|
|
source: str
|
|
source_type: str
|
|
title: str
|
|
content: str
|
|
summary: str
|
|
url: str
|
|
author: Optional[str]
|
|
published_at: datetime
|
|
fetched_at: datetime
|
|
|
|
# Sentiment
|
|
sentiment_score: float = 0.0
|
|
sentiment_label: str = "neutral"
|
|
confidence: float = 0.0
|
|
|
|
# Market impact
|
|
market_impact_score: int = 0
|
|
impact_category: str = "low"
|
|
|
|
# Engagement
|
|
social_shares: int = 0
|
|
comments_count: int = 0
|
|
engagement_score: float = 0.0
|
|
|
|
def to_dict(self) -> Dict:
|
|
"""Convert to dictionary for JSON serialization."""
|
|
d = asdict(self)
|
|
d['published_at'] = self.published_at.isoformat()
|
|
d['fetched_at'] = self.fetched_at.isoformat()
|
|
return d
|
|
|
|
|
|
class NewsScraper:
|
|
"""Fetch news from multiple sources."""
|
|
|
|
def __init__(self, sources_config: Dict):
|
|
self.sources_config = sources_config
|
|
self.session: Optional[aiohttp.ClientSession] = None
|
|
|
|
async def __aenter__(self):
|
|
self.session = aiohttp.ClientSession()
|
|
return self
|
|
|
|
async def __aexit__(self, exc_type, exc_val, exc_tb):
|
|
if self.session:
|
|
await self.session.close()
|
|
|
|
async def fetch_all_sources(self) -> List[NewsArticle]:
|
|
"""Fetch from all configured sources concurrently."""
|
|
tasks = []
|
|
|
|
for category, sources in self.sources_config['sources'].items():
|
|
for source in sources:
|
|
if source['type'] == 'rss':
|
|
tasks.append(self._fetch_rss(source))
|
|
elif source['type'] == 'twitter_search':
|
|
tasks.append(self._fetch_twitter(source))
|
|
elif source['type'] == 'reddit':
|
|
tasks.append(self._fetch_reddit(source))
|
|
|
|
results = await asyncio.gather(*tasks, return_exceptions=True)
|
|
|
|
articles = []
|
|
for result in results:
|
|
if isinstance(result, Exception):
|
|
logger.error(f"Source fetch failed: {result}")
|
|
elif result:
|
|
articles.extend(result)
|
|
|
|
return articles
|
|
|
|
async def _fetch_rss(self, source: Dict) -> List[NewsArticle]:
|
|
"""Fetch articles from RSS feed."""
|
|
articles = []
|
|
|
|
try:
|
|
async with self.session.get(source['url'], timeout=30) as response:
|
|
if response.status != 200:
|
|
logger.error(f"RSS fetch failed for {source['name']}: {response.status}")
|
|
return articles
|
|
|
|
content = await response.text()
|
|
feed = feedparser.parse(content)
|
|
|
|
for entry in feed.entries:
|
|
article = NewsArticle(
|
|
article_id=self._generate_article_id(entry.link),
|
|
source=source['name'],
|
|
source_type='rss',
|
|
title=entry.title,
|
|
content=entry.get('summary', ''),
|
|
summary=entry.get('summary', '')[:500],
|
|
url=entry.link,
|
|
author=entry.get('author'),
|
|
published_at=self._parse_published_date(entry),
|
|
fetched_at=datetime.utcnow()
|
|
)
|
|
articles.append(article)
|
|
|
|
except Exception as e:
|
|
logger.error(f"Error fetching RSS from {source['name']}: {e}")
|
|
|
|
return articles
|
|
|
|
async def _fetch_twitter(self, source: Dict) -> List[NewsArticle]:
|
|
"""Fetch tweets from Twitter API."""
|
|
# Simplified - production would use full Twitter API v2 integration
|
|
articles = []
|
|
|
|
try:
|
|
# Twitter API v2 search endpoint
|
|
bearer_token = os.getenv('TWITTER_BEARER_TOKEN')
|
|
headers = {'Authorization': f'Bearer {bearer_token}'}
|
|
|
|
query = ' OR '.join(source['keywords'])
|
|
url = f"https://api.twitter.com/2/tweets/search/recent?query={query}&max_results=100"
|
|
|
|
async with self.session.get(url, headers=headers, timeout=30) as response:
|
|
if response.status != 200:
|
|
logger.error(f"Twitter fetch failed: {response.status}")
|
|
return articles
|
|
|
|
data = await response.json()
|
|
|
|
for tweet in data.get('data', []):
|
|
article = NewsArticle(
|
|
article_id=self._generate_article_id(tweet['id']),
|
|
source=source['name'],
|
|
source_type='twitter',
|
|
title=tweet['text'][:100],
|
|
content=tweet['text'],
|
|
summary=tweet['text'][:200],
|
|
url=f"https://twitter.com/i/web/status/{tweet['id']}",
|
|
author=None, # Would fetch from includes.users
|
|
published_at=datetime.fromisoformat(tweet['created_at'].replace('Z', '+00:00')),
|
|
fetched_at=datetime.utcnow()
|
|
)
|
|
articles.append(article)
|
|
|
|
except Exception as e:
|
|
logger.error(f"Error fetching Twitter: {e}")
|
|
|
|
return articles
|
|
|
|
async def _fetch_reddit(self, source: Dict) -> List[NewsArticle]:
|
|
"""Fetch posts from Reddit."""
|
|
articles = []
|
|
|
|
try:
|
|
# Reddit API
|
|
url = f"https://www.reddit.com/r/{source['subreddit']}/hot.json?limit=100"
|
|
headers = {'User-Agent': 'CryptoNewsAggregator/1.0'}
|
|
|
|
async with self.session.get(url, headers=headers, timeout=30) as response:
|
|
if response.status != 200:
|
|
logger.error(f"Reddit fetch failed: {response.status}")
|
|
return articles
|
|
|
|
data = await response.json()
|
|
|
|
for post in data['data']['children']:
|
|
post_data = post['data']
|
|
|
|
if post_data['ups'] < source.get('min_upvotes', 0):
|
|
continue
|
|
|
|
article = NewsArticle(
|
|
article_id=self._generate_article_id(post_data['id']),
|
|
source=f"r/{source['subreddit']}",
|
|
source_type='reddit',
|
|
title=post_data['title'],
|
|
content=post_data.get('selftext', ''),
|
|
summary=post_data.get('selftext', '')[:500],
|
|
url=f"https://reddit.com{post_data['permalink']}",
|
|
author=post_data['author'],
|
|
published_at=datetime.fromtimestamp(post_data['created_utc']),
|
|
fetched_at=datetime.utcnow(),
|
|
social_shares=post_data['ups'],
|
|
comments_count=post_data['num_comments']
|
|
)
|
|
articles.append(article)
|
|
|
|
except Exception as e:
|
|
logger.error(f"Error fetching Reddit: {e}")
|
|
|
|
return articles
|
|
|
|
def _generate_article_id(self, identifier: str) -> str:
|
|
"""Generate unique article ID from URL or identifier."""
|
|
return hashlib.md5(identifier.encode()).hexdigest()
|
|
|
|
def _parse_published_date(self, entry) -> datetime:
|
|
"""Parse published date from feed entry."""
|
|
if hasattr(entry, 'published_parsed') and entry.published_parsed:
|
|
return datetime(*entry.published_parsed[:6])
|
|
return datetime.utcnow()
|
|
|
|
|
|
class SentimentAnalyzer:
|
|
"""AI-powered sentiment analysis for crypto news."""
|
|
|
|
def __init__(self, config: Dict):
|
|
self.config = config
|
|
self.client = AsyncAnthropic(api_key=os.getenv(config['api_key_env']))
|
|
self.model = config['model']
|
|
self.cache: Dict[str, Dict] = {}
|
|
|
|
async def analyze_batch(self, articles: List[NewsArticle]) -> List[NewsArticle]:
|
|
"""Analyze sentiment for batch of articles."""
|
|
tasks = []
|
|
|
|
for article in articles:
|
|
# Check cache first
|
|
cache_key = article.article_id
|
|
if cache_key in self.cache:
|
|
cached = self.cache[cache_key]
|
|
if time.time() - cached['timestamp'] < self.config['cache_ttl_seconds']:
|
|
article.sentiment_score = cached['sentiment_score']
|
|
article.sentiment_label = cached['sentiment_label']
|
|
article.confidence = cached['confidence']
|
|
continue
|
|
|
|
tasks.append(self._analyze_single(article))
|
|
|
|
results = await asyncio.gather(*tasks, return_exceptions=True)
|
|
|
|
analyzed_articles = []
|
|
for i, result in enumerate(results):
|
|
if isinstance(result, Exception):
|
|
logger.error(f"Sentiment analysis failed: {result}")
|
|
analyzed_articles.append(articles[i])
|
|
else:
|
|
analyzed_articles.append(result)
|
|
|
|
return analyzed_articles
|
|
|
|
async def _analyze_single(self, article: NewsArticle) -> NewsArticle:
|
|
"""Analyze sentiment for single article using Claude."""
|
|
prompt = f"""Analyze the sentiment of this cryptocurrency news article and its potential market impact.
|
|
|
|
Title: {article.title}
|
|
|
|
Content: {article.content[:2000]}
|
|
|
|
Provide your analysis in JSON format:
|
|
{{
|
|
"sentiment": "very_bearish|bearish|neutral|bullish|very_bullish",
|
|
"sentiment_score": <float between -1.0 and 1.0>,
|
|
"confidence": <float between 0.0 and 1.0>,
|
|
"market_impact_score": <integer 0-100>,
|
|
"reasoning": "<brief explanation>",
|
|
"entities": [
|
|
{{"name": "<entity>", "type": "<type>", "sentiment": <score>}}
|
|
]
|
|
}}
|
|
|
|
Consider:
|
|
- Regulatory developments (high impact)
|
|
- Exchange listings/delistings (medium-high impact)
|
|
- Major partnerships (medium impact)
|
|
- Technical upgrades (medium impact)
|
|
- General market commentary (low impact)
|
|
"""
|
|
|
|
try:
|
|
response = await self.client.messages.create(
|
|
model=self.model,
|
|
max_tokens=1000,
|
|
messages=[{"role": "user", "content": prompt}]
|
|
)
|
|
|
|
# Extract JSON from response
|
|
content = response.content[0].text
|
|
json_match = re.search(r'\{.*\}', content, re.DOTALL)
|
|
if json_match:
|
|
analysis = json.loads(json_match.group())
|
|
|
|
article.sentiment_score = analysis['sentiment_score']
|
|
article.sentiment_label = analysis['sentiment']
|
|
article.confidence = analysis['confidence']
|
|
article.market_impact_score = analysis['market_impact_score']
|
|
|
|
# Cache result
|
|
self.cache[article.article_id] = {
|
|
'sentiment_score': article.sentiment_score,
|
|
'sentiment_label': article.sentiment_label,
|
|
'confidence': article.confidence,
|
|
'timestamp': time.time()
|
|
}
|
|
|
|
except Exception as e:
|
|
logger.error(f"Sentiment analysis failed for {article.article_id}: {e}")
|
|
|
|
return article
|
|
|
|
|
|
class Deduplicator:
|
|
"""Detect and remove duplicate articles."""
|
|
|
|
def __init__(self, redis_client: redis.Redis, similarity_threshold: float = 0.85):
|
|
self.redis = redis_client
|
|
self.similarity_threshold = similarity_threshold
|
|
|
|
def deduplicate(self, articles: List[NewsArticle]) -> List[NewsArticle]:
|
|
"""Remove duplicate articles based on title similarity."""
|
|
unique_articles = []
|
|
seen_titles = set()
|
|
|
|
for article in articles:
|
|
# Check Redis cache for seen URL
|
|
if self.redis.exists(f"article:{article.url}"):
|
|
logger.debug(f"Duplicate URL detected: {article.url}")
|
|
continue
|
|
|
|
# Check title similarity
|
|
is_duplicate = False
|
|
normalized_title = self._normalize_title(article.title)
|
|
|
|
for seen_title in seen_titles:
|
|
similarity = self._calculate_similarity(normalized_title, seen_title)
|
|
if similarity >= self.similarity_threshold:
|
|
is_duplicate = True
|
|
logger.debug(f"Duplicate title detected: {article.title} (similarity: {similarity:.2f})")
|
|
break
|
|
|
|
if not is_duplicate:
|
|
unique_articles.append(article)
|
|
seen_titles.add(normalized_title)
|
|
|
|
# Cache URL in Redis (24h TTL)
|
|
self.redis.setex(f"article:{article.url}", 86400, "1")
|
|
|
|
logger.info(f"Deduplicated {len(articles)} articles to {len(unique_articles)} unique")
|
|
return unique_articles
|
|
|
|
def _normalize_title(self, title: str) -> str:
|
|
"""Normalize title for comparison."""
|
|
# Remove special characters, lowercase, remove extra spaces
|
|
title = re.sub(r'[^\w\s]', '', title.lower())
|
|
title = re.sub(r'\s+', ' ', title).strip()
|
|
return title
|
|
|
|
def _calculate_similarity(self, title1: str, title2: str) -> float:
|
|
"""Calculate Jaccard similarity between titles."""
|
|
words1 = set(title1.split())
|
|
words2 = set(title2.split())
|
|
|
|
if not words1 or not words2:
|
|
return 0.0
|
|
|
|
intersection = words1.intersection(words2)
|
|
union = words1.union(words2)
|
|
|
|
return len(intersection) / len(union)
|
|
|
|
|
|
class CryptoNewsAggregator:
|
|
"""Main news aggregation orchestrator."""
|
|
|
|
def __init__(self, sources_config_path: str, sentiment_config_path: str):
|
|
with open(sources_config_path) as f:
|
|
self.sources_config = json.load(f)
|
|
|
|
with open(sentiment_config_path) as f:
|
|
self.sentiment_config = json.load(f)
|
|
|
|
self.redis_client = redis.Redis(host='localhost', port=6379, decode_responses=True)
|
|
self.scraper = NewsScraper(self.sources_config)
|
|
self.sentiment_analyzer = SentimentAnalyzer(self.sentiment_config)
|
|
self.deduplicator = Deduplicator(self.redis_client)
|
|
|
|
# Database connection
|
|
self.db_conn = psycopg2.connect(
|
|
host=os.getenv('POSTGRES_HOST', 'localhost'),
|
|
database=os.getenv('POSTGRES_DB', 'crypto_news'),
|
|
user=os.getenv('POSTGRES_USER', 'postgres'),
|
|
password=os.getenv('POSTGRES_PASSWORD')
|
|
)
|
|
|
|
async def run_aggregation_cycle(self) -> Dict:
|
|
"""Run single aggregation cycle."""
|
|
logger.info("Starting news aggregation cycle")
|
|
|
|
# Step 1: Fetch from all sources
|
|
async with self.scraper:
|
|
raw_articles = await self.scraper.fetch_all_sources()
|
|
|
|
logger.info(f"Fetched {len(raw_articles)} articles from all sources")
|
|
|
|
# Step 2: Deduplicate
|
|
unique_articles = self.deduplicator.deduplicate(raw_articles)
|
|
|
|
# Step 3: Sentiment analysis
|
|
analyzed_articles = await self.sentiment_analyzer.analyze_batch(unique_articles)
|
|
|
|
# Step 4: Filter by minimum impact score
|
|
min_impact = self.sentiment_config.get('min_impact_score', 40)
|
|
high_impact_articles = [
|
|
a for a in analyzed_articles
|
|
if a.market_impact_score >= min_impact
|
|
]
|
|
|
|
logger.info(f"Found {len(high_impact_articles)} high-impact articles (score >= {min_impact})")
|
|
|
|
# Step 5: Store in database
|
|
self._store_articles(analyzed_articles)
|
|
|
|
# Step 6: Generate alerts for critical news
|
|
critical_articles = [a for a in analyzed_articles if a.market_impact_score >= 80]
|
|
if critical_articles:
|
|
await self._send_alerts(critical_articles)
|
|
|
|
return {
|
|
'total_fetched': len(raw_articles),
|
|
'unique_articles': len(unique_articles),
|
|
'analyzed_articles': len(analyzed_articles),
|
|
'high_impact': len(high_impact_articles),
|
|
'critical_alerts': len(critical_articles)
|
|
}
|
|
|
|
def _store_articles(self, articles: List[NewsArticle]) -> None:
|
|
"""Store articles in PostgreSQL."""
|
|
with self.db_conn.cursor() as cur:
|
|
for article in articles:
|
|
try:
|
|
cur.execute("""
|
|
INSERT INTO news_articles (
|
|
article_id, source, source_type, title, content, summary, url,
|
|
author, published_at, fetched_at, sentiment_score, sentiment_label,
|
|
confidence, market_impact_score, social_shares, comments_count
|
|
) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
|
|
ON CONFLICT (article_id) DO UPDATE SET
|
|
sentiment_score = EXCLUDED.sentiment_score,
|
|
market_impact_score = EXCLUDED.market_impact_score
|
|
""", (
|
|
article.article_id, article.source, article.source_type,
|
|
article.title, article.content, article.summary, article.url,
|
|
article.author, article.published_at, article.fetched_at,
|
|
article.sentiment_score, article.sentiment_label, article.confidence,
|
|
article.market_impact_score, article.social_shares, article.comments_count
|
|
))
|
|
except Exception as e:
|
|
logger.error(f"Error storing article {article.article_id}: {e}")
|
|
|
|
self.db_conn.commit()
|
|
logger.info(f"Stored {len(articles)} articles in database")
|
|
|
|
async def _send_alerts(self, articles: List[NewsArticle]) -> None:
|
|
"""Send alerts for critical news."""
|
|
for article in articles:
|
|
logger.warning(f"CRITICAL NEWS ALERT: {article.title} (impact: {article.market_impact_score})")
|
|
# Implement webhook/Slack/Discord alerting here
|
|
|
|
|
|
async def main():
|
|
"""Main entry point."""
|
|
import argparse
|
|
|
|
parser = argparse.ArgumentParser(description='Aggregate cryptocurrency news with sentiment analysis')
|
|
parser.add_argument('--sources-config', default='config/news_sources.json')
|
|
parser.add_argument('--sentiment-config', default='config/sentiment_config.json')
|
|
parser.add_argument('--interval', type=int, default=300, help='Fetch interval in seconds')
|
|
|
|
args = parser.parse_args()
|
|
|
|
aggregator = CryptoNewsAggregator(args.sources_config, args.sentiment_config)
|
|
|
|
while True:
|
|
try:
|
|
stats = await aggregator.run_aggregation_cycle()
|
|
logger.info(f"Cycle complete: {stats}")
|
|
|
|
await asyncio.sleep(args.interval)
|
|
|
|
except KeyboardInterrupt:
|
|
logger.info("Shutting down...")
|
|
break
|
|
except Exception as e:
|
|
logger.error(f"Error in aggregation cycle: {e}")
|
|
await asyncio.sleep(60)
|
|
|
|
|
|
if __name__ == '__main__':
|
|
asyncio.run(main())
|
|
```
|
|
|
|
## Error Handling
|
|
|
|
| Error Type | Detection | Resolution | Prevention |
|
|
|------------|-----------|------------|------------|
|
|
| RSS feed timeout | Connection timeout (30s) | Skip source, continue with others | Implement per-source timeout, retry mechanism |
|
|
| API rate limiting (Twitter) | HTTP 429 response | Exponential backoff (1min, 5min, 15min) | Track rate limits, stagger requests |
|
|
| Sentiment API failure | HTTP 5xx or timeout | Use fallback rule-based sentiment | Implement circuit breaker, local model fallback |
|
|
| Database connection lost | psycopg2.OperationalError | Reconnect with exponential backoff | Connection pooling, health checks |
|
|
| Duplicate article detection | Identical URL or 85%+ title similarity | Skip article, log duplicate | Redis caching with 24h TTL |
|
|
| Invalid RSS/JSON | Parsing exception | Log error, skip source | Validate feed structure before parsing |
|
|
| Memory overflow | RSS feed >100MB | Stream parsing, limit entries | Limit feed size, process in batches |
|
|
| Stale feed data | Last update >24h old | Alert admin, skip source temporarily | Monitor feed freshness, automated source health checks |
|
|
|
|
## Configuration Options
|
|
|
|
```yaml
|
|
# config/news_aggregator.yml
|
|
sources:
|
|
fetch_interval_seconds: 300
|
|
timeout_seconds: 30
|
|
max_articles_per_source: 100
|
|
|
|
credibility_weights:
|
|
tier_1: 1.0 # CoinDesk, The Block, Bloomberg
|
|
tier_2: 0.8 # CoinTelegraph, Decrypt
|
|
tier_3: 0.6 # Social media, blogs
|
|
|
|
sentiment:
|
|
provider: anthropic|openai|local
|
|
batch_size: 10
|
|
cache_ttl: 3600
|
|
min_confidence: 0.5
|
|
|
|
deduplication:
|
|
enabled: true
|
|
similarity_threshold: 0.85
|
|
cache_ttl_hours: 24
|
|
|
|
market_impact:
|
|
min_alert_score: 80
|
|
score_components:
|
|
source_credibility: 0.30
|
|
sentiment_strength: 0.25
|
|
entity_relevance: 0.20
|
|
social_engagement: 0.15
|
|
breaking_news: 0.10
|
|
|
|
alerts:
|
|
channels:
|
|
- slack
|
|
- discord
|
|
- email
|
|
min_impact_score: 80
|
|
rate_limit_per_hour: 20
|
|
|
|
storage:
|
|
database: postgresql
|
|
retention_days: 180
|
|
enable_elasticsearch: true
|
|
|
|
performance:
|
|
max_concurrent_fetches: 20
|
|
redis_cache: true
|
|
async_processing: true
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
### DO:
|
|
- ✅ Aggregate from 20+ sources minimum for comprehensive coverage
|
|
- ✅ Implement deduplication (80% of news is duplicated across sources)
|
|
- ✅ Use AI sentiment analysis (>80% accuracy with GPT-4/Claude)
|
|
- ✅ Cache sentiment results (1h TTL) to reduce API costs
|
|
- ✅ Monitor source reliability and adjust weights dynamically
|
|
- ✅ Store historical data for backtesting sentiment signals
|
|
- ✅ Implement rate limiting per source to avoid bans
|
|
- ✅ Use Redis for real-time caching and deduplication
|
|
- ✅ Alert only on high-impact news (score >80) to avoid fatigue
|
|
- ✅ Track correlation between sentiment shifts and price movements
|
|
|
|
### DON'T:
|
|
- ❌ Rely on single news source - creates blind spots
|
|
- ❌ Skip sentiment analysis - raw news requires manual interpretation
|
|
- ❌ Process articles synchronously - too slow for real-time
|
|
- ❌ Store unlimited history - database bloat (180 days max)
|
|
- ❌ Alert on every article - causes notification fatigue
|
|
- ❌ Ignore social media - often breaks news before mainstream
|
|
- ❌ Trust sentiment without context - consider source credibility
|
|
- ❌ Forget deduplication - wastes 80% of processing
|
|
- ❌ Use outdated articles - mark stale news (>24h) clearly
|
|
- ❌ Hardcode source URLs - sources change frequently
|
|
|
|
## Performance Considerations
|
|
|
|
- **Fetch Latency**: Async concurrent fetching - 50 sources in <30s
|
|
- **Sentiment Processing**: Batch processing (10 articles) - ~5s per batch with Claude
|
|
- **Deduplication**: Redis O(1) lookup - <1ms per article
|
|
- **Database Writes**: Batch inserts (100 articles) - ~200ms
|
|
- **Memory Usage**: ~4GB for 10K articles in memory + Redis cache
|
|
- **API Costs**:
|
|
- Claude Sentiment: $0.015/1K tokens = ~$0.002 per article
|
|
- Twitter API: $100/month for 2M tweets
|
|
- NewsAPI: Free tier sufficient for testing
|
|
|
|
**Optimization Tips:**
|
|
1. Use Redis for hot cache (recent articles, sentiment results)
|
|
2. Implement connection pooling for database (10-20 connections)
|
|
3. Process sentiment in batches of 10 to maximize throughput
|
|
4. Use Elasticsearch for full-text search instead of PostgreSQL LIKE queries
|
|
5. Archive articles >90 days to cold storage (S3/Glacier)
|
|
|
|
## Security Considerations
|
|
|
|
- **API Key Management**: Store in environment variables, rotate quarterly
|
|
- **Rate Limiting**: Respect source rate limits to avoid bans
|
|
- **Input Validation**: Sanitize all HTML/text before storage (XSS prevention)
|
|
- **URL Validation**: Verify URLs before storage, block malicious domains
|
|
- **Database Access**: Read-only user for analytics, write user for aggregator only
|
|
- **Content Filtering**: Filter spam, advertisements, malicious links
|
|
- **Error Disclosure**: Don't expose internal system details in logs
|
|
- **Webhook Security**: Use HMAC signatures for alert webhooks
|
|
- **Compliance**: GDPR-compliant data retention (180 days), user data anonymization
|
|
- **Monitoring Access**: Require authentication for dashboards
|
|
|
|
## Related Commands
|
|
|
|
- `/analyze-sentiment` - Deep sentiment analysis for specific topics
|
|
- `/monitor-whales` - Track whale transactions correlated with news
|
|
- `/generate-signal` - Trading signals incorporating news sentiment
|
|
- `/track-price` - Price tracking with news overlay
|
|
- `/analyze-chain` - On-chain metrics correlation with news events
|
|
- `/scan-movers` - Market movers detection with news attribution
|
|
|
|
## Version History
|
|
|
|
- **v1.0.0** (2025-10-11) - Initial release with 50+ sources, AI sentiment
|
|
- **v1.1.0** (planned) - ML-based trend prediction, narrative clustering
|
|
- **v1.2.0** (planned) - Real-time event extraction, entity relationship mapping
|
|
- **v2.0.0** (planned) - Predictive market impact modeling, automated trading integration
|