Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:47:18 +08:00
commit 57a131c6fd
18 changed files with 6838 additions and 0 deletions

595
commands/deploy-network.md Normal file
View File

@@ -0,0 +1,595 @@
---
description: Deploy network configuration end-to-end
argument-hint: Optional deployment requirements
---
You are orchestrating a comprehensive network deployment workflow using a structured multi-phase approach to ensure safe, validated, and reversible network changes.
## Workflow Steps
### 1. Gather Deployment Requirements
If the user provides deployment details, use those directly. Otherwise, ask the user for:
**Target Environment:**
- Deployment environment (production, staging, development, lab)
- Target systems and platforms (FRR, SONiC, Debian/Ubuntu, etc.)
- Number of devices affected
- Network criticality level (mission-critical, business-critical, standard)
**Configuration Requirements:**
- What configurations to deploy (interfaces, routing, VLANs, etc.)
- Configuration files or requirements (if not already generated)
- Dependencies between configurations
- Order of operations
**Change Window:**
- Scheduled maintenance window (date, time, duration)
- Maximum acceptable downtime
- Business impact of downtime
- Peak vs off-peak considerations
**Rollback Requirements:**
- Rollback criteria (what conditions trigger rollback)
- Maximum time before rollback decision
- Rollback testing requirements
- Configuration backup retention
**Testing and Validation:**
- Pre-deployment tests
- Post-deployment validation criteria
- Automated vs manual testing
- Acceptance criteria
**Communication and Coordination:**
- Stakeholders to notify
- Change control approval
- Team coordination needs
- Communication channels (Slack, email, phone)
### 2. Launch network-orchestrator for Deployment Planning
Use the Task tool to launch the network-orchestrator agent with comprehensive deployment requirements:
```
Orchestrate a network deployment with the following requirements:
Environment: [production/staging/lab]
Criticality: [mission-critical/business-critical/standard]
Target Systems: [list of devices/platforms]
Deployment Requirements:
[Detailed configuration requirements or existing config files]
Change Window:
- Start: [date/time]
- Duration: [hours]
- Max Downtime: [minutes]
Please coordinate a complete deployment workflow including:
1. Pre-Deployment Phase
- Configuration generation (if needed)
- Configuration review (architecture)
- Security review (for production)
- Change approval process
- Team preparation and briefing
2. Deployment Preparation
- Backup current configurations
- Prepare rollback procedures
- Stage configuration files
- Verify access to all devices
- Test connectivity to management interfaces
3. Deployment Execution
- Pre-deployment validation
- Phased deployment procedure
- Step-by-step deployment commands
- Checkpoints between phases
- Progress monitoring
4. Post-Deployment Validation
- Connectivity tests
- Routing validation
- Service validation
- Performance verification
- Comprehensive validation commands
5. Rollback Procedures
- Rollback triggers
- Rollback execution steps
- Rollback validation
- Emergency procedures
6. Documentation
- Deployment checklist
- As-built documentation
- Lessons learned
- Change log update
```
### 3. Execute Pre-Deployment Phase
Work through pre-deployment requirements:
**Configuration Generation (if needed):**
- Use appropriate specialist agents (frr-config-generator, netplan-config-generator, etc.)
- Generate all required configuration files
- Validate syntax of all configurations
**Configuration Review:**
- Use network-architecture-reviewer agent for technical review
- Identify and fix any configuration errors
- Verify IP addressing, routing logic, redundancy
**Security Review (Production Only):**
- Use network-security-reviewer agent for NIST-based review
- Address critical and high security findings
- Document accepted risks
**Change Approval:**
- Submit change request to CAB (Change Advisory Board)
- Document change rationale and impact
- Obtain necessary approvals
- Communicate change window to stakeholders
### 4. Prepare Deployment Environment
Execute preparation steps:
**Backup Current State:**
```bash
# For each device, backup current configuration
sudo cp /etc/frr/frr.conf /etc/frr/frr.conf.backup.$(date +%Y%m%d_%H%M%S)
sudo cp /etc/network/interfaces /etc/network/interfaces.backup.$(date +%Y%m%d_%H%M%S)
sudo cp /etc/netplan/*.yaml /etc/netplan/backup/
sudo cp /etc/sonic/config_db.json /etc/sonic/config_db.json.backup.$(date +%Y%m%d_%H%M%S)
# Document current state
ip addr show > ~/pre-deploy-state-$(date +%Y%m%d_%H%M%S).txt
ip route show >> ~/pre-deploy-state-$(date +%Y%m%d_%H%M%S).txt
show running-config >> ~/pre-deploy-state-$(date +%Y%m%d_%H%M%S).txt
```
**Stage Configuration Files:**
```bash
# Copy configs to staging directory
sudo mkdir -p /opt/network-configs/deploy-$(date +%Y%m%d)
sudo cp new-configs/* /opt/network-configs/deploy-$(date +%Y%m%d)/
# Verify file integrity
md5sum /opt/network-configs/deploy-$(date +%Y%m%d)/* > checksums.txt
```
**Verify Management Access:**
```bash
# Test SSH access to all devices
for device in device1 device2 device3; do
echo "Testing $device..."
ssh -o ConnectTimeout=5 admin@$device "hostname"
done
# Verify console access availability
# Ensure KVM/IPMI/console server access working
```
**Team Coordination:**
- Assemble deployment team
- Brief on deployment plan
- Assign roles (deployment lead, validator, rollback coordinator)
- Test communication channels
- Establish escalation procedures
### 5. Execute Pre-Deployment Validation
Run pre-deployment checks:
**Network Health Check:**
```bash
# Check current connectivity
ping -c 4 <critical-destinations>
# Verify routing
show ip route | include "0.0.0.0/0"
show ip bgp summary
show ip ospf neighbor
# Check interface status
show interfaces status
ip addr show
# Verify no existing issues
show logging | tail -100
```
**Dependency Verification:**
```bash
# Check required packages installed
dpkg -l | grep -E "frr|netplan|vlan|bridge-utils"
# Verify kernel modules loaded
lsmod | grep -E "8021q|bonding"
# Check disk space
df -h /etc
# Verify NTP synchronized
timedatectl status
```
**Go/No-Go Decision:**
- Review all pre-deployment checks
- Confirm team readiness
- Verify change window
- Get final approval to proceed
- Document decision
### 6. Execute Phased Deployment
Deploy configurations in phases with validation between each phase:
**Phase 1: Non-Disruptive Changes**
```bash
# Deploy configurations that won't cause outages
# Examples: interface descriptions, logging, monitoring
# Apply changes
# Validate changes
# Checkpoint - if issues, rollback this phase
```
**Phase 2: Core Infrastructure**
```bash
# Deploy core routing changes
# Configure BGP/OSPF/IS-IS
# Update route policies
# Apply changes in controlled sequence
# Validate routing convergence
# Monitor for issues
# Checkpoint - continue or rollback
```
**Phase 3: Access Layer**
```bash
# Deploy access layer changes
# Configure VLANs, interfaces, ACLs
# Apply changes
# Validate connectivity
# Test user/server access
# Checkpoint - continue or rollback
```
**Phase 4: Final Services**
```bash
# Deploy remaining services
# QoS, monitoring, final optimizations
# Apply changes
# Full validation
# Final checkpoint
```
### 7. Execute Post-Deployment Validation
Comprehensive validation after deployment:
**Connectivity Validation:**
```bash
# Test basic connectivity
ping -c 10 <gateway>
ping -c 10 <critical-server>
ping -c 10 8.8.8.8
# Test from multiple sources
# Verify bidirectional connectivity
```
**Routing Validation:**
```bash
# Verify routing protocols
show ip bgp summary
show ip bgp neighbors
show ip ospf neighbor
show isis neighbor
# Check routing tables
show ip route
show ip route bgp
show ip route ospf
# Verify route advertisement/receipt
show ip bgp neighbors <peer> advertised-routes
show ip bgp neighbors <peer> routes
```
**Interface Validation:**
```bash
# Check all interfaces up
show interfaces status
ip link show | grep "state UP"
# Verify no errors
show interfaces counters errors
ip -s link show
# Check port channels/bonds
show interfaces portchannel
cat /proc/net/bonding/bond0
```
**Service Validation:**
```bash
# Test critical services
curl -I http://<web-service>
nc -zv <database-server> 5432
ssh <jump-host> "uptime"
# Verify DNS
nslookup <critical-hostname>
# Check monitoring
# Verify metrics collection
# Confirm alerts functioning
```
**Performance Validation:**
```bash
# Check latency
ping -c 100 <destination> | tail -1
# Measure throughput (if applicable)
iperf3 -c <test-server>
# Verify no packet loss
ping -c 1000 <destination> | grep "packet loss"
```
**Application Testing:**
- Test critical business applications
- Verify user access
- Check server connectivity
- Validate service dependencies
### 8. Complete Deployment or Execute Rollback
Based on validation results:
**If Successful:**
```bash
# Document successful deployment
# Update as-built documentation
# Update network diagrams
# Update IPAM/inventory
# Notify stakeholders of completion
# Close change ticket
# Schedule post-implementation review
```
**If Rollback Needed:**
```bash
# Execute rollback procedures
sudo cp /etc/frr/frr.conf.backup.YYYYMMDD_HHMMSS /etc/frr/frr.conf
sudo systemctl restart frr
sudo cp /etc/network/interfaces.backup.YYYYMMDD_HHMMSS /etc/network/interfaces
sudo systemctl restart networking
sudo cp /etc/netplan/backup/*.yaml /etc/netplan/
sudo netplan apply
sudo cp /etc/sonic/config_db.json.backup.YYYYMMDD_HHMMSS /etc/sonic/config_db.json
config reload -y
# Validate rollback successful
# Same validation steps as post-deployment
# Document rollback reason
# Schedule root cause analysis
# Plan remediation
```
## Deployment Best Practices
### Change Management
1. **Follow Change Control Process**
- Submit RFC (Request for Change)
- Get CAB approval for production
- Communicate to stakeholders
- Document all changes
2. **Use Maintenance Windows**
- Schedule during low-traffic periods
- Allow sufficient time
- Have extended window for complex changes
- Plan for overrun scenarios
3. **Phased Deployment**
- Deploy in logical phases
- Validate between phases
- Checkpoints for go/no-go decisions
- Ability to rollback individual phases
4. **Test First**
- Test in lab environment
- Staging environment validation
- Pilot deployment (subset of devices)
- Then full production
### Risk Mitigation
1. **Have Backups**
- Configuration backups
- Full system backups where applicable
- Test restore procedures
- Off-site backup copies
2. **Rollback Plan**
- Documented rollback procedures
- Tested rollback process
- Clear rollback criteria
- Rollback time estimates
3. **Team Coordination**
- Dedicated deployment team
- Clear roles and responsibilities
- Communication plan
- Escalation procedures
4. **Monitoring**
- Enhanced monitoring during deployment
- Real-time alerting
- Performance baselines
- Anomaly detection
### Communication
1. **Pre-Deployment**
- Notify stakeholders of change window
- Communicate expected impact
- Provide contact information
- Set expectations
2. **During Deployment**
- Regular status updates
- Immediate notification of issues
- Progress against timeline
- Any deviations from plan
3. **Post-Deployment**
- Completion notification
- Summary of changes
- Any issues encountered
- Next steps if applicable
## Deployment Checklist
### Pre-Deployment
- [ ] Configuration files generated and reviewed
- [ ] Architecture review completed
- [ ] Security review completed (production)
- [ ] Change approval obtained
- [ ] Team briefed and assigned roles
- [ ] Backups completed and verified
- [ ] Rollback procedures documented
- [ ] Management access verified
- [ ] Staging environment tested
- [ ] Communication sent to stakeholders
### During Deployment
- [ ] Pre-deployment validation passed
- [ ] Phase 1 deployed and validated
- [ ] Phase 2 deployed and validated
- [ ] Phase 3 deployed and validated
- [ ] Final phase deployed and validated
- [ ] All checkpoints passed
- [ ] No unexpected issues
- [ ] Timeline on track
### Post-Deployment
- [ ] All connectivity tests passed
- [ ] Routing validated
- [ ] Services operational
- [ ] Performance acceptable
- [ ] No errors in logs
- [ ] Monitoring functioning
- [ ] Applications tested
- [ ] Stakeholders notified
- [ ] Documentation updated
- [ ] Change ticket closed
## Deployment Scenarios
### Low-Risk Deployment
- Lab or development environment
- Non-critical systems
- Well-tested configurations
- Easy rollback
**Simplified Process:**
1. Generate configurations
2. Quick review
3. Backup current state
4. Deploy
5. Validate
6. Done
### Medium-Risk Deployment
- Staging or pre-production
- Business-critical (not mission-critical)
- Tested configurations
- Documented rollback
**Standard Process:**
1. Generate and review configurations
2. Change approval (simplified)
3. Backup and prepare
4. Phased deployment with validation
5. Full validation
6. Document
### High-Risk Deployment
- Production environment
- Mission-critical systems
- Complex changes
- High business impact
**Full Process:**
1. Design and plan
2. Generate configurations
3. Architecture review
4. Security review
5. Lab testing
6. Change approval (full CAB)
7. Stakeholder communication
8. Comprehensive backups
9. Team coordination
10. Pre-deployment validation
11. Phased deployment with checkpoints
12. Comprehensive post-deployment validation
13. Documentation and closure
14. Post-implementation review
## Common Deployment Issues
**Configuration Syntax Errors:**
- Validate all configs before deployment
- Use dry-run/test modes
- Have syntax checkers in place
**Connectivity Loss:**
- Always have out-of-band access
- Test on non-production first
- Have console/KVM access ready
- Deploy during maintenance window
**Routing Convergence Issues:**
- Allow time for convergence
- Monitor convergence in real-time
- Have pre-calculated convergence times
- Test failover scenarios
**Rollback Complexity:**
- Keep rollback procedures simple
- Test rollback procedures beforehand
- Have automated rollback scripts
- Clear rollback criteria
## Notes
- Deployments should never be rushed
- Always have a rollback plan
- Test everything in non-production first
- Communication is critical for success
- Document everything
- Learn from each deployment (post-implementation review)
- Automation reduces risk for repetitive deployments
## Example Task Invocation
```
deploy-network Deploy BGP configuration to 10 data center leaf switches in production. We have a 4-hour maintenance window Saturday 2-6am. Need full architecture and security review before deployment. Configs are already generated and need deployment orchestration with phased rollout and comprehensive validation.
```

494
commands/design-network.md Normal file
View File

@@ -0,0 +1,494 @@
---
description: Design network architecture and topology
argument-hint: Optional network requirements
---
You are orchestrating a comprehensive network design workflow using a structured approach to create detailed network architecture plans.
## Workflow Steps
### 1. Gather Requirements
If the user provides specific requirements in their message, use those directly. Otherwise, ask the user for:
**Network Type and Purpose:**
- Network environment (data center, campus, branch office, cloud, hybrid)
- Primary use case (server connectivity, user access, WAN, internet edge)
- Criticality level (production, development, testing)
**Scale Requirements:**
- Number of devices to support (servers, users, IoT devices)
- Expected throughput (1G, 10G, 25G, 100G, 400G)
- Number of sites/locations
- Growth projections (1 year, 3 years, 5 years)
**Redundancy and High Availability:**
- Uptime requirements (99.9%, 99.99%, 99.999%)
- Acceptable downtime window
- Failure tolerance (single link, single device, entire site)
- Disaster recovery requirements
- Geographic redundancy needs
**Routing and Connectivity:**
- Routing protocol preferences (BGP, OSPF, IS-IS, static)
- Layer 2 vs Layer 3 architecture
- Network segmentation requirements
- Internet connectivity (single/dual provider, bandwidth)
- WAN connectivity requirements
**Security Requirements:**
- Compliance requirements (PCI-DSS, HIPAA, SOC 2, NIST)
- Network segmentation needs (DMZ, internal zones, guest)
- Firewall requirements
- IDS/IPS requirements
- Access control requirements
**Technical Constraints:**
- Existing infrastructure to integrate with
- Vendor preferences or restrictions
- Budget constraints
- Physical space constraints
- Power and cooling constraints
**Performance Requirements:**
- Latency requirements
- Packet loss tolerance
- QoS requirements
- Bandwidth guarantees
### 2. Launch network-orchestrator Agent
Use the Task tool to launch the network-orchestrator agent with a comprehensive design prompt:
```
Design a network architecture for the following requirements:
[Insert all gathered requirements here]
Please provide a comprehensive network design including:
1. Network Architecture Overview
- Architecture type (spine-leaf, three-tier, collapsed core, etc.)
- Design rationale and trade-offs
- Scalability analysis
2. Topology Design
- Physical topology diagram description
- Logical topology diagram description
- Device placement and roles
- Link design and redundancy
3. IP Addressing Scheme
- Subnet allocation plan
- IP address management strategy
- VLAN design and numbering
- Loopback addressing scheme
4. Routing Design
- Routing protocol selection and justification
- IGP design (areas, levels, metrics)
- BGP design (AS numbering, peering strategy)
- Route summarization strategy
- Failure scenario analysis
5. High Availability Design
- Redundancy architecture
- Failover mechanisms
- Link aggregation design
- Gateway redundancy (VRRP, HSRP, anycast)
- Fast convergence mechanisms (BFD, tuned timers)
6. Security Architecture
- Network segmentation design
- Trust zones and boundaries
- Firewall placement
- ACL strategy
- Management network design
7. Device Requirements
- Switch/router specifications
- Port count and speed requirements
- Buffer and table size requirements
- Feature requirements
8. Implementation Phases
- Phase 1: Core infrastructure
- Phase 2: Distribution/access layers
- Phase 3: Services and optimization
- Migration strategy (if applicable)
9. Validation and Testing Plan
- Design validation steps
- Testing scenarios
- Acceptance criteria
10. Documentation Deliverables
- Network diagrams
- IP address inventory
- Configuration templates
- Runbooks
```
### 3. Review Design Output
When the agent returns the design, review it for:
- Completeness of all required sections
- Alignment with requirements
- Realistic scalability projections
- Appropriate technology selections
- Clear migration/implementation path
- Comprehensive failure scenario analysis
### 4. Create Design Documentation
Ensure the design includes comprehensive documentation:
**Network Diagrams:**
- Physical topology (rack elevations, cable paths)
- Logical topology (L2 and L3)
- IP addressing diagram
- Security zones diagram
- Failure scenario diagrams
**Design Specifications:**
- Bill of Materials (BOM)
- Cable schedule
- IP address allocation table
- VLAN table
- Routing protocol configuration summary
**Implementation Guide:**
- Pre-implementation checklist
- Step-by-step implementation procedure
- Configuration snippets
- Testing procedures
- Rollback procedures
### 5. Validate Design Against Requirements
Conduct a requirements validation check:
```
Requirements Validation Checklist:
□ Meets capacity requirements (current and future)
□ Meets redundancy/HA requirements
□ Meets performance requirements (latency, throughput)
□ Meets security requirements
□ Scalable to projected growth
□ Within budget constraints
□ Compatible with existing infrastructure
□ Follows industry best practices
□ Has documented failure scenarios
□ Includes clear implementation plan
```
### 6. Conduct Design Review Sessions
Organize design review with stakeholders:
**Pre-Review Preparation:**
- Distribute design documentation 1 week before review
- Prepare presentation slides
- Identify key decision points
- Prepare answers to anticipated questions
**Review Agenda:**
1. **Executive Summary** (15 min)
- Business requirements recap
- Proposed architecture overview
- Key benefits and trade-offs
2. **Technical Deep Dive** (45 min)
- Topology and architecture
- IP addressing and routing
- High availability design
- Security architecture
3. **Implementation Plan** (30 min)
- Phased approach
- Timeline and milestones
- Resource requirements
- Risk mitigation
4. **Q&A and Feedback** (30 min)
- Stakeholder questions
- Feedback collection
- Action items
### 7. Iterate Based on Feedback
After design review:
- Document all feedback and concerns
- Update design based on valid concerns
- Re-evaluate technology choices if needed
- Update cost estimates
- Revise implementation timeline
- Schedule follow-up review if major changes
### 8. Create Final Design Package
Assemble comprehensive design deliverables:
**Design Documents:**
1. **Executive Summary** (2-3 pages)
- Business requirements
- Proposed solution overview
- Cost and timeline summary
- Key benefits
2. **Architecture Design Document** (20-50 pages)
- Detailed architecture description
- Technology selection rationale
- All network diagrams
- IP addressing tables
- Device specifications
- Security design
3. **Implementation Plan** (10-20 pages)
- Phased implementation approach
- Detailed task list with owners
- Timeline/Gantt chart
- Testing plan
- Risk assessment and mitigation
4. **Configuration Templates**
- Base configuration templates
- Security hardening templates
- Monitoring configuration
5. **Operations Runbook**
- Day 1 operations procedures
- Troubleshooting guides
- Escalation procedures
- Maintenance procedures
## Best Practices for Network Design
### Architecture Selection
**Spine-Leaf (Clos) Architecture:**
- **Use for**: Data centers, high-performance computing
- **Benefits**: Predictable latency, easy scaling, high bandwidth
- **Considerations**: Requires L3 everywhere, more complex routing
**Three-Tier (Core-Distribution-Access):**
- **Use for**: Campus networks, traditional enterprise
- **Benefits**: Well understood, hierarchical, scalable
- **Considerations**: Can have bottlenecks at aggregation layer
**Collapsed Core (Two-Tier):**
- **Use for**: Small to medium enterprises, branch offices
- **Benefits**: Simplified, lower cost, easier management
- **Considerations**: Less scalable, potential bottlenecks
### IP Addressing Best Practices
1. **Use RFC1918 Private Address Space Efficiently**
- 10.0.0.0/8 for large enterprises
- 172.16.0.0/12 for medium enterprises
- 192.168.0.0/16 for small offices
2. **Allocate Contiguous Blocks**
- Allow for route summarization
- Simplify routing tables
- Enable easier growth
3. **Reserve Ranges**
- Management network: /24 per location
- Loopbacks: /32 from dedicated range
- Point-to-point links: /30 or /31
- Future growth: 30-50% headroom
4. **Document Everything**
- IPAM (IP Address Management) system
- Spreadsheet with allocations
- DNS and DHCP integration
### Routing Protocol Selection
**BGP:**
- **Use for**: Data center fabrics, internet edge, multi-tenant
- **Pros**: Scalable, flexible policy control, industry standard
- **Cons**: More complex, requires careful design
**OSPF:**
- **Use for**: Campus networks, enterprise core
- **Pros**: Fast convergence, well understood, feature-rich
- **Cons**: Flat area design doesn't scale, CPU intensive
**IS-IS:**
- **Use for**: Service provider networks, very large enterprises
- **Pros**: Scales well, stable, low overhead
- **Cons**: Less common, fewer engineers familiar with it
**Static Routes:**
- **Use for**: Small networks, specific use cases, backup paths
- **Pros**: Simple, predictable, no protocol overhead
- **Cons**: Not scalable, manual updates, no automatic failover
### High Availability Design Principles
1. **Eliminate Single Points of Failure**
- Redundant power supplies
- Dual network paths
- Multiple uplinks
- Redundant services (DNS, DHCP, etc.)
2. **Use Redundancy Protocols**
- VRRP/HSRP for gateway redundancy
- LACP for link aggregation
- BFD for fast failure detection
- Route redundancy with equal-cost multipath
3. **Design for Fast Convergence**
- Tune protocol timers appropriately
- Use BFD (sub-second detection)
- Pre-provision backup paths
- Minimize spanning-tree domains
4. **Consider Failure Scenarios**
- Single link failure
- Single device failure
- Power failure
- Site failure (for multi-site)
- Human error
### Security Design Principles
1. **Defense in Depth**
- Multiple layers of security controls
- Network segmentation
- Least privilege access
- Monitoring and logging
2. **Network Segmentation**
- Separate trust zones (internet, DMZ, internal, management)
- VLANs for logical separation
- Firewalls between zones
- Micro-segmentation for critical assets
3. **Access Control**
- Management network isolation
- SSH key authentication
- TACACS+ or RADIUS
- Role-based access control
4. **Monitoring and Logging**
- Centralized syslog
- SNMP monitoring
- NetFlow/sFlow for traffic analysis
- Security event correlation
## Common Design Patterns
### Data Center Leaf-Spine
**Architecture:**
- Spine layer: High-capacity switches (100G/400G)
- Leaf layer: ToR switches connecting servers
- Every leaf connects to every spine
- L3 routing to the leaf switches
**Routing:**
- BGP for underlay (eBGP with unique ASN per leaf)
- EVPN for overlay
- BFD for fast convergence
**Benefits:**
- Linear scaling
- Predictable latency
- High bandwidth
- Easy to automate
### Campus Three-Tier
**Architecture:**
- Core: High-speed backbone (collapsed to 2+ switches)
- Distribution: Aggregates access switches, L3 boundary
- Access: User/device connectivity, L2 usually
**Routing:**
- OSPF for campus routing
- Default gateway at distribution
- Static routes to core (optional)
**Benefits:**
- Well understood design
- Clear hierarchy
- Scalable for medium to large campuses
### Branch Office Hub-and-Spoke
**Architecture:**
- Central hub (data center or headquarters)
- Branch sites connect to hub
- Optional branch-to-branch (full mesh) for critical sites
**Connectivity:**
- MPLS WAN or SD-WAN
- Internet VPN backup
- Dual-homed branches for critical sites
**Routing:**
- BGP for WAN (with MPLS provider)
- OSPF or EIGRP internally
- Default route to hub
## Design Validation Checklist
Before finalizing design:
**Capacity Planning:**
- [ ] Port density meets current needs + 30% growth
- [ ] Link bandwidth supports peak traffic + 50% headroom
- [ ] Routing table size within device limits
- [ ] MAC table size sufficient for L2 domains
**Redundancy:**
- [ ] No single points of failure in critical paths
- [ ] All uplinks are redundant
- [ ] Power is redundant
- [ ] Management access is redundant
**Performance:**
- [ ] Latency meets requirements
- [ ] Bandwidth meets requirements
- [ ] QoS design supports critical applications
- [ ] Convergence time is acceptable
**Security:**
- [ ] Network segmentation implemented
- [ ] Firewalls properly placed
- [ ] ACLs defined for critical segments
- [ ] Management network isolated
- [ ] Monitoring and logging configured
**Scalability:**
- [ ] Design supports 3-5 year growth
- [ ] IP addressing allows for expansion
- [ ] Routing design scales appropriately
- [ ] Physical space for additional devices
**Operational:**
- [ ] Monitoring and management tools identified
- [ ] Automation approach defined
- [ ] Documentation complete
- [ ] Training plan for operations team
- [ ] Runbooks created
## Notes
- Network design is iterative - expect multiple revision cycles
- Involve all stakeholders early (network, security, operations, business)
- Consider operational complexity vs. technical perfection
- Document design decisions and trade-offs
- Plan for day 2 operations from the start
- Always have a rollback plan
- Test designs in lab before production deployment
## Example Task Invocation
```
design-network I need to design a data center network for 500 servers across 10 racks, requiring 25G server connectivity and 100G spine uplinks, with full redundancy and BGP routing for a multi-tenant cloud environment
```

View File

@@ -0,0 +1,316 @@
---
description: Generate FRRouting configuration files
argument-hint: Optional routing requirements
---
You are initiating FRR configuration generation using a structured workflow to create production-ready FRRouting configuration files.
## Workflow Steps
### 1. Gather Requirements
If the user provides specific requirements in their message, use those directly. Otherwise, ask the user for:
**Required Information:**
- Routing protocols needed (BGP, OSPF, IS-IS, RIP, static routes, etc.)
- Router ID (e.g., 10.0.0.1)
- Network type (data center leaf-spine, campus core, WAN edge, etc.)
**Protocol-Specific Information:**
**For BGP:**
- Local ASN (e.g., 65001)
- Neighbor details (IP addresses, remote ASNs)
- Address families (IPv4 unicast, IPv6 unicast, EVPN, etc.)
- Route filtering requirements (prefix lists, route maps)
- BGP authentication (MD5 passwords)
- Communities and AS-path filtering
**For OSPF:**
- OSPF process ID
- Area design (area 0 backbone, additional areas)
- Network statements
- Interface costs and priorities
- Authentication (if needed)
- Area types (stub, NSSA, etc.)
**For IS-IS:**
- NET address
- Level design (Level 1, Level 2, or both)
- Interface metrics
- Authentication
**For BFD:**
- BFD parameters for fast failure detection
- Target protocols (BGP, OSPF, IS-IS)
**Additional Requirements:**
- Static routes needed
- Route redistribution between protocols
- Access lists or prefix lists
- VRF configurations (if multi-tenancy needed)
- Authentication requirements
- Specific routing policies
### 2. Launch frr-config-generator Agent
Use the Task tool to launch the frr-config-generator agent with a detailed prompt containing:
```
Generate FRR configuration files for the following requirements:
[Insert gathered requirements here with all details]
Please provide:
1. Complete /etc/frr/daemons file
2. Complete /etc/frr/frr.conf configuration
3. Any additional configuration files needed
4. Step-by-step deployment procedure
5. Validation commands to verify the configuration
6. Troubleshooting commands
7. Rollback procedure
```
### 3. Review Generated Configuration
When the agent returns the configuration, review it for:
- Correct syntax for FRR version
- Proper routing protocol configuration
- Complete authentication settings
- Required route filtering
- Appropriate logging configuration
- Documentation and comments
### 4. Validate Configuration Syntax
Provide the user with validation commands they should run:
```bash
# Validate FRR configuration syntax
sudo vtysh -c "show running-config" --dry-run
# Check for configuration errors
sudo vtysh -f /etc/frr/frr.conf --dry-run
# Verify daemons file
cat /etc/frr/daemons | grep "yes"
```
### 5. Present Deployment Procedure
Ensure the generated configuration includes a safe deployment procedure:
1. **Backup current configuration**
```bash
sudo cp /etc/frr/frr.conf /etc/frr/frr.conf.backup.$(date +%Y%m%d_%H%M%S)
sudo cp /etc/frr/daemons /etc/frr/daemons.backup.$(date +%Y%m%d_%H%M%S)
```
2. **Deploy new configuration**
```bash
# Copy new daemons file
sudo cp daemons /etc/frr/daemons
# Copy new configuration
sudo cp frr.conf /etc/frr/frr.conf
# Set correct permissions
sudo chown frr:frr /etc/frr/frr.conf
sudo chmod 640 /etc/frr/frr.conf
```
3. **Restart FRR services**
```bash
# Restart FRR
sudo systemctl restart frr
# Check service status
sudo systemctl status frr
```
4. **Verify configuration**
```bash
# Enter vtysh
sudo vtysh
# Show running configuration
show running-config
# Show protocol-specific status
show ip bgp summary # For BGP
show ip ospf neighbor # For OSPF
show isis neighbor # For IS-IS
show ip route # Routing table
```
### 6. Provide Validation Commands
Include comprehensive validation commands for each configured protocol:
**BGP Validation:**
```bash
# Check BGP summary
show ip bgp summary
# Check BGP neighbors
show ip bgp neighbors
# Check received/advertised routes
show ip bgp neighbors <neighbor-ip> routes
show ip bgp neighbors <neighbor-ip> advertised-routes
# Check BGP communities
show ip bgp community
```
**OSPF Validation:**
```bash
# Check OSPF neighbors
show ip ospf neighbor
# Check OSPF database
show ip ospf database
# Check OSPF interfaces
show ip ospf interface
# Check OSPF routes
show ip route ospf
```
**IS-IS Validation:**
```bash
# Check IS-IS neighbors
show isis neighbor
# Check IS-IS database
show isis database
# Check IS-IS topology
show isis topology
```
**BFD Validation:**
```bash
# Check BFD peers
show bfd peers
# Check BFD peer details
show bfd peer <neighbor-ip>
```
### 7. Include Troubleshooting Commands
Provide troubleshooting commands for common issues:
```bash
# Check FRR daemon status
sudo systemctl status frr
# View FRR logs
sudo journalctl -u frr -f
# Check for configuration errors
sudo vtysh -c "show logging"
# Debug BGP
debug bgp updates
debug bgp neighbor-events
# Debug OSPF
debug ospf events
debug ospf packet all
# Clear BGP sessions (use with caution)
clear ip bgp *
clear ip bgp <neighbor-ip>
```
### 8. Document Rollback Procedure
Ensure rollback procedure is clearly documented:
```bash
# Stop FRR
sudo systemctl stop frr
# Restore backup configuration
sudo cp /etc/frr/frr.conf.backup.YYYYMMDD_HHMMSS /etc/frr/frr.conf
sudo cp /etc/frr/daemons.backup.YYYYMMDD_HHMMSS /etc/frr/daemons
# Restart FRR
sudo systemctl start frr
# Verify rollback
sudo vtysh -c "show running-config"
```
## Best Practices
When generating FRR configurations:
1. **Security First**
- Always use authentication for routing protocols
- Implement prefix filtering on BGP sessions
- Use MD5 authentication for BGP neighbors
- Limit administrative access with ACLs
2. **Routing Protocol Selection**
- BGP: For data center fabrics, WAN, and internet connectivity
- OSPF: For campus networks and enterprise routing
- IS-IS: For large service provider networks
- Static routes: For simple scenarios or specific routing needs
3. **High Availability**
- Configure BFD for fast failure detection
- Use multiple BGP sessions for redundancy
- Implement proper OSPF area design
- Configure appropriate route summarization
4. **Operational Excellence**
- Include comprehensive logging
- Document all routing policies
- Use descriptive neighbor names
- Maintain configuration version control
- Test in non-production first
5. **Performance Optimization**
- Configure appropriate timers
- Use route summarization
- Implement route dampening for BGP
- Optimize prefix limits
## Common Scenarios
### Data Center Leaf-Spine BGP
- Use BGP with eBGP for underlay
- Implement EVPN for overlay
- Configure BFD for fast convergence
- Use route reflectors for scaling
### Campus OSPF Network
- Design multi-area OSPF
- Use area 0 as backbone
- Implement stub areas where appropriate
- Configure OSPF authentication
### Internet Edge BGP
- Implement comprehensive prefix filtering
- Configure BGP communities
- Use local preference and MED
- Implement route dampening
- Filter bogon prefixes
## Notes
- FRR configuration uses vtysh CLI syntax similar to industry-standard routing platforms
- Configuration can be managed via /etc/frr/frr.conf or through vtysh interactive CLI
- Always test routing changes in non-production environments first
- Monitor routing protocol convergence during changes
- Keep backup configurations for quick rollback
## Example Task Invocation
```
generate-frr-config I need BGP configuration for a data center leaf switch with ASN 65001, two spine neighbors (192.168.1.1 AS 65100 and 192.168.1.2 AS 65100), advertising loopback 10.0.0.1/32 and local networks 10.10.0.0/24
```

View File

@@ -0,0 +1,472 @@
---
description: Generate /etc/network/interfaces configuration files
argument-hint: Optional interface requirements
---
You are initiating /etc/network/interfaces configuration generation using a structured workflow to create production-ready Debian/Ubuntu networking configuration files.
## Workflow Steps
### 1. Gather Requirements
If the user provides specific requirements in their message, use those directly. Otherwise, ask the user for:
**Basic Requirements:**
- Target system (Debian version, Ubuntu version)
- Interfaces to configure (eth0, enp0s3, etc.)
- IP addressing method (static, DHCP, or both)
- DNS nameservers
- Search domains
**For Static IP Configuration:**
- IP address and netmask (e.g., 192.168.1.100/24)
- Gateway IP address
- Additional IP addresses (if needed)
**For VLAN Configuration:**
- VLAN IDs and parent interfaces
- IP addressing for each VLAN
- VLAN naming convention
**For Bridge Configuration:**
- Bridge interfaces to create
- Physical interfaces to attach to bridges
- STP settings (on/off)
- IP addressing for bridges
- Use case (virtualization, container networking)
**For Bond Configuration:**
- Bond interfaces to create
- Physical interfaces to bond
- Bond mode (active-backup, 802.3ad, balance-rr, etc.)
- MII monitoring interval
- Primary interface (for active-backup)
**Advanced Options:**
- MTU settings (jumbo frames)
- Static routes
- Policy routing
- IPv6 configuration
- Pre/post up/down scripts
### 2. Launch interfaces-config-generator Agent
Use the Task tool to launch the interfaces-config-generator agent with a detailed prompt containing:
```
Generate /etc/network/interfaces configuration for the following requirements:
[Insert gathered requirements here with all details]
Please provide:
1. Complete /etc/network/interfaces file content
2. List of required packages to install
3. Step-by-step deployment procedure
4. Validation commands
5. Rollback procedure
6. Comments explaining each section
```
### 3. Review Generated Configuration
When the agent returns the configuration, review it for:
- Correct syntax and indentation
- Loopback interface inclusion
- Proper use of auto/allow-hotplug directives
- No conflicting gateway definitions
- Correct netmask/CIDR notation
- Required package dependencies documented
### 4. Identify Required Packages
Ensure the configuration includes a list of required packages:
**Common Package Requirements:**
```bash
# Base networking (usually pre-installed)
apt-get install ifupdown
# For VLAN support
apt-get install vlan
# For bridge support
apt-get install bridge-utils
# For bonding support
apt-get install ifenslave
# For advanced routing
apt-get install iproute2
```
### 5. Present Deployment Procedure
Ensure the generated configuration includes a safe deployment procedure:
1. **Install Required Packages**
```bash
# Update package lists
sudo apt-get update
# Install required packages
sudo apt-get install -y vlan bridge-utils ifenslave
# Load kernel modules
sudo modprobe 8021q # VLAN support
sudo modprobe bonding # Bonding support
# Make modules load at boot
echo "8021q" | sudo tee -a /etc/modules
echo "bonding" | sudo tee -a /etc/modules
```
2. **Backup Current Configuration**
```bash
# Backup interfaces file
sudo cp /etc/network/interfaces /etc/network/interfaces.backup.$(date +%Y%m%d_%H%M%S)
# Backup current network state
ip addr show > ~/network-backup-$(date +%Y%m%d_%H%M%S).txt
ip route show >> ~/network-backup-$(date +%Y%m%d_%H%M%S).txt
```
3. **Test Configuration Syntax**
```bash
# Test interface bring-up without actually applying
sudo ifup --no-act eth0
sudo ifup --no-act <interface-name>
# Check for syntax errors in the file
sudo cat /etc/network/interfaces | grep -E "^(auto|allow-hotplug|iface)"
```
4. **Deploy New Configuration**
```bash
# Copy new configuration
sudo cp new-interfaces /etc/network/interfaces
# Set correct permissions
sudo chmod 644 /etc/network/interfaces
sudo chown root:root /etc/network/interfaces
```
5. **Apply Configuration**
```bash
# Method 1: Restart networking service (may cause temporary disconnection)
sudo systemctl restart networking
# Method 2: Bring down and up specific interfaces
sudo ifdown eth0 && sudo ifup eth0
# Method 3: Reboot (safest for complex changes)
sudo reboot
```
6. **Verify Configuration**
```bash
# Check interface status
ip addr show
# Check routing table
ip route show
# Test connectivity
ping -c 4 <gateway-ip>
ping -c 4 8.8.8.8
# Check DNS resolution
nslookup google.com
```
### 6. Provide Validation Commands
Include comprehensive validation commands:
**Interface Status:**
```bash
# Show all interfaces
ip addr show
# Show specific interface
ip addr show eth0
# Show interface statistics
ip -s link show eth0
# Check interface up/down state
ip link show | grep "state UP"
```
**Routing Validation:**
```bash
# Show main routing table
ip route show
# Show all routing tables
ip route show table all
# Show specific route
ip route get 8.8.8.8
```
**VLAN Validation:**
```bash
# Check VLAN interfaces
cat /proc/net/vlan/config
# Show VLAN interface details
ip -d link show eth0.100
```
**Bridge Validation:**
```bash
# Show bridge interfaces
brctl show
# Show bridge details
bridge link show
# Check STP status
brctl showstp br0
```
**Bond Validation:**
```bash
# Check bonding status
cat /proc/net/bonding/bond0
# Show bond interface details
ip -d link show bond0
```
### 7. Include Troubleshooting Commands
Provide troubleshooting commands for common issues:
**Interface Not Coming Up:**
```bash
# Check interface configuration
sudo ifquery eth0
# Try manual bring-up with verbose output
sudo ifup -v eth0
# Check system logs
sudo journalctl -u networking -n 50
# Check interface configuration file syntax
sudo ifquery --list
```
**No Network Connectivity:**
```bash
# Check interface status
ip link show
# Check IP addressing
ip addr show
# Check default route
ip route show default
# Check physical link
ethtool eth0
# Test ARP
ip neigh show
```
**VLAN Issues:**
```bash
# Verify VLAN module loaded
lsmod | grep 8021q
# Check VLAN interface
cat /proc/net/vlan/eth0.100
# Manually create VLAN to test
sudo ip link add link eth0 name eth0.100 type vlan id 100
```
**Bridge Issues:**
```bash
# Check bridge configuration
brctl show
# View bridge MAC learning table
brctl showmacs br0
# Check STP state
brctl showstp br0
```
**Bond Issues:**
```bash
# Check bonding module
lsmod | grep bonding
# View bond status
cat /proc/net/bonding/bond0
# Check bond mode and slaves
ip -d link show bond0
```
### 8. Document Rollback Procedure
Ensure rollback procedure is clearly documented:
```bash
# Method 1: Restore backup configuration
sudo cp /etc/network/interfaces.backup.YYYYMMDD_HHMMSS /etc/network/interfaces
sudo systemctl restart networking
# Method 2: Manual interface configuration (temporary)
sudo ip addr add 192.168.1.100/24 dev eth0
sudo ip route add default via 192.168.1.1
sudo ip link set eth0 up
# Method 3: Boot into recovery mode
# Reboot and select recovery mode from GRUB menu
# Edit /etc/network/interfaces manually
# Resume normal boot
# Verify rollback
ip addr show
ip route show
ping -c 4 <gateway-ip>
```
## Best Practices
When generating /etc/network/interfaces configurations:
1. **Always Include Loopback**
```
auto lo
iface lo inet loopback
```
2. **Use auto vs allow-hotplug Appropriately**
- `auto`: For interfaces that should always come up at boot
- `allow-hotplug`: For removable devices (USB, wireless)
3. **Consistent Indentation**
- Use spaces or tabs consistently
- Indent option lines under iface declarations
4. **Gateway Configuration**
- Only one default gateway per address family
- Specify gateway on the primary internet-facing interface
5. **Documentation**
- Add comments explaining complex configurations
- Document interface purposes
- Note any external dependencies
6. **Testing**
- Always use `ifup --no-act` before applying
- Test in non-production first
- Have console access before making changes
- Keep backup configurations
7. **Modular Configuration**
- Use `/etc/network/interfaces.d/` for complex setups
- Separate VLANs, bridges, bonds into different files
## Common Scenarios
### Simple Static IP Server
```
auto lo
iface lo inet loopback
auto eth0
iface eth0 inet static
address 192.168.1.100
netmask 255.255.255.0
gateway 192.168.1.1
dns-nameservers 8.8.8.8 8.8.4.4
```
### DHCP with Static Route
```
auto eth0
iface eth0 inet dhcp
up ip route add 10.0.0.0/8 via 192.168.1.254
down ip route del 10.0.0.0/8 via 192.168.1.254
```
### VLAN Configuration
```
auto eth0
iface eth0 inet manual
auto eth0.100
iface eth0.100 inet static
address 10.0.100.1
netmask 255.255.255.0
vlan-raw-device eth0
```
### Bridge for Virtualization
```
auto br0
iface br0 inet static
address 192.168.1.10
netmask 255.255.255.0
gateway 192.168.1.1
bridge_ports eth0 eth1
bridge_stp off
bridge_fd 0
```
### Active-Backup Bond
```
auto bond0
iface bond0 inet static
address 192.168.1.10
netmask 255.255.255.0
bond-slaves eth0 eth1
bond-mode active-backup
bond-miimon 100
bond-primary eth0
```
## Migration Notes
**For Systems Using Netplan:**
- Ubuntu 17.10+ uses netplan by default
- /etc/network/interfaces is deprecated on these systems
- Consider using generate-netplan-config instead
- If using interfaces file on netplan systems, disable netplan renderer
**Checking Current Network Manager:**
```bash
# Check if netplan is active
ls -la /etc/netplan/
# Check if using systemd-networkd
systemctl status systemd-networkd
# Check if using NetworkManager
systemctl status NetworkManager
# Check if using ifupdown
systemctl status networking
```
## Notes
- /etc/network/interfaces is the traditional Debian/Ubuntu networking configuration
- Widely supported across Debian 6-11 and Ubuntu versions pre-17.10
- Still commonly used for servers and systems requiring fine-grained control
- Requires ifupdown package
- Configuration changes require interface restart or system reboot
- Not all features available with all network managers
## Example Task Invocation
```
generate-interfaces-config I need static IP 192.168.1.50/24 on eth0 with gateway 192.168.1.1, two VLANs (VLAN 100 and 200), and a bridge br0 for KVM
```

View File

@@ -0,0 +1,628 @@
---
description: Generate netplan YAML configuration files
argument-hint: Optional netplan requirements
---
You are initiating netplan configuration generation using a structured workflow to create production-ready netplan YAML configuration files for modern Ubuntu and Debian systems.
## Workflow Steps
### 1. Gather Requirements
If the user provides specific requirements in their message, use those directly. Otherwise, ask the user for:
**Basic Requirements:**
- Target system (Ubuntu version 17.10+, Debian with netplan)
- Renderer preference (networkd for servers, NetworkManager for desktops)
- Interfaces to configure (eth0, enp0s3, wlan0, etc.)
- IP addressing method (static, DHCP, or both)
- DNS nameservers
- Search domains
**For Static IP Configuration:**
- IP address and CIDR (e.g., 192.168.1.100/24)
- Gateway IP address (or use routes with "to: default")
- Additional IP addresses (if needed)
**For VLAN Configuration:**
- VLAN IDs and parent interfaces
- IP addressing for each VLAN
- VLAN naming convention (e.g., vlan100 or eth0.100)
**For Bridge Configuration:**
- Bridge interfaces to create
- Physical interfaces to attach to bridges
- STP settings (true/false)
- Forward delay settings
- IP addressing for bridges
- Use case (virtualization, container networking)
**For Bond Configuration:**
- Bond interfaces to create
- Physical interfaces to bond
- Bond mode (active-backup, 802.3ad, balance-rr, balance-xor, etc.)
- MII monitor interval
- Primary interface (for active-backup mode)
- LACP rate (for 802.3ad)
**For WiFi Configuration:**
- SSID and password
- Security type (WPA2, WPA3)
- DHCP or static configuration
**Advanced Options:**
- MTU settings (jumbo frames for 9000 MTU)
- Static routes with metrics
- Routing policy rules
- IPv6 configuration
- Optional interfaces (don't wait at boot)
- DHCP overrides (use-dns, use-routes, etc.)
### 2. Launch netplan-config-generator Agent
Use the Task tool to launch the netplan-config-generator agent with a detailed prompt containing:
```
Generate netplan YAML configuration for the following requirements:
[Insert gathered requirements here with all details]
Please provide:
1. Complete netplan YAML file content
2. Recommended filename (e.g., /etc/netplan/01-network-config.yaml)
3. Step-by-step deployment procedure
4. Validation commands with netplan try
5. Rollback procedure
6. Comments explaining each section
7. Any version-specific considerations
```
### 3. Review Generated Configuration
When the agent returns the configuration, review it for:
- Valid YAML syntax (proper indentation with spaces, not tabs)
- Includes `version: 2` at the top
- Correct renderer specification
- Proper CIDR notation for IP addresses
- No conflicting gateway definitions
- Appropriate use of modern routes vs deprecated gateway4/gateway6
- Correct interface naming for the target system
### 4. Validate YAML Syntax
Before deployment, ensure YAML syntax validation:
```bash
# Install YAML linter if needed
sudo apt-get install -y yamllint
# Validate YAML syntax
yamllint /etc/netplan/01-network-config.yaml
# Or use Python JSON tool
python3 -c "import yaml; yaml.safe_load(open('/etc/netplan/01-network-config.yaml'))"
# Netplan's own syntax check
sudo netplan generate
```
### 5. Present Deployment Procedure
Ensure the generated configuration includes a safe deployment procedure:
1. **Backup Current Configuration**
```bash
# Backup all existing netplan files
sudo mkdir -p /etc/netplan/backup
sudo cp /etc/netplan/*.yaml /etc/netplan/backup/
# Backup current network state
ip addr show > ~/network-backup-$(date +%Y%m%d_%H%M%S).txt
ip route show >> ~/network-backup-$(date +%Y%m%d_%H%M%S).txt
```
2. **Create New Configuration File**
```bash
# Create new netplan configuration
sudo nano /etc/netplan/01-network-config.yaml
# Paste the generated YAML configuration
# Save and exit (Ctrl+X, Y, Enter)
# Set correct permissions
sudo chmod 600 /etc/netplan/01-network-config.yaml
sudo chown root:root /etc/netplan/01-network-config.yaml
```
3. **Remove Conflicting Configurations (if any)**
```bash
# List existing netplan files
ls -la /etc/netplan/
# Remove old cloud-init config if replacing it
# sudo rm /etc/netplan/50-cloud-init.yaml
# Or disable cloud-init network config
# echo "network: {config: disabled}" | sudo tee /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg
```
4. **Test Configuration with Auto-Revert**
```bash
# Try configuration with 120-second auto-revert
sudo netplan try
# The configuration will be applied
# If you lose connectivity, it auto-reverts after 120 seconds
# If everything works, press Enter to accept
# For debug output
sudo netplan --debug try
```
5. **Apply Configuration Permanently**
```bash
# After successful try, apply permanently
sudo netplan apply
# Or with debug output
sudo netplan --debug apply
```
6. **Verify Configuration**
```bash
# Check interface status
ip addr show
# Check routing table
ip route show
# Check IPv6 routes
ip -6 route show
# Test connectivity
ping -c 4 <gateway-ip>
ping -c 4 8.8.8.8
# Check DNS resolution
resolvectl status
nslookup google.com
```
### 6. Provide Validation Commands
Include comprehensive validation commands:
**Netplan Status:**
```bash
# Generate backend configuration
sudo netplan generate
# Check generated configuration
sudo netplan get
# Show current netplan configuration
cat /etc/netplan/*.yaml
```
**Interface Status:**
```bash
# Show all interfaces with networkctl (systemd-networkd)
networkctl status
# Show specific interface
networkctl status eth0
# Traditional ip command
ip addr show
ip link show
# Show interface statistics
ip -s link show eth0
```
**Routing Validation:**
```bash
# Show IPv4 routes
ip route show
# Show IPv6 routes
ip -6 route show
# Show all routing tables
ip route show table all
# Test specific route
ip route get 8.8.8.8
```
**DNS Validation:**
```bash
# Check DNS configuration (systemd-resolved)
resolvectl status
# Check resolv.conf
cat /etc/resolv.conf
# Test DNS resolution
nslookup google.com
dig google.com
```
**VLAN Validation:**
```bash
# Show VLAN interfaces
ip -d link show type vlan
# Show specific VLAN
ip -d link show vlan100
```
**Bridge Validation:**
```bash
# Show bridge interfaces
ip -d link show type bridge
# Show bridge details with bridge command
bridge link show
# Check bridge forwarding database
bridge fdb show
```
**Bond Validation:**
```bash
# Show bond interfaces
ip -d link show type bond
# Check bonding status (if using kernel bonding)
cat /proc/net/bonding/bond0
# Show bond details
ip -d link show bond0
```
**WiFi Validation:**
```bash
# Show WiFi status
nmcli device wifi list
# Show connection status
nmcli connection show
# Check WiFi interface
iw dev wlan0 info
```
### 7. Include Troubleshooting Commands
Provide troubleshooting commands for common issues:
**Configuration Not Applying:**
```bash
# Debug netplan apply
sudo netplan --debug apply
# Check netplan logs
sudo journalctl -u systemd-networkd -n 50
# For NetworkManager renderer
sudo journalctl -u NetworkManager -n 50
# Generate configuration manually
sudo netplan generate
# Check generated files
ls -la /run/netplan/
cat /run/netplan/*.network
```
**YAML Syntax Errors:**
```bash
# Validate YAML syntax
yamllint /etc/netplan/*.yaml
# Check for tabs vs spaces
cat -A /etc/netplan/01-network-config.yaml | grep $'\t'
# Python YAML validation
python3 -c "import yaml; yaml.safe_load(open('/etc/netplan/01-network-config.yaml'))"
```
**Interface Not Coming Up:**
```bash
# Check interface status
networkctl status eth0
# Check systemd-networkd status
sudo systemctl status systemd-networkd
# Restart networkd
sudo systemctl restart systemd-networkd
# Check for interface errors
ip link show eth0
# Check kernel messages
dmesg | grep eth0
```
**No Network Connectivity:**
```bash
# Check IP address assignment
ip addr show
# Check default route
ip route show default
# Check physical link
ethtool eth0
# Test ARP
ip neigh show
# Check firewall
sudo iptables -L -n -v
```
**DNS Not Working:**
```bash
# Check systemd-resolved status
sudo systemctl status systemd-resolved
# Restart resolved
sudo systemctl restart systemd-resolved
# Check DNS servers
resolvectl status
# Check resolv.conf symlink
ls -la /etc/resolv.conf
# Test DNS manually
dig @8.8.8.8 google.com
```
**Renderer Issues:**
```bash
# Check which renderer is active
ps aux | grep -E "networkd|NetworkManager"
# Switch from networkd to NetworkManager
sudo systemctl stop systemd-networkd
sudo systemctl disable systemd-networkd
sudo systemctl start NetworkManager
sudo systemctl enable NetworkManager
# Then update netplan renderer to NetworkManager
```
### 8. Document Rollback Procedure
Ensure rollback procedure is clearly documented:
```bash
# Method 1: Restore backup configuration
sudo cp /etc/netplan/backup/*.yaml /etc/netplan/
sudo netplan apply
# Method 2: Manual configuration (temporary - survives until reboot)
sudo ip addr add 192.168.1.100/24 dev eth0
sudo ip route add default via 192.168.1.1
sudo ip link set eth0 up
# Method 3: Boot with old configuration
# If netplan try timed out, old config is automatically restored
# Method 4: Edit from recovery mode
# Reboot into recovery mode if remote access is lost
# Edit /etc/netplan/*.yaml
# Run: netplan apply
# Resume normal boot
# Verify rollback
ip addr show
ip route show
ping -c 4 <gateway-ip>
```
## Best Practices
When generating netplan configurations:
1. **Use Correct YAML Syntax**
- Always use spaces for indentation (typically 2 spaces)
- Never use tabs
- Use proper list syntax with dashes
- Quote special characters in SSIDs
2. **File Naming Convention**
- Use numeric prefixes: 01-, 02-, 10-, 50-
- Files processed in lexicographical order
- Example: `/etc/netplan/01-network-config.yaml`
- Cloud-init uses 50-, so use 01- to override
3. **Renderer Selection**
- `networkd`: For servers, VMs, containers (lightweight)
- `NetworkManager`: For desktops, laptops (feature-rich GUI)
4. **Modern vs Legacy Syntax**
- Prefer `routes` over deprecated `gateway4`/`gateway6`
- Use `nameservers` for DNS configuration
- Use `addresses` with CIDR notation
5. **Security**
- Set file permissions to 600 (only root readable)
- Don't commit files with WiFi passwords to public repos
- Use separate files for sensitive configurations
6. **Testing Strategy**
- Always use `netplan try` before `netplan apply`
- Test in non-production first
- Have console/KVM access before making changes
- Keep backup configurations
7. **Documentation**
- Add YAML comments for complex configurations
- Document interface purposes
- Note any quirks or special requirements
## Common Scenarios
### Simple Static IP Server (Modern Syntax)
```yaml
network:
version: 2
renderer: networkd
ethernets:
eth0:
addresses:
- 192.168.1.100/24
routes:
- to: default
via: 192.168.1.1
nameservers:
addresses:
- 8.8.8.8
- 8.8.4.4
```
### DHCP Configuration
```yaml
network:
version: 2
renderer: networkd
ethernets:
eth0:
dhcp4: true
dhcp6: false
```
### VLAN Configuration
```yaml
network:
version: 2
renderer: networkd
ethernets:
eth0:
dhcp4: false
vlans:
vlan100:
id: 100
link: eth0
addresses:
- 10.0.100.1/24
vlan200:
id: 200
link: eth0
addresses:
- 10.0.200.1/24
```
### Bridge for Virtualization
```yaml
network:
version: 2
renderer: networkd
ethernets:
eth0:
dhcp4: false
eth1:
dhcp4: false
bridges:
br0:
interfaces:
- eth0
- eth1
addresses:
- 192.168.1.10/24
routes:
- to: default
via: 192.168.1.1
parameters:
stp: false
forward-delay: 0
```
### Bond Configuration (LACP)
```yaml
network:
version: 2
renderer: networkd
ethernets:
eth0:
dhcp4: false
eth1:
dhcp4: false
bonds:
bond0:
interfaces:
- eth0
- eth1
addresses:
- 192.168.1.10/24
routes:
- to: default
via: 192.168.1.1
parameters:
mode: 802.3ad
mii-monitor-interval: 100
lacp-rate: fast
```
### WiFi Configuration
```yaml
network:
version: 2
renderer: NetworkManager
wifis:
wlan0:
access-points:
"MyNetwork":
password: "securepassword"
dhcp4: true
```
## Version Compatibility
**Netplan 0.103+:**
- Prefer `routes` with `to: default` over `gateway4`/`gateway6`
- Better routing policy support
**Ubuntu Versions:**
- Ubuntu 17.10+: Netplan by default
- Ubuntu 20.04 LTS: Netplan 0.99+
- Ubuntu 22.04 LTS: Netplan 0.103+
- Ubuntu 24.04 LTS: Latest netplan features
**Renderer Availability:**
- `systemd-networkd`: Available on all netplan systems
- `NetworkManager`: May need installation on server editions
## Migration Notes
**From /etc/network/interfaces:**
- Netplan uses YAML instead of interfaces file format
- Convert `iface eth0 inet static` to YAML ethernets section
- Convert bond/bridge syntax to netplan parameters
- Update scripts to use `netplan apply` instead of `ifup/ifdown`
**From NetworkManager (GUI Config):**
- Export existing connections to netplan format
- Use `renderer: NetworkManager` to keep using GUI
- Or migrate to `renderer: networkd` for server deployments
## Notes
- Netplan is a network configuration abstraction for Ubuntu 17.10+
- Generates backend configuration for NetworkManager or systemd-networkd
- YAML configuration stored in `/etc/netplan/`
- Changes require `netplan apply` to take effect
- `netplan try` is the safest way to test changes (auto-reverts after timeout)
- Files processed in lexicographical order (01-, 02-, etc.)
- Cloud-init may create 50-cloud-init.yaml - use 01- to override
## Example Task Invocation
```
generate-netplan-config I need Ubuntu 22.04 server with static IP 192.168.1.50/24 on enp0s3, gateway 192.168.1.1, two VLANs (100 and 200 on enp0s3), DNS 8.8.8.8, using networkd renderer
```

View File

@@ -0,0 +1,435 @@
---
description: Review network configurations for errors and best practices
argument-hint: Optional configuration files or paths
---
You are initiating a comprehensive network configuration review using a structured workflow to identify errors, validate best practices, and provide improvement recommendations.
## Workflow Steps
### 1. Gather Configuration Information
If the user provides configuration files or paths, use those directly. Otherwise, ask the user for:
**Configuration Files:**
- File paths to review or paste configuration content directly
- Configuration type (interfaces, netplan, FRR, SONiC, vendor-specific)
- Multiple files if reviewing a complete setup
**Review Scope:**
- Target environment (production, staging, development, lab)
- Criticality level (mission-critical, business-critical, standard)
- Specific concerns or focus areas
- Known issues to investigate
**Context Information:**
- Network architecture type (data center, campus, branch, etc.)
- Scale (number of devices, interfaces, routes)
- Existing vs. new deployment
- Recent changes (if troubleshooting)
### 2. Launch network-architecture-reviewer Agent
Use the Task tool to launch the network-architecture-reviewer agent with a detailed prompt:
```
Review the following network configuration files for errors and best practices:
Configuration Type: [interfaces/netplan/FRR/SONiC/etc.]
Environment: [production/staging/lab]
Criticality: [mission-critical/business-critical/standard]
Configuration Files:
[Paste configuration content or provide file paths]
Please perform a comprehensive review including:
1. Syntax Validation
- Check for configuration syntax errors
- Validate proper formatting and indentation
- Identify typos and common mistakes
2. Logical Validation
- Verify IP addressing scheme (no conflicts, proper subnetting)
- Check routing logic and gateway configurations
- Validate interface relationships
- Confirm VLAN and network segmentation
3. Best Practices Assessment
- Evaluate against industry standards
- Check for recommended practices
- Assess scalability and maintainability
- Review documentation quality
4. Common Pitfalls and Anti-patterns
- Identify single points of failure
- Check for routing loops or suboptimal routing
- Look for security vulnerabilities
- Find performance bottlenecks
5. Recommendations
- Prioritize issues by severity (Critical, High, Medium, Low)
- Provide specific corrective actions
- Suggest improvements and optimizations
- Include relevant documentation references
Focus areas (if specified): [user-specified concerns]
```
### 3. Analyze Review Output
When the agent returns the review, organize findings by:
- **Critical Issues**: Must fix before deployment
- **High Priority**: Should fix soon
- **Medium Priority**: Fix when convenient
- **Low Priority**: Nice-to-have improvements
### 4. Create Issue Report
Structure the review findings in a clear report format:
**Executive Summary:**
- Overall assessment (Ready/Needs Work/Major Issues)
- Total number of issues by severity
- Top 3-5 critical findings
- Recommendation (Deploy/Fix and Review/Major Rework)
**Detailed Findings:**
For each issue:
```
[SEVERITY] Category: Issue Title
Location: <file>:<line> or <section>
Description:
<Clear description of what's wrong>
Impact:
<Potential consequences if not fixed>
Current Configuration:
<Show problematic configuration snippet>
Recommended Fix:
<Specific corrective action with example>
Reference:
<Link to documentation or best practice guide>
```
### 5. Validate Findings
Review the findings for accuracy:
- Verify each issue is actually a problem in context
- Check that recommendations are appropriate for the environment
- Ensure fixes won't introduce new issues
- Validate that all critical items are caught
### 6. Provide Remediation Plan
Create a prioritized remediation plan:
**Immediate Actions (Critical - Fix Before Deployment):**
1. Issue description and fix
2. Estimated time to fix
3. Testing required
**Short-term Actions (High Priority - Fix Within 1 Week):**
1. Issue description and fix
2. Estimated time to fix
3. Testing required
**Medium-term Actions (Medium Priority - Fix Within 1 Month):**
1. Issue description and fix
2. Estimated time to fix
3. Testing required
**Long-term Improvements (Low Priority - Plan for Future):**
1. Enhancement description
2. Benefits
3. Effort estimate
### 7. Include Validation Commands
Provide commands to validate each fix:
**For Interface Configurations:**
```bash
# Debian/Ubuntu /etc/network/interfaces
sudo ifup --no-act <interface>
sudo ifquery <interface>
# Netplan
sudo netplan --debug try
sudo netplan generate
# Verify after applying
ip addr show
ip route show
```
**For FRR Configurations:**
```bash
# Validate configuration syntax
sudo vtysh -f /etc/frr/frr.conf --dry-run
# Check after applying
sudo vtysh -c "show running-config"
sudo vtysh -c "show ip bgp summary"
sudo vtysh -c "show ip ospf neighbor"
```
**For SONiC Configurations:**
```bash
# Validate JSON syntax
python3 -m json.tool config_db.json
jq . config_db.json
# Check after applying
show running-config
show interfaces status
show ip bgp summary
```
### 8. Document Best Practices
Include best practices reference for the configuration type:
**General Networking:**
- No conflicting default gateways
- Proper subnet sizing with growth room
- Consistent naming conventions
- Comprehensive documentation
- Version control for configurations
**Routing:**
- Authentication on routing protocols
- Route filtering and summarization
- Appropriate protocol selection for scale
- Redundant paths where needed
- Convergence time considerations
**High Availability:**
- Eliminate single points of failure
- Redundancy protocols (VRRP, LACP, etc.)
- Fast failover mechanisms (BFD)
- Tested failure scenarios
**Security:**
- Management network isolation
- Encrypted management protocols (SSH, not Telnet)
- ACLs for traffic control
- Logging and monitoring configured
- Regular security updates
## Review Categories and Checks
### Configuration Syntax and Correctness
**For /etc/network/interfaces:**
- [ ] Valid syntax and indentation
- [ ] Loopback interface configured
- [ ] Proper use of auto/allow-hotplug
- [ ] Valid IP addresses and CIDR notation
- [ ] No gateway conflicts
- [ ] Required packages documented (vlan, bridge-utils, etc.)
**For Netplan:**
- [ ] Valid YAML syntax (spaces, not tabs)
- [ ] version: 2 specified
- [ ] Correct renderer choice
- [ ] Valid CIDR notation
- [ ] Proper interface naming
- [ ] No conflicting gateways
**For FRR:**
- [ ] Valid daemon configuration
- [ ] Correct routing protocol syntax
- [ ] Proper route-map and prefix-list definitions
- [ ] Valid BGP/OSPF configuration
- [ ] Authentication configured
**For SONiC:**
- [ ] Valid JSON syntax
- [ ] Required sections present (DEVICE_METADATA)
- [ ] Correct interface naming for platform
- [ ] Valid feature configuration
### Network Design Best Practices
**IP Addressing:**
- [ ] No IP address conflicts or overlaps
- [ ] Appropriate subnet sizing
- [ ] RFC1918 private addressing used correctly
- [ ] Loopback addresses configured
- [ ] DHCP vs static appropriate for use case
**Routing:**
- [ ] No routing loops
- [ ] Appropriate routing protocol
- [ ] Route summarization where applicable
- [ ] Route filtering configured
- [ ] Redistribution done correctly
**High Availability:**
- [ ] Redundant paths configured
- [ ] Bonding/teaming properly configured
- [ ] Gateway redundancy (VRRP/HSRP)
- [ ] Fast failover (BFD) configured
- [ ] Acceptable convergence time
**Scalability:**
- [ ] Design supports growth projections
- [ ] Efficient routing protocol usage
- [ ] Proper network segmentation
- [ ] VLAN design is scalable
- [ ] Capacity planning considered
### Performance Considerations
- [ ] MTU configured appropriately (jumbo frames if needed)
- [ ] Link speeds set correctly
- [ ] QoS/traffic shaping configured
- [ ] Buffer sizes appropriate
- [ ] TCP optimization if needed
### Operational Considerations
**Documentation:**
- [ ] Clear comments in configuration
- [ ] Design decisions documented
- [ ] Rollback procedures defined
- [ ] Contact information included
**Maintainability:**
- [ ] Consistent naming conventions
- [ ] Logical organization
- [ ] Modular structure
- [ ] Version control friendly
**Monitoring:**
- [ ] Logging configured
- [ ] SNMP community strings secure
- [ ] Syslog destinations configured
- [ ] Interface descriptions present
## Common Issues and Fixes
### Critical Issues
**Multiple Default Gateways:**
```
Problem: Multiple interfaces with gateway defined
Impact: Unpredictable routing behavior
Fix: Remove gateway from secondary interfaces, use static routes instead
```
**IP Address Conflicts:**
```
Problem: Same IP on multiple interfaces or duplicate IPs in network
Impact: Network connectivity failures
Fix: Use unique IPs, implement IPAM system
```
**Routing Loops:**
```
Problem: Routes pointing back to same router
Impact: Packet loops, network outage
Fix: Correct routing table, use route filtering
```
**Invalid Syntax:**
```
Problem: Configuration file won't parse
Impact: Configuration won't apply
Fix: Correct syntax errors, validate before applying
```
### High-Risk Issues
**Missing Route Filtering:**
```
Problem: No prefix lists on BGP sessions
Impact: Route leaks, blackhole traffic
Fix: Implement strict prefix filtering
```
**No Redundancy:**
```
Problem: Single path to critical resources
Impact: Outage if link/device fails
Fix: Add redundant links, use bonding/LACP
```
**Weak Security:**
```
Problem: Telnet, HTTP, SNMPv1/v2c enabled
Impact: Security vulnerabilities
Fix: Use SSH, HTTPS, SNMPv3 only
```
## Best Practices Reference
### Interface Configuration
- Use descriptive interface descriptions
- Configure appropriate MTU for network type
- Enable only required protocols
- Document interface purpose and connections
### Routing Protocols
- **OSPF**: Use area 0 as backbone, proper area design
- **BGP**: Implement route filtering, use prefix lists
- **IS-IS**: Proper NET addressing, level hierarchy
- **All**: Use BFD for fast failure detection
### Network Segmentation
- Separate management from data plane
- Use VLANs for logical separation
- Implement proper inter-VLAN routing
- Document VLAN purposes
## Output Format
The review report should include:
1. **Executive Summary** (1 page)
- Overall assessment
- Issue count by severity
- Key recommendations
- Deploy/no-deploy decision
2. **Detailed Findings** (detailed pages)
- Each issue with severity, location, description
- Impact analysis
- Recommended fixes with examples
3. **Positive Observations**
- Good practices found
- Correct implementations
- Thoughtful design decisions
4. **Remediation Roadmap**
- Prioritized action items
- Fix procedures
- Testing requirements
- Timeline estimates
5. **Validation Commands**
- Commands to test each fix
- Expected outputs
- Verification procedures
## Notes
- Review is advisory - final decisions rest with network team
- Consider environment context (lab vs production)
- Balance technical perfection with operational reality
- Provide constructive, actionable feedback
- Include references to official documentation
- Test recommendations before applying to production
## Example Task Invocation
```
review-network-config Please review my FRR BGP configuration for a data center leaf switch. Config file is at /etc/frr/frr.conf and I'm seeing some routes not being advertised to spine neighbors. Environment is production, mission-critical.
```

View File

@@ -0,0 +1,475 @@
---
description: Review network security based on NIST standards
argument-hint: Optional network architecture or config files
---
You are initiating a comprehensive network security review using a structured workflow based on NIST (National Institute of Standards and Technology) standards and industry security best practices.
## Workflow Steps
### 1. Gather Security Review Requirements
If the user provides specific information, use that directly. Otherwise, ask the user for:
**Network Information:**
- Network architecture diagrams or descriptions
- Configuration files (FRR, interfaces, netplan, SONiC, firewall configs)
- Network topology (data center, campus, branch, cloud, hybrid)
- Critical assets and data flows
**Environment Context:**
- Criticality level (mission-critical, business-critical, standard)
- Exposure (internet-facing, internal, DMZ, management)
- Industry and compliance requirements (PCI-DSS, HIPAA, SOC 2, FEDRAMP)
- Specific NIST frameworks to align with
**Security Concerns:**
- Known vulnerabilities or weaknesses
- Recent security incidents
- Specific threats or attack vectors of concern
- Third-party security audit findings
**Scope:**
- Full infrastructure review or specific components
- Focus areas (access control, encryption, routing, etc.)
- Depth of review (high-level vs deep technical)
### 2. Launch network-security-reviewer Agent
Use the Task tool to launch the network-security-reviewer agent with a comprehensive prompt:
```
Perform a comprehensive NIST-based security review of the following network:
Network Type: [data center/campus/branch/etc.]
Criticality: [mission-critical/business-critical/standard]
Exposure: [internet-facing/internal/DMZ]
Compliance Requirements: [PCI-DSS/HIPAA/SOC 2/NIST CSF/etc.]
Network Architecture and Configurations:
[Provide diagrams, descriptions, and configuration files]
Specific Concerns (if any):
[User-specified security concerns]
Please perform a comprehensive security review based on NIST standards covering:
1. Network Segmentation and Isolation (NIST SP 800-41, 800-125B)
- Trust zone separation
- DMZ implementation
- VLAN/VRF isolation
- Micro-segmentation for critical assets
- East-west traffic control
2. Access Control and Authentication (NIST SP 800-53 AC Family)
- Management access security
- Authentication mechanisms (MFA, SSH keys, TACACS+)
- Authorization and privilege escalation
- Session management
- Role-based access control
3. Encryption and Data Protection (NIST SP 800-52, 800-77, 800-113)
- Management protocol encryption (SSH vs Telnet)
- Routing protocol security
- VPN configuration and cipher suites
- SNMP security (SNMPv3)
- WiFi security (WPA3/WPA2 Enterprise)
4. Routing Security (Industry Best Practices + NIST)
- BGP security (authentication, filtering, RPKI)
- OSPF/IS-IS authentication
- Route filtering and validation
- Prevention of route hijacking
- Bogon filtering
5. Denial of Service Protection (NIST Principles)
- Rate limiting and CoPP
- SYN flood protection
- ICMP rate limiting
- Broadcast storm control
- Resource limits
6. Logging, Monitoring, and Auditing (NIST SP 800-53 AU Family)
- Centralized logging
- Log integrity
- NTP synchronization
- SNMP monitoring security
- NetFlow/sFlow analysis
- Retention policies
7. Interface and Service Security (Attack Surface Reduction)
- Unused interfaces shutdown
- Unnecessary services disabled
- Port security
- DHCP snooping
- Dynamic ARP Inspection
8. Management Plane Security (NIST Principles)
- Out-of-band management
- Management network isolation
- Encrypted protocols only
- Jump host/bastion architecture
- Console security
9. Firmware and Configuration Management (NIST SP 800-53 CM Family)
- Patch management
- Configuration baselines
- Change control
- Vulnerability scanning
- CIS benchmark compliance
10. NIST Compliance Mapping (CSF or SP 800-53)
- Map controls to NIST framework
- Identify compliance gaps
- Document deviations
- Provide remediation roadmap
Provide output in the following format:
- Executive summary with overall risk assessment
- Critical/High/Medium/Low findings with NIST references
- Compliance matrix
- Remediation roadmap with priorities
- Validation commands
```
### 3. Analyze Security Review Output
When the agent returns the review, organize by:
- **Critical Findings**: Immediate security risks requiring urgent action
- **High Findings**: Significant vulnerabilities to fix soon
- **Medium Findings**: Security improvements needed
- **Low Findings**: Best practice enhancements
### 4. Create Security Assessment Report
Structure the security findings in a comprehensive report:
**Executive Summary:**
- Overall Security Posture (Critical Risk / High Risk / Medium Risk / Low Risk)
- Number of findings by severity
- Top 5 critical security gaps
- NIST compliance status
- Key recommendations
**Risk Assessment Matrix:**
```
| Finding | Severity | NIST Control | Likelihood | Impact | Risk Score |
|---------|----------|--------------|------------|--------|------------|
| ... | Critical | AC-2 | High | High | 9/10 |
```
**Detailed Findings:**
For each security issue:
```
[SEVERITY] Category: Finding Title
NIST Reference: SP 800-53 XX-Y / CSF Category
Location: <device/config section>
Vulnerability:
<Description of the security issue>
Risk:
<Potential impact and exploitation scenario>
Evidence:
<Configuration excerpt or observation>
Attack Vector:
<How this could be exploited>
Recommendation:
<Specific remediation steps with examples>
NIST Compliance:
<How remediation addresses NIST requirements>
Priority: [Immediate/30 days/90 days]
```
### 5. Map to NIST Cybersecurity Framework
Create CSF compliance matrix:
**Identify (ID):**
- Asset Management (ID.AM)
- Risk Assessment (ID.RA)
- Findings and gaps
**Protect (PR):**
- Access Control (PR.AC)
- Data Security (PR.DS)
- Findings and gaps
**Detect (DE):**
- Anomalies and Events (DE.AE)
- Continuous Monitoring (DE.CM)
- Findings and gaps
**Respond (RS):**
- Response Planning (RS.RP)
- Communications (RS.CO)
- Findings and gaps
**Recover (RC):**
- Recovery Planning (RC.RP)
- Improvements (RC.IM)
- Findings and gaps
### 6. Create Remediation Roadmap
Provide prioritized remediation plan:
**Phase 1: Immediate (0-30 days) - Critical Vulnerabilities**
1. Finding: Unencrypted management protocols
- Action: Disable Telnet/HTTP, enable SSH/HTTPS only
- Owner: Network Security Team
- Effort: 2 days
- NIST: SC-8, SC-13
2. Finding: No BGP authentication
- Action: Implement MD5 authentication on all BGP sessions
- Owner: Network Engineering
- Effort: 1 week
- NIST: SC-8
**Phase 2: Short-term (1-3 months) - High Priority**
[Continue with high priority items]
**Phase 3: Medium-term (3-6 months) - Medium Priority**
[Continue with medium priority items]
**Phase 4: Long-term (6-12 months) - Enhancements**
[Continue with improvements]
### 7. Provide Validation Commands
Include commands to verify security controls:
**Authentication and Access Control:**
```bash
# Verify SSH only (no Telnet)
show line vty 0 4 | include "transport input"
netstat -tuln | grep :23 # Should show no Telnet listener
# Check AAA configuration
show running-config | section aaa
show tacacs
show radius
# Verify session timeouts
show line vty 0 4 | include "exec-timeout"
```
**Encryption Verification:**
```bash
# Check routing protocol authentication
show running-config | section "router bgp"
show ip ospf interface | include "authentication"
# Verify VPN encryption
show crypto ipsec sa
show crypto isakmp sa
# Check SNMPv3
show snmp user
show running-config | include "snmp-server"
```
**Network Segmentation:**
```bash
# Verify VLAN configuration
show vlan brief
show vlan id 100
# Check ACLs
show ip access-lists
show running-config | include "access-list"
# Verify firewall rules (if applicable)
iptables -L -n -v
nft list ruleset
```
**Logging and Monitoring:**
```bash
# Check syslog configuration
show logging
show running-config | include logging
# Verify NTP
show ntp status
show ntp associations
# Check SNMP (should be v3 only)
show snmp community # Should be empty or not exist
show snmp user
```
**Routing Security:**
```bash
# BGP prefix filtering
show ip bgp neighbors <peer> | include "prefix-list"
show ip prefix-list
# Route authentication
show ip bgp neighbors <peer> | include password
show ip ospf interface | include auth
# Check for bogon filtering
show ip access-lists | include "deny.*10\.0\.0\.0"
```
### 8. Document Compliance Mapping
Create NIST SP 800-53 compliance matrix:
| Control Family | Control | Status | Evidence | Gaps | Remediation |
|----------------|---------|--------|----------|------|-------------|
| AC (Access Control) | AC-2 Account Management | Partial | TACACS+ configured | No MFA | Implement MFA |
| AC-3 Access Enforcement | Non-Compliant | None | No ACLs | Implement ACLs |
| SC (System Communications) | SC-8 Transmission Confidentiality | Compliant | SSH only | - | - |
| SC-13 Cryptographic Protection | Partial | Strong ciphers | Weak BGP auth | Update to BGP-TCP-AO |
| AU (Audit and Accountability) | AU-2 Audit Events | Compliant | Syslog configured | - | - |
| AU-6 Audit Review | Partial | Logs collected | No analysis | Implement SIEM |
## Common Security Vulnerabilities
### Critical Security Issues
**Unencrypted Management Protocols:**
```
Vulnerability: Telnet, HTTP, SNMPv1/v2c in use
Risk: Credential theft, man-in-the-middle attacks
NIST: SC-8, SC-13
Fix: Use SSH, HTTPS, SNMPv3 exclusively
```
**Default or Weak Passwords:**
```
Vulnerability: Default credentials or weak passwords
Risk: Unauthorized access, privilege escalation
NIST: IA-5
Fix: Strong password policy, key-based authentication
```
**No Routing Protocol Authentication:**
```
Vulnerability: BGP/OSPF without authentication
Risk: Route hijacking, traffic interception
NIST: SC-8
Fix: Implement MD5 or stronger authentication
```
**No Network Segmentation:**
```
Vulnerability: Flat network design
Risk: Lateral movement, widespread compromise
NIST: SC-7
Fix: Implement VLANs, firewalls, ACLs
```
### High-Risk Security Issues
**Missing Logging and Monitoring:**
```
Vulnerability: Inadequate logging
Risk: Inability to detect/respond to incidents
NIST: AU-2, AU-6, SI-4
Fix: Centralized syslog, SNMP, NetFlow
```
**No Rate Limiting or DoS Protection:**
```
Vulnerability: No control plane protection
Risk: Denial of service attacks
NIST: SC-5
Fix: Implement CoPP, rate limiting
```
**Weak BGP Security:**
```
Vulnerability: No prefix filtering, no ROV
Risk: Route leaks, prefix hijacking
NIST: Best practices
Fix: Prefix lists, RPKI validation
```
## Security Best Practices
### Defense in Depth
**Layer 1: Physical Security**
- Console port security
- Secure rack access
- Environmental controls
**Layer 2: Network Segmentation**
- VLANs and VRFs
- Firewalls between zones
- DMZ for public services
- Micro-segmentation
**Layer 3: Access Control**
- Strong authentication (MFA)
- Least privilege access
- Role-based access control
- Regular access reviews
**Layer 4: Encryption**
- Management traffic encryption
- VPN for remote access
- Routing protocol authentication
- SNMPv3 only
**Layer 5: Monitoring and Detection**
- Centralized logging
- Anomaly detection
- Traffic analysis
- Security event correlation
**Layer 6: Incident Response**
- Response procedures
- Escalation paths
- Forensics capability
- Recovery procedures
### Security Hardening Checklist
- [ ] Disable Telnet, enable SSH only
- [ ] Disable HTTP, enable HTTPS only
- [ ] Use SNMPv3 with encryption
- [ ] Configure routing protocol authentication
- [ ] Implement BGP prefix filtering
- [ ] Configure control plane protection
- [ ] Enable logging to central server
- [ ] Configure NTP for time sync
- [ ] Shutdown unused interfaces
- [ ] Disable unnecessary services
- [ ] Implement AAA (TACACS+/RADIUS)
- [ ] Configure session timeouts
- [ ] Use strong passwords or keys
- [ ] Implement network segmentation
- [ ] Configure ACLs for traffic control
- [ ] Enable port security on switches
- [ ] Configure DHCP snooping
- [ ] Enable Dynamic ARP Inspection
- [ ] Implement management VLAN/VRF
- [ ] Regular firmware updates
## Notes
- Security is a continuous process, not a one-time check
- Risk tolerance varies by organization - work with security team
- Balance security with operational requirements
- Document all security decisions and accepted risks
- Revalidate security after any network changes
- Keep up with evolving threats and NIST updates
- Test security controls regularly
## Example Task Invocation
```
review-network-security Please review security of our data center network with 10 leaf-spine switches running BGP. We're preparing for SOC 2 audit and need NIST SP 800-53 compliance assessment. Main concerns are BGP security and encryption of management protocols. Configuration files attached.
```

571
commands/sonic-config.md Normal file
View File

@@ -0,0 +1,571 @@
---
description: Generate SONiC NOS configuration files
argument-hint: Optional SONiC requirements
---
You are initiating SONiC (Software for Open Networking in the Cloud) NOS configuration using a structured workflow to create production-ready SONiC configuration files and operational procedures.
## Workflow Steps
### 1. Gather Requirements
If the user provides specific requirements in their message, use those directly. Otherwise, ask the user for:
**Basic Requirements:**
- SONiC version (community or enterprise/vendor-specific)
- Platform/hardware (Broadcom, Mellanox, Intel, etc.)
- Switch role (Leaf, Spine, ToR, Border, etc.)
- Hostname and basic metadata
**Configuration Type Needed:**
- Interface configuration (physical ports, speeds, MTU)
- VLAN configuration
- Port channel/LAG configuration
- BGP routing configuration
- OSPF routing configuration
- ACL configuration
- QoS configuration
- Loopback interfaces
- Static routes
- System management (NTP, syslog, SNMP)
**For Interface Configuration:**
- Interface names (Ethernet0, Ethernet4, etc.)
- Speeds (10G, 25G, 40G, 100G, 400G)
- Admin status (up/down)
- MTU settings (typically 9100 for data centers)
- FEC settings (RS, FC)
**For VLAN Configuration:**
- VLAN IDs and descriptions
- VLAN member ports
- Tagging mode (tagged/untagged)
- VLAN interface IP addresses
**For Port Channel/LAG:**
- Port channel interface names
- Member interfaces
- LACP configuration
- Minimum links
**For BGP Configuration:**
- Local ASN
- BGP neighbors (IP, ASN, descriptions)
- Peer groups
- Route policies and prefix lists
- Address families (IPv4, IPv6, EVPN)
- Authentication
**For ACL Configuration:**
- ACL table names and types (L3, L2, CTRLPLANE)
- ACL rules (priorities, actions, match criteria)
- Port bindings
**For QoS Configuration:**
- DSCP to TC mapping
- TC to queue mapping
- Scheduler policies
- Port QoS profiles
### 2. Launch sonic-engineer Agent
Use the Task tool to launch the sonic-engineer agent with a detailed prompt containing:
```
Generate SONiC configuration for the following requirements:
[Insert gathered requirements here with all details]
Please provide:
1. Complete config_db.json file
2. Equivalent CLI commands for reference
3. Step-by-step deployment procedure
4. Validation commands specific to this configuration
5. Rollback procedure
6. Any platform-specific notes or requirements
7. Prerequisites (SONiC version, required features)
```
### 3. Review Generated Configuration
When the agent returns the configuration, review it for:
- Valid JSON syntax
- Correct SONiC schema structure
- All required sections present (DEVICE_METADATA, etc.)
- Proper interface naming for the platform
- No conflicting configurations
- Complete BGP/routing configuration
- Appropriate security settings
### 4. Validate JSON Syntax
Before deployment, ensure JSON syntax validation:
```bash
# Validate JSON syntax
python3 -m json.tool config_db.json
# Or use jq
jq . config_db.json
# Check for common issues
jq 'keys' config_db.json # Show top-level keys
```
### 5. Present Deployment Procedure
Ensure the generated configuration includes a safe deployment procedure:
1. **Backup Current Configuration**
```bash
# Save current running config
config save -y
# Create timestamped backup
sudo cp /etc/sonic/config_db.json /etc/sonic/config_db.json.backup.$(date +%Y%m%d_%H%M%S)
# Save current state
show running-config > ~/sonic-config-backup-$(date +%Y%m%d_%H%M%S).txt
show interfaces status >> ~/sonic-config-backup-$(date +%Y%m%d_%H%M%S).txt
```
2. **Validate New Configuration**
```bash
# Validate JSON syntax
python3 -m json.tool new_config_db.json
# Validate SONiC config format
sonic-cfggen -j new_config_db.json --print-data
# Check for required keys
jq 'has("DEVICE_METADATA")' new_config_db.json
```
3. **Deploy Configuration**
```bash
# Copy new configuration
sudo cp new_config_db.json /etc/sonic/config_db.json
# Set correct permissions
sudo chown root:root /etc/sonic/config_db.json
sudo chmod 644 /etc/sonic/config_db.json
```
4. **Apply Configuration**
```bash
# Method 1: Load configuration without full restart
config load /etc/sonic/config_db.json -y
# Method 2: Full configuration reload (restarts services)
config reload -y
# Method 3: Load and save
config load /etc/sonic/config_db.json -y && config save -y
```
5. **Verify Configuration**
```bash
# Check interfaces
show interfaces status
# Check IP configuration
show ip interfaces
# Check BGP (if configured)
show ip bgp summary
# Check VLANs (if configured)
show vlan brief
# Check port channels (if configured)
show interfaces portchannel
# Check system status
show system-health
```
### 6. Provide Validation Commands
Include comprehensive validation commands for each configuration type:
**Interface Validation:**
```bash
# Show all interface status
show interfaces status
# Show specific interface
show interfaces status Ethernet0
# Show interface counters
show interfaces counters
# Show interface errors
show interfaces counters errors
# Show transceiver information
show interfaces transceiver info
# Show interface description
show interfaces description
```
**VLAN Validation:**
```bash
# Show VLAN configuration
show vlan brief
# Show detailed VLAN config
show vlan config
# Show VLAN member ports
show vlan id 100
```
**Port Channel Validation:**
```bash
# Show port channel summary
show interfaces portchannel
# Show LACP status
show lacp neighbor
show lacp internal
# Show port channel details
show interface PortChannel1
```
**BGP Validation:**
```bash
# Show BGP summary
show ip bgp summary
# Show BGP neighbors
show ip bgp neighbors
# Show BGP routes
show ip bgp
# Show received routes from neighbor
show ip bgp neighbors 192.168.1.1 received-routes
# Show advertised routes to neighbor
show ip bgp neighbors 192.168.1.1 advertised-routes
# Show BGP configuration
show runningconfiguration bgp
```
**OSPF Validation:**
```bash
# Show OSPF neighbors
show ip ospf neighbor
# Show OSPF routes
show ip ospf route
# Show OSPF database
show ip ospf database
# Show OSPF interfaces
show ip ospf interface
```
**ACL Validation:**
```bash
# Show ACL tables
show acl table
# Show ACL rules
show acl rule
# Show ACL counters
acl-loader show table
acl-loader show rule
```
**QoS Validation:**
```bash
# Show QoS maps
show qos map dscp-to-tc
show qos map tc-to-queue
# Show queue counters
show queue counters
# Show priority-group
show priority-group
```
**System Validation:**
```bash
# Show system information
show version
show platform summary
show platform syseeprom
# Show services
show services
# Show system health
show system-health
# Show running configuration
show running-config
```
### 7. Include Troubleshooting Commands
Provide troubleshooting commands for common issues:
**Configuration Not Applied:**
```bash
# Check config_db.json syntax
python3 -m json.tool /etc/sonic/config_db.json
# Check SONiC services
show services
# Restart specific service
sudo systemctl restart bgp
sudo systemctl restart swss
# Check service logs
sudo journalctl -u bgp -n 100
sudo journalctl -u swss -n 100
# View syslog
show logging
tail -f /var/log/syslog
```
**Interface Issues:**
```bash
# Check interface admin state
show interfaces status Ethernet0
# Check physical link
show interfaces transceiver info Ethernet0
# Check interface errors
show interfaces counters errors Ethernet0
# Clear interface counters
sonic-clear counters
# Check ASIC programming
show platform switch
```
**BGP Not Establishing:**
```bash
# Check BGP configuration
show runningconfiguration bgp
# Check BGP neighbors
show ip bgp neighbors 192.168.1.1
# Enable BGP debugging
vtysh -c "debug bgp neighbor-events"
vtysh -c "debug bgp updates"
# Check connectivity to neighbor
ping 192.168.1.1
# Check routing table
show ip route
```
**VLAN Issues:**
```bash
# Check VLAN configuration
show vlan config
# Check VLAN member configuration
redis-cli -n 4 HGETALL "VLAN_MEMBER|Vlan100|Ethernet8"
# Check bridge FDB
show mac
# Check VLAN interface
show ip interfaces | grep Vlan
```
**Database Issues:**
```bash
# Access config database (DB 4)
redis-cli -n 4
# Show all keys
redis-cli -n 4 KEYS "*"
# Show specific configuration
redis-cli -n 4 HGETALL "PORT|Ethernet0"
redis-cli -n 4 HGETALL "DEVICE_METADATA|localhost"
# Check application database (DB 0)
redis-cli -n 0 KEYS "*"
```
### 8. Document Rollback Procedure
Ensure rollback procedure is clearly documented:
```bash
# Method 1: Restore from backup
sudo cp /etc/sonic/config_db.json.backup.YYYYMMDD_HHMMSS /etc/sonic/config_db.json
config reload -y
# Method 2: Load previous working config
config load /etc/sonic/config_db.json.backup.YYYYMMDD_HHMMSS -y
# Method 3: Manual configuration via CLI (temporary)
# Use vtysh for routing protocols
sudo vtysh
# Use config commands for interfaces/VLANs
config interface ip add Ethernet0 192.168.1.1/24
# Method 4: Factory reset (CAUTION)
# sudo config-setup factory
# Verify rollback
show interfaces status
show ip bgp summary
show vlan brief
```
## Best Practices
When generating SONiC configurations:
1. **Configuration Management**
- Always backup before changes
- Use version control for config_db.json
- Test in lab environment first
- Document all changes
2. **Interface Configuration**
- Use consistent interface naming
- Configure appropriate MTU for network (9100 for data centers)
- Enable FEC where appropriate
- Add meaningful descriptions
3. **Routing Configuration**
- Use BGP authentication
- Implement prefix filtering
- Configure maximum-prefix limits
- Use BFD for fast convergence
4. **VLAN Design**
- Plan VLAN ID scheme
- Use meaningful VLAN descriptions
- Separate traffic types appropriately
- Configure VLAN interfaces for L3
5. **High Availability**
- Configure redundant uplinks
- Use port channels for link aggregation
- Implement BFD for fast failure detection
- Configure multiple BGP sessions
6. **Security**
- Implement control plane ACLs
- Use routing protocol authentication
- Configure management ACLs
- Enable logging and monitoring
7. **Operational Excellence**
- Configure NTP for time synchronization
- Set up syslog to central server
- Enable SNMP monitoring
- Use consistent naming conventions
## Common Scenarios
### Data Center Leaf Switch (BGP Unnumbered)
- Underlay BGP with spine neighbors
- VLAN configuration for server access
- Port channels for server bonding
- Loopback for VTEP
- ACLs for security
### Top-of-Rack (ToR) Switch
- Access port configuration for servers
- Uplinks to spine (port channels)
- VLANs for network segmentation
- Basic BGP or OSPF routing
- QoS policies
### Spine Switch
- High-density 100G/400G interfaces
- BGP configuration for all leaf neighbors
- Route reflection (if used)
- Minimal VLANs (management only)
- BFD for fast convergence
### Border/Edge Switch
- External BGP peering
- Route filtering and policies
- ACLs for security
- NAT configuration (if supported)
- Internet routing table handling
## SONiC Architecture Notes
**Key Components:**
- **Redis Database**: Configuration and state storage
- **Docker Containers**: Modular service architecture
- **SAI**: Switch Abstraction Interface for hardware
- **FRR**: Routing protocol daemon (BGP, OSPF, etc.)
- **Orchestration Agent**: Translates config to ASIC
**Database Structure:**
- **ConfigDB** (DB 4): Configuration data
- **AppDB** (DB 0): Application state
- **StateDB** (DB 6): Operational state
- **ASIC_DB** (DB 1): Hardware programming
**Configuration Methods:**
1. config_db.json (recommended for automation)
2. CLI commands (immediate application)
3. OpenConfig/gNMI (enterprise features)
4. REST API (if available)
## Platform Considerations
**Broadcom-based Switches:**
- Common in enterprise and cloud
- SAI fully supported
- Check BCM shell access if needed
**Mellanox-based Switches:**
- Common in high-performance networks
- Spectrum ASIC series
- Check SX-SDK version
**Barefoot/Intel Tofino:**
- Programmable pipeline
- P4 runtime support
- Check platform-specific features
**Interface Naming:**
- Usually Ethernet0, Ethernet4, etc. (increments of 4)
- Check platform documentation for mapping
- Alias field for human-readable names
## Notes
- SONiC uses JSON-based configuration (config_db.json)
- Configuration stored in Redis database
- Supports both CLI and file-based configuration
- Container-based architecture for modularity
- Uses FRR for routing protocols (BGP, OSPF, IS-IS)
- SAI provides hardware abstraction
- Always validate JSON syntax before deployment
- Test routing changes in maintenance windows
- Monitor ASIC programming after changes
## Example Task Invocation
```
sonic-config I need a data center leaf switch configuration with ASN 65001, two spine BGP neighbors (192.168.1.1 and 192.168.1.2 both AS 65100), VLAN 100 for servers on Ethernet8-Ethernet24, loopback 10.0.0.1/32, and port channel with Ethernet0 and Ethernet4 for uplink
```