Network Automation Agent 🤖

An AI-powered network automation assistant that uses natural language to manage network devices. Built with LangGraph, Groq (Llama 3.3), and Nornir.

✨ Features

Linear Pipeline Architecture: A deterministic "One-Shot" workflow (Intent → Action → Summary) that eliminates infinite loops and ensures predictable behavior.
Natural Language Interface: Describe intents in plain English (e.g., "Show interfaces on R1" or "Configure VLAN 10 on Switch 2").
Structured Outputs: Uses Pydantic to enforce strict data schemas, ensuring the AI produces clean Markdown summaries and structured JSON every time.
Smart Context Management: Intelligently compresses massive network outputs (like show running-config) to maintain long conversation history without hitting token limits.
Human-in-the-Loop: Critical configuration changes trigger an interrupt, requiring explicit user approval via CLI before execution.
Multi-Vendor Support: Works with Cisco IOS/XE, Arista EOS, Juniper Junos, etc. (via Netmiko/Nornir).
Enhanced Validation & Risk Assessment: Advanced validation layer that checks commands against device inventory and assesses risk of configuration changes before execution.
Comprehensive Monitoring & Observability: Built-in monitoring dashboard, performance metrics, and alerting system with support for email and Slack notifications.
Safety-First Design: Multiple validation layers prevent unauthorized or dangerous operations on network devices.

🏗️ Architecture

The application follows a Linear Pipeline design to ensure safety and reliability in network operations:

graph TD
    Start --> Context[Context Manager]
    Context --> Understand[Understanding Node]

    Understand -->|Chat| End
    Understand -->|Show Command| Execute[Execute Node]
    Understand -->|Config Command| Approval[Approval Node]

    Approval -->|Approved| Execute
    Approval -->|Denied| Response[Response Node]

    Execute --> Response
    Response --> End

Workflow Logic

Message Manager: Compresses old tool outputs to save tokens while keeping the conversation flow intact.
Understanding Node: Analyzes user intent and selects the appropriate tool (show_command or config_command) with enhanced validation.
Approval Node: Intercepts state-changing commands. Pauses for user confirmation with risk assessment.
Execute Node: Runs Nornir tasks against live devices and bundles the raw output.
Response Node: Analyzes the raw execution data and generates a professional Markdown summary using strict Pydantic schemas.

Package Structure

network-automation-agent/
├── agent/                  # AI Logic
│   ├── workflow_manager.py # Linear Graph definition
│   ├── schemas.py          # Pydantic output models
│   ├── prompts.py          # System prompts
│   ├── nodes.py            # All workflow nodes (understanding, execute, approval, response)
│   ├── state.py            # State definitions
│   └── constants.py        # Shared constants
├── core/                   # Infrastructure
│   ├── config.py           # Configuration management
│   ├── nornir_manager.py   # Device connectivity
│   ├── llm_provider.py     # LLM client factory
│   ├── message_manager.py  # Token optimization
│   ├── device_inventory.py # Device validation
│   └── task_executor.py    # Task execution
├── monitoring/             # Monitoring & Observability
│   ├── tracing.py          # LangSmith integration
│   ├── callbacks.py        # Monitoring callbacks
│   ├── dashboard.py        # Dashboard functionality
│   └── alerting.py         # Alert management system
├── tools/                  # Capabilities
│   ├── show_tool.py        # Read-only commands
│   ├── config_tool.py      # Config changes
│   ├── registry.py         # Tool registry
│   └── validators.py       # Input validation
├── cli/                    # User Interface
│   ├── application.py      # Main application logic
│   ├── orchestrator.py     # Workflow orchestration
│   └── bootstrapper.py     # Dependency initialization
├── ui/                     # Presentation
│   └── console_ui.py       # Rich-based terminal UI
├── utils/                  # Utilities
│   ├── logger.py           # Logging utilities
│   └── responses.py        # Response helpers
├── main.py                 # Application entry point
├── hosts.yaml              # Device inventory
├── groups.yaml             # Device groups
├── config.yaml             # Application configuration
├── pyproject.toml          # Project dependencies
└── uv.lock                 # Dependency lock file

📊 Monitoring & Observability

The Network Automation Agent includes a comprehensive monitoring and observability system that provides real-time insights into workflow performance, tool execution, and system health.

Monitoring Features

Real-time Dashboard: Text-based dashboard showing system health, performance metrics, and recent sessions
Performance Metrics: Track tool execution times, LLM response times, and success rates
Alerting System: Configurable alerts for slow performance, errors, and failures with email and Slack notifications
Session Tracking: Monitor individual workflow sessions with detailed execution statistics
LangSmith Integration: Optional integration for advanced tracing and analytics

Monitoring Dashboard

To view the monitoring dashboard:

uv run python main.py --monitor

The dashboard displays:

System status and uptime
Performance metrics with health indicators
Recent session history
Alert summary and recent alerts

Alert Configuration

The system supports multiple alert types and severity levels:

Alert Types: PERFORMANCE, ERROR, FAILURE, TIMEOUT, SECURITY
Severity Levels: LOW, MEDIUM, HIGH, CRITICAL
Notification Channels: Email, Slack, console logging

For advanced alerting configuration, you can set up email and Slack notifications in the monitoring configuration.

🚀 Quick Start

Prerequisites

Python 3.12+
uv package manager (recommended) or pip
Network devices with SSH access
Groq API key

Installation

Clone the repository:

git clone <repository-url>
cd network-automation-agent

Install dependencies with uv:

# Install uv if you don't have it
pip install uv

# Sync project dependencies
uv sync

Configure Environment:

# Copy the example environment file
cp .env.example .env
# Edit .env and add your Groq API key: GROQ_API_KEY=your_key_here

Configure Device Inventory: Edit hosts.yaml and groups.yaml to match your network environment:

# hosts.yaml - Define your network devices
r1:
  hostname: 192.168.1.1
  groups: [cisco]
s1:
  hostname: 192.168.1.2
  groups: [arista]

# groups.yaml - Define device groups and credentials
cisco:
  platform: cisco_ios
  username: admin
  password: secure_password
arista:
  platform: arista_eos
  username: admin
  password: secure_password

Usage

Interactive Chat Mode (Recommended):

uv run python main.py --chat

Single Command Mode:

uv run python main.py "show ip interface brief on R1"

Specify Target Device:

uv run python main.py --device R1 "show version"

Debug Mode:

uv run python main.py --chat --debug

🔧 Configuration

Application Configuration (config.yaml)

The application uses a Nornir-based configuration that supports:

Inventory management (host and group files)
Parallel execution settings (num_workers)
Connection timeouts and options
Logging configuration

Key settings that can be overridden via environment variables:

NUM_WORKERS: Number of parallel workers (default: 20)
NETMIKO_TIMEOUT: Command timeout in seconds (default: 30)
NETMIKO_CONN_TIMEOUT: Connection timeout in seconds (default: 10)
NETMIKO_SESSION_TIMEOUT: Session timeout in seconds (default: 60)

Environment Variables

Required:

GROQ_API_KEY: API key for Groq cloud service

Optional:

NUM_WORKERS: Number of concurrent connections to devices
NETMIKO_TIMEOUT: Command execution timeout
NETMIKO_CONN_TIMEOUT: Device connection timeout
NETMIKO_SESSION_TIMEOUT: Session timeout

🛡️ Safety & Validation

Multi-Layer Validation System

The agent implements multiple layers of safety:

Device Inventory Validation: Ensures target devices exist before execution
Command Validation: Validates command syntax and safety
Risk Assessment: Evaluates configuration commands for potential risks
Human Approval: Critical changes require explicit user confirmation

Command Types

Show Commands: Execute directly after validation
Config Commands: Require explicit user approval with risk assessment

🧪 Testing

Run the test suite to ensure everything works correctly:

# Run all tests
uv run pytest

# Run specific test file
uv run pytest tests/unit/test_core/test_config.py

# Run with verbose output
uv run pytest -v

# Run integration tests
uv run pytest tests/integration/

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes
Add tests if applicable
Ensure tests pass: uv run pytest
Commit your changes with descriptive messages
Push to the branch: git push origin feature/amazing-feature
Open a pull request

Development Commands

Install dependencies: uv sync
Add new dependency: uv add package_name
Update dependencies: uv sync --refresh
Run tests: uv run pytest
Run with debug: uv run python main.py --chat --debug

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🚨 Disclaimer

This tool is designed for managing network infrastructure. Use responsibly and ensure you have proper authorization before connecting to any network devices. The authors are not responsible for any damage caused by misuse of this tool.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Network Automation Agent 🤖

✨ Features

🏗️ Architecture

Workflow Logic

Package Structure

📊 Monitoring & Observability

Monitoring Features

Monitoring Dashboard

Alert Configuration

🚀 Quick Start

Prerequisites

Installation

Usage

🔧 Configuration

Application Configuration (config.yaml)

Environment Variables

🛡️ Safety & Validation

Multi-Layer Validation System

Command Types

🧪 Testing

🤝 Contributing

Development Commands

📄 License

🚨 Disclaimer

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 181 Commits
agent		agent
cli		cli
core		core
docs		docs
monitoring		monitoring
net_lab		net_lab
tests		tests
tools		tools
ui		ui
utils		utils
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
PROGRESS_TRACKER.md		PROGRESS_TRACKER.md
README.md		README.md
config.yaml		config.yaml
main.py		main.py
plan.md		plan.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Network Automation Agent 🤖

✨ Features

🏗️ Architecture

Workflow Logic

Package Structure

📊 Monitoring & Observability

Monitoring Features

Monitoring Dashboard

Alert Configuration

🚀 Quick Start

Prerequisites

Installation

Usage

🔧 Configuration

Application Configuration (config.yaml)

Environment Variables

🛡️ Safety & Validation

Multi-Layer Validation System

Command Types

🧪 Testing

🤝 Contributing

Development Commands

📄 License

🚨 Disclaimer

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages