1. Introduction to NetDevOps and Infrastructure as Code
Welcome to “NetDevOps for Network Engineers,” a guide designed to equip you with the essential knowledge and practical skills to transform network operations using modern automation principles. In today’s rapidly evolving digital landscape, manual network configuration and troubleshooting are no longer sustainable. The demand for agility, reliability, and scalability necessitates a fundamental shift in how networks are designed, deployed, and managed. This is where NetDevOps and Infrastructure as Code (IaC) come into play.
This inaugural chapter serves as your gateway into the world of NetDevOps and IaC. We will lay the foundational understanding necessary to grasp the subsequent technical deep dives. We’ll explore the core philosophies, key components, and the compelling reasons why these methodologies are becoming indispensable for every forward-thinking network engineer.
What this chapter covers:
- The genesis and core principles of NetDevOps.
- The definition and benefits of Infrastructure as Code in a networking context.
- An overview of modern programmable network interfaces like NETCONF, RESTCONF, gRPC, and their role with YANG data models.
- Practical, albeit introductory, examples of how IaC and automation manifest across multi-vendor network devices.
- Initial considerations for security, troubleshooting, and performance in an automated environment.
Why it’s important:
Embracing NetDevOps and IaC is not just about learning new tools; it’s about adopting a new mindset that fosters collaboration, accelerates deployment cycles, minimizes human error, and ensures consistency across your network infrastructure. As network automation matures, engineers who master these concepts will lead the charge in building more resilient, efficient, and scalable networks.
What you’ll be able to do after this chapter:
- Clearly articulate the concepts of NetDevOps and Infrastructure as Code.
- Understand the fundamental shift from traditional CLI management to API-driven automation.
- Recognize the value of programmable interfaces (NETCONF, RESTCONF, gRPC) and YANG for multi-vendor automation.
- Appreciate the importance of version control, continuous integration, and continuous delivery in networking.
- Be prepared to delve into the practical applications of Ansible and Python for network automation in subsequent chapters.
2. Technical Concepts
2.1. The Essence of NetDevOps
NetDevOps is the application of DevOps principles—culture, automation, lean, measurement, and sharing—to network operations. It bridges the gap between traditional networking practices and modern software development methodologies, aiming to bring agility, efficiency, and reliability to network infrastructure management.
Core Tenets of NetDevOps:
- Automation: Automating repetitive and error-prone tasks, from configuration deployment to monitoring and troubleshooting.
- Collaboration: Fostering closer working relationships between network, development, and operations teams to break down silos.
- Continuous Integration (CI) & Continuous Delivery (CD): Implementing pipelines to test network changes automatically before deployment and deploying validated changes efficiently.
- Version Control: Managing network configurations and automation scripts in a version control system (e.g., Git) for traceability, collaboration, and rollback capabilities.
- Monitoring & Observability: Leveraging telemetry and advanced monitoring tools to gain real-time insights into network state and performance.
Why NetDevOps?
The demands on modern networks are increasing exponentially. Applications require faster provisioning, higher availability, and greater agility. Traditional CLI-driven, manual processes are slow, prone to human error, and cannot keep pace. NetDevOps provides a framework to meet these challenges, enabling network engineers to:
- Reduce operational costs and time.
- Increase network stability and reliability.
- Accelerate service delivery.
- Promote innovation and experimentation.
Here’s a high-level architectural view of a NetDevOps pipeline:
@startuml
skinparam handwritten true
skinparam style strict
cloud "Network Team" as NetTeam
cloud "DevOps Team" as DevOpsTeam
package "Version Control System (Git)" as VCS {
rectangle "Configuration Files" as Configs
rectangle "Automation Scripts (Ansible/Python)" as Scripts
rectangle "IaC Templates (Terraform)" as IaC
}
rectangle "Automation Engine (Ansible/Python/Terraform)" as Engine
rectangle "CI/CD Pipeline (Jenkins/GitLab CI/Argo CD)" as CICD
package "Network Devices" as Devices {
component "Cisco IOS-XE" as Cisco
component "Juniper JunOS" as Juniper
component "Arista EOS" as Arista
component "Other Vendors" as Other
}
rectangle "Monitoring & Telemetry" as Monitoring
rectangle "Alerting & Logging" as Alerting
NetTeam [label="> VCS : Commits network code
DevOpsTeam"] VCS : Collaborates on pipeline
VCS [label="> CICD : Trigger pipeline on push
CICD"] Engine : Executes automation
Engine [label="> Cisco : (NETCONF/RESTCONF/gRPC)
Engine"] Juniper : (NETCONF/RESTCONF)
Engine [label="> Arista : (eAPI/OpenConfig)
Engine"] Other : (Vendor API)
Devices [label="> Monitoring : Telemetry data
Monitoring"] Alerting : Anomalies & events
Alerting [label="> NetTeam : Notifications
Alerting"] DevOpsTeam : Notifications
CICD --right-> Monitoring : Deploy Monitoring Agents
note right of Engine
Leverages APIs (NETCONF, RESTCONF, gRPC, eAPI)
and CLI for automation
end note
@enduml
2.2. Infrastructure as Code (IaC) for Networks
Infrastructure as Code (IaC) is the practice of managing and provisioning infrastructure through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools. For networks, IaC means defining the desired state of your network devices, services, and topologies using code.
Key Principles of Network IaC:
- Declarative Approach: Instead of specifying how to achieve a state (imperative), IaC describes what the desired final state should be. The automation tool then figures out the steps.
- Example (Imperative - CLI):
configure terminal,interface GigabitEthernet1/0/1,description "Uplink to Core",no shutdown,exit,end,write memory. - Example (Declarative - IaC): A YAML file specifying
interface: GigabitEthernet1/0/1,description: "Uplink to Core",state: up.
- Example (Imperative - CLI):
- Idempotence: Applying the same IaC script multiple times yields the same result. The script intelligently determines if changes are needed and only applies them if the current state deviates from the desired state. This is crucial for consistency and safety.
- Version Control: All network configuration code is stored in a version control system (like Git), providing a complete history of changes, accountability, and the ability to roll back to previous states.
- Testability: Because network configurations are code, they can be tested like software code (unit tests, integration tests, end-to-end tests) before deployment, significantly reducing errors.
Benefits of Network IaC:
- Consistency: Eliminates configuration drift and ensures all devices conform to defined standards.
- Speed & Agility: Rapid deployment and modification of network services.
- Reduced Human Error: Automation minimizes manual mistakes.
- Traceability & Auditability: Full history of changes and who made them.
- Disaster Recovery: Ability to quickly rebuild network infrastructure from code.
- Collaboration: Enables multiple engineers to work on network changes simultaneously.
Here’s how an IaC workflow typically looks:
digraph IaC_Workflow {
rankdir=LR;
node [shape=box, style=filled, fillcolor=lightblue];
"Code Repository (Git)" -> "CI/CD Pipeline" [label="Push Network Code"];
"CI/CD Pipeline" -> "Testing & Validation" [label="Run Tests"];
"Testing & Validation" -> "Automation Engine" [label="Approved Changes"];
"Automation Engine" -> "Network Infrastructure" [label="Apply Configuration"];
"Network Infrastructure" -> "Monitoring & Observability" [label="Emit Telemetry"];
"Monitoring & Observability" -> "Code Repository (Git)" [label="Detect Drift/Report State"];
subgraph cluster_legend {
label="Legend";
color=grey;
node [shape=plaintext, fillcolor=white];
manual [label="Manual Interaction"];
automated [label="Automated Process"];
}
}
2.3. Programmable Interfaces: NETCONF, RESTCONF, gRPC, and YANG
The shift to NetDevOps and IaC is largely powered by the evolution of network device management interfaces beyond the traditional Command Line Interface (CLI). These modern interfaces provide structured, programmatic access to network devices, enabling true automation and multi-vendor interoperability.
2.3.1. YANG Data Models
At the heart of modern programmable interfaces is YANG (Yet Another Next Generation). YANG is a data modeling language used to model configuration and state data, notifications, and RPCs for network devices. It provides a standardized, vendor-agnostic way to describe network elements and their attributes.
- RFC Reference: RFC 7950 (The YANG 1.1 Data Modeling Language) and RFC 6020 (YANG - A Data Modeling Language for NETCONF).
- Benefits:
- Standardization: Enables interoperability across different vendors.
- Structure: Provides a well-defined schema, making automation code more robust.
- Validation: Tools can validate configuration against the YANG model before pushing to the device, preventing errors.
- Abstraction: Abstracts away vendor-specific CLI syntax.
Example: A simplified YANG module for an interface
module example-interface {
yang-version 1.1;
namespace "urn:example:interface";
prefix "ex-if";
organization "Example Corp";
contact "[email protected]";
description "A simple YANG model for network interfaces.";
revision 2026-01-24 {
description "Initial revision.";
reference "RFC 0000";
}
container interfaces {
description "Top-level container for interfaces.";
list interface {
key "name";
description "A list of network interfaces.";
leaf name {
type string;
description "Interface name (e.g., 'GigabitEthernet1/0/1').";
}
leaf description {
type string;
description "User-defined interface description.";
}
leaf enabled {
type boolean;
default "true";
description "Administratively enabled/disabled status.";
}
container ipv4 {
description "IPv4 configuration.";
leaf address {
type inet:ipv4-address;
description "Primary IPv4 address.";
}
leaf netmask {
type inet:ipv4-address;
description "IPv4 netmask.";
}
}
}
}
}
2.3.2. NETCONF (Network Configuration Protocol)
NETCONF is an XML-based protocol designed for managing network devices. It uses Remote Procedure Calls (RPCs) to perform configuration operations and retrieve state information.
- RFC Reference: RFC 6241 (Network Configuration Protocol (NETCONF)).
- Key Features:
- Transactional: Supports atomic commits and rollback capabilities.
- Structured Data: Relies heavily on YANG data models, exchanging data in XML format.
- Secure Transport: Typically runs over SSH (TCP port 830).
- Capabilities Exchange: Devices advertise their supported YANG models and capabilities.
A conceptual packet/data structure for a NETCONF RPC:
packetdiag {
colwidth = 32
0-31: SSH Header
32-63: XML PDU Length
64-95: <rpc> Element (NETCONF Op)
96-127: <edit-config> / <get> / <get-config> (Operation type)
128-159: <target> / <source> (e.g., <running/>, <candidate/>)
160-255: <config> / <filter> (YANG-modeled data in XML)
256-287: </rpc> (End RPC)
}
Note: This is a highly simplified conceptual representation. Actual NETCONF XML payload and transport over SSH is more complex.
2.3.3. RESTCONF (RESTful Network Configuration Protocol)
RESTCONF is an HTTP-based protocol that provides a RESTful interface for interacting with network devices using YANG data models. It maps YANG data models to a hierarchical URI structure.
- RFC Reference: RFC 8040 (RESTCONF Protocol).
- Key Features:
- HTTP Methods: Uses standard HTTP methods (GET, POST, PUT, PATCH, DELETE) for operations.
- Data Formats: Supports JSON and XML data payloads.
- Ease of Use: Leverages familiar web technologies, making it accessible to web developers.
- Stateful vs. Stateless: RESTful by design, favoring stateless interactions.
2.3.4. gRPC (Google Remote Procedure Call)
gRPC is a high-performance, open-source RPC framework that can operate over HTTP/2. It uses Protocol Buffers (Protobuf) for defining services and messages, offering a language-agnostic way to build APIs. In networking, it’s often used with YANG models via gNMI (gRPC Network Management Interface).
- RFC Reference: While gRPC itself is not an RFC standard, it’s a widely adopted industry standard. gNMI is a community-driven specification.
- Key Features:
- High Performance: Built on HTTP/2, leveraging multiplexing and header compression.
- Strongly Typed: Protocol Buffers provide strict schema definition, preventing data type errors.
- Bi-directional Streaming: Supports efficient streaming of telemetry data.
- Language Agnostic: Tools can generate client/server code in multiple programming languages.
Evolution of Network Management Interfaces:
digraph G {
rankdir=LR;
node [shape=box, style=filled, fillcolor=lightyellow];
edge [fontname="Helvetica,Arial,sans-serif", fontsize=10];
"Manual CLI" [label="Manual CLI\n(Error-prone, Slow)"];
"SNMP" [label="SNMP\n(Monitoring, Limited Config)"];
"Proprietary APIs" [label="Proprietary APIs\n(Vendor Lock-in)"];
"NETCONF" [label="NETCONF\n(Transactional, Structured XML)"];
"RESTCONF" [label="RESTCONF\n(HTTP/JSON/XML, Web-friendly)"];
"gRPC/gNMI" [label="gRPC/gNMI\n(High-perf, Streaming Telemetry)"];
"Manual CLI" -> "SNMP" [label="Next Step"];
"SNMP" -> "Proprietary APIs" [label="Limited Automation"];
"Proprietary APIs" -> "NETCONF" [label="Standardization Need"];
"NETCONF" -> "RESTCONF" [label="RESTful Approach"];
"RESTCONF" -> "gRPC/gNMI" [label="High Performance, Telemetry"];
"YANG Data Models" [shape=cylinder, style=filled, fillcolor=lightgreen];
"YANG Data Models" -> "NETCONF" [label="Drives"];
"YANG Data Models" -> "RESTCONF" [label="Drives"];
"YANG Data Models" -> "gRPC/gNMI" [label="Drives"];
{rank = same; "NETCONF"; "RESTCONF"; "gRPC/gNMI";}
}
3. Configuration Examples (Illustrative IaC)
For this introductory chapter, we’ll demonstrate a simple, idempotent configuration task across multiple vendors: configuring a loopback interface with a description. This highlights how the desired state is expressed.
Security Warning: Never embed sensitive credentials directly into configuration files or scripts destined for version control. Use secure credential management systems (e.g., Ansible Vault, environment variables, secret managers). The examples below omit direct credentials for brevity, assuming secure handling.
3.1. Cisco IOS-XE/NX-OS Configuration
Here’s how you might configure a Loopback0 interface with a description using traditional CLI and then conceptualizing it with NETCONF XML.
! Configure Loopback0 with a description and IP address
configure terminal
interface Loopback0
description Automated_Loopback_Interface_IaC_Demo
ip address 10.0.0.1 255.255.255.255
no shutdown
end
write memory
! Verification commands
show ip interface brief Loopback0
show running-config interface Loopback0
Conceptual NETCONF XML for Desired State:
This XML represents the desired state for a specific interface based on a YANG model. An automation tool would construct and send this via NETCONF.
<rpc message-id="101" xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">
<edit-config>
<target>
<running/>
</target>
<config>
<interfaces xmlns="urn:ietf:params:xml:ns:yang:ietf-interfaces">
<interface>
<name>Loopback0</name>
<description>Automated_Loopback_Interface_IaC_Demo</description>
<enabled>true</enabled>
<ipv4 xmlns="urn:ietf:params:xml:ns:yang:ietf-ip">
<address>
<ip>10.0.0.1</ip>
<netmask>255.255.255.255</netmask>
</address>
</ipv4>
</interface>
</interfaces>
</config>
</edit-config>
</rpc>
3.2. Juniper JunOS Configuration
For Juniper, we’ll configure a loopback interface. JunOS also uses XML internally, making it very suitable for NETCONF.
# Configure Loopback0 with a description and IP address
set interfaces lo0 unit 0 description "Automated_Loopback_Interface_IaC_Demo"
set interfaces lo0 unit 0 family inet address 10.0.0.1/32
commit and-quit
# Verification commands
show interfaces lo0 terse
show configuration interfaces lo0
Conceptual NETCONF XML for Desired State:
<rpc message-id="102" xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">
<edit-config>
<target><candidate/></target>
<default-operation>merge</default-operation>
<config>
<configuration>
<interfaces>
<interface>
<name>lo0</name>
<unit>
<name>0</name>
<description>Automated_Loopback_Interface_IaC_Demo</description>
<family>
<inet>
<address>
<name>10.0.0.1/32</name>
</address>
</inet>
</family>
</unit>
</interface>
</interfaces>
</configuration>
</config>
</edit-config>
</rpc>
<rpc message-id="103" xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">
<commit/>
</rpc>
3.3. Arista EOS Configuration
Arista EOS is Linux-based and offers a robust eAPI (Extended API) that is RESTful and uses JSON for data exchange.
! Configure Loopback0 with a description and IP address
configure
interface Loopback0
description Automated_Loopback_Interface_IaC_Demo
ip address 10.0.0.1/32
no shutdown
end
copy running-config startup-config
! Verification commands
show ip interface brief Loopback0
show running-config interface Loopback0
Conceptual eAPI JSON (via RESTCONF/eAPI POST request) for Desired State:
An automation tool would formulate a JSON payload like this and send it via HTTP POST to the /command-api endpoint.
{
"jsonrpc": "2.0",
"method": "runCli",
"params": {
"cmds": [
"enable",
"configure terminal",
"interface Loopback0",
"description Automated_Loopback_Interface_IaC_Demo",
"ip address 10.0.0.1/32",
"no shutdown",
"end",
"copy running-config startup-config"
],
"format": "json"
},
"id": "1"
}
4. Automation Examples (Introductory)
These examples illustrate the basic interaction with network devices using Python and Ansible for the loopback interface configuration.
4.1. Python (Netmiko)
This Python script uses Netmiko (a multi-vendor SSH connection library) to configure a loopback interface.
import os
from netmiko import ConnectHandler
# Security warning: In a production environment, avoid hardcoding credentials.
# Use environment variables, a secret manager, or Ansible Vault.
# For this example, we'll simulate fetching credentials.
# Replace with actual device details and secure credential handling.
DEVICE_TYPE_MAP = {
"cisco_ios": {
"device_type": "cisco_ios",
"username": os.environ.get("NET_USERNAME", "admin"),
"password": os.environ.get("NET_PASSWORD", "cisco"),
"secret": os.environ.get("NET_PASSWORD", "cisco")
},
"juniper_junos": {
"device_type": "juniper_junos",
"username": os.environ.get("NET_USERNAME", "admin"),
"password": os.environ.get("NET_PASSWORD", "juniper"),
},
"arista_eos": {
"device_type": "arista_eos",
"username": os.environ.get("NET_USERNAME", "admin"),
"password": os.environ.get("NET_PASSWORD", "arista"),
"secret": os.environ.get("NET_PASSWORD", "arista")
}
}
devices = [
{"host": "192.168.1.101", "vendor": "cisco_ios"},
{"host": "192.168.1.102", "vendor": "juniper_junos"},
{"host": "192.168.1.103", "vendor": "arista_eos"},
]
config_commands = {
"cisco_ios": [
"interface Loopback0",
"description Automated_Loopback_Interface_IaC_Demo",
"ip address 10.0.0.1 255.255.255.255",
"no shutdown",
],
"juniper_junos": [
"set interfaces lo0 unit 0 description \"Automated_Loopback_Interface_IaC_Demo\"",
"set interfaces lo0 unit 0 family inet address 10.0.0.1/32",
],
"arista_eos": [
"interface Loopback0",
"description Automated_Loopback_Interface_IaC_Demo",
"ip address 10.0.0.1/32",
"no shutdown",
],
}
for device in devices:
device_params = DEVICE_TYPE_MAP[device["vendor"]].copy()
device_params["host"] = device["host"]
try:
print(f"\n--- Connecting to {device['host']} ({device['vendor']}) ---")
with ConnectHandler(**device_params) as net_connect:
# Send configuration commands
output = net_connect.send_config_set(config_commands[device["vendor"]])
print("Configuration applied:")
print(output)
# For Juniper, commit is required after configuration
if device["vendor"] == "juniper_junos":
print("Committing changes on Juniper device...")
commit_output = net_connect.commit()
print(commit_output)
# Save configuration (optional, device-dependent)
if device["vendor"] in ["cisco_ios", "arista_eos"]:
print("Saving configuration...")
net_connect.save_config()
# Verify configuration
print("Verifying configuration...")
if device["vendor"] == "cisco_ios":
verify_cmd = "show ip interface brief Loopback0"
elif device["vendor"] == "juniper_junos":
verify_cmd = "show interfaces lo0 terse"
elif device["vendor"] == "arista_eos":
verify_cmd = "show ip interface brief Loopback0"
else:
verify_cmd = "show version" # Fallback
verify_output = net_connect.send_command(verify_cmd)
print(verify_output)
except Exception as e:
print(f"Error connecting to or configuring {device['host']}: {e}")
4.2. Ansible Playbook
This Ansible playbook automates the loopback configuration across Cisco, Juniper, and Arista devices.
---
- name: Configure Loopback0 via IaC Introduction
hosts: all_network_devices
gather_facts: false # No need to gather facts for this simple task
# Security warning: Use Ansible Vault for sensitive variables
# like `ansible_user` and `ansible_password` in production.
# For this example, we assume these are passed securely or
# defined in an inventory file protected by Vault.
vars:
loopback_id: 0
loopback_description: "Automated_Loopback_Interface_IaC_Demo"
loopback_ip: "10.0.0.1"
loopback_prefix: "32" # Equivalent to 255.255.255.255
tasks:
- name: Configure Loopback0 on Cisco IOS-XE
cisco.ios.ios_config:
parents: "interface Loopback"
lines:
- "description "
- "ip address "
- "no shutdown"
save_when: modified # Only save if changes were made
when: ansible_network_os == 'ios' or ansible_network_os == 'ios_xe'
- name: Configure Loopback0 on Juniper JunOS
junipernetworks.junos.junos_config:
lines:
- "set interfaces lo unit 0 description \"\""
- "set interfaces lo unit 0 family inet address /"
commit_empty_configuration: false # Avoid committing if no changes
commit_on_change: true
when: ansible_network_os == 'junos'
- name: Configure Loopback0 on Arista EOS
arista.eos.eos_config:
parents: "interface Loopback"
lines:
- "description "
- "ip address /"
- "no shutdown"
save_when: modified # Only save if changes were made
when: ansible_network_os == 'eos'
- name: Verify Loopback0 configuration on Cisco
cisco.ios.ios_command:
commands: "show ip interface brief Loopback"
register: cisco_loopback_output
when: ansible_network_os == 'ios' or ansible_network_os == 'ios_xe'
- name: Display Cisco verification output
ansible.builtin.debug:
var: cisco_loopback_output.stdout_lines
when: ansible_network_os == 'ios' or ansible_network_os == 'ios_xe'
- name: Verify Loopback0 configuration on Juniper
junipernetworks.junos.junos_rpc:
rpc: get-interface-information
filter:
interfaces-information:
physical-interface:
name: "lo"
register: juniper_loopback_output
when: ansible_network_os == 'junos'
- name: Display Juniper verification output
ansible.builtin.debug:
var: juniper_loopback_output.output
when: ansible_network_os == 'junos'
- name: Verify Loopback0 configuration on Arista
arista.eos.eos_command:
commands: "show ip interface brief Loopback"
register: arista_loopback_output
when: ansible_network_os == 'eos'
- name: Display Arista verification output
ansible.builtin.debug:
var: arista_loopback_output.stdout_lines
when: ansible_network_os == 'eos'
Note: This playbook requires an Ansible inventory file that defines all_network_devices and the ansible_network_os variable for each host.
4.3. Terraform (Conceptual IaC for Provisioning)
Terraform excels at provisioning and managing infrastructure rather than just device configurations. While direct device configuration is possible with specific providers, a common IaC pattern is to provision higher-level network constructs or cloud resources. For an introductory chapter, we’ll show a simple example of defining a “network resource” which, in a real scenario, might translate to a cloud VPC or a managed network segment. This demonstrates the declarative nature of IaC with Terraform.
# This is a conceptual example for demonstrating IaC principles with Terraform.
# In a real-world scenario, you would use a specific provider (e.g., Cisco Meraki,
# AWS VPC, Azure VNet, or custom network device providers) to manage actual network resources.
# --- main.tf ---
resource "local_file" "network_segment_definition" {
# This 'local_file' resource conceptually represents a network segment
# defined as code. In practice, this would be a provider resource
# that interacts with a real network controller or device API.
content = jsonencode({
segment_name = "production-web-tier"
cidr_block = "10.0.0.0/24"
vlans = [
{ id = 10, name = "Web_Servers" },
{ id = 20, name = "Database_Servers" }
],
gateways = [
{ ip = "10.0.0.1", device_role = "Primary-Router" }
],
description = "IaC defined network segment for web application deployment."
})
filename = "network_segments/production-web-tier.json"
}
output "segment_details_file" {
value = local_file.network_segment_definition.filename
description = "Path to the IaC-defined network segment details."
}
To run this: terraform init, terraform plan, terraform apply.
This example creates a local JSON file representing the desired state of a network segment, demonstrating that network definitions can be version-controlled and managed declaratively. The local_file resource acts as a placeholder for what would typically be a provider-specific resource managing actual network infrastructure like a Cisco DNA Center policy, a cloud VPC, or a device group in a network management system.
5. Security Considerations
Adopting NetDevOps and IaC introduces new attack vectors and necessitates a strong focus on security. Automated processes can amplify vulnerabilities if not properly secured.
5.1. Attack Vectors
- Compromised Automation Host/Platform: If the server running Ansible, Python scripts, or CI/CD pipelines is compromised, an attacker could gain control over your entire network.
- Insecure Credentials: Hardcoding usernames/passwords in scripts or storing them unencrypted in version control.
- API Vulnerabilities: Exploiting weaknesses in NETCONF, RESTCONF, gRPC, or vendor APIs (e.g., authentication bypass, injection attacks).
- Version Control System (VCS) Compromise: Tampering with network configurations or automation code in Git, leading to malicious deployments or denial-of-service.
- Configuration Drift: Undesired manual changes bypassing the IaC pipeline, creating inconsistencies and potential security gaps.
- Insider Threats: Malicious actors within the organization manipulating automation tools or IaC.
5.2. Mitigation Strategies & Best Practices
- Secure Credential Management:
- Ansible Vault: Encrypt sensitive data (passwords, API keys) at rest.
- Environment Variables: Store credentials in environment variables on the automation host, not in scripts.
- Secret Managers: Integrate with dedicated secret management services (e.g., HashiCorp Vault, AWS Secrets Manager, Azure Key Vault).
- Never Hardcode: Absolutely avoid placing credentials directly in code.
- Least Privilege:
- API/Device Access: Grant network devices and automation tools only the minimum necessary permissions (e.g., read-only access for verification, specific configuration commands for deployment).
- Automation User Accounts: Use dedicated, non-admin accounts for automation with granular RBAC (Role-Based Access Control).
- Secure Automation Platforms:
- Hardened OS: Ensure automation servers (e.g., Jenkins, GitLab Runner, Ansible Automation Platform) are running on a hardened operating system.
- Network Segmentation: Isolate automation platforms in a secure network segment.
- Regular Patching: Keep all software (OS, automation tools, libraries) up to date.
- Audit Logging: Enable comprehensive logging for all automation activities and access attempts.
- Version Control System (VCS) Security:
- Branch Protections: Implement branch protection rules (e.g., require code reviews, signed commits) for critical configuration branches.
- Access Control: Strict access control to the VCS.
- Code Review: Mandate peer reviews for all network configuration code changes.
- Secure APIs:
- TLS/SSL: Always use encrypted transport (HTTPS for RESTCONF, SSH for NETCONF, TLS for gRPC).
- API Key Management: Rotate API keys regularly.
- Input Validation: Ensure automation scripts validate input to prevent injection attacks.
- Immutable Infrastructure & Configuration:
- Strive for immutable network configurations where possible, rebuilding or replacing components rather than modifying them in place.
- Regularly audit device configurations against the IaC source of truth to detect and remediate drift.
- Continuous Security Testing: Integrate security scanning tools into your CI/CD pipeline to check for vulnerabilities in automation code or configuration policies.
5.3. Security Configuration Example (Conceptual SSH hardening for Cisco)
This is a conceptual example of using automation to enforce a basic security hardening on SSH for Cisco devices.
! Configure SSH v2, strong algorithms, and local authentication
configure terminal
ip ssh version 2
ip ssh rsa keypair-name SSH_KEYS force regenerate
crypto key generate rsa modulus 2048
ip ssh authentication-retries 3
ip ssh timeout 60
ip ssh dh min-size 2048
ip ssh cipher [email protected] aes256-ctr
ip ssh mac hmac-sha2-512 hmac-sha2-256
line vty 0 4
transport input ssh
login local
end
write memory
! Verification Commands:
show ip ssh
show running-config | section ip ssh
show running-config | section line vty
Security Warning: Generating new crypto keys will interrupt existing SSH sessions. Ensure this is done during a maintenance window or with appropriate fallback access.
6. Verification & Troubleshooting
Automation aims to reduce errors, but it doesn’t eliminate the need for verification and troubleshooting. IaC fundamentally changes how these activities are performed.
6.1. Verification Commands (for Loopback Configuration)
After deploying the loopback configuration, these commands would be used to confirm the desired state.
# Cisco IOS-XE/NX-OS
show ip interface brief Loopback0
show running-config interface Loopback0
# Juniper JunOS
show interfaces lo0 terse
show configuration interfaces lo0
# Arista EOS
show ip interface brief Loopback0
show running-config interface Loopback0
Expected Output (Cisco Example):
Interface IP-Address OK? Method Status Protocol
Loopback0 10.0.0.1 YES manual up up
Building configuration...
Current configuration : 110 bytes
!
interface Loopback0
description Automated_Loopback_Interface_IaC_Demo
ip address 10.0.0.1 255.255.255.255
end
Annotations:
- The
StatusandProtocolbeingupindicates the interface is administratively and operationally active. - The
descriptionandip addressmatch the values intended by the IaC.
6.2. Common Issues & Troubleshooting
| Issue | Description | Potential Root Causes | Debug Commands / Resolution Steps |
|---|---|---|---|
| Connectivity Failure | Automation engine cannot reach the device. | Network firewall blocking, incorrect IP address, device offline, SSH/API service down. | ping <device_ip>, ssh <device_ip>, nc -vz <device_ip> <port>. Check device status and network path. |
| Authentication Errors | Automation tool fails to log in. | Incorrect username/password, SSH key issue, local/RADIUS/TACACS+ server issues. | Double-check credentials. Test manual login. Check debug aaa authentication on device, show users. |
| Authorization Errors | User logs in but lacks permissions for configuration. | Insufficient RBAC privileges for the automation user. | Review user roles/privileges on the device. Grant necessary permissions. For NETCONF/RESTCONF, check specific YANG module access. |
| Syntax Errors (IaC/API) | Device rejects configuration due to malformed input. | Incorrect YANG data, invalid CLI command in script, wrong JSON/XML structure. | Examine automation tool output for error messages. Compare generated payload/commands to vendor documentation. Use vendor-specific validation tools (e.g., Cisco YANG Suite). |
| Idempotence Failure | Automation applies changes even when not needed, or misses existing config. | Logic error in automation script, inconsistent state awareness, device not reporting current state accurately. | Debug script logic. Perform dry-run or check mode if available. Manually verify device state before and after automation. |
| Configuration Drift | Manual changes bypass IaC, leading to inconsistencies. | Lack of clear change management policy, emergency manual changes not committed to IaC. | Implement regular audits of running config against IaC. Enforce IaC-only changes. Alert on detected drift. |
| Session Lock/Resource Busy | Automation tool cannot acquire necessary locks (e.g., NETCONF candidate config). | Another session holds the lock, previous automation run failed to release lock. | On device, identify and clear existing sessions/locks. Implement robust error handling in automation to release locks gracefully. |
7. Performance Optimization
While introductory, understanding performance considerations for network automation is crucial for scaling.
- Parallel Execution: Most automation tools (Ansible, Nornir, Python with
asyncio) support running tasks on multiple devices concurrently. This significantly reduces total execution time.- Ansible: Use
forksparameter inansible.cfgor on the command line. - Python: Use
concurrent.futures(ThreadPoolExecutor) orasyncio.
- Ansible: Use
- Batching API Calls: For APIs that support it, send multiple configuration or query operations in a single request to minimize network latency and overhead.
- Targeted Changes: Only send configuration for elements that actually need to change (idempotence helps here). Avoid pushing full configurations if only a small part is modified.
- Caching: Cache device facts or frequently accessed read-only data to reduce API calls.
- Efficient Data Processing: Optimize Python scripts for faster data parsing and manipulation.
- Monitoring Automation Execution: Implement metrics and logging to track automation script execution times, success rates, and identify bottlenecks.
8. Hands-On Lab: Deploying a Basic Loopback Interface via Ansible
This lab will guide you through setting up a simple Ansible environment and using it to configure a loopback interface on a simulated or real Cisco IOS-XE device.
Lab Topology (Conceptual):
nwdiag {
network Management_Network {
address = "192.168.1.0/24"
Ansible_Control_Node [address = "192.168.1.10"];
Router_1 [address = "192.168.1.101"];
}
}
Objectives:
- Set up an Ansible control node.
- Create an Ansible inventory file.
- Create an Ansible playbook to configure a loopback interface.
- Execute the playbook and verify the configuration.
Step-by-Step Configuration:
Prerequisites:
- A Linux-based system (e.g., Ubuntu, CentOS, WSL) to act as the Ansible Control Node.
- A Cisco IOS-XE device (physical or virtual, e.g., Cisco CSR1000V) with SSH enabled and network connectivity to the Ansible Control Node. Ensure you have an administrator username and password.
1. Install Ansible on Control Node:
# For Ubuntu/Debian
sudo apt update
sudo apt install software-properties-common
sudo apt-add-repository --yes --update ppa:ansible/ansible
sudo apt install ansible
# For CentOS/RHEL
sudo yum install epel-release
sudo yum install ansible
Verify installation:
ansible --version
2. Create a Project Directory:
mkdir ansible_netdevops_intro
cd ansible_netdevops_intro
3. Create an Inventory File (inventory.ini):
Replace YOUR_DEVICE_IP with your Cisco IOS-XE device’s IP. Replace YOUR_SSH_USERNAME and YOUR_SSH_PASSWORD with actual credentials (or configure SSH key-based authentication for production).
[cisco_iosxe]
Router_1 ansible_host=YOUR_DEVICE_IP ansible_network_os=ios
[all_network_devices:children]
cisco_iosxe
Security Warning: Storing passwords directly in inventory.ini is highly insecure for production. Use Ansible Vault: ansible-vault encrypt inventory.ini and then provide the password at runtime or use ansible_password variable in a vaulted file. For this simple lab, you might temporarily use cleartext but understand the risks.
4. Create an Ansible Playbook (loopback_config.yml):
---
- name: Configure Loopback0 on Cisco IOS-XE Lab Device
hosts: cisco_iosxe
gather_facts: false
vars:
ansible_user: YOUR_SSH_USERNAME
ansible_password: YOUR_SSH_PASSWORD # Remember to use Ansible Vault in production!
ansible_become: yes # Required for privilege escalation on Cisco devices
ansible_become_method: enable # Method to gain enable mode
ansible_become_pass: YOUR_ENABLE_PASSWORD # Enable password (Vault this!)
loopback_id: 0
loopback_description: "Lab_Automated_Loopback_IaC"
loopback_ip: "172.16.10.1"
loopback_prefix: "24" # 255.255.255.0
tasks:
- name: Configure Loopback
cisco.ios.ios_config:
parents: "interface Loopback"
lines:
- "description "
- "ip address "
- "no shutdown"
save_when: modified # Only save if changes were made
register: config_result
- name: Display configuration result
ansible.builtin.debug:
var: config_result
- name: Verify Loopback configuration
cisco.ios.ios_command:
commands: "show ip interface brief Loopback"
register: verify_output
- name: Display verification output
ansible.builtin.debug:
var: verify_output.stdout_lines
Note: Make sure the cisco.ios collection is installed: ansible-galaxy collection install cisco.ios.
5. Execute the Playbook:
ansible-playbook -i inventory.ini loopback_config.yml
If you used Ansible Vault for credentials, you’d add --ask-vault-pass or configure ansible.cfg.
6. Verification Steps:
Observe the output of the playbook. The Display verification output task should show the configured IP address and description for Loopback0.
You can also log into your Cisco IOS-XE device manually via SSH and run:
show ip interface brief Loopback0
show running-config interface Loopback0
Confirm the output matches your playbook’s intended configuration.
Challenge Exercises:
- Modify the playbook to configure a different loopback interface (e.g., Loopback1) with a different IP address and description.
- Add a task to the playbook to verify the device’s hostname before configuring the loopback interface.
- (Advanced) Try to implement the same configuration using a Python script with Netmiko, similar to the example in Section 4.1.
9. Best Practices Checklist
- Version Control Everything: All network configurations, automation scripts, and IaC templates are stored in a Git repository.
- Declarative Over Imperative: Strive to define the desired state rather than a sequence of commands.
- Idempotent Automation: Scripts and playbooks should be executable multiple times without causing unintended side effects.
- Secure Credential Management: Use Ansible Vault, environment variables, or dedicated secret managers; never hardcode credentials.
- Least Privilege Access: Grant automation users and API clients only the minimum necessary permissions.
- Code Review: All changes to network code are peer-reviewed before merging and deployment.
- Automated Testing: Implement unit, integration, and end-to-end tests for network changes.
- Continuous Integration/Continuous Delivery (CI/CD): Automate the testing and deployment pipeline.
- Monitoring & Alerting: Continuously monitor network state, performance, and automation execution for anomalies.
- Audit Trails: Maintain comprehensive logs of all automated and manual network changes.
- Documentation: Document automation workflows, IaC definitions, and API usage clearly.
- Small, Incremental Changes: Break down large changes into smaller, manageable, and easily reversible units.
- Backup & Rollback Strategy: Ensure you have clear procedures for backing up configurations and rolling back to previous states.
10. Reference Links
- NETCONF RFC 6241: https://datatracker.ietf.org/doc/html/rfc6241
- RESTCONF RFC 8040: https://datatracker.ietf.org/doc/html/rfc8040
- YANG RFC 7950 (YANG 1.1): https://datatracker.ietf.org/doc/html/rfc7950
- Cisco DevNet - Standard Network Devices (NETCONF/RESTCONF/gRPC/YANG): https://developer.cisco.com/site/standard-network-devices/
- Cisco DevNet - Infrastructure as Code: https://developer.cisco.com/iac/
- Cisco YANG Suite: https://developer.cisco.com/yangsuite/
- Ansible Network Automation: https://docs.ansible.com/ansible/latest/network/index.html
- Netmiko GitHub: https://github.com/ktbyers/netmiko
- NAPALM GitHub: https://github.com/napalm-automation/napalm
- Nornir GitHub: https://github.com/nornir-automation/nornir
- Terraform Documentation: https://www.terraform.io/docs/
- PlantUML Official Site: https://plantuml.com/
- nwdiag Examples: http://blockdiag.com/en/nwdiag/nwdiag-examples.html
- Graphviz DOT Language: https://graphviz.org/doc/info/lang.html
- packetdiag Examples: http://blockdiag.com/en/nwdiag/packetdiag-examples.html
11. What’s Next
This chapter provided a high-level but essential introduction to the philosophy and fundamental concepts behind NetDevOps and Infrastructure as Code. You should now have a solid understanding of why these methodologies are critical for modern network engineers, the role of programmable interfaces, and the benefits they bring.
In the upcoming chapters, we will dive deeper into the practical implementation of these concepts:
- Chapter 2: Getting Started with Python for Network Automation: We will explore Python fundamentals, essential libraries (Netmiko, NAPALM), and basic scripting for network interaction.
- Chapter 3: Mastering Ansible for Multi-Vendor Network Automation: This chapter will focus on Ansible architecture, playbooks, modules, roles, and collections for automating configuration and operational tasks across diverse network devices.
- Chapter 4: Advanced Network APIs and YANG Data Models: We will explore NETCONF, RESTCONF, and gRPC in detail, including how to interact with them programmatically using Python and how to leverage YANG data models for robust, standardized automation.
Prepare to transform your approach to network engineering from manual to automated, from reactive to proactive, and from siloed to collaborative. The journey to becoming a NetDevOps expert has just begun!