7.1 Introduction

The transition from manual configuration to automated, programmatic network management is a cornerstone of NetDevOps. While Ansible and Python excel at imperative task execution and configuration management, Infrastructure as Code (IaC) tools like Terraform offer a powerful, declarative approach to provisioning and managing infrastructure, including network devices and cloud networking services.

This chapter delves into Terraform, focusing on its application within a NetDevOps framework for both traditional network hardware and modern cloud environments. We will explore Terraform’s core principles, its unique capabilities for state management, and how it integrates with diverse network ecosystems, from Cisco IOS XE routers to cloud-native Virtual Private Clouds (VPCs).

What this chapter covers:

  • The fundamental concepts of Infrastructure as Code and Terraform.
  • Terraform’s architecture, including providers, resources, data sources, and modules.
  • How Terraform interacts with multi-vendor network devices using APIs like NETCONF and RESTCONF, leveraging YANG data models.
  • Utilizing Terraform for provisioning and managing cloud network infrastructure across major providers (AWS, Azure, GCP).
  • Practical configuration examples for Cisco, Juniper, and cloud environments.
  • Security best practices, verification, and troubleshooting techniques specific to Terraform.
  • Strategies for optimizing Terraform deployments in production.

Why it’s important: Terraform allows network engineers to define their network infrastructure in code, enabling version control, collaboration, idempotency, and the ability to rapidly deploy, modify, and destroy network components with confidence. This declarative approach minimizes configuration drift, enhances auditability, and supports agile development methodologies for network operations, aligning perfectly with NetDevOps principles.

What you’ll be able to do after: Upon completing this chapter, you will understand how to design and implement network infrastructure using Terraform, manage multi-vendor network devices declaratively, provision cloud network resources, and integrate these practices into your NetDevOps workflows. You will be equipped to leverage Terraform for building robust, scalable, and automated network environments.

7.2 Technical Concepts

7.2.1 Infrastructure as Code (IaC) and Declarative Configuration

Infrastructure as Code (IaC) is the practice of managing and provisioning infrastructure through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools. The entire network infrastructure, including devices, firewalls, load balancers, and cloud resources, is described in a high-level language.

Terraform embodies the declarative IaC paradigm. Instead of writing a script that describes how to achieve a desired state (imperative), you describe what the desired end-state should be. Terraform then figures out the necessary steps to reach that state. This is a significant shift from traditional network CLI management.

Key IaC Benefits:

  • Version Control: Track changes, revert to previous states, collaborate using tools like Git.
  • Idempotency: Applying the same configuration multiple times yields the same result without unintended side effects.
  • Consistency: Eliminates “snowflake” configurations, ensuring uniformity across environments.
  • Repeatability: Easily recreate environments (dev, test, production).
  • Reduced Human Error: Automates complex provisioning tasks.
  • Auditability: Changes are tracked in code, providing a clear history.

IaC Workflow with Terraform

digraph IaC_Terraform_Workflow {
    rankdir=LR;
    node [shape=box];

    user [label="Network Engineer (HCL Code)"];
    git [label="Version Control (Git)"];
    terraform_cli [label="Terraform CLI"];
    provider [label="Terraform Provider"];
    network_devices [label="Network Devices/Cloud APIs", shape=cylinder];
    terraform_state [label="Terraform State File", shape=Mrecord];

    user -> git [label="Pushes HCL"];
    git -> terraform_cli [label="Fetches HCL"];
    terraform_cli -> terraform_state [label="Loads Current State"];
    terraform_cli -> provider [label="Requests Resource State"];
    provider -> network_devices [label="Interacts with APIs"];
    network_devices -> provider [label="Returns Actual State"];
    provider -> terraform_cli [label="Returns Actual State"];
    terraform_cli -> terraform_state [label="Compares, Updates State"];
    terraform_cli -> user [label="Shows Plan/Status"];
    terraform_cli -> provider [label="Applies Changes (if approved)"];
}

Figure 7.1: Terraform Infrastructure as Code Workflow

7.2.2 Terraform Core Concepts

a) HashiCorp Configuration Language (HCL): Terraform uses HCL, a declarative language designed for configuration files. It’s human-readable and supports expressions, variables, and complex data structures.

b) Providers: Providers are plugins that Terraform uses to interact with an upstream API to manage resources. There are thousands of providers for cloud services (AWS, Azure, GCP), SaaS products, and network devices. For network devices, providers often translate HCL into API calls (NETCONF, RESTCONF, gNMI, vendor-specific APIs) that devices understand. * Examples: aws, azurerm, google, ciscoiosxe, junos, arista, netconf.

c) Resources: The fundamental building blocks of infrastructure defined in Terraform. A resource block describes one or more infrastructure objects, such as a cloud VPC, a network interface, a router, or a VLAN. Terraform manages the lifecycle of these resources (create, read, update, delete).

d) Data Sources: Allow Terraform to fetch information about existing infrastructure objects, without managing their lifecycle. This is useful for querying current state or referencing resources managed outside of Terraform.

e) Modules: Self-contained packages of Terraform configurations that are reusable. Modules encapsulate related resources, promoting organization, reusability, and consistency. A module can define a complete network segment, a VPN tunnel, or a standardized device configuration.

f) State: Terraform maintains a terraform.tfstate file, which maps the real-world resources to your configuration. This state file is crucial: * It tracks metadata about your infrastructure. * It’s used to compare the desired state (HCL) with the actual state (remote infrastructure). * It helps Terraform understand which resources to create, update, or destroy. * Critical Security Note: State files can contain sensitive information. They should be stored securely (e.g., remote backend with encryption) and never committed directly to public source control.

g) Remote Backends: For collaborative environments and security, Terraform state files should be stored in a remote, shared, and versioned backend (e.g., AWS S3, Azure Blob Storage, HashiCorp Consul, Terraform Cloud/Enterprise). Remote backends also support state locking to prevent concurrent modifications.

h) Workspaces: Allow you to manage multiple distinct instances of the same configuration. This is useful for creating separate environments (e.g., dev, stage, prod) from a single set of Terraform files.

7.2.3 Terraform for Network Devices

Modern network devices expose programmatic interfaces, moving beyond CLI scraping. Terraform leverages these APIs, primarily NETCONF and RESTCONF, which utilize YANG data models to define device configurations and operational states.

a) YANG Data Models: YANG (RFC 6020, RFC 7950) is a data modeling language used to define the configuration and state data of network devices. It provides a standardized, structured, and vendor-agnostic way to represent network elements.

b) NETCONF: NETCONF (Network Configuration Protocol - RFC 6241) is a standardized, XML-based protocol for installing, manipulating, and deleting the configuration of network devices. It’s connection-oriented and uses RPCs (Remote Procedure Calls). Terraform providers for network devices often use NETCONF for robust, transaction-based configuration management.

c) RESTCONF: RESTCONF (RFC 8040) is an HTTP-based protocol that provides a REST-like interface for interacting with data defined by YANG models. It’s often preferred for simpler integrations and web-based applications. Many modern network devices support both NETCONF and RESTCONF.

d) gRPC Network Management Interface (gNMI): While NETCONF/RESTCONF are dominant for configuration, gNMI (a Google-led initiative) is gaining traction for high-performance telemetry and configuration. Some advanced Terraform providers may interface with gNMI for specific use cases, though it’s less common for general configuration provisioning than NETCONF/RESTCONF.

Terraform Interaction with Network Devices

@startuml
skinparam handwritten true
skinparam cloudBorderColor #ADD8E6
skinparam nodeBorderColor #8A2BE2
skinparam databaseBorderColor #FFD700
skinparam rectangleBorderColor #FFA500

cloud "Terraform Control Plane" as TF {
  rectangle "Terraform CLI" as CLI
  rectangle "Terraform State" as State
  rectangle "Terraform Providers" as Providers
}

package "Network Automation Interfaces" {
  component "NETCONF" as NETCONF_API
  component "RESTCONF" as RESTCONF_API
  component "Vendor API (e.g., ACI)" as Vendor_API
}

node "Cisco IOS XE Device" as IOSXE {
  database "YANG Data Model" as IOSXE_YANG
}

node "Juniper JunOS Device" as JUNOS {
  database "YANG Data Model" as JUNOS_YANG
}

node "Arista EOS Device" as ARISTA {
  database "YANG Data Model" as ARISTA_YANG
}

CLI [label="> Providers : HCL configuration
Providers"] NETCONF_API : NETCONF RPCs (XML)
Providers [label="> RESTCONF_API : RESTCONF HTTP (JSON/XML)
Providers"] Vendor_API : Vendor-specific Calls
Providers <[label="State : Read/Write State

NETCONF_API <"] IOSXE_YANG : Configure Device
RESTCONF_API <[label="> IOSXE_YANG : Configure Device
NETCONF_API <"] JUNOS_YANG : Configure Device
RESTCONF_API <[label="> JUNOS_YANG : Configure Device
Vendor_API <"] ARISTA_YANG : Configure Device

IOSXE_YANG -- IOSXE
JUNOS_YANG -- JUNOS
ARISTA_YANG -- ARISTA

@enduml

Figure 7.2: Terraform Interaction with Multi-Vendor Network Devices

7.2.4 Terraform for Cloud Networking

Cloud platforms (AWS, Azure, GCP) fundamentally operate on an IaC model, exposing comprehensive APIs for all their services, including networking. Terraform excels at provisioning and managing these cloud network resources, integrating seamlessly with their native APIs.

Common Cloud Network Resources Managed by Terraform:

  • Virtual Private Clouds (VPCs) / Virtual Networks (VNets): Isolated network segments in the cloud.
  • Subnets: Divisions within a VPC/VNet.
  • Route Tables: Control network traffic flow.
  • Security Groups / Network Security Groups (NSGs): Stateful firewalls for instances/subnets.
  • Load Balancers: Distribute traffic across instances.
  • VPN Gateways / Direct Connect / ExpressRoute / Cloud Interconnect: Hybrid cloud connectivity.
  • Transit Gateways / Hub-and-Spoke Topologies: Centralized routing and connectivity for multiple VPCs.
  • DNS Services: Route 53, Azure DNS, Cloud DNS.

The declarative nature of Terraform maps perfectly to the API-driven nature of cloud networking.

Hybrid Cloud Network Architecture with Terraform

nwdiag {
  // Define custom styles for clouds and on-prem
  define style cloud {
    color = "#ADD8E6"; // Light Blue
    border_color = "#4169E1"; // Royal Blue
    font_color = "#191970"; // Midnight Blue
    border_width = 2;
  }
  define style on_prem {
    color = "#FFDAB9"; // Peach Puff
    border_color = "#FFA07A"; // Light Salmon
    font_color = "#8B0000"; // Dark Red
    border_width = 2;
  }
  define style device_router {
    shape = router;
    color = "#E0FFFF";
    border_color = "#00BFFF";
  }
  define style device_server {
    shape = box;
    color = "#F0FFF0";
    border_color = "#3CB371";
  }

  // Cloud network (AWS)
  cloud "AWS Cloud" {
    style = cloud;
    network "AWS VPC Production" {
      address = "10.10.0.0/16"
      description = "Terraform-Managed Prod VPC"

      network "Private Subnet A" {
        address = "10.10.1.0/24"
        instance_prod_app [address = "10.10.1.10", style = device_server];
      }
      network "Private Subnet B" {
        address = "10.10.2.0/24"
        instance_prod_db [address = "10.10.2.20", style = device_server];
      }
      network "Public Subnet" {
        address = "10.10.0.0/24"
        lb_public [address = "10.10.0.5", shape=cloud]; // Representing a Load Balancer
      }
    }
    network "AWS VPC Development" {
      address = "10.20.0.0/16"
      description = "Terraform-Managed Dev VPC"
      dev_router [address = "10.20.0.1", style = device_router];
    }

    // Transit Gateway for inter-VPC and hybrid connectivity
    network "AWS Transit Gateway" {
      address = "VPN/Direct Connect Tunnel"
      description = "Centralized Routing Hub"
      aws_tgw [shape=cloud]; // Abstract representation of TGW
    }

    aws_tgw -- "AWS VPC Production" : VPC Attachment
    aws_tgw -- "AWS VPC Development" : VPC Attachment
  }

  // On-Premises Network
  group "On-Premises Data Center" {
    style = on_prem;
    network "Internal Network" {
      address = "192.168.1.0/24"
      onprem_router [address = "192.168.1.1", style = device_router];
      onprem_server [address = "192.168.1.100", style = device_server];
    }
    network "DMZ Network" {
      address = "172.16.0.0/24"
      firewall [address = "172.16.0.1", shape=firewall];
    }
  }

  // Connectivity between On-Prem and Cloud
  onprem_router -- firewall;
  firewall -- aws_tgw : IPsec VPN Tunnel
  // Implicitly, lb_public has internet access
}

Figure 7.3: Hybrid Cloud Network Topology Managed by Terraform

7.3 Configuration Examples

These examples demonstrate using Terraform to configure network devices and cloud resources. They assume a basic Terraform setup (installed CLI, configured cloud credentials).

7.3.1 Cisco IOS XE (NETCONF/RESTCONF Provider)

This example uses the generic netconf provider to configure a VLAN on a Cisco IOS XE device. The device needs to have NETCONF or RESTCONF enabled and configured for SSH.

Prerequisites:

  • Cisco IOS XE device reachable via SSH.
  • NETCONF/RESTCONF enabled on the device. Example config:
    !
    username terraform privilege 15 secret 0 terraform_password
    !
    netconf-yang
     ssh
    !
    restconf
     transport https
     port 443
    !
    
  • A ~/.netconf.yml or similar file with credentials, or direct inline credentials (less secure).

Terraform Files:

main.tf

# main.tf for Cisco IOS XE VLAN Configuration

# Configure the NETCONF provider
# Ensure host, username, and password are correct for your device
provider "netconf" {
  device = "iosxe_router" # Name defined in ~/.netconf.yml or provide directly
  # Alternatively, provide host, username, password directly:
  # host     = "192.168.1.10"
  # username = "terraform"
  # password = "terraform_password"
  port     = 830 # Default NETCONF over SSH port
  # Use skip_verify for lab environments if self-signed certs
  # skip_verify = true
}

# Define a NETCONF device for the provider
# This resource doesn't configure anything on the device,
# but it's used by other resources to target the specific device.
resource "netconf_device" "iosxe_router" {
  name = "iosxe_router"
  host = "192.168.1.10" # Replace with your IOS XE device IP
}

# Define a resource to manage a VLAN using NETCONF
# This uses a generic 'netconf_edit_config' resource, providing the XML payload.
# For production, consider vendor-specific providers (e.g., ciscoiosxe) if available,
# as they abstract away the XML/JSON and provide HCL-native resource definitions.
resource "netconf_edit_config" "vlan_data" {
  device_name = netconf_device.iosxe_router.name
  target      = "running" # Apply to running configuration
  config_xml  = <<EOF
<config>
  <native xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-native">
    <vlan>
      <vlan-list>
        <id>100</id>
        <name>Terraform_VLAN</name>
      </vlan-list>
      <vlan-list>
        <id>101</id>
        <name>Another_Terraform_VLAN</name>
      </vlan-list>
    </vlan>
  </native>
</config>
EOF
  # Lifecycle rule to prevent Terraform from destroying the VLAN if the resource is removed
  # This is a common practice for core network configurations.
  # If you want Terraform to manage deletion, remove this block.
  lifecycle {
    prevent_destroy = true
    ignore_changes = ["config_xml"] # If you update config_xml outside, Terraform won't re-apply
  }
}

# Output the device name after successful application
output "configured_iosxe_device" {
  value       = netconf_device.iosxe_router.name
  description = "The name of the Cisco IOS XE device configured by Terraform."
}

Security Warning: Directly embedding credentials in main.tf is highly insecure for production. Use environment variables, terraform.tfvars, or a secrets management tool like HashiCorp Vault. For netconf provider, storing credentials in ~/.netconf.yml or using SSH agent forwarding is more secure.

Deployment Steps:

  1. Initialize Terraform: terraform init
  2. Review the plan: terraform plan
  3. Apply the configuration: terraform apply

Verification Commands (Cisco IOS XE):

! Verify VLAN configuration
show vlan brief
show running-config | section vlan

Expected Output:

VLAN Name                             Status    Ports
---- -------------------------------- --------- -------------------------------
1    default                          active
100  Terraform_VLAN                   active
101  Another_Terraform_VLAN           active
...

7.3.2 Juniper JunOS (NETCONF/RESTCONF Provider)

Similar to Cisco, this example uses the netconf provider to configure a logical interface on a Juniper JunOS device.

Prerequisites:

  • Juniper JunOS device reachable via SSH.
  • NETCONF enabled on the device. Example config:
    # Set up NETCONF over SSH
    set system services netconf ssh
    # Create user for Terraform
    set system login user terraform class super-user authentication plain-text-password
    # Provide password when prompted
    

Terraform Files:

main.tf

# main.tf for Juniper JunOS Interface Configuration

# Configure the NETCONF provider
provider "netconf" {
  device = "juniper_srx" # Name defined in ~/.netconf.yml or provide directly
  # host     = "192.168.1.20" # Replace with your JunOS device IP
  # username = "terraform"
  # password = "your_juniper_password"
  port     = 830
}

resource "netconf_device" "juniper_srx" {
  name = "juniper_srx"
  host = "192.168.1.20" # Replace with your JunOS device IP
}

# Define a resource to manage a logical interface on JunOS using NETCONF
resource "netconf_edit_config" "loopback_interface" {
  device_name = netconf_device.juniper_srx.name
  target      = "running" # Apply to running configuration
  config_xml  = <<EOF
<configuration>
  <interfaces>
    <interface>
      <name>lo0</name>
      <unit>
        <name>0</name>
        <family>
          <inet>
            <address>
              <name>10.0.0.1/32</name>
            </address>
          </inet>
        </family>
      </unit>
    </interface>
    <interface>
      <name>ge-0/0/0</name>
      <unit>
        <name>0</name>
        <family>
          <inet>
            <address>
              <name>192.168.20.1/24</name>
            </address>
          </inet>
        </family>
      </unit>
    </interface>
  </interfaces>
</configuration>
EOF
  # Commit and synchronize changes
  commit_confirmed      = false # Set to true for a 'commit confirmed' operation
  commit_synchronize    = true # Ensures configuration is synchronized to the backup Routing Engine
  commit_comment        = "Terraform: Configured loopback and ge-0/0/0.0 interfaces"
  # lifecycle { prevent_destroy = true } # Consider for critical resources
}

output "configured_juniper_device" {
  value       = netconf_device.juniper_srx.name
  description = "The name of the Juniper JunOS device configured by Terraform."
}

Security Warning: Same as for Cisco, avoid hardcoding credentials.

Deployment Steps:

  1. Initialize Terraform: terraform init
  2. Review the plan: terraform plan
  3. Apply the configuration: terraform apply

Verification Commands (Juniper JunOS):

# Verify interface configuration
show interfaces lo0.0
show interfaces ge-0/0/0.0
show configuration interfaces | display set

Expected Output:

user@juniper-srx> show interfaces lo0.0
  Logical interface lo0.0 (Index 66) (SNMP ifIndex 506)
    Flags: Up SNMP-Traps 0x4000000 Encapsulation: ENET2
    inet  addr 10.0.0.1/32
user@juniper-srx> show interfaces ge-0/0/0.0
  Logical interface ge-0/0/0.0 (Index 67) (SNMP ifIndex 507)
    Flags: Up SNMP-Traps 0x4000000 Encapsulation: ENET2
    inet  addr 192.168.20.1/24

7.3.3 Cloud Networking (AWS VPC Example)

This example provisions an AWS VPC, subnets, and a security group.

Prerequisites:

  • AWS account with appropriate IAM permissions.
  • AWS credentials configured for Terraform (e.g., via environment variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, or ~/.aws/credentials).

Terraform Files:

main.tf

# main.tf for AWS VPC and Subnet Configuration

# Configure the AWS provider
provider "aws" {
  region = "us-east-1" # Specify your desired AWS region
}

# Create a new VPC
resource "aws_vpc" "netdevops_vpc" {
  cidr_block = "10.0.0.0/16"
  enable_dns_support = true
  enable_dns_hostnames = true

  tags = {
    Name        = "NetDevOps-Terraform-VPC"
    ManagedBy   = "Terraform"
    Environment = "Dev"
  }
}

# Create a public subnet
resource "aws_subnet" "public_subnet" {
  vpc_id            = aws_vpc.netdevops_vpc.id
  cidr_block        = "10.0.1.0/24"
  availability_zone = "${provider.aws.region}a"
  map_public_ip_on_launch = true # Instances in this subnet get public IPs

  tags = {
    Name        = "NetDevOps-Public-Subnet"
    ManagedBy   = "Terraform"
  }
}

# Create a private subnet
resource "aws_subnet" "private_subnet" {
  vpc_id            = aws_vpc.netdevops_vpc.id
  cidr_block        = "10.0.2.0/24"
  availability_zone = "${provider.aws.region}a"

  tags = {
    Name        = "NetDevOps-Private-Subnet"
    ManagedBy   = "Terraform"
  }
}

# Create a Security Group (Firewall Rules)
resource "aws_security_group" "web_sg" {
  name        = "web-security-group"
  description = "Allow HTTP/HTTPS traffic"
  vpc_id      = aws_vpc.netdevops_vpc.id

  ingress {
    description = "Allow HTTP from anywhere"
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    description = "Allow HTTPS from anywhere"
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1" # Allow all outbound traffic
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name        = "WebSecurityGroup"
    ManagedBy   = "Terraform"
  }
}

# Output the VPC ID and Subnet IDs
output "vpc_id" {
  value       = aws_vpc.netdevops_vpc.id
  description = "The ID of the created VPC."
}

output "public_subnet_id" {
  value       = aws_subnet.public_subnet.id
  description = "The ID of the public subnet."
}

output "private_subnet_id" {
  value       = aws_subnet.private_subnet.id
  description = "The ID of the private subnet."
}

Deployment Steps:

  1. Initialize Terraform: terraform init
  2. Review the plan: terraform plan
  3. Apply the configuration: terraform apply

Verification (AWS Console/CLI):

  • Navigate to the VPC service in the AWS console.
  • Verify the existence of “NetDevOps-Terraform-VPC” with CIDR 10.0.0.0/16.
  • Check the subnets for NetDevOps-Public-Subnet (10.0.1.0/24) and NetDevOps-Private-Subnet (10.0.2.0/24).
  • Confirm the web-security-group exists with appropriate ingress/egress rules.

Expected Output (CLI):

Apply complete! Resources: 4 added, 0 changed, 0 destroyed.

Outputs:

private_subnet_id = "subnet-xxxxxxxxxxxxxxxxxx"
public_subnet_id = "subnet-xxxxxxxxxxxxxxxxxx"
vpc_id = "vpc-xxxxxxxxxxxxxxxxxx"

7.4 Network Diagrams

As demonstrated throughout the examples, clear network diagrams are crucial for understanding and documenting infrastructure managed by Terraform.

7.4.1 Network Topology (nwdiag)

A diagram illustrating a typical multi-region, hybrid cloud network managed by Terraform.

nwdiag {
  // Define custom styles for different components
  define style cloud_region {
    color = "#E0FFFF"; // Azure light blue
    border_color = "#4169E1"; // Royal Blue
    font_color = "#191970"; // Midnight Blue
    border_width = 2;
    fontsize = 14;
  }
  define style on_prem_dc {
    color = "#FFFACD"; // Lemon Chiffon
    border_color = "#DAA520"; // Goldenrod
    font_color = "#8B0000"; // Dark Red
    border_width = 2;
    fontsize = 14;
  }
  define style device_router {
    shape = router;
    color = "#F0FFF0"; // Honeydew
    border_color = "#3CB371"; // Medium Sea Green
    fontsize = 12;
  }
  define style device_firewall {
    shape = firewall;
    color = "#FFEFD5"; // PapayaWhip
    border_color = "#FF8C00"; // Dark Orange
    fontsize = 12;
  }
  define style device_server {
    shape = box;
    color = "#F5F5DC"; // Beige
    border_color = "#A0522D"; // Sienna
    fontsize = 12;
  }
  define style service_lb {
    shape = cloud;
    color = "#F0F8FF"; // AliceBlue
    border_color = "#1E90FF"; // DodgerBlue
    fontsize = 12;
  }
  define style internet_cloud {
    shape = cloud;
    color = "#F8F8FF"; // GhostWhite
    border_color = "#778899"; // Light Slate Gray
    fontsize = 12;
  }

  // AWS Region 1 (us-east-1)
  cloud "AWS Region A (us-east-1)" {
    style = cloud_region;
    network "AWS VPC A (10.1.0.0/16)" {
      description = "Terraform-Managed"
      router_aws_a [style = device_router];
      network "Public Subnet (10.1.1.0/24)" {
        lb_web_a [style = service_lb];
      }
      network "Private Subnet (10.1.2.0/24)" {
        app_server_a [style = device_server];
      }
    }
  }

  // AWS Region 2 (us-west-2)
  cloud "AWS Region B (us-west-2)" {
    style = cloud_region;
    network "AWS VPC B (10.2.0.0/16)" {
      description = "Terraform-Managed"
      router_aws_b [style = device_router];
      network "Public Subnet (10.2.1.0/24)" {
        lb_web_b [style = service_lb];
      }
      network "Private Subnet (10.2.2.0/24)" {
        app_server_b [style = device_server];
      }
    }
  }

  // On-Premises Data Center
  group "On-Premises Data Center" {
    style = on_prem_dc;
    network "Core Network (192.168.10.0/24)" {
      description = "Managed by Terraform/Ansible"
      core_router [style = device_router, address="192.168.10.1"];
      core_switch [shape = switch];
      db_server [style = device_server, address="192.168.10.10"];
    }
    network "DMZ (172.16.0.0/24)" {
      fw_onprem [style = device_firewall, address="172.16.0.1"];
      web_proxy [style = device_server, address="172.16.0.10"];
    }
  }

  // Interconnects
  internet "Internet" {
    style = internet_cloud;
  }

  // Cloud Interconnections
  router_aws_a -- router_aws_b [label="VPC Peering / Transit Gateway"];
  fw_onprem -- router_aws_a [label="Direct Connect / VPN Tunnel"];

  // Internet Connectivity
  lb_web_a -- internet;
  lb_web_b -- internet;
  web_proxy -- internet;

  // Internal On-Prem Connectivity
  core_router -- fw_onprem;
  core_router -- core_switch;
  core_switch -- db_server;
}

Figure 7.4: Multi-Region Hybrid Cloud Network Topology

7.4.2 Protocol Flow (graphviz)

Illustrating the Terraform plan and apply lifecycle.

digraph Terraform_Lifecycle {
    rankdir=LR;
    node [shape=box, style=filled, fillcolor=lightblue];
    edge [color=gray, arrowhead=vee];

    start [label="Start (terraform init)", shape=ellipse, fillcolor=lightgreen];
    config [label="HCL Configuration (.tf files)"];
    provider_plugins [label="Provider Plugins", shape=cylinder, fillcolor=lightgray];
    state_file [label="Terraform State (.tfstate)", shape=Mrecord, fillcolor=lightyellow];
    current_infra [label="Current Infrastructure (APIs)", shape=cylinder, fillcolor=lightgray];
    plan [label="terraform plan", shape=box, fillcolor=orange];
    diff [label="Compare (HCL vs State vs Actual)"];
    execution_plan [label="Execution Plan (Proposed Changes)", fillcolor=lightpink];
    review [label="Review Plan & Approve", shape=hexagon, fillcolor=cyan];
    apply [label="terraform apply", shape=box, fillcolor=orange];
    provider_action [label="Provider Actions (API Calls)", fillcolor=lightgray];
    updated_infra [label="Updated Infrastructure", shape=cylinder, fillcolor=lightgreen];
    update_state [label="Update State File"];
    end [label="End (Infrastructure Deployed/Modified)", shape=ellipse, fillcolor=lightgreen];
    destroy [label="terraform destroy", shape=box, fillcolor=red];
    destroy_action [label="Provider Actions (Delete API Calls)", fillcolor=red];
    destroyed_infra [label="Destroyed Infrastructure", shape=cylinder, fillcolor=darkgray];
    clear_state [label="Clear State File"];


    start -> config;
    config -> provider_plugins;
    provider_plugins -> plan;
    plan -> state_file [label="Read State"];
    plan -> current_infra [label="Read Current State via APIs"];
    plan -> diff;
    diff -> execution_plan;
    execution_plan -> review;
    review -> apply [label="Approve"];
    review -> destroy [label="Decide to Destroy (Alternative)"];

    apply -> provider_action [label="Execute Changes via APIs"];
    provider_action -> updated_infra;
    updated_infra -> update_state;
    update_state -> state_file;
    update_state -> end;

    destroy -> destroy_action;
    destroy_action -> destroyed_infra;
    destroyed_infra -> clear_state;
    clear_state -> state_file;
    clear_state -> end;
}

Figure 7.5: Terraform Plan and Apply Lifecycle

7.4.3 Architecture (PlantUML)

High-level architecture depicting a NetDevOps pipeline utilizing Terraform for both network and cloud provisioning.

@startuml
skinparam handwritten true
skinparam style strictuml

rectangle "Network Engineering Team" as Team
cloud "Version Control System (GitLab/GitHub)" as VCS
rectangle "CI/CD Pipeline (Jenkins/GitLab CI/GitHub Actions)" as CI_CD
cloud "Terraform Cloud/Enterprise" as TF_Cloud_RemoteState
rectangle "Terraform Workspace" as TF_Workspace

package "Network Automation Layer" {
  component "Terraform CLI" as TF_CLI
  component "Terraform Providers" as TF_Providers
}

package "Cloud Infrastructure" {
  cloud "AWS VPCs & Services" as AWS
  cloud "Azure VNets & Services" as Azure
  cloud "GCP VPCs & Services" as GCP
}

package "On-Prem Network Infrastructure" {
  node "Cisco IOS XE Devices" as Cisco
  node "Juniper JunOS Devices" as Juniper
  node "Arista EOS Devices" as Arista
}

Team [label="> VCS : Pushes HCL Code
VCS"] CI_CD : Trigger Pipeline (Webhook)
CI_CD [label="> TF_Workspace : Checkout Code
CI_CD"] TF_CLI : Executes 'terraform plan/apply'
TF_CLI <[label="> TF_Cloud_RemoteState : Remote State Backend & Locking
TF_CLI"] TF_Providers : Delegates API Calls

TF_Providers [label="> AWS : Provision/Manage Cloud Resources
TF_Providers"] Azure : Provision/Manage Cloud Resources
TF_Providers [label="> GCP : Provision/Manage Cloud Resources

TF_Providers"] Cisco : Config via NETCONF/RESTCONF/API
TF_Providers [label="> Juniper : Config via NETCONF/RESTCONF/API
TF_Providers"] Arista : Config via EOS API/NETCONF

VCS ..> TF_Cloud_RemoteState : Store TF Modules

@enduml

Figure 7.6: NetDevOps Architecture with Terraform for Hybrid Infrastructure

7.5 Automation Examples (Terraform HCL)

These examples focus on the Terraform HCL configurations themselves, as Terraform is primarily an IaC automation tool.

7.5.1 Modular AWS VPC with Subnets and Route Tables

This demonstrates a more robust, modular approach for AWS networking, creating a reusable VPC module.

Module Definition (modules/vpc/main.tf):

# modules/vpc/main.tf

variable "region" {
  description = "AWS region for the VPC"
  type        = string
}

variable "vpc_cidr" {
  description = "CIDR block for the VPC"
  type        = string
}

variable "public_subnet_cidrs" {
  description = "List of CIDR blocks for public subnets"
  type        = list(string)
}

variable "private_subnet_cidrs" {
  description = "List of CIDR blocks for private subnets"
  type        = list(string)
}

variable "vpc_name" {
  description = "Name tag for the VPC"
  type        = string
  default     = "terraform-managed-vpc"
}

# AWS VPC
resource "aws_vpc" "main" {
  cidr_block           = var.vpc_cidr
  enable_dns_support   = true
  enable_dns_hostnames = true
  tags = {
    Name        = var.vpc_name
    ManagedBy   = "Terraform-Module"
  }
}

# Internet Gateway (for public subnets to access internet)
resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id
  tags = {
    Name = "${var.vpc_name}-igw"
  }
}

# Public Subnets
resource "aws_subnet" "public" {
  count             = length(var.public_subnet_cidrs)
  vpc_id            = aws_vpc.main.id
  cidr_block        = var.public_subnet_cidrs[count.index]
  availability_zone = "${var.region}${element(["a", "b", "c"], count.index)}"
  map_public_ip_on_launch = true
  tags = {
    Name        = "${var.vpc_name}-public-subnet-${count.index}"
    ManagedBy   = "Terraform-Module"
  }
}

# Private Subnets
resource "aws_subnet" "private" {
  count             = length(var.private_subnet_cidrs)
  vpc_id            = aws_vpc.main.id
  cidr_block        = var.private_subnet_cidrs[count.index]
  availability_zone = "${var.region}${element(["a", "b", "c"], count.index)}"
  tags = {
    Name        = "${var.vpc_name}-private-subnet-${count.index}"
    ManagedBy   = "Terraform-Module"
  }
}

# Public Route Table
resource "aws_route_table" "public" {
  vpc_id = aws_vpc.main.id
  tags = {
    Name = "${var.vpc_name}-public-rt"
  }
}

resource "aws_route" "public_internet_gateway" {
  route_table_id         = aws_route_table.public.id
  destination_cidr_block = "0.0.0.0/0"
  gateway_id             = aws_internet_gateway.main.id
}

# Associate public subnets with public route table
resource "aws_route_table_association" "public" {
  count          = length(var.public_subnet_cidrs)
  subnet_id      = aws_subnet.public[count.index].id
  route_table_id = aws_route_table.public.id
}

# Output values
output "vpc_id" {
  value = aws_vpc.main.id
}

output "public_subnet_ids" {
  value = [for s in aws_subnet.public : s.id]
}

output "private_subnet_ids" {
  value = [for s in aws_subnet.private : s.id]
}

Root Configuration (main.tf in parent directory):

# main.tf in root directory

provider "aws" {
  region = "us-east-1"
}

module "prod_vpc" {
  source = "./modules/vpc" # Path to your VPC module
  region = "us-east-1"
  vpc_cidr = "10.10.0.0/16"
  public_subnet_cidrs = ["10.10.1.0/24", "10.10.2.0/24"]
  private_subnet_cidrs = ["10.10.10.0/24", "10.10.11.0/24"]
  vpc_name = "production-vpc"
}

module "dev_vpc" {
  source = "./modules/vpc" # Reuse the same module for dev environment
  region = "us-east-1"
  vpc_cidr = "10.20.0.0/16"
  public_subnet_cidrs = ["10.20.1.0/24"]
  private_subnet_cidrs = ["10.20.10.0/24"]
  vpc_name = "development-vpc"
}

output "prod_vpc_id" {
  value = module.prod_vpc.vpc_id
}

output "dev_vpc_id" {
  value = module.dev_vpc.vpc_id
}

This modular structure allows you to instantiate multiple VPCs with consistent patterns, drastically reducing code duplication and maintenance.

7.6 Security Considerations

Security is paramount in IaC, especially when managing network infrastructure.

7.6.1 State File Security

  • Sensitive Data: Terraform state files often contain sensitive information (IP addresses, network topology details, sometimes even plaintext passwords if not handled carefully).
  • Remote Backend: Always use a remote backend (e.g., AWS S3 with encryption, Azure Blob Storage, HashiCorp Consul, Terraform Cloud/Enterprise) for state storage. This centralizes state, enables locking, and provides access control.
  • Encryption: Ensure the remote backend is configured for encryption at rest (e.g., S3 server-side encryption).
  • Access Control: Implement strict Access Control Lists (ACLs) or IAM policies for who can read/write to the state file. Follow the principle of least privilege.
  • Never Commit to Git: The terraform.tfstate file should never be committed to version control directly. Add it to .gitignore.

7.6.2 Credentials and Secrets Management

  • Avoid Hardcoding: Never hardcode API keys, usernames, passwords, or tokens directly in your HCL code.
  • Environment Variables: Use environment variables for sensitive provider authentication (e.g., TF_VAR_aws_access_key_id, NETCONF_USERNAME).
  • Secrets Management Tools: Integrate with dedicated secrets management solutions like HashiCorp Vault, AWS Secrets Manager, Azure Key Vault, or GCP Secret Manager. These tools securely store and provide access to credentials at runtime.
  • Terraform CLI var files: For non-sensitive but environment-specific variables, use terraform.tfvars. For sensitive variables, use a *.auto.tfvars file that is not committed to Git, or better yet, inject them from a secrets manager in a CI/CD pipeline.

7.6.3 Access Control and RBAC

  • Least Privilege: Configure IAM/RBAC roles for Terraform execution with the minimum necessary permissions to provision and manage resources. For example, a role might only have permissions to create/modify VPCs and subnets, but not delete critical production databases.
  • Separate Environments: Use separate cloud accounts, IAM roles, or Terraform workspaces/projects for different environments (dev, test, prod) to prevent accidental cross-contamination or unauthorized access.

7.6.4 Drift Detection and Remediation

  • Configuration Drift: This occurs when the actual state of infrastructure deviates from the desired state defined in HCL. This can happen due to manual changes, out-of-band configurations, or other automation tools.
  • Regular terraform plan: Periodically run terraform plan in a read-only mode (e.g., in a CI/CD pipeline) to detect drift.
  • Automated Remediation (Caution): While terraform apply can remediate drift, automating this can be risky if changes are not reviewed. Consider manual approval for remediation or use drift detection for alerting rather than immediate auto-remediation.

7.6.5 Supply Chain Security (Providers and Modules)

  • Trusted Sources: Only use Terraform providers and modules from trusted sources (HashiCorp Registry, verified partners, or internal repositories).
  • Versioning: Pin provider and module versions (required_providers and module blocks) to ensure consistent behavior and prevent unexpected changes from new releases.
  • Security Scanning: Incorporate static analysis tools (e.g., Checkov, Trivy, tfsec) into your CI/CD pipeline to scan HCL code for security misconfigurations and compliance violations before deployment.

7.6.6 Example: Protecting Sensitive Outputs

# main.tf (excerpt) - Sensitive Output Example

# Assume you have a resource that generates a sensitive value, e.g., a shared secret for a VPN
resource "random_string" "vpn_secret" {
  length  = 32
  special = true
  numeric = true
  upper   = true
  lower   = true
}

output "vpn_shared_secret" {
  value       = random_string.vpn_secret.result
  description = "The shared secret for the VPN connection."
  sensitive   = true # Mark this output as sensitive
}

When an output is marked sensitive = true, Terraform will redact its value in the CLI output (e.g., (sensitive value)) and in the state file when queried via terraform output.

7.7 Verification & Troubleshooting

Terraform provides robust tools for verification and offers clear error messages, but understanding common issues helps.

7.7.1 Verification Commands

Terraform’s primary verification comes from its plan and apply outputs.

  • terraform init: Initializes your working directory, downloading necessary providers and modules.
    • Verification: Successful download of providers, creation of .terraform directory.
    • Troubleshooting: Network connectivity issues, incorrect provider source in HCL, provider registry unreachable.
  • terraform validate: Checks the HCL code for syntax errors and internal consistency. It does not access any remote services.
    • Verification: Success! The configuration is valid.
    • Troubleshooting: Syntax errors in HCL (missing brackets, typos), incorrect variable declarations.
  • terraform plan: Generates an execution plan, showing what actions Terraform will take (create, update, destroy) to reach the desired state defined in your HCL. It queries the actual state of the infrastructure.
    • Verification: Review the proposed changes carefully. Ensure it matches your expectations. Look for N to add, N to change, N to destroy.
    • Troubleshooting:
      • Unexpected changes: Often due to drift (manual changes) or incorrect logic in your HCL.
      • Provider errors: If Terraform can’t connect to the API or authenticate, plan will fail.
      • Resource not found: If data sources try to query non-existent resources.
  • terraform apply: Executes the actions proposed in the plan. Requires explicit approval.
    • Verification: Successful completion of resource provisioning. The output shows Apply complete! Resources: N added, N changed, N destroyed..
    • Troubleshooting:
      • API errors: Permissions issues, invalid parameters passed to the device/cloud API, rate limiting.
      • State conflicts: If multiple users apply changes concurrently without state locking.
      • Timeout errors: If resource creation takes longer than the provider’s configured timeout.
  • terraform show: Reads the current state file and prints the resource attributes that Terraform knows about.
    • Verification: Check if the deployed resources match your configuration and if sensitive data is redacted.
  • terraform output <output_name>: Displays the value of a specific output variable.
    • Verification: Quickly check values of key resources (e.g., VPC ID, interface IP).

7.7.2 Common Issues and Resolution Steps

Issue CategoryCommon SymptomsDebug Commands / Resolution Steps
HCL Syntax ErrorsError: Missing argument, Error: Invalid block definition, Unterminated string literalterraform validate is your first line of defense. Pay close attention to line numbers in error messages. Use a good IDE with HCL syntax highlighting.
Provider IssuesError: provider.aws: NoCredentialProviders (AWS), Error: dial tcp: i/o timeout (NETCONF), Error: Failed to refresh stateAuthentication: Double-check credentials (env vars, ~/.aws/credentials, ~/.netconf.yml). Connectivity: Ping target host, netstat to check port, firewall rules. Provider Version: Ensure required_providers in versions.tf is correct and run terraform init -upgrade.
State File CorruptionError: Failed to load state, Resource already exists in state but not in configNEVER MANUALLY EDIT terraform.tfstate DIRECTLY. Use terraform state subcommands: terraform state rm, terraform state mv, terraform import to reconcile. In extreme cases, restore from a remote backend backup.
Configuration Driftterraform plan shows unexpected changes to existing resources.Review the terraform plan output carefully. Determine if the drift was intentional (manual change) or unintentional. If unintentional, terraform apply will revert it. If intentional, consider updating HCL or terraform taint to force recreation.
Resource Creation FailsError: Error creating EC2 instance: UnauthorizedOperation, Error: VLAN ID 100 already existsPermissions: Verify IAM/RBAC policies for the user/role running Terraform. API Errors: The provider error message usually contains the underlying API error (e.g., duplicate resource name, invalid value). Consult vendor API documentation.
Timeout ErrorsError: timeout while waiting for state to become 'running'Increase timeout settings in the resource block (if supported by the provider) or provider configuration. This often happens with network device reboots or slow cloud resource provisioning.
Dependency IssuesResources fail to provision because a dependency isn’t ready (e.g., routing table without a VPC).Terraform usually manages dependencies implicitly. If explicit ordering is needed, use depends_on = [resource.type.name]. Ensure your network architecture is logically sound.

7.7.3 Root Cause Analysis for Network-Specific Issues

  • NETCONF/RESTCONF Specific:
    • Incorrect XML/JSON Payload: Terraform might send valid XML/JSON to the device, but the content itself is semantically incorrect according to the YANG model. Use terraform plan -json to inspect the actual payload being sent (if the provider supports it) and validate against the YANG schema using tools like Cisco YANG Suite.
    • Device Capability Mismatch: Ensure the device supports the specific YANG model or NETCONF operations being attempted. Use show netconf-yang capabilities (Cisco) or show netconf capabilities (Juniper) on the device.
    • Firewall between Terraform and Device: Ensure TCP port 830 (NETCONF over SSH) or 443/80 (RESTCONF over HTTPS/HTTP) is open.
  • Cloud Network Specific:
    • CIDR Overlaps: Terraform will often detect and fail on CIDR block overlaps when creating VPCs/VNets or subnets.
    • Availability Zone (AZ) Capacity: Occasionally, an AZ may lack capacity for a specific resource type, leading to provisioning failures. Terraform retries can sometimes overcome this, or you may need to adjust your AZ strategy.
    • Routing Conflicts: Incorrect route table configurations can cause provisioning failures or lead to unreachability post-deployment.

7.8 Performance Optimization

Optimizing Terraform deployments in complex network environments involves structuring your code and leveraging Terraform features effectively.

7.8.1 Modularization and Granularity

  • Small, Focused Modules: Break down large configurations into smaller, reusable modules. For instance, an AWS module for VPCs, another for security groups, and a third for EC2 instances. For network devices, modules could represent a standardized device configuration, a VLAN stack, or a routing protocol setup.
  • Reduced Scope: Smaller modules lead to faster terraform plan and terraform apply times, as Terraform only needs to process a subset of resources.
  • Parallel Execution: Terraform can often parallelize resource creation/modification. Well-defined, independent modules can benefit from this.

7.8.2 Remote State Management

  • State Locking: Essential for collaborative environments. Remote backends (e.g., S3, Consul, Terraform Cloud) provide state locking to prevent multiple terraform apply operations from conflicting.
  • Encryption: Store state files encrypted at rest.
  • Versioning: Utilize state backend versioning (e.g., S3 versioning) for easy rollbacks and audit trails.

7.8.3 Provider Configuration

  • Connection Pooling (where available): Some network device providers might support connection pooling for NETCONF/RESTCONF sessions, reducing overhead for multiple configuration changes.
  • Batching API Calls: Advanced providers may batch multiple changes into a single API call, improving performance, especially over high-latency links.
  • Rate Limiting: Be aware of API rate limits imposed by cloud providers or network devices. Terraform providers often have built-in retry mechanisms, but excessive resource creation can still hit limits.

7.8.4 CI/CD Integration

  • Automated plan on PR: Run terraform plan automatically on every pull request to get quick feedback on proposed changes and detect drift.
  • Targeted Applies: For large configurations, use terraform apply -target=resource.type.name to apply changes only to specific resources. Use this with caution as it can break implicit dependencies and lead to partial deployments. It’s generally better to let Terraform manage the entire graph.
  • Terraform Cloud/Enterprise: Leverage these platforms for managed remote state, shared module registries, policy enforcement (Sentinel), and streamlined CI/CD integration, which significantly boosts team productivity and performance.

7.8.5 Resource count and for_each

  • Efficiently provision multiple similar resources using count or for_each meta-arguments. This reduces HCL boilerplate and simplifies managing large numbers of identical network segments, interfaces, or security groups.

7.9 Hands-On Lab: Hybrid Network Provisioning

This lab will guide you through provisioning a simple hybrid network using Terraform. It will involve:

  1. Creating a cloud VPC with subnets (AWS).
  2. Configuring a loopback interface on a Cisco IOS XE device.
  3. Connecting the two logically (conceptual, as physical VPN setup is complex for a simple lab).

Lab Topology

nwdiag {
  // Define custom styles for clouds and on-prem
  define style cloud {
    color = "#E0FFFF";
    border_color = "#4169E1";
  }
  define style on_prem {
    color = "#FFFACD";
    border_color = "#DAA520";
  }
  define style device_router {
    shape = router;
    color = "#F0FFF0";
    border_color = "#3CB371";
  }
  define style internet_cloud {
    shape = cloud;
    color = "#F8F8FF";
    border_color = "#778899";
  }

  cloud "AWS Cloud (us-east-1)" {
    style = cloud;
    network "NetDevOps VPC (10.0.0.0/16)" {
      description = "Terraform-Managed VPC"
      aws_router [shape=router]; // Logical representation of AWS routing
      network "Public Subnet (10.0.1.0/24)" {
        aws_public_instance [address = "10.0.1.10", shape=box];
      }
      network "Private Subnet (10.0.2.0/24)" {
        aws_private_instance [address = "10.0.2.10", shape=box];
      }
    }
  }

  group "On-Premises Lab" {
    style = on_prem;
    network "On-Prem Network (192.168.1.0/24)" {
      cisco_iosxe [address = "192.168.1.100", style = device_router];
    }
  }

  // Conceptual VPN/Direct Connect link
  cisco_iosxe -- aws_router [label="Conceptual VPN/Direct Connect Link", style="dotted"];
  aws_router -- "NetDevOps VPC (10.0.0.0/16)";
  aws_public_instance -- internet_cloud; // Public instance has internet access
}

Figure 7.7: Hands-On Lab Hybrid Network Topology

7.9.1 Objectives

  • Set up Terraform for AWS and a local Cisco IOS XE device.
  • Provision an AWS VPC, subnets, and an Internet Gateway.
  • Configure a loopback interface and a VLAN on the Cisco IOS XE device.
  • Output key IDs for verification.

7.9.2 Prerequisites

  • AWS Account: With credentials configured (e.g., ~/.aws/credentials).
  • Cisco IOS XE Device:
    • Running on a platform like VIRL/CML, EVE-NG, or a physical device.
    • SSH accessible from your Terraform workstation.
    • NETCONF/RESTCONF enabled. User terraform with password terraform_password and privilege 15.
    • IP address: 192.168.1.100 (adjust if needed).
  • Terraform CLI: Installed on your workstation.

7.9.3 Step-by-Step Configuration

Step 1: Create Lab Directory Structure

mkdir netdevops-terraform-lab
cd netdevops-terraform-lab
touch main.tf versions.tf terraform.tfvars
mkdir modules
mkdir modules/cisco_iosxe
touch modules/cisco_iosxe/main.tf modules/cisco_iosxe/variables.tf modules/cisco_iosxe/outputs.tf

Step 2: Define Terraform Providers and Backend (versions.tf)

# versions.tf
terraform {
  required_version = ">= 1.0.0" # Ensure compatibility

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
    netconf = {
      source  = "netconf-ng/netconf"
      version = "~> 1.0"
    }
  }
  # Example remote backend (uncomment and configure for production)
  /*
  backend "s3" {
    bucket         = "your-terraform-state-bucket"
    key            = "netdevops-lab/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-state-lock" # For state locking
  }
  */
}

Step 3: Define Variables (terraform.tfvars)

# terraform.tfvars (Sensitive data should NOT be committed to VCS!)
# For lab purposes, you might keep these here, but for production, use environment variables or a secrets manager.

aws_region           = "us-east-1"
aws_vpc_cidr         = "10.0.0.0/16"
aws_public_subnet_cidr  = "10.0.1.0/24"
aws_private_subnet_cidr = "10.0.2.0/24"

cisco_iosxe_host     = "192.168.1.100"
cisco_iosxe_username = "terraform"
cisco_iosxe_password = "terraform_password" # Change in production!

Step 4: AWS Configuration (main.tf - root directory)

# main.tf (root directory)

variable "aws_region" {
  description = "AWS region for the VPC"
  type        = string
}

variable "aws_vpc_cidr" {
  description = "CIDR block for the AWS VPC"
  type        = string
}

variable "aws_public_subnet_cidr" {
  description = "CIDR block for the AWS public subnet"
  type        = string
}

variable "aws_private_subnet_cidr" {
  description = "CIDR block for the AWS private subnet"
  type        = string
}

provider "aws" {
  region = var.aws_region
}

# AWS VPC
resource "aws_vpc" "netdevops_vpc" {
  cidr_block           = var.aws_vpc_cidr
  enable_dns_support   = true
  enable_dns_hostnames = true
  tags = {
    Name = "NetDevOps-Lab-VPC"
  }
}

# Internet Gateway
resource "aws_internet_gateway" "netdevops_igw" {
  vpc_id = aws_vpc.netdevops_vpc.id
  tags = {
    Name = "NetDevOps-Lab-IGW"
  }
}

# Public Subnet
resource "aws_subnet" "netdevops_public_subnet" {
  vpc_id                  = aws_vpc.netdevops_vpc.id
  cidr_block              = var.aws_public_subnet_cidr
  availability_zone       = "${var.aws_region}a"
  map_public_ip_on_launch = true
  tags = {
    Name = "NetDevOps-Lab-Public-Subnet"
  }
}

# Private Subnet
resource "aws_subnet" "netdevops_private_subnet" {
  vpc_id                  = aws_vpc.netdevops_vpc.id
  cidr_block              = var.aws_private_subnet_cidr
  availability_zone       = "${var.aws_region}a"
  tags = {
    Name = "NetDevOps-Lab-Private-Subnet"
  }
}

# Public Route Table
resource "aws_route_table" "netdevops_public_rt" {
  vpc_id = aws_vpc.netdevops_vpc.id
  tags = {
    Name = "NetDevOps-Lab-Public-RT"
  }
}

resource "aws_route" "public_internet_route" {
  route_table_id         = aws_route_table.netdevops_public_rt.id
  destination_cidr_block = "0.0.0.0/0"
  gateway_id             = aws_internet_gateway.netdevops_igw.id
}

resource "aws_route_table_association" "public_subnet_association" {
  subnet_id      = aws_subnet.netdevops_public_subnet.id
  route_table_id = aws_route_table.netdevops_public_rt.id
}

# Output for AWS resources
output "aws_vpc_id" {
  value = aws_vpc.netdevops_vpc.id
  description = "ID of the provisioned AWS VPC."
}
output "aws_public_subnet_id" {
  value = aws_subnet.netdevops_public_subnet.id
  description = "ID of the public subnet."
}
output "aws_private_subnet_id" {
  value = aws_subnet.netdevops_private_subnet.id
  description = "ID of the private subnet."
}

# Call Cisco IOS XE module
module "cisco_network_config" {
  source = "./modules/cisco_iosxe"
  iosxe_host     = var.cisco_iosxe_host
  iosxe_username = var.cisco_iosxe_username
  iosxe_password = var.cisco_iosxe_password
}

Step 5: Cisco IOS XE Module (modules/cisco_iosxe/main.tf)

# modules/cisco_iosxe/main.tf

variable "iosxe_host" {
  description = "IP address of the Cisco IOS XE device"
  type        = string
}

variable "iosxe_username" {
  description = "Username for NETCONF access"
  type        = string
  sensitive   = true
}

variable "iosxe_password" {
  description = "Password for NETCONF access"
  type        = string
  sensitive   = true
}

provider "netconf" {
  host     = var.iosxe_host
  username = var.iosxe_username
  password = var.iosxe_password
  port     = 830
  # skip_verify = true # Uncomment for lab if using self-signed certs or HTTP RESTCONF
}

resource "netconf_device" "iosxe_lab_device" {
  name = "iosxe-lab-device"
  host = var.iosxe_host
}

# Configure a Loopback Interface
resource "netconf_edit_config" "loopback_config" {
  device_name = netconf_device.iosxe_lab_device.name
  target      = "running"
  config_xml  = <<EOF
<config>
  <native xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-native">
    <interface>
      <Loopback>
        <name>100</name>
        <ip>
          <address>
            <primary>
              <address>192.168.255.1</address>
              <mask>255.255.255.0</mask>
            </primary>
          </address>
        </ip>
        <description>Configured by Terraform NetDevOps Lab</description>
      </Loopback>
    </interface>
  </native>
</config>
EOF
  lifecycle { prevent_destroy = true } # Keep loopback interface even if TF resource is removed
  commit_comment = "Terraform: Configured Loopback100"
}

# Configure a VLAN
resource "netconf_edit_config" "vlan_config" {
  device_name = netconf_device.iosxe_lab_device.name
  target      = "running"
  config_xml  = <<EOF
<config>
  <native xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-native">
    <vlan>
      <vlan-list>
        <id>200</id>
        <name>Terraform-Lab-VLAN</name>
      </vlan-list>
    </vlan>
  </native>
</config>
EOF
  lifecycle { prevent_destroy = true }
  commit_comment = "Terraform: Configured VLAN200"
}

output "iosxe_device_name" {
  value = netconf_device.iosxe_lab_device.name
  description = "Name of the Cisco IOS XE device managed by Terraform."
}
output "iosxe_loopback_ip" {
  value = "192.168.255.1"
  description = "IP address of the configured Loopback100 interface."
}

Step 6: Initialize, Plan, and Apply From the netdevops-terraform-lab root directory:

terraform init
terraform plan
terraform apply

Review the plan output carefully, then type yes to apply.

7.9.4 Verification Steps

AWS Verification:

  1. Log in to the AWS Management Console.
  2. Navigate to VPC service in the us-east-1 region.
  3. Confirm the NetDevOps-Lab-VPC with CIDR 10.0.0.0/16 exists.
  4. Check the Subnets section to see NetDevOps-Lab-Public-Subnet (10.0.1.0/24) and NetDevOps-Lab-Private-Subnet (10.0.2.0/24).
  5. Verify the Internet Gateway and Public Route Table are associated correctly.

Cisco IOS XE Verification:

  1. SSH into your Cisco IOS XE device.
  2. Run the following commands:
    show ip interface brief | include Loopback100
    show running-config interface Loopback100
    show vlan brief | include Terraform-Lab-VLAN
    
  3. Confirm Loopback100 is up with IP 192.168.255.1 and VLAN200 exists with the correct name.

7.9.5 Challenge Exercises

  1. Add a Security Group: Modify the AWS configuration to create a security group allowing SSH (port 22) from your IP and attach it to a hypothetical EC2 instance.
  2. Juniper Integration: Add a junos provider (or use netconf with Juniper XML) to configure a loopback interface on a separate Juniper device.
  3. Variable Inputs: Convert more hardcoded values (e.g., VLAN ID, loopback IP) into variables in the Cisco module.
  4. Destroy: Once satisfied, understand and execute terraform destroy to tear down the provisioned infrastructure.

7.10 Best Practices Checklist

Applying these best practices will lead to more maintainable, secure, and efficient Terraform deployments for network infrastructure.

  • Version Control: Store all HCL code in a Git repository.
  • Remote State: Use a remote backend for state management and enable state locking.
  • State Encryption: Ensure your remote state backend encrypts data at rest.
  • .gitignore: Always include terraform.tfstate*, .terraform/, and *.tfvars (for sensitive files) in your .gitignore.
  • Modularize: Organize configurations into small, reusable modules.
  • Variable Inputs: Use variables for all configurable parameters, especially sensitive ones (and mark them sensitive = true for outputs).
  • Secrets Management: Integrate with a secrets manager (Vault, AWS Secrets Manager) for credentials, avoiding hardcoding.
  • Least Privilege: Configure IAM/RBAC roles for Terraform execution with the minimum necessary permissions.
  • Provider Pinning: Explicitly define required provider versions to ensure stability.
  • Meaningful Naming: Use clear, consistent naming conventions for resources and tags.
  • Documentation: Add comments to HCL code and provide READMEs for modules.
  • Continuous Integration: Integrate terraform plan into your CI pipeline for every pull request.
  • Dry Runs: Always perform terraform plan before terraform apply.
  • Small, Incremental Changes: Avoid large, sweeping changes in a single terraform apply.
  • Destroy Awareness: Understand the impact of terraform destroy and use prevent_destroy for critical resources.
  • Drift Detection: Regularly run terraform plan to identify and manage configuration drift.
  • Automated Testing: Implement validation tests (e.g., using Terratest or pytest for HCL) to verify deployed infrastructure.
  • Tagging: Use consistent tagging for cloud resources for cost allocation, automation, and management.

7.12 What’s Next

This chapter provided a comprehensive deep dive into leveraging Terraform for Infrastructure as Code in network engineering. You’ve learned about its declarative nature, core components, and how to apply it to both traditional network devices and cloud networking. The ability to define, provision, and manage your network infrastructure in code is a fundamental skill in modern NetDevOps.

In the next chapter, we will shift our focus to “Continuous Integration and Continuous Delivery (CI/CD) for Network Automation.” We will explore how to integrate the Ansible playbooks, Python scripts, and Terraform configurations you’ve learned to build robust, automated pipelines that ensure rapid, reliable, and consistent deployments across your network infrastructure. This includes setting up Git workflows, automated testing, and deployment strategies that truly embody the NetDevOps ethos.