Objective
For a long time, I have wanted to automate the physical part of my lab. My goal was being able to wipe the devices and then bootstrap them with different configs to match the solution I am researching without manual intervention. Even my surface level thinking on this process made me realize there is one very large circular dependency with this workflow. If your automation tool wipes the network device and reloads it, it is essentially cutting off its arms. One main solution would be giving your automation tools access to the raw console lines of the devices, but I prefer a strictly ethernet solution. This solution reflects a more real-world approach to networking by utilizing the built-in management uplink, additionally freeing up the console port.
The goal for this solution was to be able to return my Cisco Software Defined Access lab rack to a bootstrapped state after each night in the lab. That way, I could make a bunch of changes, break the fabric, even sever the connection to DNA Center and still be able to re-bootstrap the device at the end of the night. That way the next time I lab I can just readopt it into DNA Center. Without doing any manual configuration on the switch.
My lab rack is pretty simple and consists of the following hardware:
- 1x Cisco 3650 (SDA Fusion Node)
- 6x Cisco 3850 (SDA Borders & Control Plane Nodes)
All of these devices have 1 key feature that can be leveraged to make this possible, an out of band management port in its own VRF. In the back of the rack, I have each management port cabled to a desktop switch with a Raspberry Pi connected. The Raspberry Pi is running Ansible and a TFTP server. The wired out of band network is completely self-contained, traffic never leaves the L2 domain between the switches management port and the Pi. To access the Pi, the built in wireless is leveraged.
Now that we’ve laid the foundation, we need a way to wipe the switches but retain management connectivity. A simple “write erase” won’t do the trick here. The initial manual bootstrap process for a device when it’s installed in the rack is generally unavoidable, although experimenting with Plug and Play (PnP) might be an option in the future. The objective is to eliminate the need for manual reconnection to Ansible, even after a device reset following a lab session.
The key to maintaining connection to Ansible is that when I “wipe” my devices I am not actually wiping them. I am having Ansible compile a configuration file and place it in the TFTP directory on the Pi. Then Ansible can SSH into the switch and overwrite the startup-config with the configuration it just complied. This fulfills the objective of “wiping” without losing the switch’s connection to Ansible.
Execution
TFTP is required to use the playbook below, a stock install of tftpd-hpa will suffice.
Below is an example of a “wipe.yaml” playbook that I execute on my rack after completing lab work:
---
- name: SDA Rack Wipe + Retain Managment
hosts: fabric_devices
gather_facts: no
connection: network_cli
tasks:
- name: 1.0 | Compile Startup Config File
copy:
dest: "/srv/tftp/{{hostname}}"
content: |
!
username ansible privilege 15 password 0 cisco
!
enable password cisco
!
ip tftp source-interface g0/0
!
ip domain-name ansible-configured.local
!
hostname {{hostname}}
!
ip ssh version 2
!
interface GigabitEthernet0/0
description ANSIBLE MANAGMENT INTERFACE - DO NOT MODIFY
ip address {{inventory_hostname}} 255.255.255.0
!
line vty 0 4
privilege level 15
login local
transport input all
- name: 2.0 | Set File Prompt
cisco.ios.ios_config:
lines:
- file prompt quiet
- name: 3.0 | Replace Current Configuration
ios_command:
commands:
- command: copy tftp://10.254.254.100/{{hostname}} nvram:/startup-config
- name: 4.0 | Reload into new configuration
ios_command:
commands:
- command: reload in 1
Sample Inventory File:
[fabric_devices]
10.254.254.11 hostname=CCIE-BLDG1-SW1
[fabric_devices:vars]
ansible_network_os=ios
ansible_user='ansible'
ansible_ssh_pass='cisco'
In this demo the 10.254.254.0/24 network is the out of band network, with .100 being the Ansible host.
The playbook runs on a group of hosts named “fabric_devices”, compiling a startup config file based on host variables in the inventory file and placing it in the TFTP root. Next, the playbook sets the file prompt to not confirm to ensure it can copy without interruption. In a single step, the new file can be copied via TFTP directly to the NVRAM config. Finally, the playbook reloads the switch to apply the new configuration.
Conclusion
After the switches reload, they remain accessible by Ansible along with no other configuration present. Thus, allowing me to run my DNA Center bootstrap playbook against the switches to provision the required commands for DNA Center Discovery. This method has reduced the time it takes to deploy Software Defined Access by approximately 75%. Allowing me to start fresh with a new fabric.
In the future I plan to add steps to my “wipe.yaml” to remove the devices from DNA Center’s inventory, and then subsequently add them back in the DNA Center bootstrap playbook. While I have used this approach for SDA, it can be applied to almost any IOS-XE-based lab rack for a variety of other solutions.