I recently got access to a Cisco MDS switch in the lab and I had the opportunity to play with it before it went into production. In particular it was a Connectrix MDS 9396T, which is a 32GB FC switch with up to 96 ports. The setup was far from “straight-forward” so I thought I would share all my lessons learned so that other people don’t unnecessary pain. This blog post has three sections:
- Installation
- Configuration. This includes the resolution to the 4 issues I encountered during the setup
- Playbook examples
Cisco developed some Ansible modules that allow us to configure multiple aspects of the switch. In particular I am interested in zoning: alias, zones, zonesets and VSANs. You can get the modules from GitHub. A quick look at the repo reveals that you can do more than just managing storage functionality. It provides capabilities for virtually every aspect of managing NX-OS
In the repo it states that the modules have been tested with Cisco NX-OS 7.0(3)I5(1) on Nexus Switches and NX-OS 8.4(1) on MDS Switches. It is worth noting that in this other Cisco page it seems to suggest that Ansible support was introduced with 8.4(1). My switch is running is running NX-OS software version 8.4(2c), so it should be OK. However, for the record, I tested initially all the steps below with version 8.3(1) and the behavior was identical, meaning Ansible worked with that version too as long as you solve the 4 issues I will explain below
When it comes to Ansible versions, the repo calls for >=2.9.10 and in my system I am running 2.9.21.
Installation
The initial installation works as described. The documentation doesn’t call out any specific requirements. The installation seems to install everything necessary. Maybe that’s part of the reason why the installation takes close to 2 minutes:
[root@ansible3 ~]# ansible-galaxy collection install cisco.nxos
Process install dependency map
Starting collection install process
Installing 'cisco.nxos:2.4.0' to '/root/.ansible/collections/ansible_collections/cisco/nxos'
Installing 'ansible.netcommon:2.2.0' to '/root/.ansible/collections/ansible_collections/ansible/netcommon'
Installing 'ansible.utils:2.3.0' to '/root/.ansible/collections/ansible_collections/ansible/utils'
Once it’s finished I can see 99 modules … I guess that’s the other reason it takes longer.
[root@ansible3 ~]# ls /root/.ansible/collections/ansible_collections/cisco/nxos/plugins/modules/
__init__.py nxos_gir_profile_management.py nxos_ntp.py
nxos_aaa_server_host.py nxos_gir.py nxos_nxapi.py
nxos_aaa_server.py nxos_hsrp_interfaces.py nxos_ospf_interfaces.py
...
Configuration
When I search for examples beyond what the GitHub repo provides what comes out on top is 2 documents from Cisco regarding the programmability of the MDS 9000 range for software versions 8.x:
Amongst other things, these examples use “connection=local” and a parameter called “provider“. When I follow this these examples I get the first error message:
Unsupported parameters for (cisco.nxos.nxos_devicealias) module: provider Supported parameters include: da, distribute, mode, rename
So this is saying that the “provider” parameter is not recognized. The interesting thing is that if you delete it from your playbook it is still added at runtime somehow and you get the same error. You can verify that it is still passing it by running for playbook with the “-vvv” option and looking at the value of “module_args“
[root@ansible3 mds]# ansible-playbook test.yml -vvv
...
"invocation": {
"module_args": {
"da": [
{
"name": "testalias1",
"pwwn": "11:22:33:44:55:66:77:88"
}
],
"distribute": true,
"mode": "enhanced",
"provider": {
"auth_pass": null,
"authorize": false,
...
The only way I could get it to work was by removing all references to “connection=local” both in the playbook itself and in the “/etc/ansible/hosts“. Instead we are going to use “connection: network_cli“. This will cause the playbook to use SSH to connect to the switch. Once I make that change I get a different error … at least that’s progress 🙂
Unable to automatically determine host network os. Please manually configure ansible_network_os value for this host
The way of getting rid of this error message is by adding “ansible_network_os=nxos“. You can add this to your playbook as a variable:
vars:
ansible_network_os: nxos
Or you can add it to the “/etc/ansible/hosts” file that Cisco suggested. I personally prefer the previous method
[all:vars]
ansible_network_os=nxos
This allows us to move forward, but we are not out of the woods yet. Now we get a different error message
ansible.module_utils.connection.ConnectionError: No authentication methods available
So it is complaining about authentication. When I tested this I still had the “provider” parameter in my playbook and I even got a “[WARNING]: provider is unnecessary when using network_cli and will be ignored“. After trying a few things I discovered that when using the “network_cli” connection method the credentials need to be passed on these specific variables
vars:
ansible_user: "{{ un }}"
ansible_password: "{{ pwd }}"
ansible_network_os: nxos
tasks:
- name: Create alias
cisco.nxos.nxos_devicealias:
distribute: false
mode: basic
da:
- name: testalias1
pwwn: 11:22:33:44:55:66:77:88
By the way, note how the variables need to exist but they are not parameters of the actual module. You can create this variables in any way you would normally create your variables, ex: explicitly in the playbook, or reading them from an encrypted file with “vars_files”, or in the “hosts” file …
And with that, we come to the final hurdle. This is the final error I encountered
ansible.module_utils.connection.ConnectionError: paramiko: The authenticity of host ‘10.1.1.251’ can’t be established
This error message is more straight forward. It is is related to the behavior of SSH when you connect to a host for the first time. There are 2 options. The preferred one is to add entries for the switches in the “known_hosts” file. This file is in the home directory of the user that will run the playbooks. For example I am running the playbooks as root, so the location of the file is “~/.ssh/known_hosts“. If you have only a handful of switches the easiest way is to SSH to each of them once
The other option would be to add this to your “/etc/ansible/ansible.cfg“, so that it doesn’t bother checking the keys. Of course this is not secure and not a best practice, so use at your own peril
[defaults]
host_key_checking = false
With all those teething issues out of the way it is time to start having fun!
Playbook Examples
Let’s start by creating an alias. My switch is brand new so there is nothing configured yet
mds32-sw1# show device-alias database
There are no entries in the database
I have added a single task to my playbook that looks like as follows:
---
- name: Cisco zoning
hosts: switch251
connection: network_cli
gather_facts: false
vars_files:
- creds.yml
vars:
ansible_user: "{{ un }}"
ansible_password: "{{ pwd }}"
ansible_network_os: nxos
tasks:
- name: Create alias
cisco.nxos.nxos_devicealias:
distribute: false
mode: basic
da:
- name: testalias1
pwwn: 11:22:33:44:55:66:77:88
And the “/etc/ansible/hosts” file is as simple as this
[switch251]
10.1.1.251
We run it and runs through completion
[root@ansible3 mds]# ansible-playbook test.yml
PLAY [Cisco zoning] **************************************************************************************
TASK [Create alias] **************************************************************************************
changed: [10.1.1.251]
PLAY RECAP ***************************************************************************
10.1.1.251 : ok=1 changed=1 unreachable=0 failed=0
If I open an SSH session to the switch I can verify the alias has been created
mds32-sw1# show device-alias database
device-alias name testalias1 pwwn 11:22:33:44:55:66:77:88
Total number of entries = 1
The good news is that the module seems to be idempotent, because if we run it again it doesn’t change anything
[root@ansible3 mds]# ansible-playbook test.yml
PLAY [Cisco zoning] **************************************************************************************
TASK [Create alias] **************************************************************************************
changed: [10.1.1.251]
PLAY RECAP ***************************************************************************
10.1.1.251 : ok=1 changed=0 unreachable=0 failed=0
This time we will remove the alias that we just created, so instead of specifying the “pwwn” we use “remove: True“
- name: Create alias
cisco.nxos.nxos_devicealias:
distribute: false
mode: basic
da:
- name: testalias1
remove: True
If you run it multiple times you will see that this function is also idempotent. We can verify the alias is gone by querying the switch
mds32-sw1# show device-alias database
There are no entries in the database
Let’s look at another aspect of managing Cisco MDS switches, the configuration of VSANs. Let’s add a new task to our playbook:
- name: Add a vsan with an interface on it
cisco.nxos.nxos_vsan:
vsan:
- id: 10
interface:
- fc1/1
name: vsan-SAN-A
remove: false
suspend: false
After running the playbook we can verify that the VSAN has been created and that port “fc1/1” has been added as member
mds32-sw1# show vsan 10
vsan 10 information
name:vsan-SAN-A state:active
interoperability mode:default
loadbalancing:src-id/dst-id/oxid
operational state:up
mds32-sw1# show vsan 10 membership
vsan 10 interfaces:
fc1/1
Finally let’s take a look at an example of creating a zone using the vsan and the device alias we created earlier
- name: Zone testing
cisco.nxos.nxos_zone_zoneset:
zone_zoneset_details:
- vsan: 10
mode: basic
zone:
- members:
- pwwn: "11:11:11:11:11:11:11:11"
- pwwn: "12:12:12:12:12:12:12:12"
- device_alias: testalias1
name: zoneA
Once the playbook runs we can verify the results in the switch session
mds32-sw1# show zone vsan 10
zone name zoneA vsan 10
pwwn 11:11:11:11:11:11:11:11
pwwn 12:12:12:12:12:12:12:12
pwwn 11:22:33:44:55:66:77:88 [testalias1]
In conclusion, the Ansible modules to manage the storage functionality in Cisco Nexus and MDS switches contain a lot of great functionality and will be the ‘go-to’ tool for those organizations that are considering automating Cisco switch operations. The Ansible modules offer a convenient declarative syntax and seem to be idempotent. Once the following list of issues have been solved during the initial configuration as described earlier, the modules can provide very valuable:
- “Unsupported parameters for (cisco.nxos.nxos_devicealias) module: provider Supported parameters include: da, distribute, mode, rename”
- “Unable to automatically determine host network os. Please manually configure ansible_network_os value for this host”
- “ansible.module_utils.connection.ConnectionError: No authentication methods available”
- “ansible.module_utils.connection.ConnectionError: paramiko: The authenticity of host can’t be established”
2 replies »