ansible

Ansible with Cisco MDS/DellEMC Connectrix

I recently got access to a Cisco MDS switch in the lab and I had the opportunity to play with it before it went into production. In particular it was a Connectrix MDS 9396T, which is a 32GB FC switch with up to 96 ports. The setup was far from “straight-forward” so I thought I would share all my lessons learned so that other people don’t unnecessary pain. This blog post has three sections:

  • Installation
  • Configuration. This includes the resolution to the 4 issues I encountered during the setup
  • Playbook examples

Cisco developed some Ansible modules that allow us to configure multiple aspects of the switch. In particular I am interested in zoning: alias, zones, zonesets and VSANs. You can get the modules from GitHub. A quick look at the repo reveals that you can do more than just managing storage functionality. It provides capabilities for virtually every aspect of managing NX-OS

In the repo it states that the modules have been tested with Cisco NX-OS 7.0(3)I5(1) on Nexus Switches and NX-OS 8.4(1) on MDS Switches. It is worth noting that in this other Cisco page it seems to suggest that Ansible support was introduced with 8.4(1). My switch is running is running NX-OS software version 8.4(2c), so it should be OK. However, for the record, I tested initially all the steps below with version 8.3(1) and the behavior was identical, meaning Ansible worked with that version too as long as you solve the 4 issues I will explain below

When it comes to Ansible versions, the repo calls for >=2.9.10 and in my system I am running 2.9.21.

Installation

The initial installation works as described. The documentation doesn’t call out any specific requirements. The installation seems to install everything necessary. Maybe that’s part of the reason why the installation takes close to 2 minutes:

[root@ansible3 ~]# ansible-galaxy collection install cisco.nxos
Process install dependency map
Starting collection install process
Installing 'cisco.nxos:2.4.0' to '/root/.ansible/collections/ansible_collections/cisco/nxos'
Installing 'ansible.netcommon:2.2.0' to '/root/.ansible/collections/ansible_collections/ansible/netcommon'
Installing 'ansible.utils:2.3.0' to '/root/.ansible/collections/ansible_collections/ansible/utils'

Once it’s finished I can see 99 modules … I guess that’s the other reason it takes longer.

[root@ansible3 ~]# ls /root/.ansible/collections/ansible_collections/cisco/nxos/plugins/modules/
__init__.py                          nxos_gir_profile_management.py  nxos_ntp.py              
nxos_aaa_server_host.py              nxos_gir.py                     nxos_nxapi.py            
nxos_aaa_server.py                   nxos_hsrp_interfaces.py         nxos_ospf_interfaces.py  
...

Configuration

When I search for examples beyond what the GitHub repo provides what comes out on top is 2 documents from Cisco regarding the programmability of the MDS 9000 range for software versions 8.x:

Amongst other things, these examples use “connection=local” and a parameter called “provider“. When I follow this these examples I get the first error message:

Unsupported parameters for (cisco.nxos.nxos_devicealias) module: provider Supported parameters include: da, distribute, mode, rename

So this is saying that the “provider” parameter is not recognized. The interesting thing is that if you delete it from your playbook it is still added at runtime somehow and you get the same error. You can verify that it is still passing it by running for playbook with the “-vvv” option and looking at the value of “module_args

[root@ansible3 mds]# ansible-playbook test.yml -vvv
    ...
    "invocation": {
        "module_args": {
            "da": [
                {
                    "name": "testalias1",
                    "pwwn": "11:22:33:44:55:66:77:88"
                }
            ],
            "distribute": true,
            "mode": "enhanced",
            "provider": {
                "auth_pass": null,
                "authorize": false,
    ...

The only way I could get it to work was by removing all references to “connection=local” both in the playbook itself and in the “/etc/ansible/hosts“. Instead we are going to use “connection: network_cli“. This will cause the playbook to use SSH to connect to the switch. Once I make that change I get a different error … at least that’s progress 🙂

Unable to automatically determine host network os. Please manually configure ansible_network_os value for this host

The way of getting rid of this error message is by adding “ansible_network_os=nxos“. You can add this to your playbook as a variable:

  vars:
    ansible_network_os: nxos

Or you can add it to the “/etc/ansible/hosts” file that Cisco suggested. I personally prefer the previous method

[all:vars]
ansible_network_os=nxos

This allows us to move forward, but we are not out of the woods yet. Now we get a different error message

ansible.module_utils.connection.ConnectionError: No authentication methods available

So it is complaining about authentication. When I tested this I still had the “provider” parameter in my playbook and I even got a “[WARNING]: provider is unnecessary when using network_cli and will be ignored“. After trying a few things I discovered that when using the “network_cli” connection method the credentials need to be passed on these specific variables

  vars:
    ansible_user: "{{ un }}"
    ansible_password: "{{ pwd }}"
    ansible_network_os: nxos

  tasks:
  - name: Create alias
    cisco.nxos.nxos_devicealias:
      distribute: false
      mode: basic
      da:
        - name: testalias1
          pwwn: 11:22:33:44:55:66:77:88

By the way, note how the variables need to exist but they are not parameters of the actual module. You can create this variables in any way you would normally create your variables, ex: explicitly in the playbook, or reading them from an encrypted file with “vars_files”, or in the “hosts” file …

And with that, we come to the final hurdle. This is the final error I encountered

ansible.module_utils.connection.ConnectionError: paramiko: The authenticity of host ‘10.1.1.251’ can’t be established

This error message is more straight forward. It is is related to the behavior of SSH when you connect to a host for the first time. There are 2 options. The preferred one is to add entries for the switches in the “known_hosts” file. This file is in the home directory of the user that will run the playbooks. For example I am running the playbooks as root, so the location of the file is “~/.ssh/known_hosts“. If you have only a handful of switches the easiest way is to SSH to each of them once

The other option would be to add this to your “/etc/ansible/ansible.cfg“, so that it doesn’t bother checking the keys. Of course this is not secure and not a best practice, so use at your own peril

[defaults]
host_key_checking = false

With all those teething issues out of the way it is time to start having fun!

Playbook Examples

Let’s start by creating an alias. My switch is brand new so there is nothing configured yet

mds32-sw1# show device-alias database
There are no entries in the database

I have added a single task to my playbook that looks like as follows:

---
- name: Cisco zoning
  hosts: switch251
  connection: network_cli
  gather_facts: false

  vars_files:
    - creds.yml

  vars:
    ansible_user: "{{ un }}"
    ansible_password: "{{ pwd }}"
    ansible_network_os: nxos

  tasks:
  - name: Create alias
    cisco.nxos.nxos_devicealias:

      distribute: false
      mode: basic
      da:
        - name: testalias1
          pwwn: 11:22:33:44:55:66:77:88

And the “/etc/ansible/hosts” file is as simple as this

[switch251]
10.1.1.251

We run it and runs through completion

[root@ansible3 mds]# ansible-playbook test.yml

PLAY [Cisco zoning] **************************************************************************************

TASK [Create alias] **************************************************************************************
changed: [10.1.1.251]

PLAY RECAP ***************************************************************************
10.1.1.251             : ok=1    changed=1    unreachable=0    failed=0

If I open an SSH session to the switch I can verify the alias has been created

mds32-sw1# show device-alias database
device-alias name testalias1 pwwn 11:22:33:44:55:66:77:88

Total number of entries = 1

The good news is that the module seems to be idempotent, because if we run it again it doesn’t change anything

[root@ansible3 mds]# ansible-playbook test.yml

PLAY [Cisco zoning] **************************************************************************************

TASK [Create alias] **************************************************************************************
changed: [10.1.1.251]

PLAY RECAP ***************************************************************************
10.1.1.251             : ok=1    changed=0    unreachable=0    failed=0

This time we will remove the alias that we just created, so instead of specifying the “pwwn” we use “remove: True

  - name: Create alias
    cisco.nxos.nxos_devicealias:
      distribute: false
      mode: basic
      da:
        - name: testalias1
          remove: True

If you run it multiple times you will see that this function is also idempotent. We can verify the alias is gone by querying the switch

mds32-sw1# show device-alias database
There are no entries in the database

Let’s look at another aspect of managing Cisco MDS switches, the configuration of VSANs. Let’s add a new task to our playbook:

  - name: Add a vsan with an interface on it
    cisco.nxos.nxos_vsan:
      vsan:
      - id: 10
        interface:
        - fc1/1
        name: vsan-SAN-A
        remove: false
        suspend: false

After running the playbook we can verify that the VSAN has been created and that port “fc1/1” has been added as member

mds32-sw1# show vsan 10
vsan 10 information
         name:vsan-SAN-A  state:active
         interoperability mode:default
         loadbalancing:src-id/dst-id/oxid
         operational state:up

mds32-sw1# show vsan 10 membership
vsan 10 interfaces:
    fc1/1

Finally let’s take a look at an example of creating a zone using the vsan and the device alias we created earlier

  - name: Zone testing
    cisco.nxos.nxos_zone_zoneset:
      zone_zoneset_details:
      - vsan: 10
        mode: basic
        zone:
        - members:
          - pwwn: "11:11:11:11:11:11:11:11"
          - pwwn: "12:12:12:12:12:12:12:12"
          - device_alias: testalias1
          name: zoneA

Once the playbook runs we can verify the results in the switch session

mds32-sw1# show zone vsan 10
zone name zoneA vsan 10
  pwwn 11:11:11:11:11:11:11:11
  pwwn 12:12:12:12:12:12:12:12
  pwwn 11:22:33:44:55:66:77:88 [testalias1]

In conclusion, the Ansible modules to manage the storage functionality in Cisco Nexus and MDS switches contain a lot of great functionality and will be the ‘go-to’ tool for those organizations that are considering automating Cisco switch operations. The Ansible modules offer a convenient declarative syntax and seem to be idempotent. Once the following list of issues have been solved during the initial configuration as described earlier, the modules can provide very valuable:

  • “Unsupported parameters for (cisco.nxos.nxos_devicealias) module: provider Supported parameters include: da, distribute, mode, rename”
  • “Unable to automatically determine host network os. Please manually configure ansible_network_os value for this host”
  • “ansible.module_utils.connection.ConnectionError: No authentication methods available”
  • “ansible.module_utils.connection.ConnectionError: paramiko: The authenticity of host can’t be established”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s