How Does a SysAdmin Can Apply a Python Skills To His Daily Work? Part1

System Administration & Programming Main Logo 1

 How Does a SysAdmin Can Apply 

a Python Skills To His Daily Work? (Part1)

“The system administrator needs to be able to program” – this phrase often provokes objections from many professionals.

  • What for? Hands it is more reliable.
  • But you can automate typical operations.
  • And break a bunch of devices if something goes wrong?
  • But you still can break them even with your hands.

You have listened to the summary of typical discussions on this issue. Most admins stop editing the previously copied pieces of the config in the text editor and copying them to the console. Or preparing typical configuration files, but adding them to the equipment by hand through the console.

If you look towards the manufacturers of network equipment,
it turns out that the same Cisco has long offered a variety of options for automating work with network equipment: from TCL on iOS to Python on NX-OS and IOS-XR. This is called network automation or network programmability, and Cisco has courses in this direction.

And Cisco is not alone here: Juniper c PyEZ, HP, Huawei and so on.

Many tools – Netconf, Restconf, Ansible, Puppet and Python, Python. The analysis of specific tools will be postponed for a later time, let’s move on to a concrete example.

  • The second question, which sometimes causes heated discussions, usually leads to a complete misunderstanding of each other: “Does a system administrator really need network devices in DNS?”.
    Let’s leave a detailed analysis of the participants’ positions for later, formulating the task that led to Python and SNMP. And it all started with a traceroute.

Despite the presence of a variety of monitoring systems that watch and see a lot, MPLS-TE, which deploys traffic in a bizarre way, the correct ICMP and traceroute and ping utilities in many cases are able to give the right information quickly and now. But traceroute output only as IP addresses in a large network will require additional efforts to understand exactly where the packets came from. For example, we see that forward and reverse traffic from the user goes through different routers, but for which ones? The solution is obviously to enter the router’s addresses in the DNS. And for corporate networks where you rarely use unnumbered, placing separate addresses on connectors, if you enter the interface addresses in DNS, you can quickly understand what interface the ICMP packet came from the router.

  • However, manually running the DNS database on a large network requires a very large amount of labor not of the most difficult work. But the interface domain name will consist of the interface name, interface description, router’s hostname and domain name. All this router carries in its configuration. The main thing is to collect and properly glue and bind to the right address.

So this task should be automated.

The first thought, the analysis of configurations, quickly faded, the network is large, multi-vendor, and even equipment from different generations, so the idea of parsing configs quickly became unpopular.

The second thought is to use what gives the right answers to universal requests for equipment from different vendors. The answer was obvious – SNMP. It, for all its features, is implemented in the software of any vendor.

Let’s get started

First, we need to install a Python:

sudo apt-get install python3

We need modules to work with SNMP, IP addresses, over time. But for their installation, it is necessary to put pip. True, it is now bundled with python.

sudo apt install python3-pip

And now we put the modules.

pip3 install pysnmp

pip3 install datetime

pip3 install ipaddress

Let’s try to get its hostname from the router. SNMP uses for requests to the host OID. On the OID, the host returns information corresponding to this OID. We want to get a hostname – we need to query 1.3.6.1.2.1.1.5.0.

And so the first script that requests only the hostname.

# import section
from pysnmp.hlapi import *
from ipaddress import *
from datetime import datetime

# var section

#snmp
community_string = 'derfnutfo'  # From file
ip_address_host = '192.168.88.1'  # From file
port_snmp = 161
OID_sysName = '1.3.6.1.2.1.1.5.0'  # From SNMPv2-MIB hostname/sysname

# function section

def snmp_getcmd(community, ip, port, OID):
    return (getCmd(SnmpEngine(),
                   CommunityData(community),
                   UdpTransportTarget((ip, port)),
                   ContextData(),
                   ObjectType(ObjectIdentity(OID))))

def snmp_get_next(community, ip, port, OID):
    errorIndication, errorStatus, errorIndex, varBinds = next(snmp_getcmd(community, ip, port, OID))
    for name, val in varBinds:

        return (val.prettyPrint())

#code section

sysname = (snmp_get_next(community_string, ip_address_host, port_snmp, OID_sysName))
print('hostname= ' + sysname)

Run and get:

hostname = MikroTik

Let’s take a look at the script in more detail:

First, we import the necessary modules:

  1. pysnmp – allows the script to work with the host via SNMP
  2. ipaddress – provides work with addresses. Checking addresses for correctness, checking for occurrences of addresses to the network address, etc.
  3. datetime – get the current time. In this task, you need to organize logs.

Then we start four variables:

  1. community
  2. Host address
  3. SNMP port
  4. OID value

Two functions:

1. snmp_getcmd
2. snmp_get_next

  • The first function sends a GET request to the specified host, at the specified port, with the specified community and OID.
  • The second function is the snmp_getcmd generator. Probably split into two functions was not entirely correct, but it turned out so.

This script lacks some things:

1. In the script, you need to load the ip addresses of the hosts. For example, from a text file. At loading it is necessary to check up the loaded address for correctness, differently pysnmp can very strongly be surprised and the script will stop with a traceback. It is not important where you will get the addresses from the file from the database, but you must be sure that the addresses that you received are correct. And so, the source of the addresses is a text file, one line is one address in decimal form.

2. The network equipment can be turned off at the time of polling, it can be incorrectly configured, as a result pysnmp will in this case not at all what we are waiting for and after further processing of the received information we get a stop of the script with a traceback. We need an error handler for our SNMP interaction.

3. A log file is needed, in which the processed errors will be recorded.

Load the addresses and create a log file

  • Enter the variable for the file name.
  • We write a function check_ip to verify the correctness of the address.
  • We write the function get_from_file of address loading, which checks each address for correctness and if it is not so, writes a message about it to the log.
  • We implement the loading of data into the list.
filename_of_ip = 'ip.txt' # name of the file with IP addresses
#log
filename_log = 'zone_gen.log' #   

def check_ip(ip): # ip address verification correctness
    try:
        ip_address(ip)
    except ValueError:
        return False
    else:
        return True

def get_from_file(file, filelog): # selects ip addresses from the file. one line - one address in decimal form
    fd = open(file,'r')
    list_ip = []
    for line in fd:
       line=line.rstrip('\n')
       if check_ip(line):
           list_ip.append(line)
       else:
            filed.write(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + ': Error Garbage at source ip addresses ' + line)
            print('Error Garbage at source ip addresses ' + line)
    fd.close()
    return list_ip

#code section

# open the log file
filed = open(filename_log,'w')

# write down the current time
filed.write(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + '\n')

ip_from_file = get_from_file(filename_of_ip, filed)

for ip_address_host in ip_from_file:
    sysname = (snmp_get_next(community_string, ip_address_host, port_snmp, OID_sysName))
    print('hostname= ' + sysname)

filed.write(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + '\n')
filed.close()

Create the file ip.txt

192.168.88.1
172.1.1.1
12.43.dsds.f4
192.168.88.1

The second address in this list does not respond to SNMP. Run the script and verify that you need an error handler for SNMP.

Error ip 12.43.dsds.f4
hostname = MikroTik
Traceback (most recent last call last):
File “/snmp/snmp_read3.py”, line 77, in print (‘hostname =’ + sysname)
TypeError: Can not convert ‘NoneType’ object to str implicitly

Process finished with exit code 1

It is impossible to understand the contents of traceback that the reason for the failure was an inaccessible host. Let’s try to intercept possible reasons for stopping the script and write all the information to the log.

Creating an error handler for pysnmp

The snmp_get_next function already has errorIndication, errorStatus, errorIndex, varmints. In varBinds, the received data is unloaded, in variables beginning with error, error information is unloaded. It only needs to be handled correctly. Since in the future there will be several more functions in the script for working with SNMP, it makes sense to process the errors in a separate function.

def errors(errorIndication, errorStatus, errorIndex, ip, file):
    # error handling In case of errors, we return False and write to the file
    if errorIndication:
        print(errorIndication, 'ip address ', ip)
        file.write(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + ' : ' + str(errorIndication) + ' = ip address = ' + ip + '\n')
        return False
    elif errorStatus:
        print(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + ' : ' + '%s at %s' % (errorStatus.prettyPrint(), errorIndex and varBinds[int(errorIndex) - 1][0] or '?'))
        file.write(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + ' : ' + '%s at %s' % (errorStatus.prettyPrint(), errorIndex and varBinds[int(errorIndex) - 1][0] or '?' + '\n'))
        return False
    else:
        return True

And now we add error handling to the snmp_get_next function and write to the log file. The function should now return not only data but also a message about whether there were errors.

def snmp_get_next(community, ip, port, OID, file):
    errorIndication, errorStatus, errorIndex, varBinds = next(snmp_getcmd(community, ip, port, OID))
    if errors(errorIndication, errorStatus, errorIndex, ip, file):
        for name, val in varBinds:
            return (val.prettyPrint(), True)
    else:
        file.write(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + ' : Error snmp_get_next ip = ' + ip + ' OID = ' + OID + '\n')
        return ('Error', False)

Now you need to rewrite the code section a bit, taking into account that now there are messages about the success of the request.

In addition, we add a few checks:

1. Sysname is less than three characters long. We will write the file to the log so that we can look at it more closely.

2. Discover that some Huawei and Catos give only a hostname to the request. Since we do not really want to look for OIDs separately (not the fact that it exists at all, maybe it’s a software error), we’ll add this domain manually to such hosts.

3. We find that hosts with incorrect community behave differently, most initiate an error handler, and some for some reason answer that the script perceives as a normal situation.

4. We add at the time of debugging a different level of logging so that later we do not pick out unnecessary messages throughout the script.

for ip_address_host in ip_from_file:
    # get sysname hostname + domainname, error flag   
    sysname, flag_snmp_get = (snmp_get_next(community_string, ip_address_host, port_snmp, OID_sysName, filed))

    if flag_snmp_get:
        # It's OK, the host responded to snmp
        if sysname == 'No Such Object currently exists at this OID':
            # and the community is invalid. it is necessary to skip the host, otherwise we catch traceback. And you just can not catch that problem in the community, so you should always ask for the hostname, which gives all the devices  
            print('ERROR community', sysname, ' ', ip_address_host)
            filed.write(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + ' : ' + 'ERROR community sysname = ' + sysname + '  ip = ' + ip_address_host + '\n')
        else:
            if log_level == 'debug':
                filed.write(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + ' : ' + '  sysname ' + sysname + ' type ' + str(type(sysname)) + ' len ' + str(len(sysname)) + ' ip ' + ip_address_host + '\n')
            if len(sysname) < 3
                if log_level == 'debug' or log_level == 'normal':
                    filed.write(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + ' : ' + 'Error sysname  3  = ' + sysname + '  ip = ' + ip_address_host + '\n')
            if sysname.find(domain) == -1:
               # something gave up a hostname without a domain, for example, Huawei or Catos
                sysname = sysname + '.' + domain
                  if log_level == 'debug' or log_level == 'normal':
                    filed.write("check domain     : " + sysname + " " + ip_address_host + " " + "\n")

        print('hostname= ' + sysname)

Let’s check this script on the same file ip.txt

Error The garbage in the source of ip addresses 12.43.dsds.f4
hostname = MikroTik.mydomain.com
No SNMP response received before timeout ip address 172.1.1.1
hostname = MikroTik.mydomain.com

Everything worked out regularly, we caught all the errors, the script missed the hosts with errors. Now, with this script, you can build a hostname from all the devices that respond to SNMP.

The whole script in one
# import section from pysnmp.hlapi import * from ipaddress import * from datetime import datetime # var section #snmp community_string = ‘derfnutfo’ ip_address_host = ‘192.168.88.1’ port_snmp = 161 OID_sysName = ‘1.3.6.1.2.1.1.5.0’ # From SNMPv2-MIB hostname / sysname filename_of_ip = ‘ip.txt’ # Ip #log filename_log = ‘zone_gen.log’ # for the log file log_level = ‘debug’ domain = ‘mydomain.ru’ #section section def snmp_getcmd (community, ip, port, OID): # type class ‘generator’ errorIndication, errorStatus, errorIndex, result [3] – list # method get get the result of access to the device by SNMP with the specified OID     return (getCmd (SnmpEngine (),                    CommunityData (community),                    UdpTransportTarget ((ip, port)),                    ContextData (),                    ObjectType (ObjectIdentity (OID)))) def snmp_get_next (community, ip, port, OID, file): # method handles the class generator from def snmp_get # process errors, output the type class ‘pysnmp.smi.rfc1902.ObjectType’ with OID (in name) and value (in val) # we get one scalar value     errorIndication, errorStatus, errorIndex, varBinds = next (snmp_getcmd (community, ip, port, OID))     if errors (errorIndication, errorStatus, errorIndex, ip, file):         for name, val in varBinds:             return (val.prettyPrint (), True)     else:         file.write (datetime.strftime (datetime.now (),                                      “% Y.% m.% D% H:% M:% S”) + ‘: Error snmp_get_next ip =’ + ip + ‘OID =’ + OID + ‘\ n’)         return (‘Error’, False) def get_from_file (file, filelog): # Loading ip addresses from file, writing errors to filelog      fd = open (file, ‘r’)      list_ip = []      for line in fd:          line = line.rstrip (‘\ n’)          if check_ip (line):             list_ip.append (line)          else:             filed.write (datetime.strftime (datetime.now (),                                               “% Y.% m.% D% H:% M:% S”) + ‘: Error ip’ + line)             print (‘Error ip’ + line)      fd.close ()      return list_ip def check_ip (ip): # Check the ip address for correctness. False check failed.     try:        ip_address (ip)     except ValueError:         return False     else:         return True def errors (errorIndication, errorStatus, errorIndex, ip, file):     # error handling in case of errors return False and write to the file file     if errorIndication:        print (errorIndication, ‘ip address’, ip)        file.write (datetime.strftime (datetime.now (), “% Y.% m.% d% H:% M:% S”) + ‘:’ + str (                 errorIndication) + ‘= ip address =’ + ip + ‘\ n’)        return False     elif errorStatus:          print (datetime.strftime (datetime.now (), “% Y.% m.% d% H:% M:% S”) + ‘:’ + ‘% s at% s’% (          errorStatus.prettyPrint (),          errorIndex and varBinds [int (errorIndex) – 1] [0] or ‘?’ ))          file.write (datetime.strftime (datetime.now (), “% Y.% m.% d% H:% M:% S”) + ‘:’ + ‘% s at% s’% (          errorStatus.prettyPrint (),          errorIndex and varBinds [int (errorIndex) – 1] [0] or ‘?’ + ‘\ n’))          return False     else:          return True #code section # open the log file filed = open (filename_log, ‘w’) # write down the current time filed.write (datetime.strftime (datetime.now (), “% Y.% m.% d% H:% M:% S”) + ‘\ n’) ip_from_file = get_from_file (filename_of_ip, filed) for ip_address_host in ip_from_file:     # get sysname hostname + domainname, error flag     sysname, flag_snmp_get = (snmp_get_next (community_string, ip_address_host, port_snmp, OID_sysName, filed))     if flag_snmp_get:         # It’s OK, the host responded to snmp         if sysname == ‘No Such Object currently exists at this OID’:              # and the community is invalid. it is necessary to skip the host, otherwise we catch traceback. And you just can not catch that problem in the community, so you should always ask for the hostname, which gives all the devices             print (‘ERROR community’, sysname, ”, ip_address_host)             filed.write (datetime.strftime (datetime.now (),                                           “% Y.% m.% D% H:% M:% S”) + ‘:’ + ‘ERROR community sysname =’ + sysname + ‘ip =’ + ip_address_host + ‘\ n’)         else:             if log_level == ‘debug’:                 filed.write (datetime.strftime (datetime.now (),                                               “% Y.% m.% D% H:% M:% S”) + ‘:’ + ‘sysname’ + sysname + ‘type’ + str (                     type (sysname)) + ‘len’ + str (len (sysname)) + ‘ip’ + ip_address_host + ‘\ n’)             if len (sysname) <3:                 sysname = ‘None_sysname’                 if log_level == ‘debug’ or log_level == ‘normal’:                     filed.write (datetime.strftime (datetime.now (),                                                   “% Y.% m.% D% H:% M:% S”) + ‘:’ + ‘Error sysname 3 =’ + sysname + ‘ip =’ + ip_address_host + ‘\ n’)             if sysname.find (domain) == -1:                 # something gave up a hostname without a domain, for example, Huawei or Catos
  • Now it remains to collect the names of interfaces, description of interfaces, interface addresses and correctly decompose into configuration files bind. But about this in the second part.

PS: We note that in a good way log messages should be formed in a different way to the principle.
For example time special symbol error code special character description_objects special character additional_information. This will then help configure the automatic processing of the log.
UPD: error correction.