How Does a SysAdmin Can Apply
a Python Skills To His Daily Work? (Part1)
“The system administrator needs to be able to program” – this phrase often provokes objections from many professionals.
- What for? Hands it is more reliable.
- But you can automate typical operations.
- And break a bunch of devices if something goes wrong?
- But you still can break them even with your hands.
You have listened to the summary of typical discussions on this issue. Most admins stop editing the previously copied pieces of the config in the text editor and copying them to the console. Or preparing typical configuration files, but adding them to the equipment by hand through the console.
If you look towards the manufacturers of network equipment,
it turns out that the same Cisco has long offered a variety of options for automating work with network equipment: from TCL on iOS to Python on NX-OS and IOS-XR. This is called network automation or network programmability, and Cisco has courses in this direction.
And Cisco is not alone here: Juniper c PyEZ, HP, Huawei and so on.
Many tools – Netconf, Restconf, Ansible, Puppet and Python, Python. The analysis of specific tools will be postponed for a later time, let’s move on to a concrete example.
- The second question, which sometimes causes heated discussions, usually leads to a complete misunderstanding of each other: “Does a system administrator really need network devices in DNS?”.
Let’s leave a detailed analysis of the participants’ positions for later, formulating the task that led to Python and SNMP. And it all started with a traceroute.
Despite the presence of a variety of monitoring systems that watch and see a lot, MPLS-TE, which deploys traffic in a bizarre way, the correct ICMP and traceroute and ping utilities in many cases are able to give the right information quickly and now. But traceroute output only as IP addresses in a large network will require additional efforts to understand exactly where the packets came from. For example, we see that forward and reverse traffic from the user goes through different routers, but for which ones? The solution is obviously to enter the router’s addresses in the DNS. And for corporate networks where you rarely use unnumbered, placing separate addresses on connectors, if you enter the interface addresses in DNS, you can quickly understand what interface the ICMP packet came from the router.
- However, manually running the DNS database on a large network requires a very large amount of labor not of the most difficult work. But the interface domain name will consist of the interface name, interface description, router’s hostname and domain name. All this router carries in its configuration. The main thing is to collect and properly glue and bind to the right address.
So this task should be automated.
The first thought, the analysis of configurations, quickly faded, the network is large, multi-vendor, and even equipment from different generations, so the idea of parsing configs quickly became unpopular.
The second thought is to use what gives the right answers to universal requests for equipment from different vendors. The answer was obvious – SNMP. It, for all its features, is implemented in the software of any vendor.
Let’s get started
First, we need to install a Python:
sudo apt-get install python3
We need modules to work with SNMP, IP addresses, over time. But for their installation, it is necessary to put pip. True, it is now bundled with python.
sudo apt install python3-pip
And now we put the modules.
pip3 install pysnmp
pip3 install datetime
pip3 install ipaddress
Let’s try to get its hostname from the router. SNMP uses for requests to the host OID. On the OID, the host returns information corresponding to this OID. We want to get a hostname – we need to query 184.108.40.206.220.127.116.11.0.
And so the first script that requests only the hostname.
# import section from pysnmp.hlapi import * from ipaddress import * from datetime import datetime # var section #snmp community_string = 'derfnutfo' # From file ip_address_host = '192.168.88.1' # From file port_snmp = 161 OID_sysName = '18.104.22.168.22.214.171.124.0' # From SNMPv2-MIB hostname/sysname # function section def snmp_getcmd(community, ip, port, OID): return (getCmd(SnmpEngine(), CommunityData(community), UdpTransportTarget((ip, port)), ContextData(), ObjectType(ObjectIdentity(OID)))) def snmp_get_next(community, ip, port, OID): errorIndication, errorStatus, errorIndex, varBinds = next(snmp_getcmd(community, ip, port, OID)) for name, val in varBinds: return (val.prettyPrint()) #code section sysname = (snmp_get_next(community_string, ip_address_host, port_snmp, OID_sysName)) print('hostname= ' + sysname)
Run and get:
hostname = MikroTik
Let’s take a look at the script in more detail:
First, we import the necessary modules:
- pysnmp – allows the script to work with the host via SNMP
- ipaddress – provides work with addresses. Checking addresses for correctness, checking for occurrences of addresses to the network address, etc.
- datetime – get the current time. In this task, you need to organize logs.
Then we start four variables:
- Host address
- SNMP port
- OID value
- The first function sends a GET request to the specified host, at the specified port, with the specified community and OID.
- The second function is the snmp_getcmd generator. Probably split into two functions was not entirely correct, but it turned out so.
This script lacks some things:
1. In the script, you need to load the ip addresses of the hosts. For example, from a text file. At loading it is necessary to check up the loaded address for correctness, differently pysnmp can very strongly be surprised and the script will stop with a traceback. It is not important where you will get the addresses from the file from the database, but you must be sure that the addresses that you received are correct. And so, the source of the addresses is a text file, one line is one address in decimal form.
2. The network equipment can be turned off at the time of polling, it can be incorrectly configured, as a result pysnmp will in this case not at all what we are waiting for and after further processing of the received information we get a stop of the script with a traceback. We need an error handler for our SNMP interaction.
3. A log file is needed, in which the processed errors will be recorded.
Load the addresses and create a log file
- Enter the variable for the file name.
- We write a function check_ip to verify the correctness of the address.
- We write the function get_from_file of address loading, which checks each address for correctness and if it is not so, writes a message about it to the log.
- We implement the loading of data into the list.
filename_of_ip = 'ip.txt' # name of the file with IP addresses #log filename_log = 'zone_gen.log' # def check_ip(ip): # ip address verification correctness try: ip_address(ip) except ValueError: return False else: return True def get_from_file(file, filelog): # selects ip addresses from the file. one line - one address in decimal form fd = open(file,'r') list_ip =  for line in fd: line=line.rstrip('\n') if check_ip(line): list_ip.append(line) else: filed.write(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + ': Error Garbage at source ip addresses ' + line) print('Error Garbage at source ip addresses ' + line) fd.close() return list_ip #code section # open the log file filed = open(filename_log,'w') # write down the current time filed.write(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + '\n') ip_from_file = get_from_file(filename_of_ip, filed) for ip_address_host in ip_from_file: sysname = (snmp_get_next(community_string, ip_address_host, port_snmp, OID_sysName)) print('hostname= ' + sysname) filed.write(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + '\n') filed.close()
Create the file ip.txt
The second address in this list does not respond to SNMP. Run the script and verify that you need an error handler for SNMP.
Error ip 12.43.dsds.f4
hostname = MikroTik
Traceback (most recent last call last):
File “/snmp/snmp_read3.py”, line 77, in print (‘hostname =’ + sysname)
TypeError: Can not convert ‘NoneType’ object to str implicitly
Process finished with exit code 1
It is impossible to understand the contents of traceback that the reason for the failure was an inaccessible host. Let’s try to intercept possible reasons for stopping the script and write all the information to the log.
Creating an error handler for pysnmp
The snmp_get_next function already has errorIndication, errorStatus, errorIndex, varmints. In varBinds, the received data is unloaded, in variables beginning with error, error information is unloaded. It only needs to be handled correctly. Since in the future there will be several more functions in the script for working with SNMP, it makes sense to process the errors in a separate function.
def errors(errorIndication, errorStatus, errorIndex, ip, file): # error handling In case of errors, we return False and write to the file if errorIndication: print(errorIndication, 'ip address ', ip) file.write(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + ' : ' + str(errorIndication) + ' = ip address = ' + ip + '\n') return False elif errorStatus: print(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + ' : ' + '%s at %s' % (errorStatus.prettyPrint(), errorIndex and varBinds[int(errorIndex) - 1] or '?')) file.write(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + ' : ' + '%s at %s' % (errorStatus.prettyPrint(), errorIndex and varBinds[int(errorIndex) - 1] or '?' + '\n')) return False else: return True
And now we add error handling to the snmp_get_next function and write to the log file. The function should now return not only data but also a message about whether there were errors.
def snmp_get_next(community, ip, port, OID, file): errorIndication, errorStatus, errorIndex, varBinds = next(snmp_getcmd(community, ip, port, OID)) if errors(errorIndication, errorStatus, errorIndex, ip, file): for name, val in varBinds: return (val.prettyPrint(), True) else: file.write(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + ' : Error snmp_get_next ip = ' + ip + ' OID = ' + OID + '\n') return ('Error', False)
Now you need to rewrite the code section a bit, taking into account that now there are messages about the success of the request.
In addition, we add a few checks:
1. Sysname is less than three characters long. We will write the file to the log so that we can look at it more closely.
2. Discover that some Huawei and Catos give only a hostname to the request. Since we do not really want to look for OIDs separately (not the fact that it exists at all, maybe it’s a software error), we’ll add this domain manually to such hosts.
3. We find that hosts with incorrect community behave differently, most initiate an error handler, and some for some reason answer that the script perceives as a normal situation.
4. We add at the time of debugging a different level of logging so that later we do not pick out unnecessary messages throughout the script.
for ip_address_host in ip_from_file: # get sysname hostname + domainname, error flag sysname, flag_snmp_get = (snmp_get_next(community_string, ip_address_host, port_snmp, OID_sysName, filed)) if flag_snmp_get: # It's OK, the host responded to snmp if sysname == 'No Such Object currently exists at this OID': # and the community is invalid. it is necessary to skip the host, otherwise we catch traceback. And you just can not catch that problem in the community, so you should always ask for the hostname, which gives all the devices print('ERROR community', sysname, ' ', ip_address_host) filed.write(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + ' : ' + 'ERROR community sysname = ' + sysname + ' ip = ' + ip_address_host + '\n') else: if log_level == 'debug': filed.write(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + ' : ' + ' sysname ' + sysname + ' type ' + str(type(sysname)) + ' len ' + str(len(sysname)) + ' ip ' + ip_address_host + '\n') if len(sysname) &amp;lt; 3 if log_level == 'debug' or log_level == 'normal': filed.write(datetime.strftime(datetime.now(), "%Y.%m.%d %H:%M:%S") + ' : ' + 'Error sysname 3 = ' + sysname + ' ip = ' + ip_address_host + '\n') if sysname.find(domain) == -1: # something gave up a hostname without a domain, for example, Huawei or Catos sysname = sysname + '.' + domain if log_level == 'debug' or log_level == 'normal': filed.write("check domain : " + sysname + " " + ip_address_host + " " + "\n") print('hostname= ' + sysname)
Let’s check this script on the same file ip.txt
Error The garbage in the source of ip addresses 12.43.dsds.f4
hostname = MikroTik.mydomain.com
No SNMP response received before timeout ip address 126.96.36.199
hostname = MikroTik.mydomain.com
Everything worked out regularly, we caught all the errors, the script missed the hosts with errors. Now, with this script, you can build a hostname from all the devices that respond to SNMP.
- Now it remains to collect the names of interfaces, description of interfaces, interface addresses and correctly decompose into configuration files bind. But about this in the second part.
PS: We note that in a good way log messages should be formed in a different way to the principle.
For example time special symbol error code special character description_objects special character additional_information. This will then help configure the automatic processing of the log.
UPD: error correction.