How To Collect And Analysis Of NetFlow Traffic With Mikrotik
Once upon a time, in a distant distant galaxy … Though, to think, it was only 15 years ago.
In general, there were times when solutions based on FreeBSD and Linux were used as the central gateway to the Internet. And these solutions were lovingly tuned, and they were hung with all possible and impossible functions (from the firewall and VPN servers to TFTP + PXE-services of the diskless boot) … and there was no trouble, and everything was fine …
- But times are changing, new solutions are emerging, companies appear that “cheaply and angrily” prepare the Linux kernel, hang out with the necessary functionality and sell it for very modest money (comparable to the cost of hardware).
An example of such solutions is the company Mikrotik and its similar solutions.
Total current situation – to buy and put Mikrotik in an organization in which 10 to 5000 computers, faster and more economical than taking the “regular system” and collect the gateway in parts (networked, software, services, etc.).
- At the same time, the tasks of accounting traffic were and still are. And then a nearby server (usually a Linux-based Linux or FreeBSD) comes to the rescue.
The calculation of WEB traffic is simple and understandable – a bundle of Squid + LightSquid allows you to quickly and easily collect and aggregate information about who visits which sites, what files it downloads and how many it hangs in Youtube. If necessary, you can also limit websites, time, etc. A simple, convenient solution, tested for years. In Mikrotik one rule is issued, issuing an IP proxy server on the Internet. And everyone is happy.
But the problem is that not everything passes successfully through Squid. There are customer banks written without HTTP support and Socks proxies. There are complex programs that use different connections for different types of traffic-the result-they either work poorly through Proxy or do not work at all. And there is a separate category of so-called VIP-persons … which is easier to give “Full NAT” than to aggravate relations when “something does not open with them”.
Thus, in Mikrotik sooner or later, but there will be separate rules that issue “special” directly through NAT, bypassing the proxy server. And we do not see their traffic in statistics.
The decision to record such traffic suggests the following:
- Enable the capture of NetFlow statistics on the external interface of Mikrotik;
- Sending these statistics to the NAS (for example, to the flow-tools service, via flow-capture)
For convenient analysis of the files received on the NAS server, this solution is proposed to be improved with a couple of self-scripted scripts:
- Perl-script, which will process ft-files, and upload information to the MySQL database;
- PHP-script, which will perform the role of preconfigured UI, for convenient analysis of NetFlow-data.
Everything is simple and according to documentation:
set enabled=yes interfaces=WAN
/ip traffic-flow target
add dst-address=<NAS IP Address> port=8787 v9-template-timeout=1m version=5
Configuring Flow-Tools with the example of FreeBSD
# Install the NetFlow sensor:
pkg install streaming tools
# Launch setup:
echo ‘flow_capture_enable = "YES"’ >> /etc/rc.conf.local
echo ‘flow_capture_flags = "- N-2"’ >> /etc/rc.conf.local
starting the flow_capture service
Installation and preparation of MySQL DBMS for importing NetFlow data
# Install, run and configure the MySQL server:
pkg install mysql56-server
# Start the service
echo ‘mysql_enable = "YES"’ >> /etc/rc.conf
service mysql start
# Initial configuration of the DBMS:
# Install Perl-modules to run the script with the DBMS:
pkg install p5-DBI p5-DBD-mysql
# We enter the DBMS and create a database and a user to work with it:
mysql -u root -p
Create DBMS and user:
mysql> create database netflow;
mysql> grant insert,create,update,select,delete on netflow.* to nfuser@’localhost’ identified by ‘987654321’;
mysql> flush privileges;
Perl-script for analysis of ft- * NetFlow statistics files and data loading in MySQL
The script was not written from scratch – long ago, in 2005, an article on the calculation of traffic on the FreeBSD gateway was posted on the OpenNet website using the NetGraph module ng_ipacct.
- The download script was taken as a basis and rewritten for use with NetFlow and flow-tools. Works both on FreeBSD and Linux (only ways to rewrite to the flow-cat and flow-print programs).
Features of the script – this option is designed to analyze all ft-files over the past day and load them into the database (line by line). At the same time, it was implemented to exclude rows from several templates so as not to load redundant information into MySQL (for example, to exclude broadcast traffic, DNS query traffic, traffic from HTTP / Socks proxies (the proxy statistics are in a different place). allows you to reduce the number of rows loaded into the database in 10 or even 20-30 times.
Tables in the DBMS are created automatically with the beginning of the new month. To the standard NetFlow v5 format, the day is added, the recording time (the time of the generated ft-files is used – for example, every 15 minutes), the NetFlow source name and the network interface name are also indicated.
PHP UI for simplified SQL query building
In the distant 2005, when the author’s Perl scripts were used, we used SQL commands to analyze the data in MySQL … and everything suited everybody.
But sooner or later the time has come when you are tired of entering inquiries. And, having gathered his thoughts, he wrote a small PHP-code that allowed to build SQL-queries in a faster and easier way.
What makes it possible to make a script:
- Generate SQL queries, see the final query itself and its result in a table form;
- See the amount of NetFlow traffic for any month;
- Group traffic by one metric (SourceIP, SourcePort, DestIP, DestPort, Proto, Date);
- See the number of rows instead of the amount (both for the month and for grouping by one characteristic);
- Filter by traffic volume;
- See the entries themselves instead of the amount of traffic;
- Request the specified number of records from the database (for example, for further copy/paste in Excel).
- The owner of the netflow.php file must be a Web server user (for example, Apache).
- Access to the DBMS is specified in the netflow.php file explicitly – so change it for yourself.
Maintenance of MySQL tables
If the tables are quite large (although no one bothers you in loading only the DBMS that you need, except “slag” and reducing the size), that is an interesting technique that allows you to significantly reduce the size of the database. It’s about using a database compression in the MyISAM format, and also about index optimization.
To automatically execute these procedures, another Perl script was written, which on the first day of each new month is run by Cron:
- It repairs the table just in case (suddenly there were incorrect service outages and MyISAM contains errors);
- Renames the table, adding the character c at the end (from the word compressed);
- Runs the programs myisamchk and myisampack, which compress the database and build a new and sorted index for the compressed table.
After the script is executed, this table will become ReadOnly, it will be compressed (the size will decrease by a factor of 3), and a new sorted and optimized index will be built. Requests to this table will be faster.