Service checks in LibreNMS (http, all other Nagios plugins)
Published: 10-09-2018 | Author: Remy van Elst | Text only version of this article
Table of Contents
LibreNMS is becoming one of my favorite monitoring tools. Setup and gettingstarted is easy and it has enough advanced options and tunables. I recentlydiscovered that LibreNMS is able to check services as well. Services, in thiscontext, means, executing Nagios plugins (like check http, check ping, etc).This allows you to check services that SNMP does not cover by default, likeHTTP(s) health checks, certificate expiry, tcp port checks (e.g. rdp) andanything for which you can write a Nagios plugin yourself. The performance data,if available, is graphed automatically. Alerting is done with the regularLibreNMS alerts. This guide covers the setup of services (it's not enabled bydefault) and a few basic checks, like an http health check, certificate expiryand SSH monitoring.
Nagios check plugins
For those unfamiliar with Nagios, it is a monitoring system which can executechecks. These checks are scripts and progams which take input (for example,which host to check, tresholds), do a check and then return an exit code andsome performance data. The plugins can be in any language, Nagios only caresabout the exit codes. They can be the following:
- 0: OK
- 1: WARNING
- 2: CRITICAL
- 3: OK
- 4 and up: UNKNOWN
For example, to check if a website is working, you would use the
check_httpplugin. This plugin checks if the site returns a
200 OK and if so, gives exitstatus
0. If not, for example because of a timeout, access denied or
50xerror, it will return status
2. Nagios then can do all kinds ofalerting based on those statusses.
Performance data is comma seperated value data added after the status output inthe command result. This can be anything, for example, the time the HTTP requesttook.
Since you can write these scripts yourself any monitoring system that uses theseplugins is very extensible. It can check anything you want as long as you canwrite a script for it. This makes the monitoring tool very powerfull, you're notlimited to what they provide.
To read more about monitoring, you can read one of my other articles.
Enabling service checks
Service checks are not enabled by default in LibreNMS. The documentationexplains how to enable the module. In this guide I asume your path is
/opt/librenms/. Edit your config file:
Add the following line:
$config['show_services'] = 1;
Save the file.
Edit the LibreNMS cronjob to include service checks:
*/5 * * * * librenms /opt/librenms/services-wrapper.py 1
Make sure the Nagios plugins are installed:
apt-get install nagios-plugins nagios-plugins-extra
Do a test to see if the plugins work:
/usr/lib/nagios/plugins/check_http -H raymii.org -S -p 443
HTTP OK: HTTP/1.1 200 OK - 1320 bytes in 0.199 second response time |time=0.198748s;;;0.000000 size=1320B;;;0
Adding a (dummy) host
You must have a host in LibreNMS to be able to add service checks. Normally youwould use
snmp to monitor devices, but if you just want to do simple (HTTP)checks without SNMP you can add a host without SNMP or TCP checks. Via
Add Device you can enter an URL/IP. Uncheck the SNMP checkbox andcheck the
Force add button:
If this device does not accept ICMP (ping) traffic, you can disable that aswell. Go to the device, select the Cog menu, Edit, "Misc" tab, then check"Disable ICMP Test?":
If you do want to use SNMP, here is a quick guide for Ubuntu. First install
apt-get install snmpd
Edit the configuration. Remove everything and add the following:
agentAddress udp:161createUser <username> SHA "<password>" AES "<password2>" view systemonly included .184.108.40.206.2.1.1view systemonly included .220.127.116.11.18.104.22.168rwuser <username>sysLocation <location>sysContact <your name and email>includeAllDisks 10%defaultMonitors yeslinkUpDownNotifications yes
password to a long and secure name and password (8characters minimum). Restart snmpd:
service snmpd restart
Add a rule in your firewall to only allow access to UDP port 161 from yourmonitoring service and deny all other traffic.
You can now add this machine in LibreNMS using SNMPv3 and the authenticationdata you provided.
Configuring services in LibreNMS
In LibreNMS you should now have a new tab button in the top menu, named"Services":
Make sure you added a host as described above. You can navigate to a host andclick the "Services" tab, then click "Add service". In the top menu bar you canalso click "Services", "Add Service". You then have to select the host as well.
type is the nagios plugin you want to use. In our case,
check_ part is not shown).
Enter a meaningfull description. For example, "HTTP Checkhttps://example.org/path/to/data".
The IP address can be the hostname or the IP. It is recommended to make this thesame as the host the services are coupled to.
The "Parameters" are the Nagios check command parameters, from the shell. In thecase of an HTTP check for one of the servers hosting raymii.org it would be:
-E -I 22.214.171.124 -S -p 443 -u "/s/index.html"
IP Address: raymii.org
-E: extended performance data
-I 126.96.36.199: the specifc IP address (optional, I have multiple A records)
-S: use SSL
-p 443: use port 443
-u "/s/index.html": the URL to request. (optional)
All parameters can be found on the monitoring-plugins website. You can teston the shell first before you add the check to LibreNMS.
Save the dialog box and wait a few minutes for the check to run.
An SSH check is even simpler, just select
SSH as the type and add the check.Here is an example of a Cisco switch where SSH is checked:
A certificate check, to get an alert when a certificate is about to expire, canalso be done. The
http and the parameters are:
--sni -S -p 443 -C 30
It will check if the certificate expires within 30 days.
There is a default alert rule in LibreNMS named
services.service_status != 0 AND macros.device_up = 1
If you want to differentiate between WARNING and CRITICAL Nagios alerts, you cancreate two rules:
# warningservices.service_status = 1 AND macros.device_up = 1# criticalservices.service_status = 2 AND macros.device_up = 1
Specific alerting and rechecking when a check fails is not as configurable inIcinga or Nagios. The check will run, and alert you on a failure. Icinga/Nagiosallow you to configure escalation paths and advanced re-checking. For example,when a check fails, recheck it 4 times with an interval of X seconds (instead ofthe regular check interval) and only alert if it still fails.
In Icinga you can define (service or host) groups and apply service checks tothese groups. LibreNMS doesn't allow this, so you cannot define a check andapply it to a group. If you need to check 100 servers, it means defining 100checks by hand per server.
Here is an example of services that are down:
Here is an example of a dummy host (no ICMP or SNMP) with a HTTP check andalerting enabled:
Tags: bash, icinga, librenms, logging, monitoring, nagios, observium, plugin, python, tutorials