Munin optimization guide for Debian (rrdcached, tmpfs, ionice and nice)

08-12-2012 | Remy van Elst


Table of Contents


This guide will help you tune the performance of Munin. When Munin monitors more than a few hosts the performance goes down and it requires more resources. This gets better with newer releases, but it still is not perfect. You can limit munin yourself so that the IO performance gets better.

This guide assumes a working munin setup (1.4 or 2.0) on debian/ubuntu. It also assumes root access to the server (either su or sudo).

rrdcached

rrdcached is as the name implies a caching daemon for rrd. Munin uses rrd for the database, and updates the rrd files every 5 minutes, which gives a lot of random IO. rrdcached makes the writes more sequential and less often.

Do note that rrdcached is only fully supported in munin 2.0. Usage on 1.4 is not supported.

First install it:

apt-get install rrdcached

Then run the below command to create an rrd socket

sudo -u munin /usr/bin/rrdcached 
  -p /run/munin/rrdcached.pid 
  -B -b /var/lib/munin/ 
  -F -j /var/lib/munin/rrdcached-journal/ 
  -m 0660 -l unix:/run/munin/rrdcached.sock 
  -w 1800 -z 1800 -f 3600

Make sure to add it to /etc/rc.local.

Now add the following to your /etc/munin/munin.conf to enable the socket:

rrdcached_socket /run/munin/rrdcached.sock

Some info about the command:

RRDCached writes the spool data every 5 minutes by default. This is the same as the munin master. To have an effect, change the flushing intervals to allow more data to be spooled. Use the following parameters, and tune to your liking:

-w 1800 Wait 30 minutes before writing data
-z 1800 Delay writes by a random factor of up to 30 minutes (this should be equal to, or lower than, -w)
-f 3600 Flush all data every hour

By default munin-graph runs every 5 minutes so the caching we do above will not work if the data is read every 5 minutes. To solve this we split the munin-updating and the munin-graphing.

Edit the file /etc/cron.d/munin, add the following line:

10 * * * *      munin if [ -x /usr/bin/munin-graph ]; then /usr/bin/munin-graph; fi

The file /usr/bin/munin-graph does not exist yet, we are going to create it:

nano /usr/bin/munin-graph

Now add this:

#!/bin/bash
# We always launch munin-html.
# It is a noop if html_strategy is "cgi"
nice /usr/share/munin/munin-html $@ || exit 1

# The result of munin-html is needed for munin-graph.
# It is a noop if graph_strategy is "cgi"
nice /usr/share/munin/munin-graph --cron $@ || exit 1 

and make it executable:

chmod +x /usr/bin/munin-graph

Now we edit the /usr/bin/munin-cron file and comment out the lines we put in the munin-graph file:

[...]
# We always launch munin-html.
# It is a noop if html_strategy is "cgi"
# nice /usr/share/munin/munin-html $@ || exit 1

# The result of munin-html is needed for munin-graph.
# It is a noop if graph_strategy is "cgi"
# nice /usr/share/munin/munin-graph --cron $@ || exit 1   

By doing this, the munin-update runs every 5 minutes, and the graphing and HTML page creation runs only once per hour. This prevents data loss, and I dont think you are going to look at the graphs every 5 minutes. Whenever you need to have faster updates, just change the crontab file to also create the munin-graphs more often.

nice and ionice

This little tweak to the munin-cron file makes it IO and CPU friendlier. Note that you need to use the default disk-scheduler for the ionice to work. Also check the syntax of the nice command, different versions exist.

Edit the /etc/cron.d/munin file and change the cronjob:

*/5 * * * *     munin if [ -x /usr/bin/munin-cron ]; then /usr/bin/ionice -c 3 /usr/bin/nice -n 19 /usr/bin/munin-cron; fi

If you have applied the above tweak with rrdcached then you can also make the other cronjob nicer:

10 * * * *      munin if [ -x /usr/bin/munin-graph ]; then /usr/bin/ionice -c 3 /usr/bin/nice -n 19 /usr/bin/munin-graph; fi

munin html and graphs in RAM

By mounting the folder where munin creates the HTML (in my case /var/www/munin) we lower the IO a bit more because it is written to the RAM instead of the disk. Data loss after a power outage/reboot is not an issue, because the html and graphs are generated fresh every time.

Edit /etc/fstab and add the following (be careful not to change something else):

tmpfs  /var/www/munin   tmpfs   rw,mode=755,uid=munin,gid=munin,size=150M   0 0

Because my munin-server has 256MB RAM I give it a 150M size. If you have more RAM available you can easisly up this to 1000 M. If you save your munin graphs somewhere else (check the htmldir variable in /etc/munin/munin.conf) the make sure you change that as well.

Now make sure it is mounted by executing the following:

mount -a

cgi strategy

Another way to up the performance of Munin is by using a cgi-based graph strategy. However, for me this made the munin webinterface terribly slow and unworkable so I will not cover that here.

Links


Tags: debian, ionice, monitoring, munin, nice, performance, rrdcached, tmpfs, ubuntu,