Friday, September 18, 2009

Nagios Remote Monitoring Unix/Linux with check_by_ssh

Nagios (http://www.nagios.org) does full remote monitoring by ssh'ing into the foreign server and running the nagios plugin on that server. So for it to work SSH has to work (without interaction) and the plugins (only) need to be installed on the remote server. This monitoring does not require any special privileges (i.e. It doesn't need to run as root).

On the remote (to be monitored) server (assumes a user name of nagmon):
useradd -d /export/home/nagmon -m nagmon *create our local monitoring service account

passwd nagmon *the password you set is unimportant, we won’t ever use the password to login you just need to set one to enable the account

mkdir /export/home/nagmon/.ssh

chown nagmon /export/home/nagmon/.ssh


On the nagios (monitor) server:
ssh-keygen -t rsa -b 4096 *generate an rsa key pair and save to /etc/ssh/nagmon_rsa, when prompted for a passphrase just hit enter
cat /etc/ssh/nagmon_rsa.pub *this is the public host key file that will be used to authenticate ssh, copy all the text in the file


On the remote (to be monitored) server:
vi /export/home/nagmon/.ssh/authorized_keys *paste the text from the previous command into the file, then save and exit the file

chown nagmon /export/home/nagmon/.ssh/authorized_keys *make sure our service account can read the key file

chmod 600 /export/home/nagmon/.ssh/authorized_keys *ssh will reject the connection if the proper permissions are not set on the file

mkdir /usr/lib/nagios/plugins *install/copy the nagios plugins to this directory, usually by copying from the nagios server. Make sure the plugins are compiled for the system/processor you are using.
chmod 755 /usr/lib/nagios/plugins/*

*Example Nagios Config (monitoring a remote disk):
Add to the file that contains your command configurations:
define command{
name check_remote_disk
command_name check_remote_disk
command_line /usr/local/nagios/libexec/check_by_ssh -H $HOSTADDRESS$ -2 -C '/usr/lib/nagios/plugins/check_disk -w $ARG2$ -c $ARG3$ -p $ARG1$' -l nagmon -i /etc/ssh/nagmon_rsa
}
Add to the file that contains your service configurations
define service{
use local-service *this is a template, yours will probably be named differently
host_name
service_description usr partition
check_command check_remote_disk!/usr!25%!10%
}
That's it! You can run any of the local check nagios plugins via the tunnel and return the results to nagios.

Sunday, September 13, 2009

Updating Vendor Certifications

I've been working on getting my certifications up to date. Mainly because I see certifications as a good benchmark to keep your general knowledge level up to date. It's easy to get deep into a specific function of a technology but certifications make you step back and look at the big picture again.

But I digress. I just re-certified my CCNA (Cisco Certified Network Associate) and it made me think a little. I did the same certification about a decade ago and frankly the test was significantly easier and covered a lot less material back then. I suspect that Cisco did this as a result of introducing it's new entry level CCENT certification. But being a hiring manager my impression of the CCNA was clearly off base prior to retaking it this year. We were making this certification a requirement for our entry level networking folks and I don't think that was appropriate.

So how often do vendors update certifications by making them significantly more difficult and how do vendors get the word out on these changes to hiring managers? Now I know but i clearly hadn't gotten this message prior to completing the test myself.


*Example: I still recall on my first CCNA being annoyed that I had learned STP inside and out. The only question I had on it was something like "what protocol would you use to prevent layer 2 loops". For the recent CCNA exam you had to know VTP, trunking protocols, port trunk modes, STP, RSTP etc... So there was clearly a lot more content.