NAGIOS
TOC:

Reference

frg-vt-dev-xen-03-vm-08.dev..... http://doc.nagios-fr.org/3_0/html/ch06s02.html
http://nagios.sourceforge.net/docs/3_0/objectdefinitions.html#host
!! attention j'utilise nagios 2_0 car on utilise un vieux centos au taf
http://doc.fedora-fr.org/wiki/Nagios#Configuration
http://nagios.sourceforge.net/docs/2_0/xodtemplate.html

Mest fichier complet pour le test ici

Install

Prerequi

Install dependence

yum install httpd gcc glibc glibc-common gd gd-devel

User

/usr/sbin/useradd -m nagios
passwd nagios

groupe user

Creation d'un groupe pour commande externe et ajout user nagios+apache
/usr/sbin/groupadd nagcmd
/usr/sbin/usermod -G nagcmd nagios
/usr/sbin/usermod -G nagcmd apache

Setup Nagios webapp

yum

yum install nagios nagios-plugins-nrpe nagios-plugins nagios-plugins-ping
Plugin utilisé de base
yum install perl-Nagios-Metrics.noarch nagios-plugins-procs_memory nagios-plugins-procs nagios-plugins-memory nagios-plugins-load nagios-plugins-users nagios-plugins-mounted-disks nagios-plugins-disk
Plugin aditionel
yum install kelkooMonitoringStatsBucheron.rpm.noarch

Enable http

vi /etc/httpd/conf.d/nagios.conf
:%s/deny from all/allow from all/

cgi.cfg

Authoriser le user nagios
vi /usr/local/nagios/etc/cgi.cfg

Selinux

vi /etc/selinux/config
SELINUX=permissive
(or diseable)

Connect

http://localhost/nagios/index.html
mon serveur perso

Setup Nagios Base

vi /etc/services nsca 5667/tcp #nagios

nagios.cfg

Fichier contenant la liste des fichiers de conf
vi /etc/nagios/nagios.cfg
Decommenter
cfg_file=/etc/nagios/contactgroups.cfg
cfg_file=/etc/nagios/contacts.cfg
cfg_file=/etc/nagios/hosts.cfg

contact.cfg

define contact{
	contact_name                    lionel	
	host_notifications_enabled		1
	service_notifications_enabled	1
	service_notification_period     24x7
	host_notification_period        24x7
	service_notification_options    w,u,c,r
	host_notification_options       d,u,r
	service_notification_commands   notify-by-email
	host_notification_commands      host-notify-by-email
	email			lionel.gadille@kelkoo.fr
	can_submit_commands	1
}

Contactgroups.cfg

vi /etc/nagios/contactgroups.cfg
define contactgroup{ contactgroup_name lionel alias lionel gadille members lionel,oleg }

Verifier la conf

/usr/sbin/nagios -v /etc/nagios/nagios.cfg

Demarrer nagios

service nagios reload
Le local host doit etre bhon

Rajouter des hosts

New Host Configuration

sous /etc/nagios/

hostgroups.cfg

define hostgroup{
	hostgroup_name monreseau
	alias Mon reseau
	members spokeniece
}

hosts.cfg

define host{
        host_name spokeniece
        alias spokeniece
        address 10.76.76.175
        max_check_attempts 2
        check_period 24x7
        contact_groups lionel
        notification_period             24x7
        notification_options            d,u,r
        notification_interval           30
        check_command                   check_host_alive
}
define command{
        command_name                    check_host_alive
        command_line                    $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 1
}
Noté la correspondance entre menber spokeniece (hostgroups.cfg) avec host_name
Check_command est la comande pour vérifier que le host est vivant
La commande utilisé ce nome check_host_alive (ici un ping)
Le nom d'host (fleche1) peux etre rouge (host unreachable) si le check_host_alive n'est pas ok
La deuxieme flache correpond au tests de ping actif si dessous

services.cfg

define service{  	 
	host_name 	spokeniece
	service_description 	PING
	check_period 	24x7
	max_check_attempts 	3
	normal_check_interval 	5 #5min
	retry_check_interval 	1 #retry toute les 1*normal_check_interval min
	contact_groups 	lionel
	notification_interval   10  #en seconde
	notification_period 	24x7
	notification_options 	c,r
	check_command 	check_ping!100.0,20%!500.0,60%
} 

Start Resulta

Verifier redemarrer

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cf
service nagios reload

Nagios content

Dans service detaills le nouveaux host devrais etre la
le ping ce mets a jours apres le tps demandé

plugins

/usr/lib/nagios/plugins/

Nsca

Nsca est passif c'est le client qui doit envoyer les messages
Une bonne methode pour monitorer un script qui enverra des message du type
ok je commence
ok j'ais fini

Nsca Serveur

Setup

yum install nsca
chkconfig nsca on
/etc/init.d/nsca start

nagios.cfg

Authoriser au niveaux global le check de fraicheur
check_service_freshness=1
service_freshness_check_interval=10  #valeur la plus basse authoriser
check_external_commands=1
command_check_interval=15s
!! attention meme avec une surcharge local aucun service ne seras surveillie plus vite que la valeur place dans le nagios.cfg

Service.cfg

########### Template
define service{
         use                    generic-service
         name                   passive_service
         active_checks_enabled  0
         passive_checks_enabled 1                               # We want only passive checking
         flap_detection_enabled 0
         register               0                               # This is a template, not a real service
         is_volatile            0
         check_period           24x7
         max_check_attempts     1
         normal_check_interval  5
         retry_check_interval   1
         check_freshness        0
         contact_groups         admins,lionel
         check_command          check_dummy!0
         notification_interval  10 #min
         notification_period    24x7
         notification_options   w,u,c,r
         stalking_options       w,c,u
}

define service{
         use                 passive_service
         service_description TestMessage
         host_name           localhost
}
Voici un service simple pour valider que le daemon nsca tourne
on utilise le template si dessus pour eviter de tous reecrir

Verifier/redemarrer

/usr/sbin/nagios -v /etc/nagios/nagios.cfg
service nagios restart
service nsca restart

Send a message

Methode1:
echo -e "localhost\tTestMessage\t1\tTest en echo Warning"|/usr/sbin/send_nsca localhost
1 data packet(s) sent to host successfully.
Write test message use tab no space:
/usr/sbin/send_nsca localhost < test.msg
1 data packet(s) sent to host successfully.
Le service marche en local votre status nagios doit changer en fonction du message envoyé

Explication message nsca

echo -e "localhost\tTestMessage\t0\ttest en echo Pass"|/usr/sbin/send_nsca localhost
localhost = host_name
TestMessageservice_description = TestMessage
0,1,2 = pass, xwarning fail
test en echo Pass = baratin aficher dans le status

Nsca client/Serveur

Nsca Serveur

services.cfg

define service{
         use                    generic-service
         name                   stat
         active_checks_enabled  0
         passive_checks_enabled 1                               # We want only passive checking
         flap_detection_enabled 0
         register               0                               # This is a template, not a real service
         is_volatile            0
         check_period           24x7
         max_check_attempts     1
         normal_check_interval  5
         retry_check_interval   1
        check_freshness        1
        freshness_threshold    300  #seconde PROD ==> 3600*30=108000
         contact_groups         admins,lionel
         check_command          check_dummy!0
         notification_interval  120  # min
         notification_period    24x7
         notification_options   w,u,c,r
         stalking_options       w,c,u
}
define service{
         use                stat
         service_description kog_tar_ftp_ImpressionLog_fr
        host_name spokeniece
}
define service{
         use                stat
         service_description kog_tar_ftp_OfferLeadsLog_fr
        host_name spokeniece
}
define command{
        command_name    criticalOnLateServiceStatus
        command_line    echo 'No response for the passive service' && exit 2
}

define command{
        command_name    warningOnLateServiceStatus
        command_line    echo 'No response for the passive service' && exit 1
}

define command{
        command_name    check_dummy
        command_line    echo 'No response for the passive service' && exit 3
}
cette exemple defini 2 entrés suplementaire pour spokeniece
par rapport a l'exemple precedent j'ais changé ceci check_freshness 1
reshness_threshold 300
cela signifie que si ou bout de 60*5=300 seconde je n'est pas de message je vais appeler la check_command
En prode je mettrais 30h donc si mon messages n'est pas envoyer tousles jours par mon script perl la production est allerté

Nsca Client

yum install nsca-client nsca-split
send message
echo -e "spokeniece\tkog_tar_ftp_ImpressionLog_fr\t1\ttest en echo"|sudo /usr/sbin/send_nsca raptor
rator et le nom de mon serveur nagios
attention au sudo consuèlter mon site sur le sudoers

Nrpe Client

chkconfig nrpe on
/etc/init.d/nrpe start
http://wiki.corp.gadille.free.fr/foswiki/EUMarketPlaceEngineering/DC_Monitoring