Nagios是一款开源的免费网络监视工具,能有效监控Windows、Linux和Unix的主机状态,交换机路由器等网络设置,打印机等。在系统或服务状态异常时发出邮件或短信报警第一时间通知网站运维人员,在状态恢复后发出正常的邮件或短信通知。
nagios官方网站:http://www.nagios.org/
关于nagios的安装官网有文档已经说的比较清楚了http://nagios.sourceforge.net/docs/3_0/quickstart-fedora.html
我这里安装的均为当前最新的稳定版本:
nagios-4.0.8
nagios-plugins-2.0.3
安装如下:
安装前准备工作
使用root用户操作服务器,需要以下软件环境:
Apache
PHP
DCC编译器
GD开发库
可通过yum进行安装:
yum install httpd php
yum install gcc glibc glibc-common
yum install gd gd-devel
1. 创建用户和用户组
useradd -m nagios
passwd nagios
Create a new nagcmd group for allowing external commands to be submitted through the web interface. Add both the nagios user and the apache user to the group.
/usr/sbin/groupadd nagcmd
/usr/sbin/usermod -a -G nagcmd nagios
/usr/sbin/usermod -a -G nagcmd apache
2. 下载Nagios和Nagios插件
创建文件夹/software 将下载的文件放到/software目录下
官网下载:http://www.nagios.org/download
本地下载:
nagios-4.0.8.tar.gz
nagios-plugins-2.0.3.tar.gz
3. 编译安装nagios
# cd /software
# tar -zxvf nagios-4.0.8.tar.gz
# cd nagios-4.0.8
Run the Nagios configure script, passing the name of the group you created earlier like so:
./configure --with-command-group=nagcmd
Compile the Nagios source code.
make all
Install binaries, init script, sample config files and set permissions on the external command directory.
make install
make install-init
make install-config
make install-commandmode
Don't start Nagios yet - there's still more that needs to be done...
4. Customize Configuration
Sample configuration files have now been installed in the /usr/local/nagios/etc directory. These sample files should work fine for getting started with Nagios. You'll need to make just one change before you proceed...
Edit the /usr/local/nagios/etc/objects/contacts.cfg config file with your favorite editor and change the email address associated with the nagiosadmin contact definition to the address you'd like to use for receiving alerts.
vi /usr/local/nagios/etc/objects/contacts.cfg
5. Configure the Web Interface
Install the Nagios web config file in the Apache conf.d directory.
make install-webconf
Create a nagiosadmin account for logging into the Nagios web interface. Remember the password you assign to this account - you'll need it later.
htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin //这里我将密码设为123456
Restart Apache to make the new settings take effect.
service httpd restart
Note: Consider implementing the ehanced CGI security measures described here to ensure that your web authentication credentials are not compromised.
6. 安装nagios插件
cd /software
tar -zxvf nagios-plugins-2.0.3.tar.gz
cd nagios-plugins-2.0.3
./configure --with-nagios-user=nagios --with-nagios-group=nagios
make
make install
7. 启动nagios
将nagios加入开机启动
chkconfig --add nagios
chkconfig nagios on
验证nagios配置文件是否有错:
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
如果没有错误,则启动nagios:
service nagios start
8. 关闭selinux
vim /etc/selinux/config
将SELINUX=enforcing 改为 SELINUX=disabled
修改后需重启系统生效。
临时生效:
# setenforce 0
9. 在浏览器中登录
用户名为nagiosadmin 密码为刚才配置的123456
http://localhost/nagios/
我这里吧默认80端口改成了800了,如下图:
当前系统默认监控的本机的项目:
10. nagios配置文件说明
nagios安装完成后主目录为/usr/local/nagios 所有的配置文件在/usr/local/nagios/etc目录下:
cgi.cfg
htpasswd.users //用于web端登录时的验证,上文中添加的
nagios.cfg //nagios主配置文件,如果监控其他主机,需要在里面添加其他主机的配置文件
resource.cfg //定义了nagios插件的位置
其中objects目录也是非常重要的目录,绝大部分操作都会在这里进行,他的默认文件有:
commands.cfg //命令定义文件,其中的命令可被其他配置文件引用
contacts.cfg //定义联系人和联系人组的文件
localhost.cfg //监控本机的文件
printer.cfg //监控打印机的模板文件,默认没有启用
switch.cfg //监控路由器的模板文件,默认没有启用
templates.cfg //定义主机和服务的模板配置文件,可在其他配置文件中引用
timeperiods.cfg //定义nagios监控时间段的配置文件
windows.cfg //监控windows主机的配置文件,默认没有启用
监控本机:
在/usr/local/nagios/etc/nagios.cfg中加入本机的监控,默认已经添加:
cfg_file=/usr/local/nagios/etc/objects/localhost.cfg
本机配置文件默认监控了8项服务Current load,Current Users,HTTP,PING,Root Partition,SSH,Swap Usage,Total Processes
添加一个对本机/data分区的监控:
vim localhost.cfg
在后面添加
define service{
use local-service
host_name localhost
service_description Data Partition
check_command check_local_disk!20%!10%!/data
notifications_enabled 1
}
保存重启nagios服务,登录web即可看到。
Nagios发送邮件的设置:
参照:http://xiaosa.blog.51cto.com/665033/381310/
下载sendmail http://caspian.dotconf.net/menu/Software/SendEmail/
本地下载:sendEmail-v1.56.tar.gz
cd /usr/local/
tar -zxvf sendEmail-v1.56.tar.gz
复制执行程序 cp sendEmail-v1.56/sendEmail /usr/local/bin/
发件测试:
# sendEmail -f nagios@domain.com -t 7344506@qq.com -s mail.domain.com -u "This is subject" -xu nagios -xp passpwd -m This is content.
解释:
-f 表示发送者的邮箱
-t 表示接收者的邮箱
-s 表示SMTP服务器的域名或者ip
-u 表示邮件的主题
-xu 表示SMTP验证的用户名
-xp 表示SMTP验证的密码(注意,这个密码貌似有限制,例如我用d!5neyland就不能被正确识别)
-m 表示邮件的内容
修改commands.cfg
vim /usr/local/nagios/etc/objects/commands.cfg
# 'notify-host-by-email' command definition
define command{
command_name notify-host-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$
\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /usr/local/bin/sendEmail -f nagios@domain.com -t $CONTACTEMAIL$ -s mail.domain.com -u "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAM
E$ is $HOSTSTATE$ **" $CONTACTEMAIL$ -xu nagios -xp password
}
# 'notify-service-by-email' command definition
define command{
command_name notify-service-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTA
DDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n" | /usr/local/bin/sendEmail -f nagios@domain.com -t $CONTACTEMAIL$ -s mail.domain.com -u "** $NOTIFIC
ATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$ -xu nagios -xp password
}
保存并重启nagios服务。
以下在被监控机上安装:
可参考此文档:
NRPE.pdf
建立nagios用户和组:
# useradd nagios
创建存放软件的目录:
# mkdir /software
# cd /softeware
下载nagios-plugins-2.0.3.tar.gz和nrpe-2.15.tar.gz
# wget http://nagios-plugins.org/download/nagios-plugins-2.0.3.tar.gz
# wget http://sourceforge.net/projects/nagios/files/nrpe-2.x/nrpe-2.15/nrpe-2.15.tar.gz/download
安装nagios-plugins-2.0.3.tar.gz
# tar -zxvf nagios-plugins-2.0.3.tar.gz
# cd nagios-plugins-2.0.3
# ./configure
# make && make install
授权:
# chown nagios.nagios /usr/local/nagios/
# chown -R nagios.nagios /usr/local/nagios/libexec/
安装xinetd:
# yum install xinetd
安装nrpe-2.15.tar.gz
# tar -zxvf nrpe-2.15.tar.gz
# cd nrpe-2.15/
# ./configure
# make all
Install the NRPE plugin (for testing), daemon, and sample daemon config file.
# make install-plugin
# make install-daemon
# make install-daemon-config
Install the NRPE daemon as a service under xinetd.
# make install-xinetd
添加nagios监控服务器地址到/etc/xinetd.d/nrpe
only_from = 127.0.0.1 12.92.117.15
追加nrpe服务到/etc/services
# echo "nrpe 5666/tcp # nrpe" >> /etc/services
重启xinetd服务:
# /etc/init.d/xinetd restart
查看有没有起来:
# netstat -at|grep nrpe //如下已启动
tcp 0 0 *:nrpe *:* LISTEN
因为刚才在/etc/xinetd.d/nrpe中添加了可以连接到本机的地址,一个是本机地址127.0.0.1,另一个是182.92.187.15,现在测试一下能够正常连接:
在被监控机上:
# /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1
NRPE v2.15 //如果出现了版本信息表示成功。
在监控端机器上:
# /usr/local/nagios/libexec/check_nrpe -H www.iub.com.sg
NRPE v2.15 //如果出现了版本信息表示成功。
开放被监控机的5666端口:
iptables -I RH-Firewall-1-INPUT -p tcp -m tcp –dport 5666 -j ACCEPT
在监控端服务器上:
# cd /usr/local/nagios/etc/objects/
添加以下到commands.cfg
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
创建文件www.domain.com.cfg,添加以下内容:
define host{
use linux-server
host_name www.domain.com
alias www.domain.com
address 115.11.115.222
}
define service{
use generic-service
host_name www.domain.com
service_description CPU Load
check_command check_nrpe!check_load
}
添加此配置文件到nagios.cfg
# vim /usr/local/nagios/etc/nagios.cfg
加入:
cfg_file=/usr/local/nagios/etc/objects/www.domain.com.cfg
重启nagios服务,登录web界面查看。
待续...