视频1 视频21 视频41 视频61 视频文章1 视频文章21 视频文章41 视频文章61 推荐1 推荐3 推荐5 推荐7 推荐9 推荐11 推荐13 推荐15 推荐17 推荐19 推荐21 推荐23 推荐25 推荐27 推荐29 推荐31 推荐33 推荐35 推荐37 推荐39 推荐41 推荐43 推荐45 推荐47 推荐49 关键词1 关键词101 关键词201 关键词301 关键词401 关键词501 关键词601 关键词701 关键词801 关键词901 关键词1001 关键词1101 关键词1201 关键词1301 关键词1401 关键词1501 关键词1601 关键词1701 关键词1801 关键词1901 视频扩展1 视频扩展6 视频扩展11 视频扩展16 文章1 文章201 文章401 文章601 文章801 文章1001 资讯1 资讯501 资讯1001 资讯1501 标签1 标签501 标签1001 关键词1 关键词501 关键词1001 关键词1501 专题2001
Nagios监控生产环境redis集群服务实战
2020-11-09 14:40:02 责编:小采
文档


前言: 以前做了cacti上展示redis性能报表图,可以看到redis的性能变化趋势图, 但是还缺了实时报警通知的功能,现在补上这一环节。在redis服务瓶颈或者异常时候即使报警通知,方便dba第一时间处理维护。 1,下载redis监控插件 Redis已经在服务器安装好了,

前言: 以前做了cacti上展示redis性能报表图,可以看到redis的性能变化趋势图,但是还缺了实时报警通知的功能,现在补上这一环节。在redis服务瓶颈或者异常时候即使报警通知,方便dba第一时间处理维护。
1,下载redis监控插件

Redis已经在服务器安装好了,所以直接可以进行监控,redis集群安装请参考:http://blog.itpub.net/26230597/viewspace-1145831/,下载地址为:http://download.csdn.net/detail/mchdba/8023351,有2个版本,一个是perl脚本写成的,一个是php脚本写成的,可以任意选择一个,这里选择的是perl脚本。

2,赋予执行权限

将check_redis.php和check_redis.pl复制到/usr/lib/nagios/plugins/目录,然后赋予执行权限,

[root@wgq_41 plugins]# cd /usr/lib/nagios/plugins/

[root@wgq_41 plugins]# chown -R nagios.nagios check_redis.*

[root@wgq_41 plugins]# chmod 750 check_redis.*

3,定义监控命令

[root@wgq objects] vim /usr/local/nagios/etc/objects/commands.cfg

# add by tim on 20141010,for redis

# check redis

define command {

command_name check_redis

command_line /usr/lib/nagios/plugins/check_redis.pl -H $HOSTADDRESS$ -p $ARG1$ -a $ARG2$ -w $ARG3$ -c $ARG4$ -f

}

4,定义redis监控主机

[root@wgq etc]# vim /usr/local/nagios/etc/hosts.cfg

# No.018,redis master server

define host{

use linux-server

host_name cache-1

alias cache-1

address 10.xxx.3.x0

check_command check-host-alive

max_check_attempts 5

check_period 24x7

contact_groups ops

notification_interval 30

notification_period 24x7

notification_options d,u,r

}

# No.020 cache-3 redis slave server

define host{

use linux-server

host_name cache-3

alias cache-3

address 10.xx.3.x2

check_command check-host-alive

max_check_attempts 5

check_period 24x7

contact_groups ops

notification_interval 30

notification_period 24x7

notification_options d,u,r

}

5,定义redis监控主机组

define hostgroup {

hostgroup_name Redis_Servers

alias Redisservices

members cache-1,cache-2

}

6,定义redis监控服务选项

[root@wgq objects]# vim /usr/local/nagios/etc/objects/services_redis.cfg

# Redis Master 监控选项

define service {

host_name cache-1

servicegroups Redisservices

service_description Redis Master Clients

check_command check_redis!6379!'connected_clients,blocked_clients,client_longest_output_list,client_biggest_input_buf'!200,50,~,~!600,150,~,~

max_check_attempts 5

normal_check_interval 3

retry_check_interval 2

check_period 24x7

notification_interval 10

notification_period 24x7

notification_options w,u,c,r

contact_groups ops

}

define service {

host_name cache-1

servicegroups Redisservices

service_description Redis Master Memory

check_command check_redis!6379!'used_memory_human,used_memory_peak_human'!~,~!~,~

max_check_attempts 5

normal_check_interval 3

retry_check_interval 2

check_period 24x7

notification_interval 10

notification_period 24x7

notification_options w,u,c,r

contact_groups ops

}

define service {

host_name cache-1

servicegroups Redisservices

service_description Redis Master CPU

check_command check_redis!6379!'used_cpu_sys,used_cpu_user,used_cpu_sys_children,used_cpu_user_children'!~,~,~,~!~,~,~,~ ; #未定义监控报警阀值

max_check_attempts 5

normal_check_interval 3

retry_check_interval 2

check_period 24x7

notification_interval 10

notification_period 24x7

notification_options w,u,c,r

contact_groups ops

}

# Redis Slave 监控选项

define service {

host_name cache-3

servicegroups Redisservices

service_description Redis Slave Clients

check_command check_redis!6379!'connected_clients,blocked_clients,client_longest_output_list,client_biggest_input_buf'!200,50,~,~!600,150,~,~

max_check_attempts 5

normal_check_interval 3

retry_check_interval 2

check_period 24x7

notification_interval 10

notification_period 24x7

notification_options w,u,c,r

contact_groups ops

}

define service {

host_name cache-3

servicegroups Redisservices

service_description Redis Slave Memory

check_command check_redis!6379!'used_memory_human,used_memory_peak_human'!~,~!~,~

max_check_attempts 5

normal_check_interval 3

retry_check_interval 2

check_period 24x7

notification_interval 10

notification_period 24x7

notification_options w,u,c,r

contact_groups ops

}

define service {

host_name cache-3

servicegroups Redisservices

service_description Redis Slave CPU

check_command check_redis!6379!'used_cpu_sys,used_cpu_user,used_cpu_sys_children,used_cpu_user_children'!~,~,~,~!~,~,~,~ ; #未定义监控报警阀值

max_check_attempts 5

normal_check_interval 3

retry_check_interval 2

check_period 24x7

notification_interval 10

notification_period 24x7

notification_options w,u,c,r

contact_groups ops

}

赋予nagios用户执行权限

[root@wgq objects]# chown -R nagios.nagios services_redis.cfg

[root@wgq objects]# chmod 777 services_redis.cfg

添加监控服务项到nagios.cfg

[root@wgq etc]# vim /usr/local/nagios/etc/nagios.cfg

cfg_file=/usr/local/nagios/etc/objects/services_redis.cfg

7,测试redis监控服务

执行命令/usr/lib/nagios/plugins/check_redis.pl -H cache-1 -a 'connected_clients,blocked_clients' -w ~,~ -c ~,~ -m -M 4G -A -R -T 来测试下redis监控是否正常运行

[root@wgq plugins]# /usr/lib/nagios/plugins/check_redis.pl -H 10.2xx.3.x0 -a 'connected_clients,blocked_clients' -w ~,~ -c ~,~ -m -M 4G -A -R -T

OK: REDIS 2.8.8 on 10.2xx.3.x0:6379 has 1 databases (db0) with 28497 keys, up 76 days 2 hours - response in 0.004s, hitrate is 12.83%, memory use is 194.14M (peak 205.14M, 6.49% of max, fragmentation 1.37%), connected_clients is 35, blocked_clients is 11 | redis_build_id=d322d411218ade61 total_connections_received=341191c used_memory_lua=33792 aof_rewrite_buffer_length=0 used_memory_rss=278749184B redis_git_dirty=0 loading=0 redis_mode=standalone latest_fork_usec=5588 repl_backlog_first_byte_offset=0 sync_partial_ok=0 master_repl_offset=0 uptime_in_days=76c aof_rewrite_scheduled=0 lru_clock=39276 rdb_bgsave_in_progress=0 rejected_connections=0 repl_backlog_active=0 aof_delayed_fsync=1 sync_full=0 process_id=7776 used_memory_human=194.14M aof_current_rewrite_time_sec=-1 used_memory=203570960 aof_enabled=1 blocked_clients=11 aof_last_bgrewrite_status=ok aof_rewrite_in_progress=0 sync_partial_err=0 used_cpu_sys_children=2222.75 connected_slaves=0 repl_backlog_histlen=0 uptime_in_seconds=6576292c repl_backlog_size=1048576 os=Linux 2.6.32-358.el6.x86_ x86_ used_cpu_sys=320.80 aof_pending_bio_fsync=0 connected_clients=35 rdb_last_bgsave_time_sec=1 used_memory_peak_human=205.14M run_id=d1fc098d26fa4bbcef3eabeec6d19a858f03dd00 rdb_last_bgsave_status=ok pubsub_patterns=8 client_biggest_input_buf=0 keyspace_hits=421756c rdb_last_save_time=1412935342 rdb_changes_since_last_save=318 db0_keys=28497 db0_expires=7 db0_avg_ttl=34003 aof_pending_rewrite=0 aof_buffer_length=0 config_file=/usr/local/redis-2.8.8/etc/redis.conf pubsub_channels=0 used_cpu_user_children=21375.34 hz=10 aof_last_rewrite_time_sec=2 aof_last_write_status=ok aof_base_size=82883253 used_cpu_user=18460.42 keyspace_misses=286602797c tcp_port=6379 total_commands_processed=797581196c mem_fragmentation_ratio=1.37 aof_current_size=1485850 rdb_current_bgsave_time_sec=-1 client_longest_output_list=0 instantaneous_ops_per_sec=114 evicted_keys=0c used_memory_peak=215106272B expired_keys=577c total_keys=28497 total_expires=7 response_time=0.003802s hitrate=12.8281% memory_utilization=6.49013519287109%

[root@wgq plugins]#

8,查看redis监控服务状态

先重新加载nagios,使刚添加的redis监控配置生效

[root@wgq objects]# service nagios reload

Running configuration check...

Reloading nagios configuration...

done

[root@wgq objects]#

redis监控服务界面,如下图所示:


9,操作过程中的报错处理过程

报错:

[root@wgq_line_cache_3_41 plugins]# ./check_redis.pl --help

Can't locate Redis.pm in @INC (@INC contains: /usr/local/lib/perl5 /usr/local/share/perl5 /usr/lib/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib/perl5 /usr/share/perl5 .) at ./check_redis.pl line 421.

BEGIN failed--compilation aborted at ./check_redis.pl line 421.

[root@wgq_line_cache_3_41 plugins]#

[root@wgq_line_cache_3_41 plugins]# perl -MCPAN -e shell

Terminal does not support AddHistory.

cpan shell -- CPAN exploration and modules installation (v1.9402)

Enter 'h' for help.

cpan[1]> install Redis

Can't locate Module/Build/Tiny.pm in @INC (@INC contains: /usr/local/lib/perl5 /usr/local/share/perl5 /usr/lib/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib/perl5 /usr/share/perl5 .) at Build.PL line 2.

BEGIN failed--compilation aborted at Build.PL line 2.

Warning: No success on command[/usr/bin/perl Build.PL --installdirs site]

Warning (usually harmless): 'YAML' not installed, will not store persistent state

DAMS/Redis-1.976.tar.gz

/usr/bin/perl Build.PL --installdirs site -- NOT OK

Running Build test

Make had some problems, won't test

Running Build install

Make had some problems, won't install

Could not read '/root/.cpan/build/Redis-1.976-Zhz6xI/META.yml'. Falling back to other methods to determine prerequisites……

YAML是以数据为的标记语言,其使用ASCII码(如连字符、问号、冒号、逗号等)构造数据块(标量值或哈希码)。和XML相同,YAML也是一种机器可识别语言,并能和多种脚本语言相结合,其中一种便是Perl,需要安装YAML,如下执行:

cpan[2]>install YAML

……

Appending installation info to /usr/lib/perl5/perllocal.pod

INGY/YAML-1.12.tar.gz

/usr/bin/make install -- OK

CPAN: YAML loaded ok (v1.12)

PS:这里可能会安装失败,失败原因是网络连接,可以多执行几次install YAML就会成功。

再继续执行install Redis,有如下提示信息

cpan[4]> install Redis

Running install for module 'Redis'

Running Build for D/DA/DAMS/Redis-1.976.tar.gz

Has already been unwrapped into directory /root/.cpan/build/Redis-1.976-cUL4rt

'/usr/bin/perl Build.PL --installdirs site' returned status 512, won't make

Running Build test

Make had some problems, won't test

Running Build install

Make had some problems, won't install

cpan[5]>

Build失败,Build.PL故障了,需要重新安装下执行命令install Build

cpan[5]> install Build

成功后,再执行install Redis

cpan[6]> install Redis

Redis安装执行成功。

<版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!>

参考文档:http://exchange.nagios.org/directory/Plugins/Databases/check_redis-2Epl/details

下载本文
显示全文
专题