一、前期准备
1、IP地址规划
服务器1主机名:DS1
服务器1外网IP(eth0):192.168.1.201
服务器1心跳IP(eth1):192.168.10.201 #用于检测服务器2是否存活,一般是使用一根网线直连服务器2,所以与服务器2必须为同一网段
服务器2主机名:DS2
服务器2外网IP(eth0):192.168.1.202
服务器2心跳IP(eth1):192.168.10.202 #用于检测服务器1是否存活,一般是使用一根网线直连服务器1,所以与服务器1必须为同一网段
虚拟IP地址:192.168.1.200 #对外提供服务的IP、实际的外网IP用户是不知道的
2、两台服务器都关闭防火墙、安装nginx服务
iptables -F #实验方便,所以关闭防火墙及selinux,生产环境中需建立相应的防火墙规则
selinux:setenforce 0
yum install -y nginx #安装nginx服务实验对象,用于提供服务
二、heartbeat安装
1. 安装epel扩展源:2. 安装heartbeat / libnet三、主节点配置
1、添加host主机名2、拷贝配置文件3、配置authkeys认证文件,用于两个节点的认证。主从节点的验证方式及验证密码必须一致 4、配置haresources资源文件,用于指定双机系统的主节点、VIP、子网掩码、广播地址及启动的服务等集群资源5、配置ha.cf文件,heartbeat的主配置文件四、从节点配置
从节点的配置与主节点的配置除以下一处需要修改,其他配置一样五、heartbeat测试
将两台服务器的heartbeat启动。service heartbeat start
正常情况下,在ds1服务器上使用ifconfig上可以看到eth0:0的信息,ds2是无eth0:0的信息将DS1的heartbeat服务关闭,此时DS2将自动生成eth0:0的信息此时在DS2查看日志,可发现DS2检测到DS1已经shutdown,DS2接管DS1的服务。在DS1启动heartbeat服务,此时在DS2查看日志,可发现DS2检测到DS1已经start,DS2自动将自己的服务移除,变为standby的状态。
1、IP地址规划
服务器1主机名:DS1
服务器1外网IP(eth0):192.168.1.201
服务器1心跳IP(eth1):192.168.10.201 #用于检测服务器2是否存活,一般是使用一根网线直连服务器2,所以与服务器2必须为同一网段
服务器2主机名:DS2
服务器2外网IP(eth0):192.168.1.202
服务器2心跳IP(eth1):192.168.10.202 #用于检测服务器1是否存活,一般是使用一根网线直连服务器1,所以与服务器1必须为同一网段
虚拟IP地址:192.168.1.200 #对外提供服务的IP、实际的外网IP用户是不知道的
2、两台服务器都关闭防火墙、安装nginx服务
iptables -F #实验方便,所以关闭防火墙及selinux,生产环境中需建立相应的防火墙规则
selinux:setenforce 0
yum install -y nginx #安装nginx服务实验对象,用于提供服务
二、heartbeat安装
1. 安装epel扩展源:
- [root@DS1 ~]# rpm -ivh 'http://www.lishiming.net/data/attachment/forum/epel-release-6-8_32.noarch.rpm'
- [root@DS1 ~]# yum install -y heartbeat* libnet
1、添加host主机名
- [root@DS1 ~]# vi /etc/hosts
- 192.168.1.201 ds1
- 192.168.1.202 ds2
- [root@DS1 ~]# cd /usr/share/doc/heartbeat-3.0.4/
- [root@DS1 ~]# cp authkeys ha.cf haresources /etc/ha.d/
- [root@DS1 ~]# vi /etc/ha.d/authkeys
- auth 3 #取消注释,将其改成以下验证算法前的序号
- #1 crc #只作校验,开销最小
- #2 sha1 HI! #加密强度最大,开销最大,后面的"HI!"为验证密码
- 3 md5 Hello! #取消注释 ,使用md5进行验证,加密强度比sha1稍弱,但开销也比sha1小,,后面的"Hello!"为验证密码
- [root@DS1 ~]#vim /etc/ha.d/haresources
- DS1 192.168.1.200/24/eth0:0 nginx #设置DS1为主节点、虚拟IP地址为192.168.1.200、提供的服务为nginx
- [root@DS1 ~]# vim /etc/ha.d/ha.cf
- logfile /var/log/ha.log #指定heartbeat日志的存放位置
- #bcast eth1 #指定心跳使用以太网广播方式,并且在eth1进行广播
- ucast eth1 192.168.10.202 #指定使用eth1网卡来做心跳检测,地址为对方的心跳IP地址
- keepalive 2 #指定心跳间隔时间,每2秒发一次广播
- warntime 10 # 该时间为警告时间,10s钟内没有收到对方节点的信号,则会在日志里写入警告信息
- deadtime 30 #指定备用节点如果30s内没有收到主节点的心跳信号,则认为对方节点宕机,然后立即接管主节点的服务
- initdead 120 #该时间是用于系统启动或重启,网络服务不能立即正常使用的等待时间,至少设置为deadtime的两倍
- udpport 694 # 广播通信使用的端口,694为默认的端口号
- auto_failback on # 如果是on, 则当主节点故障恢复后,服务还会切换回来
- node DS1 #主节点主机名
- node DS2 #从节点主机名
- ping 192.168.1.1 #仲裁节点,用于测试网络连接,最好是一个比较强健的设备,比如说交换机
- respawn hacluster /usr/lib64/heartbeat/ipfail #当heartbeat启动时也会随着一起启动的进程放到这里,ipfail是用来检测网络连通性的工具,hacluster为启动该程序的用户,路径需注意32位及64位系统的区别,此处写错会导致heartbeat启动不成功
- debugfile /var/log/ha-debug.log
从节点的配置与主节点的配置除以下一处需要修改,其他配置一样
- [root@DS2 ~]#vim /etc/ha.d/ha.cf
- ucast eth1 192.168.10.201 #指定使用eth1网卡来做心跳检测,地址为对方的心跳IP地址
将两台服务器的heartbeat启动。service heartbeat start
正常情况下,在ds1服务器上使用ifconfig上可以看到eth0:0的信息,ds2是无eth0:0的信息
- [root@DS1 ~]# ifconfig
- eth0:0 Link encap:Ethernet HWaddr 00:0C:29:8B:5F:54
- inet addr:192.168.1.200 Bcast:192.168.1.255 Mask:255.255.255.0
- UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
- [root@DS2 ~]# ifconfig
- eth0:0 Link encap:Ethernet HWaddr 00:0C:29:63:7C:F8
- inet addr:192.168.1.200 Bcast:192.168.1.255 Mask:255.255.255.0
- UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
- [root@DS2 ~]# cat /var/log/ha-log
- Nov 10 01:12:36 DS2 heartbeat: [8935]: info: Received shutdown notice from 'ds1'.
- Nov 10 01:12:36 DS2 heartbeat: [8935]: info: Resources being acquired from ds1.
- Nov 10 01:12:36 DS2 heartbeat: [10374]: info: acquire local HA resources (standby).
- Nov 10 01:12:36 DS2 heartbeat: [10375]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeys ds2] to acquire.
- Nov 10 01:12:36 DS2 heartbeat: [10374]: info: local HA resource acquisition completed (standby).
- Nov 10 01:12:36 DS2 heartbeat: [8935]: info: Standby resource acquisition done [foreign].
- harc(default)[10400]: 2014/11/10_01:12:36 info: Running /etc/ha.d//rc.d/status status
- mach_down(default)[10417]: 2014/11/10_01:12:36 info: Taking over resource group 192.168.1.200/24/eth0:0
- ResourceManager(default)[10444]: 2014/11/10_01:12:36 info: Acquiring resource group: ds1 192.168.1.200/24/eth0:0 nginx
- /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.1.200)[10472]: 2014/11/10_01:12:36 INFO: Resource is stopped
- ResourceManager(default)[10444]: 2014/11/10_01:12:36 info: Running /etc/ha.d/resource.d/IPaddr 192.168.1.200/24/eth0:0 start
- IPaddr(IPaddr_192.168.1.200)[10603]: 2014/11/10_01:12:37 INFO: Adding inet address 192.168.1.200/24 with broadcast address 192.168.1.255 to device eth0 (with label eth0:0)
- IPaddr(IPaddr_192.168.1.200)[10603]: 2014/11/10_01:12:37 INFO: Bringing device eth0 up
- IPaddr(IPaddr_192.168.1.200)[10603]: 2014/11/10_01:12:37 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.1.200 eth0 192.168.1.200 auto not_used not_used
- /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.1.200)[10577]: 2014/11/10_01:12:37 INFO: Success
- ResourceManager(default)[10444]: 2014/11/10_01:12:37 info: Running /etc/init.d/nginx start
- mach_down(default)[10417]: 2014/11/10_01:12:37 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
- mach_down(default)[10417]: 2014/11/10_01:12:37 info: mach_down takeover complete for node ds1.
- Nov 10 01:12:37 DS2 heartbeat: [8935]: info: mach_down takeover complete.
- [root@DS2 ~]# cat /var/log/ha-log
- Nov 10 01:17:40 DS2 heartbeat: [8935]: info: Heartbeat restart on node ds1
- Nov 10 01:17:40 DS2 heartbeat: [8935]: info: Link ds1:eth1 up.
- Nov 10 01:17:40 DS2 heartbeat: [8935]: info: Status update for node ds1: status init
- Nov 10 01:17:40 DS2 heartbeat: [8935]: info: Status update for node ds1: status up
- Nov 10 01:17:40 DS2 ipfail: [8962]: info: Link Status update: Link ds1/eth1 now has status up
- Nov 10 01:17:40 DS2 ipfail: [8962]: info: Status update: Node ds1 now has status init
- Nov 10 01:17:40 DS2 ipfail: [8962]: info: Status update: Node ds1 now has status up
- harc(default)[10748]: 2014/11/10_01:17:40 info: Running /etc/ha.d//rc.d/status status
- harc(default)[10765]: 2014/11/10_01:17:40 info: Running /etc/ha.d//rc.d/status status
- Nov 10 01:17:42 DS2 heartbeat: [8935]: info: Status update for node ds1: status active
- Nov 10 01:17:42 DS2 ipfail: [8962]: info: Status update: Node ds1 now has status active
- harc(default)[10782]: 2014/11/10_01:17:42 info: Running /etc/ha.d//rc.d/status status
- Nov 10 01:17:42 DS2 heartbeat: [8935]: info: remote resource transition completed.
- Nov 10 01:17:42 DS2 heartbeat: [8935]: info: ds2 wants to go standby [foreign]
- Nov 10 01:17:43 DS2 heartbeat: [8935]: info: standby: ds1 can take our foreign resources
- Nov 10 01:17:43 DS2 heartbeat: [10799]: info: give up foreign HA resources (standby).
- ResourceManager(default)[10812]: 2014/11/10_01:17:43 info: Releasing resource group: ds1 192.168.1.200/24/eth0:0 nginx
- ResourceManager(default)[10812]: 2014/11/10_01:17:43 info: Running /etc/init.d/nginx stop
- Nov 10 01:17:43 DS2 ipfail: [8962]: info: Asking other side for ping node count.
- ResourceManager(default)[10812]: 2014/11/10_01:17:43 info: Running /etc/ha.d/resource.d/IPaddr 192.168.1.200/24/eth0:0 stop
- IPaddr(IPaddr_192.168.1.200)[10898]: 2014/11/10_01:17:43 INFO: IP status = ok, IP_CIP=
- /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.1.200)[10872]: 2014/11/10_01:17:43 INFO: Success
- Nov 10 01:17:43 DS2 heartbeat: [10799]: info: foreign HA resource release completed (standby).
- Nov 10 01:17:43 DS2 heartbeat: [8935]: info: Local standby process completed [foreign].
- Nov 10 01:17:43 DS2 heartbeat: [8935]: WARN: 1 lost packet(s) for [ds1] [11:13]
- Nov 10 01:17:43 DS2 heartbeat: [8935]: info: remote resource transition completed.
- Nov 10 01:17:43 DS2 heartbeat: [8935]: info: No pkts missing from ds1!
- Nov 10 01:17:43 DS2 heartbeat: [8935]: info: Other node completed standby takeover of foreign resources.
- Nov 10 01:17:52 DS2 ipfail: [8962]: info: No giveup timer to abort.
- Nov 10 01:17:57 DS2 heartbeat: [8935]: info: ds1 wants to go standby [foreign]
- Nov 10 01:17:57 DS2 heartbeat: [8935]: info: standby: acquire [foreign] resources from ds1
- Nov 10 01:17:57 DS2 heartbeat: [10966]: info: acquire local HA resources (standby).
- Nov 10 01:17:57 DS2 heartbeat: [10966]: info: local HA resource acquisition completed (standby).
- Nov 10 01:17:57 DS2 heartbeat: [8935]: info: Standby resource acquisition done [foreign].
- Nov 10 01:17:58 DS2 heartbeat: [8935]: info: remote resource transition completed.
编辑回复