corosync + pacemaker 安装配置，实现httpd高可用

最新推荐文章于 2022-10-16 17:35:29 发布

donghaixiaolongwang

最新推荐文章于 2022-10-16 17:35:29 发布

阅读量4.5k

点赞数

文章标签：高可用集群基础

corosync是一个Messaging Layer。它和pacemaker组合，被各个linux系统用来实现服务高可。corosync的历史自己google了解下就行了。高可用集群原理，不明白的可查看之前总结的文章。

##实现httpd高可用

规划：

机器一：ip地址=172.16.100.7

机器二：ip地址=172.16.100.2

机器三：时间服务器 ip=172.16.0.1

我们假设使用172.16.100.1多为对外提供服务的地址——即VIP

1、两台机器安装web服务，并在工作目录下提供测试页面

yum install httpd -y

chkconfig httpd off ##关闭httpd服务自启动。

cd /var/www/html

vim 1.html ###为了测试效果，我们在两台机器的web工作目录中提供名字一样但内容不一样的页面1.html 。因此随意写点东西。

2、安装corosync和pacemaker。并安装crmsh配置高可用集群的工具。这里直接使用这个命令行工具。

yum info corosync ##查看corosync是否安装

yum install corosync ##安装corosync

yum info pacemaker ##查看pacemaker是否安装

yum install pacemaker ##安装pacemaker

vim /etc/yum.repos.d/HA.repo ##添加安装crmsh用的yum源。可能网速有点慢，超时的话，多安两遍。

[network_ha-clustering_Stable]
name=Stable High Availability/Clustering packages (CentOS_CentOS-6)
type=rpm-md
baseurl=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/
gpgcheck=1
gpgkey=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6//repodata/repomd.xml.key
enabled=1

yum install crmsh -y ##安装crmsh

3、配置两个机器的主机名字。必须配置，集群间通信基于此。

hostname node1.magedu.com ###机器1上执行，临时生效

vim /etc/sysconfig/network ###机器1上执行，永久生效

hostname node2.magedu.com ###机器2上执行，临时生效

vim /etc/sysconfig/network ##机器2上执行，永久生效

4、配置两台机器的名称解析（两台机器都执行相同操作），不能基于DNS，来进行主机名《=》ip地址的转换

vim /etc/hosts

172.16.100.7 node1.magedu.com

172.16.100.2 node2.magedu.com

5、将我们规划的ip配置到两台机器上，并互相ping，是否能ping同

ifconfig eth0 172.16.100.7/16 ##机器一配置ip

ifconfig eth0 172.16.100.2/16 ##机器二配置ip

ping node2.magedu.com ##在node1上

6、配置ssh互信通信。

ssh -keygen -t rsa -P '' ##制作秘钥,机器一上执行

ssh-copy-id -i ~/.ssh/id_rsa.pub root@node12.magedu.com ##把公钥拷贝到机器二上

然后到机器二上，执行上边相同的操作，把公钥复制到机器一上。

7、时间同步，必须。集群间需要基于这个标准来进行判断，集群中各节点是否有问题、是否隔离该节点。

service ntpd stop ##两台机器上都需要关闭ntpd服务

chkconfig ntpd off ##两台机器都关闭开机自启动ntpd服务

ntpdate 172.16.0.1 ##从172.16.0.1那台机器同步时间，随便找台机器配置上ip，作为ntpd服务器就行。

如果同步时间失败参看：http://www.blogjava.net/spray/archive/2008/07/10/213964.html

crontab -e ##两台机器上都做成计划任务，每5分钟同步一次。必须。crontab -l ##查看计划任务

*/5 * * * * /usr/sbin/ntpdate 172.16.0.1 &> /dev/null

8、配置corosync，绿色字体为配置文件内容。英文好的man corosync.conf

vim /etc/corosync/corosync.conf ##修改配置文件

# Please read the corosync.conf.5 manual page
compatibility: whitetank

totem {
   version: 2 ##这里不用改，指的是配置文件版本。不能修改

   # secauth: Enable mutual node authentication. If you choose to
   # enable this ("on"), then do remember to create a shared
   # secret with "corosync-keygen".
   secauth: on ##指的是集群间认证开启，防止其他主机加入集群

   threads: 2 ##并发开启的线程数。一般单核cpu修改下即可。多核cpu不需要修改

   # interface: define at least one interface to communicate
   # over. If you define more than one interface stanza, you must
   # also set rrp_mode.
   interface {
                # Rings must be consecutively numbered, starting at 0.
       ringnumber: 0
       # This is normally the *network* address of the
       # interface to bind to. This ensures that you can use
       # identical instances of this configuration file
       # across all your cluster nodes, without having to
       # modify this option.
       bindnetaddr: 172.16.0.0 ##集群工作的网段
       # However, if you have multiple physical network
       # interfaces configured for the same subnet, then the
       # network address alone is not sufficient to identify
       # the interface Corosync should bind to. In that case,
       # configure the *host* address of the interface
       # instead:
       # bindnetaddr: 192.168.1.1
       # When selecting a multicast address, consider RFC
       # 2365 (which, among other things, specifies that
       # 239.255.x.x addresses are left to the discretion of
       # the network administrator). Do not reuse multicast
       # addresses across multiple Corosync clusters sharing
       # the same network.
       mcastaddr: 239.255.1.1 ##多播的地址。集群节点间通讯使用这个多播地址。具体多播地址有哪些可以用。自己查查。
       # Corosync uses the port you specify here for UDP
       # messaging, and also the immediately preceding
       # port. Thus if you set this to 5405, Corosync sends
       # messages over UDP ports 5405 and 5404.
       mcastport: 5405 ##多播端口号，保持默认即可
       # Time-to-live for cluster communication packets. The
       # number of hops (routers) that this ring will allow
       # itself to pass. Note that multicast routing must be
       # specifically enabled on most network routers.
       ttl: 1
   }
}

logging { ##配置日志存储的部分，不说了
   # Log the source file and line where messages are being
   # generated. When in doubt, leave off. Potentially useful for
   # debugging.
   fileline: off
   # Log to standard error. When in doubt, set to no. Useful when
   # running in the foreground (when invoking "corosync -f")
   to_stderr: no
   # Log to a log file. When set to "no", the "logfile" option
   # must not be set.
   to_logfile: yes
   logfile: /var/log/cluster/corosync.log
   # Log to the system log daemon. When in doubt, set to yes.
   to_syslog: yes
   # Log debug messages (very verbose). When in doubt, leave off.
   debug: off
   # Log messages with time stamps. When in doubt, set to on
   # (unless you are only logging to syslog, where double
   # timestamps can be annoying).
   timestamp: on
   logger_subsys {
       subsys: AMF
       debug: off
   }
}

amf{
mode:disabled
}

service { ##启动corosync完成后，就启动pacemaker。
ver:0
name:pacemaker
}

9、生成认证秘钥，并且每个节点都有一份

corosync-keygen

scp authkey corosync.conf node2.magedu.com:/etc/corosync/

10、启动corosync

service iptables stop ##关闭防火墙

setenforce 0 ##关闭selinux防止影响我们

node1上：service corosync start

node2上：在node1上执行，ssh node2.magedu.com 'service corosync start'

11、查看启动是否有错误

(1).查看corosync引擎是否正常启动

 
        [root@node1 ~] 
        # grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/cluster/corosync.log  
       
        Aug 17 17:31:20 corosync [MAIN  ] Corosync Cluster Engine ( 
        '1.4.1' 
        ): started and ready to provide service.  
       
        Aug 17 17:31:20 corosync [MAIN  ] Successfully 
        read 
        main configuration 
        file 
        '/etc/corosync/corosync.conf' 
        .

(2).查看初始化成员节点通知是否正常发出

 
        [root@node1 ~] 
        # grep  TOTEM /var/log/cluster/corosync.log 
       
        Aug 17 17:31:20 corosync [TOTEM ] Initializing transport (UDP 
        /IP 
        Multicast).   
       
        Aug 17 17:31:20 corosync [TOTEM ] Initializing transmit 
        /receive 
        security: libtomcrypt SOBER128 
        /SHA1HMAC 
        (mode 0).   
       
        Aug 17 17:31:21 corosync [TOTEM ] The network interface [192.168.1.201] is now up.  
       
        Aug 17 17:31:21 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.

(3).检查启动过程中是否有错误产生

 
   
        [root@node1 ~] 
        # grep ERROR: /var/log/cluster/corosync.log  
       
 
        Aug 17 17:31:21 corosync [pcmk  ] ERROR: process_ais_conf: You have configured a cluster using the Pacemaker plugin 
        for 
        Corosync. The plugin is not supported 
        in 
        this environment and will be removed very soon.  
       
 
        Aug 17 17:31:21 corosync [pcmk  ] ERROR: process_ais_conf:  Please see Chapter 8 of 
        'Clusters from Scratch' 
        (http: 
        //www 
        .clusterlabs.org 
        /doc 
        ) 
        for 
        details on using Pacemaker with CMAN 
       
 
 

(4).查看pacemaker是否正常启动

 
        [root@node1 ~] 
        # grep pcmk_startup /var/log/cluster/corosync.log 
       
        Aug 17 17:31:21 corosync [pcmk  ] info: pcmk_startup: CRM: Initialized  
       
        Aug 17 17:31:21 corosync [pcmk  ] Logging: Initialized pcmk_startup  
       
        Aug 17 17:31:21 corosync [pcmk  ] info: pcmk_startup: Maximum core 
        file 
        size is: 18446744073709551615  
       
        Aug 17 17:31:21 corosync [pcmk  ] info: pcmk_startup: Service: 9  
       
        Aug 17 17:31:21 corosync [pcmk  ] info: pcmk_startup: Local 
        hostname 
        : node1. 
        test 
        .com

12、crm添加集群资源

crm: 两种模式
   交互式：
       配置，执行commit命令以后才生效
   批处理：
       立即生效

##

crm ##键入crm命令

configure ##配置

primitive webip ocf:heartbeat:IPaddr params ip=172.16.100.1 nic=eth0 cid_netmask=16 添加vip资源（对外服务的ip地址）。注意如果参数的值中有空格要用“” 引起来。

verify ##校验是否有问题

commit ##提交才会生效，这就是“交互式”

show ##查看下配置

show xml ##xml文件格式查看配置文件

primitive httpd lsb:httpd op start timeout=20 ##添加httpd资源

show ##查看下资源

commit ##提交生效

cd ..

status ##查看各个资源服务运行情况

configure ##进入配置命令

group webservice webip httpd ##定义组资源

verify

cd ..

status ##查看状况，此时资源服务都会运行于一个node上

configure

property no-quorum-policy=ignore ##配置下默认策略，因为我们集群只有两个节点，一个down掉另一个是没有法定票数的。需要设置成ignore才能有效。实际情况下，一般我们提供奇数个节点，或者即使是偶数保证节点多些也没有问题。

cd ..

node standy ##测试下，停止一个节点。

status ##查看下服务资源运行的几点是不是变了

node online ##开启刚才关闭的节点

####上边已经完成了httpd高可用，用的方法是资源组来保证资源运行于同一个节点上，下边使用资源约束来保证所有资源运行于同一节点上

resource stop webservice

configure delete webservice ##删除资源组

commit ##提交生效

status 查看下集群资源服务运行状态

configure colocation httpd_with_webip inf: httpd webip ##添加排列约束，保证httpd和webip资源运行于同一个node。

verify ##校验

commit ##提交

status ##查看状态

configure order webip_before_httpd mandatory: webip httpd ##添加顺序约束，保证节点先有ip才会有httpd

verify

status ##查看状态

configure location webip_on_node1 webip rule 100: #uname eq node1.magedu.com ##添加位置约束，保证服务更倾向于在node1节点运行

verify

commit

node standby ##停掉一个节点

status

node online ##在开启停掉的哪个节点。是不是回来了。

常用命令：

verify 检测配置文件是否有问题

crm configure property stonith-enabled=false

property stonith-enables=false

verify

commit

内容有空格加双引号

primitive webip ocf：heartbeat：IPaddr params ip=172.16.100.1 nic=eth0 cidr_netmask=16

verify

show

show xml

stop webip ##停用一个资源

resources

list

start webip

list

migrate ##迁移资源

crm_mon

providers httpd 看看httpd这个ra是谁提供的

classes

list lsb

meta lsb:httpd ##擦看下

configure

primitive httpd lsb:httpd op start timeout=20

show

verify

commit

show

crm status

group webservice webip httpd

verify

crm status

crm node standby / online

crm status

crm configure property no-quorum-policy=ignore

crm show

resource

stop webservice

list

cleanup webservice

cleanup webip

cleanup httpd

cleanstate node1.magedu.com

cleanstate node2.magedu.com

resource

start webservice

edit

verify

show

quit

resource

migrate ##迁移资源

commit

crm status

crm node standby

crm resource

stop webservice

configure

delete webservice

show

commit

configure

help colocation

colocation httpd_with_webip inf: httpd webip

show xml

verify

commit

crm status

order webip_before_httpd mandatory: webip httpd

show xml

commit

crm status

crm node standby

crm_mon

crm node online

location webip_on_node1 webip rule 100: #uname eq node1.magedu.com

show xml

verify

commit

crm status

crm node standy

crm status

crm configure

rsc_defaults resource-stickiness=200 ##设置默认的粘滞性

verify

commit

crm node standby

crm node online

meta ocf:heartbeat:Filesystem

附加：

REHL 6.x RHCS: corosync

RHEL 5.x RHCS: openais, cman, rgmanager

corosync: Messaging Layer

openais: AIS

corosync --> pacemaker

SUSE Linux Enterprise Server: Hawk, WebGUI

LCMC: Linux Cluster Management Console

RHCS: Conga(luci/ricci)

webGUI

keepalived: VRRP, 2节点

参考网址：http://freeloda.blog.51cto.com/2033581/1275528

donghaixiaolongwang

关注

0
点赞
踩
5

收藏

觉得还不错? 一键收藏
1
评论
corosync + pacemaker 安装配置，实现httpd高可用

1、corosync简介目前众多类unix操作系统都广泛使用corosync和pacemaker来实现服务的高可用。但是各个系统对于高可用集群的管理工具略有不同。
复制链接

扫一扫