Luci管理cluster
一、Cluster相關軟體安裝
1、 服務元件說明
l Configuration Information
ccsd – Cluster Configuration System
l High-Availability Management
aisexec - OpenAIS cluster manager: communications, encryption, quorum, membership
rgmanager - Cluster resource group manager
l Shared Storage Related
fenced - I/O Fencing
DLM - Distributed Locking Manager
dlm_controld - Manages DLM groups
lock_dlmd - Manages interaction between DLM GFS
clvmd - Clustered Logical Volume Manager
l Deployment
luci - Conga project
system-config-cluster
<b>2、 </b><b>相關架構</b>
<b></b>
l RHEL4 CMAN/DLM Architecture
<a href="http://blog.51cto.com/attachment/201201/162039471.png" target="_blank"></a>
l RHEL5 CMAN/DLM/OpenAIS Architecture
<a href="http://blog.51cto.com/attachment/201201/162052620.png" target="_blank"></a>
3、安裝
[root@node1 ~]# yum install cman rgmanager –y
[root@node1 ~]# yum install -y cluster-cim lvm2-cluster system-config-cluster
[root@node1 ~]# yum install ricci -y
[root@node1 ~]# yum install luci
[root@node1 ~]# service ricci start
啟動 oddjobd: [确定]
generating SSL certificates... done
啟動 ricci: [确定]
[root@node1 ~]# service cman start
Starting cluster:
Loading modules... done
Mounting configfs... done
Starting ccsd... done
Starting cman... failed
/usr/sbin/cman_tool: ccsd is not running [失敗]
[root@node1 ~]# service rgmanager start
[root@node1 ~]# servic qdiskd start
[root@node1 ~]# luci_admin init
Initializing the luci server
Creating the 'admin' user
Enter password:
Confirm password:
Please wait...
The admin password has been successfully set.
Generating SSL certificates...
The luci server has been successfully initialized
You must restart the luci server for changes to take effect.
Run "service luci restart" to do so
[root@node1 ~]# service luci restart
Shutting down luci: [确定]
Starting luci: Generating https SSL certificates... done [确定]
Point your web browser to https://node1:8084 to access luci
[root@node1 ~]# chkconfig ricci on
[root@node1 ~]# chkconfig cman on
[root@node1 ~]# chkconfig rgmanager on
[root@node1 ~]# chkconfig qdiskd on
[root@node1 ~]# chkconfig luci on
二、建立仲裁盤
[root@node1 ~]# mkqdisk -l cluster8_qdisk -c /dev/sdc
mkqdisk v0.6.0
Writing new quorum disk label 'cluster8_qdisk' to /dev/sdc.
WARNING: About to destroy all data on /dev/sdc; proceed [N/y] ? y
Initializing status block for node 1...
Initializing status block for node 2...
Initializing status block for node 3...
Initializing status block for node 4...
Initializing status block for node 5...
Initializing status block for node 6...
Initializing status block for node 7...
Initializing status block for node 8...
Initializing status block for node 9...
Initializing status block for node 10...
Initializing status block for node 11...
Initializing status block for node 12...
Initializing status block for node 13...
Initializing status block for node 14...
Initializing status block for node 15...
Initializing status block for node 16...
[root@node1 ~]# mkqdisk -L
/dev/disk/by-id/scsi-1IET_00010001:
/dev/disk/by-path/ip-10.160.100.40:3260-iscsi-iqn.20120116.target:dskpub-lun-1:
/dev/sdc:
Magic: eb7a62c2
Label: cluster8_qdisk
Created: Mon Jan 16 13:45:21 2012
Host: node1
Kernel Sector Size: 512
Recorded Sector Size: 512
[root@node1 ~]# service qdiskd start #要求所有node上都啟動該服務
#quorum disk--仲裁盤
l qdisk磁盤大小10M就行了,需這裡必須用-l來設定一個卷标,是因為/dev/sdc有時會變動,這裡不能用udev的方法來寫,是以要用卷标.在任意一個節點操作即可
mkqdisk -L 檢視仲裁盤
mkqdisk -f <label> 檢視某個仲裁盤資訊
l 可以設定多久和中仲裁盤通信一次,方法是用ping,比如1秒ping一次,ping個50次都成功說明對方是活着的。
三、利用luci配置管理cluster
1、 建立cluter(添加節點到cluter8中)
<a href="http://blog.51cto.com/attachment/201201/162122911.jpg" target="_blank"></a>
2、 為cluster添加仲裁盤
<a href="http://blog.51cto.com/attachment/201201/162157447.jpg" target="_blank"></a>
l 參數說明:
interval 2 :每2秒鐘做一次
votes 1 :票數為1
TKO 10 :做十次
Minimum Score :每次得一分
Device --/dev/sda3
Label---cluster8_qdisk
l 假如做十次的話,每次得一分,必須大于5分才說明叢集正常
Path to program interval score
ping -c1 -t1 192.168.0.254 2 1
用cman_tool status你會發現如果是兩個節點的話,期待票數會變成3,總票數也為3,Quorum為2,這說明用Quorum也會加上上面的Votes設定的票數,注意要改一個參數在/etc/cluster/cluster.conf裡expected_votes="3" two_node="0",因為這裡的expected_votes要手動改一下,并且two_node是兩節點叢集的開關,改為0說明是大于兩節點的叢集,如果為1剛說明是兩節點的叢集。然後各node重新開機rgmanager、qdisk、cman服務。
3、 為節點添加fence裝置(各節點操作類似)
<a href="http://blog.51cto.com/attachment/201201/162217193.png" target="_blank"></a>
4、 建立cluster域
<a href="http://blog.51cto.com/attachment/201201/162232884.jpg" target="_blank"></a>
#priority:優先級數字越小級别越高,表示服務在node上啟動的優先級。
Prioritiezed:開啟優先級,域中開啟優先級,節點上才能設定。
Restrict failover…:嚴格限制域中成員,限制該域中的成員隻能屬于該域。
5、 添加共享資源
共享gfs檔案系統,其他共享資源建立與此一樣
<a href="http://blog.51cto.com/attachment/201201/162303671.jpg" target="_blank"></a>
6、 将共享的資源添加到服務中
<a href="http://blog.51cto.com/attachment/201201/162318875.jpg" target="_blank"></a>
7、 檢視和編輯配置檔案(/etc/cluster/cluster.conf)
[root@node1 ~]# vim /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster alias="cluster8" config_version="9" name="cluster8">
#每修改一次cluster.conf,必須将confg_version的數字增加1
<clusternode name="node1hb" nodeid="1" votes="5">
<fence/>
</clusternode>
<clusternode name="node3hb" nodeid="2" votes="5">
<clusternode name="node2hb" nodeid="3" votes="5">
</clusternodes>
<cman expected_votes="13"/>
<fencedevices/>
<rm>
<failoverdomains>
<failoverdomain name="httpd_fail" nofailback="0" ordered="1" restricted="1">
<failoverdomainnode name="node1hb" priority="1"/>
<failoverdomainnode name="node3hb" priority="10"/>
<failoverdomainnode name="node2hb" priority="5"/>
</failoverdomain>
</failoverdomains>
<resources>
<clusterfs device="/dev/vg01/lv01" force_unmount="0" fsid="34218" fstype="gfs" mountpoint="/var/www/ht
ml" name="httpd_files" self_fence="0"/>
<ip address="192.168.32.21" monitor_link="0"/>
<script file="/etc/init.d/httpd" name="httpd"/>
</resources>
<service autostart="1" domain="httpd_fail" exclusive="1" name="httpd_srv" recovery="relocate">
<clusterfs fstype="gfs" ref="httpd_files"/>
<ip ref="192.168.32.21"/>
<script ref="httpd"/>
</service>
</rm>
<quorumd interval="5" label="cluster8_qdisk" min_score="1" tko="10" votes="10">
<heuristic interval="5" program="ping -c1 -t1 192.168.32.254" score="1"/>
</quorumd>
</cluster>
[root@node1 ~]# ccs_tool update /etc/cluster/cluster.conf
Config file updated from version 9 to 10
Update complete.
#更新cluster版本
四、啟動服務并測試
1、 開關閉服務
[root@node1 ~]# clusvcadm -R httpd_srv
Local machine trying to restart service:httpd_srv...Success
#啟動httpd_srv服務
[root@node1 ~]# clusvcadm -s httpd_srv
Local machine stopping service:httpd_srv...Success
#關閉httpd_srv服務
2、檢測cluster狀态
[root@node1 ~]# clustat -l
Cluster Status for cluster8 @ Mon Jan 16 15:45:19 2012
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
node1hb 1 Online, Local, rgmanager
node3hb 2 Online, rgmanager
node2hb 3 Online, rgmanager
/dev/disk/by-id/scsi-1IET_00010001 0 Online, Quorum Disk
Service Information
------- -----------
Service Name : service:httpd_srv
Current State : started (112)
Flags : none (0)
Owner : node1hb
Last Owner : node1hb
Last Transition : Mon Jan 16 15:44:43 2012
#clustat -i 3 :每三秒重新整理下一次顯示cluster狀态
[root@node1 ~]# ccs_test connect
Connect successful.
Connection descriptor = 36960
[root@node1 ~]# ccs_tool lsnode
Cluster name: cluster8, config_version: 9
Nodename Votes Nodeid Fencetype
node1hb 5 1
node3hb 5 2
node2hb 5 3
[root@node1 ~]# cman_tool services
type level name id state
fence 0 default 00010003 none
[1 2 3]
dlm 1 clvmd 00010001 none
dlm 1 rgmanager 00020003 none
dlm 1 gfslv01 00030001 none
[1]
gfs 2 gfslv01 00020001 none
本文轉自netsword 51CTO部落格,原文連結:http://blog.51cto.com/netsword/764825