天天看點

redis 學習筆記(6)-cluster叢集搭建

redis 學習筆記(6)-cluster叢集搭建

上次寫redis的學習筆記還是2014年,一轉眼已經快2年過去了,在段時間裡,redis最大的變化之一就是cluster功能的正式釋出,以前要搞redis叢集,得借助一緻性hash來自己搞sharding,現在友善多了,直接上cluster功能就行了,而且還支援節點動态添加、HA、節點增減後緩存重新分布(resharding)。

下面是參考官方教程cluster-tutorial 在mac機上搭建cluster的過程:

一、下載下傳最新版redis 編譯

 ​

編譯很簡單,一個make指令即可,不清楚的同學,可參考我之前的筆記: redis 學習筆記(1)-編譯、啟動、停止

二、建6個目錄

mkdir ~/app/redis-cluster/  #先建一個根目錄
mkdir 7000 7001 7002 7003 7004 7005      

注:與大多數分布式中間件一樣,redis的cluster也是依賴選舉算法來保證叢集的高可用,是以類似ZK一樣,一般是奇數個節點(可以允許N/2以下的節點失效),再考慮到每個節點做Master-Slave互為備份,是以一個redis cluster叢集最少也得6個節點。

然後把步驟1裡編譯好的redis,複制到這6個目錄下。

三、配置檔案

port 7000
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
appendonly yes      

把上面這段儲存成redis-cluster.conf,放到每個目錄的redis目錄中,注意修改port端口,即7000目錄下的port為7000,7001目錄下的port為7001...

cluster-node-timeout 是叢集中各節點互相通訊時,允許"失聯"的最大毫秒數,上面的配置為5秒,如果超過5秒某個節點沒向其它節點彙報成功,認為該節點挂了。

四、依次啟動各個redis

在每個目錄redis的src子目錄下,輸入:

./redis-server ../redis-cluster.conf      

這樣7000~7005這6個節點就啟動了。

五、安裝redis的ruby子產品

brew update 
brew install ruby
sudo gem install redis #注:這個步驟建議翻^牆,不然你懂的      

解釋:雖然步驟4把6個redis server啟動成功了,但是彼此之間是完全獨立的,需要借助其它工具将其加入cluster,而這個工具就是redis提供的一個名為redis-trib.rb的ruby腳本(個人估計redis的作者比較偏愛ruby),mac自帶了ruby2.0環境,但是沒有redis子產品,是以要安裝這玩意兒,否則接下來的建立cluster将失敗。

六、建立cluster

./redis-trib.rb create --replicas 1 127.0.0.1:7000 127.0.0.1:7001 \
127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005      

仍然保持在某個目錄的src子目錄下,運作上面這段shell腳本,cluster就建立成功了,replicas 1的意思,就是每個節點建立1個副本(即:slave),是以最終的結果,就是後面的127.0.0.1:7000~127.0.0.1:7005中,會有3個會指定成master,而其它3個會指定成slave。

注:利用redis-trib建立cluster的操作,隻需要一次即可,假設系統關機,把所有6個節點全關閉後,下次重新開機後,即自動進入cluster模式,不用再次redis-trib.rb create。

此時,如何用ps檢視redis程序,會看到每個程序後附帶了cluster的字樣

redis 學習筆記(6)-cluster叢集搭建

如果想知道,哪些端口的節點是master,哪些端口的節點是slave,可以用下面的指令:

./redis-trib.rb check 127.0.0.1:7000      

輸出結果如下:

>>> Performing Cluster Check (using node 127.0.0.1:7000)
S: 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e 127.0.0.1:7000
   slots: (0 slots) slave
   replicates 38910c5baafea02c5303505acfd9bd331c608cfc
M: e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa 127.0.0.1:7001
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
S: 88e16f91609c03277f2ee6ce5285932f58c221c1 127.0.0.1:7005
   slots: (0 slots) slave
   replicates ec964a7c7cd53b986f54318a190c1426fc53a5fa
S: be7e9fd3b7d096b037306bc14e1017150fa59d7a 127.0.0.1:7004
   slots: (0 slots) slave
   replicates e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa
M: 38910c5baafea02c5303505acfd9bd331c608cfc 127.0.0.1:7003
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
M: ec964a7c7cd53b986f54318a190c1426fc53a5fa 127.0.0.1:7002
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.      

從上面的輸出,可以看出7000、7004、7005是slave,而7001、7003、7002是master(如果大家人為做過一些failover的測試,比如把某個節點手動停掉,再恢複,輸出的結果可能與上面不太一樣),除了check參數,還有一個常用的參數info

./redis-trib.rb info 127.0.0.1:7000      
127.0.0.1:7001 (e0e8dfdd...) -> 2 keys | 5462 slots | 1 slaves.
127.0.0.1:7003 (38910c5b...) -> 2 keys | 5461 slots | 1 slaves.
127.0.0.1:7002 (ec964a7c...) -> 0 keys | 5461 slots | 1 slaves.
[OK] 4 keys in 3 masters.
0.00 keys per slot on average.      

它會把所有的master資訊輸出,包括這個master上有幾個緩存key,有幾個slave,所有master上的keys合計,以及平均每個slot上有多少key,想了解更多redis-trib腳本的其它參數,可以用

./redis-trib.rb help      

輸出如下:

Usage: redis-trib <command> <options> <arguments ...>

  create          host1:port1 ... hostN:portN
                  --replicas <arg>
  check           host:port
  info            host:port
  fix             host:port
                  --timeout <arg>
  reshard         host:port
                  --from <arg>
                  --to <arg>
                  --slots <arg>
                  --yes
                  --timeout <arg>
                  --pipeline <arg>
  rebalance       host:port
                  --weight <arg>
                  --auto-weights
                  --use-empty-masters
                  --timeout <arg>
                  --simulate
                  --pipeline <arg>
                  --threshold <arg>
  add-node        new_host:new_port existing_host:existing_port
                  --slave
                  --master-id <arg>
  del-node        host:port node_id
  set-timeout     host:port milliseconds
  call            host:port command arg arg .. arg
  import          host:port
                  --from <arg>
                  --copy
                  --replace
  help            (show this help)

For check, fix, reshard, del-node, set-timeout you can specify the host and port of any working node in the cluster.      

上面已經多次出現了slot這個詞,略為解釋一下:

redis 學習筆記(6)-cluster叢集搭建

如上圖,redis-cluster把整個叢集的存儲空間劃分為16384個slot(譯為:插槽?),當6個節點分為3主3從時,相當于整個cluster中有3組HA的節點,3個master會平均分攤所有slot,每次向cluster中的key做操作時(比如:讀取/寫入緩存),redis會對key值做CRC32算法處理,得到一個數值,然後再對16384取模,通過餘數判斷該緩存項應該落在哪個slot上,确定了slot,也就确定了儲存在哪個master節點上,當cluster擴容或删除節點時,隻需要将slot重新配置設定即可(即:把部分slot從一些節點移動到其它節點)。

七、redis-cli用戶端操作

./redis-cli -c -h localhost -p 7000      

注意加參數-c,表示進入cluster模式,随便添加一個緩存試試:

localhost:7000> set user1 jimmy
-> Redirected to slot [8106] located at 127.0.0.1:7001
OK      

注意第2行的輸出,表示user1這個緩存通過計算後,落在8106這個slot上,最終定位在7001這個端口對應的節點上(解釋:因為7000是slave,7001才是master,隻有master才能寫入),如果是在7001上重複上面的操作時,不會出現第2行(解釋:7001是master,是以不存在redirect的過程)

➜  src ./redis-cli -c -h localhost -p 7001
localhost:7001> set user1 yang
OK
localhost:7001>      

八、FailOver測試

先用redis-trib.rb 檢視下目前的主、從情況

➜  src ./redis-trib.rb check localhost:7000
>>> Performing Cluster Check (using node localhost:7000)
S: 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e localhost:7000
   slots: (0 slots) slave
   replicates 38910c5baafea02c5303505acfd9bd331c608cfc
M: ec964a7c7cd53b986f54318a190c1426fc53a5fa 127.0.0.1:7002
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
M: e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa 127.0.0.1:7001
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
S: be7e9fd3b7d096b037306bc14e1017150fa59d7a 127.0.0.1:7004
   slots: (0 slots) slave
   replicates e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa
S: 88e16f91609c03277f2ee6ce5285932f58c221c1 127.0.0.1:7005
   slots: (0 slots) slave
   replicates ec964a7c7cd53b986f54318a190c1426fc53a5fa
M: 38910c5baafea02c5303505acfd9bd331c608cfc 127.0.0.1:7003
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.      

從輸出上看7000是7003(38910c5baafea02c5303505acfd9bd331c608cfc)的slave,現在我們人工把7003的redis程序給kill掉,然後觀察7000的終端輸出:

872:S 21 Mar 10:55:55.663 * Connecting to MASTER 127.0.0.1:7003
3872:S 21 Mar 10:55:55.663 * MASTER <-> SLAVE sync started
3872:S 21 Mar 10:55:55.663 # Error condition on socket for SYNC: Connection refused
3872:S 21 Mar 10:55:55.771 * Marking node 38910c5baafea02c5303505acfd9bd331c608cfc as failing (quorum reached).
3872:S 21 Mar 10:55:55.771 # Cluster state changed: fail
3872:S 21 Mar 10:55:55.869 # Start of election delayed for 954 milliseconds (rank #0, offset 183).
3872:S 21 Mar 10:55:56.703 * Connecting to MASTER 127.0.0.1:7003
3872:S 21 Mar 10:55:56.703 * MASTER <-> SLAVE sync started
3872:S 21 Mar 10:55:56.703 # Error condition on socket for SYNC: Connection refused
3872:S 21 Mar 10:55:56.909 # Starting a failover election for epoch 10.
3872:S 21 Mar 10:55:56.911 # Failover election won: I'm the new master.
3872:S 21 Mar 10:55:56.911 # configEpoch set to 10 after successful failover
3872:M 21 Mar 10:55:56.911 * Discarding previously cached master state.
3872:M 21 Mar 10:55:56.911 # Cluster state changed: ok      

注意5,6,11這幾行,第5行表明由于7003當機,cluster狀态已經切換到fail狀态,第6行表示發起選舉,第11行表示7000端口對應的節點當選為new master。

注:如果一組分片中的master、slave全挂了,整個cluster叢集不再接受任何讀/寫指令,redis-cli終端裡會直接報cluster down,但是info等其它指令仍然可用,直到這一組分片中,有一個節點恢複為止。

九、cluster 擴容

業務規模變大後,叢集擴容是早晚的事情,下面示範如何再添加2個節點,先把7000複制二份,變成7006,7007,然後進入7006/7007目錄redis的src子目錄下

rm nodes.conf dump.rdb appendonly.aof      

由于7000我們剛才啟動過,裡面有已經有一些資料了,是以要把資料檔案,日志檔案,以及cluster的nodes.conf檔案删除,變成一個空的redis獨立節點,否則無法加入cluster。

然後修改redis-cluster.conf

port 7000
cluster-enabled yes
cluster-config-file "nodes.conf"
cluster-node-timeout 10000
appendonly yes
# Generated by CONFIG REWRITE
dir "/Users/yjmyzz/app/redis-cluster/7000/redis-3.0.7/src"      

要修改的地方有二處,1是第一行的端口,改成與7006/7007比對的端口,2是最後2行,這是7000運作後,自動添加的,把最後二行删除。

做完這些後,啟動7006,7007這二個redis節點,此時這2個新節點與cluster沒有任何關系,可以用下面的指令将7006做為master添加到cluster中。

./redis-trib.rb add-node 127.0.0.1:7006 127.0.0.1:7000      

注:第1個參數為新節點的"IP:端口",第2個參數為叢集中的任一有效的節點。

順利的話,輸出如下:

>>> Adding node 127.0.0.1:7006 to cluster 127.0.0.1:7000
>>> Performing Cluster Check (using node 127.0.0.1:7000)
M: 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e 127.0.0.1:7000
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
M: e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa 127.0.0.1:7001
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
S: be7e9fd3b7d096b037306bc14e1017150fa59d7a 127.0.0.1:7004
   slots: (0 slots) slave
   replicates e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa
M: ec964a7c7cd53b986f54318a190c1426fc53a5fa 127.0.0.1:7002
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
S: 88e16f91609c03277f2ee6ce5285932f58c221c1 127.0.0.1:7005
   slots: (0 slots) slave
   replicates ec964a7c7cd53b986f54318a190c1426fc53a5fa
S: 38910c5baafea02c5303505acfd9bd331c608cfc 127.0.0.1:7003
   slots: (0 slots) slave
   replicates 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 127.0.0.1:7006 to make it join the cluster.
[OK] New node added correctly.      

可以再用check确認下狀态:

➜  src ./redis-trib.rb check 127.0.0.1:7000
>>> Performing Cluster Check (using node 127.0.0.1:7000)
M: 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e 127.0.0.1:7000
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
M: e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa 127.0.0.1:7001
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
S: be7e9fd3b7d096b037306bc14e1017150fa59d7a 127.0.0.1:7004
   slots: (0 slots) slave
   replicates e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa
M: 226d1af3c95bf0798ea9fed86373b89347f889da 127.0.0.1:7006
   slots: (0 slots) master
   0 additional replica(s)
M: ec964a7c7cd53b986f54318a190c1426fc53a5fa 127.0.0.1:7002
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
S: 88e16f91609c03277f2ee6ce5285932f58c221c1 127.0.0.1:7005
   slots: (0 slots) slave
   replicates ec964a7c7cd53b986f54318a190c1426fc53a5fa
S: 38910c5baafea02c5303505acfd9bd331c608cfc 127.0.0.1:7003
   slots: (0 slots) slave
   replicates 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.      

12-14行說明7006已經是cluster的新master了,繼續,用下面的指令把7007當成slave加入:

./redis-trib.rb add-node --slave --master-id 226d1af3c95bf0798ea9fed86373b89347f889da 127.0.0.1:7007 127.0.0.1:7000      

這裡多出了二個參數:--slave 表示準備将新節點當成slave加入,--master-id xxxxx 則是指定要當誰的slave,後面的xxx部分,即為前面check的輸出結果中,7006的ID,完事之後,可以再次确認狀态:

➜  src ./redis-trib.rb check 127.0.0.1:7000
>>> Performing Cluster Check (using node 127.0.0.1:7000)
M: 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e 127.0.0.1:7000
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
S: 792bcccf35845c4922dd33d7f9827420ebb89bc9 127.0.0.1:7007
   slots: (0 slots) slave
   replicates 226d1af3c95bf0798ea9fed86373b89347f889da
M: e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa 127.0.0.1:7001
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
S: be7e9fd3b7d096b037306bc14e1017150fa59d7a 127.0.0.1:7004
   slots: (0 slots) slave
   replicates e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa
M: 226d1af3c95bf0798ea9fed86373b89347f889da 127.0.0.1:7006
   slots: (0 slots) master
   1 additional replica(s)
M: ec964a7c7cd53b986f54318a190c1426fc53a5fa 127.0.0.1:7002
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
S: 88e16f91609c03277f2ee6ce5285932f58c221c1 127.0.0.1:7005
   slots: (0 slots) slave
   replicates ec964a7c7cd53b986f54318a190c1426fc53a5fa
S: 38910c5baafea02c5303505acfd9bd331c608cfc 127.0.0.1:7003
   slots: (0 slots) slave
   replicates 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.      

觀察6-8行、15-17行,說明7007已經是7006的slave。

十、reshard 重新劃分slot

增加新的節點之後,問題就來了,16384個slot已經被其它3組節點分完了,新節點沒有slot,沒辦法存放緩存,是以需要将slot重新分布。

➜  src ./redis-trib.rb info 127.0.0.1:7000
127.0.0.1:7000 (0b7e0d53...) -> 4 keys | 5461 slots | 1 slaves.
127.0.0.1:7001 (e0e8dfdd...) -> 4 keys | 5462 slots | 1 slaves.
127.0.0.1:7006 (226d1af3...) -> 0 keys | 0 slots | 1 slaves. #7006上完全沒有slot
127.0.0.1:7002 (ec964a7c...) -> 9 keys | 5461 slots | 1 slaves.
[OK] 17 keys in 4 masters.
0.00 keys per slot on average.      

用下面的指令可以重新配置設定slot

./redis-trib.rb reshard 127.0.0.1:7000      

reshard後面的IP:port,隻要是在cluster中的有效節點即可。

➜  src ./redis-trib.rb reshard 127.0.0.1:7000
>>> Performing Cluster Check (using node 127.0.0.1:7000)
M: 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e 127.0.0.1:7000
   slots:1792-4095 (2304 slots) master
   0 additional replica(s)
   ...
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 1000 #這裡輸入要移動多少slot
What is the receiving node ID? 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e #這裡輸入目标節點的id
Please enter all the source node IDs.
  Type 'all' to use all the nodes as source nodes for the hash slots.
  Type 'done' once you entered all the source nodes IDs.
Source node #1:all #将所有node都當成源節點
    ...
    Moving slot 4309 from ec964a7c7cd53b986f54318a190c1426fc53a5fa
    Moving slot 4310 from ec964a7c7cd53b986f54318a190c1426fc53a5fa
    Moving slot 4311 from ec964a7c7cd53b986f54318a190c1426fc53a5fa
    Moving slot 4312 from ec964a7c7cd53b986f54318a190c1426fc53a5fa
    Moving slot 4313 from ec964a7c7cd53b986f54318a190c1426fc53a5fa
Do you want to proceed with the proposed reshard plan (yes/no)? yes #确認執行      

注:第一個互動詢問,填寫多少slot移動時,要好好想想,如果填成16384,則将所有slot都移動到一個固定節點上,會導緻更加不均衡!建議每次移動500~1000,這樣對線上的影響比較小。

另外在填寫source node時,除了all之外,還可以直接填寫源節點的id,即:

[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 300
What is the receiving node ID? 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e 
Please enter all the source node IDs.
  Type 'all' to use all the nodes as source nodes for the hash slots.
  Type 'done' once you entered all the source nodes IDs.
Source node #1:226d1af3c95bf0798ea9fed86373b89347f889da #這裡填寫源節點的id
Source node #2:done #這裡輸入done表示,不再繼續添加源節點了      

reshard可以多次操作,直到達到期望的分布為止(注:個人覺得redis的reshard這裡有點麻煩,要移動多少slot需要人工計算,如果能提供一個參數之類,讓16384個slot自動平均配置設定就好了),調整完成後,可以再看看分布情況:

➜  src ./redis-trib.rb info 127.0.0.1:7000
127.0.0.1:7000 (0b7e0d53...) -> 4 keys | 4072 slots | 0 slaves.
127.0.0.1:7001 (e0e8dfdd...) -> 5 keys | 4099 slots | 0 slaves.
127.0.0.1:7006 (226d1af3...) -> 5 keys | 4132 slots | 4 slaves.
127.0.0.1:7002 (ec964a7c...) -> 3 keys | 4081 slots | 0 slaves.
[OK] 17 keys in 4 masters.
0.00 keys per slot on average.      

十一、删除節點del-node

既然有擴容,就會有反向需求,某些節點不再需要時,可以用del-node删除,比如剛才我一陣亂倒騰後,發現7006已經有4個slave了,而其它master一個slave都沒有,這明顯不合理。

删除節點指令:

./redis-trib.rb del-node 127.0.0.1:7006 88e16f91609c03277f2ee6ce5285932f58c221c1      

del-node後面的ip:port隻要是cluster中有效節點即可,最後一個參數為目标節點的id,注意:隻有slave節點和空的master節點可以删除,如果master非空,先用reshard把上面的slot移動到其它node後再删除,如果有一組master-slave節點,将master上所有slot移到其它節點,然後将master删除,剩下的slave會另尋他主,變成其它master的slave。

另外:删除節點的含義,不僅僅是從cluster中将這個節點移除,還會直接将目标節點的redis服務停止。

作者:​​菩提樹下的楊過​​

繼續閱讀