目錄
- 一、概述
- 二、Redis高可用集群搭建
- 三、Redis集群節(jié)點(diǎn)間的通信機(jī)制
- 四、網(wǎng)絡(luò)抖動(dòng)
- 五、Redis集群選舉原理分析
- 5.1、集群是否完整才能對(duì)外提供服務(wù)
- 5.2、Redis集群為什么至少需要三個(gè)master節(jié)點(diǎn),并且推薦節(jié)點(diǎn)數(shù)為奇數(shù)?
- 5.3、哨兵leader選舉流程
- 六、新增/刪除節(jié)點(diǎn)
一、概述
在Redis3.0以前的集群一般是借助哨兵sentinel工具來(lái)監(jiān)控主節(jié)點(diǎn)的狀態(tài),如果主節(jié)點(diǎn)異常,則會(huì)做主從切換,將某一臺(tái)slave作為master。哨兵的配置略微復(fù)雜,并且性能和高可用性等各方面表現(xiàn)一般,特別是在主從切換的瞬間存在訪問(wèn)瞬斷的情況,集群會(huì)需要十幾秒甚至幾十秒的時(shí)間用于判斷主節(jié)點(diǎn)下線,并選舉一個(gè)從節(jié)點(diǎn)成為新的主節(jié)點(diǎn)。在某寶雙11這樣高并發(fā)的場(chǎng)景如果出現(xiàn)Redis主節(jié)點(diǎn)訪問(wèn)瞬斷是一件非??膳碌氖拢@意味著幾千萬(wàn)的商品、訂單查詢請(qǐng)求將直接請(qǐng)求數(shù)據(jù)庫(kù),數(shù)據(jù)庫(kù)很可能因?yàn)榇笈康牟樵冋?qǐng)求而崩潰。
哨兵模式通常只有一個(gè)主節(jié)點(diǎn)對(duì)外提供服務(wù),沒(méi)法支持很高的并發(fā),假設(shè)一個(gè)Redis節(jié)點(diǎn)允許支持10W的并發(fā),但面對(duì)雙11幾千萬(wàn)的并發(fā)量還是捉襟見(jiàn)肘的,且單個(gè)主節(jié)點(diǎn)內(nèi)存也不宜設(shè)置得過(guò)大,否則會(huì)導(dǎo)致持久化文件過(guò)大,影響數(shù)據(jù)恢復(fù)或主從同步的效率。

哨兵模式
Redis集群的性能和高可用性均優(yōu)于之前版本的哨兵模式,且集群配置簡(jiǎn)單。高可用集群相較于哨兵集群,至少不會(huì)出現(xiàn)主節(jié)點(diǎn)下線后,整個(gè)集群在一段時(shí)間內(nèi)處于不可用狀態(tài),直到選舉出主節(jié)點(diǎn)。因?yàn)楦呖捎眉河卸鄠€(gè)主節(jié)點(diǎn),當(dāng)我們需要向整個(gè)Redis服務(wù)寫(xiě)入大批量數(shù)據(jù)時(shí),數(shù)據(jù)會(huì)根據(jù)寫(xiě)入的key算出一個(gè)hash值,將數(shù)據(jù)落地到不同的主節(jié)點(diǎn)上,所以當(dāng)一個(gè)主節(jié)點(diǎn)下線后,落地到其他主節(jié)點(diǎn)的寫(xiě)請(qǐng)求還是正常的。

高可用集群模式
二、Redis高可用集群搭建
Redis集群需要至少三個(gè)主節(jié)點(diǎn),我們這里搭建三個(gè)主節(jié)點(diǎn),并且給每個(gè)主節(jié)點(diǎn)再搭建一個(gè)從節(jié)點(diǎn),總共6個(gè)Redis節(jié)點(diǎn),端口號(hào)從8001~8006,這里筆者依舊是在一臺(tái)機(jī)器上部署六個(gè)節(jié)點(diǎn),搭建步驟如下:
配置1-1
#在Redis安裝目錄下創(chuàng)建一個(gè)config和data目錄,并將redis.conf文件拷貝到config目錄下并更名為redis-8001.conf進(jìn)行配置修改。有部分配置再之前的主從哨兵集群有講解過(guò),這里便不再贅述。
port 8001
protected-mode no
daemonize yes
pidfile "/var/run/redis-8001.pid"
logfile "8001.log"
dir "/home/lf/redis-6.2.1/data"
dbfilename "dump-8001.rdb"
#bind 127.0.0.1 -::1
appendonly yes
appendfilename "appendonly-8001.aof"
requirepass "123456"
#設(shè)置集群訪問(wèn)密碼
masterauth 123456
#啟動(dòng)集群模式
cluster-enabled yes
#集群節(jié)點(diǎn)信息文件,這里800x最好和port對(duì)應(yīng)上
cluster-config-file nodes-8001.conf
#設(shè)置節(jié)點(diǎn)超時(shí)時(shí)間,單位:毫秒
cluster-node-timeout 15000
修改完畢redis-8001.conf配置后,我們復(fù)制該配置并更名為redis-8002.conf、redis-8003.conf、redis-8004.conf、redis-8005.conf、redis-8006.conf,然后我們將文件里的8001分別替換成8002、8003、8004、8005、8006,可以批量替換:
:%s/源字符串/目的字符串/g
注意,如果集群是搭建在不同的服務(wù)器上,大家還要在每臺(tái)服務(wù)器上執(zhí)行下面的命令關(guān)閉下防火墻,避免出現(xiàn)因?yàn)榉阑饓?dǎo)致不同服務(wù)器的Redis進(jìn)程無(wú)法互相訪問(wèn):
systemctl stop firewalld # 臨時(shí)關(guān)閉防火墻
systemctl disable firewalld # 禁止開(kāi)機(jī)啟動(dòng)
之后,我們單獨(dú)修改redis-8001.conf的配置:
min-replicas-to-write 1
這個(gè)配置可以讓我們?cè)谙蛑鞴?jié)點(diǎn)寫(xiě)數(shù)據(jù)時(shí),主節(jié)點(diǎn)必須至少同步到一個(gè)從節(jié)點(diǎn)才會(huì)返回,如果配3則主節(jié)點(diǎn)必須同步到3個(gè)節(jié)點(diǎn)才會(huì)返回,這個(gè)配置可以在主節(jié)點(diǎn)下線,從節(jié)點(diǎn)切換為主節(jié)點(diǎn)時(shí)減少數(shù)據(jù)的丟失,但這個(gè)配置也不能完全規(guī)避在主節(jié)點(diǎn)下線時(shí)數(shù)據(jù)的丟失,并且存在性能的損耗,因?yàn)橹鞴?jié)點(diǎn)必須確認(rèn)數(shù)據(jù)同步到一定量的從節(jié)點(diǎn),才能將客戶端的請(qǐng)求返回。
現(xiàn)在,我們依次啟動(dòng)端口為8001~8006的Redis服務(wù):
[root@master redis-6.2.1]# src/redis-server config/redis-8001.conf
[root@master redis-6.2.1]# src/redis-server config/redis-8002.conf
[root@master redis-6.2.1]# src/redis-server config/redis-8003.conf
[root@master redis-6.2.1]# src/redis-server config/redis-8004.conf
[root@master redis-6.2.1]# src/redis-server config/redis-8005.conf
[root@master redis-6.2.1]# src/redis-server config/redis-8006.conf
之前創(chuàng)建的6個(gè)Redis服務(wù)還是獨(dú)立的服務(wù),下面我們來(lái)看下將這6個(gè)服務(wù)組成一個(gè)集群的命令:
[root@master redis-6.2.1]# src/redis-cli --cluster help
Cluster Manager Commands:
create host1:port1 ... hostN:portN #組成集群的Redis服務(wù)的IP和端口
--cluster-replicas arg> #集群副本數(shù)量,填N代表每個(gè)主節(jié)點(diǎn)有N個(gè)從節(jié)點(diǎn)br>……
現(xiàn)在,我們按照上面的命令將6個(gè)Redis服務(wù)組成一個(gè)集群,我們有6個(gè)Redis服務(wù),所以會(huì)有3個(gè)主節(jié)點(diǎn),3個(gè)從節(jié)點(diǎn),--cluster-replicas的參數(shù)我們應(yīng)該填1:
#創(chuàng)建集群
[root@master redis-6.2.1]# src/redis-cli -a 123456 --cluster create --cluster-replicas 1 192.168.6.86:8001 192.168.6.86:8002 192.168.6.86:8003 192.168.6.86:8004 192.168.6.86:8005 192.168.6.86:8006
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 192.168.6.86:8005 to 192.168.6.86:8001
Adding replica 192.168.6.86:8006 to 192.168.6.86:8002
Adding replica 192.168.6.86:8004 to 192.168.6.86:8003
>>> Trying to optimize slaves allocation for anti-affinity
[WARNING] Some slaves are in the same host as their master
#1>
M: 28ad6b59866832b13dbd58dd944e641862702e23 192.168.6.86:8001
slots:[0-5460] (5461 slots) master
M: baf630fe745d9f1db7a58ffb96e180fab1047c79 192.168.6.86:8002
slots:[5461-10922] (5462 slots) master
M: 115a626ee6d475076b096181ab10d3ab6988cc04 192.168.6.86:8003
slots:[10923-16383] (5461 slots) master
S: 54b6c985bf0f41fa1b92cff7c165c317dd0a30c7 192.168.6.86:8004
replicates baf630fe745d9f1db7a58ffb96e180fab1047c79
S: 9c6f93c3b5329e60032b970b57e599b98961cba6 192.168.6.86:8005
replicates 115a626ee6d475076b096181ab10d3ab6988cc04
S: aa6ce37e876660161403a801adb8fc7a79a9d876 192.168.6.86:8006
replicates 28ad6b59866832b13dbd58dd944e641862702e23
Can I set the above configuration? (type 'yes' to accept): yes #2>
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
>>> Performing Cluster Check (using node 192.168.6.86:8001)
#3>
M: 28ad6b59866832b13dbd58dd944e641862702e23 192.168.6.86:8001
slots:[0-5460] (5461 slots) master
1 additional replica(s)
S: aa6ce37e876660161403a801adb8fc7a79a9d876 192.168.6.86:8006
slots: (0 slots) slave
replicates 28ad6b59866832b13dbd58dd944e641862702e23
M: baf630fe745d9f1db7a58ffb96e180fab1047c79 192.168.6.86:8002
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: 9c6f93c3b5329e60032b970b57e599b98961cba6 192.168.6.86:8005
slots: (0 slots) slave
replicates 115a626ee6d475076b096181ab10d3ab6988cc04
M: 115a626ee6d475076b096181ab10d3ab6988cc04 192.168.6.86:8003
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
S: 54b6c985bf0f41fa1b92cff7c165c317dd0a30c7 192.168.6.86:8004
slots: (0 slots) slave
replicates baf630fe745d9f1db7a58ffb96e180fab1047c79
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
我們節(jié)選創(chuàng)建集群的部分返回來(lái)解析,下面有3個(gè)M和3個(gè)S,分別代表主節(jié)點(diǎn)master和從節(jié)點(diǎn)slave,之后是節(jié)點(diǎn)的ID、IP+端口,集群默認(rèn)會(huì)使用我們輸入的前三個(gè)服務(wù)作為主節(jié)點(diǎn),根據(jù)我們之前輸入的參數(shù),端口號(hào)8001、8002、8003的服務(wù)作為主節(jié)點(diǎn)。主節(jié)點(diǎn)還會(huì)有該節(jié)點(diǎn)所對(duì)應(yīng)的槽位,Redis會(huì)將數(shù)據(jù)劃分為16384個(gè)槽位(slots),每個(gè)節(jié)點(diǎn)負(fù)責(zé)存儲(chǔ)一部分槽位,比如8001對(duì)應(yīng)的槽位是[0,5460],8002對(duì)應(yīng)的槽位是[5461,10922],8003對(duì)應(yīng)的槽位是[10923,16383],當(dāng)我們要存儲(chǔ)或讀取一個(gè)key值時(shí),Redis客戶端會(huì)根據(jù)key的hash值去對(duì)應(yīng)槽位的主節(jié)點(diǎn)執(zhí)行命令。我們?cè)賮?lái)看下從節(jié)點(diǎn),從節(jié)點(diǎn)的格式大部分和主節(jié)點(diǎn)類似,除了槽位那部分,從節(jié)點(diǎn)可以根據(jù)replicates {masterID}查詢?cè)摴?jié)點(diǎn)對(duì)應(yīng)的主節(jié)點(diǎn)ID,比如8004從節(jié)點(diǎn)對(duì)應(yīng)主8002主節(jié)點(diǎn),8005從節(jié)點(diǎn)對(duì)應(yīng)8003主節(jié)點(diǎn),8006從節(jié)點(diǎn)對(duì)應(yīng)主節(jié)點(diǎn)8001。
#1>
M(主節(jié)點(diǎn)): 28ad6b59866832b13dbd58dd944e641862702e23(節(jié)點(diǎn)ID) 192.168.6.86:8001(節(jié)點(diǎn)的IP和端口)
slots:[0-5460] (5461 slots) master(節(jié)點(diǎn)槽位,key的hash值在0~5460會(huì)落地到該節(jié)點(diǎn))
M: baf630fe745d9f1db7a58ffb96e180fab1047c79 192.168.6.86:8002
slots:[5461-10922] (5462 slots) master
M: 115a626ee6d475076b096181ab10d3ab6988cc04 192.168.6.86:8003
slots:[10923-16383] (5461 slots) master
S: 54b6c985bf0f41fa1b92cff7c165c317dd0a30c7 192.168.6.86:8004
replicates baf630fe745d9f1db7a58ffb96e180fab1047c79
S: 9c6f93c3b5329e60032b970b57e599b98961cba6 192.168.6.86:8005
replicates 115a626ee6d475076b096181ab10d3ab6988cc04
S(從節(jié)點(diǎn)): aa6ce37e876660161403a801adb8fc7a79a9d876(節(jié)點(diǎn)ID) 192.168.6.86:8006(節(jié)點(diǎn)的IP和端口)
replicates 28ad6b59866832b13dbd58dd944e641862702e23(該從節(jié)點(diǎn)對(duì)應(yīng)主節(jié)點(diǎn)的ID)
如果同意Redis集群的主從劃分,則在2>處輸入yes并回車(chē)。3>處則是真實(shí)劃分,如果沒(méi)有意外內(nèi)容應(yīng)該跟1>處大致類似。之前,我們把所有的節(jié)點(diǎn)都搭建在一臺(tái)服務(wù)器上,如果我們把節(jié)點(diǎn)部署在多臺(tái)服務(wù)器上,那么Redis在劃分主從時(shí),會(huì)刻意將主從節(jié)點(diǎn)劃分到不同的服務(wù)器上,這是因?yàn)镽edis期望如果一臺(tái)服務(wù)器掛了,不會(huì)導(dǎo)致一整個(gè)主從集群都不可用,將主從劃分到不同機(jī)器上,可以保證如果主節(jié)點(diǎn)所在的服務(wù)器掛了,從節(jié)點(diǎn)能切換成主節(jié)點(diǎn)。
如果我們想查看集群信息,可以連接到任意一個(gè)節(jié)點(diǎn),執(zhí)行CLUSTER NODES或者CLUSTER INFO命令:
[root@master redis-6.2.1]# src/redis-cli -a 123456 -c -p 8001
127.0.0.1:8001> CLUSTER NODES
aa6ce37e876660161403a801adb8fc7a79a9d876 192.168.6.86:8006@18006 slave 28ad6b59866832b13dbd58dd944e641862702e23 0 1618317182151 1 connected
baf630fe745d9f1db7a58ffb96e180fab1047c79 192.168.6.86:8002@18002 master - 0 1618317187163 2 connected 5461-10922
9c6f93c3b5329e60032b970b57e599b98961cba6 192.168.6.86:8005@18005 slave 115a626ee6d475076b096181ab10d3ab6988cc04 0 1618317186161 3 connected
115a626ee6d475076b096181ab10d3ab6988cc04 192.168.6.86:8003@18003 master - 0 1618317184000 3 connected 10923-16383
54b6c985bf0f41fa1b92cff7c165c317dd0a30c7 192.168.6.86:8004@18004 slave baf630fe745d9f1db7a58ffb96e180fab1047c79 0 1618317186000 2 connected
28ad6b59866832b13dbd58dd944e641862702e23 192.168.6.86:8001@18001 myself,master - 0 1618317184000 1 connected 0-5460
127.0.0.1:8001> CLUSTER INFO
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:61
cluster_stats_messages_pong_sent:62
cluster_stats_messages_sent:123
cluster_stats_messages_ping_received:57
cluster_stats_messages_pong_received:61
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:123
執(zhí)行CLUSTER NODES可以看到集群的主從劃分,主節(jié)點(diǎn)所管理的槽位,從節(jié)點(diǎn)對(duì)接的主節(jié)點(diǎn),以及各個(gè)節(jié)點(diǎn)的連接數(shù)。這里要注意一點(diǎn),如果集群所有的服務(wù)器都崩潰了,待服務(wù)器啟動(dòng)時(shí)如果我們想重啟整個(gè)集群,不需要再用redus-cli --cluster create命令去創(chuàng)建集群,只要啟動(dòng)每個(gè)8001~8006的Redis節(jié)點(diǎn),整個(gè)集群便會(huì)恢復(fù),因?yàn)榧阂坏﹦?chuàng)建成功,集群的節(jié)點(diǎn)信息會(huì)被寫(xiě)入之前配置的nodes-800X.conf文件中。
現(xiàn)在我們來(lái)測(cè)試集群,我們分別設(shè)置兩個(gè)鍵值對(duì)python,flask>、java,spring>:
[root@master redis-6.2.1]# src/redis-cli -a 123456 -c -p 8001
127.0.0.1:8001> SET python flask
-> Redirected to slot [7252] located at 192.168.6.86:8002
OK
192.168.6.86:8002> SET java spring
-> Redirected to slot [858] located at 192.168.6.86:8001
OK
192.168.6.86:8001> GET java
"spring"
192.168.6.86:8001> GET python
-> Redirected to slot [7252] located at 192.168.6.86:8002
"flask"
根據(jù)上面的輸出結(jié)果我們可以看到,在設(shè)置python,flask>鍵值對(duì)時(shí),Redis計(jì)算出python對(duì)應(yīng)的hash值為7252,處于8002節(jié)點(diǎn)所管轄的槽位[5461-10922],會(huì)幫我們重定向到8002節(jié)點(diǎn)。當(dāng)我們?cè)?002主節(jié)點(diǎn)設(shè)置java,spring>,Redis服務(wù)算出java對(duì)應(yīng)的hash值為858,處于8001節(jié)點(diǎn)所管轄的槽位[0-5460],又會(huì)幫我們重定向到8001.同理執(zhí)行GET命令時(shí)也會(huì)幫助我們重定向。
現(xiàn)在,我們?cè)賮?lái)殺死8001的從節(jié)點(diǎn)8006進(jìn)程,測(cè)試之前單獨(dú)配置給8001的min-replicas-to-write是否能生效,之前我們配置8001必須將寫(xiě)入的數(shù)據(jù)同步到至少一個(gè)從節(jié)點(diǎn)才能返回,現(xiàn)在我們?cè)偻丝?001的Redis服務(wù)設(shè)置java,tomcat>鍵值對(duì):
[root@master redis-6.2.1]# ps -ef | grep redis
root 44661 22426 0 19:50 pts/0 00:00:00 grep --color=auto redis
root 108814 1 0 Apr13 ? 00:13:24 src/redis-server *:8002 [cluster]
root 108820 1 0 Apr13 ? 00:13:31 src/redis-server *:8003 [cluster]
root 108826 1 0 Apr13 ? 00:13:14 src/redis-server *:8004 [cluster]
root 108835 1 0 Apr13 ? 00:13:43 src/redis-server *:8005 [cluster]
root 108923 1 0 Apr13 ? 00:13:21 src/redis-server *:8001 [cluster]
root 109206 1 0 Apr13 ? 00:13:28 src/redis-server *:8006 [cluster]
root 109315 1 0 Apr13 ? 00:13:43 src/redis-server *:8007 [cluster]
root 109324 1 0 Apr13 ? 00:13:20 src/redis-server *:8008 [cluster]
root 109963 103945 0 Apr13 pts/1 00:00:00 src/redis-cli -a 123456 -c -p 8001
#殺死8006端口的Redis服務(wù)
[root@master redis-6.2.1]# kill -9 109206
#連接到8001Redis服務(wù)后,嘗試設(shè)置java,tomcat>鍵值對(duì),可以看到報(bào)錯(cuò):沒(méi)有足夠的從節(jié)點(diǎn)寫(xiě)入。
192.168.6.86:8001> SET java tomcat
(error) NOREPLICAS Not enough good replicas to write.
從上面的的結(jié)果我們可以確定,min-replicas-to-write N確實(shí)可以保證在向Redis主節(jié)點(diǎn)寫(xiě)入數(shù)據(jù)時(shí)至少同步到N個(gè)從節(jié)點(diǎn)后才會(huì)返回,如果我們重啟8006從節(jié)點(diǎn),8006節(jié)點(diǎn)會(huì)自動(dòng)重新加入集群,于是8001主節(jié)點(diǎn)又可以正常設(shè)置鍵值對(duì):
[root@master redis-6.2.1]# src/redis-server config/redis-8006.conf
192.168.6.86:8001> SET java tomcat
OK
三、Redis集群節(jié)點(diǎn)間的通信機(jī)制
Redis Cluster節(jié)點(diǎn)間采取gossip協(xié)議進(jìn)行通信,維護(hù)集群的元數(shù)據(jù)(集群節(jié)點(diǎn)信息,主從角色,節(jié)點(diǎn)數(shù)量,各節(jié)點(diǎn)共享的數(shù)據(jù)等)有兩種方式:集中式和gossip
3.1、集中式
優(yōu)點(diǎn)在于元數(shù)據(jù)的更新和讀取具有良好的時(shí)效性,一旦元數(shù)據(jù)出現(xiàn)變更立即就會(huì)更新到集中式的存儲(chǔ)中,其他節(jié)點(diǎn)讀取的時(shí)候可以立即感知到;不足的是所有元數(shù)據(jù)的更新壓力全部集中在一個(gè)地方,可能導(dǎo)致元數(shù)據(jù)的存儲(chǔ)壓力。很多中間件都會(huì)借助zookeeper集中式存儲(chǔ)元數(shù)據(jù)。
3.2、gossip
gossip協(xié)議包含多種消息,包括ping,pong,meet,fail等等。
- meet:某個(gè)節(jié)點(diǎn)發(fā)送meet給新加入的節(jié)點(diǎn),讓新節(jié)點(diǎn)加入集群中,然后新節(jié)點(diǎn)就會(huì)開(kāi)始與其他節(jié)點(diǎn)進(jìn)行通信。
- ping:每個(gè)節(jié)點(diǎn)都會(huì)頻繁給其他節(jié)點(diǎn)發(fā)送ping,其中包含自己的狀態(tài)還有自己維護(hù)的集群元數(shù)據(jù),互相通過(guò)ping交換元數(shù)據(jù)(類似自己感知到的集群節(jié)點(diǎn)增加和移除,hash slot信息等);
- pong: 對(duì)ping和meet消息的返回,包含自己的狀態(tài)和其他信息,也可以用于信息廣播和更新;
- fail:某個(gè)節(jié)點(diǎn)判斷另一個(gè)節(jié)點(diǎn)下線后,就發(fā)送fail給其他節(jié)點(diǎn),通知其他節(jié)點(diǎn)指定的節(jié)點(diǎn)宕機(jī)了。
gossip協(xié)議的優(yōu)點(diǎn)在于元數(shù)據(jù)的更新比較分散,不是集中在一個(gè)地方,更新請(qǐng)求會(huì)陸陸續(xù)續(xù)傳輸?shù)剿泄?jié)點(diǎn)上,降低了壓力,但存在一定的延時(shí),可能導(dǎo)致集群的一些操作存在滯后。每個(gè)節(jié)點(diǎn)都有一個(gè)專門(mén)用于節(jié)點(diǎn)間gossip通信的端口,就是自己提供服務(wù)的端口號(hào)+10000,比如8001,那么用于節(jié)點(diǎn)間通信的端口就是18001端口。每個(gè)節(jié)點(diǎn)每隔一段時(shí)間都會(huì)往另外幾個(gè)節(jié)點(diǎn)發(fā)送ping消息,同時(shí)其他幾點(diǎn)接收到ping消息之后返回pong消息。
四、網(wǎng)絡(luò)抖動(dòng)
線上的機(jī)房網(wǎng)絡(luò)往往并不總是風(fēng)平浪靜的,經(jīng)常會(huì)發(fā)生各種各樣的問(wèn)題。比如網(wǎng)絡(luò)抖動(dòng)就是很常見(jiàn)的現(xiàn)象,突然間部分連接變得不可訪問(wèn),過(guò)段時(shí)間又恢復(fù)正常了。
為解決這種問(wèn)題,Redis Cluster提供了一個(gè)選項(xiàng)cluster--node--timeout,表示當(dāng)某個(gè)節(jié)點(diǎn)持續(xù)timeout的時(shí)間失聯(lián)時(shí),才可以判定該節(jié)點(diǎn)出現(xiàn)故障,需要進(jìn)行主從切換。如果沒(méi)有這個(gè)選項(xiàng),網(wǎng)絡(luò)抖動(dòng)會(huì)導(dǎo)致主從頻繁切換 (數(shù)據(jù)的重新復(fù)制)。
五、Redis集群選舉原理分析
當(dāng)從節(jié)點(diǎn)發(fā)現(xiàn)自己的主節(jié)點(diǎn)變?yōu)閒ail狀態(tài)時(shí),便嘗試進(jìn)行failover,以期成為新的主節(jié)點(diǎn)。由于掛掉的主節(jié)點(diǎn)可能會(huì)有多個(gè)從節(jié)點(diǎn),從而存在多個(gè)從節(jié)點(diǎn)競(jìng)爭(zhēng)成為主節(jié)點(diǎn)的過(guò)程,其過(guò)程如下:
1.從節(jié)點(diǎn)發(fā)現(xiàn)自己的主節(jié)點(diǎn)變?yōu)閒ail。
2.將自己記錄的集群currentEpoch加1,并廣播FAILOVER_AUTH_REQUEST信息。
3.其他節(jié)點(diǎn)收到該信息,只有主節(jié)點(diǎn)響應(yīng),判斷請(qǐng)求者的合法性,并發(fā)送FAILOVER_AUTH_ACK,對(duì)每一個(gè)epoch只發(fā)送一次ack。
4.嘗試failover的從節(jié)點(diǎn)收集其他主節(jié)點(diǎn)返回的FAILOVER_AUTH_ACK。
5.從節(jié)點(diǎn)收到超過(guò)半數(shù)主節(jié)點(diǎn)的ack后變成新主節(jié)點(diǎn)(這里解釋了集群為什么至少需要三個(gè)主節(jié)點(diǎn),如果只有兩個(gè),當(dāng)其中一個(gè)掛了,只剩一個(gè)主節(jié)點(diǎn)是不能選舉成功的)
6.從節(jié)點(diǎn)廣播pong消息通知其他集群節(jié)點(diǎn),從節(jié)點(diǎn)并不是在主節(jié)點(diǎn)一進(jìn)入fail狀態(tài)就馬上嘗試發(fā)起選舉,而是有一定延遲,一定的延遲確保我們等待fail狀態(tài)在集群中傳播,從節(jié)點(diǎn)如果立即嘗試選舉,其它主節(jié)點(diǎn)尚未意識(shí)到fail狀態(tài),可能會(huì)拒絕投票。
延遲計(jì)算公式:DELAY = 500ms + random(0~500ms)+SALVE_RANK*1000ms
SALVE_RANK表示此從節(jié)點(diǎn)從主節(jié)點(diǎn)復(fù)制數(shù)據(jù)的總量的rank。rank越小代表已復(fù)制的數(shù)據(jù)越新。這種方式下,持有最新數(shù)據(jù)的從節(jié)點(diǎn)將會(huì)首先發(fā)起選舉。
5.1、集群是否完整才能對(duì)外提供服務(wù)
當(dāng)redis.conf的配置cluster-require-full-coverage為no時(shí),表示當(dāng)負(fù)責(zé)一個(gè)主庫(kù)下線且沒(méi)有相應(yīng)的從庫(kù)進(jìn)行故障恢復(fù)時(shí),集群仍然可用,如果為yes則集群不可用。
5.2、Redis集群為什么至少需要三個(gè)master節(jié)點(diǎn),并且推薦節(jié)點(diǎn)數(shù)為奇數(shù)?
對(duì)于類似MSET,MGET這樣可以操作多個(gè)key的命令,Redis集群只支持所有key落在同一slot的情況,如果有多個(gè)key一定要用類似MSET命令在Redis集群上批量操作,則可以在key的前面加上{XX},這樣數(shù)據(jù)分片hash計(jì)算的只會(huì)是大括號(hào)里的值,可以確保不同的key能落到同一slot里去,示例如下:
#user:1:name和user:2:name兩個(gè)key會(huì)落地到不同的槽位,所以不能用類似MSET批量操作key的命令
192.168.6.86:8002> MSET user:1:name Tom user:2:name Amy
(error) CROSSSLOT Keys in request don't hash to the same slot
#如果用{XX}前綴,可以保證{user}:1:name和{user}:2:name落地到同一個(gè)槽位
192.168.6.86:8002> MSET {user}:1:name Tom {user}:2:name Amy
-> Redirected to slot [5474] located at 192.168.6.86:8001
OK
192.168.6.86:8001> MGET {user}:1:name {user}:2:name
1) "Tom"
2) "Amy"
5.3、哨兵leader選舉流程
當(dāng)一個(gè)主節(jié)點(diǎn)服務(wù)器被某哨兵視為下線狀態(tài)后,該哨兵會(huì)與其他哨兵協(xié)商選出哨兵的leader進(jìn)行故障轉(zhuǎn)移工作。每個(gè)發(fā)現(xiàn)主節(jié)點(diǎn)下線的哨兵都可以要求其他哨兵選自己為哨兵的leader,選舉是先到先得。每個(gè)哨兵每次選舉都會(huì)自增選舉周期,每個(gè)周期中只會(huì)選擇一個(gè)哨兵作為的leader。如果所有超過(guò)一半的哨兵選舉某哨兵作為leader。之后該哨兵進(jìn)行故障轉(zhuǎn)移操作,在存活的從節(jié)點(diǎn)中選舉出新的主節(jié)點(diǎn),這個(gè)選舉過(guò)程跟集群的主節(jié)點(diǎn)選舉很類似。
哨兵集群哪怕只有一個(gè)哨兵節(jié)點(diǎn),在主節(jié)點(diǎn)下線時(shí)也能正常選舉出新的主節(jié)點(diǎn),當(dāng)然那唯一一個(gè)哨兵節(jié)點(diǎn)就作為leader選舉新的主節(jié)點(diǎn)。不過(guò)為了高可用一般都推薦至少部署三個(gè)哨兵節(jié)點(diǎn)。為什么推薦奇數(shù)個(gè)哨兵節(jié)點(diǎn)原理跟集群奇數(shù)個(gè)主節(jié)點(diǎn)類似。
六、新增/刪除節(jié)點(diǎn)
到此為止,我們學(xué)習(xí)了如何創(chuàng)建集群、如何向集群設(shè)置鍵值對(duì),我們還差了解如何往集群里加入節(jié)點(diǎn)和刪除節(jié)點(diǎn)。這里筆者會(huì)帶大家一起往集群加入一對(duì)8007和8008端口的Redis主從節(jié)點(diǎn),然后再將這對(duì)主從從集群里移除。我們按照之前的步驟復(fù)制redis.conf到config目錄下,更名為redis-8007.conf和redis-8008.conf,并按照配置1-1將原先8001替換成8007和8008,然后啟動(dòng)8007和8008兩個(gè)Redis服務(wù):
[root@master redis-6.2.1]# src/redis-server config/redis-8007.conf
[root@master redis-6.2.1]# src/redis-server config/redis-8008.conf
然后我們執(zhí)行redis-cli --cluster help查看如何將新節(jié)點(diǎn)加入集群:
[root@master redis-6.2.1]# src/redis-cli --cluster help
Cluster Manager Commands:
create host1:port1 ... hostN:portN
--cluster-replicas arg>
check host:port
--cluster-search-multiple-owners
info host:port
fix host:port
--cluster-search-multiple-owners
--cluster-fix-with-unreachable-masters
reshard host:port
--cluster-from arg>
--cluster-to arg>
--cluster-slots arg>
--cluster-yes
--cluster-timeout arg>
--cluster-pipeline arg>
--cluster-replace
rebalance host:port
--cluster-weight node1=w1...nodeN=wN>
--cluster-use-empty-masters
--cluster-timeout arg>
--cluster-simulate
--cluster-pipeline arg>
--cluster-threshold arg>
--cluster-replace
add-node new_host:new_port existing_host:existing_port
--cluster-slave
--cluster-master-id arg>
del-node host:port node_id
call host:port command arg arg .. arg
--cluster-only-masters
--cluster-only-replicas
set-timeout host:port milliseconds
import host:port
--cluster-from arg>
--cluster-from-user arg>
--cluster-from-pass arg>
--cluster-from-askpass
--cluster-copy
--cluster-replace
backup host:port backup_directory
help
1.create:創(chuàng)建一個(gè)集群環(huán)境host1:port1 ... hostN:portN。
2.call:可以執(zhí)行redis命令。
3.add-node:將一個(gè)節(jié)點(diǎn)添加到集群里,第一個(gè)參數(shù)為新節(jié)點(diǎn)的ip:port,第二個(gè)參數(shù)為集群中任意一個(gè)已經(jīng)存在的節(jié)點(diǎn)的ip:port。
4.del-node:移除一個(gè)節(jié)點(diǎn)。
5.reshard:重新分片。
6.check:檢查集群狀態(tài)。
現(xiàn)在,我們將8007Redis服務(wù)加入到集群,這里需要我們填入兩個(gè)參數(shù),一個(gè)是新加入的節(jié)點(diǎn)IP和端口,一個(gè)是已存在在集群的IP和端口,分別是192.168.6.86:8007和192.168.6.86:8001:
[root@master redis-6.2.1]# src/redis-cli -a 123456 --cluster add-node 192.168.6.86:8007 192.168.6.86:8001
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Adding node 192.168.6.86:8007 to cluster 192.168.6.86:8001
>>> Performing Cluster Check (using node 192.168.6.86:8001)
M: 28ad6b59866832b13dbd58dd944e641862702e23 192.168.6.86:8001
slots:[0-5460] (5461 slots) master
1 additional replica(s)
S: aa6ce37e876660161403a801adb8fc7a79a9d876 192.168.6.86:8006
slots: (0 slots) slave
replicates 28ad6b59866832b13dbd58dd944e641862702e23
M: baf630fe745d9f1db7a58ffb96e180fab1047c79 192.168.6.86:8002
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: 9c6f93c3b5329e60032b970b57e599b98961cba6 192.168.6.86:8005
slots: (0 slots) slave
replicates 115a626ee6d475076b096181ab10d3ab6988cc04
M: 115a626ee6d475076b096181ab10d3ab6988cc04 192.168.6.86:8003
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
S: 54b6c985bf0f41fa1b92cff7c165c317dd0a30c7 192.168.6.86:8004
slots: (0 slots) slave
replicates baf630fe745d9f1db7a58ffb96e180fab1047c79
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 192.168.6.86:8007 to make it join the cluster.
[OK] New node added correctly.
加入節(jié)點(diǎn)時(shí),會(huì)重新打印一遍集群原先的主從劃分,最后提示:[OK] New node added correctly,代表節(jié)點(diǎn)加入成功。
按照上面的步驟,我們把8008也加入到集群,可以發(fā)現(xiàn)這次打印的集群信息,相比上次多了一個(gè)主節(jié)點(diǎn)8007:
[root@master redis-6.2.1]# src/redis-cli -a 123456 --cluster add-node 192.168.6.86:8008 192.168.6.86:8001
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Adding node 192.168.6.86:8008 to cluster 192.168.6.86:8001
>>> Performing Cluster Check (using node 192.168.6.86:8001)
M: 28ad6b59866832b13dbd58dd944e641862702e23 192.168.6.86:8001
slots:[0-5460] (5461 slots) master
1 additional replica(s)
S: aa6ce37e876660161403a801adb8fc7a79a9d876 192.168.6.86:8006
slots: (0 slots) slave
replicates 28ad6b59866832b13dbd58dd944e641862702e23
M: baf630fe745d9f1db7a58ffb96e180fab1047c79 192.168.6.86:8002
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: 9c6f93c3b5329e60032b970b57e599b98961cba6 192.168.6.86:8005
slots: (0 slots) slave
replicates 115a626ee6d475076b096181ab10d3ab6988cc04
M: 115a626ee6d475076b096181ab10d3ab6988cc04 192.168.6.86:8003
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
M: 5846d4b7785447b9d7b1c08a0ed74c5e68f2f367 192.168.6.86:8007
slots: (0 slots) master
S: 54b6c985bf0f41fa1b92cff7c165c317dd0a30c7 192.168.6.86:8004
slots: (0 slots) slave
replicates baf630fe745d9f1db7a58ffb96e180fab1047c79
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 192.168.6.86:8008 to make it join the cluster.
[OK] New node added correctly.
如果我們打印集群信息,會(huì)發(fā)現(xiàn)8007和8008兩個(gè)節(jié)點(diǎn)都是主節(jié)點(diǎn),而且集群并沒(méi)有給這兩個(gè)節(jié)點(diǎn)劃分槽位,這是正常的,新加入到集群的節(jié)點(diǎn)都是主節(jié)點(diǎn),兩個(gè)節(jié)點(diǎn)的主從關(guān)系,以及節(jié)點(diǎn)管理的槽位需要我們手動(dòng)去劃分:
192.168.6.86:8001> CLUSTER NODES
aa6ce37e876660161403a801adb8fc7a79a9d876 192.168.6.86:8006@18006 slave 28ad6b59866832b13dbd58dd944e641862702e23 0 1618318693000 1 connected
baf630fe745d9f1db7a58ffb96e180fab1047c79 192.168.6.86:8002@18002 master - 0 1618318692000 2 connected 5461-10922
9c6f93c3b5329e60032b970b57e599b98961cba6 192.168.6.86:8005@18005 slave 115a626ee6d475076b096181ab10d3ab6988cc04 0 1618318693725 3 connected
115a626ee6d475076b096181ab10d3ab6988cc04 192.168.6.86:8003@18003 master - 0 1618318695730 3 connected 10923-16383
5cd842f76c141eddf5270218b877a54a0c202998 192.168.6.86:8008@18008 master - 0 1618318690000 0 connected
5846d4b7785447b9d7b1c08a0ed74c5e68f2f367 192.168.6.86:8007@18007 master - 0 1618318694728 7 connected
54b6c985bf0f41fa1b92cff7c165c317dd0a30c7 192.168.6.86:8004@18004 slave baf630fe745d9f1db7a58ffb96e180fab1047c79 0 1618318691000 2 connected
28ad6b59866832b13dbd58dd944e641862702e23 192.168.6.86:8001@18001 myself,master - 0 1618318692000 1 connected 0-5460
我們先連接到8008節(jié)點(diǎn),讓8008節(jié)點(diǎn)成為8007的從節(jié)點(diǎn),這里我們用CLUSTER REPLICATE {masterID}命令,可以指定一個(gè)新加入的主節(jié)點(diǎn),成為另一個(gè)主節(jié)點(diǎn)的從節(jié)點(diǎn),這里masterID我們用8007的ID:
[root@master redis-6.2.1]# src/redis-cli -a 123456 -c -p 8008
127.0.0.1:8008> CLUSTER REPLICATE 5846d4b7785447b9d7b1c08a0ed74c5e68f2f367
OK
#查看節(jié)點(diǎn)信息可以看到,8008已經(jīng)成為8007的從節(jié)點(diǎn)
127.0.0.1:8008> CLUSTER NODES
baf630fe745d9f1db7a58ffb96e180fab1047c79 192.168.6.86:8002@18002 master - 0 1618318835003 2 connected 5461-10922
5846d4b7785447b9d7b1c08a0ed74c5e68f2f367 192.168.6.86:8007@18007 master - 0 1618318835000 7 connected
54b6c985bf0f41fa1b92cff7c165c317dd0a30c7 192.168.6.86:8004@18004 slave baf630fe745d9f1db7a58ffb96e180fab1047c79 0 1618318834000 2 connected
28ad6b59866832b13dbd58dd944e641862702e23 192.168.6.86:8001@18001 master - 0 1618318832000 1 connected 0-5460
115a626ee6d475076b096181ab10d3ab6988cc04 192.168.6.86:8003@18003 master - 0 1618318832999 3 connected 10923-16383
5cd842f76c141eddf5270218b877a54a0c202998 192.168.6.86:8008@18008 myself,slave 5846d4b7785447b9d7b1c08a0ed74c5e68f2f367 0 1618318833000 7 connected
9c6f93c3b5329e60032b970b57e599b98961cba6 192.168.6.86:8005@18005 slave 115a626ee6d475076b096181ab10d3ab6988cc04 0 1618318832000 3 connected
aa6ce37e876660161403a801adb8fc7a79a9d876 192.168.6.86:8006@18006 slave 28ad6b59866832b13dbd58dd944e641862702e23 0 1618318836006 1 connected
在劃分好新的主從后,我們要為新主從分配槽位,這里我們要用--cluster reshard命令:
[root@master redis-6.2.1]# src/redis-cli -a 123456 --cluster reshard 192.168.6.86:8001
>>> Performing Cluster Check (using node 192.168.6.86:8001)
M: 28ad6b59866832b13dbd58dd944e641862702e23 192.168.6.86:8001
slots:[0-5460] (5461 slots) master
1 additional replica(s)
S: aa6ce37e876660161403a801adb8fc7a79a9d876 192.168.6.86:8006
slots: (0 slots) slave
replicates 28ad6b59866832b13dbd58dd944e641862702e23
M: baf630fe745d9f1db7a58ffb96e180fab1047c79 192.168.6.86:8002
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: 9c6f93c3b5329e60032b970b57e599b98961cba6 192.168.6.86:8005
slots: (0 slots) slave
replicates 115a626ee6d475076b096181ab10d3ab6988cc04
M: 115a626ee6d475076b096181ab10d3ab6988cc04 192.168.6.86:8003
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
S: 5cd842f76c141eddf5270218b877a54a0c202998 192.168.6.86:8008
slots: (0 slots) slave
replicates 5846d4b7785447b9d7b1c08a0ed74c5e68f2f367
M: 5846d4b7785447b9d7b1c08a0ed74c5e68f2f367 192.168.6.86:8007
slots: (0 slots) master
1 additional replica(s)
S: 54b6c985bf0f41fa1b92cff7c165c317dd0a30c7 192.168.6.86:8004
slots: (0 slots) slave
replicates baf630fe745d9f1db7a58ffb96e180fab1047c79
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
#從8001移出600個(gè)槽位給別的主節(jié)點(diǎn)
How many slots do you want to move (from 1 to 16384)? 600
#輸入8007主節(jié)點(diǎn)的ID,會(huì)將8001主節(jié)點(diǎn)管理的600個(gè)槽位移給8007
What is the receiving node ID? 5846d4b7785447b9d7b1c08a0ed74c5e68f2f367
Please enter all the source node IDs.
#輸入all會(huì)從每個(gè)主節(jié)點(diǎn)(8001、8002、8003)取600個(gè)槽位分配給目標(biāo)主節(jié)點(diǎn)(8007)管理
Type 'all' to use all the nodes as source nodes for the hash slots.
#輸入done則指定從哪些節(jié)點(diǎn)取槽位分配給目標(biāo)主節(jié)點(diǎn)管理
Type 'done' once you entered all the source nodes IDs.
#這里我們輸入all,讓集群自動(dòng)幫我們?nèi)ジ鱾€(gè)主節(jié)點(diǎn)取槽位,由于要取600個(gè),這里輸出會(huì)很多,只節(jié)選部分,可以看到最高到8003主節(jié)點(diǎn)的11121
Source node #1: all
……
Moving slot 11119 from 115a626ee6d475076b096181ab10d3ab6988cc04
Moving slot 11120 from 115a626ee6d475076b096181ab10d3ab6988cc04
Moving slot 11121 from 115a626ee6d475076b096181ab10d3ab6988cc04
#輸入yes,讓Redis開(kāi)始執(zhí)行槽位分配。
Do you want to proceed with the proposed reshard plan (yes/no)? yes
槽位分配完畢后,我們?cè)賮?lái)看看各個(gè)主節(jié)點(diǎn)的槽位劃分,可以8001、8002、8003現(xiàn)在管理的槽位已經(jīng)和原先不同,而8007則管理三個(gè)槽位,分別是從8001、8002、8003分配過(guò)來(lái)的[0,198] 、[5461,5661]、 [10923,11121]:
127.0.0.1:8001> CLUSTER NODES
aa6ce37e876660161403a801adb8fc7a79a9d876 192.168.6.86:8006@18006 slave 28ad6b59866832b13dbd58dd944e641862702e23 0 1618319470349 1 connected
baf630fe745d9f1db7a58ffb96e180fab1047c79 192.168.6.86:8002@18002 master - 0 1618319472353 2 connected 5662-10922
9c6f93c3b5329e60032b970b57e599b98961cba6 192.168.6.86:8005@18005 slave 115a626ee6d475076b096181ab10d3ab6988cc04 0 1618319469347 3 connected
115a626ee6d475076b096181ab10d3ab6988cc04 192.168.6.86:8003@18003 master - 0 1618319471351 3 connected 11122-16383
5cd842f76c141eddf5270218b877a54a0c202998 192.168.6.86:8008@18008 slave 5846d4b7785447b9d7b1c08a0ed74c5e68f2f367 0 1618319469000 7 connected
5846d4b7785447b9d7b1c08a0ed74c5e68f2f367 192.168.6.86:8007@18007 master - 0 1618319470000 7 connected 0-198 5461-5661 10923-11121
54b6c985bf0f41fa1b92cff7c165c317dd0a30c7 192.168.6.86:8004@18004 slave baf630fe745d9f1db7a58ffb96e180fab1047c79 0 1618319468345 2 connected
28ad6b59866832b13dbd58dd944e641862702e23 192.168.6.86:8001@18001 myself,master - 0 1618319470000 1 connected 199-5460
我們來(lái)嘗試移除節(jié)點(diǎn),我們先移除8008從節(jié)點(diǎn),這里我們使用--cluster del-node {host}:{port} {nodeID}從集群移除從節(jié)點(diǎn):
[root@master redis-6.2.1]# src/redis-cli -a 123456 --cluster del-node 192.168.6.86:8008 5cd842f76c141eddf5270218b877a54a0c202998
>>> Removing node 5cd842f76c141eddf5270218b877a54a0c202998 from cluster 192.168.6.86:8008
>>> Sending CLUSTER FORGET messages to the cluster...
>>> Sending CLUSTER RESET SOFT to the deleted node.
我們?cè)僖瞥?007主節(jié)點(diǎn),由于8007節(jié)點(diǎn)已經(jīng)分配了槽位,直接移除會(huì)報(bào)錯(cuò),這里我們要先把8007的槽位歸還給各個(gè)主節(jié)點(diǎn),這里我們依舊使用
--cluster reshard將8007現(xiàn)有的節(jié)點(diǎn)重新劃分:
#重新劃分8007主節(jié)點(diǎn)的槽位
[root@master redis-6.2.1]# src/redis-cli -a 123456 --cluster reshard 192.168.6.86:8007
>>> Performing Cluster Check (using node 192.168.6.86:8007)
M: 5846d4b7785447b9d7b1c08a0ed74c5e68f2f367 192.168.6.86:8007
slots:[0-198],[5461-5661],[10923-11121] (599 slots) master
M: 28ad6b59866832b13dbd58dd944e641862702e23 192.168.6.86:8001
slots:[199-5460] (5262 slots) master
1 additional replica(s)
S: 54b6c985bf0f41fa1b92cff7c165c317dd0a30c7 192.168.6.86:8004
slots: (0 slots) slave
replicates baf630fe745d9f1db7a58ffb96e180fab1047c79
M: baf630fe745d9f1db7a58ffb96e180fab1047c79 192.168.6.86:8002
slots:[5662-10922] (5261 slots) master
1 additional replica(s)
M: 115a626ee6d475076b096181ab10d3ab6988cc04 192.168.6.86:8003
slots:[11122-16383] (5262 slots) master
1 additional replica(s)
S: aa6ce37e876660161403a801adb8fc7a79a9d876 192.168.6.86:8006
slots: (0 slots) slave
replicates 28ad6b59866832b13dbd58dd944e641862702e23
S: 9c6f93c3b5329e60032b970b57e599b98961cba6 192.168.6.86:8005
slots: (0 slots) slave
replicates 115a626ee6d475076b096181ab10d3ab6988cc04
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
#原先劃分給8007節(jié)點(diǎn)有600個(gè)槽位,現(xiàn)在要重新劃分出去
How many slots do you want to move (from 1 to 16384)? 600
#填寫(xiě)接受槽位節(jié)點(diǎn),這里填8001
What is the receiving node ID? 28ad6b59866832b13dbd58dd944e641862702e23
Please enter all the source node IDs.
Type 'all' to use all the nodes as source nodes for the hash slots.
Type 'done' once you entered all the source nodes IDs.
#填寫(xiě)8007節(jié)點(diǎn)ID
Source node #1: 5846d4b7785447b9d7b1c08a0ed74c5e68f2f367
#輸入done生成槽位遷移計(jì)劃
Source node #2: done
……
Moving slot 11119 from 5846d4b7785447b9d7b1c08a0ed74c5e68f2f367
Moving slot 11120 from 5846d4b7785447b9d7b1c08a0ed74c5e68f2f367
Moving slot 11121 from 5846d4b7785447b9d7b1c08a0ed74c5e68f2f367
#輸入yes開(kāi)始槽位遷移,根據(jù)下面的輸出我們可以看到11119、11120、11121被遷移到8001主節(jié)點(diǎn)
Do you want to proceed with the proposed reshard plan (yes/no)? yes
……
Moving slot 11119 from 192.168.6.86:8007 to 192.168.6.86:8001:
Moving slot 11120 from 192.168.6.86:8007 to 192.168.6.86:8001:
Moving slot 11121 from 192.168.6.86:8007 to 192.168.6.86:8001:
8007主節(jié)點(diǎn)將槽位重新分配后,并不意味著8001、8002、8003管理的槽位會(huì)回到最初,可以看到,8001管理兩個(gè)槽位[0,5661]、[10923,11121],和最初8001管理[0-5460]已經(jīng)不一樣了,這里就不再對(duì)比8002和8003,大家可以自行對(duì)比:
192.168.6.86:8001> CLUSTER NODES
aa6ce37e876660161403a801adb8fc7a79a9d876 192.168.6.86:8006@18006 slave 28ad6b59866832b13dbd58dd944e641862702e23 0 1618651357467 8 connected
baf630fe745d9f1db7a58ffb96e180fab1047c79 192.168.6.86:8002@18002 master - 0 1618651357000 2 connected 5662-10922
9c6f93c3b5329e60032b970b57e599b98961cba6 192.168.6.86:8005@18005 slave 115a626ee6d475076b096181ab10d3ab6988cc04 0 1618651356000 3 connected
115a626ee6d475076b096181ab10d3ab6988cc04 192.168.6.86:8003@18003 master - 0 1618651355000 3 connected 11122-16383
54b6c985bf0f41fa1b92cff7c165c317dd0a30c7 192.168.6.86:8004@18004 slave baf630fe745d9f1db7a58ffb96e180fab1047c79 0 1618651355463 2 connected
28ad6b59866832b13dbd58dd944e641862702e23 192.168.6.86:8001@18001 myself,master - 0 1618651354000 8 connected 0-5661 10923-11121
在重新分配完槽位后,我們?cè)賮?lái)看看節(jié)點(diǎn)信息:
127.0.0.1:8001> CLUSTER NODES
aa6ce37e876660161403a801adb8fc7a79a9d876 192.168.6.86:8006@18006 slave 28ad6b59866832b13dbd58dd944e641862702e23 0 1618320346264 8 connected
baf630fe745d9f1db7a58ffb96e180fab1047c79 192.168.6.86:8002@18002 master - 0 1618320345000 2 connected 5662-10922
9c6f93c3b5329e60032b970b57e599b98961cba6 192.168.6.86:8005@18005 slave 115a626ee6d475076b096181ab10d3ab6988cc04 0 1618320345000 3 connected
115a626ee6d475076b096181ab10d3ab6988cc04 192.168.6.86:8003@18003 master - 0 1618320345261 3 connected 11122-16383
5846d4b7785447b9d7b1c08a0ed74c5e68f2f367 192.168.6.86:8007@18007 master - 0 1618320347267 7 connected
54b6c985bf0f41fa1b92cff7c165c317dd0a30c7 192.168.6.86:8004@18004 slave baf630fe745d9f1db7a58ffb96e180fab1047c79 0 1618320343256 2 connected
28ad6b59866832b13dbd58dd944e641862702e23 192.168.6.86:8001@18001 myself,master - 0 1618320343000 8 connected 0-5661 10923-11121
確定8007已經(jīng)不再管理任何槽位后,我們將8007節(jié)點(diǎn)移出集群:
[root@master redis-6.2.1]# src/redis-cli -a 123456 --cluster del-node 192.168.6.86:8007 5846d4b7785447b9d7b1c08a0ed74c5e68f2f367
>>> Removing node 5846d4b7785447b9d7b1c08a0ed74c5e68f2f367 from cluster 192.168.6.86:8007
>>> Sending CLUSTER FORGET messages to the cluster...
>>> Sending CLUSTER RESET SOFT to the deleted node.
此時(shí),重新查看集群信息,可以看到不再有8007節(jié)點(diǎn)了:
127.0.0.1:8001> CLUSTER NODES
aa6ce37e876660161403a801adb8fc7a79a9d876 192.168.6.86:8006@18006 slave 28ad6b59866832b13dbd58dd944e641862702e23 0 1618360351136 8 connected
baf630fe745d9f1db7a58ffb96e180fab1047c79 192.168.6.86:8002@18002 master - 0 1618360350000 2 connected 5662-10922
9c6f93c3b5329e60032b970b57e599b98961cba6 192.168.6.86:8005@18005 slave 115a626ee6d475076b096181ab10d3ab6988cc04 0 1618360350132 3 connected
115a626ee6d475076b096181ab10d3ab6988cc04 192.168.6.86:8003@18003 master - 0 1618360348127 3 connected 11122-16383
54b6c985bf0f41fa1b92cff7c165c317dd0a30c7 192.168.6.86:8004@18004 slave baf630fe745d9f1db7a58ffb96e180fab1047c79 0 1618360351000 2 connected
28ad6b59866832b13dbd58dd944e641862702e23 192.168.6.86:8001@18001 myself,master - 0 1618360350000 8 connected 0-5661 10923-11121
以上就是比較幾種Redis集群方案的詳細(xì)內(nèi)容,更多關(guān)于Redis集群方案的資料請(qǐng)關(guān)注腳本之家其它相關(guān)文章!
您可能感興趣的文章:- 基于docker搭建redis集群的方法
- Docker 部署單機(jī)版 Pulsar 和集群架構(gòu) Redis(開(kāi)發(fā)神器)的方法
- Redis Cluster集群數(shù)據(jù)分片機(jī)制原理
- springcloud微服務(wù)基于redis集群的單點(diǎn)登錄實(shí)現(xiàn)解析
- springboot整合redis集群過(guò)程解析
- Redis集群下過(guò)期key監(jiān)聽(tīng)的實(shí)現(xiàn)代碼
- Java調(diào)用Redis集群代碼及問(wèn)題解決
- Linux(Centos7)下redis5集群搭建和使用說(shuō)明詳解
- 基于docker搭建redis-sentinel集群的方法示例
- 詳細(xì)分析Redis集群故障