Redis-Cluster的管理

340人浏览 / 0人评论

1、查看集群节点列表

# redis-cli -c -p 6001 cluster nodes
8e5721f66249998581e331292122ac33fcde2a79 192.168.6.251:6003@16003 master - 0 1614232987848 3 connected 10923-16383
6b4e0171623b0f35239d9c8021207e28cab0125c 192.168.6.251:6005@16005 slave 8e5721f66249998581e331292122ac33fcde2a79 0 1614232988850 3 connected
77ceff91b1e998c10ab7c596339c93989db48c2e 192.168.6.251:7001@17001 master - 0 1614232986000 0 connected
43f06aad833579da7dc0256aebe1623ac90aca21 192.168.6.251:6006@16006 master - 0 1614232987000 7 connected 0-5460
40957fddba99cd36e5827f1d7dedde90f246ac73 192.168.6.251:6002@16002 master - 0 1614232986000 2 connected 5461-10922
3102a25b63e4b2c431534112ffdb67c739563632 192.168.6.251:6001@16001 myself,slave 43f06aad833579da7dc0256aebe1623ac90aca21 0 1614232987000 7 connected
a35ebfec4a0b9eac090d62adc5f0e42766136f8b 192.168.6.251:6004@16004 slave 40957fddba99cd36e5827f1d7dedde90f246ac73 0 1614232989852 2 connected

命令解析:

00fee91991f8bb6a787bb8552b5a207a4d1437f8 192.168.163.203:6380 myself,slave ec6f04c5cb3e3392b7787445440b7494ddff8923 0 0 6 connected 
<id> <ip:port> <flags> <master> <ping-sent> <pong-recv> <config-epoch> <link-state> <slot> <slot> ... <slot>
字段字段值描述
id00fee91991f8bb6a787bb8552b5a207a4d1437f8节点ID,是一个40字节的随机字符串,这个值在节点启动的时候创建,并且永远不会改变(除非使用CLUSTER RESET HARD命令)
ip:port192.168.163.203:6380集群节点的ip和端口
flagsmyself,slave逗号分割的标记位,可能的值有: myself, master, slave, fail?, fail, handshake, noaddr, noflags.
masterec6f04c5cb3e3392b7787445440b7494ddff8923若节点是slave,列出master节点ID, 否则列出 - (如第2行的 master - )
ping-sent0最近一次发送ping的时间,这个时间是一个unix毫秒时间戳,0代表没有发送过.
pong-recv0最近一次收到pong的时间,使用unix时间戳表示.
config-epoch6节点的epoch值(or of the current master if the node is a slave)。每当节点发生失败切换时,都会创建一个新的,独特的,递增的epoch。如果多个节点竞争同一个哈希槽时,epoch值更高的节点会抢夺到。
link-stateconnectednode-to-node集群总线使用的链接的状态,我们使用这个链接与集群中其他节点进行通信.值可以是 connected 和 disconnected.
slot 哈希槽值或者一个哈希槽范围. 从第9个参数开始,后面最多可能有16384个 数。  (说直白点,用来给事件增加版本号)

 

2、添加新节点

2.1、添加master

将192.168.6.251:7001添加到集群中

# redis-cli --cluster add-node 192.168.6.251:7001 192.168.6.251:6001 
>>> Adding node 192.168.6.251:7001 to cluster 192.168.6.251:6001
>>> Performing Cluster Check (using node 192.168.6.251:6001)
S: 3102a25b63e4b2c431534112ffdb67c739563632 192.168.6.251:6001
   slots: (0 slots) slave
   replicates 43f06aad833579da7dc0256aebe1623ac90aca21
M: 8e5721f66249998581e331292122ac33fcde2a79 192.168.6.251:6003
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 6b4e0171623b0f35239d9c8021207e28cab0125c 192.168.6.251:6005
   slots: (0 slots) slave
   replicates 8e5721f66249998581e331292122ac33fcde2a79
M: 43f06aad833579da7dc0256aebe1623ac90aca21 192.168.6.251:6006
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
M: 40957fddba99cd36e5827f1d7dedde90f246ac73 192.168.6.251:6002
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: a35ebfec4a0b9eac090d62adc5f0e42766136f8b 192.168.6.251:6004
   slots: (0 slots) slave
   replicates 40957fddba99cd36e5827f1d7dedde90f246ac73
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 192.168.6.251:7001 to make it join the cluster.
[OK] New node added correctly.

add-node命令将新节点的地址指定为第一个参数,并将集群中随机存在的节点的地址指定为第二个参数

2.2、添加slave

# redis-cli --cluster add-node 192.168.6.251:7001 192.168.6.251:6001 --cluster-slave
>>> Adding node 192.168.6.251:7001 to cluster 192.168.6.251:6001
>>> Performing Cluster Check (using node 192.168.6.251:6001)
S: 3102a25b63e4b2c431534112ffdb67c739563632 192.168.6.251:6001
   slots: (0 slots) slave
   replicates 43f06aad833579da7dc0256aebe1623ac90aca21
M: 8e5721f66249998581e331292122ac33fcde2a79 192.168.6.251:6003
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 6b4e0171623b0f35239d9c8021207e28cab0125c 192.168.6.251:6005
   slots: (0 slots) slave
   replicates 8e5721f66249998581e331292122ac33fcde2a79
M: 43f06aad833579da7dc0256aebe1623ac90aca21 192.168.6.251:6006
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
M: 40957fddba99cd36e5827f1d7dedde90f246ac73 192.168.6.251:6002
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: a35ebfec4a0b9eac090d62adc5f0e42766136f8b 192.168.6.251:6004
   slots: (0 slots) slave
   replicates 40957fddba99cd36e5827f1d7dedde90f246ac73
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
Automatically selected master 192.168.6.251:6003
>>> Send CLUSTER MEET to node 192.168.6.251:7001 to make it join the cluster.
Waiting for the cluster to join

>>> Configure node as replica of 192.168.6.251:6003.
[OK] New node added correctly.

输出内容中:“Automatically selected master 192.168.6.251:6003”,说明7001自动作为6003的slave节点。原因是redis-cli将新节点添加为副本较少的主节点中的随机主节点的副本。

2.3、添加到指定的master下的副本

# redis-cli --cluster add-node 192.168.6.251:7001 192.168.6.251:6001 --cluster-slave --cluster-master-id 40957fddba99cd36e5827f1d7dedde90f246ac73
>>> Adding node 192.168.6.251:7001 to cluster 192.168.6.251:6001
>>> Performing Cluster Check (using node 192.168.6.251:6001)
S: 3102a25b63e4b2c431534112ffdb67c739563632 192.168.6.251:6001
   slots: (0 slots) slave
   replicates 43f06aad833579da7dc0256aebe1623ac90aca21
M: 8e5721f66249998581e331292122ac33fcde2a79 192.168.6.251:6003
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 6b4e0171623b0f35239d9c8021207e28cab0125c 192.168.6.251:6005
   slots: (0 slots) slave
   replicates 8e5721f66249998581e331292122ac33fcde2a79
M: 43f06aad833579da7dc0256aebe1623ac90aca21 192.168.6.251:6006
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
M: 40957fddba99cd36e5827f1d7dedde90f246ac73 192.168.6.251:6002
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: a35ebfec4a0b9eac090d62adc5f0e42766136f8b 192.168.6.251:6004
   slots: (0 slots) slave
   replicates 40957fddba99cd36e5827f1d7dedde90f246ac73
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 192.168.6.251:7001 to make it join the cluster.
Waiting for the cluster to join

>>> Configure node as replica of 192.168.6.251:6002.
[OK] New node added correctly.

3、删除节点

redis-cli --cluster del-node 127.0.0.1:7000 `<node-id>`

第一个参数只是集群中的一个随机节点,第二个参数是您要删除的节点的ID。您也可以用相同的方法删除主节点,但是要删除主节点,它必须为空。如果主节点不为空,则需要先将数据从其重新分片到所有其他主节点

# redis-cli --cluster del-node 192.168.6.251:6001 77ceff91b1e998c10ab7c596339c93989db48c2e
>>> Removing node 77ceff91b1e998c10ab7c596339c93989db48c2e from cluster 192.168.6.251:6001
>>> Sending CLUSTER FORGET messages to the cluster...
>>> Sending CLUSTER RESET SOFT to the deleted node.

再次查看,发现已经没有7001节点了:

# redis-cli -c -p 6001 cluster nodes
8e5721f66249998581e331292122ac33fcde2a79 192.168.6.251:6003@16003 master - 0 1614233026917 3 connected 10923-16383
6b4e0171623b0f35239d9c8021207e28cab0125c 192.168.6.251:6005@16005 slave 8e5721f66249998581e331292122ac33fcde2a79 0 1614233025915 3 connected
43f06aad833579da7dc0256aebe1623ac90aca21 192.168.6.251:6006@16006 master - 0 1614233027000 7 connected 0-5460
40957fddba99cd36e5827f1d7dedde90f246ac73 192.168.6.251:6002@16002 master - 0 1614233027919 2 connected 5461-10922
3102a25b63e4b2c431534112ffdb67c739563632 192.168.6.251:6001@16001 myself,slave 43f06aad833579da7dc0256aebe1623ac90aca21 0 1614233025000 7 connected
a35ebfec4a0b9eac090d62adc5f0e42766136f8b 192.168.6.251:6004@16004 slave 40957fddba99cd36e5827f1d7dedde90f246ac73 0 1614233025000 2 connected

如果要删除master,并且master节点不为空,就会报错:

# redis-cli --cluster del-node 192.168.6.251:6001 4622143c1d92d253843e5beb3d53e4e8c2746f5a
>>> Removing node 4622143c1d92d253843e5beb3d53e4e8c2746f5a from cluster 192.168.6.251:6001
[ERR] Node 192.168.6.251:7002 is not empty! Reshard data away and try again.

如果master不为空,需要先将slot迁移到其他节点:

redis-cli --cluster reshard 192.168.6.251:6001

输入需要迁移的slot数量:2738

再输入接收的节点id:8e5721f66249998581e331292122ac33fcde2a79

再输入迁移的来源节点id:4622143c1d92d253843e5beb3d53e4e8c2746f5a(7002端口的实例)

再输入第一个来源节点id:done(因为只有一个节点需要迁移,所以输入done结束)

最后提示:Do you want to proceed with the proposed reshard plan (yes/no)? yes

等迁移完后,再执行删除节点命令就可以成功:

4、扩容缩容reshard

添加完一组新的master和slave后,就需要重新给新的master分配slot。

redis-cli --cluster reshard 192.168.6.251:7001

会询问想做多少重分片:

How many slots do you want to move (from 1 to 16384)?

输入想重新分配的slot后,再次询问哪个节点进行接收:

What is the receiving node ID?

输入完nodeID后,会再次询问从哪些节点进行迁移:

Please enter all the source node IDs.
  Type 'all' to use all the nodes as source nodes for the hash slots.
  Type 'done' once you entered all the source nodes IDs.
Source node #1: 

这里想从所有的节点来迁移,就输入all,

会出现迁移内容一览:

Ready to move 10 slots.
  Source nodes:
    M: 8e5721f66249998581e331292122ac33fcde2a79 192.168.6.251:6003
       slots:[12288-16383] (4096 slots) master
       1 additional replica(s)
    M: 43f06aad833579da7dc0256aebe1623ac90aca21 192.168.6.251:6006
       slots:[1365-5460] (4096 slots) master
       1 additional replica(s)
    M: 40957fddba99cd36e5827f1d7dedde90f246ac73 192.168.6.251:6002
       slots:[6827-10922] (4096 slots) master
       1 additional replica(s)
  Destination node:
    M: 77ceff91b1e998c10ab7c596339c93989db48c2e 192.168.6.251:7001
       slots:[0-1364],[5461-6826],[10923-12287] (4096 slots) master
       1 additional replica(s)
  Resharding plan:
    Moving slot 12288 from 8e5721f66249998581e331292122ac33fcde2a79
    Moving slot 12289 from 8e5721f66249998581e331292122ac33fcde2a79
    Moving slot 12290 from 8e5721f66249998581e331292122ac33fcde2a79
    Moving slot 12291 from 8e5721f66249998581e331292122ac33fcde2a79
    Moving slot 1365 from 43f06aad833579da7dc0256aebe1623ac90aca21
    Moving slot 1366 from 43f06aad833579da7dc0256aebe1623ac90aca21
    Moving slot 1367 from 43f06aad833579da7dc0256aebe1623ac90aca21
    Moving slot 6827 from 40957fddba99cd36e5827f1d7dedde90f246ac73
    Moving slot 6828 from 40957fddba99cd36e5827f1d7dedde90f246ac73
    Moving slot 6829 from 40957fddba99cd36e5827f1d7dedde90f246ac73
Do you want to proceed with the proposed reshard plan (yes/no)? 

最后输入yes,就开始进行迁移:

Do you want to proceed with the proposed reshard plan (yes/no)? yes
Moving slot 12288 from 192.168.6.251:6003 to 192.168.6.251:7001: 
Moving slot 12289 from 192.168.6.251:6003 to 192.168.6.251:7001: 
Moving slot 12290 from 192.168.6.251:6003 to 192.168.6.251:7001: 
Moving slot 12291 from 192.168.6.251:6003 to 192.168.6.251:7001: 
Moving slot 1365 from 192.168.6.251:6006 to 192.168.6.251:7001: 
Moving slot 1366 from 192.168.6.251:6006 to 192.168.6.251:7001: 
Moving slot 1367 from 192.168.6.251:6006 to 192.168.6.251:7001: 
Moving slot 6827 from 192.168.6.251:6002 to 192.168.6.251:7001: 
Moving slot 6828 from 192.168.6.251:6002 to 192.168.6.251:7001: 
Moving slot 6829 from 192.168.6.251:6002 to 192.168.6.251:7001:

分片结束后校验集群状态:

# redis-cli --cluster check 192.168.6.251:7001
192.168.6.251:7001 (77ceff91...) -> 28 keys | 4106 slots | 1 slaves.
192.168.6.251:6003 (8e5721f6...) -> 25 keys | 4092 slots | 1 slaves.
192.168.6.251:6006 (43f06aad...) -> 26 keys | 4093 slots | 1 slaves.
192.168.6.251:6002 (40957fdd...) -> 23 keys | 4093 slots | 1 slaves.
[OK] 102 keys in 4 masters.
0.01 keys per slot on average.
>>> Performing Cluster Check (using node 192.168.6.251:7001)
M: 77ceff91b1e998c10ab7c596339c93989db48c2e 192.168.6.251:7001
   slots:[0-1367],[5461-6829],[10923-12291] (4106 slots) master
   1 additional replica(s)
S: 4622143c1d92d253843e5beb3d53e4e8c2746f5a 192.168.6.251:7002
   slots: (0 slots) slave
   replicates 77ceff91b1e998c10ab7c596339c93989db48c2e
M: 8e5721f66249998581e331292122ac33fcde2a79 192.168.6.251:6003
   slots:[12292-16383] (4092 slots) master
   1 additional replica(s)
M: 43f06aad833579da7dc0256aebe1623ac90aca21 192.168.6.251:6006
   slots:[1368-5460] (4093 slots) master
   1 additional replica(s)
S: a35ebfec4a0b9eac090d62adc5f0e42766136f8b 192.168.6.251:6004
   slots: (0 slots) slave
   replicates 40957fddba99cd36e5827f1d7dedde90f246ac73
S: 6b4e0171623b0f35239d9c8021207e28cab0125c 192.168.6.251:6005
   slots: (0 slots) slave
   replicates 8e5721f66249998581e331292122ac33fcde2a79
S: 3102a25b63e4b2c431534112ffdb67c739563632 192.168.6.251:6001
   slots: (0 slots) slave
   replicates 43f06aad833579da7dc0256aebe1623ac90aca21
M: 40957fddba99cd36e5827f1d7dedde90f246ac73 192.168.6.251:6002
   slots:[6830-10922] (4093 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

5、平衡集群节点slot数量(rebalance)

  • check集群状态,发现三个master上分配的slots差别很大

  • 使用rebalance进行平衡一下

  • 平衡之后的结果,发现每个master上的slots的数量一样了

  • 如果新增一个master,并且没有手动分配slot,可以使用rebalance进行自动平均分配
[root@k8s-node1 redis-6.2.0]# redis-cli --cluster rebalance --cluster-use-empty-masters 192.168.6.251:6001
>>> Performing Cluster Check (using node 192.168.6.251:6001)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Rebalancing across 4 nodes. Total weight = 4.00
Moving 1366 slots from 192.168.6.251:6006 to 192.168.6.251:7001
################################################################
Moving 1365 slots from 192.168.6.251:6002 to 192.168.6.251:7001
################################################################
Moving 1365 slots from 192.168.6.251:6003 to 192.168.6.251:7001
################################################################

--cluster-use-empty-masters:rebalance是否考虑没有节点的master,默认没有分配slot节点的master是不参与rebalance的,设置--cluster-use-empty-masters可以让没有分配slot的节点参与rebalance。

平衡后的结果,发现新加的7001也分配了4096个槽:

6、cluster slots命令解析

连接上集群后,输入cluster slots,可以显示集群中slots的分布

192.168.6.251:6003> cluster slots
1) 1) (integer) 0
   2) (integer) 1367
   3) 1) "192.168.6.251"
      2) (integer) 6002
      3) "40957fddba99cd36e5827f1d7dedde90f246ac73"
   4) 1) "192.168.6.251"
      2) (integer) 6004
      3) "a35ebfec4a0b9eac090d62adc5f0e42766136f8b"
2) 1) (integer) 6830
   2) (integer) 10922
   3) 1) "192.168.6.251"
      2) (integer) 6002
      3) "40957fddba99cd36e5827f1d7dedde90f246ac73"
   4) 1) "192.168.6.251"
      2) (integer) 6004
      3) "a35ebfec4a0b9eac090d62adc5f0e42766136f8b"
3) 1) (integer) 1368
   2) (integer) 5460
   3) 1) "192.168.6.251"
      2) (integer) 6006
      3) "43f06aad833579da7dc0256aebe1623ac90aca21"
   4) 1) "192.168.6.251"
      2) (integer) 6001
      3) "3102a25b63e4b2c431534112ffdb67c739563632"
4) 1) (integer) 5461
   2) (integer) 6829
   3) 1) "192.168.6.251"
      2) (integer) 6003
      3) "8e5721f66249998581e331292122ac33fcde2a79"
   4) 1) "192.168.6.251"
      2) (integer) 6005
      3) "6b4e0171623b0f35239d9c8021207e28cab0125c"
5) 1) (integer) 10923
   2) (integer) 16383
   3) 1) "192.168.6.251"
      2) (integer) 6003
      3) "8e5721f66249998581e331292122ac33fcde2a79"
   4) 1) "192.168.6.251"
      2) (integer) 6005
      3) "6b4e0171623b0f35239d9c8021207e28cab0125c"
  • 最左列的1)、2)、3)、4)、5)表示将0~16384个槽分成了5组
  • 以第1组为例:
    • “1) (integer) 0”表示槽区间开始下标
    • “2) (integer) 1367”表示槽区间结束下标
    • 3) 1) "192.168.6.251"

2) (integer) 6002

3) "40957fddba99cd36e5827f1d7dedde90f246ac73"

表示该槽范围的主节点信息,IP,端口及节点id

  • 4) 1) "192.168.6.251"

2) (integer) 6004

3) "a35ebfec4a0b9eac090d62adc5f0e42766136f8b"

表示该槽范围的从节点信息,IP,端口及节点id

7、手动故障转移

在需要升级master实例的版本时,就需要先手动故障转移将其转为slave再进行升级操作。

Redis集群使用 CLUSTER FAILOVER命令来进行故障转移,不过要在被转移的主节点的从节点上执行该命令

手动故障转移比主节点失败自动故障转移更加安全,因为手动故障转移时客户端的切换是在确保新的主节点完全复制了失败的旧的主节点数据的前提下下发生的,所以避免了数据的丢失。 

  • 查看集群节点信息,发现6005是master,对应的slave是6003

  • 登录6003,执行cluster failover命令(注意:只能在slave上执行cluster failover)
[root@k8s-node1 redis-6.2.0]# redis-cli -c -p 6003
127.0.0.1:6003> cluster failover
  • 查看手动故障转移后的结果,发现6003已经变成master了,6005变成slave,主从已经互换了

全部评论