天天看点

What are the different methods used for communication with RHEL High Availability?

What are the different methods used for communication with RHEL High Availability?​

There are 4 types of IP communication used by Red Hat Enterprise Linux(RHEL) High Availability.

  • Broadcast: A packet is sent to the subnet's broadcast address and all nodes in the subnet will see that message. All of those cluster nodes need to make a decision whether to take any notice of that packet or not. Broadcast packets are not routed, they are just transmitted all over the network.
  • Multicast: A packet is sent to a nominated multicast address. Interested cluster nodes subscribe to multicast addresses and so those that do not subscribe to the cluster multicast do not see the packets. Multicast packets can be routed (though many switches do this badly by default).
  • Unicast: A packet is sent to a specific IP address, only that particular cluster node will see it. These messages can be routed.
  • Stream: The above messages are UDP packets, streams are a TCP thing and always unicast. Pairs of nodes establish a connection between themselves and exchange a stream of bytes in either/both directions.

NOTE: The ​

​iba​

​ transport for corosync is not supported: ​​Is the iba transport supported in a RHEL 6 or 7 High Availability cluster?​​

Red Hat Cluster Suite 4+

The service ​

​cman​

​ in RHEL 4 used ​​broadcast​​ packets by default. This is mainly because it was the easiest to code and to set up and also the protocol was very simple and so did not generate much traffic. Due to the way the protocol was implemented it wasn't even capable of generating much traffic. Since there was not much traffic generated then broadcast was fine for this sort of application. RHEL 4 does have a multicast option that can be enabled in the ​

​/etc/cluster/cluster.conf​

​.

Some packets in RHEL 4 were UDP unicast packets. When ​

​cman​

​ knew exactly which cluster node to send data to (eg ACKs) then they were sent directly to that cluster node using a unicast packet.

Red Hat Enterprise Linux Server 5 (with the High Availability Add on) using ​

​openais​

Clustering in RHEL 5 was rewritten from scratch. Red Hat integrated ​

​openais​

​ as the major part of the cluster stack. This automatically forced the product to use multicast as that was the only type of communication that ​

​openais​

​ could use at the time. ​

​openais​

​ also has a much more complex protocol than RHEL 4 ​

​cman​

​ and thus isolation of traffic from all cluster nodes in a subnet is more important. Without proper isolation using ​

​IGMP​

​, cluster traffic could be sent to all nodes, even those not part of the cluster, causing traffic overload.

As in RHEL 4, not all ​

​openais​

​ packets are multicast. Quite a lot are actually unicast as ​

​openais​

​ is a ring protocol and packets are sent round the ring. The join part of the protocol is one that does use multicast though, so failure of multicast routing can be seen quite quickly in an ​

​openais​

​cluster as cluster nodes fail to join.

The broadcast transport was added to ​

​openais​

​ and ​

​corosync​

​ to help alleviate multicast configuration and functional errors with switches. Broadcast was actually added to ​​openais​​ and ​​corosync​​ to enable deployments where multicast was not functional. Broadcast was originally not supported because it transmits traffic to all nodes in the network resulting in overload and broadcast is incompatible with IPv6.

Red Hat Enterprise Linux Server 6 (with the High Availability Add on) using ​

​corosync​

By default RHEL 6 uses ​​multicast​​ for default transport.

In RHEL 6.2 ​​UDPU​​ option was added to ​​corosync​​ as fully supported transport which already included broadcast and multicast. UDPU helps to resolve the multicast and broadcast issues customers were facing by using UDPU which uses all unicast UDP packets.

All unicast UDP packets are sent directly to targeted cluster nodes. The packets that would have been broadcast or multicasted to all cluster nodes now have to be sent N-1 times (where N is the number of cluster nodes in the cluster) so there is a traffic increase compared to the other systems, but this is often a decent tradeoff, particularly for smaller clusters. One restriction on UDPU clusters is that all cluster nodes of the cluster need to be specified in ​

​cluster.conf​

​ (if using only ​

​corosync​

​ and not ​

​cman​

​ then specified in the ​

​corosync.conf​

​) so that the cluster node addresses are known in advance. This means there is an administrative as well as a traffic overhead for ​

​<cman transport="udpu"/>​

​ in ​

​/etc/cluster/cluster.conf​

​,

Red Hat Enterprise Linux Server 7 (with the High Availability Add on) using ​

​corosync​

By default RHEL 7 uses ​​UDPU​​ as default transport.

What factors can affect cluster communication?

There are various other factors that can affect the different traffic protocols such as what other subsystems are also in use. Below is a description of various services or protocols that can be directly or indirectly involved with cluster communication performance:

​DLM​

​ has always used TCP streams for communication, so it is not directly involved in this. It is indirectly involved when ​

​plocks​

​ are used. The ​

​plocks​

​ that are used by ​

​GFS​

​GFS2​

​ file-system use ​

​openais​

​ checkpoints(since RHEL 5). These checkpoints exchange information over the ​

​openais​

​ or​

​corosync​

​ TOTEM protocol and so are now affected by the multicast, broadcast, or UDPU issues. For a cluster with high ​

​plock​

​ usage when using a ​

​GFS​

​GFS2​

​ file-system there can be communication issues because of the extra traffic that is generated with broadcast or UPDU which makes these protocolsunsuitable for these configurations. For small scale clusters or clusters not using ​

​plocks​

​ on a ​

​GFS​

​ or ​

​GFS2​

​ file-system then broadcast or UDPU could be used. In RHEL 4 ​

​plocks​

​ were implemented on top of the ​

​DLM​

​. ​

​plocks​

​ are range locks with odd semantics and the implementation over ​

​DLM​

​ was not an efficient use of the ​

​DLM​

​ API. This meant that ​

​plock​

​ traffic was sent over the TCP streams of the ​

​DLM​

​ and not subject to the issues discussed above. * Another factor that can affect communication is when network switches put a *cap on the number of multicast packets. This can cause communication issues(and possible fencing of cluster node) when that cap is exceeded.

* ​

​lvm​

​ is the other notable user of the cluster protocol. The service ​

​clvmd​

​ is a very minimal user actually. It just sends locking notification messages when metadata is altered so there is no serious impact on performance and ​

​clvmd​

​ should work quite happily in a multicast, broadcast or UDPU installation.

* The service ​

​cmirror​

​ is a different thing altogether. ​

​cmirror​

​ use the ​

​openais​

​corosync​

​ CPG protocol to share information and on a busy system with many mirrors this can be quite a lot of traffic, especially when cluster nodes reboot or resyncs are necessary. For this reason multicast is recommended for ​

​cmirror​

​ installations.