天天看點

【原創】RabbitMQ官網文檔翻譯 -- Clustering Guide

      為了友善工作中使用,自己花費了周末空閑的時間對 rabbitmq 的叢集配置相關文檔進行了翻譯,鑒于自己水準有限,翻譯中難免有纰漏産生,如果疑問,歡迎指出探讨。此文以中英對照方式呈現。

============== 我是分割線 ================

a rabbitmq broker is a logical grouping of one or several erlang nodes, each running the rabbitmq application and sharing users, virtual hosts, queues, exchanges, etc. sometimes we refer to the collection of nodes as a cluster.

rabbitmq 中的 broker 是指一個或多個 erlang node 的邏輯分組,每個 node 上面都運作 rabbitmq 應用程式并且共享 user、vhost、queue、exchange 等。通常我們将 node 的集合稱之為叢集 cluster 。

all data/state required for the operation of a rabbitmq broker is replicated across all nodes, for reliability and scaling, with full acid properties. an exception to this are message queues, which by default reside on the node that created them, though they are visible and reachable from all nodes. to replicate queues across nodes in a cluster, see the documentation on high availability (note that you will need a working cluster first).

rabbitmq clustering does not tolerate network partitions well, so it should not be used over a wan. the shovel or federation plugins are better solutions for connecting brokers across a wan.

network partition 即網絡分裂。是指在系統中的任何兩個分組之間的所有網絡連接配接同時發生故障後所出現的情況。發生這種情況時,分裂的系統雙方都會從本方一側重新啟動應用程式,進而導緻重複服務或裂腦。如果一個群集中配置的兩個獨立系統具有對指定資源(通常是檔案系統或卷)的獨占通路權限,則會發生裂腦情況。由網絡分裂造成的最為嚴重的問題是它會影響共享磁盤上的資料。

the composition of a cluster can be altered dynamically. all rabbitmq brokers start out as running on a single node. these nodes can be joined into clusters, and subsequently turned back into individual brokers again.

cluster 的構成是可以動态改變的。 所有 rabbitmq broker 在最初啟動時都是從單獨一個 node 上開始的。 這些 node 可以加入到同一個 cluster 中,之後還可以重新加回到不同的 broker 中。 

rabbitmq brokers tolerate the failure of individual nodes. nodes can be started and stopped at will.

rabbitmq broker 對單個 node 的失效是可以容忍的,node 可以随意地啟動或者停止。 

a node can be a disk node or a ram node. (note: disk and disc are used interchangeably. configuration syntax or status messages normally use disc.) ram nodes keep their state only in memory (with the exception of queue contents, which can reside on disc if the queue is persistent or too big to fit in memory). disk nodes keep state in memory and on disk. as ram nodes don't have to write to disk as much as disk nodes, they can perform better. however, not that since the queue data is always stored on disc, the performance improvements will affect only resources management (e.g. adding/removing queues, exchanges, or vhosts), but not publishing or consuming speed. because state is replicated across all nodes in the cluster, it is sufficient (but not reccomended) to have just one disk node within a cluster, to store the state of the cluster safely.

node 的類型分為磁盤(disk) node 或者是記憶體(ram) node 兩種。(注:磁盤間可以互相替換,配置文法或者狀态消息通常使用磁盤 node 進行存儲) 記憶體 node 隻在記憶體中儲存狀态資訊(除了 queue 内容的特殊情況,即如果将 queue 的屬性設定為 persistent 或者出現要存放的資料量太大不适合放在記憶體中的情況時,queue 中的内容會被存放到磁盤上)。 磁盤 node 同時在記憶體和磁盤上儲存狀态資訊;而記憶體 node 不像磁盤 node 那樣必須在磁盤上儲存資訊,故記憶體 node 具有更高效的性能。然而,并不是說 因為 queue 資料 總是儲存在 disk 上, 是以隻有資源管理(例如,增加/删除 quque 、exchange 或者 vhost) 才能夠對性能提高産生影響, 還要考慮 publishing 和 consuming 速度的影響。 因為狀态資訊會在 cluster 包含的所有 node 中是可以進行複制,是以在一個 cluster 中隻配置一個磁盤 node 便足夠安全存儲 cluster 的狀态資訊(但并不是說建議一定要這樣做)。

the following is a transcript of setting up and manipulating a rabbitmq cluster across three machines - rabbit1, rabbit2, rabbit3, with two of the machines replicating data on ram and disk, and the other replicating data in ram only.

下面是一份建立和操控 rabbitmq cluster 的示範。其中包括 3 台機器 - rabbit1,rabbit2,rabbit3,其中兩台機器采用磁盤 node 方式複制資料,一台機器采用記憶體 node 方式複制資料。 

we assume that the user is logged into all three machines, that rabbitmq has been installed on the machines, and that the rabbitmq-server and rabbitmqctl scripts are in the user's path.

假定使用者已經登入到全部 3 台已經安裝好 rabbitmq 的機器上了,并且 rabbitmq-server 和 rabbitmqctl 指令行腳本已經在系統路徑 path 中配置好了。 

erlang nodes use a cookie to determine whether they are allowed to communicate with each other - for two nodes to be able to communicate they must have the same cookie.

erlang node 使用 cookie 值來确定 node 間是否允許互相通信 - 兩個 node 能夠互相通信的前提是他們必須擁有相同的 cookie 值。 

the cookie is just a string of alphanumeric characters. it can be as long or short as you like.

cookie 的值就是一串由字母和數字構成的字元串,其長度随大爺你的便。 

erlang will automatically create a random cookie file when the rabbitmq server starts up. this will be typically located in /var/lib/rabbitmq/.erlang.cookie on unix systems and c:\users\current user\.erlang.cookie or c:\documents and settings\current user\.erlang.cookie on windows systems. the easiest way to proceed is to allow one node to create the file, and then copy it to all the other nodes in the cluster.

erlang 會在 rabbitmq 服務啟動後自動地建立一個具有随機 cookie 值的檔案,該檔案一般會位于 unix 系統的 /var/lib/rabbitmq/.erlang.cookie 以及windows 系統的 c:\users\current user\.erlang.cookie 或者 c:\documents and settings\current user\.erlang.cookie 。 最簡單的方式就是讓某一個 node 建立該 cookie 檔案,然後收到将其拷貝到 cluster 中的所有其他 node 上。 

as an alternative, you can insert the option "-setcookie cookie" in the erl call in the rabbitmq-server and rabbitmqctl scripts.

另外一種方法是,你可以使用在腳本指令 rabbitmq-server 和 rabbitmqctl 中使用選項 " -setcookie cookie" 。

clusters are set up by re-configuring existing rabbitmq nodes into a cluster configuration. hence the first step is to start rabbitmq on all nodes in the normal way:

要想建立一個 cluster ,你就必須對每一個已經存在的 rabbitmq node 按照 cluster 配置的方式重新進行配置。故第一步要做的就是在每一個 node 上都正常啟動 rabbitmq 服務: 

<a href="http://my.oschina.net/moooofly/blog/93548#">?</a>

1

2

3

<code>rabbit1$ rabbitmq-server -detached</code>

<code>rabbit2$ rabbitmq-server -detached</code>

<code>rabbit3$ rabbitmq-server -detached</code>

this creates three independent rabbitmq brokers, one on each node, as confirmed by the cluster_status command:

這樣就建立了 3 個獨立的 rabbitmq broker ,每一個 node 上一個,可以通過 cluster_status 指令來确認: 

4

5

6

7

8

9

10

11

12

<code>rabbit1$ rabbitmqctl cluster_status</code>

<code>cluster status of node rabbit@rabbit1 ...</code>

<code>[{nodes,[{disc,[rabbit@rabbit1]}]},{running_nodes,[rabbit@rabbit1]}]</code>

<code>...</code><code>done</code><code>.</code>

<code>rabbit2$ rabbitmqctl cluster_status</code>

<code>cluster status of node rabbit@rabbit2 ...</code>

<code>[{nodes,[{disc,[rabbit@rabbit2]}]},{running_nodes,[rabbit@rabbit2]}]</code>

<code>rabbit3$ rabbitmqctl cluster_status</code>

<code>cluster status of node rabbit@rabbit3 ...</code>

<code>[{nodes,[{disc,[rabbit@rabbit3]}]},{running_nodes,[rabbit@rabbit3]}]</code>

the node name of a rabbitmq broker started from the rabbitmq-server shell script is rabbit@shorthostname, where the short node name is lower-case (as in rabbit@rabbit1, above). if you use the rabbitmq-server.bat batch file on windows, the short node name is upper-case (as in rabbit@rabbit1). when you type node names, case matters, and these strings must match exactly.

通過 rabbitmq-server 腳本指令建立的 rabbitmq broker 對應的 node 的名字是 rabbit@shorthostname 樣式,其中 short node 名字在 linux 下是小寫字母形式(如 rabbit@rabbit1)。如果您是在 windows 上使用 rabbitmq-server.bat 批處理來執行的上述指令,short node 名字會是大寫字母形式(如 rabbit@rabbit1)。是以, 當你要使用 node 名字時,要注意大小寫的問題,因為比對時要求完全一緻。

in order to link up our three nodes in a cluster, we tell two of the nodes, say rabbit@rabbit2 and rabbit@rabbit3, to join the cluster of the third, say rabbit@rabbit1.

為了将我們建立的 3 個 node 連接配接成一個 cluster ,需要将其中兩個 node(如 rabbit@rabbit2 和 rabbit@rabbit3)加入到第三個 node(如 rabbit@rabbit1)所在的 cluster 中。 

we first join rabbit@rabbit2 as a ram node in a cluster with rabbit@rabbit1 in a cluster. to do that, on rabbit@rabbit2 we stop the rabbitmq application and join the rabbit@rabbit1 cluster enabling the --ram flag, and restart the rabbitmq application. note that joining a cluster implicitly resets the node, thus removing all resources and data that were previously present on that node.

我們首先将 rabbit@rabbit2 按照記憶體 node 的方式加入到 rabbit@rabbit1 所在 cluster 中。我們需要先停止 rabbit@rabbit2 上的 rabbitmq 應用,然後以使能 " --ram " 辨別的方式加入到 rabbit@rabbit1 所在 cluster ,最後重新啟動 rabbitmq 應用。 注意:加入 cluster 的過程隐式包含了重置 node 的動作,即移除了目前 node 上之前存放的所的資源和資料。 

<code>rabbit2$ rabbitmqctl stop_app</code>

<code>stopping node rabbit@rabbit2 ...</code><code>done</code><code>.</code>

<code>rabbit2$ rabbitmqctl join_cluster --</code><code>ram</code> <code>rabbit@rabbit1</code>

<code>clustering node rabbit@rabbit2 with [rabbit@rabbit1] ...</code><code>done</code><code>.</code>

<code>rabbit2$ rabbitmqctl start_app</code>

<code>starting node rabbit@rabbit2 ...</code><code>done</code><code>.</code>

we can see that the two nodes are joined in a cluster by running the cluster_status command on either of the nodes:

我們可以從 rabbit@rabbit1 或者 rabbit@rabbit2 上通過指令 cluster_status 看到兩個 node 已經加入到同一個 cluster 中了: 

<code>[{nodes,[{disc,[rabbit@rabbit1]},{</code><code>ram</code><code>,[rabbit@rabbit2]}]},</code>

<code> </code><code>{running_nodes,[rabbit@rabbit2,rabbit@rabbit1]}]</code>

<code> </code><code>{running_nodes,[rabbit@rabbit1,rabbit@rabbit2]}]</code>

now we join rabbit@rabbit3 as a disk node to the same cluster. the steps are identical to the ones above, except that we omit the --ram flag in order to turn it into a disk rather than ram node. this time we'll cluster to rabbit2 to demonstrate that the node chosen to cluster to does not matter - it is enough to provide one online node and the node will be clustered to the cluster that the specified node belongs to.

現在我們将 rabbit@rabbit3 以磁盤 node 的形式加入到同一個 cluster 中。步驟和上面的相同,除了需要省掉 "--ram" 辨別以便按照磁盤 node 的形式加入。這次我們将加入 rabbit2 所在的 cluster (其實也是 rabbit1 所在的 cluster)以證明在這種情況下通過哪一個 node 加入 cluster 都是一樣一樣一樣的。即隻要我們提供了處于某個 cluster 中的可被其他人通路的 node ,那麼該 node 所在的 cluster 就可以被其他 node 加入。 

<code>rabbit3$ rabbitmqctl stop_app</code>

<code>stopping node rabbit@rabbit3 ...</code><code>done</code><code>.</code>

<code>rabbit3$ rabbitmqctl join_cluster rabbit@rabbit2</code>

<code>clustering node rabbit@rabbit3 with rabbit@rabbit2 ...</code><code>done</code><code>.</code>

<code>rabbit3$ rabbitmqctl start_app</code>

<code>starting node rabbit@rabbit3 ...</code><code>done</code><code>.</code>

we can see that the three nodes are joined in a cluster by running the cluster_status command on any of the nodes:

我們可以從任意一個 node 上通過指令 cluster_status 看到三個 node 已經加入到同一個 cluster 中了: 

13

14

15

<code>[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit3]},{</code><code>ram</code><code>,[rabbit@rabbit2]}]},</code>

<code> </code><code>{running_nodes,[rabbit@rabbit3,rabbit@rabbit2,rabbit@rabbit1]}]</code>

<code> </code><code>{running_nodes,[rabbit@rabbit3,rabbit@rabbit1,rabbit@rabbit2]}]</code>

<code>[{nodes,[{disc,[rabbit@rabbit3,rabbit@rabbit1]},{</code><code>ram</code><code>,[rabbit@rabbit2]}]},</code>

<code> </code><code>{running_nodes,[rabbit@rabbit2,rabbit@rabbit1,rabbit@rabbit3]}]</code>

by following the above steps we can add new nodes to the cluster at any time, while the cluster is running.

按照上面的步驟,我們可以在任意時間添加新的 node 到 cluster 中,隻要 cluster 處于運作狀态。

we can change the type of a node from ram to disk and vice versa. say we wanted to reverse the types of rabbit@rabbit2 and rabbit@rabbit3, turning the former from a ram node into a disk node and the latter from a disk node into a ram node. to do that we can use the change_cluster_node_type command. the node must be stopped first.

我們可以改變 node 的類型,如磁盤 node 到記憶體 node ,或者相反。比如将 rabbit@rabbit2 和 rabbit@rabbit3 的 node 類型都變成和之前不同的種類。我們可以使用指令 change_cluster_node_type 來進行轉換,但是首先需要将 node 停止。 

<code>rabbit2$ rabbitmqctl change_cluster_node_type disc</code>

<code>turning rabbit@rabbit2 into a disc node ...</code>

<code>rabbit3$ rabbitmqctl change_cluster_node_type</code><code>ram</code>

<code>turning rabbit@rabbit3 into a</code><code>ram</code> <code>node ...</code>

nodes that have been joined to a cluster can be stopped at any time. it is also ok for them to crash. in both cases the rest of the cluster continues operating unaffected, and the nodes automatically "catch up" with the other cluster nodes when they start up again.

cluster 中的 node 在任何時候都可以被停止。 同樣地如果他們崩潰了也是沒有任何問題的。在上述兩種情況中,cluster 中的其他 node 都可以不受任何影響的繼續運作,這些“非正常” node 重新啟動後會自動地與 cluster 中的其他 node 取得聯系。 

we shut down the nodes rabbit@rabbit1 and rabbit@rabbit3 and check on the cluster status at each step:

我們手動關閉 rabbit@rabbit1 和 rabbit@rabbit3 後,通過指令檢視 cluster 的狀态: 

16

17

18

19

20

21

22

23

<code>rabbit1$ rabbitmqctl stop</code>

<code>stopping and halting node rabbit@rabbit1 ...</code><code>done</code><code>.</code>

<code>[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]},{</code><code>ram</code><code>,[rabbit@rabbit3]}]},</code>

<code> </code><code>{running_nodes,[rabbit@rabbit3,rabbit@rabbit2]}]</code>

<code>[{nodes,[{disc,[rabbit@rabbit2,rabbit@rabbit1]},{</code><code>ram</code><code>,[rabbit@rabbit3]}]},</code>

<code> </code><code>{running_nodes,[rabbit@rabbit2,rabbit@rabbit3]}]</code>

<code>rabbit3$ rabbitmqctl stop</code>

<code>stopping and halting node rabbit@rabbit3 ...</code><code>done</code><code>.</code>

<code> </code><code>{running_nodes,[rabbit@rabbit2]}]</code>

now we start the nodes again, checking on the cluster status as we go along:

現在我們重新啟動 node ,并檢視 cluster 的狀态: 

24

25

26

27

28

29

30

31

32

33

<code> </code><code>{running_nodes,[rabbit@rabbit1,rabbit@rabbit2,rabbit@rabbit3]}]</code>

there are some important caveats:

有幾個需要注意的地方:

at least one disk node should be running at all times to prevent data loss. rabbitmq will prevent the creation of a ram-only cluster in many situations, but it still won't stop you from stopping and forcefully resetting all the disc nodes, which will lead to a ram-only cluster. doing this is not advisable and makes losing data very easy.

為了防止資料丢失的發生,在任何情況下都應該保證至少有一個 node 是采用磁盤 node 方式。rabbitmq 在很多情況下會阻止建立僅有記憶體 node 的 cluster ,但是如果你通過手動将 cluster 中的全部磁盤 node 都停止掉或者強制 reset 所有的磁盤 node 的方式間接導緻生成了僅有記憶體 node 的 cluster ,rabbitmq 無法阻止你。你這麼做本身是很不明智的,因為會導緻你的資料非常容易丢失。

when the entire cluster is brought down, the last node to go down must be the first node to be brought online. if this doesn't happen, the nodes will wait 30 seconds for the last disc node to come back online, and fail afterwards. if the last node to go offline cannot be brought back up, it can be removed from the cluster using the forget_cluster_node command - consult the rabbitmqctl manpage for more information.

當整個 cluster 不能工作了,最後一個失效的 node 必須是第一個重新開始工作的那一個。如果這種情況得不到滿足,所有 node 将會為最後一個磁盤 node 的恢複等待 30 秒。如果最後一個離線的 node 無法重新上線,我們可以通過指令 forget_cluster_node 将其從 cluster 中移除 - 具體參考 rabbitmqctl 的使用手冊。

nodes need to be removed explicitly from a cluster when they are no longer meant to be part of it. we first remove rabbit@rabbit3 from the cluster, returning it to independent operation. to do that, on rabbit@rabbit3 we stop the rabbitmq application, reset the node, and restart the rabbitmq application.

當 node 不應該繼續存在于一個 cluster 中時,我們需要顯式的将這些 node 移除。我們首先從 cluster 中移除 rabbit@rabbit3 ,将其還原為獨立運作狀态。具體做法為,在 rabbit@rabbit3 上先停止 rabbitmq 應用,再重置 node ,最後重新啟動  rabbitmq 應用。 

<code>rabbit3$ rabbitmqctl reset</code>

<code>resetting node rabbit@rabbit3 ...</code><code>done</code><code>.</code>

note that it would have been equally valid to list rabbit@rabbit3 as a node.

值得注意的是,此時仍舊可以通過 list 指令發現 rabbit@rabbit3 仍然作為 node 顯示出來。 

running the cluster_status command on the nodes confirms that rabbit@rabbit3 now is no longer part of the cluster and operates independently:

在 node 上運作 cluster_status 指令可以發現 rabbit@rabbit3 已經不再是 cluster 中的一員,且已經處于獨立運作狀态: 

<code>[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]}]},</code>

we can also remove nodes remotely. this is useful, for example, when having to deal with an unresponsive node. we can for example remove rabbit@rabbi1 from rabbit@rabbit2.

我們還可以利用遠端移除 node 的操作,這在有些情況下是很有用的,比如對無任何反應的 node 的 處理 。例如,我們可以在 rabbit@rabbit2 上執行移除 rabbit@rabbit1 的操作。 

<code>rabbit1$ rabbitmqctl stop_app</code>

<code>stopping node rabbit@rabbit1 ...</code><code>done</code><code>.</code>

<code>rabbit2$ rabbitmqctl forget_cluster_node rabbit@rabbit1</code>

<code>removing node rabbit@rabbit1 from cluster ...</code>

note that rabbit1 still thinks its clustered with rabbit2, and trying to start it will result in an error. we will need to reset it to be able to start it again.

注意到,rabbit1 仍舊會認為自己與 rabbit2 處于同一個 cluster 中,但是此時在 rabbit1 上執行 start_app 操作會提示相應錯誤資訊。如果需要,我們可以将 rabbit1 重置成與 rabbit2 處于 同一 cluster 的狀态。 

<code>rabbit1$ rabbitmqctl start_app</code>

<code>starting node rabbit@rabbit1 ...</code>

<code>error: inconsistent_cluster: node rabbit@rabbit1 thinks it's clustered with node rabbit@rabbit2, but rabbit@rabbit2 disagrees</code>

<code>rabbit1$ rabbitmqctl reset</code>

<code>resetting node rabbit@rabbit1 ...</code><code>done</code><code>.</code>

the cluster_status command now shows all three nodes operating as independent rabbitmq brokers:

此時執行 cluster_status 指令可以顯示出目前所有 3 個 node 均是作為獨立的 rabbitmq broker 處于運作狀态: 

note that rabbit@rabbit2 retains the residual state of the cluster, whereas rabbit@rabbit1 and rabbit@rabbit3 are freshly initialised rabbitmq brokers. if we want to re-initialise rabbit@rabbit2 we follow the same steps as for the other nodes:

注意到 rabbit@rabbit2 會保有 cluster 的殘餘狀态資訊,而 rabbit@rabbit1 和 rabbit@rabbit3 卻可以看成是新初始化的 rabbitmq broker 。如果我們想要重新初始化 rabbit@rabbit2 ,我們可以按照下面的方式執行: 

<code>rabbit2$ rabbitmqctl reset</code>

<code>resetting node rabbit@rabbit2 ...</code><code>done</code><code>.</code>

instead of configuring clusters "on the fly" using the cluster command, clusters can also be set up via the rabbitmq configuration file. the file should set the cluster_nodes field in the rabbit application to a tuple contanining a list of rabbit nodes, and an atom - either disc or ram - indicating whether the node should join them as a disc node or not.

if cluster_nodes is specified, rabbitmq will try to cluster to each node provided, and stop after it can cluster with one of them. rabbitmq will try cluster to any node which is online that has the same version of erlang and rabbitmq. if no suitable nodes are found, the node is left unclustered.

如果指定了 cluster_nodes 字段,rabbitmq 将嘗試對給出的 node 進行 cluster 操作,然後在與這些 node 之中的一個構成 cluster 之後停止。rabbitmq 将嘗試對任何線上的且具有相同 erlang 和 rabbitmq 版本的 node 進行 cluster 操作。如果沒有發現合适的 node ,目前 node 将以非 cluster 的狀态離開。

note that the cluster configuration is applied only to fresh nodes. a fresh nodes is a node which has just been reset or is being start for the first time. thus, the automatic clustering won't take place after restarts of nodes. this means that any change to the clustering via rabbitmqctl will take precedence over the automatic clustering configuration.

注意到, cluster 配置僅被用于 fresh node 。 fresh node 是指剛剛被 reset 或者首次被 start 的 node 。這樣,自動 cluster 行為不會在重新開機 node 之後發生。 這意味着任何通過 rabbitmqctl 指令對 cluster 進行地改變将地位高于(覆寫)自動 cluster 配置。 

a common use of cluster configuration via the rabbitmq config file is to automatically configure nodes to join a common cluster. for this purpose the same cluster nodes can be specified on all cluster, plus the boolean to determine disc nodes.

運用 rabbitmq 配置檔案進行 cluster 配置的最常見形式是可以使得 node 自動加入到 cluster 中去。為了達到該目的,在所有的 cluster 上均指定相同的 cluster node ,且包含一個表明是否為磁盤 node 的布爾值。 

say we want to join our three separate nodes of our running example back into a single cluster, with rabbit@rabbit1 and rabbit@rabbit2 being the disk nodes of the cluster. first we reset and stop all nodes, to make sure that we're working with fresh nodes:

例如,我們想将之前拆開運作的 node 重新加入到同一個 cluster 中去,且 rabbit@rabbit1 和 rabbit@rabbit2 的 node 類型為磁盤 node 。首先,我們要 reset 和 stop 所有 node 以確定我們是以 fresh node 開始後續工作: 

<code>rabbit2$ rabbitmqctl stop</code>

<code>stopping and halting node rabbit@rabbit2 ...</code><code>done</code><code>.</code>

now we set the relevant field in the config file:

此時我們在配置檔案的相關字段上進行設定: 

<code>[</code>

<code>  </code><code>...</code>

<code>  </code><code>{rabbit, [</code>

<code>        </code><code>...</code>

<code>        </code><code>{cluster_nodes, {[</code><code>'rabbit@rabbit1'</code><code>,</code><code>'rabbit@rabbit2'</code><code>,</code><code>'rabbit@rabbit3'</code><code>], disc}},</code>

<code>  </code><code>]},</code>

<code>].</code>

for instance, if this were the only field we needed to set, we would simply create the rabbitmq config file with the contents:

例如,如果我們隻需要設定上面給出的字段,我們隻需使用如下内容建立 rabbitmq 配置檔案: 

<code>[{rabbit,</code>

<code>  </code><code>[{cluster_nodes, {[</code><code>'rabbit@rabbit1'</code><code>,</code><code>'rabbit@rabbit2'</code><code>,</code><code>'rabbit@rabbit3'</code><code>], disc}}]}].</code>

since we want rabbit@rabbit3 to be a ram node, we need to specify that in its configuration file:

如果我們想将 rabbit@rabbit3 設定為記憶體 node ,我們需要在配置檔案中具體指出: 

<code>  </code><code>[{cluster_nodes, {[</code><code>'rabbit@rabbit1'</code><code>,</code><code>'rabbit@rabbit2'</code><code>,</code><code>'rabbit@rabbit3'</code><code>],</code><code>ram</code><code>}}]}].</code>

(note for erlang programmers and the curious: this is a standard erlang configuration file. for more details, see the configuration guide and the erlang config man page.)

once we have the configuration files in place, we simply start the nodes:

一旦我們準備好了配置檔案,就可以簡單地 start 相應的 node : 

我們可以通過 cluster_status 指令看到 3 個 node 确實已經加入了同一個 cluster 中: 

note that, in order to remove a node from an auto-configured cluster, it must first be removed from the rabbitmq.config files of the other nodes in the cluster. only then, can it be reset safely.

需要注意的是:為了從通過自動配置方式配置的 cluster 中移除 node ,你首先需要将該 node 從 cluster 中的其他 node 上的 rabbitmq.config 檔案中移除,隻有這樣做,才能保證安全 reset 。

when upgrading from one version of rabbitmq to another, rabbitmq will automatically update its persistent data structures if necessary. in a cluster, this task is performed by the first disc node to be started (the "upgrader" node). therefore when upgrading a rabbitmq cluster, you should not attempt to start any ram nodes first; any ram nodes started will emit an error message and fail to start up.

當 rabbitmq 從一個版本更新到另一個版本時,如果必要,rabbitmq 會自動更新持久化資料結構。在 cluster 中,上述工作會由第一個被啟動的磁盤 node 進行(即“負責更新的” node )。是以,當你更新一個 rabbitmq cluster 的時候,不可以首先啟動任何記憶體 node ,任何記憶體 node 的啟動将産生一條錯誤消息并且啟動失敗。 

all nodes in a cluster must be running the same versions of erlang and rabbitmq, although they may have different plugins installed. therefore it is necessary to stop all nodes in the cluster, then start all nodes when performing an upgrade.

cluster 中的所有 node 必須運作在相同的 erlang 和 rabbitmq 版本之上,盡管他們都可以安裝很多不同的插件。是以在更新 cluster 的時候有必要先将全部 node 都停止,更新之後再将全部 node 重新啟動。 

while not strictly necessary, it is a good idea to decide ahead of time which disc node will be the upgrader, stop that node last, and start it first. otherwise changes to the cluster configuration that were made between the upgrader node stopping and the last node stopping will be lost.

盡管不是一定必要,但是建議你事先決定好使用哪個磁盤 node 作為更新點(upgrader),然後在更新過程中,最後停止那個 node ,最先啟動那個 node 。否則,在 更新點 node 停止和最後停止的 node 之間所做的對于 cluster 配置的修改将會被丢失掉。 

automatic upgrades are only possible from rabbitmq versions 2.1.1 and later. if you have an earlier cluster, you will need to rebuild it to upgrade.

自動更新的功能僅在 rabbitmq 2.1.1 和之後的版本中才具有。如果你使用了更早版本的 cluster ,你講需要通過重新建構的方式來更新。

under some circumstances it can be useful to run a cluster of rabbitmq nodes on a single machine. this would typically be useful for experimenting with clustering on a desktop or laptop without the overhead of starting several virtual machines for the cluster. the two main requirements for running more than one node on a single machine are that each node should have a unique name and bind to a unique port / ip address combination for each protocol in use.

在一些情況下,在單機上運作 rabbitmq node 的 cluster 可能對你很有實用價值。其中之一是,你可以在你的桌上型電腦或者筆記本上運作 cluster 而不用額外跑多個虛拟機。 想要在單機上運作超過一個 node 的兩個主要要求是,每一個 node 應該具有一個唯一的名字,并且與唯一的 port/ip 綁定,以使得每一份協定都可用。 

you can start multiple nodes on the same host manually by repeated invocation of rabbitmq-server ( rabbitmq-server.bat on windows). you must ensure that for each invocation you set the environment variables rabbitmq_nodename and rabbitmq_node_port to suitable values.

你可以通過手動重複執行 rabbitmq-server 指令在同一主機上啟動多個 node ,你必須確定你每次執行該指令時都對環境變量rabbitmq_nodename 和 rabbitmq_node_port 設定了合适的值。 

舉例:

<code>$ rabbitmq_node_port=5672 rabbitmq_nodename=rabbit rabbitmq-server -detached</code>

<code>$ rabbitmq_node_port=5673 rabbitmq_nodename=hare rabbitmq-server -detached</code>

<code>$ rabbitmqctl -n hare stop_app</code>

<code>$ rabbitmqctl -n hare reset</code>

<code>$ rabbitmqctl -n hare cluster rabbit@`</code><code>hostname</code> <code>-s`</code>

<code>$ rabbitmqctl -n hare start_app</code>

will set up a two node cluster with one disc node and one ram node. note that if you have rabbitmq opening any ports other than amqp, you'll need to configure those not to clash as well - for example:

上述指令建立了兩個 node 的 cluster ,其中包含一個磁盤 node 一個記憶體 node 。注意到如果你令 rabbitmq 使用了非 amqp 協定指定的任何其他端口,你将需要通過配置保證不會出現端口沖突 - 例如: 

<code>$ rabbitmq_node_port=5672 rabbitmq_server_start_args=</code><code>"-rabbitmq_management listener [{port,15672}]"</code> <code>rabbitmq_nodename=rabbit rabbitmq-server -detached</code>

<code>$ rabbitmq_node_port=5673 rabbitmq_server_start_args=</code><code>"-rabbitmq_management listener [{port,15673}]"</code> <code>rabbitmq_nodename=hare rabbitmq-server -detached</code>

will start two nodes (which can then be clustered) when the management plugin is installed.

上述指令同樣建立了兩個 node 的 cluster ,但是使用了管理插件。

the case for firewalled clustered nodes exists when nodes are in a data center or on a reliable network, but separated by firewalls. again, clustering is not recommended over a wan or when network links between nodes are unreliable.

這種情況是指資料中心或者可靠網絡上的 cluster 中的 node 彼此之間存在防火牆的情況。再一次重申,不建議在 wan 或者 node 之間的網絡連接配接不可靠的情況下建立 cluster 。 

if different nodes of a cluster are in the same data center, but behind firewalls then additional configuration will be necessary to ensure inter-node communication. erlang makes use of a port mapper daemon (epmd) for resolution of node names in a cluster. nodes must be able to reach each other and the port mapper daemon for clustering to work.

如果 cluster 中的不同 node 均處于同一個資料中,但是處于防火牆之後,那麼就需要進行額外的配置以保證 node 之間的正常通信。 erlang 利用了端口映射守護程序(epmd)用于解析 cluster 中的 node 名字,node 之間以及 node 和 epmd 之間必須保證能夠進行通信。 

the default epmd port is 4369, but this can be changed using the erl_epmd_port environment variable. all nodes must use the same port. firewalls must permit traffic on this port to pass between clustered nodes. for further details see the erlang epmd manpage.

once a distributed erlang node address has been resolved via epmd, other nodes will attempt to communicate directly with that address using the erlang distributed node protocol. the port range for this communication can be configured with two parameters for the erlang kernel application:

一旦分布式 erlang node 位址被 epmd 成功解析,其他 node 将嘗試使用解析出的位址,通過 erlang 分布式 node 協定進行直連。用于該通信的 端口範圍 可以通過 erlang kernel 應用的兩個參數進行配置: 

<code>inet_dist_listen_min</code>

<code>inet_dist_listen_max</code>

firewalls must permit traffic in this range to pass between clustered nodes (assuming all nodes use the same port range). the default port range is unrestricted.

防火牆必須允許 cluster 中的 node 在這個端口範圍内的通信(假定所有 node 都使用同樣的端口範圍)。 預設端口範圍是無限制。 

the erlang kernel_app manpage contains more details on the port range that distributed erlang nodes listen on. see the configuration page for information on how to create and edit a configuration file.

a client can connect as normal to any node within a cluster. if that node should fail, and the rest of the cluster survives, then the client should notice the closed connection, and should be able to reconnect to some surviving member of the cluster. generally, it's not advisable to bake in node hostnames or ip addresses into client applications: this introduces inflexibility and will require client applications to be edited, recompiled and redeployed should the configuration of the cluster change or the number of nodes in the cluster change. instead, we recommend a more abstracted approach: this could be a dynamic dns service which has a very short ttl configuration, or a plain tcp load balancer, or some sort of mobile ip achieved with pacemaker or similar technologies. in general, this aspect of managing the connection to nodes within a cluster is beyond the scope of rabbitmq itself, and we recommend the use of other technologies designed specifically to solve these problems.

用戶端可以透明地連接配接到 cluster 中的任意一個 node 上。 如果目前與用戶端處于連接配接狀态的那個 node 失效了,但是 cluster 中的其他 node 正常工作,那麼用戶端應該發現目前連接配接的關閉,然後應該可以重新連接配接到 cluster 中的其他正常的 node 上。一般來講,将 node 的主機名或者 ip 位址 寫死到用戶端應用程式中是非常不明智的:這會導緻各種坑爹問題的出現,因為一旦 cluster 的配置改變或者 cluster 中的 ndoe 數目改變,用戶端将面臨重新編碼、編譯和重新釋出的問題。作為替代,我們建議一種更加一般化的方式:采用 動态 dns 服務 ,其具有非常短的 ttl 配置,或者 普通 tcp 負載均衡器 ,或者通過随機行走或者類似技術實作的某種形式的 mobile ip 。通常來講,關于如何成功連接配接 cluster 中的 node 已經超出了 rabbitmq 本身要說明的範疇,我們建議你使用其他的專門用于處理這方面問題的技術來解決這種問題。 

繼續閱讀