Resolving split-brains

June 2017 ยท 2 minute read

In this guide we describe how to resolve a split-brain situation in three steps. First of all you have to decide which node(s) will be the split-brain victim(s) and which node will be the split-brain survivor. In this guide we show these steps for a two-node cluster. We assume that drbdtop is started on all nodes and that the resource to act on is selected.

tl;tr

  • on all nodes: 'c','d' (connection, disconnect)
  • on the victim(s): 'c','m' (connection, discard-my-data)
  • on the survivor: 'c','c' (connection, connect resource)
  • be happy ;-)

Detailed version

In this section we present a step-by-step guide illustrated with screenshots.

Preparation

Select the resource on all nodes as depicted in the following screenshot: select

If nodes are Standalone, this step is not necessary, but it is easy enough to make sure that all connections are disconnected: Press 'c', 'd' (connection, disconnect) on all nodes.

Press 'c' on all nodes again so that they are in the connection menu: connect

Victims

On the victim(s) press 'm', which issues a drbdadm connect --discard-my-data foo.

Survivor

On the survivor press 'c', which issues a drbdadm connect foo.

After these steps, the resource will start to synchronize and the data of the victim(s) gets overwritten with the data of the survivor. Note that this does not imply a full synchronization and should finish quickly (depending on the amount of data that diverged). You can of course monitor that progress in the in sync view of drbdtop.

After the synchronization finished, the resource will be healthy again. healthy