Database synchronization is the central data-integrity mechanism for the Nexus Repo Manager HA cluster architecture. This allows failed nodes to rejoin and synch with their healthy cluster mates. If, however, you need to bring down all of the nodes (e.g. for planned maintenance), additional steps are required to ensure the database and cluster state are accurate to minimize the chance of data-loss. Here are the complete steps to perform an orderly shutdown of a Nexus 3 Repository Manager cluster:
- Begin by checking the nexus.log of the nodes to determine that the cluster is otherwise healthy. For example, you should see 'Distributed servers status' entries with an ASCII table of nodes and statuses. You will be looking for an 'ONLINE' status for all members:
- If the cluster is healthy, begin shutting down the nodes one at a time. The order isn't important, as the DB state is dynamically synch'ed amongst the running nodes. You should monitor the nexus.log (in
sonatype-work/nexus3/log/
) in the surviving nodes to verify the exiting nodes are removing themselves from the cluster. - When there is only one node left, check the Nexus Repo Manager web UI. Is there a quorum error banner at the top of the page? Alternatively, check the nexus.log of the node; does the 'Distributed servers status' table report more than one node? If so, then follow the instructions here to zero the cluster via the Admin UI.
- Stop the final node and verify it has completely shutdown before performing any maintenance.
- Once you are ready to bring the cluster back up, start the last node and allow it to come up completely. The nexus.log should show a healthy single-node cluster.
- Bring the other nodes up in any order and verify that they are accepted into the cluster via the 'Distributed servers status' table.