Multiwrite (MW) Replication

Multiwrite replication is a good choice for a system that includes applications requiring real-time replication. This section discusses the following topics:
cad141
Multiwrite replication is a good choice for a system that includes applications requiring real-time replication. This section discusses the following topics:
3
To use multiwrite replication successfully, your network links, computers, and power supplies should be reliable.
Warning!
Treat system crashes and unreliable networks as disaster scenarios. Recovery requires manual reconciliation between datastores.
How Multiwrite Replication Works
Multiwrite replication sends updates in real time to all other DSAs in the same group or region. When a client makes an update request, that update is applied immediately to the local DSA, and then to all other DSAs. The client receives a confirmation response only after all DSAs have responded.
Example: A Simple Multiwrite System
The following diagram shows these steps:
  1. The client sends an update request to the directory.
    DSA 1 receives the request and immediately applies the update to itself.
  2. DSA 1 sends the update request to its peers, DSA 2 and DSA 3.
  3. DSA 2 and DSA 3 each apply the update to itself, and then send an update response to DSA 1.
  4. After receiving an update response from both peers, DSA 1 sends an update confirmation to the client.
    Any client can now query any DSA and it gets the same response, because the update has been made to all DSAs.
    Example of a Simple Multiwrite System
    Example of a Simple Multiwrite System
Recovery in Multiwrite
A multiwrite update can fail. Usually, this is because a DSA is down or otherwise offline. When the DSA comes back online again, the system must be recovered.
With multiwrite replication, there are two ways you can recover the system: multiwrite recovery and DISP recovery. If recovery is with multiwrite, the replication scheme is just called multiwrite. If recovery is with DISP, the replication scheme is called Multiwrite-DISP.
If you want to ensure that a region continues to service a namespace even if one DSA fails, you need at least three DSAs in each region. This is because if one DSA fails, you may need to take a second DSA offline to resynchronize the failed DSA.
Multiwrite Queues
While a peer DSA is offline, the sending DSA puts new updates for the peer in a queue, and periodically tries to connect to the peer.
When the peer DSA comes back online, the queued requests are sent in the order that they were processed locally.
While the peer DSA remains offline, the queue grows. The DSA raises an alarm, and writes to the alarm log, at 60%, 70%, 80%, 90%, and 100% of queue capacity.
If the queue becomes full, the peer DSA ignores the unavailable DSA. It discards all the queued requests for the unavailable DSA and temporarily drops it from the multiwrite set. To bring this DSA back into service, you must resynchronize the DSA datastores manually, and restart the DSAs.
If a DSA has attempted a multiwrite operation, the
get dsp
console command returns one of the following states:
  • Failed
    Indicates that the multiwrite DSA is not responding to an update. Updates for that DSA are held in the multiwrite queue until the DSA responds. The queue is retained until the multiwrite queue size is exceeded.
  • Recovering
    Indicates that the DSA has become available and still has old updates in its queue.
  • OK
    This is the normal multiwrite DSA status.
Multiwrite Replication with Multiwrite Recovery
Multiwrite is based on the idea that the multiwrite DSAs are usually functioning and connected. If one of the DSAs in the region shuts down or becomes disconnected, any updates are queued in another DSA's memory until the offline DSA becomes available once more.
After the DSA puts the update request in a queue, it sends a confirmation to the client. In effect, multiwrite reverts to a write-behind scheme until the offline DSA becomes available.
A queued update is stored in memory only, and is lost if the DSA holding the queue is restarted.
The following diagrams show how a DSA in simple multiwrite system recovers.
  1. The system is functioning correctly.
    A single router DSA passes client requests to two data DSAs, which replicate all changes to each other.
    Single router DSA passing client requests to two data DSAs, which replicate all changes to each other
    Single router DSA passing client requests to two data DSAs, which replicate all changes to each other
  2. Data DSA 2 goes offline.
    Data DSA 2 goes offline
    Data DSA 2 goes offline
    While DSA 2 is down, the following happens when the client application makes an update request:
    1. The router DSA passes the update request to DSA 1.
    2. DSA 1 makes the update to itself and queues the update for DSA 2.
      DSA 2 is now out-of-date.
  3. DSA 2 comes online again, as follows:
    1. DSA 2 starts up in recovery mode, which means that it can receive binds only from its peer, DSA 1.
    2. DSA 1 sends updates from its queue, in the order that it queued them, to DSA 2, as follows:
      Data DSA 2 comes online in recovery mode
      Data DSA 2 comes online in recovery mode
    3. When the queue is empty, DSA 1 sends a notification to DSA 2 that the data is synchronized. This switches DSA 2 out of recovery mode, returning it to service.
Multiwrite Groups
Multiwrite replication works well if all DSAs are connected with high-speed high-bandwidth links.
If some of your DSAs are connected by latent links, updates to the entire directory will be slower. This is because multiwrite replication is synchronous by default. In synchronous replication, an update operation is not confirmed until all the multiwrite peers have applied it, which provides for loadsharing and failover.
If your backbone includes any slow network connections, you should set up multiwrite groups. DSAs connected by slow links should be in different groups.
Within a multiwrite group, replication is synchronous. Between groups, replication is asynchronous.
How Multiwrite Groups Work
For each namespace partition, each group has one hub DSA. This is the DSA that accepts multiwrite requests from DSAs in other groups.
The following diagram shows a backbone with one namespace partition (Staff) across three regions:
A backbone with one namespace partition, Staff across three regions
A backbone with one namespace partition, Staff across three regions
The diagram shows the following steps:
  1. Write to self (synchronous):
    A client sends an update request to a DSA, which applies the update to itself.
  2. Write to peers in group:
    If the local update succeeds, the DSA sends the request to its peers in the same group. If these updates succeed, the peers send confirmations back to the first DSA.
  3. Send response to client:
    When the first DSA has received confirmation from each peer in its region, it sends the confirmation response to the client.
  4. Write to hub DSAs in other groups (asynchronous):
    The first DSA sends the request to the hub DSAs in each of the other groups.
  5. Hub DSAs write to peers:
    Each hub DSA sends the request to the other DSAs in its group.
    The following steps are not shown in the diagram.
  6. Peer DSAs write to self:
    Each peer DSA makes the update.
  7. Peer DSAs send confirmation to hubs:
    Each peer DSA sends confirmation of the update to the hub DSA of their group.
  8. Hub DSAs send confirmation to the first DSA:
    Each hub DSA sends the confirmation response to the first DSA. This DSA has already sent confirmation back to the client, so the client is not affected by the slow links.
Example A Backbone with Three Multiwrite Groups
A global company has directory hosts in North America, Central America, North Africa and Europe.
This diagram shows the speed of the network connections between the sites:
Map showing which sites are connected by slow and fast links
Sites 1, 2, 3, 4, and 5 are linked by fast connections. Sites 6, 7, 8, and 9 are also linked by fast connections.
However, these two groups of sites are connected by slow links. Also, Site 10 is only connected to other sites by slow links.
The directory designers decide to create the following groups:
Group A
Group B
Group C
Site 1, Site 2, Site 3, Site 4, Site 5
Site 6, Site 7, Site 8, Site 9
Site 10
Example Shutting Down a Data Centre
Using the previous example, if Group A needs to be taken offline where:
  • All the DSAs have “set wait-for-multiwrite = true;”
  • All the DSAs have “set multi-write-disp-recovery = false;”
Group A
Group B
Group C
Site 1, Site 2, Site 3, Site 4, Site 5
Site 6, Site 7, Site 8, Site 9
Site 10
This scenario allows all the DSAs to be stopped while updates are coming in from Group B and Group C.
For all the multi-write group DSAs in Site 1, Site 2, Site 3, Site 4 & Site 5 servicing Group A:
  1. Run the
    set isolate-multi-write-group=true;
    either through the DXconsole or using the configuration (followed by init).
  2. Issue the
    dxserver stop
    command for each DSA.
  3. Once the queues have drained (if any), the DSAs will stop.
  4. When bringing Group A back online ensure that the “set isolate-multi-write-group” command is not present or set to false.
If Group A is offline for too long, multiwrite queues may fill on Group B and Group C DSAs. If the queues are not large enough or the outage takes too long, then a manual synchronization of all replicating DSAs may be required.