From ec909ced57e7d96b86e7ddb04e1cb67d970df124 Mon Sep 17 00:00:00 2001 From: Josip Seljan <62958579+the-joksim@users.noreply.github.com> Date: Wed, 25 Nov 2020 14:35:08 +0100 Subject: [PATCH] Update nomenclature in the replication feature spec (#47) * Update nomenclature in the replication feature spec * Update the replication feature spec to contain the changes to the new replica registering syntax Co-authored-by: jseljan --- docs/feature_spec/replication.md | 134 ++++++++++++++++++------------- 1 file changed, 76 insertions(+), 58 deletions(-) diff --git a/docs/feature_spec/replication.md b/docs/feature_spec/replication.md index 1539b8ed0..87b8e5ef9 100644 --- a/docs/feature_spec/replication.md +++ b/docs/feature_spec/replication.md @@ -5,18 +5,18 @@ Replication is a method that ensures that multiple database instances are storing the same data. To enable replication, there must be at least two instances of Memgraph in a cluster. Each instance has one of either two roles: -master or slave. The master instance is the instance that accepts writes to the -database and replicates its state to the slaves. In a cluster, there can only -be one master. There can be one or more slaves. None of the slaves will accept +main or replica. The main instance is the instance that accepts writes to the +database and replicates its state to the replicas. In a cluster, there can only +be one main. There can be one or more replicas. None of the replicas will accept write queries, but they will always accept read queries (there is an exception -to this rule and is described below). Slaves can also be configured to be -replicas of slaves, not necessarily replicas of the master. Each instance will +to this rule and is described below). Replicas can also be configured to be +replicas of replicas, not necessarily replicas of the main. Each instance will always be reachable using the standard supported communication protocols. The replication will replicate WAL data. All data is transported through a custom binary protocol that will try remain backward compatible, so that replication immediately allows for zero downtime upgrades. -Each slave can be configured to accept replicated data in one of the following +Each replica can be configured to accept replicated data in one of the following modes: - synchronous - asynchronous @@ -24,27 +24,27 @@ modes: ### Synchronous Replication -When the data is replicated to a slave synchronously, all of the data of a -currently pending transaction must be sent to the synchronous slave before the +When the data is replicated to a replica synchronously, all of the data of a +currently pending transaction must be sent to the synchronous replica before the transaction is able to commit its changes. This mode has a positive implication that all data that is committed to the -master will always be replicated to the synchronous slave. It also has a -negative performance implication because non-responsive slaves could grind all +main will always be replicated to the synchronous replica. It also has a +negative performance implication because non-responsive replicas could grind all query execution to a halt. This mode is good when you absolutely need to be sure that all data is always -consistent between the master and the slave. +consistent between the main and the replica. ### Asynchronous Replication -When the data is replicated to a slave asynchronously, all pending transactions -are immediately committed and their data is replicated to the asynchronous -slave in the background. +When the data is replicated to a replica asynchronously, all pending +transactions are immediately committed and their data is replicated to the +asynchronous replica in the background. This mode has a positive performance implication in which it won't slow down -query execution. It also has a negative implication that the data between the -master and the slave is almost never in a consistent state (when the data is +query execution. It also has a negative implication that the data between the +main and the replica is almost never in a consistent state (when the data is being changed). This mode is good when you don't care about consistency and only need an @@ -52,28 +52,28 @@ eventually consistent cluster, but you care about performance. ### Semi-synchronous Replication -When the data is replicated to a slave semi-synchronously, the data is +When the data is replicated to a replica semi-synchronously, the data is replicated using both the synchronous and asynchronous methodology. The data is -always replicated synchronously, but, if the slave for any reason doesn't +always replicated synchronously, but, if the replica for any reason doesn't respond within a preset timeout, the pending transaction is committed and the -data is replicated to the slave asynchronously. +data is replicated to the replica asynchronously. This mode has a positive implication that all data that is committed is -*mostly* replicated to the semi-synchronous slave. It also has a negative +*mostly* replicated to the semi-synchronous replica. It also has a negative performance implication as the synchronous replication mode. This mode is useful when you want the replication to be synchronous to ensure -that the data within the cluster is consistent, but you don't want the master -to grind to a halt when you have a non-responsive slave. +that the data within the cluster is consistent, but you don't want the main +to grind to a halt when you have a non-responsive replica. -### Addition of a New Slave +### Addition of a New Replica -Each slave when added to the cluster (in any mode) will first start out as an -asynchronous slave. That will allow slaves that have fallen behind to first -catch-up to the current state of the database. When the slave is in a state -that it isn't lagging behind the master it will then be promoted (in a brief -stop-the-world operation) to a semi-synchronous or synchronous slave. Slaves -that are added as asynchronous slaves will remain asynchronous. +Each replica, when added to the cluster (in any mode), will first start out as +an asynchronous replica. That will allow replicas that have fallen behind to +first catch-up to the current state of the database. When the replica is in a +state that it isn't lagging behind the main it will then be promoted (in a brief +stop-the-world operation) to a semi-synchronous or synchronous replica. Slaves +that are added as asynchronous replicas will remain asynchronous. ## User Facing Setup @@ -81,49 +81,67 @@ that are added as asynchronous slaves will remain asynchronous. Replication configuration is done primarily through openCypher commands. This allows the cluster to be dynamically rearranged (new leader election, addition -of a new slave, etc.). +of a new replica, etc.). -Each Memgraph instance when first started will be a master. You have to change -the mode of all slave nodes using the following openCypher query before you can -enable replication on the master: +Each Memgraph instance when first started will be a main. You have to change +the role of all replica nodes using the following openCypher query before you +can enable replication on the main: ```plaintext -SET REPLICATION MODE TO (MASTER|SLAVE); +SET REPLICATION ROLE TO (MAIN|REPLICA); ``` -After you have set your slave instance to the correct operating mode, you can -enable replication in the master instance by issuing the following openCypher +After you have set your replica instance to the correct operating role, you can +enable replication in the main instance by issuing the following openCypher command: ```plaintext -CREATE REPLICA name (SYNC|ASYNC) [WITH TIMEOUT 0.5] TO ; +REGISTER REPLICA name (SYNC|ASYNC) [WITH TIMEOUT 0.5] TO ; ``` -Each Memgraph instance will remember that the configuration was set to and will +The socket address must be a string of the following form: + +```plaintext +"IP_ADDRESS:PORT_NUMBER" +``` + +where IP_ADDRESS is a valid IP address, and PORT_NUMBER is a valid port number, +both given in decimal notation. +Note that in this case they must be separated by a single colon. +Alternatively, one can give the socket address as: + +```plaintext +"IP_ADDRESS" +``` + +where IP_ADDRESS must be a valid IP address, and the port number will be +assumed to be the default one (we specify it to be 10000). + +Each Memgraph instance will remember what the configuration was set to and will automatically resume with its role when restarted. ### How to Setup an Advanced Replication Scenario? The configuration allows for a more advanced scenario like this: ```plaintext -master -[asyncrhonous]-> slave 1 -[semi-synchronous]-> slave 2 +main -[asynchronous]-> replica 1 -[semi-synchronous]-> replica 2 ``` To configure the above scenario, issue the following commands: ```plaintext -SET REPLICATION MODE TO SLAVE; # on slave 1 -SET REPLICATION MODE TO SLAVE; # on slave 2 +SET REPLICATION ROLE TO REPLICA; # on replica 1 +SET REPLICATION ROLE TO REPLICA; # on replica 2 -CREATE REPLICA slave1 ASYNC TO ; # on master -CREATE REPLICA slave2 SYNC WITH TIMEOUT 0.5 TO ; # on slave 1 +REGISTER REPLICA replica1 ASYNC TO ; # on main +REGISTER REPLICA replica2 SYNC WITH TIMEOUT 0.5 TO ; # on replica 1 ``` ### How to See the Current Replication Status? -To see the replication mode of the current Memgraph instance, you can issue the +To see the replication ROLE of the current Memgraph instance, you can issue the following query: ```plaintext -SHOW REPLICATION MODE; +SHOW REPLICATION ROLE; ``` To see the replicas of the current Memgraph instance, you can issue the @@ -139,34 +157,34 @@ To delete a replica, issue the following query: DELETE REPLICA 'name'; ``` -### How to Promote a New Master? +### How to Promote a New Main? -When you have an already set-up cluster, to promote a new master, just set the -slave that you want to be a master to the master role. +When you have an already set-up cluster, to promote a new main, just set the +replica that you want to be a main to the main role. ```plaintext -SET REPLICATION MODE TO MASTER; # on desired slave +SET REPLICATION ROLE TO MAIN; # on desired replica ``` -After the command is issued, if the original master is still alive, it won't be -able to replicate its data to the slave (the new master) anymore and will enter +After the command is issued, if the original main is still alive, it won't be +able to replicate its data to the replica (the new main) anymore and will enter an error state. You must ensure that at any given point in time there aren't -two masters in the cluster. +two mains in the cluster. ## Integration with Memgraph -WAL `Delta`s are replicated between the replication master and slave. With -`Delta`s, all `StorageGlobalOperation`s are also replicated. Replication is +WAL `Delta`s are replicated between the replication main and replica. With +`Delta`s, all `StorageGlobalOperation`s are also replicated. Replication is essentially the same as appending to the WAL. Synchronous replication will occur in `Commit` and each -`StorageGlobalOperation` handler. The storage itself guarantees that `Commit` +`StorageGlobalOperation` handler. The storage itself guarantees that `Commit` will be called single-threadedly and that no `StorageGlobalOperation` will be -executed during an active transaction. Asynchronous replication will load its -data from already written WAL files and transmit the data to the slave. All +executed during an active transaction. Asynchronous replication will load its +data from already written WAL files and transmit the data to the replica. All data will be replicated using our RPC protocol (SLK encoded). -For each replica the replication master (or slave) will keep track of the +For each replica the replication main (or replica) will keep track of the replica's state. That way, it will know which operations must be transmitted to the replica and which operations can be skipped. When a replica is very stale, a snapshot will be transmitted to it so that it can quickly synchronize with