Update nomenclature in the replication feature spec (#47)
* Update nomenclature in the replication feature spec * Update the replication feature spec to contain the changes to the new replica registering syntax Co-authored-by: jseljan <josip.seljan@memgraph.io>
This commit is contained in:
parent
7e9175052a
commit
ec909ced57
@ -5,18 +5,18 @@
|
|||||||
Replication is a method that ensures that multiple database instances are
|
Replication is a method that ensures that multiple database instances are
|
||||||
storing the same data. To enable replication, there must be at least two
|
storing the same data. To enable replication, there must be at least two
|
||||||
instances of Memgraph in a cluster. Each instance has one of either two roles:
|
instances of Memgraph in a cluster. Each instance has one of either two roles:
|
||||||
master or slave. The master instance is the instance that accepts writes to the
|
main or replica. The main instance is the instance that accepts writes to the
|
||||||
database and replicates its state to the slaves. In a cluster, there can only
|
database and replicates its state to the replicas. In a cluster, there can only
|
||||||
be one master. There can be one or more slaves. None of the slaves will accept
|
be one main. There can be one or more replicas. None of the replicas will accept
|
||||||
write queries, but they will always accept read queries (there is an exception
|
write queries, but they will always accept read queries (there is an exception
|
||||||
to this rule and is described below). Slaves can also be configured to be
|
to this rule and is described below). Replicas can also be configured to be
|
||||||
replicas of slaves, not necessarily replicas of the master. Each instance will
|
replicas of replicas, not necessarily replicas of the main. Each instance will
|
||||||
always be reachable using the standard supported communication protocols. The
|
always be reachable using the standard supported communication protocols. The
|
||||||
replication will replicate WAL data. All data is transported through a custom
|
replication will replicate WAL data. All data is transported through a custom
|
||||||
binary protocol that will try remain backward compatible, so that replication
|
binary protocol that will try remain backward compatible, so that replication
|
||||||
immediately allows for zero downtime upgrades.
|
immediately allows for zero downtime upgrades.
|
||||||
|
|
||||||
Each slave can be configured to accept replicated data in one of the following
|
Each replica can be configured to accept replicated data in one of the following
|
||||||
modes:
|
modes:
|
||||||
- synchronous
|
- synchronous
|
||||||
- asynchronous
|
- asynchronous
|
||||||
@ -24,27 +24,27 @@ modes:
|
|||||||
|
|
||||||
### Synchronous Replication
|
### Synchronous Replication
|
||||||
|
|
||||||
When the data is replicated to a slave synchronously, all of the data of a
|
When the data is replicated to a replica synchronously, all of the data of a
|
||||||
currently pending transaction must be sent to the synchronous slave before the
|
currently pending transaction must be sent to the synchronous replica before the
|
||||||
transaction is able to commit its changes.
|
transaction is able to commit its changes.
|
||||||
|
|
||||||
This mode has a positive implication that all data that is committed to the
|
This mode has a positive implication that all data that is committed to the
|
||||||
master will always be replicated to the synchronous slave. It also has a
|
main will always be replicated to the synchronous replica. It also has a
|
||||||
negative performance implication because non-responsive slaves could grind all
|
negative performance implication because non-responsive replicas could grind all
|
||||||
query execution to a halt.
|
query execution to a halt.
|
||||||
|
|
||||||
This mode is good when you absolutely need to be sure that all data is always
|
This mode is good when you absolutely need to be sure that all data is always
|
||||||
consistent between the master and the slave.
|
consistent between the main and the replica.
|
||||||
|
|
||||||
### Asynchronous Replication
|
### Asynchronous Replication
|
||||||
|
|
||||||
When the data is replicated to a slave asynchronously, all pending transactions
|
When the data is replicated to a replica asynchronously, all pending
|
||||||
are immediately committed and their data is replicated to the asynchronous
|
transactions are immediately committed and their data is replicated to the
|
||||||
slave in the background.
|
asynchronous replica in the background.
|
||||||
|
|
||||||
This mode has a positive performance implication in which it won't slow down
|
This mode has a positive performance implication in which it won't slow down
|
||||||
query execution. It also has a negative implication that the data between the
|
query execution. It also has a negative implication that the data between the
|
||||||
master and the slave is almost never in a consistent state (when the data is
|
main and the replica is almost never in a consistent state (when the data is
|
||||||
being changed).
|
being changed).
|
||||||
|
|
||||||
This mode is good when you don't care about consistency and only need an
|
This mode is good when you don't care about consistency and only need an
|
||||||
@ -52,28 +52,28 @@ eventually consistent cluster, but you care about performance.
|
|||||||
|
|
||||||
### Semi-synchronous Replication
|
### Semi-synchronous Replication
|
||||||
|
|
||||||
When the data is replicated to a slave semi-synchronously, the data is
|
When the data is replicated to a replica semi-synchronously, the data is
|
||||||
replicated using both the synchronous and asynchronous methodology. The data is
|
replicated using both the synchronous and asynchronous methodology. The data is
|
||||||
always replicated synchronously, but, if the slave for any reason doesn't
|
always replicated synchronously, but, if the replica for any reason doesn't
|
||||||
respond within a preset timeout, the pending transaction is committed and the
|
respond within a preset timeout, the pending transaction is committed and the
|
||||||
data is replicated to the slave asynchronously.
|
data is replicated to the replica asynchronously.
|
||||||
|
|
||||||
This mode has a positive implication that all data that is committed is
|
This mode has a positive implication that all data that is committed is
|
||||||
*mostly* replicated to the semi-synchronous slave. It also has a negative
|
*mostly* replicated to the semi-synchronous replica. It also has a negative
|
||||||
performance implication as the synchronous replication mode.
|
performance implication as the synchronous replication mode.
|
||||||
|
|
||||||
This mode is useful when you want the replication to be synchronous to ensure
|
This mode is useful when you want the replication to be synchronous to ensure
|
||||||
that the data within the cluster is consistent, but you don't want the master
|
that the data within the cluster is consistent, but you don't want the main
|
||||||
to grind to a halt when you have a non-responsive slave.
|
to grind to a halt when you have a non-responsive replica.
|
||||||
|
|
||||||
### Addition of a New Slave
|
### Addition of a New Replica
|
||||||
|
|
||||||
Each slave when added to the cluster (in any mode) will first start out as an
|
Each replica, when added to the cluster (in any mode), will first start out as
|
||||||
asynchronous slave. That will allow slaves that have fallen behind to first
|
an asynchronous replica. That will allow replicas that have fallen behind to
|
||||||
catch-up to the current state of the database. When the slave is in a state
|
first catch-up to the current state of the database. When the replica is in a
|
||||||
that it isn't lagging behind the master it will then be promoted (in a brief
|
state that it isn't lagging behind the main it will then be promoted (in a brief
|
||||||
stop-the-world operation) to a semi-synchronous or synchronous slave. Slaves
|
stop-the-world operation) to a semi-synchronous or synchronous replica. Slaves
|
||||||
that are added as asynchronous slaves will remain asynchronous.
|
that are added as asynchronous replicas will remain asynchronous.
|
||||||
|
|
||||||
## User Facing Setup
|
## User Facing Setup
|
||||||
|
|
||||||
@ -81,49 +81,67 @@ that are added as asynchronous slaves will remain asynchronous.
|
|||||||
|
|
||||||
Replication configuration is done primarily through openCypher commands. This
|
Replication configuration is done primarily through openCypher commands. This
|
||||||
allows the cluster to be dynamically rearranged (new leader election, addition
|
allows the cluster to be dynamically rearranged (new leader election, addition
|
||||||
of a new slave, etc.).
|
of a new replica, etc.).
|
||||||
|
|
||||||
Each Memgraph instance when first started will be a master. You have to change
|
Each Memgraph instance when first started will be a main. You have to change
|
||||||
the mode of all slave nodes using the following openCypher query before you can
|
the role of all replica nodes using the following openCypher query before you
|
||||||
enable replication on the master:
|
can enable replication on the main:
|
||||||
|
|
||||||
```plaintext
|
```plaintext
|
||||||
SET REPLICATION MODE TO (MASTER|SLAVE);
|
SET REPLICATION ROLE TO (MAIN|REPLICA);
|
||||||
```
|
```
|
||||||
|
|
||||||
After you have set your slave instance to the correct operating mode, you can
|
After you have set your replica instance to the correct operating role, you can
|
||||||
enable replication in the master instance by issuing the following openCypher
|
enable replication in the main instance by issuing the following openCypher
|
||||||
command:
|
command:
|
||||||
```plaintext
|
```plaintext
|
||||||
CREATE REPLICA name (SYNC|ASYNC) [WITH TIMEOUT 0.5] TO <ip_address>;
|
REGISTER REPLICA name (SYNC|ASYNC) [WITH TIMEOUT 0.5] TO <socket_address>;
|
||||||
```
|
```
|
||||||
|
|
||||||
Each Memgraph instance will remember that the configuration was set to and will
|
The socket address must be a string of the following form:
|
||||||
|
|
||||||
|
```plaintext
|
||||||
|
"IP_ADDRESS:PORT_NUMBER"
|
||||||
|
```
|
||||||
|
|
||||||
|
where IP_ADDRESS is a valid IP address, and PORT_NUMBER is a valid port number,
|
||||||
|
both given in decimal notation.
|
||||||
|
Note that in this case they must be separated by a single colon.
|
||||||
|
Alternatively, one can give the socket address as:
|
||||||
|
|
||||||
|
```plaintext
|
||||||
|
"IP_ADDRESS"
|
||||||
|
```
|
||||||
|
|
||||||
|
where IP_ADDRESS must be a valid IP address, and the port number will be
|
||||||
|
assumed to be the default one (we specify it to be 10000).
|
||||||
|
|
||||||
|
Each Memgraph instance will remember what the configuration was set to and will
|
||||||
automatically resume with its role when restarted.
|
automatically resume with its role when restarted.
|
||||||
|
|
||||||
### How to Setup an Advanced Replication Scenario?
|
### How to Setup an Advanced Replication Scenario?
|
||||||
|
|
||||||
The configuration allows for a more advanced scenario like this:
|
The configuration allows for a more advanced scenario like this:
|
||||||
```plaintext
|
```plaintext
|
||||||
master -[asyncrhonous]-> slave 1 -[semi-synchronous]-> slave 2
|
main -[asynchronous]-> replica 1 -[semi-synchronous]-> replica 2
|
||||||
```
|
```
|
||||||
|
|
||||||
To configure the above scenario, issue the following commands:
|
To configure the above scenario, issue the following commands:
|
||||||
```plaintext
|
```plaintext
|
||||||
SET REPLICATION MODE TO SLAVE; # on slave 1
|
SET REPLICATION ROLE TO REPLICA; # on replica 1
|
||||||
SET REPLICATION MODE TO SLAVE; # on slave 2
|
SET REPLICATION ROLE TO REPLICA; # on replica 2
|
||||||
|
|
||||||
CREATE REPLICA slave1 ASYNC TO <slave1_ip>; # on master
|
REGISTER REPLICA replica1 ASYNC TO <replica1_sa>; # on main
|
||||||
CREATE REPLICA slave2 SYNC WITH TIMEOUT 0.5 TO <slave2_ip>; # on slave 1
|
REGISTER REPLICA replica2 SYNC WITH TIMEOUT 0.5 TO <replica2_sa>; # on replica 1
|
||||||
```
|
```
|
||||||
|
|
||||||
### How to See the Current Replication Status?
|
### How to See the Current Replication Status?
|
||||||
|
|
||||||
To see the replication mode of the current Memgraph instance, you can issue the
|
To see the replication ROLE of the current Memgraph instance, you can issue the
|
||||||
following query:
|
following query:
|
||||||
|
|
||||||
```plaintext
|
```plaintext
|
||||||
SHOW REPLICATION MODE;
|
SHOW REPLICATION ROLE;
|
||||||
```
|
```
|
||||||
|
|
||||||
To see the replicas of the current Memgraph instance, you can issue the
|
To see the replicas of the current Memgraph instance, you can issue the
|
||||||
@ -139,34 +157,34 @@ To delete a replica, issue the following query:
|
|||||||
DELETE REPLICA 'name';
|
DELETE REPLICA 'name';
|
||||||
```
|
```
|
||||||
|
|
||||||
### How to Promote a New Master?
|
### How to Promote a New Main?
|
||||||
|
|
||||||
When you have an already set-up cluster, to promote a new master, just set the
|
When you have an already set-up cluster, to promote a new main, just set the
|
||||||
slave that you want to be a master to the master role.
|
replica that you want to be a main to the main role.
|
||||||
|
|
||||||
```plaintext
|
```plaintext
|
||||||
SET REPLICATION MODE TO MASTER; # on desired slave
|
SET REPLICATION ROLE TO MAIN; # on desired replica
|
||||||
```
|
```
|
||||||
|
|
||||||
After the command is issued, if the original master is still alive, it won't be
|
After the command is issued, if the original main is still alive, it won't be
|
||||||
able to replicate its data to the slave (the new master) anymore and will enter
|
able to replicate its data to the replica (the new main) anymore and will enter
|
||||||
an error state. You must ensure that at any given point in time there aren't
|
an error state. You must ensure that at any given point in time there aren't
|
||||||
two masters in the cluster.
|
two mains in the cluster.
|
||||||
|
|
||||||
## Integration with Memgraph
|
## Integration with Memgraph
|
||||||
|
|
||||||
WAL `Delta`s are replicated between the replication master and slave. With
|
WAL `Delta`s are replicated between the replication main and replica. With
|
||||||
`Delta`s, all `StorageGlobalOperation`s are also replicated. Replication is
|
`Delta`s, all `StorageGlobalOperation`s are also replicated. Replication is
|
||||||
essentially the same as appending to the WAL.
|
essentially the same as appending to the WAL.
|
||||||
|
|
||||||
Synchronous replication will occur in `Commit` and each
|
Synchronous replication will occur in `Commit` and each
|
||||||
`StorageGlobalOperation` handler. The storage itself guarantees that `Commit`
|
`StorageGlobalOperation` handler. The storage itself guarantees that `Commit`
|
||||||
will be called single-threadedly and that no `StorageGlobalOperation` will be
|
will be called single-threadedly and that no `StorageGlobalOperation` will be
|
||||||
executed during an active transaction. Asynchronous replication will load its
|
executed during an active transaction. Asynchronous replication will load its
|
||||||
data from already written WAL files and transmit the data to the slave. All
|
data from already written WAL files and transmit the data to the replica. All
|
||||||
data will be replicated using our RPC protocol (SLK encoded).
|
data will be replicated using our RPC protocol (SLK encoded).
|
||||||
|
|
||||||
For each replica the replication master (or slave) will keep track of the
|
For each replica the replication main (or replica) will keep track of the
|
||||||
replica's state. That way, it will know which operations must be transmitted to
|
replica's state. That way, it will know which operations must be transmitted to
|
||||||
the replica and which operations can be skipped. When a replica is very stale,
|
the replica and which operations can be skipped. When a replica is very stale,
|
||||||
a snapshot will be transmitted to it so that it can quickly synchronize with
|
a snapshot will be transmitted to it so that it can quickly synchronize with
|
||||||
|
Loading…
Reference in New Issue
Block a user