memgraph/docs/dev/durability/wal.md
Ivan Paljak d106aff88f Implement full durability mode
Summary:
This diff introduces a new flags
* `--synchronous-commit`

The `--synchronous-commit` tells the WAL when should the deltas be flushed to
the disk drive. By default this is off and the WAL flushes deltas every `N`
milliseconds. If it's turned on, on every transaction end, commit or abort, the
WAL will first flush the deltas and only after that will return from ending a
transaction.

Reviewers: buda, vkasljevic, mferencevic, teon.banek, ipaljak

Reviewed By: mferencevic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1542
2018-08-29 16:05:07 +02:00

2.7 KiB

Write-ahead logging

Typically WAL denotes the process of writing a "log" of database operations (state changes) to persistent storage before committing the transaction, thus ensuring that the state can be recovered (in the case of a crash) for all the transactions which the database committed.

The WAL is a fine-grained durability format. It's purpose is to store database changes fast. It's primary purpose is not to provide space-efficient storage, nor to support fast recovery. For that reason it's often used in combination with a different persistence mechanism (in Memgraph's case the "snapshot") that has complementary characteristics.

Guarantees

Ensuring that the log is written before the transaction is committed can slow down the database. For that reason this guarantee is most often configurable in databases.

Memgraph offers two options for the WAL. The default option, where the WAL is flushed to the disk periodically and transactions do not wait for this to complete, introduces the risk of database inconsistency because an operating system or hardware crash might lead to missing transactions in the WAL. Memgraph will handle this as if those transactions never happened. The second option, called synchronous commit, will instruct Memgraph to wait for the WAL to be flushed to the disk when a transactions completes and the transaction will wait for this to complete. This option can be turned on with the --synchronous-commit command line flag.

Format

The WAL file contains a series of DB state changes called StateDeltas. Each of them describes what the state change is and in which transaction it happened. Also some kinds of meta-information needed to ensure proper state recovery are recorded (transaction beginnings and commits/abort).

The following is guaranteed w.r.t. StateDelta ordering in a single WAL file:

  • For two ops in the same transaction, if op A happened before B in the database, that ordering is preserved in the log.
  • Transaction begin/commit/abort messages also appear in exactly the same order as they were executed in the transactional engine.

Recovery

The database can recover from the WAL on startup. This works in conjunction with snapshot recovery. The database attempts to recover from the latest snapshot and then apply as much as possible from the WAL files. Only those transactions that were not recovered from the snapshot are recovered from the WAL, for speed efficiency. It is possible (but inefficient) to recover the database from WAL only, provided all the WAL files created from DB start are available. It is not possible to recover partial database state (i.e. from some suffix of WAL files, without the preceding snapshot).