memgraph/docs/user_technical/storage.md

## Durability and Data Recovery

*Memgraph* uses two mechanisms to ensure the durability of the stored data:

  * write-ahead logging (WAL) and
  * taking periodic snapshots.

Write-ahead logging works by logging all database modifications to a file.
This ensures that all operations are done atomically and provides a trace of
steps needed to reconstruct the database state.

Snapshots are taken periodically during the entire runtime of *Memgraph*. When
a snapshot is triggered, the whole data storage is written to disk. The
snapshot file provides a quicker way to restore the database state.

Database recovery is done on startup from the most recently found snapshot
file. Since the snapshot may be older than the most recent update logged in
the WAL file, the recovery process will apply the remaining state changes
found in the said WAL file.

NOTE: Snapshot and WAL files are not (currently) compatible between *Memgraph*
versions.

Behaviour of the above mechanisms can be tweaked in the configuration file,
usually found in `/etc/memgraph/memgraph.conf`.

In addition to the above mentioned data durability and recovery, a
snapshot file may be generated using *Memgraph's* import tools. For more
information, take a look at **Import Tools** chapter.

## Storable Data Types

Since *Memgraph* is a *graph* database management system, data is stored in
the form of graph elements: nodes and edges. Each graph element can also
contain various types of data. This chapter describes which data types are
supported in *Memgraph*.

### Node Labels & Edge Types

Each node can have any number of labels. A label is a text value, which can be
used to *label* or group nodes according to users' desires. A user can change
labels at any time. Similarly to labels, each edge can have a type,
represented as text. Unlike nodes, which can have multiple labels or none at
all, edges *must* have exactly one edge type. Another difference to labels, is
that the edge types are set upon creation and never modified again.

### Properties

Nodes and edges can store various properties. These are like mappings or
tables containing property names and their accompanying values. Property names
are represented as text, while values can be of different types. Each property
name can store a single value, it is not possible to have multiple properties
with the same name on a single graph element. Naturally, the same property
names can be found across multiple graph elements. Also, there are no
restrictions on the number of properties that can be stored in a single graph
element. The only restriction is that the values must be of the supported
types. Following is a table of supported data types.

 Type      | Description
-----------|------------
 `Null`    | Denotes that the property has no value. This is the same as if the property does not exist.
 `String`  | A character string, i.e. text.
 `Boolean` | A boolean value, either `true` or `false`.
 `Integer` | An integer number.
 `Float`   | A floating-point number, i.e. a real number.
 `List`    | A list containing any number of property values of any supported type. It can be used to store multiple values under a single property name.
 `Map`     | A mapping of string keys to values of any supported type.

 Note that even though it's possible to store `List` and `Map` property values, it is not possible to modify them. It is however possible to replace them completely. So, the following queries are legal:

    CREATE (:Node {property: [1, 2, 3]})
    CREATE (:Node {property: {key: "value"}})

However, these queries are not:

    MATCH (n:Node) SET n.property[0] = 0
    MATCH (n:Node) SET n.property.key = "other value"

### Cold data on disk

Although *Memgraph* is an in-memory database by default, it offers an option
to store a certain amount of data on disk. More precisely, the user can pass
a list of properties they wish to keep stored on disk via the command line.
In certain cases, this might result in a significant performance boost due to
reduced memory usage. It is recommended to use this feature on large,
cold properties, i.e. properties that are rarely accessed.

For example, a user of a library database might identify author biographies
and book summaries as cold properties. In that case, the user should run
*Memgraph* as follows:

```
/usr/lib/memgraph/memgraph --properties-on-disk biography,summary
```

Note that the usage of *Memgraph* has not changed, i.e. durability and
data recovery mechanisms are still in place and the query language remains
the same. It is also important to note that the user cannot change the storage
location of a property while *Memgraph* is running. Naturally, the user can
reload their database from snapshot, provide a different list of properties on
disk and rest assured that only those properties will be stored on disk.
Update user technical documentation Summary: Document durability and RPM installation in user docs Reviewers: buda, mtomic, ipaljak Reviewed By: buda, mtomic Differential Revision: https://phabricator.memgraph.io/D1345 2018-04-09 20:31:52 +08:00			`## Durability and Data Recovery`

			`Memgraph uses two mechanisms to ensure the durability of the stored data:`

			`* write-ahead logging (WAL) and`
			`* taking periodic snapshots.`

			`Write-ahead logging works by logging all database modifications to a file.`
			`This ensures that all operations are done atomically and provides a trace of`
			`steps needed to reconstruct the database state.`

			`Snapshots are taken periodically during the entire runtime of Memgraph. When`
			`a snapshot is triggered, the whole data storage is written to disk. The`
			`snapshot file provides a quicker way to restore the database state.`

			`Database recovery is done on startup from the most recently found snapshot`
			`file. Since the snapshot may be older than the most recent update logged in`
			`the WAL file, the recovery process will apply the remaining state changes`
			`found in the said WAL file.`

			`NOTE: Snapshot and WAL files are not (currently) compatible between Memgraph`
			`versions.`

			`Behaviour of the above mechanisms can be tweaked in the configuration file,`
			usually found in `/etc/memgraph/memgraph.conf`.

			`In addition to the above mentioned data durability and recovery, a`
			`snapshot file may be generated using Memgraph's import tools. For more`
			`information, take a look at Import Tools chapter.`

Add a chapter in user documentation on data types Reviewers: florijan, buda Reviewed By: florijan Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D564 2017-07-19 20:48:20 +08:00			`## Storable Data Types`

			`Since Memgraph is a graph database management system, data is stored in`
			`the form of graph elements: nodes and edges. Each graph element can also`
			`contain various types of data. This chapter describes which data types are`
			`supported in Memgraph.`

			`### Node Labels & Edge Types`

			`Each node can have any number of labels. A label is a text value, which can be`
			`used to label or group nodes according to users' desires. A user can change`
			`labels at any time. Similarly to labels, each edge can have a type,`
			`represented as text. Unlike nodes, which can have multiple labels or none at`
			`all, edges must have exactly one edge type. Another difference to labels, is`
			`that the edge types are set upon creation and never modified again.`

			`### Properties`

			`Nodes and edges can store various properties. These are like mappings or`
			`tables containing property names and their accompanying values. Property names`
			`are represented as text, while values can be of different types. Each property`
			`name can store a single value, it is not possible to have multiple properties`
			`with the same name on a single graph element. Naturally, the same property`
			`names can be found across multiple graph elements. Also, there are no`
			`restrictions on the number of properties that can be stored in a single graph`
			`element. The only restriction is that the values must be of the supported`
			`types. Following is a table of supported data types.`

			`Type \| Description`
			`-----------\|------------`
			`Null` \| Denotes that the property has no value. This is the same as if the property does not exist.
			`String` \| A character string, i.e. text.
Misc user docs fixes. Reviewers: buda, mislav.bradac, teon.banek, florijan Reviewed By: buda Differential Revision: https://phabricator.memgraph.io/D947 2017-10-31 18:03:15 +08:00			`Boolean` \| A boolean value, either `true` or `false`.
Add a chapter in user documentation on data types Reviewers: florijan, buda Reviewed By: florijan Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D564 2017-07-19 20:48:20 +08:00			`Integer` \| An integer number.
			`Float` \| A floating-point number, i.e. a real number.
			`List` \| A list containing any number of property values of any supported type. It can be used to store multiple values under a single property name.
Property storage now supports Map Summary: Added: - map support in PropertyValue - conversion of map TypedValue to PropertyValue if appropriate flag is set (undocumented because it's private) - ordering of map PropertyValue in LabelPropertyIndex - issue raised regarding list and value property modifications in storage (currently unsupported) Maybe I missed some feature or whatever? Reviewers: mislav.bradac, buda, teon.banek Reviewed By: mislav.bradac, buda Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D692 2017-08-24 06:13:26 +08:00			`Map` \| A mapping of string keys to values of any supported type.
Map type now supported Summary: - MapLiteral added - PropertyLookup on maps added This is the basic implementation, missing are: - unit tests - feature and TCK tests - documentation - changelog That stuff is coming. Please review the implementation (Mislav). Reviewers: mislav.bradac, buda, teon.banek Reviewed By: mislav.bradac Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D640 2017-08-08 19:43:42 +08:00
Property storage now supports Map Summary: Added: - map support in PropertyValue - conversion of map TypedValue to PropertyValue if appropriate flag is set (undocumented because it's private) - ordering of map PropertyValue in LabelPropertyIndex - issue raised regarding list and value property modifications in storage (currently unsupported) Maybe I missed some feature or whatever? Reviewers: mislav.bradac, buda, teon.banek Reviewed By: mislav.bradac, buda Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D692 2017-08-24 06:13:26 +08:00			Note that even though it's possible to store `List` and `Map` property values, it is not possible to modify them. It is however possible to replace them completely. So, the following queries are legal:
Map type now supported Summary: - MapLiteral added - PropertyLookup on maps added This is the basic implementation, missing are: - unit tests - feature and TCK tests - documentation - changelog That stuff is coming. Please review the implementation (Mislav). Reviewers: mislav.bradac, buda, teon.banek Reviewed By: mislav.bradac Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D640 2017-08-08 19:43:42 +08:00
Property storage now supports Map Summary: Added: - map support in PropertyValue - conversion of map TypedValue to PropertyValue if appropriate flag is set (undocumented because it's private) - ordering of map PropertyValue in LabelPropertyIndex - issue raised regarding list and value property modifications in storage (currently unsupported) Maybe I missed some feature or whatever? Reviewers: mislav.bradac, buda, teon.banek Reviewed By: mislav.bradac, buda Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D692 2017-08-24 06:13:26 +08:00			`CREATE (:Node {property: [1, 2, 3]})`
			`CREATE (:Node {property: {key: "value"}})`
Map type now supported Summary: - MapLiteral added - PropertyLookup on maps added This is the basic implementation, missing are: - unit tests - feature and TCK tests - documentation - changelog That stuff is coming. Please review the implementation (Mislav). Reviewers: mislav.bradac, buda, teon.banek Reviewed By: mislav.bradac Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D640 2017-08-08 19:43:42 +08:00
Property storage now supports Map Summary: Added: - map support in PropertyValue - conversion of map TypedValue to PropertyValue if appropriate flag is set (undocumented because it's private) - ordering of map PropertyValue in LabelPropertyIndex - issue raised regarding list and value property modifications in storage (currently unsupported) Maybe I missed some feature or whatever? Reviewers: mislav.bradac, buda, teon.banek Reviewed By: mislav.bradac, buda Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D692 2017-08-24 06:13:26 +08:00			`However, these queries are not:`
Map type now supported Summary: - MapLiteral added - PropertyLookup on maps added This is the basic implementation, missing are: - unit tests - feature and TCK tests - documentation - changelog That stuff is coming. Please review the implementation (Mislav). Reviewers: mislav.bradac, buda, teon.banek Reviewed By: mislav.bradac Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D640 2017-08-08 19:43:42 +08:00
Property storage now supports Map Summary: Added: - map support in PropertyValue - conversion of map TypedValue to PropertyValue if appropriate flag is set (undocumented because it's private) - ordering of map PropertyValue in LabelPropertyIndex - issue raised regarding list and value property modifications in storage (currently unsupported) Maybe I missed some feature or whatever? Reviewers: mislav.bradac, buda, teon.banek Reviewed By: mislav.bradac, buda Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D692 2017-08-24 06:13:26 +08:00			`MATCH (n:Node) SET n.property[0] = 0`
			`MATCH (n:Node) SET n.property.key = "other value"`
POD serialization, rocksdb integration and Gleich's optimization Reviewers: buda, dgleich, mferencevic, msantl, teon.banek Reviewed By: buda, dgleich, teon.banek Subscribers: teon.banek, pullbot Differential Revision: https://phabricator.memgraph.io/D1399 2018-06-12 17:29:22 +08:00
			`### Cold data on disk`

			`Although Memgraph is an in-memory database by default, it offers an option`
			`to store a certain amount of data on disk. More precisely, the user can pass`
			`a list of properties they wish to keep stored on disk via the command line.`
			`In certain cases, this might result in a significant performance boost due to`
			`reduced memory usage. It is recommended to use this feature on large,`
			`cold properties, i.e. properties that are rarely accessed.`

			`For example, a user of a library database might identify author biographies`
			`and book summaries as cold properties. In that case, the user should run`
			`Memgraph as follows:`

			```
			`/usr/lib/memgraph/memgraph --properties-on-disk biography,summary`
			```

			`Note that the usage of Memgraph has not changed, i.e. durability and`
			`data recovery mechanisms are still in place and the query language remains`
			`the same. It is also important to note that the user cannot change the storage`
			`location of a property while Memgraph is running. Naturally, the user can`
			`reload their database from snapshot, provide a different list of properties on`
			`disk and rest assured that only those properties will be stored on disk.`