Remove docs/user_technical
Summary: The new source of truth is https://github.com/memgraph/docs because content writers and community members will write most of the content. Reviewers: ipaljak, teon.banek Reviewed By: ipaljak Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D1634
This commit is contained in:
parent
5097c10ba8
commit
7b70d126f6
@ -1,41 +0,0 @@
|
||||
# Technical Documentation
|
||||
|
||||
## About Memgraph
|
||||
|
||||
Memgraph is an ACID compliant high performance transactional in-memory graph
|
||||
database management system featuring highly concurrent
|
||||
data structures, multi-version concurrency control and asynchronous IO.
|
||||
|
||||
[//]: # (When adding a new documentation file, please add it to the list)
|
||||
|
||||
## Contents
|
||||
|
||||
* [About Memgraph](#about-memgraph)
|
||||
* [Quick Start](01_quick-start.md)
|
||||
* [Tutorials](tutorials/02_tutorials-overview.md)
|
||||
* [Analysing TED Talks](tutorials/03_analyzing-TED-talks.md)
|
||||
* [Graphing the Premier League](tutorials/04_graphing-the-premier-league.md)
|
||||
* [Exploring the European Road Network](tutorials/05_exploring-the-european-road-network.md)
|
||||
* [How to](how_to_guides/01_how-to-guides-overview.md)
|
||||
* [Import Data?](how_to_guides/02_import-data.md)
|
||||
* [Query Memgraph Programmatically?](how_to_guides/03_query-memgraph-programmatically.md)
|
||||
* [Ingest Data Using Kafka?](how_to_guides/04_ingest-data-using-kafka.md)
|
||||
* [Manage User Privileges?](how_to_guides/05_manage-user-privileges.md)
|
||||
* [Concepts](concepts/01_concepts_overview.md)
|
||||
* [Storage](concepts/02_storage.md)
|
||||
* [Graph Algorithms](concepts/03_graph-algorithms.md)
|
||||
* [Indexing](concepts/04_indexing.md)
|
||||
* [Reference Guide](reference_guide/01_reference-overview.md)
|
||||
* [Reading Existing Data](reference_guide/02_reading-existing-data.md)
|
||||
* [Writing New Data](reference_guide/03_writing-new-data.md)
|
||||
* [Reading and Writing](reference_guide/04_reading-and-writing.md)
|
||||
* [Indexing](reference_guide/05_indexing.md)
|
||||
* [Graph Algorithms](reference_guide/06_graph-algorithms.md)
|
||||
* [Graph Streams](reference_guide/07_graph-streams.md)
|
||||
* [Security](reference_guide/08_security.md)
|
||||
* [Dynamic Graph Partitioner](reference_guide/09_dynamic-graph-partitioner.md)
|
||||
* [Other Features](reference_guide/10_other-features.md)
|
||||
* [Differences](reference_guide/11_differences.md)
|
||||
* [Upcoming Features](upcoming-features.md)
|
||||
|
||||
[//]: # (Nothing should go below the contents section)
|
@ -1,11 +0,0 @@
|
||||
## Concepts Overview
|
||||
|
||||
Articles within the concepts section serve as an in-depth introduction into
|
||||
inner workings of Memgraph. These tend to be quite technical in nature and
|
||||
are recommended for advanced users and other graph database enthusiasts.
|
||||
|
||||
So far we have covered the following topics:
|
||||
|
||||
* [Data Storage](02_storage.md)
|
||||
* [Graph Algorithms](03_graph-algorithms.md)
|
||||
* [Indexing](04_indexing.md)
|
@ -1,106 +0,0 @@
|
||||
## Durability and Data Recovery
|
||||
|
||||
*Memgraph* uses two mechanisms to ensure the durability of the stored data:
|
||||
|
||||
* write-ahead logging (WAL) and
|
||||
* taking periodic snapshots.
|
||||
|
||||
Write-ahead logging works by logging all database modifications to a file.
|
||||
This ensures that all operations are done atomically and provides a trace of
|
||||
steps needed to reconstruct the database state.
|
||||
|
||||
Snapshots are taken periodically during the entire runtime of *Memgraph*. When
|
||||
a snapshot is triggered, the whole data storage is written to disk. The
|
||||
snapshot file provides a quicker way to restore the database state.
|
||||
|
||||
Database recovery is done on startup from the most recently found snapshot
|
||||
file. Since the snapshot may be older than the most recent update logged in
|
||||
the WAL file, the recovery process will apply the remaining state changes
|
||||
found in the said WAL file.
|
||||
|
||||
NOTE: Snapshot and WAL files are not (currently) compatible between *Memgraph*
|
||||
versions.
|
||||
|
||||
Behaviour of the above mechanisms can be tweaked in the configuration file,
|
||||
usually found in `/etc/memgraph/memgraph.conf`.
|
||||
|
||||
In addition to the above mentioned data durability and recovery, a
|
||||
snapshot file may be generated using *Memgraph's* import tools. For more
|
||||
information, take a look at the [Import Tools](../how_to_guides/02_import-tools.md)
|
||||
article.
|
||||
|
||||
## Storable Data Types
|
||||
|
||||
Since *Memgraph* is a *graph* database management system, data is stored in
|
||||
the form of graph elements: nodes and edges. Each graph element can also
|
||||
contain various types of data. This chapter describes which data types are
|
||||
supported in *Memgraph*.
|
||||
|
||||
### Node Labels & Edge Types
|
||||
|
||||
Each node can have any number of labels. A label is a text value, which can be
|
||||
used to *label* or group nodes according to users' desires. A user can change
|
||||
labels at any time. Similarly to labels, each edge can have a type,
|
||||
represented as text. Unlike nodes, which can have multiple labels or none at
|
||||
all, edges *must* have exactly one edge type. Another difference to labels, is
|
||||
that the edge types are set upon creation and never modified again.
|
||||
|
||||
### Properties
|
||||
|
||||
Nodes and edges can store various properties. These are like mappings or
|
||||
tables containing property names and their accompanying values. Property names
|
||||
are represented as text, while values can be of different types. Each property
|
||||
name can store a single value, it is not possible to have multiple properties
|
||||
with the same name on a single graph element. Naturally, the same property
|
||||
names can be found across multiple graph elements. Also, there are no
|
||||
restrictions on the number of properties that can be stored in a single graph
|
||||
element. The only restriction is that the values must be of the supported
|
||||
types. Following is a table of supported data types.
|
||||
|
||||
Type | Description
|
||||
-----------|------------
|
||||
`Null` | Denotes that the property has no value. This is the same as if the property does not exist.
|
||||
`String` | A character string, i.e. text.
|
||||
`Boolean` | A boolean value, either `true` or `false`.
|
||||
`Integer` | An integer number.
|
||||
`Float` | A floating-point number, i.e. a real number.
|
||||
`List` | A list containing any number of property values of any supported type. It can be used to store multiple values under a single property name.
|
||||
`Map` | A mapping of string keys to values of any supported type.
|
||||
|
||||
Note that even though it's possible to store `List` and `Map` property values, it is not possible to modify them. It is however possible to replace them completely. So, the following queries are legal:
|
||||
|
||||
```opencypher
|
||||
CREATE (:Node {property: [1, 2, 3]})
|
||||
CREATE (:Node {property: {key: "value"}})
|
||||
```
|
||||
|
||||
However, these queries are not:
|
||||
|
||||
```opencypher
|
||||
MATCH (n:Node) SET n.property[0] = 0
|
||||
MATCH (n:Node) SET n.property.key = "other value"
|
||||
```
|
||||
|
||||
### Cold data on disk
|
||||
|
||||
Although *Memgraph* is an in-memory database by default, it offers an option
|
||||
to store a certain amount of data on disk. More precisely, the user can pass
|
||||
a list of properties they wish to keep stored on disk via the command line.
|
||||
In certain cases, this might result in a significant performance boost due to
|
||||
reduced memory usage. It is recommended to use this feature on large,
|
||||
cold properties, i.e. properties that are rarely accessed.
|
||||
|
||||
For example, a user of a library database might identify author biographies
|
||||
and book summaries as cold properties. In that case, the user should run
|
||||
*Memgraph* as follows:
|
||||
|
||||
```bash
|
||||
/usr/lib/memgraph/memgraph --properties-on-disk biography,summary
|
||||
```
|
||||
|
||||
Note that the usage of *Memgraph* has not changed, i.e. durability and
|
||||
data recovery mechanisms are still in place and the query language remains
|
||||
the same. It is also important to note that the user cannot change the storage
|
||||
location of a property while *Memgraph* is running. Naturally, the user can
|
||||
reload their database from snapshot, provide a different list of properties on
|
||||
disk and rest assured that only those properties will be stored on disk.
|
@ -1,156 +0,0 @@
|
||||
## Graph Algorithms
|
||||
|
||||
### Introduction
|
||||
|
||||
The graph is a mathematical structure used to describe a set of objects in which
|
||||
some pairs of objects are "related" in some sense. Generally, we consider
|
||||
those objects as abstractions named `nodes` (also called `vertices`).
|
||||
Aforementioned relations between nodes are modelled by an abstraction named
|
||||
`edge` (also called `relationship`).
|
||||
|
||||
It turns out that a lot of real-world problems can be successfully modeled
|
||||
using graphs. Some natural examples would contain railway networks between
|
||||
cities, computer networks, piping systems and Memgraph itself.
|
||||
|
||||
This article outlines some of the most important graph algorithms
|
||||
that are internally used by Memgraph. We believe that advanced users could
|
||||
significantly benefit from obtaining basic knowledge about those algorithms.
|
||||
The users should also note that this article does not contain an in-depth
|
||||
analysis of algorithms and their implementation details since those are
|
||||
well documented in the appropriate literature and, in our opinion, go well out
|
||||
of scope for user documentation. That being said, we will include the relevant
|
||||
information for using Memgraph effectively and efficiently.
|
||||
|
||||
Contents of this article include:
|
||||
|
||||
* [Breadth First Search (BFS)](#breadth-first-search)
|
||||
* [Weighted Shortest Path (WSP)](#weighted-shortest-path)
|
||||
|
||||
|
||||
### Breadth First Search
|
||||
|
||||
[Breadth First Search](https://en.wikipedia.org/wiki/Breadth-first_search)
|
||||
is a way of traversing a graph data structure. The
|
||||
traversal starts from a single node (usually referred to as source node) and,
|
||||
during the traversal, breadth is prioritized over depth, hence the name of the
|
||||
algorithm. More precisely, when we visit some node, we can safely assume that
|
||||
we have already visited all nodes that are fewer edges away from a source node.
|
||||
An interesting side-effect of traversing a graph in BFS order is the fact
|
||||
that, when we visit a particular node, we can easily find a path from
|
||||
the source node to the newly visited node with the least number of edges.
|
||||
Since in this context we disregard the edge weights, we can say that BFS is
|
||||
a solution to an unweighted shortest path problem.
|
||||
|
||||
The algorithm itself proceeds as follows:
|
||||
|
||||
* Keep around a set of nodes that are equidistant from the source node.
|
||||
Initially, this set contains only the source node.
|
||||
* Expand to all not yet visited nodes that are a single edge away from that
|
||||
set. Note that the set of those nodes is also equidistant from the source
|
||||
node.
|
||||
* Replace the set with a set of nodes obtained in the previous step.
|
||||
* Terminate the algorithm when the set is empty.
|
||||
|
||||
The order of visited nodes is nicely visualized in the following animation from
|
||||
Wikipedia. Note that each row contains nodes that are equidistant from the
|
||||
source and thus represents one of the sets mentioned above.
|
||||
|
||||
![visualization](https://upload.wikimedia.org/wikipedia/commons/5/5d/Breadth-First-Search-Algorithm.gif)
|
||||
|
||||
The standard BFS implementation skews from the above description by relying on
|
||||
a FIFO (first in, first out) queue data structure. Nevertheless, the
|
||||
functionality is equivalent and its runtime is bounded by `O(|V| + |E|)` where
|
||||
`V` denotes the set of nodes and `E` denotes the set of edges. Therefore,
|
||||
it provides a more efficient way of finding unweighted shortest paths than
|
||||
running [Dijkstra's algorithm](#weighted-shortest-path) on a graph
|
||||
with edge weights equal to `1`.
|
||||
|
||||
### Weighted Shortest Path
|
||||
|
||||
In [graph theory](https://en.wikipedia.org/wiki/Graph_theory), weighted shortest
|
||||
path problem is the problem of finding a path between two nodes in a graph such
|
||||
that the sum of the weights of edges connecting nodes on the path is minimized.
|
||||
|
||||
#### Dijkstra's algorithm
|
||||
|
||||
One of the most important algorithms for finding weighted shortest paths is
|
||||
[Dijkstra's algorithm](https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm).
|
||||
Our implementation uses a modified version of this algorithm that can handle
|
||||
length restriction. The length restriction parameter is optional and when it's
|
||||
not set it could increase the complexity of the algorithm. It is important to
|
||||
note that the term "length" in this context denotes the number of traversed
|
||||
edges and not the sum of their weights.
|
||||
|
||||
The algorithm itself is based on a couple of greedy observations and could
|
||||
be expressed in natural language as follows:
|
||||
|
||||
* Keep around a set of already visited nodes along with their corresponding
|
||||
shortest paths from source node. Initially, this set contains only the
|
||||
source node with the shortest distance of `0`.
|
||||
* Find an edge that goes from a visited node to an unvisited one such that the
|
||||
shortest path from source to the visited node increased by the weight of
|
||||
that edge is minimized. Traverse that edge and add a newly visited node with
|
||||
appropriate distance to the set of already visited nodes.
|
||||
* Repeat the process until the destination node is visited.
|
||||
|
||||
The described algorithm is nicely visualized in the following animation from
|
||||
Wikipedia. Note that edge weights correspond to the Euclidean distance between
|
||||
nodes which represent points on a plane.
|
||||
|
||||
![visualization](https://upload.wikimedia.org/wikipedia/commons/e/e4/DijkstraDemo.gif)
|
||||
|
||||
Using appropriate data structures the worst-case performance of our
|
||||
implementation can be expressed as `O(|E| + |V|log|V|)` where `E` denotes
|
||||
a set of edges and `V` denotes the set of nodes.
|
||||
|
||||
A sample query that finds a shortest path between two nodes looks as follows:
|
||||
|
||||
```opencypher
|
||||
MATCH (a {id: 723})-[edge_list *wShortest 10 (e, n | e.weight) total_weight]-(b {id: 882}) RETURN *
|
||||
```
|
||||
|
||||
This query has an upper bound length restriction set to `10`. This means that no
|
||||
path that traverses more than `10` edges will be considered as a valid result.
|
||||
|
||||
##### Upper Bound Implications
|
||||
|
||||
Since the upper bound parameter is optional, we can have different results based
|
||||
on this parameter.
|
||||
|
||||
Consider the following graph and sample queries.
|
||||
|
||||
![sample-graph](../data/graph.png)
|
||||
|
||||
```opencypher
|
||||
MATCH (a {id: 0})-[edge_list *wShortest 3 (e, n | e.weight) total_weight]-(b {id: 5}) RETURN *
|
||||
```
|
||||
|
||||
```opencypher
|
||||
MATCH (a {id: 0})-[edge_list *wShortest (e, n | e.weight) total_weight]-(b {id: 5}) RETURN *
|
||||
```
|
||||
|
||||
The first query will try to find the weighted shortest path between nodes `0`
|
||||
and `5` with the restriction on the path length set to `3`, and the second query
|
||||
will try to find the weighted shortest path with no restriction on the path
|
||||
length.
|
||||
|
||||
The expected result for the first query is `0 -> 1 -> 4 -> 5` with the total
|
||||
cost of `12`, while the expected result for the second query is
|
||||
`0 -> 2 -> 3 -> 4 -> 5` with the total cost of `11`. Obviously, the second
|
||||
query can find the true shortest path because it has no restrictions on the
|
||||
length.
|
||||
|
||||
To handle cases when the length restriction is set, *weighted shortest path*
|
||||
algorithm uses both node and distance as the state. This causes the search
|
||||
space to increase by the factor of the given upper bound. On the other hand, not
|
||||
setting the upper bound parameter, the search space might contain the whole
|
||||
graph.
|
||||
|
||||
Because of this, one should always try to narrow down the upper bound limit to
|
||||
be as precise as possible in order to have a more performant query.
|
||||
|
||||
### Where to next?
|
||||
|
||||
For some real-world application of WSP we encourage you to visit our article
|
||||
on [exploring the European road network](../tutorials/04_exploring-the-european-road-network.md)
|
||||
which was specially crafted to showcase our graph algorithms.
|
@ -1,92 +0,0 @@
|
||||
## Indexing
|
||||
|
||||
### Introduction
|
||||
|
||||
A database index is a data structure used to improve the speed of data retrieval
|
||||
within a database at the cost of additional writes and storage space for
|
||||
maintaining the index data structure.
|
||||
|
||||
Armed with deep understanding of their data model and use-case, users can decide
|
||||
which data to index and, by doing so, significantly improve their data retrieval
|
||||
efficiency
|
||||
|
||||
### Index Types
|
||||
|
||||
At Memgraph, we support two types of indexes:
|
||||
|
||||
* label index
|
||||
* label-property index
|
||||
|
||||
Label indexing is enabled by default in Memgraph, i.e., Memgraph automatically
|
||||
indexes labeled data. By doing so we optimize queries which fetch nodes by
|
||||
label:
|
||||
|
||||
```opencypher
|
||||
MATCH (n: Label) ... RETURN n
|
||||
```
|
||||
|
||||
Indexes can also be created on data with a specific combination of label and
|
||||
property, hence the name label-property index. This operation needs to be
|
||||
specified by the user and should be used with a specific data model and
|
||||
use-case in mind.
|
||||
|
||||
For example, suppose we are storing information about certain people in our
|
||||
database and we are often interested in retrieving their age. In that case,
|
||||
it might be beneficial to create an index on nodes labeled as `:Person` which
|
||||
have a property named `age`. We can do so by using the following language
|
||||
construct:
|
||||
|
||||
```opencypher
|
||||
CREATE INDEX ON :Person(age)
|
||||
```
|
||||
|
||||
After the creation of that index, those queries will be more efficient due to
|
||||
the fact that Memgraph's query engine will not have to fetch each `:Person` node
|
||||
and check whether the property exists. Moreover, even if all nodes labeled as
|
||||
`:Person` had an `age` property, creating such index might still prove to be
|
||||
beneficial. The main reason is that entries within that index are kept sorted
|
||||
by property value. Queries such as the following are therefore more efficient:
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person {age: 42}) RETURN n
|
||||
```
|
||||
|
||||
Index based retrieval can also be invoked on queries with `WHERE` statements.
|
||||
For instance, the following query will have the same effect as the previous
|
||||
one:
|
||||
|
||||
```opencypher
|
||||
MATCH (n) WHERE n:Person AND n.age = 42 RETURN n
|
||||
```
|
||||
|
||||
Naturally, indexes will also be used when filtering based on less than or
|
||||
greater than comparisons. For example, filtering all minors (persons
|
||||
under 18 years of age under Croatian law) using the following query will use
|
||||
index based retrieval:
|
||||
|
||||
```opencypher
|
||||
MATCH (n) WHERE n:PERSON and n.age < 18 RETURN n
|
||||
```
|
||||
|
||||
Bear in mind that `WHERE` filters could contain arbitrarily complex expressions
|
||||
and index based retrieval might not be used. Nevertheless, we are continually
|
||||
improving our index usage recognition algorithms.
|
||||
|
||||
### Underlying Implementation
|
||||
|
||||
The central part of our index data structure is a highly-concurrent skip list.
|
||||
Skip lists are probabilistic data structures that allow fast search within an
|
||||
ordered sequence of elements. The structure itself is built in layers where the
|
||||
bottom layer is an ordinary linked list that preserves the order. Each higher
|
||||
level can be imagined as a highway for layers below.
|
||||
|
||||
The implementation details behind skip list operations are well documented
|
||||
in the literature and are out of scope for this article. Nevertheless, we
|
||||
believe that it is important for more advanced users to understand the following
|
||||
implications of this data structure (`n` denotes the current number of elements
|
||||
in a skip list):
|
||||
|
||||
* Average insertion time is `O(log(n))`
|
||||
* Average deletion time is `O(log(n))`
|
||||
* Average search time is `O(log(n))`
|
||||
* Average memory consumption is `O(n)`
|
Binary file not shown.
Before Width: | Height: | Size: 12 KiB |
@ -1,12 +0,0 @@
|
||||
## How-to Guides Overview
|
||||
|
||||
Articles within the how-to guides section serve as a cookbook for getting
|
||||
things done as fast as possible. These articles tend to provide a step-by-step
|
||||
guide on how to use certain Memgraph feature or solve a particular problem.
|
||||
|
||||
So far we have covered the following topics:
|
||||
|
||||
* [How to Import Data?](02_import-tools.md)
|
||||
* [How to Query Memgraph Programmatically?](03_query-memgraph-programmatically.md)
|
||||
* [How to Ingest Data Using Kafka](04_ingest-data-using-kafka.md)
|
||||
* [How to Manage User Privileges](05_manage-user-privileges.md)
|
@ -1,118 +0,0 @@
|
||||
## How to Import Data?
|
||||
|
||||
Memgraph comes with tools for importing data into the database. Currently,
|
||||
only import of CSV formatted is supported. We plan to support more formats in
|
||||
the future.
|
||||
|
||||
### CSV Import Tool
|
||||
|
||||
CSV data should be in Neo4j CSV compatible format. Detailed format
|
||||
specification can be found
|
||||
[here](https://neo4j.com/docs/operations-manual/current/tools/import/file-header-format/).
|
||||
|
||||
The import tool is run from the console, using the `mg_import_csv` command.
|
||||
|
||||
If you installed Memgraph using Docker, you will need to run the importer
|
||||
using the following command:
|
||||
|
||||
```bash
|
||||
docker run -v mg_lib:/var/lib/memgraph -v mg_etc:/etc/memgraph -v mg_import:/import-data \
|
||||
--entrypoint=mg_import_csv memgraph
|
||||
```
|
||||
|
||||
You can pass CSV files containing node data using the `--nodes` option.
|
||||
Multiple files can be specified by repeating the `--nodes` option. At least
|
||||
one node file should be specified. Similarly, graph edges (also known as
|
||||
relationships) are passed via the `--relationships` option. Multiple
|
||||
relationship files are imported by repeating the option. Unlike nodes,
|
||||
relationships are not required.
|
||||
|
||||
After reading the CSV files, the tool will by default search for the installed
|
||||
Memgraph configuration. If the configuration is found, the data will be
|
||||
written in the configured durability directory. If the configuration isn't
|
||||
found, you will need to use the `--out` option to specify the output file. You
|
||||
can use the same option to override the default behaviour.
|
||||
|
||||
Memgraph will recover the imported data on the next startup by looking in the
|
||||
durability directory.
|
||||
|
||||
For information on other options, run:
|
||||
|
||||
```bash
|
||||
mg_import_csv --help
|
||||
```
|
||||
|
||||
When using Docker, this translates to:
|
||||
|
||||
```bash
|
||||
docker run --entrypoint=mg_import_csv memgraph --help
|
||||
```
|
||||
|
||||
#### Example
|
||||
|
||||
Let's import a simple dataset.
|
||||
|
||||
Store the following in `comment_nodes.csv`.
|
||||
|
||||
```csv
|
||||
id:ID(COMMENT_ID),country:string,browser:string,content:string,:LABEL
|
||||
0,Croatia,Chrome,yes,Message;Comment
|
||||
1,United Kingdom,Chrome,thanks,Message;Comment
|
||||
2,Germany,,LOL,Message;Comment
|
||||
3,France,Firefox,I see,Message;Comment
|
||||
4,Italy,Internet Explorer,fine,Message;Comment
|
||||
```
|
||||
|
||||
Now, let's add `forum_nodes.csv`.
|
||||
|
||||
```csv
|
||||
id:ID(FORUM_ID),title:string,:LABEL
|
||||
0,General,Forum
|
||||
1,Support,Forum
|
||||
2,Music,Forum
|
||||
3,Film,Forum
|
||||
4,Programming,Forum
|
||||
```
|
||||
|
||||
And finally, set relationships between comments and forums in
|
||||
`relationships.csv`.
|
||||
|
||||
```csv
|
||||
:START_ID(COMMENT_ID),:END_ID(FORUM_ID),:TYPE
|
||||
0,0,POSTED_ON
|
||||
1,1,POSTED_ON
|
||||
2,2,POSTED_ON
|
||||
3,3,POSTED_ON
|
||||
4,4,POSTED_ON
|
||||
```
|
||||
|
||||
Now, you can import the dataset in Memgraph.
|
||||
|
||||
WARNING: Your existing recovery data will be considered obsolete, and Memgraph
|
||||
will load the new dataset.
|
||||
|
||||
Use the following command:
|
||||
|
||||
```bash
|
||||
mg_import_csv --overwrite --nodes=comment_nodes.csv --nodes=forum_nodes.csv --relationships=relationships.csv
|
||||
```
|
||||
|
||||
If using Docker, things are a bit more complicated. First you need to move the
|
||||
CSV files where the Docker image can see them:
|
||||
|
||||
```bash
|
||||
mkdir -p /var/lib/docker/volumes/mg_import/_data
|
||||
cp comment_nodes.csv forum_nodes.csv relationships.csv /var/lib/docker/volumes/mg_import/_data
|
||||
```
|
||||
|
||||
Then, run the importer with the following:
|
||||
|
||||
```bash
|
||||
docker run -v mg_lib:/var/lib/memgraph -v mg_etc:/etc/memgraph -v mg_import:/import-data \
|
||||
--entrypoint=mg_import_csv memgraph \
|
||||
--overwrite \
|
||||
--nodes=/import-data/comment_nodes.csv --nodes=/import-data/forum_nodes.csv \
|
||||
--relationships=/import-data/relationships.csv
|
||||
```
|
||||
|
||||
Next time you run Memgraph, the dataset will be loaded.
|
@ -1,213 +0,0 @@
|
||||
## How to Query Memgraph Programmatically?
|
||||
|
||||
### Supported Languages
|
||||
|
||||
If users wish to query Memgraph programmatically, they can do so using the
|
||||
[Bolt protocol](https://boltprotocol.org). Bolt was designed for efficient
|
||||
communication with graph databases and Memgraph supports
|
||||
[Version 1](https://boltprotocol.org/v1) of the protocol. Bolt protocol drivers
|
||||
for some popular programming languages are listed below:
|
||||
|
||||
* [Java](https://github.com/neo4j/neo4j-java-driver)
|
||||
* [Python](https://github.com/neo4j/neo4j-python-driver)
|
||||
* [JavaScript](https://github.com/neo4j/neo4j-javascript-driver)
|
||||
* [C#](https://github.com/neo4j/neo4j-dotnet-driver)
|
||||
* [Ruby](https://github.com/neo4jrb/neo4j)
|
||||
* [Haskell](https://github.com/zmactep/hasbolt)
|
||||
* [PHP](https://github.com/graphaware/neo4j-bolt-php)
|
||||
|
||||
### Secure Sockets Layer (SSL)
|
||||
|
||||
Secure connections are supported and enabled by default. The server initially
|
||||
ships with a self-signed testing certificate. The certificate can be replaced
|
||||
by editing the following parameters in `/etc/memgraph/memgraph.conf`:
|
||||
```
|
||||
--cert-file=/path/to/ssl/certificate.pem
|
||||
--key-file=/path/to/ssl/privatekey.pem
|
||||
```
|
||||
To disable SSL support and use insecure connections to the database you should
|
||||
set both parameters (`--cert-file` and `--key-file`) to empty values.
|
||||
|
||||
### Examples
|
||||
|
||||
In this article we have included some basic usage examples for the following
|
||||
supported languages:
|
||||
|
||||
* [Python](#python-example)
|
||||
* [Java](#java-example)
|
||||
* [JavaScript](#javascript-example)
|
||||
* [C#](#c-sharp-example)
|
||||
|
||||
Examples for the languages listed above are equivalent.
|
||||
|
||||
#### Python Example
|
||||
|
||||
Neo4j officially supports Python for interacting with an openCypher and Bolt
|
||||
compliant database. For details consult the
|
||||
[official documentation](http://neo4j.com/docs/api/python-driver) and the
|
||||
[GitHub project](https://github.com/neo4j/neo4j-python-driver).
|
||||
|
||||
The code snippet below outlines a basic usage example which connects to the
|
||||
database and executes a couple of elementary queries.
|
||||
|
||||
```python
|
||||
from neo4j.v1 import GraphDatabase, basic_auth
|
||||
|
||||
# Initialize and configure the driver.
|
||||
# * provide the correct URL where Memgraph is reachable;
|
||||
# * use an empty user name and password.
|
||||
driver = GraphDatabase.driver("bolt://localhost:7687",
|
||||
auth=basic_auth("", ""))
|
||||
|
||||
# Start a session in which queries are executed.
|
||||
session = driver.session()
|
||||
|
||||
# Execute openCypher queries.
|
||||
# After each query, call either `consume()` or `data()`
|
||||
session.run('CREATE (alice:Person {name: "Alice", age: 22})').consume()
|
||||
|
||||
# Get all the vertices from the database (potentially multiple rows).
|
||||
vertices = session.run('MATCH (n) RETURN n').data()
|
||||
# Assuming we started with an empty database, we should have Alice
|
||||
# as the only row in the results.
|
||||
only_row = vertices.pop()
|
||||
alice = only_row["n"]
|
||||
|
||||
# Print out what we retrieved.
|
||||
print("Found a vertex with labels '{}', name '{}' and age {}".format(
|
||||
alice['name'], alice.labels, alice['age'])
|
||||
|
||||
# Remove all the data from the database.
|
||||
session.run('MATCH (n) DETACH DELETE n').consume()
|
||||
|
||||
# Close the session and the driver.
|
||||
session.close()
|
||||
driver.close()
|
||||
```
|
||||
|
||||
#### Java Example
|
||||
|
||||
The details about Java driver can be found on
|
||||
[GitHub](https://github.com/neo4j/neo4j-java-driver).
|
||||
|
||||
The code snippet below outlines a basic usage example which connects to the
|
||||
database and executes a couple of elementary queries.
|
||||
|
||||
```java
|
||||
import org.neo4j.driver.v1.*;
|
||||
import org.neo4j.driver.v1.types.*;
|
||||
import static org.neo4j.driver.v1.Values.parameters;
|
||||
import java.util.*;
|
||||
|
||||
public class JavaQuickStart {
|
||||
public static void main(String[] args) {
|
||||
// Initialize driver.
|
||||
Config config = Config.build().toConfig();
|
||||
Driver driver = GraphDatabase.driver("bolt://localhost:7687",
|
||||
AuthTokens.basic("",""),
|
||||
config);
|
||||
// Execute basic queries.
|
||||
try (Session session = driver.session()) {
|
||||
StatementResult rs1 = session.run("MATCH (n) DETACH DELETE n");
|
||||
StatementResult rs2 = session.run(
|
||||
"CREATE (alice: Person {name: 'Alice', age: 22})");
|
||||
StatementResult rs3 = session.run( "MATCH (n) RETURN n");
|
||||
List<Record> records = rs3.list();
|
||||
Record record = records.get(0);
|
||||
Node node = record.get("n").asNode();
|
||||
System.out.println(node.get("name").asString());
|
||||
} catch (Exception e) {
|
||||
System.out.println(e);
|
||||
System.exit(1);
|
||||
}
|
||||
// Cleanup.
|
||||
driver.close();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### JavaScript Example
|
||||
|
||||
The details about Javascript driver can be found on
|
||||
[GitHub](https://github.com/neo4j/neo4j-javascript-driver).
|
||||
|
||||
Here is an example related to `Node.js`. Memgraph doesn't have integrated
|
||||
support for `WebSocket` which is required during the execution in any web
|
||||
browser. If you want to run `openCypher` queries from a web browser,
|
||||
[websockify](https://github.com/novnc/websockify) has to be up and running.
|
||||
Requests from web browsers are wrapped into `WebSocket` messages, and a proxy
|
||||
is needed to handle the overhead. The proxy has to be configured to point out
|
||||
to Memgraph's Bolt port and web browser driver has to send requests to the
|
||||
proxy port.
|
||||
|
||||
The code snippet below outlines a basic usage example which connects to the
|
||||
database and executes a couple of elementary queries.
|
||||
|
||||
```javascript
|
||||
var neo4j = require('neo4j-driver').v1;
|
||||
var driver = neo4j.driver("bolt://localhost:7687",
|
||||
neo4j.auth.basic("neo4j", "1234"));
|
||||
var session = driver.session();
|
||||
|
||||
function die() {
|
||||
session.close();
|
||||
driver.close();
|
||||
}
|
||||
|
||||
function run_query(query, callback) {
|
||||
var run = session.run(query, {});
|
||||
run.then(callback).catch(function (error) {
|
||||
console.log(error);
|
||||
die();
|
||||
});
|
||||
}
|
||||
|
||||
run_query("MATCH (n) DETACH DELETE n", function (result) {
|
||||
console.log("Database cleared.");
|
||||
run_query("CREATE (alice: Person {name: 'Alice', age: 22})", function (result) {
|
||||
console.log("Record created.");
|
||||
run_query("MATCH (n) RETURN n", function (result) {
|
||||
console.log("Record matched.");
|
||||
var alice = result.records[0].get("n");
|
||||
console.log(alice.labels[0]);
|
||||
console.log(alice.properties["name"]);
|
||||
session.close();
|
||||
driver.close();
|
||||
});
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
#### C# Example {#c-sharp-example}
|
||||
|
||||
The details about C# driver can be found on
|
||||
[GitHub](https://github.com/neo4j/neo4j-dotnet-driver).
|
||||
|
||||
The code snipped below outlines a basic usage example which connects to the
|
||||
database and executes a couple of elementary queries.
|
||||
|
||||
```csh
|
||||
using System;
|
||||
using System.Linq;
|
||||
using Neo4j.Driver.V1;
|
||||
|
||||
public class Basic {
|
||||
public static void Main(string[] args) {
|
||||
// Initialize the driver.
|
||||
var config = Config.DefaultConfig;
|
||||
using(var driver = GraphDatabase.Driver("bolt://localhost:7687", AuthTokens.None, config))
|
||||
using(var session = driver.Session())
|
||||
{
|
||||
// Run basic queries.
|
||||
session.Run("MATCH (n) DETACH DELETE n").Consume();
|
||||
session.Run("CREATE (alice:Person {name: \"Alice\", age: 22})").Consume();
|
||||
var result = session.Run("MATCH (n) RETURN n").First();
|
||||
var alice = (INode) result["n"];
|
||||
Console.WriteLine(alice["name"]);
|
||||
Console.WriteLine(string.Join(", ", alice.Labels));
|
||||
Console.WriteLine(alice["age"]);
|
||||
}
|
||||
Console.WriteLine("All ok!");
|
||||
}
|
||||
}
|
||||
```
|
@ -1,116 +0,0 @@
|
||||
## How to Ingest Data Using Kafka
|
||||
|
||||
Apache Kafka is an open-source stream-processing software platform. The project
|
||||
aims to provide a unified, high-throughput, low-latency platform for handling
|
||||
real-time data feeds.
|
||||
|
||||
Memgraph offers easy data import at the source using Kafka as the
|
||||
high-throughput messaging system.
|
||||
|
||||
At this point, we strongly advise you to read the streaming section of our
|
||||
[reference guide](../reference_guide/07_graph-streams.md)
|
||||
|
||||
In this article, we assume you have a local instance of Kafka. You can find
|
||||
more about running Kafka [here](https://kafka.apache.org/quickstart).
|
||||
|
||||
From this point forth, we assume you have a instance of Kafka running on
|
||||
`localhost:9092` with a topic `test` and that you've started Memgraph and have
|
||||
Memgraph client running.
|
||||
|
||||
Each Kafka stream in Memgraph requires a transform script written in `Python`
|
||||
that knows how to interpret incoming data and transform the data to queries that
|
||||
Memgraph understands. Lets assume you have script available on
|
||||
`http://localhost/transform.py`.
|
||||
|
||||
Lets also assume the Kafka topic contains two types of messages:
|
||||
|
||||
* Node creation: the message contains a single number, the node id.
|
||||
* Edge creation: the message contains two numbers, origin node id and
|
||||
destination node id.
|
||||
|
||||
In order to create a stream input the following query in the client:
|
||||
|
||||
```opencypher
|
||||
CREATE STREAM mystream AS LOAD DATA KAFKA 'localhost:9092' WITH TOPIC 'test' WITH
|
||||
TRANSFORM 'http://localhost/transform.py'
|
||||
```
|
||||
|
||||
This will create the stream inside Memgraph but will not start it yet. However,
|
||||
if the Kafka instance isn't available on the given URI, or the topic doesn't
|
||||
exist, the query will fail with an appropriate message.
|
||||
|
||||
E.g. if the transform script can't be found at the given URI, the following
|
||||
error will be shown:
|
||||
|
||||
```plaintext
|
||||
Client received exception: Couldn't get the transform script from http://localhost/transform.py
|
||||
```
|
||||
Similarly, if the given Kafka topic doesn't exist, we'll get the following:
|
||||
|
||||
```plaintext
|
||||
Client received exception: Kafka stream mystream, topic not found
|
||||
```
|
||||
|
||||
After a successful stream creation, you can check the status of all streams by
|
||||
executing:
|
||||
|
||||
```opencypher
|
||||
SHOW STREAMS
|
||||
```
|
||||
|
||||
This should produce the following output:
|
||||
|
||||
```plaintext
|
||||
+----------+----------------+-------+------------------------------+---------+
|
||||
| name | uri | topic | transform | status |
|
||||
+---------------------------+--------------------------------------+---------+
|
||||
| mystream | localhost:9092 | test | http://localhost/memgraph.py | stopped |
|
||||
+----------+----------------+-------+------------------------------+---------+
|
||||
```
|
||||
As you can notice, the status of this stream is stopped.
|
||||
|
||||
In order to see if everything is correct, you can test the stream by executing:
|
||||
|
||||
```opencypher
|
||||
TEST STREAM mystream;
|
||||
```
|
||||
|
||||
This will ingest data from Kafka, but instead of writing it to Memgraph, it will
|
||||
just output the result.
|
||||
|
||||
If the `test` Kafka topic would contain two messages, `1` and `1 2` the result
|
||||
of the `TEST STREAM` query would look like:
|
||||
|
||||
```plaintext
|
||||
+-------------------------------------------------------------------------------+-------------------------+
|
||||
| query | params |
|
||||
+-------------------------------------------------------------------------------+-------------------------+
|
||||
| CREATE (:Node {id: $id}) | {id:"1"} |
|
||||
| MATCH (n:Node {id: $from_id}), (m:Node {id: $to_id}) CREATE (n)-[:Edge]->(m) | {from_id:"1",to_id:"2"} |
|
||||
+-------------------------------------------------------------------------------+-------------------------+
|
||||
```
|
||||
|
||||
To start ingesting data from a stream, you need to execute the following query:
|
||||
|
||||
```opencypher
|
||||
START STREAM mystream;
|
||||
```
|
||||
|
||||
If we check the stream status now, the output would look like this:
|
||||
|
||||
```plaintext
|
||||
+----------+----------------+-------+------------------------------+---------+
|
||||
| name | uri | topic | transform | status |
|
||||
+---------------------------+--------------------------------------+---------+
|
||||
| mystream | localhost:9092 | test | http://localhost/memgraph.py | running |
|
||||
+----------+----------------+-------+------------------------------+---------+
|
||||
```
|
||||
|
||||
To stop ingesting data, the stop stream query needs to be executed:
|
||||
|
||||
```opencypher
|
||||
STOP STREAM mystream;
|
||||
```
|
||||
|
||||
If Memgraph shuts down, all streams that existed before the shutdown are going
|
||||
to be recovered.
|
@ -1,142 +0,0 @@
|
||||
## How to Manage User Privileges?
|
||||
|
||||
Most databases have multiple users accessing and modifying
|
||||
data within the database. This might pose a serious security concern for the
|
||||
system administrators that wish to grant only certain privileges to certain
|
||||
users. A typical example would be an internal database of some company which
|
||||
tracks data about their employees. Naturally, only certain users of the database
|
||||
should be able to perform queries which modify that data.
|
||||
|
||||
At Memgraph, we provide the administrators with the option of granting,
|
||||
denying or revoking a certain set of privileges to some users or groups of users
|
||||
(i.e. users that are assigned a specific user role), thereby eliminating such
|
||||
security concerns.
|
||||
|
||||
By default, anyone can connect to Memgraph and is granted all privileges.
|
||||
After the first user is created, Memgraph will execute a query if and only
|
||||
if either a user or its role is granted that privilege and neither the
|
||||
user nor its role are denied that privilege. Otherwise, Memgraph will not
|
||||
execute that specific query. Note that `DENY` is a stronger
|
||||
operation than `GRANT`. This is also notable from the fact that if neither the
|
||||
user nor its role are explicitly granted or denied a certain privilege, that
|
||||
user will not be able to perform that specific query. This effect also is known
|
||||
as a silent deny. The information above is neatly condensed in the following
|
||||
table:
|
||||
|
||||
User Status | Role Status | Effective Status
|
||||
---------------------------------------------
|
||||
GRANT | GRANT | GRANT
|
||||
GRANT | DENY | DENY
|
||||
GRANT | NULL | GRANT
|
||||
DENY | GRANT | DENY
|
||||
DENY | DENY | DENY
|
||||
DENY | NULL | DENY
|
||||
NULL | GRANT | GRANT
|
||||
NULL | DENY | DENY
|
||||
NULL | NULL | DENY
|
||||
|
||||
All supported commands that deal with accessing or modifying users, user
|
||||
roles and privileges can only be executed by users that are granted the
|
||||
`AUTH` privilege. All of those commands are listed in the appropriate
|
||||
[reference guide](../reference_guide/security.md).
|
||||
|
||||
At the moment, privileges are confined to users' abilities to perform certain
|
||||
`OpenCypher` queries. Namely users can be given permission to execute a subset
|
||||
of the following commands: `CREATE`, `DELETE`, `MATCH`, `MERGE`, `SET`,
|
||||
`REMOVE`, `INDEX`, `AUTH`, `STREAM`.
|
||||
|
||||
We could naturally cluster those privileges into groups:
|
||||
|
||||
* Privilege to access data (`MATCH`)
|
||||
* Privilege to modify data (`MERGE`, `SET`)
|
||||
* Privilege to create and delete data (`CREATE`, `DELETE`, `REMOVE`)
|
||||
* Privilege to index data (`INDEX`)
|
||||
* Privilege to use data streaming (`STREAM`)
|
||||
* Privilege to view and alter users, roles and privileges (`AUTH`)
|
||||
|
||||
If you are unfamiliar with any of these commands, you can look them up in our
|
||||
[reference guide](../reference_guide/01_reference-overview.md).
|
||||
|
||||
Similarly, the complete list of commands which can be executed under `AUTH`
|
||||
privilege can be viewed in the
|
||||
[appropriate article](../reference_guide/08_security.md) within our reference
|
||||
guide.
|
||||
|
||||
The remainder of this article outlines a recommended workflow of
|
||||
user management within an internal database of a fictitious company.
|
||||
|
||||
### Creating an Administrator
|
||||
|
||||
As it was stated in the introduction, after the first user is created, Memgraph
|
||||
will execute a query for a given user if the effective status of a corresponding
|
||||
privilege evaluates to `GRANT`. As a corollary, the person that created the
|
||||
first user might not be able to perform any meaningful action after their
|
||||
session had ended. To prevent that from happening, we strongly recommend
|
||||
the first created user to be an administrator which is granted all privileges.
|
||||
|
||||
Therefore, let's create a user named `admin` and set its' password to `0000`.
|
||||
This can be done by executing:
|
||||
|
||||
```openCypher
|
||||
CREATE USER admin IDENTIFIED BY '0000';
|
||||
```
|
||||
|
||||
Granting all privileges to our `admin` user can be done as follows:
|
||||
|
||||
```openCypher
|
||||
GRANT ALL PRIVILEGES to admin;
|
||||
```
|
||||
|
||||
At this point, the current user can close their session and log into a new
|
||||
one as an `admin` user they have just created. The remainder of the article
|
||||
is written from the viewpoint of an administrator which is granted
|
||||
all privileges.
|
||||
|
||||
### Creating Other Users
|
||||
|
||||
Our fictitious company is internally divided into teams, and each team has
|
||||
its own supervisor. All employees of the company need to access and modify
|
||||
data within the database.
|
||||
|
||||
Creating a user account for a new hire named Alice can be done as follows:
|
||||
|
||||
```openCypher
|
||||
CREATE USER alice IDENTIFIED BY '0042';
|
||||
```
|
||||
|
||||
Alice should also be granted a privilege to access data, which can be done by
|
||||
executing the following:
|
||||
|
||||
```openCypher
|
||||
GRANT MATCH, MERGE, SET TO alice;
|
||||
```
|
||||
|
||||
### Creating User Roles
|
||||
|
||||
Each team supervisor needs to have additional privileges that allow them to
|
||||
create new data or delete existing data from the database. Instead of tediously
|
||||
granting additional privileges to each supervisor using language constructs from
|
||||
the previous chapter, we could do so by creating a new user role for
|
||||
supervisors.
|
||||
|
||||
Creating a user role named `supervisor` can be done by executing the following
|
||||
command:
|
||||
|
||||
```openCypher
|
||||
CREATE ROLE supervisor;
|
||||
```
|
||||
|
||||
Granting the privilege to create and delete data to our newly created role can
|
||||
be done as follows:
|
||||
|
||||
```openCypher
|
||||
GRANT CREATE, DELETE, REMOVE TO supervisor;
|
||||
```
|
||||
|
||||
Finally, we need to assign that role to each of the supervisors. Suppose, a user
|
||||
named `bob` is indeed a supervisor within the company. Assigning them that role
|
||||
within the database can be done by the following command:
|
||||
|
||||
```
|
||||
SET ROLE FOR bob TO supervisor;
|
||||
```
|
@ -1,290 +0,0 @@
|
||||
## Quick Start
|
||||
|
||||
This article briefly outlines the basic steps necessary to install and run
|
||||
Memgraph. It also gives a brief glimpse into the world of OpenCypher and
|
||||
outlines some information on programmatic querying of Memgraph. The users
|
||||
should also make sure to read and fully understand the implications of
|
||||
[telemetry](#telemetry) at the very end of the article.
|
||||
|
||||
### Installation
|
||||
|
||||
With regards to their own preference, users can download the Memgraph binary
|
||||
as:
|
||||
|
||||
* [a Debian package for Debian 9 (Stretch)](#debian-installation)
|
||||
* [a RPM package for CentOS 7](#RPM-installation)
|
||||
* [a Docker image](#docker-installation)
|
||||
|
||||
After downloading the binary, users are advised to proceed to the corresponding
|
||||
section below which outlines the installation details.
|
||||
|
||||
It is important to note that newer versions of Memgraph are currently not
|
||||
backward compatible with older versions. This is mainly noticeable by
|
||||
being unable to load storage snapshots between different versions.
|
||||
|
||||
#### Debian Package Installation {#debian-installation}
|
||||
|
||||
After downloading Memgraph as a Debian package, install it by running the
|
||||
following:
|
||||
|
||||
```bash
|
||||
dpkg -i /path/to/memgraph_<version>.deb
|
||||
```
|
||||
|
||||
On successful installation, Memgraph should already be running. To
|
||||
make sure that is true, user can start it explicitly with the command:
|
||||
|
||||
|
||||
```bash
|
||||
systemctl start memgraph
|
||||
```
|
||||
|
||||
To verify that Memgraph is running, user can run the following command:
|
||||
|
||||
```bash
|
||||
journalctl --unit memgraph
|
||||
```
|
||||
|
||||
If successful, the user should receive an output similar to the following:
|
||||
|
||||
```bash
|
||||
Nov 23 13:40:13 hostname memgraph[14654]: Starting 8 BoltS workers
|
||||
Nov 23 13:40:13 hostname memgraph[14654]: BoltS server is fully armed and operational
|
||||
Nov 23 13:40:13 hostname memgraph[14654]: BoltS listening on 0.0.0.0 at 7687
|
||||
```
|
||||
|
||||
At this point, Memgraph is ready to process queries. To try out some elementary
|
||||
queries, the user should proceed to [querying](#querying) section of this
|
||||
article.
|
||||
|
||||
To shut down the Memgraph server, issue the following command:
|
||||
|
||||
```bash
|
||||
systemctl stop memgraph
|
||||
```
|
||||
|
||||
Memgraph configuration is available in `/etc/memgraph/memgraph.conf`. If the
|
||||
configuration is altered, Memgraph needs to be restarted.
|
||||
|
||||
#### RPM Package Installation {#RPM-installation}
|
||||
|
||||
After downloading the RPM package of Memgraph, the user can install it by
|
||||
issuing the following command:
|
||||
|
||||
```bash
|
||||
rpm -U /path/to/memgraph-<version>.rpm
|
||||
```
|
||||
|
||||
After the successful installation, Memgraph can be started as a service. To do
|
||||
so, the user can type the following command:
|
||||
|
||||
```bash
|
||||
systemctl start memgraph
|
||||
```
|
||||
|
||||
To verify that Memgraph is running, the user should run the following command:
|
||||
|
||||
```bash
|
||||
journalctl --unit memgraph
|
||||
```
|
||||
|
||||
If successful, the user should receive an output similar to the following:
|
||||
|
||||
```bash
|
||||
Nov 23 13:40:13 hostname memgraph[14654]: Starting 8 BoltS workers
|
||||
Nov 23 13:40:13 hostname memgraph[14654]: BoltS server is fully armed and operational
|
||||
Nov 23 13:40:13 hostname memgraph[14654]: BoltS listening on 0.0.0.0 at 7687
|
||||
```
|
||||
|
||||
At this point, Memgraph is ready to process queries. To try out some elementary
|
||||
queries, the user should proceed to [querying](#querying) section of this
|
||||
article.
|
||||
|
||||
To shut down the Memgraph server, issue the following command:
|
||||
|
||||
```bash
|
||||
systemctl stop memgraph
|
||||
```
|
||||
|
||||
Memgraph configuration is available in `/etc/memgraph/memgraph.conf`. If the
|
||||
configuration is altered, Memgraph needs to be restarted.
|
||||
|
||||
#### Docker Installation {#docker-installation}
|
||||
|
||||
Before proceeding with the installation, the user should install the Docker
|
||||
engine on their system. Instructions on how to install Docker can be found on
|
||||
the [official Docker website](https://docs.docker.com/engine/installation).
|
||||
Memgraph's Docker image was built with Docker version `1.12` and should be
|
||||
compatible with all newer versions.
|
||||
|
||||
After successful Docker installation, the user should install the Memgraph
|
||||
Docker image and import it using the following command:
|
||||
|
||||
```bash
|
||||
docker load -i /path/to/memgraph-<version>-docker.tar.gz
|
||||
```
|
||||
|
||||
To actually start Memgraph, the user should issue the following command:
|
||||
|
||||
```bash
|
||||
docker run -p 7687:7687 \
|
||||
-v mg_lib:/var/lib/memgraph -v mg_log:/var/log/memgraph -v mg_etc:/etc/memgraph \
|
||||
memgraph
|
||||
```
|
||||
|
||||
If successful, the user should be greeted with the following message:
|
||||
|
||||
```bash
|
||||
Starting 8 workers
|
||||
Server is fully armed and operational
|
||||
Listening on 0.0.0.0 at 7687
|
||||
```
|
||||
|
||||
At this point, Memgraph is ready to process queries. To try out some elementary
|
||||
queries, the user should proceed to [querying](#querying) section of this
|
||||
article.
|
||||
|
||||
To stop Memgraph, press `Ctrl-c`.
|
||||
|
||||
#### Note about named volumes
|
||||
|
||||
Memgraph configuration is available in Docker's named volume `mg_etc`. On
|
||||
Linux systems it should be in
|
||||
`/var/lib/docker/volumes/mg_etc/_data/memgraph.conf`. After changing the
|
||||
configuration, Memgraph needs to be restarted.
|
||||
|
||||
If it happens that the named volumes are reused between different Memgraph
|
||||
versions, Docker will overwrite a folder within the container with existing
|
||||
data from the host machine. If a new file is introduced, or two versions of
|
||||
Memgraph are not compatible, some features might not work or Memgraph might
|
||||
not be able to work correctly. We strongly advise the users to use another
|
||||
named volume for a different Memgraph version or to remove the existing volume
|
||||
from the host with the following command:
|
||||
|
||||
```bash
|
||||
docker volume rm <volume_name>
|
||||
```
|
||||
#### Note for OS X/macOS Users {#OSX-note}
|
||||
|
||||
Although unlikely, some OS X/macOS users might experience minor difficulties
|
||||
after following the Docker installation instructions. Instead of running on
|
||||
`localhost`, a Docker container for Memgraph might be running on a custom IP
|
||||
address. Fortunately, that IP address can be found using the following
|
||||
algorithm:
|
||||
|
||||
1) Find out the container ID of the Memgraph container
|
||||
|
||||
By issuing the command `docker ps` the user should get an output similar to the
|
||||
following:
|
||||
|
||||
```bash
|
||||
CONTAINER ID IMAGE COMMAND CREATED ...
|
||||
9397623cd87e memgraph "/usr/lib/memgraph/m…" 2 seconds ago ...
|
||||
```
|
||||
|
||||
At this point, it is important to remember the container ID of the Memgraph
|
||||
image. In our case, that is `9397623cd87e`.
|
||||
|
||||
2) Use the container ID to retrieve an IP of the container
|
||||
|
||||
```bash
|
||||
docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' 9397623cd87e
|
||||
```
|
||||
|
||||
The command above should yield the sought IP. If that IP does not correspond to
|
||||
`localhost`, it should be used instead of `localhost` when firing up the
|
||||
`neo4j-client` in the [querying](#querying) section.
|
||||
|
||||
### Querying {#querying}
|
||||
|
||||
Memgraph supports the openCypher query language which has been developed by
|
||||
[Neo4j](http://neo4j.com). It is a declarative language developed specifically
|
||||
for interaction with graph databases which is currently going through a
|
||||
vendor-independent standardization process.
|
||||
|
||||
The easiest way to execute openCypher queries against Memgraph is by using
|
||||
Neo4j's command-line tool. The command-line `neo4j-client` can be installed as
|
||||
described [on the official website](https://neo4j-client.net).
|
||||
|
||||
After installing `neo4j-client`, the user can connect to the running Memgraph
|
||||
instance by issuing the following shell command:
|
||||
|
||||
```bash
|
||||
neo4j-client -u "" -p "" localhost 7687
|
||||
```
|
||||
|
||||
After the client has started it should present a command prompt similar to:
|
||||
|
||||
```bash
|
||||
neo4j-client 2.1.3
|
||||
Enter `:help` for usage hints.
|
||||
Connected to 'neo4j://@localhost:7687'
|
||||
neo4j>
|
||||
```
|
||||
|
||||
At this point it is possible to execute openCypher queries on Memgraph. Each
|
||||
query needs to end with the `;` (*semicolon*) character. For example:
|
||||
|
||||
```opencypher
|
||||
CREATE (u:User {name: "Alice"})-[:Likes]->(m:Software {name: "Memgraph"});
|
||||
```
|
||||
|
||||
The above will create 2 nodes in the database, one labeled "User" with name
|
||||
"Alice" and the other labeled "Software" with name "Memgraph". It will also
|
||||
create a relationship that "Alice" *likes* "Memgraph".
|
||||
|
||||
To find created nodes and relationships, execute the following query:
|
||||
|
||||
```opencypher
|
||||
MATCH (u:User)-[r]->(x) RETURN u, r, x;
|
||||
```
|
||||
|
||||
#### Supported Languages
|
||||
|
||||
If users wish to query Memgraph programmatically, they can do so using the
|
||||
[Bolt protocol](https://boltprotocol.org). Bolt was designed for efficient
|
||||
communication with graph databases and Memgraph supports
|
||||
[Version 1](https://boltprotocol.org/v1) of the protocol. Bolt protocol drivers
|
||||
for some popular programming languages are listed below:
|
||||
|
||||
* [Java](https://github.com/neo4j/neo4j-java-driver)
|
||||
* [Python](https://github.com/neo4j/neo4j-python-driver)
|
||||
* [JavaScript](https://github.com/neo4j/neo4j-javascript-driver)
|
||||
* [C#](https://github.com/neo4j/neo4j-dotnet-driver)
|
||||
* [Ruby](https://github.com/neo4jrb/neo4j)
|
||||
* [Haskell](https://github.com/zmactep/hasbolt)
|
||||
* [PHP](https://github.com/graphaware/neo4j-bolt-php)
|
||||
|
||||
We have included some basic usage examples for some of the supported languages
|
||||
in the article about [programmatic querying](how_to_guides/03_programmatic-querying.md).
|
||||
|
||||
### Telemetry {#telemetry}
|
||||
|
||||
Telemetry is an automated process by which some useful data is collected at
|
||||
a remote point. At Memgraph, we use telemetry for the sole purpose of improving
|
||||
our product, thereby collecting some data about the machine that executes the
|
||||
database (CPU, memory, OS and kernel information) as well as some data about the
|
||||
database runtime (CPU usage, memory usage, vertices and edges count).
|
||||
|
||||
Here at Memgraph, we deeply care about the privacy of our users and do not
|
||||
collect any sensitive information. If users wish to disable Memgraph's telemetry
|
||||
features, they can easily do so by either altering the line in
|
||||
`/etc/memgraph/memgraph.conf` that enables telemetry (`--telemetry-enabled=true`)
|
||||
into `--telemetry-enabled=false`, or by including the `--telemetry-enabled=false`
|
||||
as a command-line argument when running the executable.
|
||||
|
||||
### Where to Next
|
||||
|
||||
To learn more about the openCypher language, the user should visit our
|
||||
[reference guide](reference_guide/01_reference-overview.md) article.
|
||||
For real-world examples of how to use Memgraph, we strongly suggest reading
|
||||
through the following articles:
|
||||
|
||||
* [Analyzing TED Talks](tutorials/02_analyzing-TED-talks.md)
|
||||
* [Graphing the Premier League](tutorials/03_graphing-the-premier-league.md)
|
||||
* [Exploring the European Road Network](tutorials/04_exploring-the-european-road-network.md)
|
||||
|
||||
Details on what can be stored in Memgraph can be found in the article about
|
||||
[Data Storage](concepts/02_storage.md).
|
||||
|
||||
We *welcome and encourage* your feedback!
|
@ -1,22 +0,0 @@
|
||||
## Reference Overview
|
||||
|
||||
[*openCypher*](http://www.opencypher.org/) is a query language for querying
|
||||
graph databases. It aims to be intuitive and easy to learn, while
|
||||
providing a powerful interface for working with graph based data.
|
||||
|
||||
*Memgraph* supports most of the commonly used constructs of the language. The
|
||||
reference guide contains the details of implemented features. Additionally,
|
||||
not yet supported features of the language are listed.
|
||||
|
||||
Our reference guide currently consists of the following articles:
|
||||
|
||||
* [Reading Existing Data](02_reading-existing-data.md)
|
||||
* [Writing New Data](03_writing-new-data.md)
|
||||
* [Reading and Writing](04_reading-and-writing.md)
|
||||
* [Indexing](05_indexing.md)
|
||||
* [Graph Algorithms](06_graph-algorithms.md)
|
||||
* [Graph Streams](07_graph-streams.md)
|
||||
* [Security](08_security.md)
|
||||
* [Dynamic Graph Partitioner](09_dynamic-graph-partitioner.md)
|
||||
* [Other Features](10_other-features.md)
|
||||
* [Differences](11_differences.md)
|
@ -1,280 +0,0 @@
|
||||
## Reading Existing Data
|
||||
|
||||
The simplest usage of the language is to find data stored in the
|
||||
database. For that purpose, the following clauses are offered:
|
||||
|
||||
* `MATCH`, which searches for patterns;
|
||||
* `WHERE`, for filtering the matched data and
|
||||
* `RETURN`, for defining what will be presented to the user in the result
|
||||
set.
|
||||
* `UNION` and `UNION ALL` for combining results from multiple queries.
|
||||
|
||||
### MATCH
|
||||
|
||||
This clause is used to obtain data from Memgraph by matching it to a given
|
||||
pattern. For example, to find each node in the database, you can use the
|
||||
following query.
|
||||
|
||||
```opencypher
|
||||
MATCH (node) RETURN node
|
||||
```
|
||||
|
||||
Finding connected nodes can be achieved by using the query:
|
||||
|
||||
```opencypher
|
||||
MATCH (node1)-[connection]-(node2) RETURN node1, connection, node2
|
||||
```
|
||||
|
||||
In addition to general pattern matching, you can narrow the search down by
|
||||
specifying node labels and properties. Similarly, edge types and properties
|
||||
can also be specified. For example, finding each node labeled as `Person` and
|
||||
with property `age` being 42, is done with the following query.
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person {age: 42}) RETURN n
|
||||
```
|
||||
|
||||
While their friends can be found with the following.
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person {age: 42})-[:FriendOf]-(friend) RETURN friend
|
||||
```
|
||||
|
||||
There are cases when a user needs to find data which is connected by
|
||||
traversing a path of connections, but the user doesn't know how many
|
||||
connections need to be traversed. openCypher allows for designating patterns
|
||||
with *variable path lengths*. Matching such a path is achieved by using the
|
||||
`*` (*asterisk*) symbol inside the edge element of a pattern. For example,
|
||||
traversing from `node1` to `node2` by following any number of connections in a
|
||||
single direction can be achieved with:
|
||||
|
||||
```opencypher
|
||||
MATCH (node1)-[r*]->(node2) RETURN node1, r, node2
|
||||
```
|
||||
|
||||
If paths are very long, finding them could take a long time. To prevent that,
|
||||
a user can provide the minimum and maximum length of the path. For example,
|
||||
paths of length between 2 and 4 can be obtained with a query like:
|
||||
|
||||
```opencypher
|
||||
MATCH (node1)-[r*2..4]->(node2) RETURN node1, r, node2
|
||||
```
|
||||
|
||||
It is possible to name patterns in the query and return the resulting paths.
|
||||
This is especially useful when matching variable length paths:
|
||||
|
||||
```opencypher
|
||||
MATCH path = ()-[r*2..4]->() RETURN path
|
||||
```
|
||||
|
||||
More details on how `MATCH` works can be found
|
||||
[here](https://neo4j.com/docs/developer-manual/current/cypher/clauses/match/).
|
||||
|
||||
The `MATCH` clause can be modified by prepending the `OPTIONAL` keyword.
|
||||
`OPTIONAL MATCH` clause behaves the same as a regular `MATCH`, but when it
|
||||
fails to find the pattern, missing parts of the pattern will be filled with
|
||||
`null` values. Examples can be found
|
||||
[here](https://neo4j.com/docs/developer-manual/current/cypher/clauses/optional-match/).
|
||||
|
||||
### WHERE
|
||||
|
||||
You have already seen that simple filtering can be achieved by using labels
|
||||
and properties in `MATCH` patterns. When more complex filtering is desired,
|
||||
you can use `WHERE` paired with `MATCH` or `OPTIONAL MATCH`. For example,
|
||||
finding each person older than 20 is done with the this query.
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) WHERE n.age > 20 RETURN n
|
||||
```
|
||||
|
||||
Additional examples can be found
|
||||
[here](https://neo4j.com/docs/developer-manual/current/cypher/clauses/where/).
|
||||
|
||||
### RETURN
|
||||
|
||||
The `RETURN` clause defines which data should be included in the resulting
|
||||
set. Basic usage was already shown in the examples for `MATCH` and `WHERE`
|
||||
clauses. Another feature of `RETURN` is renaming the results using the `AS`
|
||||
keyword.
|
||||
|
||||
Example.
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) RETURN n AS people
|
||||
```
|
||||
|
||||
That query would display all nodes under the header named `people` instead of
|
||||
`n`.
|
||||
|
||||
When you want to get everything that was matched, you can use the `*`
|
||||
(*asterisk*) symbol.
|
||||
|
||||
This query:
|
||||
|
||||
```opencypher
|
||||
MATCH (node1)-[connection]-(node2) RETURN *
|
||||
```
|
||||
|
||||
is equivalent to:
|
||||
|
||||
```opencypher
|
||||
MATCH (node1)-[connection]-(node2) RETURN node1, connection, node2
|
||||
```
|
||||
|
||||
`RETURN` can be followed by the `DISTINCT` operator, which will remove
|
||||
duplicate results. For example, getting unique names of people can be achieved
|
||||
with:
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) RETURN DISTINCT n.name
|
||||
```
|
||||
|
||||
Besides choosing what will be the result and how it will be named, the
|
||||
`RETURN` clause can also be used to:
|
||||
|
||||
* limit results with `LIMIT` sub-clause;
|
||||
* skip results with `SKIP` sub-clause;
|
||||
* order results with `ORDER BY` sub-clause and
|
||||
* perform aggregations (such as `count`).
|
||||
|
||||
More details on `RETURN` can be found
|
||||
[here](https://neo4j.com/docs/developer-manual/current/cypher/clauses/return/).
|
||||
|
||||
#### SKIP & LIMIT
|
||||
|
||||
These sub-clauses take a number of how many results to skip or limit.
|
||||
For example, to get the first 3 results you can use this query.
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) RETURN n LIMIT 3
|
||||
```
|
||||
|
||||
If you want to get all the results after the first 3, you can use the
|
||||
following.
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) RETURN n SKIP 3
|
||||
```
|
||||
|
||||
The `SKIP` and `LIMIT` can be combined. So for example, to get the 2nd result,
|
||||
you can do:
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) RETURN n SKIP 1 LIMIT 1
|
||||
```
|
||||
|
||||
#### ORDER BY
|
||||
|
||||
Since the patterns which are matched can come in any order, it is very useful
|
||||
to be able to enforce some ordering among the results. In such cases, you can
|
||||
use the `ORDER BY` sub-clause.
|
||||
|
||||
For example, the following query will get all `:Person` nodes and order them
|
||||
by their names.
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) RETURN n ORDER BY n.name
|
||||
```
|
||||
|
||||
By default, ordering will be in the ascending order. To change the order to be
|
||||
descending, you should append `DESC`.
|
||||
|
||||
For example, to order people by their name descending, you can use this query.
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) RETURN n ORDER BY n.name DESC
|
||||
```
|
||||
|
||||
You can also order by multiple variables. The results will be sorted by the
|
||||
first variable listed. If the values are equal, the results are sorted by the
|
||||
second variable, and so on.
|
||||
|
||||
Example. Ordering by first name descending and last name ascending.
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) RETURN n ORDER BY n.name DESC, n.lastName
|
||||
```
|
||||
|
||||
Note that `ORDER BY` sees only the variable names as carried over by `RETURN`.
|
||||
This means that the following will result in an error.
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) RETURN old AS new ORDER BY old.name
|
||||
```
|
||||
|
||||
Instead, the `new` variable must be used:
|
||||
|
||||
```opencypher
|
||||
MATCH (n: Person) RETURN old AS new ORDER BY new.name
|
||||
```
|
||||
|
||||
The `ORDER BY` sub-clause may come in handy with `SKIP` and/or `LIMIT`
|
||||
sub-clauses. For example, to get the oldest person you can use the following.
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) RETURN n ORDER BY n.age DESC LIMIT 1
|
||||
```
|
||||
|
||||
##### Aggregating
|
||||
|
||||
openCypher has functions for aggregating data. Memgraph currently supports
|
||||
the following aggregating functions.
|
||||
|
||||
* `avg`, for calculating the average.
|
||||
* `collect`, for collecting multiple values into a single list or map. If
|
||||
given a single expression values are collected into a list. If given two
|
||||
expressions, values are collected into a map where the first expression
|
||||
denotes map keys (must be string values) and the second expression denotes
|
||||
map values.
|
||||
* `count`, for counting the resulting values.
|
||||
* `max`, for calculating the maximum result.
|
||||
* `min`, for calculating the minimum result.
|
||||
* `sum`, for getting the sum of numeric results.
|
||||
|
||||
Example, calculating the average age:
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) RETURN avg(n.age) AS averageAge
|
||||
```
|
||||
|
||||
Collecting items into a list:
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) RETURN collect(n.name) AS list_of_names
|
||||
```
|
||||
|
||||
Collecting items into a map:
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) RETURN collect(n.name, n.age) AS map_name_to_age
|
||||
```
|
||||
|
||||
Click
|
||||
[here](https://neo4j.com/docs/developer-manual/current/cypher/functions/aggregating/)
|
||||
for additional details on how aggregations work.
|
||||
|
||||
### UNION and UNION ALL
|
||||
|
||||
openCypher supports combining results from multiple queries into a single result
|
||||
set. That result will contain rows that belong to queries in the union
|
||||
respecting the union type.
|
||||
|
||||
Using `UNION` will contain only distinct rows while `UNION ALL` will keep all
|
||||
rows from all given queries.
|
||||
|
||||
Restrictions when using `UNION` or `UNION ALL`:
|
||||
* The number and the names of columns returned by queries must be the same
|
||||
for all of them.
|
||||
* There can be only one union type between single queries, i.e. a query can't
|
||||
contain both `UNION` and `UNION ALL`.
|
||||
|
||||
Example, get distinct names that are shared between persons and movies:
|
||||
|
||||
```opencypher
|
||||
MATCH(n: Person) RETURN n.name AS name UNION MATCH(n: Movie) RETURN n.name AS name
|
||||
```
|
||||
|
||||
Example, get all names that are shared between persons and movies (including duplicates):
|
||||
|
||||
```opencypher
|
||||
MATCH(n: Person) RETURN n.name AS name UNION ALL MATCH(n: Movie) RETURN n.name AS name
|
@ -1,92 +0,0 @@
|
||||
## Writing New Data
|
||||
|
||||
For adding new data, you can use the following clauses.
|
||||
|
||||
* `CREATE`, for creating new nodes and edges.
|
||||
* `SET`, for adding new or updating existing labels and properties.
|
||||
* `DELETE`, for deleting nodes and edges.
|
||||
* `REMOVE`, for removing labels and properties.
|
||||
|
||||
You can still use the `RETURN` clause to produce results after writing, but it
|
||||
is not mandatory.
|
||||
|
||||
Details on which kind of data can be stored in *Memgraph* can be found in
|
||||
[Data Storage](../concepts/02_storage.md) chapter.
|
||||
|
||||
### CREATE
|
||||
|
||||
This clause is used to add new nodes and edges to the database. The creation
|
||||
is done by providing a pattern, similarly to `MATCH` clause.
|
||||
|
||||
For example, to create 2 new nodes connected with a new edge, use this query.
|
||||
|
||||
```opencypher
|
||||
CREATE (node1)-[:edge_type]->(node2)
|
||||
```
|
||||
|
||||
Labels and properties can be set during creation using the same syntax as in
|
||||
`MATCH` patterns. For example, creating a node with a label and a
|
||||
property:
|
||||
|
||||
```opencypher
|
||||
CREATE (node :Label {property: "my property value"})
|
||||
```
|
||||
|
||||
Additional information on `CREATE` is
|
||||
[here](https://neo4j.com/docs/developer-manual/current/cypher/clauses/create/).
|
||||
|
||||
### SET
|
||||
|
||||
The `SET` clause is used to update labels and properties of already existing
|
||||
data.
|
||||
|
||||
Example. Incrementing everyone's age by 1.
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) SET n.age = n.age + 1
|
||||
```
|
||||
|
||||
Click
|
||||
[here](https://neo4j.com/docs/developer-manual/current/cypher/clauses/create/)
|
||||
for a more detailed explanation on what can be done with `SET`.
|
||||
|
||||
### DELETE
|
||||
|
||||
This clause is used to delete nodes and edges from the database.
|
||||
|
||||
Example. Removing all edges of a single type.
|
||||
|
||||
```opencypher
|
||||
MATCH ()-[edge :type]-() DELETE edge
|
||||
```
|
||||
|
||||
When testing the database, you want to often have a clean start by deleting
|
||||
every node and edge in the database. It is reasonable that deleting each node
|
||||
should delete all edges coming into or out of that node.
|
||||
|
||||
```opencypher
|
||||
MATCH (node) DELETE node
|
||||
```
|
||||
|
||||
But, openCypher prevents accidental deletion of edges. Therefore, the above
|
||||
query will report an error. Instead, you need to use the `DETACH` keyword,
|
||||
which will remove edges from a node you are deleting. The following should
|
||||
work and *delete everything* in the database.
|
||||
|
||||
```opencypher
|
||||
MATCH (node) DETACH DELETE node
|
||||
```
|
||||
|
||||
More examples are
|
||||
[here](https://neo4j.com/docs/developer-manual/current/cypher/clauses/delete/).
|
||||
|
||||
### REMOVE
|
||||
|
||||
The `REMOVE` clause is used to remove labels and properties from nodes and
|
||||
edges.
|
||||
|
||||
Example.
|
||||
|
||||
```opencypher
|
||||
MATCH (n :WrongLabel) REMOVE n :WrongLabel, n.property
|
||||
```
|
@ -1,51 +0,0 @@
|
||||
## Reading and Writing
|
||||
|
||||
OpenCypher supports combining multiple reads and writes using the
|
||||
`WITH` clause. In addition to combining, the `MERGE` clause is provided which
|
||||
may create patterns if they do not exist.
|
||||
|
||||
### WITH
|
||||
|
||||
The write part of the query cannot be simply followed by another read part. In
|
||||
order to combine them, `WITH` clause must be used. The names this clause
|
||||
establishes are transferred from one part to another.
|
||||
|
||||
For example, creating a node and finding all nodes with the same property.
|
||||
|
||||
```opencypher
|
||||
CREATE (node {property: 42}) WITH node.property AS propValue
|
||||
MATCH (n {property: propValue}) RETURN n
|
||||
```
|
||||
|
||||
Note that the `node` is not visible after `WITH`, since only `node.property`
|
||||
was carried over.
|
||||
|
||||
This clause behaves very much like `RETURN`, so you should refer to features
|
||||
of `RETURN`.
|
||||
|
||||
### MERGE
|
||||
|
||||
The `MERGE` clause is used to ensure that a pattern you are looking for exists
|
||||
in the database. This means that if the pattern is not found, it will be
|
||||
created. In a way, this clause is like a combination of `MATCH` and `CREATE`.
|
||||
|
||||
|
||||
Example. Ensure that a person has at least one friend.
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) MERGE (n)-[:FriendOf]->(m)
|
||||
```
|
||||
|
||||
The clause also provides additional features for updating the values depending
|
||||
on whether the pattern was created or matched. This is achieved with `ON
|
||||
CREATE` and `ON MATCH` sub clauses.
|
||||
|
||||
Example. Set a different properties depending on what `MERGE` did.
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) MERGE (n)-[:FriendOf]->(m)
|
||||
ON CREATE SET m.prop = "created" ON MATCH SET m.prop = "existed"
|
||||
```
|
||||
|
||||
For more details, click [this
|
||||
link](https://neo4j.com/docs/developer-manual/current/cypher/clauses/merge/).
|
@ -1,56 +0,0 @@
|
||||
## Indexing
|
||||
|
||||
An index stores additional information on certain types of data, so that
|
||||
retrieving said data becomes more efficient. Downsides of indexing are:
|
||||
|
||||
* requiring extra storage for each index and
|
||||
* slowing down writes to the database.
|
||||
|
||||
Carefully choosing which data to index can tremendously improve data retrieval
|
||||
efficiency, and thus make index downsides negligible.
|
||||
|
||||
Memgraph automatically indexes labeled data. This improves queries
|
||||
which fetch nodes by label:
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Label) ... RETURN n
|
||||
```
|
||||
|
||||
Indexing can also be applied to data with a specific combination of label and
|
||||
property. These are not automatically created, instead a user needs to create
|
||||
them explicitly. Creation is done using a special
|
||||
`CREATE INDEX ON :Label(property)` language construct.
|
||||
|
||||
For example, to index nodes which is labeled as `:Person` and has a property
|
||||
named `age`:
|
||||
|
||||
```opencypher
|
||||
CREATE INDEX ON :Person(age)
|
||||
```
|
||||
|
||||
After the index is created, retrieving those nodes will become more efficient.
|
||||
For example, the following query will retrieve all nodes which have an `age`
|
||||
property, instead of fetching each `:Person` node and checking whether the
|
||||
property exists.
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person {age: 42}) RETURN n
|
||||
```
|
||||
|
||||
Using index based retrieval also works when filtering labels and properties
|
||||
with `WHERE`. For example, the same effect as in the previous example can be
|
||||
done with:
|
||||
|
||||
```opencypher
|
||||
MATCH (n) WHERE n:Person AND n.age = 42 RETURN n
|
||||
```
|
||||
|
||||
Since the filter inside `WHERE` can contain any kind of an expression, the
|
||||
expression can be complicated enough so that the index does not get used. We
|
||||
are continuously improving the recognition of index usage opportunities from a
|
||||
`WHERE` expression. If there is any suspicion that an index may not be used,
|
||||
we recommend putting properties and labels inside the `MATCH` pattern.
|
||||
|
||||
Currently, once an index is created it cannot be deleted. This feature will be
|
||||
implemented very soon. The expected syntax for removing an index will be `DROP
|
||||
INDEX ON :Label(property)`.
|
@ -1,137 +0,0 @@
|
||||
## Graph Algorithms
|
||||
|
||||
### Filtering Variable Length Paths
|
||||
|
||||
OpenCypher supports only simple filtering when matching variable length paths.
|
||||
For example:
|
||||
|
||||
```opencypher
|
||||
MATCH (n)-[edge_list:Type * {x: 42}]-(m)
|
||||
```
|
||||
|
||||
This will produce only those paths whose edges have the required `Type` and `x`
|
||||
property value. Edges that compose the produced paths are stored in a symbol
|
||||
named `edge_list`. Naturally, the user could have specified any other symbol
|
||||
name.
|
||||
|
||||
Memgraph extends openCypher with a syntax for arbitrary filter expressions
|
||||
during path matching. The next example filters edges which have property `x`
|
||||
between `0` and `10`.
|
||||
|
||||
```opencypher
|
||||
MATCH (n)-[edge_list * (edge, node | 0 < edge.x < 10)]-(m)
|
||||
```
|
||||
|
||||
Here we introduce a lambda function with parentheses, where the first two
|
||||
arguments, `edge` and `node`, correspond to each edge and node during path
|
||||
matching. `node` is the destination node we are moving to across the current
|
||||
`edge`. The last `node` value will be the same value as `m`. Following the
|
||||
pipe (`|`) character is an arbitrary expression which must produce a boolean
|
||||
value. If `True`, matching continues, otherwise the path is discarded.
|
||||
|
||||
The previous example can be written using the `all` function:
|
||||
|
||||
```opencypher
|
||||
MATCH (n)-[edge_list *]-(m) WHERE all(edge IN edge_list WHERE 0 < edge.x < 10)
|
||||
```
|
||||
|
||||
However, filtering using a lambda function is more efficient because paths
|
||||
may be discarded earlier in the traversal. Furthermore, it provides more
|
||||
flexibility for deciding what kind of paths are matched due to more expressive
|
||||
filtering capabilities. Therefore, filtering through lambda functions should
|
||||
be preferred whenever possible.
|
||||
|
||||
### Breadth First Search
|
||||
|
||||
A typical graph use-case is searching for the shortest path between nodes.
|
||||
The openCypher standard does not define this feature, so Memgraph provides
|
||||
a custom implementation, based on the edge expansion syntax.
|
||||
|
||||
Finding the shortest path between nodes can be done using breadth-first
|
||||
expansion:
|
||||
|
||||
```opencypher
|
||||
MATCH (a {id: 723})-[edge_list:Type *bfs..10]-(b {id: 882}) RETURN *
|
||||
```
|
||||
|
||||
The above query will find all paths of length up to 10 between nodes `a` and `b`.
|
||||
The edge type and maximum path length are used in the same way like in variable
|
||||
length expansion.
|
||||
|
||||
To find only the shortest path, simply append `LIMIT 1` to the `RETURN` clause.
|
||||
|
||||
```opencypher
|
||||
MATCH (a {id: 723})-[edge_list:Type *bfs..10]-(b {id: 882}) RETURN * LIMIT 1
|
||||
```
|
||||
|
||||
Breadth-first expansion allows an arbitrary expression filter that determines
|
||||
if an expansion is allowed. Following is an example in which expansion is
|
||||
allowed only over edges whose `x` property is greater than `12` and nodes `y`
|
||||
whose property is less than `3`:
|
||||
|
||||
```opencypher
|
||||
MATCH (a {id: 723})-[*bfs..10 (e, n | e.x > 12 AND n.y < 3)]-() RETURN *
|
||||
```
|
||||
|
||||
The filter is defined as a lambda function over `e` and `n`, which denote the edge
|
||||
and node being expanded over in the breadth first search. Note that if the user
|
||||
omits the edge list symbol (`edge_list` in previous examples) it will not be included
|
||||
in the result.
|
||||
|
||||
There are a few benefits of the breadth-first expansion approach, as opposed to
|
||||
a specialized `shortestPath` function. For one, it is possible to inject
|
||||
expressions that filter on nodes and edges along the path itself, not just the final
|
||||
destination node. Furthermore, it's possible to find multiple paths to multiple destination
|
||||
nodes regardless of their length. Also, it is possible to simply go through a node's
|
||||
neighbourhood in breadth-first manner.
|
||||
|
||||
Currently, it isn't possible to get all shortest paths to a single node using
|
||||
Memgraph's breadth-first expansion.
|
||||
|
||||
### Weighted Shortest Path
|
||||
|
||||
Another standard use-case in a graph is searching for the weighted shortest
|
||||
path between nodes. The openCypher standard does not define this feature, so
|
||||
Memgraph provides a custom implementation, based on the edge expansion syntax.
|
||||
|
||||
Finding the weighted shortest path between nodes is done using the weighted
|
||||
shortest path expansion:
|
||||
|
||||
```opencypher
|
||||
MATCH (a {id: 723})-[
|
||||
edge_list *wShortest 10 (e, n | e.weight) total_weight
|
||||
]-(b {id: 882})
|
||||
RETURN *
|
||||
```
|
||||
|
||||
The above query will find the shortest path of length up to 10 nodes between
|
||||
nodes `a` and `b`. The length restriction parameter is optional.
|
||||
|
||||
Weighted Shortest Path expansion allows an arbitrary expression that determines
|
||||
the weight for the current expansion. Total weight of a path is calculated as
|
||||
the sum of all weights on the path between two nodes. Following is an example in
|
||||
which the weight between nodes is defined as the product of edge weights
|
||||
(instead of sum), assuming all weights are greater than '1':
|
||||
|
||||
```opencypher
|
||||
MATCH (a {id: 723})-[
|
||||
edge_list *wShortest 10 (e, n | log(e.weight)) total_weight
|
||||
]-(b {id: 882})
|
||||
RETURN exp(total_weight)
|
||||
```
|
||||
|
||||
Weighted Shortest Path expansions also allows an arbitrary expression filter
|
||||
that determines if an expansion is allowed. Following is an example in which
|
||||
expansion is allowed only over edges whose `x` property is greater than `12`
|
||||
and nodes `y` whose property is less than `3`:
|
||||
|
||||
```opencypher
|
||||
MATCH (a {id: 723})-[
|
||||
edge_list *wShortest 10 (e, n | e.weight) total_weight (e, n | e.x > 12 AND n.y < 3)
|
||||
]-(b {id: 882})
|
||||
RETURN exp(total_weight)
|
||||
```
|
||||
|
||||
Both weight and filter expression are defined as lambda functions over `e` and
|
||||
`n`, which denote the edge and the node being expanded over in the weighted
|
||||
shortest path search.
|
@ -1,115 +0,0 @@
|
||||
## Graph Streams
|
||||
|
||||
### Kafka
|
||||
|
||||
Memgraphs custom openCypher clause for creating a stream is:
|
||||
```opencypher
|
||||
CREATE STREAM stream_name AS
|
||||
LOAD DATA KAFKA 'URI'
|
||||
WITH TOPIC 'topic'
|
||||
WITH TRANSFORM 'URI'
|
||||
[BATCH_INTERVAL milliseconds]
|
||||
[BATCH_SIZE count]
|
||||
```
|
||||
The `CREATE STREAM` clause happens in a transaction.
|
||||
|
||||
`WITH TOPIC` parameter specifies the Kafka topic from which we'll stream
|
||||
data.
|
||||
|
||||
`WITH TRANSFORM` parameter should contain a URI of the transform script.
|
||||
We cover more about the transform script later, in the [transform](#transform)
|
||||
section.
|
||||
|
||||
`BATCH_INTERVAL` parameter defines the time interval in milliseconds
|
||||
which is the time between two successive stream importing operations.
|
||||
|
||||
`BATCH_SIZE` parameter defines the count of Kafka messages that will be
|
||||
batched together before import.
|
||||
|
||||
If both `BATCH_INTERVAL` and `BATCH_SIZE` parameters are given, the condition
|
||||
that is satisfied first will trigger the batched import.
|
||||
|
||||
Default value for `BATCH_INTERVAL` is 100 milliseconds, and the default value
|
||||
for `BATCH_SIZE` is 10.
|
||||
|
||||
The `DROP` clause deletes a stream:
|
||||
```opencypher
|
||||
DROP STREAM stream_name;
|
||||
```
|
||||
|
||||
The `SHOW` clause enables you to see all configured streams:
|
||||
```opencypher
|
||||
SHOW STREAMS;
|
||||
```
|
||||
|
||||
You can also start/stop streams with the `START` and `STOP` clauses:
|
||||
```opencypher
|
||||
START STREAM stream_name [LIMIT count BATCHES];
|
||||
STOP STREAM stream_name;
|
||||
```
|
||||
A stream needs to be stopped in order to start it and it needs to be started in
|
||||
order to stop it. Starting a started or stopping a stopped stream will not
|
||||
affect that stream.
|
||||
|
||||
There are also convenience clauses to start and stop all streams:
|
||||
```opencypher
|
||||
START ALL STREAMS;
|
||||
STOP ALL STREAMS;
|
||||
```
|
||||
|
||||
Before the actual import, you can also test the stream with the `TEST
|
||||
STREAM` clause:
|
||||
```opencypher
|
||||
TEST STREAM stream_name [LIMIT count BATCHES];
|
||||
```
|
||||
When a stream is tested, data extraction and transformation occurs, but nothing
|
||||
is inserted into the graph.
|
||||
|
||||
A stream needs to be stopped in order to test it. When the batch limit is
|
||||
omitted, `TEST STREAM` will run for only one batch by default.
|
||||
|
||||
#### Transform
|
||||
|
||||
The transform script allows Memgraph users to have custom Kafka messages and
|
||||
still be able to import data in Memgraph by adding the logic to decode the
|
||||
messages in the transform script.
|
||||
|
||||
The entry point of the transform script from Memgraph is the `stream` function.
|
||||
Input for the `stream` function is a list of bytes that represent byte encoded
|
||||
Kafka messages, and the output of the `stream` function must be a list of
|
||||
tuples containing openCypher string queries and corresponding parameters stored
|
||||
in a dictionary.
|
||||
|
||||
To be more precise, the signature of the `stream` function looks like the
|
||||
following:
|
||||
|
||||
```plaintext
|
||||
stream : [bytes] -> [(str, {str : type})]
|
||||
type : none | bool | int | float | str | list | dict
|
||||
```
|
||||
|
||||
An example of a simple transform script that creates vertices if the message
|
||||
contains one number (the vertex id) or it creates edges if the message contains
|
||||
two numbers (origin vertex id and destination vertex id) would look like the
|
||||
following:
|
||||
|
||||
```python
|
||||
def create_vertex(vertex_id):
|
||||
return ("CREATE (:Node {id: $id})", {"id": vertex_id})
|
||||
|
||||
|
||||
def create_edge(from_id, to_id):
|
||||
return ("MATCH (n:Node {id: $from_id}), (m:Node {id: $to_id}) "\
|
||||
"CREATE (n)-[:Edge]->(m)", {"from_id": from_id, "to_id": to_id})
|
||||
|
||||
|
||||
def stream(batch):
|
||||
result = []
|
||||
for item in batch:
|
||||
message = item.decode('utf-8').split()
|
||||
if len(message) == 1:
|
||||
result.append(create_vertex(message[0]))
|
||||
elif len(message) == 2:
|
||||
result.append(create_edge(message[0], message[1]))
|
||||
return result
|
||||
```
|
@ -1,125 +0,0 @@
|
||||
## Security
|
||||
|
||||
Before reading this article we highly recommend going through a how-to guide
|
||||
on [managing user privileges](../how_to_guides/05_manage-user-privileges.md)
|
||||
which contains more thorough explanations of the concepts behind `openCypher`
|
||||
commands listed in this article.
|
||||
|
||||
### Users
|
||||
|
||||
Creating a user can be done by executing the following command:
|
||||
|
||||
```openCypher
|
||||
CREATE USER user_name [IDENTIFIED BY 'password'];
|
||||
```
|
||||
If the user should authenticate themself on each session, i.e. provide their
|
||||
password on each session, the part within the brackets is mandatory. Otherwise,
|
||||
the password is set to `null` and the user will be allowed to log-in using
|
||||
any password provided that they provide the correct username.
|
||||
|
||||
You can also set or alter a user's password anytime by issuing the following
|
||||
command:
|
||||
|
||||
```openCypher
|
||||
SET PASSWORD FOR user_name TO 'new_password';
|
||||
```
|
||||
|
||||
Removing a user's password, i.e. allowing the user to log-in using any
|
||||
password can be done by setting it to `null` as follows:
|
||||
|
||||
```openCypher
|
||||
SET PASSWORD FOR user_name TO null;
|
||||
```
|
||||
|
||||
### User Roles
|
||||
|
||||
Each user can be assigned at most one user role. One can think of user roles
|
||||
as abstractions which capture the privilege levels of a set of users. For
|
||||
example, suppose that `Dominik` and `Marko` belong to upper management of
|
||||
a certain company. It makes sense to grant them a set of privileges that other
|
||||
users are not entitled to so, instead of granting those privileges to each
|
||||
of them, we can create a role with those privileges called `manager`
|
||||
which we assign to `Dominik` and `Marko`.
|
||||
|
||||
In other words, Each privilege that is granted to a user role is automatically
|
||||
granted to a user (unless it has been explicitly denied to that user).
|
||||
Similarly, each privilege that is denied to a user role is automatically denied
|
||||
to a user (even if it has been explicitly granted to that user).
|
||||
|
||||
Creating a user role can be done by executing the following command:
|
||||
|
||||
```openCypher
|
||||
CREATE ROLE role_name;
|
||||
```
|
||||
|
||||
Assigning a user role to a certain user can be done by the following command:
|
||||
|
||||
```openCypher
|
||||
SET ROLE FOR user_name TO role_name;
|
||||
```
|
||||
|
||||
Removing the role from the user can be done by:
|
||||
|
||||
```openCypher
|
||||
CLEAR ROLE FOR user_name;
|
||||
```
|
||||
|
||||
Finally, showing all users that have a certain role can be done as:
|
||||
|
||||
```openCypher
|
||||
SHOW USERS FOR role_name;
|
||||
```
|
||||
|
||||
Similarly, querying which role a certain user has can be done as:
|
||||
|
||||
```openCypher
|
||||
SHOW ROLE FOR user_name;
|
||||
```
|
||||
|
||||
### Privileges
|
||||
|
||||
At the moment, privileges are confined to users' abilities to perform certain
|
||||
`OpenCypher` queries. Namely users can be given permission to execute a subset
|
||||
of the following commands: `CREATE`, `DELETE`, `MATCH`, `MERGE`, `SET`,
|
||||
`REMOVE`, `INDEX`, `AUTH`, `STREAM`.
|
||||
|
||||
Granting a certain set of privileges to a specific user or user role can be
|
||||
done by issuing the following command:
|
||||
|
||||
```openCypher
|
||||
GRANT privilege_list TO user_or_role;
|
||||
```
|
||||
|
||||
For example, granting `AUTH` and `STREAM` privileges to users with the role
|
||||
`moderator` would be written as:
|
||||
|
||||
```openCypher
|
||||
GRANT AUTH, STREAM TO moderator:
|
||||
```
|
||||
|
||||
Similarly, denying privileges is done using the `DENY` keyword instead of
|
||||
`GRANT`.
|
||||
|
||||
Both denied and granted privileges can be revoked, meaning that their status is
|
||||
not defined for that user or role. Revoking is done using the `REVOKE` keyword.
|
||||
The users should note that, although semantically unintuitive, the level of a
|
||||
certain privilege can be raised by using `REVOKE`. For instance, suppose a user
|
||||
has been denied a `STREAM` privilege, but the role it belongs to is granted
|
||||
that privilege. Currently, the user is unable to use data streaming features,
|
||||
but, after revoking the user's `STREAM` privilege, they will be able to do so.
|
||||
|
||||
Finally, if you wish to grant, deny or revoke all privileges and find it tedious
|
||||
to explicitly list them, you can use the `ALL PRIVILEGES` construct instead.
|
||||
For example, revoking all privileges from user `jdoe` can be done with the
|
||||
following command:
|
||||
|
||||
```openCypher
|
||||
REVOKE ALL PRIVILEGES FROM jdoe;
|
||||
```
|
||||
|
||||
Finally, obtaining the status of each privilege for a certain user or role can be
|
||||
done by issuing the following command:
|
||||
|
||||
```openCypher
|
||||
SHOW PRIVILEGES FOR user_or_role;
|
||||
```
|
@ -1,9 +0,0 @@
|
||||
## Dynamic Graph Partitioner
|
||||
|
||||
Memgraph supports dynamic graph partitioning which dynamically improves
|
||||
performance on badly partitioned dataset over workers. To enable it, the user
|
||||
should use the following flag when firing up the *master* node:
|
||||
|
||||
```plaintext
|
||||
--dynamic_graph_partitioner_enable
|
||||
```
|
@ -1,160 +0,0 @@
|
||||
## Other Features
|
||||
|
||||
The following sections describe some of the other supported features.
|
||||
|
||||
### UNWIND
|
||||
|
||||
The `UNWIND` clause is used to unwind a list of values as individual rows.
|
||||
|
||||
Example. Produce rows out of a single list.
|
||||
|
||||
```opencypher
|
||||
UNWIND [1,2,3] AS listElement RETURN listElement
|
||||
```
|
||||
|
||||
More examples are
|
||||
[here](https://neo4j.com/docs/developer-manual/current/cypher/clauses/unwind/).
|
||||
|
||||
### Functions
|
||||
|
||||
This section contains the list of other supported functions.
|
||||
|
||||
Name | Description
|
||||
-----------------|------------
|
||||
`coalesce` | Returns the first non null argument.
|
||||
`startNode` | Returns the starting node of an edge.
|
||||
`endNode` | Returns the destination node of an edge.
|
||||
`degree` | Returns the number of edges (both incoming and outgoing) of a node.
|
||||
`head` | Returns the first element of a list.
|
||||
`last` | Returns the last element of a list.
|
||||
`properties` | Returns the properties of a node or an edge.
|
||||
`size` | Returns the number of elements in a list or a map. When given a string it returns the number of characters. When given a path it returns the number of expansions (edges) in that path.
|
||||
`toBoolean` | Converts the argument to a boolean.
|
||||
`toFloat` | Converts the argument to a floating point number.
|
||||
`toInteger` | Converts the argument to an integer.
|
||||
`type` | Returns the type of an edge as a character string.
|
||||
`keys` | Returns a list keys of properties from an edge or a node. Each key is represented as a string of characters.
|
||||
`labels` | Returns a list of labels from a node. Each label is represented as a character string.
|
||||
`nodes` | Returns a list of nodes from a path.
|
||||
`relationships` | Returns a list of relationships (edges) from a path.
|
||||
`range` | Constructs a list of value in given range.
|
||||
`tail` | Returns all elements after the first of a given list.
|
||||
`abs` | Returns the absolute value of a number.
|
||||
`ceil` | Returns the smallest integer greater than or equal to given number.
|
||||
`floor` | Returns the largest integer smaller than or equal to given number.
|
||||
`round` | Returns the number, rounded to the nearest integer. Tie-breaking is done using the *commercial rounding*, where -1.5 produces -2 and 1.5 produces 2.
|
||||
`exp` | Calculates `e^n` where `e` is the base of the natural logarithm, and `n` is the given number.
|
||||
`log` | Calculates the natural logarithm of a given number.
|
||||
`log10` | Calculates the logarithm (base 10) of a given number.
|
||||
`sqrt` | Calculates the square root of a given number.
|
||||
`acos` | Calculates the arccosine of a given number.
|
||||
`asin` | Calculates the arcsine of a given number.
|
||||
`atan` | Calculates the arctangent of a given number.
|
||||
`atan2` | Calculates the arctangent2 of a given number.
|
||||
`cos` | Calculates the cosine of a given number.
|
||||
`sin` | Calculates the sine of a given number.
|
||||
`tan` | Calculates the tangent of a given number.
|
||||
`sign` | Applies the signum function to a given number and returns the result. The signum of positive numbers is 1, of negative -1 and for 0 returns 0.
|
||||
`e` | Returns the base of the natural logarithm.
|
||||
`pi` | Returns the constant *pi*.
|
||||
`rand` | Returns a random floating point number between 0 (inclusive) and 1 (exclusive).
|
||||
`startsWith` | Check if the first argument starts with the second.
|
||||
`endsWith` | Check if the first argument ends with the second.
|
||||
`contains` | Check if the first argument has an element which is equal to the second argument.
|
||||
`left` | Returns a string containing the specified number of leftmost characters of the original string.
|
||||
`lTrim` | Returns the original string with leading whitespace removed.
|
||||
`replace` | Returns a string in which all occurrences of a specified string in the original string have been replaced by another (specified) string.
|
||||
`reverse` | Returns a string in which the order of all characters in the original string have been reversed.
|
||||
`right` | Returns a string containing the specified number of rightmost characters of the original string.
|
||||
`rTrim` | Returns the original string with trailing whitespace removed.
|
||||
`split` | Returns a list of strings resulting from the splitting of the original string around matches of the given delimiter.
|
||||
`substring` | Returns a substring of the original string, beginning with a 0-based index start and length.
|
||||
`toLower` | Returns the original string in lowercase.
|
||||
`toString` | Converts an integer, float or boolean value to a string.
|
||||
`toUpper` | Returns the original string in uppercase.
|
||||
`trim` | Returns the original string with leading and trailing whitespace removed.
|
||||
`all` | Check if all elements of a list satisfy a predicate.<br/>The syntax is: `all(variable IN list WHERE predicate)`.<br/> NOTE: Whenever possible, use Memgraph's lambda functions when matching instead.
|
||||
`single` | Check if only one element of a list satisfies a predicate.<br/>The syntax is: `single(variable IN list WHERE predicate)`.
|
||||
`reduce` | Accumulate list elements into a single result by applying an expression. The syntax is:<br/>`reduce(accumulator = initial_value, variable IN list | expression)`.
|
||||
`extract` | A list of values obtained by evaluating an expression for each element in list. The syntax is:<br>`extract(variable IN list | expression)`.
|
||||
`assert` | Raises an exception reported to the client if the given argument is not `true`.
|
||||
`counter` | Generates integers that are guaranteed to be unique on the database level, for the given counter name.
|
||||
`counterSet` | Sets the counter with the given name to the given value.
|
||||
`indexInfo` | Returns a list of all the indexes available in the database. The list includes indexes that are not yet ready for use (they are concurrently being built by another transaction).
|
||||
`timestamp` | Returns the difference, measured in milliseconds, between the current time and midnight, January 1, 1970 UTC.
|
||||
`id` | Returns identifier for a given node or edge. The identifier is generated during the initialization of node or edge and will be persisted through the durability mechanism.
|
||||
|
||||
### String Operators
|
||||
|
||||
Apart from comparison and concatenation operators openCypher provides special
|
||||
string operators for easier matching of substrings:
|
||||
|
||||
Operator | Description
|
||||
-------------------|------------
|
||||
`a STARTS WITH b` | Returns true if prefix of string a is equal to string b.
|
||||
`a ENDS WITH b` | Returns true if suffix of string a is equal to string b.
|
||||
`a CONTAINS b` | Returns true if some substring of string a is equal to string b.
|
||||
|
||||
### Parameters
|
||||
|
||||
When automating the queries for Memgraph, it comes in handy to change only
|
||||
some parts of the query. Usually, these parts are values which are used for
|
||||
filtering results or similar, while the rest of the query remains the same.
|
||||
|
||||
Parameters allow reusing the same query, but with different parameter values.
|
||||
The syntax uses the `$` symbol to designate a parameter name. We don't allow
|
||||
old Cypher parameter syntax using curly braces. For example, you can parameterize
|
||||
filtering a node property:
|
||||
|
||||
```opencypher
|
||||
MATCH (node1 {property: $propertyValue}) RETURN node1
|
||||
```
|
||||
|
||||
You can use parameters instead of any literal in the query, but not instead of
|
||||
property maps even though that is allowed in standard openCypher. Following
|
||||
example is illegal in Memgraph:
|
||||
|
||||
```opencypher
|
||||
MATCH (node1 $propertyValue) RETURN node1
|
||||
```
|
||||
|
||||
To use parameters with Python driver use following syntax:
|
||||
|
||||
```python
|
||||
session.run('CREATE (alice:Person {name: $name, age: $ageValue}',
|
||||
name='Alice', ageValue=22)).consume()
|
||||
```
|
||||
|
||||
To use parameters which names are integers you will need to wrap parameters in
|
||||
a dictionary and convert them to strings before running a query:
|
||||
|
||||
```python
|
||||
session.run('CREATE (alice:Person {name: $0, age: $1}',
|
||||
{'0': "Alice", '1': 22})).consume()
|
||||
```
|
||||
|
||||
To use parameters with some other driver please consult appropriate
|
||||
documentation.
|
||||
|
||||
### CASE
|
||||
|
||||
Conditional expressions can be expressed in openCypher language by simple and
|
||||
generic form of `CASE` expression. A simple form is used to compare an expression
|
||||
against multiple predicates. For the first matched predicate result of the
|
||||
expression provided after the `THEN` keyword is returned. If no expression is
|
||||
matched value following `ELSE` is returned is provided, or `null` if `ELSE` is not
|
||||
used:
|
||||
|
||||
```opencypher
|
||||
MATCH (n)
|
||||
RETURN CASE n.currency WHEN "DOLLAR" THEN "$" WHEN "EURO" THEN "€" ELSE "UNKNOWN" END
|
||||
```
|
||||
|
||||
In generic form, you don't need to provide an expression whose value is compared to
|
||||
predicates, but you can list multiple predicates and the first one that evaluates
|
||||
to true is matched:
|
||||
|
||||
```opencypher
|
||||
MATCH (n)
|
||||
RETURN CASE WHEN n.height < 30 THEN "short" WHEN n.height > 300 THEN "tall" END
|
||||
```
|
@ -1,63 +0,0 @@
|
||||
## Differences
|
||||
|
||||
Although we try to implement openCypher query language as closely to the
|
||||
language reference as possible, we had to make some changes to enhance the
|
||||
user experience.
|
||||
|
||||
### Symbolic Names
|
||||
|
||||
We don't allow symbolic names (variables, label names...) to be openCypher
|
||||
keywords (WHERE, MATCH, COUNT, SUM...).
|
||||
|
||||
### Unicode Codepoints in String Literal
|
||||
|
||||
Use `\u` followed by 4 hex digits in string literal for UTF-16 codepoint and
|
||||
`\U` with 8 hex digits for UTF-32 codepoint in Memgraph.
|
||||
|
||||
|
||||
### Difference from Neo4j's Cypher Implementation
|
||||
|
||||
The openCypher initiative stems from Neo4j's Cypher query language. Following is a list
|
||||
of most important differences between Neo's Cypher and Memgraph's openCypher implementation,
|
||||
for users that are already familiar with Neo4j. There might be other differences not documented
|
||||
here (especially subtle semantic ones).
|
||||
|
||||
#### Unsupported Constructs
|
||||
|
||||
* Data importing. Memgraph doesn't support Cypher's CSV importing capabilities.
|
||||
* The `FOREACH` language construct for performing an operation on every list element.
|
||||
* The `CALL` construct for a standalone function call. This can be expressed using
|
||||
`RETURN functioncall()`. For example, with Memgraph you can get information about
|
||||
the indexes present in the database using the `RETURN indexinfo()` openCypher query.
|
||||
* Stored procedures.
|
||||
* Regular expressions for string matching.
|
||||
* `shortestPath` and `allShortestPaths` functions. `shortestPath` can be expressed using
|
||||
Memgraph's breadth-first expansion syntax already described in this document.
|
||||
* Patterns in expressions. For example, Memgraph doesn't support `size((n)-->())`. Most of the time
|
||||
the same functionalities can be expressed differently in Memgraph using `OPTIONAL` expansions,
|
||||
function calls etc.
|
||||
* Map projections such as `MATCH (n) RETURN n {.property1, .property2}`.
|
||||
|
||||
#### Unsupported Functions
|
||||
|
||||
General purpose functions:
|
||||
|
||||
* `exists(n.property)` - This can be expressed using `n.property IS NOT NULL`.
|
||||
* `length()` is named `size()` in Memgraph.
|
||||
|
||||
Aggregation functions:
|
||||
|
||||
* `count(DISTINCT variable)` - This can be expressed using `WITH DISTINCT variable RETURN count(variable)`.
|
||||
|
||||
Mathematical functions:
|
||||
|
||||
* `percentileDisc()`
|
||||
* `stDev()`
|
||||
* `point()`
|
||||
* `distance()`
|
||||
* `degrees()`
|
||||
|
||||
List functions:
|
||||
|
||||
* `any()`
|
||||
* `none()`
|
@ -1,14 +0,0 @@
|
||||
## Tutorials Overview
|
||||
|
||||
Articles within the tutorials section serve as real-world examples of using
|
||||
Memgraph. These articles tend to provide the user with a reasonably-sized
|
||||
dataset and some example queries that showcase how to use Memgraph on that
|
||||
particular dataset. We encourage all Memgraph users to go through at least
|
||||
one of the tutorials as they can also serve as a verification that Memgraph
|
||||
is successfully installed on your system.
|
||||
|
||||
So far we have covered the following topics:
|
||||
|
||||
* [Analyzing TED Talks](02_analyzing-TED-talks.md)
|
||||
* [Graphing the Premier League](03_graphing-the-premier-league.md)
|
||||
* [Exploring the European Road Network](04_exploring-the-european-road-network.md)
|
@ -1,176 +0,0 @@
|
||||
## Analyzing TED Talks
|
||||
|
||||
This article is a part of a series intended to show users how to use Memgraph
|
||||
on real-world data and, by doing so, retrieve some interesting and useful
|
||||
information.
|
||||
|
||||
We highly recommend checking out the other articles from this series:
|
||||
|
||||
* [Exploring the European Road Network](04_exploring-the-european-road-network.md)
|
||||
* [Graphing the Premier League](03_graphing-the-premier-league.md)
|
||||
|
||||
### Introduction
|
||||
|
||||
[TED](https://www.ted.com/) is a nonprofit organization devoted to spreading
|
||||
ideas, usually in the form of short, powerful talks.
|
||||
Today, TED talks are influential videos from expert speakers on almost all
|
||||
topics — from science to business to global issues.
|
||||
Here we present a small dataset which consists of 97 talks, show how to model
|
||||
this data as a graph and demonstrate a few example queries.
|
||||
|
||||
### Data Model
|
||||
|
||||
Each TED talk has a main speaker, so we
|
||||
identify two types of nodes — `Talk` and `Speaker`. Also, we will add
|
||||
an edge of type `Gave` pointing to a `Talk` from its main `Speaker`.
|
||||
Each speaker has a name so we can add property `name` to `Speaker` node.
|
||||
Likewise, we'll add properties `name`, `title` and `description` to node
|
||||
`Talk`. Furthermore, each talk is given in a specific TED event, so we can
|
||||
create node `Event` with property `name` and relationship `InEvent` between
|
||||
talk and event.
|
||||
|
||||
Talks are tagged with keywords to facilitate searching, hence we
|
||||
add node `Tag` with property `name` and relationship `HasTag` between talk and
|
||||
tag. Moreover, users give ratings to each talk by selecting up to three
|
||||
predefined string values. Therefore we add node `Rating` with these values as
|
||||
property `name` and relationship`HasRating` with property `user_count` between
|
||||
talk and rating nodes.
|
||||
|
||||
### Importing the Snapshot
|
||||
|
||||
We have prepared a database snapshot for this example, so the user can easily
|
||||
import it when starting Memgraph using the `--durability-directory` option.
|
||||
|
||||
```bash
|
||||
/usr/lib/memgraph/memgraph --durability-directory /usr/share/memgraph/examples/TEDTalk \
|
||||
--durability-enabled=false --snapshot-on-exit=false
|
||||
```
|
||||
|
||||
When using Memgraph installed from DEB or RPM package, the currently running
|
||||
Memgraph server may need to be stopped before importing the example. The user
|
||||
can do so using the following command:
|
||||
|
||||
```bash
|
||||
systemctl stop memgraph
|
||||
```
|
||||
|
||||
When using Docker, the example can be imported with the following command:
|
||||
|
||||
```bash
|
||||
docker run -p 7687:7687 \
|
||||
-v mg_lib:/var/lib/memgraph -v mg_log:/var/log/memgraph -v mg_etc:/etc/memgraph \
|
||||
memgraph --durability-directory /usr/share/memgraph/examples/TEDTalk \
|
||||
--durability-enabled=false --snapshot-on-exit=false
|
||||
```
|
||||
|
||||
The user should note that any modifications of the database state will persist
|
||||
only during this run of Memgraph.
|
||||
|
||||
### Example Queries
|
||||
|
||||
1) Find all talks given by specific speaker:
|
||||
|
||||
```opencypher
|
||||
MATCH (n:Speaker {name: "Hans Rosling"})-[:Gave]->(m:Talk)
|
||||
RETURN m.title;
|
||||
```
|
||||
|
||||
2) Find the top 20 speakers with most talks given:
|
||||
|
||||
```opencypher
|
||||
MATCH (n:Speaker)-[:Gave]->(m)
|
||||
RETURN n.name, COUNT(m) AS TalksGiven
|
||||
ORDER BY TalksGiven DESC LIMIT 20;
|
||||
```
|
||||
|
||||
3) Find talks related by tag to specific talk and count them:
|
||||
|
||||
```opencypher
|
||||
MATCH (n:Talk {name: "Michael Green: Why we should build wooden skyscrapers"})
|
||||
-[:HasTag]->(t:Tag)<-[:HasTag]-(m:Talk)
|
||||
WITH * ORDER BY m.name
|
||||
RETURN t.name, COLLECT(m.name), COUNT(m) AS TalksCount
|
||||
ORDER BY TalksCount DESC;
|
||||
```
|
||||
|
||||
4) Find 20 most frequently used tags:
|
||||
|
||||
```opencypher
|
||||
MATCH (t:Tag)<-[:HasTag]-(n:Talk)
|
||||
RETURN t.name AS Tag, COUNT(n) AS TalksCount
|
||||
ORDER BY TalksCount DESC, Tag LIMIT 20;
|
||||
```
|
||||
|
||||
5) Find 20 talks most rated as "Funny". If you want to query by other ratings,
|
||||
possible values are: Obnoxious, Jaw-dropping, OK, Persuasive, Beautiful,
|
||||
Confusing, Longwinded, Unconvincing, Fascinating, Ingenious, Courageous, Funny,
|
||||
Informative and Inspiring.
|
||||
|
||||
```opencypher
|
||||
MATCH (r:Rating{name:"Funny"})<-[e:HasRating]-(m:Talk)
|
||||
RETURN m.name, e.user_count ORDER BY e.user_count DESC LIMIT 20;
|
||||
```
|
||||
|
||||
6) Find inspiring talks and their speakers from the field of technology:
|
||||
|
||||
```opencypher
|
||||
MATCH (n:Talk)-[:HasTag]->(m:Tag {name: "technology"})
|
||||
MATCH (n)-[r:HasRating]->(p:Rating {name: "Inspiring"})
|
||||
MATCH (n)<-[:Gave]-(s:Speaker)
|
||||
WHERE r.user_count > 1000
|
||||
RETURN n.title, s.name, r.user_count ORDER BY r.user_count DESC;
|
||||
```
|
||||
|
||||
7) Now let's see one real-world example — how to make a real-time
|
||||
recommendation. If you've just watched a talk from a certain
|
||||
speaker (e.g. Hans Rosling) you might be interested in finding more talks from
|
||||
the same speaker on a similar topic:
|
||||
|
||||
```opencypher
|
||||
MATCH (n:Speaker {name: "Hans Rosling"})-[:Gave]->(m:Talk)
|
||||
MATCH (t:Talk {title: "New insights on poverty"})-[:HasTag]->(tag:Tag)<-[:HasTag]-(m)
|
||||
WITH * ORDER BY tag.name
|
||||
RETURN m.title as Title, COLLECT(tag.name), COUNT(tag) as TagCount
|
||||
ORDER BY TagCount DESC, Title;
|
||||
```
|
||||
|
||||
The following few queries are focused on extracting information about
|
||||
TED events.
|
||||
|
||||
8) Find how many talks were given per event:
|
||||
|
||||
```opencypher
|
||||
MATCH (n:Event)<-[:InEvent]-(t:Talk)
|
||||
RETURN n.name as Event, COUNT(t) AS TalksCount
|
||||
ORDER BY TalksCount DESC, Event
|
||||
LIMIT 20;
|
||||
```
|
||||
|
||||
9) Find the most popular tags in the specific event:
|
||||
|
||||
```opencypher
|
||||
MATCH (n:Event {name:"TED2006"})<-[:InEvent]-(t:Talk)-[:HasTag]->(tag:Tag)
|
||||
RETURN tag.name as Tag, COUNT(t) AS TalksCount
|
||||
ORDER BY TalksCount DESC, Tag
|
||||
LIMIT 20;
|
||||
```
|
||||
|
||||
10) Discover which speakers participated in more than 2 events:
|
||||
|
||||
```opencypher
|
||||
MATCH (n:Speaker)-[:Gave]->(t:Talk)-[:InEvent]->(e:Event)
|
||||
WITH n, COUNT(e) AS EventsCount WHERE EventsCount > 2
|
||||
RETURN n.name as Speaker, EventsCount
|
||||
ORDER BY EventsCount DESC, Speaker;
|
||||
```
|
||||
|
||||
11) For each speaker search for other speakers that participated in same
|
||||
events:
|
||||
|
||||
```opencypher
|
||||
MATCH (n:Speaker)-[:Gave]->()-[:InEvent]->(e:Event)<-[:InEvent]-()<-[:Gave]-(m:Speaker)
|
||||
WHERE n.name != m.name
|
||||
WITH DISTINCT n, m ORDER BY m.name
|
||||
RETURN n.name AS Speaker, COLLECT(m.name) AS Others
|
||||
ORDER BY Speaker;
|
||||
```
|
@ -1,190 +0,0 @@
|
||||
## Graphing the Premier League
|
||||
|
||||
This article is a part of a series intended to show users how to use Memgraph
|
||||
on real-world data and, by doing so, retrieve some interesting and useful
|
||||
information.
|
||||
|
||||
We highly recommend checking out the other articles from this series:
|
||||
|
||||
* [Analyzing TED Talks](02_analyzing-TED-talks.md)
|
||||
* [Exploring the European Road Network](04_exploring-the-european-road-network.md)
|
||||
|
||||
### Introduction
|
||||
|
||||
[Football](https://en.wikipedia.org/wiki/Association_football)
|
||||
is a team sport played between two teams of eleven
|
||||
players with a spherical ball. The game is played on a rectangular pitch with
|
||||
a goal at each and. The object of the game is to score by moving the ball
|
||||
beyond the goal line into the opposing goal. The game is played by more than
|
||||
250 million players in over 200 countries, making it the world's most
|
||||
popular sport.
|
||||
|
||||
In this article, we will present a graph model of a reasonably sized dataset
|
||||
of football matches across world's most popular leagues.
|
||||
|
||||
### Data Model
|
||||
|
||||
In essence, we are trying to model a set of football matches. All information
|
||||
about a single match is going to be contained in three nodes and two edges.
|
||||
Two of the nodes will represent the teams that have played the match, while the
|
||||
third node will represent the game itself. Both edges are directed from the
|
||||
team nodes to the game node and are labeled as `:Played`.
|
||||
|
||||
Let us consider a real life example of this model—Arsene Wenger's 1000th
|
||||
game in charge of Arsenal. This was a regular fixture of a 2013/2014
|
||||
English Premier League, yet it was written in the stars that this historic
|
||||
moment would be a big London derby against Chelsea on Stanford Bridge. The
|
||||
sketch below shows how this game is being modeled in our database.
|
||||
|
||||
```
|
||||
+---------------+ +-----------------------------+
|
||||
|n: Team | |w: Game |
|
||||
| |-[:Played {side: "home", outcome: "won"}]-->| |
|
||||
|name: "Chelsea"| |HT_home_score: 4 |
|
||||
+---------------+ |HT_away_score: 0 |
|
||||
|HT_result: "H" |
|
||||
|FT_home_score: 6 |
|
||||
|FT_away_score: 0 |
|
||||
|FT_result: "H" |
|
||||
+---------------+ |date: "2014-03-22" |
|
||||
|m: Team | |league: "ENG-Premier League" |
|
||||
| |-[:Played {side: "away", outcome: "lost"}]->|season: 2013 |
|
||||
|name: "Arsenal"| |referee: "Andre Marriner" |
|
||||
+---------------+ +-----------------------------+
|
||||
```
|
||||
|
||||
### Importing the Snapshot
|
||||
|
||||
We have prepared a database snapshot for this example, so the user can easily
|
||||
import it when starting Memgraph using the `--durability-directory` option.
|
||||
|
||||
```bash
|
||||
/usr/lib/memgraph/memgraph --durability-directory /usr/share/memgraph/examples/football \
|
||||
--durability-enabled=false --snapshot-on-exit=false
|
||||
```
|
||||
|
||||
When using Memgraph installed from DEB or RPM package, the currently running
|
||||
Memgraph server may need to be stopped before importing the example. The user
|
||||
can do so using the following command:
|
||||
|
||||
```bash
|
||||
systemctl stop memgraph
|
||||
```
|
||||
|
||||
When using Docker, the example can be imported with the following command:
|
||||
|
||||
```bash
|
||||
docker run -p 7687:7687 \
|
||||
-v mg_lib:/var/lib/memgraph -v mg_log:/var/log/memgraph -v mg_etc:/etc/memgraph \
|
||||
memgraph --durability-directory /usr/share/memgraph/examples/football \
|
||||
--durability-enabled=false --snapshot-on-exit=false
|
||||
```
|
||||
|
||||
The user should note that any modifications of the database state will persist
|
||||
only during this run of Memgraph.
|
||||
|
||||
### Example Queries
|
||||
|
||||
1) You might wonder, what leagues are supported?
|
||||
|
||||
```opencypher
|
||||
MATCH (n:Game)
|
||||
RETURN DISTINCT n.league AS League
|
||||
ORDER BY League;
|
||||
```
|
||||
|
||||
2) We have stored a certain number of seasons for each league. What is the
|
||||
oldest/newest season we have included?
|
||||
|
||||
```opencypher
|
||||
MATCH (n:Game)
|
||||
RETURN DISTINCT n.league AS League, MIN(n.season) AS Oldest, MAX(n.season) AS Newest
|
||||
ORDER BY League;
|
||||
```
|
||||
|
||||
3) You have already seen one game between Chelsea and Arsenal, let's list all of
|
||||
them in chronological order.
|
||||
|
||||
```opencypher
|
||||
MATCH (n:Team {name: "Chelsea"})-[e:Played]->(w:Game)<-[f:Played]-(m:Team {name: "Arsenal"})
|
||||
RETURN w.date AS Date, e.side AS Chelsea, f.side AS Arsenal,
|
||||
w.FT_home_score AS home_score, w.FT_away_score AS away_score
|
||||
ORDER BY Date;
|
||||
```
|
||||
|
||||
4) How about filtering games in which Chelsea won?
|
||||
|
||||
```opencypher
|
||||
MATCH (n:Team {name: "Chelsea"})-[e:Played {outcome: "won"}]->
|
||||
(w:Game)<-[f:Played]-(m:Team {name: "Arsenal"})
|
||||
RETURN w.date AS Date, e.side AS Chelsea, f.side AS Arsenal,
|
||||
w.FT_home_score AS home_score, w.FT_away_score AS away_score
|
||||
ORDER BY Date;
|
||||
```
|
||||
|
||||
5) Home field advantage is a thing in football. Let's list the number of home
|
||||
defeats for each Premier League team in the 2016/2017 season.
|
||||
|
||||
```opencypher
|
||||
MATCH (n:Team)-[:Played {side: "home", outcome: "lost"}]->
|
||||
(w:Game {league: "ENG-Premier League", season: 2016})
|
||||
RETURN n.name AS Team, count(w) AS home_defeats
|
||||
ORDER BY home_defeats, Team;
|
||||
```
|
||||
|
||||
6) At the end of the season the team with the most points wins the league. For
|
||||
each victory, a team is awarded 3 points and for each draw it is awarded
|
||||
1 point. Let's find out how many points did reigning champions (Chelsea) have
|
||||
at the end of 2016/2017 season.
|
||||
|
||||
```opencypher
|
||||
MATCH (n:Team {name: "Chelsea"})-[:Played {outcome: "drew"}]->(w:Game {season: 2016})
|
||||
WITH n, COUNT(w) AS draw_points
|
||||
MATCH (n)-[:Played {outcome: "won"}]->(w:Game {season: 2016})
|
||||
RETURN draw_points + 3 * COUNT(w) AS total_points;
|
||||
```
|
||||
|
||||
7) In fact, why not retrieve the whole table?
|
||||
|
||||
```opencypher
|
||||
MATCH (n)-[:Played {outcome: "drew"}]->(w:Game {league: "ENG-Premier League", season: 2016})
|
||||
WITH n, COUNT(w) AS draw_points
|
||||
MATCH (n)-[:Played {outcome: "won"}]->(w:Game {league: "ENG-Premier League", season: 2016})
|
||||
RETURN n.name AS Team, draw_points + 3 * COUNT(w) AS total_points
|
||||
ORDER BY total_points DESC;
|
||||
```
|
||||
|
||||
8) People have always debated which of the major leagues is the most exciting.
|
||||
One basic metric is the average number of goals per game. Let's see the results
|
||||
at the end of the 2016/2017 season. WARNING: This might shock you.
|
||||
|
||||
```opencypher
|
||||
MATCH (w:Game {season: 2016})
|
||||
RETURN w.league, AVG(w.FT_home_score) + AVG(w.FT_away_score) AS avg_goals_per_game
|
||||
ORDER BY avg_goals_per_game DESC;
|
||||
```
|
||||
|
||||
9) Another metric might be the number of comebacks—games where one side
|
||||
was winning at half time but were overthrown by the other side by the end
|
||||
of the match. Let's count such occurrences during all supported seasons across
|
||||
all supported leagues.
|
||||
|
||||
```opencypher
|
||||
MATCH (g:Game) WHERE
|
||||
(g.HT_result = "H" AND g.FT_result = "A") OR
|
||||
(g.HT_result = "A" AND g.FT_result = "H")
|
||||
RETURN g.league AS League, count(g) AS Comebacks
|
||||
ORDER BY Comebacks DESC;
|
||||
```
|
||||
|
||||
10) Exciting leagues also tend to be very unpredictable. On that note, let's
|
||||
list all triplets of teams where, during the course of one season, team A won
|
||||
against team B, team B won against team C and team C won against team A.
|
||||
|
||||
```opencypher
|
||||
MATCH (a)-[:Played {outcome: "won"}]->(p:Game {league: "ENG-Premier League", season: 2016})<--
|
||||
(b)-[:Played {outcome: "won"}]->(q:Game {league: "ENG-Premier League", season: 2016})<--
|
||||
(c)-[:Played {outcome: "won"}]->(r:Game {league: "ENG-Premier League", season: 2016})<--(a)
|
||||
WHERE p.date < q.date AND q.date < r.date
|
||||
RETURN a.name AS Team1, b.name AS Team2, c.name AS Team3;
|
||||
```
|
@ -1,178 +0,0 @@
|
||||
## Exploring the European Road Network
|
||||
|
||||
This article is a part of a series intended to show users how to use Memgraph
|
||||
on real-world data and, by doing so, retrieve some interesting and useful
|
||||
information.
|
||||
|
||||
We highly recommend checking out the other articles from this series:
|
||||
|
||||
* [Analyzing TED Talks](02_analyzing-TED-talks.md)
|
||||
* [Graphing the Premier League](03_graphing-the-premier-league.md)
|
||||
|
||||
### Introduction
|
||||
|
||||
This particular article outlines how to use some of Memgraph's built-in graph
|
||||
algorithms. More specifically, the article shows how to use breadth-first search
|
||||
graph traversal algorithm, and Dijkstra's algorithm for finding weighted
|
||||
shortest paths between nodes in the graph.
|
||||
|
||||
### Data model
|
||||
|
||||
One of the most common applications of graph traversal algorithms is driving
|
||||
route computation, so we will use European road network graph as an example.
|
||||
The graph consists of 999 major European cities from 39 countries in total.
|
||||
Each city is connected to the country it belongs to via an edge of type `:In_`.
|
||||
There are edges of type `:Road` connecting cities less than 500 kilometers
|
||||
apart. Distance between cities is specified in the `length` property of the
|
||||
edge.
|
||||
|
||||
### Importing the Snapshot
|
||||
|
||||
We have prepared a database snapshot for this example, so the user can easily
|
||||
import it when starting Memgraph using the `--durability-directory` option.
|
||||
|
||||
```bash
|
||||
/usr/lib/memgraph/memgraph --durability-directory /usr/share/memgraph/examples/Europe \
|
||||
--durability-enabled=false --snapshot-on-exit=false
|
||||
```
|
||||
|
||||
When using Memgraph installed from DEB or RPM package, the currently running
|
||||
Memgraph server may need to be stopped before importing the example. The user
|
||||
can do so using the following command:
|
||||
|
||||
```bash
|
||||
systemctl stop memgraph
|
||||
```
|
||||
|
||||
When using Docker, the example can be imported with the following command:
|
||||
|
||||
```bash
|
||||
docker run -p 7687:7687 \
|
||||
-v mg_lib:/var/lib/memgraph -v mg_log:/var/log/memgraph -v mg_etc:/etc/memgraph \
|
||||
memgraph --durability-directory /usr/share/memgraph/examples/Europe \
|
||||
--durability-enabled=false --snapshot-on-exit=false
|
||||
```
|
||||
|
||||
The user should note that any modifications of the database state will persist
|
||||
only during this run of Memgraph.
|
||||
|
||||
### Example Queries
|
||||
|
||||
1) Let's list all of the countries in our road network.
|
||||
|
||||
```opencypher
|
||||
MATCH (c:Country) RETURN c.name ORDER BY c.name;
|
||||
```
|
||||
|
||||
2) Which Croatian cities are in our road network?
|
||||
|
||||
```opencypher
|
||||
MATCH (c:City)-[:In_]->(:Country {name: "Croatia"})
|
||||
RETURN c.name ORDER BY c.name;
|
||||
```
|
||||
|
||||
3) Which cities in our road network are less than 200 km away from Zagreb?
|
||||
|
||||
```opencypher
|
||||
MATCH (:City {name: "Zagreb"})-[r:Road]->(c:City)
|
||||
WHERE r.length < 200
|
||||
RETURN c.name ORDER BY c.name;
|
||||
```
|
||||
|
||||
Now let's try some queries using Memgraph's graph traversal capabilities.
|
||||
|
||||
4) Say you want to drive from Zagreb to Paris. You might wonder, what is the
|
||||
least number of cities you have to visit if you don't want to drive more than
|
||||
500 kilometers between stops. Since the edges in our road network don't connect
|
||||
cities that are more than 500 km apart, this is a great use case for the
|
||||
breadth-first search (BFS) algorithm.
|
||||
|
||||
```opencypher
|
||||
MATCH p = (:City {name: "Zagreb"})
|
||||
-[:Road * bfs]->
|
||||
(:City {name: "Paris"})
|
||||
RETURN nodes(p);
|
||||
```
|
||||
|
||||
5) What if we want to bike to Paris instead of driving? It is unreasonable (and
|
||||
dangerous!) to bike 500 km per day. Let's limit ourselves to biking no more
|
||||
than 200 km in one go.
|
||||
|
||||
```opencypher
|
||||
MATCH p = (:City {name: "Zagreb"})
|
||||
-[:Road * bfs (e, v | e.length <= 200)]->
|
||||
(:City {name: "Paris"})
|
||||
RETURN nodes(p);
|
||||
```
|
||||
|
||||
"What is this special syntax?", you might wonder.
|
||||
|
||||
`(e, v | e.length <= 200)` is called a *filter lambda*. It's a function that
|
||||
takes an edge symbol `e` and a vertex symbol `v` and decides whether this edge
|
||||
and vertex pair should be considered valid in breadth-first expansion by
|
||||
returning true or false (or Null). In the above example, lambda is returning
|
||||
true if edge length is not greater than 200, because we don't want to bike more
|
||||
than 200 km in one go.
|
||||
|
||||
6) Let's say we also don't want to visit Vienna on our way to Paris, because we
|
||||
have a lot of friends there and visiting all of them would take up a lot of our
|
||||
time. We just have to update our filter lambda.
|
||||
|
||||
```opencypher
|
||||
MATCH p = (:City {name: "Zagreb"})
|
||||
-[:Road * bfs (e, v | e.length <= 200 AND v.name != "Vienna")]->
|
||||
(:City {name: "Paris"})
|
||||
RETURN nodes(p);
|
||||
```
|
||||
|
||||
As you can see, without the additional restriction we could visit 11 cities. If
|
||||
we want to avoid Vienna, we must visit at least 12 cities.
|
||||
|
||||
7) Instead of counting the cities visited, we might want to find the shortest
|
||||
paths in terms of distance travelled. This is a textbook application of
|
||||
Dijkstra's algorithm. The following query will return the list of cities on the
|
||||
shortest path from Zagreb to Paris along with the total length of the path.
|
||||
|
||||
```opencypher
|
||||
MATCH p = (:City {name: "Zagreb"})
|
||||
-[:Road * wShortest (e, v | e.length) total_weight]->
|
||||
(:City {name: "Paris"})
|
||||
RETURN nodes(p) as cities, total_weight;
|
||||
```
|
||||
|
||||
As you can see, the syntax is quite similar to breadth-first search syntax.
|
||||
Instead of a filter lambda, we need to provide a *weight lambda* and the *total
|
||||
weight symbol*. Given an edge and vertex pair, weight lambda must return the
|
||||
cost of expanding to the given vertex using the given edge. The path returned
|
||||
will have the smallest possible sum of costs and it will be stored in the total
|
||||
weight symbol. A limitation of Dijkstra's algorithm is that the cost must be
|
||||
non-negative.
|
||||
|
||||
8) We can also combine weight and filter lambdas in the shortest-path query.
|
||||
Let's say we're interested in the shortest path that doesn't require travelling
|
||||
more that 200 km in one go for our bike route.
|
||||
|
||||
```opencypher
|
||||
MATCH p = (:City {name: "Zagreb"})
|
||||
-[:Road * wShortest (e, v | e.length) total_weight (e, v | e.length <= 200)]->
|
||||
(:City {name: "Paris"})
|
||||
RETURN nodes(p) as cities, total_weight;
|
||||
```
|
||||
|
||||
9) Let's try and find 10 cities that are furthest away from Zagreb.
|
||||
|
||||
```opencypher
|
||||
MATCH (:City {name: "Zagreb"})
|
||||
-[:Road * wShortest (e, v | e.length) total_weight]->
|
||||
(c:City)
|
||||
RETURN c, total_weight
|
||||
ORDER BY total_weight DESC LIMIT 10;
|
||||
```
|
||||
|
||||
It is not surprising to see that they are all in Siberia.
|
||||
|
||||
To learn more about these algorithms, we suggest you check out their Wikipedia
|
||||
pages:
|
||||
|
||||
* [Breadth-first search](https://en.wikipedia.org/wiki/Breadth-first_search)
|
||||
* [Dijkstra's algorithm](https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm)
|
@ -1,61 +0,0 @@
|
||||
## Upcoming Features
|
||||
|
||||
This chapter describes some of the planned features, that we at Memgraph are
|
||||
working on.
|
||||
|
||||
### Performance Improvements
|
||||
|
||||
Excellent database performance is one of Memgraph's long-standing goals. We
|
||||
will be continually working on improving the performance. This includes:
|
||||
|
||||
* query compilation;
|
||||
* query execution;
|
||||
* core engine performance;
|
||||
* algorithmic improvements (i.e. bidirectional breadth-first search);
|
||||
* memory usage and
|
||||
* other improvements.
|
||||
|
||||
### Label-Property Index Usage Improvements
|
||||
|
||||
Currently, indexing combinations of labels and properties can be created, but
|
||||
cannot be deleted. We plan to add a new query language construct which will
|
||||
allow deletion of created indices.
|
||||
|
||||
### Improving openCypher Support
|
||||
|
||||
Although we have implemented the most common features of the openCypher query
|
||||
language, there are other useful features we are still working on.
|
||||
|
||||
#### Functions
|
||||
|
||||
Memgraph's openCypher implementation supports the most useful functions, but
|
||||
there are more which openCypher provides. Some are related to not yet
|
||||
implemented features like paths, while some may use the features Memgraph
|
||||
already supports. Out of the remaining functions, some are more useful than
|
||||
others and as such they will be supported sooner.
|
||||
|
||||
#### List Comprehensions
|
||||
|
||||
List comprehensions are similar to the supported `collect` function, which
|
||||
generates a list out of multiple values. But unlike `collect`, list
|
||||
comprehensions offer a powerful mechanism for filtering or otherwise
|
||||
manipulating values which are collected into a list.
|
||||
|
||||
For example, getting numbers between 0 and 10 and squaring them:
|
||||
|
||||
```opencypher
|
||||
RETURN [x IN range(0, 10) | x^2] AS squares
|
||||
```
|
||||
|
||||
Another example, to collect `:Person` nodes with `age` less than 42, without
|
||||
list comprehensions can be achieved with:
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) WHERE n.age < 42 RETURN collect(n)
|
||||
```
|
||||
|
||||
Using list comprehensions, the same can be done with the query:
|
||||
|
||||
```opencypher
|
||||
MATCH (n :Person) RETURN [n IN collect(n) WHERE n.age < 42]
|
||||
```
|
Loading…
Reference in New Issue
Block a user