Remove docs/dev (migrated to Notion) (#84)

This commit is contained in:
Marko Budiselić 2021-01-26 20:08:40 +01:00 committed by GitHub
parent 90d4ebdb1e
commit 2caf0e617e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
57 changed files with 5 additions and 5663 deletions

View File

@ -4,7 +4,7 @@ on:
push:
paths-ignore:
- 'docs/**'
- '*.md'
- '**/*.md'
jobs:
community_build:

View File

@ -6,34 +6,8 @@ data structures, multi-version concurrency control and asynchronous IO.
## Development Documentation
* [Quick Start](docs/dev/quick-start.md)
* [Workflow](docs/dev/workflow.md)
* [Storage](docs/dev/storage/v2/contents.md)
* [Query Engine](docs/dev/query/contents.md)
* [Communication](docs/dev/communication/contents.md)
* [Lisp C++ Preprocessor (LCP)](docs/dev/lcp.md)
## Feature Specifications
Each prominent Memgraph feature requires a feature specification. The purpose
of the feature specification is to have a base for discussing all aspects of
the feature. Elements of feature specifications should be:
* High-level context.
* Interface.
* User stories. Usage from the end-user perspective. In the case of a library,
that should be cases on how to use the programming interface. In the case of
a shell script, that should be cases on how to use flags.
* Discussion about concurrency, memory management, error management.
* Any other essential functional or non-functional requirements.
* Test and benchmark strategy.
* Possible future changes/improvements/extensions.
* Security concerns.
* Additional and/or optional implementation details.
It's crucial to keep feature spec up-to-date with the implementation. Take a
look at the list of [feature specifications](docs/feature_spec/contents.md) to
learn more about powerful Memgraph features.
Please continue
[here](https://www.notion.so/memgraph/memgraph-0428591638604c8385550e214ea9f3e6).
## User Documentation

View File

@ -1,269 +0,0 @@
# Code Review Guidelines
This chapter describes some of the things you should be on the lookout when
reviewing someone else's code.
## Exceptions
Although the Google C++ Style Guide forbids exceptions, we do allow them in
our codebase. As a reviewer you should watch out for the following.
The documentation of throwing functions needs to be in-sync with the
implementation. This must be enforced recursively. I.e. if a function A now
throws a new exception, and the function B uses A, then B needs to handle that
exception or have its documentation updated and so on. Naturally, the same
applies when an exception is removed.
Transitive callers of the function which throws a new exception must be OK
with that. This ties into the previous point. You need to check that all users
of the new exception either handle it correctly or propagate it.
Exceptions should not escape out of class destructors, because that will
terminate the program. The code should be changed so that such cases are not
possible.
Exceptions being thrown in class constructors. Although this is well defined
in C++, it usually implies that a constructor is doing too much work and the
class construction & initialization should be redesigned. Usual approaches are
using the (Static) Factory Method pattern or having some sort of an
initialization method that needs to be called after the construction is done.
Prefer the Factory Method.
Don't forget that STL functions may also throw!
## Pointers & References
In cases when some code passes a pointer or reference, or if a code stores a
pointer or reference you should take a careful look at the following.
* Lifetime of the pointed to value (this includes both ownership and
multithreaded access).
* In case of a class, check validity of destructor and move/copy
constructors.
* Is the pointed to value mutated, if not it should be `const` (`const Type
*` or `const Type &`).
## Allocators & Memory Resources
With the introduction of polymorphic allocators (C++17 `<memory_resource>` and
our `utils/memory.hpp`) we get a more convenient type signatures for
containers so as to keep the outward facing API nice. This convenience comes
at a cost of less static checks on the type level due to type erasure.
For example:
std::pmr::vector<int> first_vec(std::pmr::null_memory_resource());
std::pmr::vector<int> second_vec(std::pmr::new_delete_resource());
second_vec = first_vec // What happens here?
// Or with our implementation
utils::MonotonicBufferResource monotonic_memory(1024);
std::vector<int, utils::Allocator<int>> first_vec(&monotonic_memory);
std::vector<int, utils::Allocator<int>> second_vec(utils::NewDeleteResource());
second_vec = first_vec // What happens here?
In the above, both `first_vec` and `second_vec` have the same type, but have
*different* allocators! This can lead to ambiguity when moving or copying
elements between them.
You need to watch out for the following.
* Swapping can lead to undefined behaviour if the allocators are not equal.
* Is the move construction done with the right allocator.
* Is the move assignment done correctly, also it may throw an exception.
* Is the copy construction done with the right allocator.
* Is the copy assignment done correctly.
* Using `auto` makes allocator propagation rules rather ambiguous.
## Classes & Object Oriented Programming
A common mistake is to use classes, inheritance and "OOP" when it's not
needed. This sections shows examples of encountered cases.
### Classes without (Meaningful) Members
class MyCoolClass {
public:
int BeCool(int a, int b) { return a + b; }
void SaySomethingCool() { std::cout << "Hello!"; }
};
The above class has no members (i.e. state) which affect the behaviour of
methods. This class should need not exist, it can be easily replaced with a
more modular (and shorter) design -- top level functions.
int BeCool(int a, int b) { return a + b; }
void SaySomethingCool() { std::cout << "Hello!"; }
### Classes with a Single Public Method
clas MyAwesomeClass {
public:
MyAwesomeClass(int state) : state_(state) {}
int GetAwesome() { return GetAwesomeImpl() + 1; }
private:
int state_;
int GetAwesomeImpl() { return state_; }
};
The above class has a `state_` and even a private method, but there's only one
public method -- `GetAwesome`.
You should check "Does the stored state have any meaningful influence on the
public method?", similarly to the previous point.
In the above case it doesn't, and the class should be replaced with a public
function in `.hpp` while the private method should become a private function
in `.cpp` (static or in anonymous namespace).
// hpp
int GetAwesome(int state);
// cpp
namespace {
int GetAwesomeImpl(int state) { return state; }
}
int GetAwesome(int state) { return GetAwesomeImpl(state) + 1; }
A counterexample is when the state is meaningful.
class Counter {
public:
Counter(int state) : state_(state) {}
int Get() { return state_++; }
private:
int state_;
};
But even that could be replaced with a closure.
auto MakeCounter(int state) {
return [state]() mutable { return state++; };
}
### Private Methods
Instead of private methods, top level functions should be preferred. The
reasoning is completely explained in "Effective C++" Item 23 by Scott Meyers.
In our codebase, even improvements to compilation times can be noticed if
private methods in interface (`.hpp`) files are replaced with top level
functions in implementation (`.cpp`) files.
### Inheritance
The rule is simple -- if there are no virtual methods (but maybe destructor),
then the class should be marked as `final` and never inherited.
If there are virtual methods (i.e. class is meant to be inherited), make sure
that either a public virtual destructor or a protected non-virtual destructor
exist. See "Effective C++" Item 7 by Scott Meyers. Also take a look at
"Effective C++" Items 32---39 by Scott Meyers.
An example of how inheritance with no virtual methods is replaced with
composition.
class MyBase {
public:
virtual ~MyBase() {}
void DoSomethingBase() { ... }
};
class MyDerived final : public MyBase {
public:
void DoSomethingNew() { ... DoSomethingBase(); ... }
};
With composition, the above becomes.
class MyBase final {
public:
void DoSomethingBase() { ... }
};
class MyDerived final {
MyBase base_;
public:
void DoSomethingNew() { ... base_.DoSomethingBase(); ... }
};
The composition approach is preferred as it encapsulates the fact that
`MyBase` is used for the implementation and users only interact with the
public interface of `MyDerived`. Additionally, you can easily replace `MyBase
base_;` with a C++ PIMPL idiom (`std::unique_ptr<MyBase> base_;`) to make the
code more modular with regards to compilation.
More advanced C++ users will recognize that the encapsulation feature of the
non-PIMPL composition can be replaced with private inheritance.
class MyDerived final : private MyBase {
public:
void DoSomethingNew() { ... MyBase::DoSomethingBase(); ... }
};
One of the common "counterexample" is the ability to store objects of
different type in a container or pass them to a function. Unfortunately, this
is not that good of a design. For example.
class MyBase {
... // No virtual methods (but the destructor)
};
class MyFirstClass final : public MyBase { ... };
class MySecondClass final : public MyBase { ... };
std::vector<std::unique_ptr<MyBase>> first_and_second_classes;
first_and_second_classes.push_back(std::make_unique<MyFirstClass>());
first_and_second_classes.push_back(std::make_unique<MySecondClass>());
void FunctionOnFirstOrSecond(const MyBase &first_or_second, ...) { ... }
With C++17, the containers for different types should be implemented with
`std::variant`, and as before the functions can be templated.
class MyFirstClass final { ... };
class MySecondClass final { ... };
std::vector<std::variant<MyFirstClass, MySecondClass>> first_and_second_classes;
// Notice no heap allocation, since we don't store a pointer
first_and_second_classes.emplace_back(MyFirstClass());
first_and_second_classes.emplace_back(MySecondClass());
// You can also use `std::variant` here instead of template
template <class TFirstOrSecond>
void FunctionOnFirstOrSecond(const TFirstOrSecond &first_or_second, ...) { ... }
Naturally, if the base class has meaningful virtual methods (i.e. other than
destructor) it maybe is OK to use inheritance but also consider alternatives.
See "Effective C++" Items 32---39 by Scott Meyers.
### Multiple Inheritance
Multiple inheritance should not be used unless all base classes are pure
interface classes. This decision is inherited from [Google C++ Style
Guide](https://google.github.io/styleguide/cppguide.html#Inheritance). For
example on how to design with and around multiple inheritance refer to
"Effective C++" Item 40 by Scott Meyers.
Naturally, if there *really* is no better design, then multiple inheritance is
allowed. An example of this can be found in our codebase when inheriting
Visitor classes (though even that could be replaced with `std::variant` for
example).
## Code Format & Style
If something doesn't conform to our code formatting and style, just refer the
author to either [C++ Style](cpp-code-conventions.md) or [Other Code
Conventions](other-code-conventions.md).

View File

@ -1,5 +0,0 @@
# Communication
## Bolt
Memgraph implements [Bolt communication protocol](https://7687.org/).

View File

@ -1,350 +0,0 @@
# C++ Code Conventions
This chapter describes code conventions which should be followed when writing
C++ code.
## Code Style
Memgraph uses the
[Google Style Guide for C++](https://google.github.io/styleguide/cppguide.html)
in most of its code. You should follow them whenever writing new code.
Besides following the style guide, take a look at
[Code Review Guidelines](code-review.md) for common design issues and pitfalls
with C++ as well as [Required Reading](required-reading.md).
### Often Overlooked Style Conventions
#### Pointers & References
References provide a shorter syntax for accessing members and better declare
the intent that a pointer *should* not be `nullptr`. They do not prevent
accessing a `nullptr` and obfuscate the client/calling code because the
reference argument is passed just like a value. Errors with such code have
been very difficult to debug. Therefore, pointers are always used. They will
not prevent bugs but will make some of them more obvious when reading code.
The only time a reference can be used is if it is `const`. Note that this
kind of reference is not allowed if it is stored somewhere, i.e. in a class.
You should use a pointer to `const` then. The primary reason being is that
references obscure the semantics of moving an object, thus making bugs with
references pointing to invalid memory harder to track down.
Example of this can be seen while capturing member variables by reference
inside a lambda.
Let's define a class that has two members, where one of those members is a
lambda that captures the other member by reference.
```cpp
struct S {
std::function<void()> foo;
int bar;
S() : foo([&]() { std::cout << bar; })
{}
};
```
What would happend if we move an instance of this object? Our lambda
reference capture will point to the same location as before, i.e. it
will point to the **old** memory location of `bar`. This means we have
a dangling reference in our code!
There are multiple ways to avoid this. The simple solutions would be
capturing by value or disabling move constructors/assignments.
Still, if we capture by reference an object that is not a member
of the struct containing the lambda, we can still have a dangling
reference if we move that object somewhere in our code and there is
nothing we can do to prevent that.
So, be careful with lambda catptures, and remember that references are
still a pointer under the hood!
[Style guide reference](https://google.github.io/styleguide/cppguide.html#Reference_Arguments)
#### Constructors & RAII
RAII (Resource Acquisition is Initialization) is a nice mechanism for managing
resources. It is especially useful when exceptions are used, such as in our
code. Unfortunately, they do have 2 major downsides.
* Only exceptions can be used for to signal failure.
* Calls to virtual methods are not resolved as expected.
For those reasons the style guide recommends minimal work that cannot fail.
Using virtual methods or doing a lot more should be delegated to some form of
`Init` method, possibly coupled with static factory methods. Similar rules
apply to destructors, which are not allowed to even throw exceptions.
[Style guide reference](https://google.github.io/styleguide/cppguide.html#Doing_Work_in_Constructors)
#### Constructors and member variables
One of the most powerful tools in C++ is the move semantics. We won't go into
detail how it works, but you should know how to utilize it as much as
possible. In our example we will define a small `struct` called `S` which
contains only a single member, `text` of type `std::string`.
```cpp
struct S {
std::string text;
};
```
We want to define a constructor that takes a `std::string`, and saves its
value in `text`. This is a common situation, where the constructor takes
a value, and saves it in the object to be constructed.
Our first implementation would look like this:
```cpp
S(const std::string &s) : text(s) {}
```
This is a valid solution but with one downside - we always copy. If we
construct an object like this:
```cpp
S s("some text");
```
we would create a temporary `std::string` object and then copy it to our member
variable.
Of course, we know what to do now - we will capture temporary variables using
`&&` and move it into our `text` variable.
```cpp
S(std::string &&s) : text(std::move(s)) {}
```
Now let's add an extra member variable of type `std::vector<int>` called
`words`. Our constructors accept 2 values now - `std::vector<int>` and
`std::string`. Those arguments could be passed by value, by reference, or
as rvalues. To cover all the cases we need to define a dedicated constructor
for each case.
Fortunately, there are two simpler options, the first one is writing a
templated constructor:
```cpp
template<typename T1, typename T2>
S(T1 &&s, T2 &&v) : text(std::forward<T1>(s), words(std::forward<T2>(v) {}
```
But don't forget to define `requires` clause so you don't accept any type. This
solution is optimal but really hard to read AND write. The second solution is
something you should ALWAYS prefer in these simple cases where we only store
one of the arguments:
```cpp
S(std::string s, std::vector<int> v) : text(std::move(s)), words(std::move(v)) {}
```
This way we have an almost optimal solution. The only extra operation we have is
the extra move when we send an `lvalue`. We would copy the value to the `s`, and
then move it to the `text` variable. Before, we would copy directly to `text`.
Also, you should ALWAYS write const-correct code, meaning `s` and `v` cannot be
`const` as it's not correct here. Why is that? You CANNOT move a const object!
It would just degrade to copying the object. I would say that this is a small
price to pay for a much cleaner and more maintainable code.
### Additional Style Conventions
Old code may have broken Google C++ Style accidentally, but the new code
should adhere to it as close as possible. We do have some exceptions
to Google style as well as additions for unspecified conventions.
#### Using C++ Exceptions
Unlike Google, we do not forbid using exceptions.
But, you should be very careful when using them and introducing new ones. They
are indeed handy, but cause problems with understanding the control flow since
exceptions are another form of `goto`. It also becomes very hard to determine
that the program is in correct state after the stack is unwound and the thrown
exception handled. Other than those issues, throwing exceptions in destructors
will terminate the program. The same will happen if a thread doesn't handle an
exception even though it is not the main thread.
[Style guide reference](https://google.github.io/styleguide/cppguide.html#Exceptions)
In general, when introducing a new exception, either via `throw` statement or
calling a function which throws, you must examine all transitive callers and
update their implementation and/or documentation.
#### Assertions
We use `CHECK` and `DCHECK` macros from glog library. You are encouraged to
use them as often as possible to both document and validate various pre and
post conditions of a function.
`CHECK` remains even in release build and should be preferred over it's cousin
`DCHECK` which only exists in debug builds. The primary reason is that you
want to trigger assertions in release builds in case the tests didn't
completely validate all code paths. It is better to fail fast and crash the
program, than to leave it in undefined state and potentially corrupt end
user's work. In cases when profiling shows that `CHECK` is causing visible
slowdown you should switch to `DCHECK`.
#### Template Parameter Naming
Template parameter names should start with capital letter 'T' followed by a
short descriptive name. For example:
```cpp
template <typename TKey, typename TValue>
class KeyValueStore
```
## Code Formatting
You should install `clang-format` and run it on code you change or add. The
root of Memgraph's project contains the `.clang-format` file, which specifies
how formatting should behave. Running `clang-format -style=file` in the
project's root will read the file and behave as expected. For ease of use, you
should integrate formatting with your favourite editor.
The code formatting isn't enforced, because sometimes manual formatting may
produce better results. Though, running `clang-format` is strongly encouraged.
## Documentation
Besides following the comment guidelines from [Google Style
Guide](https://google.github.io/styleguide/cppguide.html#Comments), your
documentation of the public API should be
[Doxygen](https://github.com/doxygen/doxygen) compatible. For private parts of
the code or for comments accompanying the implementation, you are free to
break doxygen compatibility. In both cases, you should write your
documentation as full sentences, correctly written in English.
## Doxygen
To start a Doxygen compatible documentation string, you should open your
comment with either a JavaDoc style block comment (`/**`) or a line comment
containing 3 slashes (`///`). Take a look at the 2 examples below.
### Block Comment
```cpp
/**
* One sentence, brief description.
*
* Long form description.
*/
```
### Line Comment
```cpp
/// One sentence, brief description.
///
/// Long form description.
```
If you only have a brief description, you may collapse the documentation into
a single line.
### Block Comment
```cpp
/** Brief description. */
```
### Line Comment
```cpp
/// Brief description.
```
Whichever style you choose, keep it consistent across the whole file.
Doxygen supports various commands in comments, such as `@file` and `@param`.
These help Doxygen to render specified things differently or to track them for
cross referencing. If you want to learn more, take a look at these two links:
* http://www.stack.nl/~dimitri/doxygen/manual/docblocks.html
* http://www.stack.nl/~dimitri/doxygen/manual/commands.html
## Examples
Below are a few examples of documentation from the codebase.
### Function
```cpp
/**
* Removes whitespace characters from the start and from the end of a string.
*
* @param s String that is going to be trimmed.
*
* @return Trimmed string.
*/
inline std::string Trim(const std::string &s);
```
### Class
```cpp
/** Base class for logical operators.
*
* Each operator describes an operation, which is to be performed on the
* database. Operators are iterated over using a @c Cursor. Various operators
* can serve as inputs to others and thus a sequence of operations is formed.
*/
class LogicalOperator
: public ::utils::Visitable<HierarchicalLogicalOperatorVisitor> {
public:
/** Constructs a @c Cursor which is used to run this operator.
*
* @param GraphDbAccessor Used to perform operations on the database.
*/
virtual std::unique_ptr<Cursor> MakeCursor(GraphDbAccessor &db) const = 0;
/** Return @c Symbol vector where the results will be stored.
*
* Currently, outputs symbols are only generated in @c Produce operator.
* @c Skip, @c Limit and @c OrderBy propagate the symbols from @c Produce (if
* it exists as input operator). In the future, we may want this method to
* return the symbols that will be set in this operator.
*
* @param SymbolTable used to find symbols for expressions.
* @return std::vector<Symbol> used for results.
*/
virtual std::vector<Symbol> OutputSymbols(const SymbolTable &) const {
return std::vector<Symbol>();
}
virtual ~LogicalOperator() {}
};
```
### File Header
```cpp
/// @file visitor.hpp
///
/// This file contains the generic implementation of visitor pattern.
///
/// There are 2 approaches to the pattern:
///
/// * classic visitor pattern using @c Accept and @c Visit methods, and
/// * hierarchical visitor which also uses @c PreVisit and @c PostVisit
/// methods.
///
/// Classic Visitor
/// ===============
///
/// Explanation on the classic visitor pattern can be found from many
/// sources, but here is the link to hopefully most easily accessible
/// information: https://en.wikipedia.org/wiki/Visitor_pattern
///
/// The idea behind the generic implementation of classic visitor pattern is to
/// allow returning any type via @c Accept and @c Visit methods. Traversing the
/// class hierarchy is relegated to the visitor classes. Therefore, visitor
/// should call @c Accept on children when visiting their parents. To implement
/// such a visitor refer to @c Visitor and @c Visitable classes.
///
/// Hierarchical Visitor
/// ====================
///
/// Unlike the classic visitor, the intent of this design is to allow the
/// visited structure itself to control the traversal. This way the internal
/// children structure of classes can remain private. On the other hand,
/// visitors may want to differentiate visiting composite types from leaf types.
/// Composite types are those which contain visitable children, unlike the leaf
/// nodes. Differentiation is accomplished by providing @c PreVisit and @c
/// PostVisit methods, which should be called inside @c Accept of composite
/// types. Regular @c Visit is only called inside @c Accept of leaf types.
/// To implement such a visitor refer to @c CompositeVisitor, @c LeafVisitor and
/// @c Visitable classes.
///
/// Implementation of hierarchical visiting is modelled after:
/// http://wiki.c2.com/?HierarchicalVisitorPattern
```

File diff suppressed because it is too large Load Diff

View File

@ -1,15 +0,0 @@
# Other Code Conventions
While we are mainly programming in C++, we do use other programming languages
when appropriate. This chapter describes conventions for such code.
## Python
Code written in Python should adhere to
[PEP 8](https://www.python.org/dev/peps/pep-0008/). You should run `flake8` on
your code to automatically check compliance.
## Common Lisp
Code written in Common Lisp should adhere to
[Google Common Lisp Style](https://google.github.io/styleguide/lispguide.xml).

View File

@ -1 +0,0 @@
html/

View File

@ -1,34 +0,0 @@
# Query Parsing, Planning and Execution
This part of the documentation deals with query execution.
Memgraph currently supports only query interpretation. Each new query is
parsed, analysed and translated into a sequence of operations which are then
executed on the main database storage. Query execution is organized into the
following phases:
1. [Lexical Analysis (Tokenization)](parsing.md)
2. [Syntactic Analysis (Parsing)](parsing.md)
3. [Semantic Analysis and Symbol Generation](semantic.md)
4. [Logical Planning](planning.md)
5. [Logical Plan Execution](execution.md)
The main entry point is `Interpreter::operator()`, which takes a query text
string and produces a `Results` object. To instantiate the object,
`Interpreter` needs to perform the above steps from 1 to 4. If any of the
steps fail, a `QueryException` is thrown. The complete `LogicalPlan` is
wrapped into a `CachedPlan` and stored for reuse. This way we can skip the
whole process of analysing a query if it appears to be the same as before.
When we have valid plan, the client code can invoke `Results::PullAll` with a
stream object. The `Results` instance will then execute the plan and fill the
stream with the obtained results.
Since we want to optionally run Memgraph as a distributed database, we have
hooks for creating a different plan of logical operators.
`DistributedInterpreter` inherits from `Interpreter` and overrides
`MakeLogicalPlan` method. This method needs to return a concrete instance of
`LogicalPlan`, and in case of distributed database that will be
`DistributedLogicalPlan`.
![Intepreter Class Diagram](interpreter-class.png)

View File

@ -1,373 +0,0 @@
# Logical Plan Execution
We implement classical iterator style operators. Logical operators define
operations on database. They encapsulate the following info: what the input is
(another `LogicalOperator`), what to do with the data, and how to do it.
Currently logical operators can have zero or more input operations, and thus a
`LogicalOperator` tree is formed. Most `LogicalOperator` types have only one
input, so we are mostly working with chains instead of full fledged trees.
You can find information on each operator in `src/query/plan/operator.lcp`.
## Cursor
Logical operators do not perform database work themselves. Instead they create
`Cursor` objects that do the actual work, based on the info in the operator.
Cursors expose a `Pull` method that gets called by the cursor's consumer. The
consumer keeps pulling as long as the `Pull` returns `true` (indicating it
successfully performed some work and might be eligible for another `Pull`).
Most cursors will call the `Pull` function of their input provided cursor, so
typically a cursor chain is created that is analogue to the logical operator
chain it's created from.
## Frame
The `Frame` object contains all the data of the current `Pull` chain. It
serves for communicating data between cursors.
For example, in a `MATCH (n) RETURN n` query the `ScanAllCursor` places a
vertex on the `Frame` for each `Pull`. It places it on the place reserved for
the `n` symbol. Then the `ProduceCursor` can take that same value from the
`Frame` because it knows the appropriate symbol. `Frame` positions are indexed
by `Symbol` objects.
## ExpressionEvaluator
Expressions results are not placed on the `Frame` since they do not need to be
communicated between different `Cursors`. Instead, expressions are evaluated
using an instance of `ExpressionEvaluator`. Since generally speaking an
expression can be defined by a tree of subexpressions, the
`ExpressionEvaluator` is implemented as a tree visitor. There is a performance
sub-optimality here because a stack is used to communicate intermediary
expression results between elements of the tree. This is one of the reasons
why it's planned to use `Frame` for intermediary expression results as well.
The other reason is that it might facilitate compilation later on.
## Cypher Execution Semantics
Cypher query execution has *mostly* well-defined semantics. Some are
explicitly defined by openCypher and its TCK, while others are implicitly
defined by Neo4j's implementation of Cypher that we want to be generally
compatible with.
These semantics can in short be described as follows: a Cypher query consists
of multiple clauses some of which modify it. Generally, every clause in the
query, when reading it left to right, operates on a consistent state of the
property graph, untouched by subsequent clauses. This means that a `MATCH`
clause in the beginning operates on a graph-state in which modifications by
the subsequent `SET` are not visible.
The stated semantics feel very natural to the end-user, and Neo seems to
implement them well. For Memgraph the situation is complex because
`LogicalOperator` execution (through a `Cursor`) happens one `Pull` at a time
(generally meaning all the query clauses get executed for every top-level
`Pull`). This is not inherently consistent with Cypher semantics because a
`SET` clause can modify data, and the `MATCH` clause that precedes it might
see the modification in a subsequent `Pull`. Also, the `RETURN` clause might
want to stream results to the user before all `SET` clauses have been
executed, so the user might see some intermediate graph state. There are many
edge-cases that Memgraph does its best to avoid to stay true to Cypher
semantics, while at the same time using a high-performance streaming approach.
The edge-cases are enumerated in this document along with the implementation
details they imply.
## Implementation Peculiarities
### Once
An operator that does nothing but whose `Cursor::Pull` returns `true` on the
first `Pull` and `false` on subsequent ones. This operator is used when
another operator has an optional input, because in Cypher a clause will
typically execute once for every input from the preceding clauses, or just
once if there was no preceding input. For example, consider the `CREATE`
clause. In the query `CREATE (n)` only one node is created, while in the query
`MATCH (n) CREATE (m)` a node is created for each existing node. Thus in our
`CreateNode` logical operator the input is either a `ScanAll` operator, or a
`Once` operator.
### storage::View
In the previous section, [Cypher Execution
Semantics](#cypher-execution-semantics), we mentioned how the preceding
clauses should not see changes made in subsequent ones. For that reason, some
operators take a `storage::View` enum value. This value determines which state of
the graph an operator sees.
Consider the query `MATCH (n)--(m) WHERE n.x = 0 SET m.x = 1`. Naive streaming
could match a vertex `n` on the given criteria, expand to `m`, update it's
property, and in the next iteration consider the vertex previously matched to
`m` and skip it because it's newly set property value does not qualify. This
is not how Cypher works. To handle this issue properly, Memgraph designed the
`VertexAccessor` class that tracks two versions of data: one that was visible
before the current transaction+command, and the optional other that was
created in the current transaction+command. The `MATCH` clause will be planned
as `ScanAll` and `Expand` operations using `storage::View::OLD` value. This
will ensure modifications performed in the same query do not affect it. The
same applies to edges and the `EdgeAccessor` class.
### Existing Record Detection
It's possible that a pattern element has already been declared in the same
pattern, or a preceding pattern. For example `MATCH (n)--(m), (n)--(l)` or a
cycle-detection match `MATCH (n)-->(n) RETURN n`. Implementation-wise,
existing record detection just checks that the expanded record is equal to the
one already on the frame.
### Why Not Use Separate Expansion Ops for Edges and Vertices?
Expanding an edge and a vertex in separate ops is not feasible when matching a
cycle in bi-directional expansions. Consider the query `MATCH (n)--(n) RETURN
n`. Let's try to expand first the edge in one op, and vertex in the next. The
vertex expansion consumes the edge expansion input. It takes the expanded edge
from the frame. It needs to detect a cycle by comparing the vertex existing on
the frame with one of the edge vertices (`from` or `to`). But which one? It
doesn't know, and can't ensure correct cycle detection.
### Data Visibility During and After SET
In Cypher, setting values always works on the latest version of data (from
preceding or current clause). That means that within a `SET` clause all the
changes from previous clauses must be visible, as well as changes done by the
current `SET` clause. Also, if there is a clause after `SET` it must see *all*
the changes performed by the preceding `SET`. Both these things are best
illustrated with the following queries executed on an empty database:
CREATE (n:A {x:0})-[:EdgeType]->(m:B {x:0})
MATCH (n)--(m) SET m.x = n.x + 1 RETURN labels(n), n.x, labels(m), m.x
This returns:
+---------+---+---------+---+
|labels(n)|n.x|labels(m)|m.x|
+:=======:+:=:+:=======:+:=:+
|[A] |2 |[B] |1 |
+---------+---+---------+---+
|[B] |1 |[A] |2 |
+---------+---+---------+---+
The obtained result implies the following operations:
1. In the first iteration set the value of the `B.x` to 1
2. In the second iteration the we observe `B.x` with the value of 1 and set
`A.x` to 2
3. In `RETURN` we see all the changes made in both iterations
To implement the desired behavior Memgraph utilizes two techniques. First is
the already mentioned tracking of two versions of data in vertex accessors.
Using this approach ensures that the second iteration in the example query
sees the data modification performed by the preceding iteration. The second
technique is the `Accumulate` operation that accumulates all the iterations
from the preceding logical op before passing them to the next logical op. In
the example query, `Accumulate` ensures that the results returned to the user
reflect changes performed in all iterations of the query (naive streaming
could stream results at the end of first iteration producing inconsistent
results). Note that `Accumulate` is demanding regarding memory and slows down
query execution. For that reason it should be used only when necessary, for
example it does not have to be used in a query that has `MATCH` and `SET` but
no `RETURN`.
### Neo4j Inconsistency on Multiple SET Clauses
Considering the preceding example it could be expected that when a query has
multiple `SET` clauses all the changes from those preceding one are visible.
This is not the case in Neo4j's implementation. Consider the following queries
executed on an empty database:
CREATE (n:A {x:0})-[:EdgeType]->(m:B {x:0})
MATCH (n)--(m) SET n.x = n.x + 1 SET m.x = m.x * 2
RETURN labels(n), n.x, labels(m), m.x
This returns:
+---------+---+---------+---+
|labels(n)|n.x|labels(m)|m.x|
+:=======:+:=:+:=======:+:=:+
|[A] |2 |[B] |1 |
+---------+---+---------+---+
|[B] |1 |[A] |2 |
+---------+---+---------+---+
If all the iterations of the first `SET` clause were executed before executing
the second, all the resulting values would be 2. This not being the case, we
conclude that Neo4j does not use a barrier-like mechanism between `SET`
clauses. It is Memgraph's current vision that this is inconsistent and we
plan to reduce Neo4j compliance in favour of operation consistency.
### Double Deletion
It's possible to match the same graph element multiple times in a single query
and delete it. Neo supports this, and so do we. The relevant implementation
detail is in the `GraphDbAccessor` class, where the record deletion functions
reside, and not in the logical plan execution. It comes down to checking if a
record has already been deleted in the current transaction+command and not
attempting to do it again (results in a crash).
### Set + Delete Edge-case
It's legal for a query to combine `SET` and `DELETE` clauses. Consider the
following queries executed on an empty database:
CREATE ()-[:T]->()
MATCH (n)--(m) SET n.x = 42 DETACH DELETE m
Due to the `MATCH` being undirected the second pull will attempt to set data
on a deleted vertex. This is not a legal operation in Memgraph storage
implementation. For that reason the logical operator for `SET` must check if
the record it's trying to set something on has been deleted by the current
transaction+command. If so, the modification is not executed.
### Deletion Accumulation
Sometimes it's necessary to accumulate deletions of all the matches before
attempting to execute them. Consider this the following. Start with an empty
database and execute queries:
CREATE ()-[:T]->()-[:T]->()
MATCH (a)-[r1]-(b)-[r2]-(c) DELETE r1, b, c
Note that the `DELETE` clause attempts to delete node `c`, but it does not
detach it by deleting edge `r2`. However, due to undirected edge in the
`MATCH`, both edges get pulled and deleted.
Currently Memgraph does not support this behavior, Neo does. There are a few
ways that we could do this.
* Accumulate on deletion (that sucks because we have to keep track of
everything that gets returned after the deletion).
* Maybe we could stream through the deletion op, but defer actual deletion
until plan-execution end.
* Ignore this because it's very edgy (this is the currently selected option).
### Aggregation Without Input
It is necessary to define what aggregation ops return when they receive no
input. Following is a table that shows what Neo4j's Cypher implementation and
SQL produce.
+-------------+------------------------+---------------------+---------------------+------------------+
| \<OP\> | 1. Cypher, no group-by | 2. Cypher, group-by | 3. SQL, no group-by | 4. SQL, group-by |
+=============+:======================:+:===================:+:===================:+:================:+
| Count(\*) | 0 | \<NO\_ROWS> | 0 | \<NO\_ROWS> |
+-------------+------------------------+---------------------+---------------------+------------------+
| Count(prop) | 0 | \<NO\_ROWS> | 0 | \<NO\_ROWS> |
+-------------+------------------------+---------------------+---------------------+------------------+
| Sum | 0 | \<NO\_ROWS> | NULL | \<NO\_ROWS> |
+-------------+------------------------+---------------------+---------------------+------------------+
| Avg | NULL | \<NO\_ROWS> | NULL | \<NO\_ROWS> |
+-------------+------------------------+---------------------+---------------------+------------------+
| Min | NULL | \<NO\_ROWS> | NULL | \<NO\_ROWS> |
+-------------+------------------------+---------------------+---------------------+------------------+
| Max | NULL | \<NO\_ROWS> | NULL | \<NO\_ROWS> |
+-------------+------------------------+---------------------+---------------------+------------------+
| Collect | [] | \<NO\_ROWS> | N/A | N/A |
+-------------+------------------------+---------------------+---------------------+------------------+
Where:
1. `MATCH (n) RETURN <OP>(n.prop)`
2. `MATCH (n) RETURN <OP>(n.prop), (n.prop2)`
3. `SELECT <OP>(prop) FROM Table`
4. `SELECT <OP>(prop), prop2 FROM Table GROUP BY prop2`
Neo's Cypher implementation diverges from SQL only when performing `SUM`.
Memgraph implements SQL-like behavior. It is considered that `SUM` of
arbitrary elements should not be implicitly 0, especially in a property graph
without a strict schema (the property in question can contain values of
arbitrary types, or no values at all).
### OrderBy
The `OrderBy` logical operator sorts the results in the desired order. It
occurs in Cypher as part of a `WITH` or `RETURN` clause. Both the concept and
the implementation are straightforward. It's necessary for the logical op to
`Pull` everything from its input so it can be sorted. It's not necessary to
keep the whole `Frame` state of each input, it is sufficient to keep a list of
`TypedValues` on which the results will be sorted, and another list of values
that need to be remembered and recreated on the `Frame` when yielding.
The sorting itself is made to reflect that of Neo's implementation which comes
down to these points.
* `Null` comes last (as if it's greater than anything).
* Primitive types compare naturally, with no implicit casting except from
`int` to `double`.
* Complex types are not comparable.
* Every unsupported comparison results in an exception that gets propagated
to the end user.
### Limit in Write Queries
`Limit` can be used as part of a write query, in which case it will *not*
reduce the amount of performed updates. For example, consider a database that
has 10 vertices. The query `MATCH (n) SET n.x = 1 RETURN n LIMIT 3` will
result in all vertices having their property value changed, while returning
only the first to the client. This makes sense from the implementation
standpoint, because `Accumulate` is planned after `SetProperty` but before
`Produce` and `Limit` operations. Note that this behavior can be
non-deterministic in some queries, since it relies on the order of iteration
over nodes which is undefined when not explicitly specified.
### Merge
`MERGE` in Cypher attempts to match a pattern. If it already exists, it does
nothing and subsequent clauses like `RETURN` can use the matched pattern
elements. If the pattern can't match to any data, it creates it. For detailed
information see Neo4j's [merge
documentation.](https://neo4j.com/docs/developer-manual/current/cypher/clauses/merge/)
An important thing about `MERGE` is visibility of modified data. `MERGE` takes
an input (typically a `MATCH`) and has two additional *phases*: the merging
part, and the subsequent set parts (`ON MATCH SET` and `ON CREATE SET`).
Analysis of Neo4j's behavior indicates that each of these three phases (input,
merge, set) does not see changes to the graph state done by subsequent phase.
The input phase does not see data created by the merge phase, nor the set
phase. This is consistent with what seems like the general Cypher philosophy
that query clause effects aren't visible in the preceding clauses.
We define the `Merge` logical operator as a *routing* operator that uses three
logical operator branches.
1. The input from a preceding clause.
For example in `MATCH (n), (m) MERGE (n)-[:T]-(m)`. This input is
optional because `MERGE` is allowed to be the first clause in a query.
2. The `merge_match` branch.
This logical operator branch is `Pull`-ed from until exhausted for each
successful `Pull` from the input branch.
3. The `merge_create` branch.
This branch is `Pull`ed when the `merge_match` branch does not match
anything (no successful `Pull`s) for an input `Pull`. It is `Pull`ed only
once in such a situation, since only one creation needs to occur for a
failed match.
The `ON MATCH SET` and `ON CREATE SET` parts of the `MERGE` clause are
included in the `merge_match` and `merge_create` branches respectively. They
are placed on the end of their branches so that they execute only when those
branches succeed.
Memgraph strives to be consistent with Neo in its `MERGE` implementation,
while at the same time keeping performance as good as possible. Consistency
with Neo w.r.t. graph state visibility is not trivial. Documentation for
`Expand` and `Set` describe how Memgraph keeps track of both the updated
version of an edge/vertex and the old one, as it was before the current
transaction+command. This technique is also used in `Merge`. The input
phase/branch of `Merge` always looks at the old data. The merge phase needs to
see the new data so it doesn't create more data then necessary.
For example, consider the query.
MATCH (p:Person) MERGE (c:City {name: p.lives_in})
This query needs to create a city node only once for each unique `p.lives_in`.
Finally the set phase of a `MERGE` clause should not affect the merge phase.
To achieve this the `merge_match` branch of the `Merge` operator should see
the latest created nodes, but filter them on their old state (if those nodes
were not created by the `create_branch`). Implementation-wise that means that
`ScanAll` and `Expand` operators in the `merge_branch` need to look at the new
graph state, while `Filter` operators the old, if available.

View File

@ -1,23 +0,0 @@
digraph interpreter {
node [fontname="dejavusansmono"]
edge [fontname="dejavusansmono"]
node [shape=record]
edge [dir=back,arrowtail=empty,arrowsize=1.5]
Interpreter [label="{\N|+ operator(query : string, ...) : Results\l|
# MakeLogicalPlan(...) : LogicalPlan\l|
- plan_cache_ : Map(QueryHash, CachedPlan)\l}"]
Interpreter -> DistributedInterpreter
Results [label="{\N|+ PullAll(stream) : void\l|- plan_ : CachedPlan\l}"]
Interpreter -> Results
[dir=forward,style=dashed,arrowhead=open,label="<<create>>"]
CachedPlan -> Results
[dir=forward,arrowhead=odiamond,taillabel="1",headlabel="*"]
Interpreter -> CachedPlan [arrowtail=diamond,taillabel="1",headlabel="*"]
CachedPlan -> LogicalPlan [arrowtail=diamond]
LogicalPlan [label="{\N|+ GetRoot() : LogicalOperator
\l+ GetCost() : double\l}"]
LogicalPlan -> SingleNodeLogicalPlan [style=dashed]
LogicalPlan -> DistributedLogicalPlan [style=dashed]
DistributedInterpreter -> DistributedLogicalPlan
[dir=forward,style=dashed,arrowhead=open,label="<<create>>"]
}

Binary file not shown.

Before

Width:  |  Height:  |  Size: 55 KiB

View File

@ -1,62 +0,0 @@
# Lexical and Syntactic Analysis
## Antlr
We use Antlr for lexical and syntax analysis of Cypher queries. Antrl uses
grammar file `Cypher.g4` downloaded from http://www.opencypher.org to generate
the parser and the visitor for the Cypher parse tree. Even though the provided
grammar is not very pleasant to work with we decided not to do any drastic
changes to it so that our transition to newly published versions of
`Cypher.g4` would be easier. Nevertheless, we had to fix some bugs and add
features, so our version is not completely the same.
In addition to using `Cypher.g4`, we have `MemgraphCypher.g4`. This grammar
file defines Memgraph specific extensions to the original grammar. Most
notable example is the inclusion of syntax for handling authorization. At the
moment, some extensions are also found in `Cypher.g4`. For example, the syntax
for using a lambda function in relationship patterns. These extensions should
be moved out of `Cypher.g4`, so that it remains as close to the original
grammar as possible. Additionally, having `MemgraphCypher.g4` may not be
enough if we wish to split the functionality for community and enterprise
editions of Memgraph.
## Abstract Syntax Tree (AST)
Since Antlr generated visitor and the official openCypher grammar are not very
practical to use, we translate the Antlr's AST to our own AST. Currently there
are ~40 types of nodes in our AST. Their definitions can be found in
`src/query/frontend/ast/ast.lcp`.
Major groups of types can be found under the following base types.
* `Expression` --- types corresponding to Cypher expressions.
* `Clause` --- types corresponding to Cypher clauses.
* `PatternAtom` --- node or edge related information.
* `Query` --- different kinds of queries, allows extending the language with
Memgraph specific query syntax.
Memory management of created AST nodes is done with `AstStorage`. Each type
must be created by invoking `AstStorage::Create` method. This way all of the
pointers to nodes and their children are raw pointers. The only owner of
allocated memory is the `AstStorage`. When the storage goes out of scope, the
pointers become invalid. It may be more natural to handle tree ownership via
`unique_ptr`, i.e. each node owns its children. But there are some benefits to
having a custom storage and allocation scheme.
The primary reason we opted for not using `unique_ptr` is the requirement of
Antlr's base visitor class that the resulting values must by copyable. The
result is wrapped in `antlr::Any` so that the derived visitor classes may
return any type they wish when visiting Antlr's AST. Unfortunately,
`antlr::Any` does not work with non-copyable types.
Another benefit of having `AstStorage` is that we can easily add a different
allocation scheme for AST nodes. The interface of node creation would not
change.
### AST Translation
The translation process is done via `CypherMainVisitor` class, which is
derived from Antlr generated visitor. Besides instancing our AST types, a
minimal number of syntactic checks are done on a query. These checks handle
the cases which were valid in original openCypher grammar, but may be invalid
when combined with other syntax elements.

View File

@ -1,526 +0,0 @@
# Logical Planning
After the semantic analysis and symbol generation, the AST is converted to a
tree of logical operators. This conversion is called *planning* and the tree
of logical operators is called a *plan*. The whole planning process is done in
the following steps.
1. [AST Preprocessing](#ast-preprocessing)
The first step is to preprocess the AST by collecting
information on filters, divide the query into parts, normalize patterns
in `MATCH` clauses, etc.
2. [Logical Operator Planning](#logical-operator-planning)
After the preprocess step, the planning can be done via 2 planners:
`VariableStartPlanner` and `RuleBasedPlanner`. The first planner will
generate multiple plans where each plan has different starting points for
searching the patterns in `MATCH` clauses. The second planner produces a
single plan by mapping the query parts as they are to logical operators.
3. [Logical Plan Postprocessing](#logical-plan-postprocessing)
In this stage, we perform various transformations on the generated logical
plan. Here we want to optimize the operations in order to improve
performance during the execution. Naturally, transformations need to
preserve the semantic behaviour of the original plan.
4. [Cost Estimation](#cost-estimation)
After the generation, the execution cost of each plan is estimated. This
estimation is used to select the best plan which will be executed.
The implementation can be found in the `query/plan` directory, with the public
entry point being `query/plan/planner.hpp`.
## AST Preprocessing
Each openCypher query consists of at least 1 **single query**. Multiple single
queries are chained together using a **query combinator**. Currently, there is
only one combinator, `UNION`. The preprocessing step starts in the
`CollectQueryParts` function. This function will take a look at each single
query and divide it into parts. Each part is separated with `RETURN` and
`WITH` clauses. For example:
MATCH (n) CREATE (m) WITH m MATCH (l)-[]-(m) RETURN l
| | |
|------- part 1 -----------+-------- part 2 --------|
| |
|-------------------- single query -----------------|
Each part is created by collecting all `MATCH` clauses and *normalizing* their
patterns. Pattern normalization is the process of converting an arbitrarily
long pattern chain of nodes and edges into a list of triplets `(start node,
edge, end node)`. The triplets should preserve the semantics of the match. For
example:
MATCH (a)-[p]-(b)-[q]-(c)-[r]-(d)
is equivalent to:
MATCH (a)-[p]-(b), (b)-[q]-(c), (c)-[r]-(d)
With this representation, it becomes easier to reorder the triplets and choose
different strategies for pattern matching.
In addition to normalizing patterns, all of the filter expressions in patterns
and inside of the `WHERE` clause (of the accompanying `MATCH`) are extracted
and stored separately. During the extraction, symbols used in the filter
expression are collected. This allows for planning filters in a valid order,
as the matching for triplets is being done. Another important benefit of
having extra information on filters, is to recognize when a database index
could be used.
After each `MATCH` is processed, they are all grouped, so that even the whole
`MATCH` clauses may be reordered. The important thing is to remember which
symbols were used to name edges in each `MATCH`. With those symbols we can
plan for *cyphermorphism*, i.e. ensure different edges in the search pattern
of a single `MATCH` map to different edges in the graph. This preserves the
semantic of the query, even though we may have reordered the matching. The
same steps are done for `OPTIONAL MATCH`.
Another clause which needs processing is `MERGE`. Here we normalize the
pattern, since the `MERGE` is a bit like `MATCH` and `CREATE` in one.
All the other clauses are left as is.
In the end, each query part consists of:
* processed and grouped `MATCH` clauses;
* processed and grouped `OPTIONAL MATCH` clauses;
* processed `MERGE` matching pattern and
* unchanged remaining clauses.
The last stored clause is guaranteed to be either `WITH` or `RETURN`.
## Logical Operator Planning
### Variable Start Planner
The `VariableStartPlanner` generates multiple plans for a single query. Each
plan is generated by selecting a different starting point for pattern
matching.
The algorithm works as follows.
1. For each query part:
1. For each node in triplets of collected `MATCH` clauses:
i. Add the node to a set of `expanded` nodes
ii. Select a triplet `(start node, edge, end node)` whose `start node` is
in the `expanded` set
iii. If no triplet was selected, choose a new starting node that isn't in
`expanded` and continue expanding
iv. Repeat steps ii. -- iii. until all triplets have been selected
and store that as a variation of the `MATCH` clauses
2. Do step 1.1. for `OPTIONAL MATCH` and `MERGE` clauses
3. Take all combinations of the generated `MATCH`, `OPTIONAL MATCH` and
`MERGE` and store them as variations of the query part.
2. For each combination of query part variations:
1. Generate a plan using the rule based planner
### Rule Based Planner
The `RuleBasedPlanner` generates a single plan for a single query. A plan is
generated by following hardcoded rules for producing logical operators. The
following sections are an overview on how each openCypher clause is converted
to a `LogicalOperator`.
#### MATCH
`MATCH` clause is used to specify which patterns need to be searched for in
the database. These patterns are normalized in the preprocess step to be
represented as triplets `(start node, edge, end node)`. When there is no edge,
then the triplet is reduced only to the `start node`. Generating the operators
is done by looping over these triplets.
##### Searching for Nodes
The simplest search is finding standalone nodes. For example, `MATCH (n)`
will find all the nodes in the graph. This is accomplished by generating a
`ScanAll` operator and forwarding the node symbol which should store the
results. In this case, all the nodes will be referenced by `n`.
Multiple nodes can be specified in a single match, e.g. `MATCH (n), (m)`.
Planning is done by repeating the same steps for each sub pattern (separated
by a comma). In this case, we would get 2 `ScanAll` operators chained one
after the other. An optimization can be obtained if the node in the pattern is
already searched for. In `MATCH (n), (n)` we can drop the second `ScanAll`
operator since we have already generated it for the first node.
##### Searching for Relationships
A more advanced search includes finding nodes with relationships. For example,
`MATCH (n)-[r]-(m)` should find every pair of connected nodes in the database.
This means, that if a single node has multiple connections, it will be
repeated for each combination of pairs. The generation of operators starts
from the first node in the pattern. If we are referencing a new starting node,
we need to generate a `ScanAll` which finds all the nodes and stores them
into `n`. Then, we generate an `Expand` operator which reads the `n` and
traverses all the edges of that node. The edge is stored into `r`, while the
destination node is stored in `m`.
Matching multiple relationships proceeds similarly, by repeating the same
steps. The only difference is that we need to ensure different edges in the
search pattern, map to different edges in the graph. This means that after each
`Expand` operator, we need to generate an `EdgeUniquenessFilter`. We provide
this operator with a list of symbols for the previously matched edges and the
symbol for the current edge.
For example.
MATCH (n)-[r1]-(m)-[r2]-(l)
The above is preprocessed into
MATCH (n)-[r1]-(m), (m)-[r2]-(l)
Then we look at each triplet in order and perform the described steps. This
way, we would generate:
ScanAll (n) > Expand (n, r1, m) > Expand (m, r2, l) >
EdgeUniquenessFilter ([r1], r2)
Note that we don't need to make `EdgeUniquenessFilter` after the first
`Expand`, since there are no edges to compare to. This filtering needs to work
across multiple pattern, but inside a *single* `MATCH` clause.
Let's take a look at the following.
MATCH (n)-[r1]-(m), (m)-[r2]-(l)
We would also generate the exact same operators.
ScanAll (n) > Expand (n, r1, m) > Expand (m, r2, l) >
EdgeUniquenessFilter ([r1], r2)
On the other hand,
MATCH (n)-[r1]-(m) MATCH (m)-[r2]-(l)-[r3]-(i)
would reset the uniqueness filtering at the start of the second match. This
would mean that we output the following:
ScanAll (n) > Expand (n, r1, m) > Expand (m, r2, l) > Expand (l, r3, i) >
EdgeUniquenessFilter ([r2], r3)
There is a difference in how we handle edge uniqueness compared to Neo4j.
Neo4j does not allow searching for a single edge multiple times, but we've
decided to support that.
For example, the user can say the following.
MATCH (n)-[r]-(m)-[r]-l
We would ensure that both `r` variables match to the same edge. In our
terminology, we call this the *edge cycle*. For the above example, we would
generate this plan.
ScanAll (n) > Expand (n, r, m) > Expand (m, r, l)
We do not put an `EdgeUniquenessFilter` operator between 2 `Expand`
operators and we tell the 2nd `Expand` that it is an edge cycle. This, 2nd
`Expand` will ensure we have matched both the same edges.
##### Filtering
To narrow the search down, the patterns in `MATCH` can have filtered labels
and properties. A more general filtering is done using the accompanying
`WHERE` clause. During the preprocess step, all filters are collected and
extracted into expressions. Additional information on which symbols are used
is also stored. This way, each time we generate a `ScanAll` or `Expand`, we
look at all the filters to see if any of them can be used. I.e. if the symbols
they use have been bound by a newly produced operator. If a filter expression
can be used, we immediately add a `Filter` operator with that expression.
For example.
MATCH (n)-[r]-(m :label) WHERE n.prop = 42
We would produce:
ScanAll (n) > Filter (n.prop) > Expand (n, r, m) > Filter (m :label)
This means that the same plan is generated for the query:
MATCH (n {prop: 42})-[r]-(m :label)
#### OPTIONAL
If a `MATCH` clause is preceded by `OPTIONAL`, then we need to generate a plan
such that we produce results even if we fail to match anything. This is
accomplished by generating an `Optional` operator, which takes 2 operator
trees:
* input operation and
* optional operation.
The input is the operation we generated for the part of the query before
`OPTIONAL MATCH`. For the optional operation, we simply generate the `OPTIONAL
MATCH` part just like we would for regular `MATCH`. In addition to operations,
we need to send the symbols which are set during optional matching to the
`Optional` operator. The operator will reset values of those symbols to
`null`, when the optional part fails to match.
#### RETURN & WITH
`RETURN` and `WITH` clauses are very similar to each other. The only
difference is that `WITH` separates parts of the query and can be paired with
`WHERE` clause.
The common part is generating operators for the body of the clause. Separation
of query parts is mostly done in semantic analysis, which checks that only the
symbols exposed through `WITH` are visible in the query parts after the
clause. The minor part is done in planning.
##### Named Results
Both clauses contain multiple named expressions (`expr AS name`) which are
used to generate `Produce` operator.
##### Aggregations
If an expression contains an aggregation operator (`sum`, `avg`, ...) we need
to plan the `Aggregate` operator as input to `Produce`. This case is more
complex, because aggregation in openCypher can perform implicit grouping of
results used for aggregation.
For example, `WITH/RETURN sum(n.x) AS s, n.y AS group` will implicitly group
by `n.y` expression.
Another, obscure grouping can be achieved with `RETURN sum(n.a) + n.b AS s`.
Here, the `n.b` will be used for grouping, even though both the `sum` and
`n.b` are in the same named expression.
Therefore, we need to collect all expressions which do not contain
aggregations and use them for grouping. You may have noticed that in the last
example `sum` is actually a sub-expression of `+`. `Aggregate` operator does
not see that (nor it should), so the responsibility of evaluating that falls
on `Produce`. One way is for `Aggregate` to store results of grouping
expressions on the frame in addition to aggregation results. Unfortunately,
this would require rewiring named expressions in `Produce` to reference
already evaluated expressions. In the current implementation, we opted for
`Aggregate` to store only aggregation results on the frame, while `Produce`
will re-evaluate all the other (grouping) expressions. To handle that, symbols
which are used in expressions are passed to `Aggregate`, so that they can be
remembered. `Produce` will read those symbols from the frame and use it to
re-evaluate the needed expressions.
##### Accumulation
After we have `Produce` and potentially `Aggregate`, we need to handle a
special case when the part of the query before `RETURN` or `WITH` performs
updates. For that, we want to run that part of the query fully, so that we get
the latest results. This is accomplished by adding `Accumulate` operator as
input to `Aggregate` or `Produce` (if there is no aggregation). Accumulation
will store all the values for all the used symbols inside `RETURN` and `WITH`,
so that they can be used in the operator which follows. This way, only parts
of the frame are copied, instead of the whole frame. Here is a minor
difference between planning `WITH`, compared to `RETURN`. Since `WITH` can
separate writing from reading, we need to advance the transaction command.
This enables the later, read parts of the query to obtain the newest changes.
This is supported by passing `advance_command` flag to `Accumulate` operator.
In the simplest case, common to both clauses, we have `Accumulate > Aggregate
> Produce` operators, where `Accumulate` and `Aggregate` may be left out.
##### Ordering
Planning `ORDER BY` is simple enough. Since it may see new symbols (filled in
`Produce`), we add the `OrderBy` operator at the end. The operator will change
the order of produced results, so we pass it the ordering expressions and the
output symbols of named expressions.
##### Filtering
A final difference in `WITH`, is when it contains a `WHERE` clause. For that,
we simply generate the `Filter` operator, appended after `Produce` or
`OrderBy` (depending which operator is last).
##### Skipping and Limiting
If we have `SKIP` or `LIMIT`, we generate `Skip` or `Limit` operators,
respectively. These operators are put at the end of the clause.
This placement may have some unexpected behaviour when combined with
operations that update the graph. For example.
MATCH (n) SET n.x = n.x + 1 RETURN n LIMIT 1
The above query may be interpreted as if the `SET` will be done only once.
Since this is a write query, we need to accumulate results, so the part before
`RETURN` will execute completely. The accumulated results will be yielded up
to the given limit, and the user would get only the first `n` that was
updated. This may confuse the user because in reality, every node in the
database had been updated.
Note that `Skip` always comes before `Limit`. In the current implementation,
they are generated directly one after the other.
#### CREATE
`CREATE` clause is used to create nodes and edges (relationships).
For multiple `CREATE` clauses or multiple creation patterns in a single
clause, we perform the same, following steps.
##### Creating a Single Node
A node is created by simply specifying a node pattern.
For example `CREATE (n :label {property: "value"}), ()` would create 2 nodes.
The 1st one would be created with a label and a property. This node could be
referenced later in the query, by using the variable `n`. The 2nd node cannot
be referenced and it would be created without any labels nor properties. For
node creation, we generate a `CreateNode` operator and pass it all the details
of node creation: variable symbol, labels and properties. In the mentioned
example, we would have `CreateNode > CreateNode`.
##### Creating a Relationship
To create a relationship, the `CREATE` clause must contain a pattern with a
directed edge. Compared to creating a single node, this case is a bit more
complicated, because either side of the edge may not exist. By exist, we mean
that the endpoint is a variable which already references a node.
For example, `MATCH (n) CREATE (n)-[r]->(m)` would create an edge `r` and a
node `m` for each matched node `n`. If we focus on the `CREATE` part, we
generate `CreateExpand (n, r, m)` where `n` already exists (refers to matched
node) and `m` would be newly created along with edge `r`. If we had only
`CREATE (n)-[r]->(m)`, then we would need to create both nodes of the edge
`r`. This is done by generating `CreateNode (n) > CreateExpand(n, r, m)`. The
final case is when both endpoints refer to an existing node. For example, when
adding a node with a cyclical connection `CREATE (n)-[r]->(n)`. In this case,
we would generate `CreateNode (n) > CreateExpand (n, r, n)`. We would tell
`CreateExpand` to only create the edge `r` between the already created `n`.
#### MERGE
Although the merge operation is complex, planning turns out to be relatively
simple. The pattern inside the `MERGE` clause is used for both matching and
creating. Therefore, we create 2 operator trees, one for each action.
For example.
MERGE (n)-[r:r]-(m)
We would generate a single `Merge` operator which has the following.
* No input operation (since it is not preceded by any other clause).
* On match operation
`ScanAll (n) > Expand (n, r, m) > Filter (r)`
* On create operation
`CreateNode (n) > CreateExpand (n, r, m)`
In cases when `MERGE` contains `ON MATCH` and `ON CREATE` parts, we simply
append their operations to the respective operator trees.
Observe the following example.
MERGE (n)-[r:r]-(m) ON MATCH SET n.x = 42 ON CREATE SET m :label
The `Merge` would be generated with the following.
* No input operation (again, since there is no clause preceding it).
* On match operation
`ScanAll (n) > Expand (n, r, m) > Filter (r) > SetProperty (n.x, 42)`
* On create operation
`CreateNode (n) > CreateExpand (n, r, m) > SetLabels (n, :label)`
When we have preceding clauses, we simply put their operator as input to
`Merge`.
MATCH (n) MERGE (n)-[r:r]-(m)
The above would be generated as
ScanAll (n) > Merge (on_match_operation, on_create_operation)
Here we need to be careful to recognize which symbols are already declared.
But, since the `on_match_operation` uses the same algorithm for generating a
`Match`, that problem is handled there. The same should hold for
`on_create_operation`, which uses the process of generating a `Create`. So,
finally for this example, the `Merge` would have:
* Input operation
`ScanAll (n)`
* On match operation
`Expand (n, r, m) > Filter (r)`
Note that `ScanAll` is not needed since we get the nodes from input.
* On create operation
`CreateExpand (n, r, m)`
Note that `CreateNode` is dropped, since we want to expand the existing one.
## Logical Plan Postprocessing
Postprocessing of a logical plan is done by rewriting the original plan into
a more efficient one while preserving the original semantic of operations.
The rewriters are found in `query/plan/rewrite` directory, and currently we
only have one -- `IndexLookupRewriter`.
### IndexLookupRewriter
The job of this rewriter is to merge `Filter` and `ScanAll` operations into
equivalent `ScanAllBy<Index>` operations. In almost all cases using indexed
lookup will be faster than regular lookup, so `IndexLookupRewriter` simply
does the transformations whenever possible. The simplest case being the
following, assuming we have an index over `id`.
* Original Plan
`ScanAll (n) > Filter (id(n) == 42) > Produce (n)`
* Rewritten Plan
`ScanAllById (n, id=42) > Produce (n)`
Naturally, there are some cases we need to be careful about.
1. Operators with Multiple Branches
Here we may not carry `Filter` operations outside of the operator into
its branches, so the branches are rewritten as stand alone plans with a
branch new `IndexLookupRewriter`. Some of the operators with multiple
branches are `Merge`, `Optional`, `Cartesian` and `Union`.
2. Expand Operators
Expand operations aren't that tricky to handle, but they have a special
case where we want to use an indexed lookup of the destination so that the
expansion is performed between known nodes. This decision may depend on
various parameters which may need further tweaking as we encounter more
use-cases of Cypher queries.
## Cost Estimation
Cost estimation is the final step of processing a logical plan. The
implementation can be found in `query/plan/cost_estimator.hpp`. We give each
operator a cost based on the estimated cardinality of results of that operator
and on the preset coefficient of the runtime performance of that operator.
This scheme is rather simple and works quite well, but there are couple of
improvements we may want to do at some point.
* Track more information about the stored graph and use that to improve the
estimates.
* Do a quick, partial run of the plan and tweak the estimation based on how
much each operator produced results. This may require us having some kind
of representative subset of the stored graph.
* Write micro benchmarks for each operator and based on the results create
sensible preset coefficients. This would replace the current coefficients
which are just assumptions on how each operator implementation performs.

View File

@ -1,134 +0,0 @@
# Semantic Analysis and Symbol Generation
In this phase, various semantic and variable type checks are performed.
Additionally, we generate symbols which map AST nodes to stored values
computed from evaluated expressions.
## Symbol Generation
Implementation can be found in `query/frontend/semantic/symbol_generator.cpp`.
Symbols are generated for each AST node that represents data that needs to
have storage. Currently, these are:
* `NamedExpression`
* `CypherUnion`
* `Identifier`
* `Aggregation`
You may notice that the above AST nodes may not correspond to something named
by a user. For example, `Aggregation` can be a part of larger expression and
thus remain unnamed. The reason we still generate symbols is to have a uniform
behaviour when executing a query as well as allow for caching the results of
expression evaluation.
AST nodes do not actually store a `Symbol` instance, instead they have a
`int32_t` index identifying the symbol in the `SymbolTable` class. This is
done to minimize the size of AST types as well as allow easier sharing of same
symbols with multiple instances of AST nodes.
The storage for evaluated data is represented by the `Frame` class. Each
symbol determines a unique position in the frame. During interpretation,
evaluation of expressions which have a symbol will either read or store values
in the frame. For example, instance of an `Identifier` will use the symbol to
find and read the value from `Frame`. On the other hand, `NamedExpression`
will take the result of evaluating its own expression and store it in the
`Frame`.
When a symbol is created, context of creation is used to assign a type to that
symbol. This type is used for simple type checking operations. For example,
`MATCH (n)` will create a symbol for variable `n`. Since the `MATCH (n)`
represents finding a vertex in the graph, we can set `Symbol::Type::Vertex`
for that symbol. Later, for example in `MATCH ()-[n]-()` we see that variable
`n` is used as an edge. Since we already have a symbol for that variable, we
detect this type mismatch and raise a `SemanticException`.
Basic rule of symbol generation, is that variables inside `MATCH`, `CREATE`,
`MERGE`, `WITH ... AS` and `RETURN ... AS` clauses establish new symbols.
### Symbols in Patterns
Inside `MATCH`, symbols are created only if they didn't exist before. For
example, patterns in `MATCH (n {a: 5})--(m {b: 5}) RETURN n, m` will create 2
symbols: one for `n` and one for `m`. `RETURN` clause will, in turn, reference
those symbols. Symbols established in a part of pattern are immediately bound
and visible in later parts. For example, `MATCH (n)--(n)` will create a symbol
for variable `n` for 1st `(n)`. That symbol is referenced in 2nd `(n)`. Note
that the symbol is not bound inside 1st `(n)` itself. What this means is that,
for example, `MATCH (n {a: n.b})` should raise an error, because `n` is not
yet bound when encountering `n.b`. On the other hand,
`MATCH (n)--(n {a: n.b})` is fine.
The `CREATE` is similar to `MATCH`, but it *always* establishes symbols for
variables which create graph elements. What this means is that, for example
`MATCH (n) CREATE (n)` is not allowed. `CREATE` wants to create a new node,
for which we already have a symbol. In such a case, we need to throw an error
that the variable `n` is being redeclared. On the other hand `MATCH (n) CREATE
(n)-[r :r]->(n)` is fine, because `CREATE` will only create the edge `r`,
connecting the already existing node `n`. Remaining behaviour is the same as
in `MATCH`. This means that we can simplify `CREATE` to be like `MATCH` with 2
special cases.
1. Are we creating a node, i.e. `CREATE (n)`? If yes, then the symbol for
`n` must not have been created before. Otherwise, we reference the
existing symbol.
2. Are we creating an edge, i.e. we encounter a variable for an edge inside
`CREATE`? If yes, then that variable must not reference a symbol.
The `MERGE` clause is treated the same as `CREATE` with regards to symbol
generation. The only difference is that we allow bidirectional edges in the
pattern. When creating such a pattern, the direction of the created edge is
arbitrarily determined.
### Symbols in WITH and RETURN
In addition to patterns, new symbols are established in the `WITH` clause.
This clause makes the new symbols visible *only* to the rest of the query.
For example, `MATCH (old) WITH old AS new RETURN new, old` should raise an
error that `old` is unbound inside `RETURN`.
There is a special case with symbol visibility in `WHERE` and `ORDER BY`. They
need to see both the old and the new symbols. Therefore `MATCH (old) RETURN
old AS new ORDER BY old.prop` needs to work. On the other hand, if we perform
aggregations inside `WITH` or `RETURN`, then the old symbols should not be
visible neither in `WHERE` nor in `ORDER BY`. Since the aggregation has to go
through all the results in order to generate the final value, it makes no
sense to store old symbols and their values. A query like `MATCH (old) WITH
SUM(old.prop) AS sum WHERE old.prop = 42 RETURN sum` needs to raise an error
that `old` is unbound inside `WHERE`.
For cases when `SKIP` and `LIMIT` appear, we disallow any identifiers from
appearing in their expressions. Basically, `SKIP` and `LIMIT` can only be
constant expressions[^1]. For example, `MATCH (old) RETURN old AS new SKIP
new.prop` needs to raise that variables are not allowed in `SKIP`. It makes no
sense to allow variables, since their values may vary on each iteration. On
the other hand, we could support variables to constant expressions, but for
simplicity we do not. For example, `MATCH (old) RETURN old, 2 AS limit_var
LIMIT limit_var` would still throw an error.
Finally, we generate symbols for names created in `RETURN` clause. These
symbols are used for the final results of a query.
NOTE: New symbols in `WITH` and `RETURN` should be unique. This means that
`WITH a AS same, b AS same` is not allowed, neither is a construct like
`RETURN 2, 2`
### Symbols in Functions which Establish New Scope
Symbols can also be created in some functions. These functions usually take an
expression, bind a single variable and run the expression inside the newly
established scope.
The `all` function takes a list, creates a variable for list element and runs
the predicate expression. For example:
MATCH (n) RETURN n, all(n IN n.prop_list WHERE n < 42)
We create a new symbol for use inside `all`, this means that the `WHERE n <
42` uses the `n` which takes values from a `n.prop_list` elements. The
original `n` bound by `MATCH` is not visible inside the `all` function, but it
is visible outside. Therefore, the `RETURN n` and `n.prop_list` reference the
`n` from `MATCH`.
[^1]: Constant expressions are expressions for which the result can be
computed at compile time.

View File

@ -1,107 +0,0 @@
# Quick Start
A short chapter on downloading the Memgraph source, compiling and running.
## Obtaining the Source Code
Memgraph uses `git` for source version control. You will need to install `git`
on your machine before you can download the source code.
On Debian systems, you can do it inside a terminal with the following
command:
apt install git
After installing `git`, you are now ready to fetch your own copy of Memgraph
source code. Run the following command:
git clone https://github.com/memgraph/memgraph.git
The above will create a `memgraph` directory and put all source code there.
## Compiling Memgraph
With the source code, you are now ready to compile Memgraph. Well... Not
quite. You'll need to download Memgraph's dependencies first.
In your terminal, position yourself in the obtained memgraph directory.
cd memgraph
### Installing Dependencies
Dependencies that are required by the codebase should be checked by running the
`init` script:
./init
If the script fails, dependencies installation scripts could be found under
`environment/os/`. The directory contains dependencies management script for
each supported operating system. E.g. if your system is **Debian 10**, run the
following to install all required build packages:
./environment/os/debian-10.sh install MEMGRAPH_BUILD_DEPS
Once everything is installed, rerun the `init` script.
Once the `init` script is successfully finished, issue the following commands:
mkdir -p build
./libs/setup.sh
### Compiling
Memgraph is compiled using our own custom toolchain that can be obtained from
the toolchain repository. You should read the `environment/README.txt` file
in the repository and install the apropriate toolchain for your distribution.
After you have installed the toolchain you should read the instructions for the
toolchain in the toolchain install directory (`/opt/toolchain-vXYZ/README.md`)
and install dependencies that are necessary to run the toolchain.
When you want to compile Memgraph you should activate the toolchain using the
prepared toolchain activation script that is also described in the toolchain
`README`.
NOTE: You **must** activate the toolchain every time you want to compile
Memgraph!
You should now activate the toolchain in your console.
source /opt/toolchain-vXYZ/activate
With all of the dependencies installed and the build environment set-up, you
need to configure the build system. To do that, execute the following:
cd build
cmake ..
If everything went OK, you can now, finally, compile Memgraph.
make -j$(nproc)
### Running
After the compilation verify that Memgraph works:
./memgraph --version
To make extra sure, run the unit tests:
ctest -R unit -j$(nproc)
## Problems
If you have any trouble running the above commands, contact your nearest
developer who successfully built Memgraph. Ask for help and insist on getting
this document updated with correct steps!
## Next Steps
Familiarise yourself with our code conventions and guidelines:
* [C++ Code](cpp-code-conventions.md)
* [Other Code](other-code-conventions.md)
* [Code Review Guidelines](code-review.md)
Take a look at the list of [required reading](required-reading.md) for
brushing up on technical skills.

View File

@ -1,129 +0,0 @@
# Required Reading
This chapter lists a few books that should be read by everyone working on
Memgraph. Since Memgraph is developed primarily with C++, Python and Common
Lisp, books are oriented around those languages. Of course, there are plenty
of general books which will help you improve your technical skills (such as
"The Pragmatic Programmer", "The Mythical Man-Month", etc.), but they are not
listed here. This way the list should be kept short and the *required* part in
"Required Reading" more easily honored.
Some of these books you may find in our office, so feel free to pick them up.
If any are missing and you would like a physical copy, don't be afraid to
request the book for our office shelves.
Besides reading, don't get stuck in a rut and be a
[Blub Programmer](http://www.paulgraham.com/avg.html).
## Effective C++ by Scott Meyers
Required for C++ developers.
The book is a must-read as it explains most common gotchas of using C++. After
reading this book, you are good to write competent C++ which will pass code
reviews easily.
## Effective Modern C++ by Scott Meyers
Required for C++ developers.
This is a continuation of the previous book, it covers updates to C++ which
came with C++11 and later. The book isn't as imperative as the previous one,
but it will make you aware of modern features we are using in our codebase.
## Practical Common Lisp by Peter Siebel
Required for Common Lisp developers.
Free: http://www.gigamonkeys.com/book/
We use Common Lisp to generate C++ code and make our lives easier.
Unfortunately, not many developers are familiar with the language. This book
will make you familiar very quickly as it has tons of very practical
exercises. E.g. implementing unit testing library, serialization library and
bundling all that to create a mp3 music server.
## Effective Python by Brett Slatkin
(Almost) required reading for Python developers.
Why the "almost"? Well, Python is relatively easy to pick up and you will
probably learn all the gotchas during code review from someone more
experienced. This makes the book less necessary for a newcomer to Memgraph,
but the book is not advanced enough to delegate it to
[Advanced Reading](#advanced-reading). The book is written in similar vein as
the "Effective C++" ones and will make you familiar with nifty Python features
that make everyone's lives easier.
# Advanced Reading
The books listed below are not required reading, but you may want to read them
at some point when you feel comfortable enough.
## Design Patterns by Gamma et. al.
Recommended for C++ developers.
This book is highly divisive because it introduced a culture centered around
design patterns. The main issues is overuse of patterns which complicates the
code. This has made many Java programs to serve as examples of highly
complicated, "enterprise" code.
Unfortunately, design patterns are pretty much missing
language features. This is most evident in dynamic languages such as Python
and Lisp, as demonstrated by
[Peter Norvig](http://www.norvig.com/design-patterns/).
Or as [Paul Graham](http://www.paulgraham.com/icad.html) put it:
```
This practice is not only common, but institutionalized. For example, in the
OO world you hear a good deal about "patterns". I wonder if these patterns are
not sometimes evidence of case (c), the human compiler, at work. When I see
patterns in my programs, I consider it a sign of trouble. The shape of a
program should reflect only the problem it needs to solve. Any other
regularity in the code is a sign, to me at least, that I'm using abstractions
that aren't powerful enough-- often that I'm generating by hand the expansions
of some macro that I need to write
```
After presenting the book so negatively, why you should even read it then?
Well, it is good to be aware of those design patterns and use them when
appropriate. They can improve modularity and reuse of the code. You will also
find examples of such patterns in our code, primarily Strategy and Visitor
patterns. The book is also a good stepping stone to more advanced reading
about software design.
## Modern C++ Design by Andrei Alexandrescu
Recommended for C++ developers.
This book can be treated as a continuation of the previous "Design Patterns"
book. It introduced "dark arts of template meta-programming" to the world.
Many of the patterns are converted to use C++ templates which makes them even
better for reuse. But, like the previous book, there are downsides if used too
much. You should approach it with a critical eye and it will help you
understand ideas that are used in some parts of our codebase.
## Large Scale C++ Software Design by John Lakos
Recommended for C++ developers.
An old book, but well worth the read. Lakos presents a very pragmatic view of
writing modular software and how it affects both development time as well as
program runtime. Some things are outdated or controversial, but it will help
you understand how the whole C++ process of working in a large team, compiling
and linking affects development.
## On Lisp by Paul Graham
Recommended for Common Lisp developers.
Free: http://www.paulgraham.com/onlisp.html
An excellent continuation to "Practical Common Lisp". It starts of slow, as if
introducing the language, but very quickly picks up speed. The main meat of
the book are macros and their uses. From using macros to define cooperative
concurrency to including Prolog as if it's part of Common Lisp. The book will
help you understand more advanced macros that are occasionally used in our
Lisp C++ Preprocessor (LCP).

View File

@ -1,110 +0,0 @@
# DatabaseAccessor
A `DatabaseAccessor` actually wraps a transactional access to database
data, for a single transaction. In that sense the naming is bad. It
encapsulates references to the database and the transaction object.
It contains logic for working with database content (graph element
data) in the context of a single transaction. All CRUD operations are
performed within a single transaction (as Memgraph is a transactional
database), and therefore iteration over data, finding a specific graph
element etc are all functionalities of a `GraphDbAccessor`.
In single-node Memgraph the database accessor also defined the lifetime
of a transaction. Even though a `Transaction` object was owned by the
transactional engine, it was `GraphDbAccessor`'s lifetime that object
was bound to (the transaction was implicitly aborted in
`GraphDbAccessor`'s destructor, if it was not explicitly ended before
that).
# RecordAccessor
It is important to understand data organization and access in the
storage layer. This discussion pertains to vertices and edges as graph
elements that the end client works with.
Memgraph uses MVCC (documented on it's own page). This means that for
each graph element there could be different versions visible to
different currently executing transactions. When we talk about a
`Vertex` or `Edge` as a data structure we typically mean one of those
versions. In code this semantic is implemented so that both those classes
inherit `mvcc::Record`, which in turn inherits `mvcc::Version`.
Handling MVCC and visibility is not in itself trivial. Next to that,
there is other book-keeping to be performed when working with data. For
that reason, Memgraph uses "accessors" to define an API of working with
data in a safe way. Most of the code in Memgraph (for example the
interpretation code) should work with accessors. There is a
`RecordAccessor` as a base class for `VertexAccessor` and
`EdgeAccessor`. Following is an enumeration of their purpose.
### Data Access
The client interacts with Memgraph using the Cypher query language. That
language has certain semantics which imply that multiple versions of the
data need to be visible during the execution of a single query. For
example: expansion over the graph is always done over the graph state as
it was at the beginning of the transaction.
The `RecordAccessor` exposes functions to switch between the old and the new
versions of the same graph element (intelligently named `SwitchOld` and
`SwitchNew`) within a single transaction. In that way the client code
(mostly the interpreter) can avoid dealing with the underlying MVCC
version concepts.
### Updates
Data updates are also done through accessors. Meaning: there are methods
on the accessors that modify data, the client code should almost never
interact directly with `Vertex` or `Edge` objects.
The accessor layer takes care of creating version in the MVCC layer and
performing updates on appropriate versions.
Next, for many kinds of updates it is necessary to update the relevant
indexes. There are implicit indexes for vertex labels, as
well as user-created indexes for (label, property) pairs. The accessor
layer takes care of updating the indexes when these values are changed.
Each update also triggers a log statement in the write-ahead log. This
is also handled by the accessor layer.
### Distributed
In distributed Memgraph accessors also contain a lot of the remote graph
element handling logic. More info on that is available in the
documentation for distributed.
### Deferred MVCC Data Lookup for Edges
Vertices and edges are versioned using MVCC. This means that for each
transaction an MVCC lookup needs to be done to determine which version
is visible to that transaction. This tends to slow things down due to
cache invalidations (version lists and versions are stored in arbitrary
locations on the heap).
However, for edges, only the properties are mutable. The edge endpoints
and type are fixed once the edge is created. For that reason both edge
endpoints and type are available in vertex data, so that when expanding
it is not mandatory to do MVCC lookups of versioned, mutable data. This
logic is implemented in `RecordAccessor` and `EdgeAccessor`.
### Exposure
The original idea and implementation of graph element accessors was that
they'd prevent client code from ever interacting with raw `Vertex` or
`Edge` data. This however turned out to be impractical when implementing
distributed Memgraph and the raw data members have since been exposed
(through getters to old and new version pointers). However, refrain from
working with that data directly whenever possible! Always consider the
accessors to be the first go-to for interacting with data, especially
when in the context of a transaction.
# Skiplist Accessor
The term "accessor" is also used in the context of a skiplist. Every
operation on a skiplist must be performed within on an
accessor. The skiplist ensures that there will be no physical deletions
of an object during the lifetime of an accessor. This mechanism is used
to ensure deletion correctness in a highly concurrent container.
We only mention that here to avoid confusion regarding terminology.

View File

@ -1,6 +0,0 @@
# Storage v1
* [Accessors](accessors.md)
* [Indexes](indexes.md)
* [Property Storage](property-storage.md)
* [Durability](durability.md)

View File

@ -1,80 +0,0 @@
# Durability
## Write-ahead Logging
Typically WAL denotes the process of writing a "log" of database
operations (state changes) to persistent storage before committing the
transaction, thus ensuring that the state can be recovered (in the case
of a crash) for all the transactions which the database committed.
The WAL is a fine-grained durability format. It's purpose is to store
database changes fast. It's primary purpose is not to provide
space-efficient storage, nor to support fast recovery. For that reason
it's often used in combination with a different persistence mechanism
(in Memgraph's case the "snapshot") that has complementary
characteristics.
### Guarantees
Ensuring that the log is written before the transaction is committed can
slow down the database. For that reason this guarantee is most often
configurable in databases.
Memgraph offers two options for the WAL. The default option, where the WAL is
flushed to the disk periodically and transactions do not wait for this to
complete, introduces the risk of database inconsistency because an operating
system or hardware crash might lead to missing transactions in the WAL. Memgraph
will handle this as if those transactions never happened. The second option,
called synchronous commit, will instruct Memgraph to wait for the WAL to be
flushed to the disk when a transactions completes and the transaction will wait
for this to complete. This option can be turned on with the
`--synchronous-commit` command line flag.
### Format
The WAL file contains a series of DB state changes called `StateDelta`s.
Each of them describes what the state change is and in which transaction
it happened. Also some kinds of meta-information needed to ensure proper
state recovery are recorded (transaction beginnings and commits/abort).
The following is guaranteed w.r.t. `StateDelta` ordering in
a single WAL file:
- For two ops in the same transaction, if op A happened before B in the
database, that ordering is preserved in the log.
- Transaction begin/commit/abort messages also appear in exactly the
same order as they were executed in the transactional engine.
### Recovery
The database can recover from the WAL on startup. This works in
conjunction with snapshot recovery. The database attempts to recover from
the latest snapshot and then apply as much as possible from the WAL
files. Only those transactions that were not recovered from the snapshot
are recovered from the WAL, for speed efficiency. It is possible (but
inefficient) to recover the database from WAL only, provided all the WAL
files created from DB start are available. It is not possible to recover
partial database state (i.e. from some suffix of WAL files, without the
preceding snapshot).
## Snapshots
A "snapshot" is a record of the current database state stored in permanent
storage. Note that the term "snapshot" is used also in the context of
the transaction engine to denote a set of running transactions.
A snapshot is written to the file by Memgraph periodically if so
configured. The snapshot creation process is done within a transaction created
specifically for that purpose. The transaction is needed to ensure that
the stored state is internally consistent.
The database state can be recovered from the snapshot during startup, if
so configured. This recovery works in conjunction with write-ahead log
recovery.
A single snapshot contains all the data needed to recover a database. In
that sense snapshots are independent of each other and old snapshots can
be deleted once the new ones are safely stored, if it is not necessary
to revert the database to some older state.
The exact format of the snapshot file is defined inline in the snapshot
creation code.

View File

@ -1,116 +0,0 @@
# Label Indexes
These are unsorted indexes that contain all the vertices that have the label
the indexes are for (one index per label). These kinds of indexes get
automatically generated for each label used in the database.
### Updating the Indexes
Whenever something gets added to the record we update the index (add that
record to index). We keep an index which might contain garbage (not relevant
records, because the value got removed or something similar) but we will
filter it out when querying the index. We do it like this because we don't
have to do bookkeeping and deciding if we update the index on the end of the
transaction (commit/abort phase), moreover current interpreter advances the
command in transaction and as such assumes that the indexes now contain
objects added in the previous command inside this transaction, so we need to
update over the whole scope of transaction (whenever something is added to the
record).
### Index Entries Label
These kinds of indexes are internally keeping track of pair (record, vlist).
Why do we need to keep track of exactly those two things?
Problems with two different approaches
1) Keep track of just the record:
- We need the `VersionList` for creating an accessor (this in itself is a
deal-breaker).
- Semantically it makes sense. An edge/vertex maps bijectionally to a
`VersionList`.
- We might try to access some members of record while the record is being
modified from another thread.
- A vertex/edge could get updated, thus expiring the record in the index.
The newly created record should be present in the index, but it's not.
Without the `VersionList` we can't reach the newly created record.
- Probably there are even more reasons... It should be obvious by now that
we need the `VersionList` in the index.
2) Keep track of just the version list:
- Removing from an index is a problem for two major reasons. First, if we
only have the `VersionList`, checking if it should be removed implies
checking all the reachable records, which is not thread-safe. Second,
there are issues with concurrent removal and insertion. The cleanup thread
could determine the vertex/edge should be removed from the index and
remove it, while in between those ops another thread attempts to insert
the `VersionList` into the index. The insertion does nothing because the
`VersionList` is already in, but it gets removed immediately after.
Because of inability to keep track of just the record, or value, we need to
keep track of both of them. Resolution of problems mentioned above, in the
same order, with (record, vlist) pair
- simple `vlist.find(current transaction)` will get us the newest visible
record
- we'll never try to access some record if it's still being written since we
will always operate on vlist.find returned record
- newest record will contain that label
- since we have (record, vlist) pair as the key in the index when we update
and delete in the same time we will never delete the same record, vlist
pair we are adding because the record, vlist pair we are deleting is
already superseded by a newer record and as such won't be inserted while
it's being deleted
### Querying the Index
We run through the index for the given label and do `vlist.find` operation for
the current transaction, and check if the newest return record has that
label. If it has it then we return it. By now you are probably wondering
aren't we sometimes returning duplicate vlist entries? And you are wondering
correctly, we would be returning them, but we are making sure that the entires
in the index are sorted by their `vlist*` and as such we can filter consecutive
duplicate `vlist*` to only return one of those while still being able to create
an iterator to index.
### Cleaning the Index
Cleaning the index is not as straightforward as it seems as a lot of garbage
can accumulate, but it's hard to know when exactly can we delete some (record,
vlist) pair. First, let's assume that we are doing the cleaning process at
some `transaction_id`, `id` such that there doesn't exist an active transaction
with an id lower than `id`.
We scan through the whole index and for each (record, vlist) pair we first
check if it was deleted before the id (i.e. no transaction with an id >= `id`
will ever again see that record), if it was deleted before we might naively
say that it's safe to delete it, but, we must take into account that when some
new record is created from this record (update operation), that record still
contains the label but by deleting this record we won't be able to see that
vlist because that new record won't add again to index because we didn't
explicitly add that label again to it.
Because of this we have to 'update' this index (record, vlist) pair. We have
to update the record to now point to a newer record in vlist, the one that is
not deleted yet. We can do that by querying the `version_list` for the last
record inside (oldest it has &mdash; remember that `mvcc_gc` will re-link not
visible records so the last record will be visible for the current GC id).
When updating the record inside the index, it's not okay to just update the
pointer and leave the index as it is, because with updating the `record*` we
might change the relative order of entries inside the index. We first have to
re-insert it with new `record*`, and then delete the old entry. And we need to
do insertion before the remove operation! Otherwise it could happen that the
vlist with a newer record with that label won't exist while some transaction
is querying the index.
Records which we added as a consequence of deleting older records will be
eventually removed from the index if they don't contain label because if we
see that the record is not deleted we try to check if that record still
contains the label. We also need to be careful here because we can't check
that while the record is being potentially updated by some transaction (race
condition), so we need can check if records still contain label if it's
creation id is smaller than our `id`, as that implies that the creating
transaction either aborted or committed as our `id` is equal to the oldest
active transaction in time of starting the GC.

View File

@ -1,131 +0,0 @@
# Property Storage
Although the reader is probably familiar with properties in *Memgraph*, let's
briefly recap.
Both vertices and edges can store an arbitrary number of properties. Properties
are, in essence, ordered pairs of property names and property values. Each
property name within a single graph element (edge/node) can store a single
property value. Property names are represented as strings, while property values
must be one of the following types:
Type | Description
-----------|------------
`Null` | Denotes that the property has no value. This is the same as if the property does not exist.
`String` | A character string, i.e. text.
`Boolean` | A boolean value, either `true` or `false`.
`Integer` | An integer number.
`Float` | A floating-point number, i.e. a real number.
`List` | A list containing any number of property values of any supported type. It can be used to store multiple values under a single property name.
`Map` | A mapping of string keys to values of any supported type.
Property values are modeled in a class conveniently called `PropertyValue`.
## Mapping Between Property Names and Property Keys.
Although users think of property names in terms of descriptive strings
(e.g. "location" or "department"), *Memgraph* internally converts those names
into property keys which are, essentially, unsigned 16-bit integers.
Property keys are modelled by a not-so-conveniently named class called
`Property` which can be found in `storage/types.hpp`. The actual conversion
between property names and property keys is done within the `ConcurrentIdMapper`
but the internals of that implementation are out of scope for understanding
property storage.
## PropertyValueStore
Both `Edge` and `Vertex` objects contain an instance of `PropertyValueStore`
object which is responsible for storing properties of a corresponding graph
element.
An interface of `PropertyValueStore` is as follows:
Method | Description
-----------|------------
`at` | Returns the `PropertyValue` for a given `Property` (key).
`set` | Stores a given `PropertyValue` under a given `Property` (key).
`erase` | Deletes a given `Property` (key) alongside its corresponding `PropertyValue`.
`clear` | Clears the storage.
`iterator`| Provides an extension of `std::input_iterator` that iterates over storage.
## Storage Location
By default, *Memgraph* is an in-memory database and all properties are therefore
stored in working memory unless specified otherwise by the user. User has an
option to specify via the command line which properties they wish to be stored
on disk.
Storage location of each property is encapsulated within a `Property` object
which is ensured by the `ConcurrentIdMapper`. More precisely, the unsigned 16-bit
property key has the following format:
```
|---location--|------id------|
|-Memory|Disk-|-----2^15-----|
```
In other words, the most significant bit determines the location where the
property will be stored.
### In-memory Storage
The underlying implementation of in-memory storage for the time being is
`std::vector<std::pair<Property, PropertyValue>>`. Implementations of`at`, `set`
and `erase` are linear in time. This implementation is arguably more efficient
than `std::map` or `std::unordered_map` when the average number of properties of
a record is relatively small (up to 10) which seems to be the case.
### On-disk Storage
#### KVStore
Disk storage is modeled by an abstraction of key-value storage as implemented in
`storage/kvstore.hpp'. An interface of this abstraction is as follows:
Method | Description
----------------|------------
`Put` | Stores the given value under the given key.
`Get` | Obtains the given value stored under the given key.
`Delete` | Deletes a given (key, value) pair from storage..
`DeletePrefix` | Deletes all (key, value) pairs where key begins with a given prefix.
`Size` | Returns the size of the storage or, optionally, the number of stored pairs that begin with a given prefix.
`iterator` | Provides an extension of `std::input_iterator` that iterates over storage.
Keys and values in this context are of type `std::string`.
The actual underlying implementation of this abstraction uses
[RocksDB]{https://rocksdb.org} &mdash; a persistent key-value store for fast
storage.
It is worthy to note that the custom iterator implementation allows the user
to iterate over a given prefix. Otherwise, the implementation follows familiar
c++ constructs and can be used as follows:
```
KVStore storage = ...;
for (auto it = storage.begin(); it != storage.end(); ++it) {}
for (auto kv : storage) {}
for (auto it = storage.begin("prefix"); it != storage.end("prefix"); ++it) {}
```
Note that it is not possible to scan over multiple prefixes. For instance, one
might assume that you can scan over all keys that fall in a certain
lexicographical range. Unfortunately, that is not the case and running the
following code will result in an infinite loop with a touch of undefined
behavior.
```
KVStore storage = ...;
for (auto it = storage.begin("alpha"); it != storage.end("omega"); ++it) {}
```
#### Data Organization on Disk
Each `PropertyValueStore` instance can access a static `KVStore` object that can
store `(key, value)` pairs on disk. The key of each property on disk consists of
two parts &mdash; a unique identifier (unsigned 64-bit integer) of the current
record version (see mvcc docummentation for further clarification) and a
property key as described above. The actual value of the property is serialized
into a bytestring using bolt `BaseEncoder`. Similarly, deserialization is
performed by bolt `Decoder`.

View File

@ -1,3 +0,0 @@
# Storage v2
TODO(gitbuda): Write documentation.

View File

@ -1,166 +0,0 @@
# Memgraph Workflow
This chapter describes the usual workflow for working on Memgraph.
## Git
Memgraph uses [git](https://git-scm.com/) for source version control. If you
obtained the source, you probably already have it installed. Before you can
track new changes, you need to setup some basic information.
First, tell git your name:
git config --global user.name "FirstName LastName"
Then, set your Memgraph email:
git config --global user.email "my.email@memgraph.com"
Finally, make git aware of your favourite editor:
git config --global core.editor "vim"
## Github
All of the code in Memgraph needs to go through code review before it can be
accepted in the codebase. This is done through [Github](https://github.com/).
You should already have it installed if you followed the steps in [Quick
Start](quick-start.md).
## Working on Your Feature Branch
Git has a concept of source code **branches**. The `master` branch contains all
of the changes which were reviewed and accepted in Memgraph's code base. The
`master` branch is selected by default.
### Creating a Branch
When working on a new feature or fixing a bug, you should create a new branch
out of the `master` branch. There are two branch types, **epic** and **task**
branches. The epic branch is created when introducing a new feature or any work
unit requiring more than one commit. More commits are required to split the
work into chunks to be able to easier review code or find a bug (in each
commit, there could be various problems, e.g., related to performance or
concurrency issues, which are the hardest to track down). Each commit on the
master or epic branch should be a compilable and well-documented set of
changes. Task branches should be created when a smaller work unit has to be
integrated into the codebase. The task branch could be branched out of the
master or an epic branch. We manage epics and tasks on the project management
tool called [Airtable](https://airtable.com/tblTUqycq8sHTTkBF). Each epic is
prefixed by `Exyz-MG`, on the other hand, each task has `Tabcd-MG` prefix.
Examples on how to create branches follow:
```
git checkout master
git checkout -b T0025-MG-fix-a-problem
...
git checkout master
git checkout -b E025-MG-huge-feature
...
git checkout E025-MG-huge-feature
git checkout -b T0123-MG-add-feature-part
```
Note that a branch is created from the currently selected branch. So, if you
wish to create another branch from `master` you need to switch to `master`
first.
### Making and Committing Changes
When you have a branch for your new addition, you can now actually start
implementing it. After some amount of time, you may have created new files,
modified others and maybe even deleted unused files. You need to tell git to
track those changes. This is accomplished with `git add` and `git rm`
commands.
git add path-to-new-file path-to-modified-file
git rm path-to-deleted-file
To check that everything is correctly tracked, you may use the `git status`
command. It will also print the name of the currently selected branch.
If everything seems OK, you should commit these changes to git.
git commit
You will be presented with an editor where you need to type the commit
message. Writing a good commit message is an art in itself. You should take a
look at the links below. We try to follow these conventions as much as
possible.
* [How to Write a Git Commit Message](http://chris.beams.io/posts/git-commit/)
* [A Note About Git Commit Messages](http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html)
* [stopwritingramblingcommitmessages](http://stopwritingramblingcommitmessages.com/)
### Sending Changes on a Review
After finishing your work on your feature branch, you will want to send it on
code review. This is done by pushing the branch to Github and creating a pull
request. You can find all PRs
[here](https://github.com/memgraph/memgraph/pulls).
### Code Integration
When working, you have to integrate some changes to your work or push your work
to be available for others. To pull changes into a local `branch`, usually run
the following:
git checkout {{branch}}
git pull origin {{branch}}
To push your changes, usually run the following:
git checkout {{branch}}
git push origin {{branch}}
Sometimes, things could get a little bit more complicated. Diagram below shows
which git operation should be performed if a piece of code has to be integrated
from one branch to another. Note, `main_branch` is the **master** branch in our
case.
```
|<---------------------------|
| squash merge |
|--------------------------->|
| merge |
| |
|<-----------|<--------------|
| merge | squash merge |
| | |
|----------->|-------------->|
| rebase | merge |
| | rebase --onto |
| | |
main_branch epic_branch task_branch
```
There are a couple of cases:
* If a code has to be integrated from a task branch to the main branch, use
**squash merge**. While you were working on a task, you probably committed a
couple of cleanup commits that are not relevant to the main branch. In the
other direction, while integrating the main branch to a task branch, the
**regular merge** is ok because changes from the task branch will later be
squash merged.
* You should use **squash merge** when integrating changes from task to epic
branch (task might have irrelevant commits). On the other hand, you should
use a **regular merge** when an epic is completed and has to be integrated into
the main branch. Epic is a more significant piece of work, decoupled in
compilable and testable commits. All these commits should be preserved to be
able to find potential issues later on.
* You should use **rebase** when integrating changes from main to an epic
branch. The epic branch has to be as clean as possible, avoid pure merge
commits. Once you rebase epic on main, all commits on the epic branch will
change the hashes. The implications are: 1) you have to force push your local
branch to the origin, 2) if you made a task branch out of the epic branch, you
would have to use **rebase --onto** (please refer to `git help rebase` for
details). In simple cases, **regular merge** should be sufficient to integrate
changes from epic to a task branch (that can even be done via GitHub web
interface).
During any code integration, you may get reports that some files have
conflicting changes. If you need help resolving them, don't be afraid to ask
around! After you've resolved them, mark them as done with `git add` command.
You may then continue with `git {{action}} --continue`.

View File

@ -1,185 +0,0 @@
# Python 3 Query Modules
## Introduction
Memgraph exposes a C API for writing the so called Query Modules. These
modules contain definitions of procedures which can be invoked through the
query language using the `CALL ... YIELD ...` syntax. This mechanism allows
database users to extend Memgraph with their own algorithms and
functionalities.
Using a low level language like C can be quite cumbersome for writing modules,
so it seems natural to add support for a higher level language on top of the
existing C API.
There are languages written exactly for this purpose of extending C with high
level constructs, for example Lua and Guile. Instead of those, we have chosen
Python 3 to be the first high level language we will support. The primary reason
being that it's very popular, so more people should be able to write modules.
Another benefit of Python which comes out of its popularity is the large
ecosystem of libraries, especially graph algorithm related ones like NetworkX.
Python does have significant performance and implementation downsides compared
to Lua and Guile, but these are described in more detail later in this
document.
## Python 3 API Overview
The Python 3 API should be as user friendly as possible as well as look
Pythonic. This implies that some functions from the C API will not map to the
exact same functions. The most obvious case for a Pythonic approach is
registering procedures of a query module. Let's take a look at the C example
and its transformation to Python.
```c
static void procedure(const struct mgp_list *args,
const struct mgp_graph *graph, struct mgp_result *result,
struct mgp_memory *memory);
int mgp_init_module(struct mgp_module *module, struct mgp_memory *memory) {
struct mgp_proc *proc =
mgp_module_add_read_procedure(module, "procedure", procedure);
if (!proc) return 1;
if (!mgp_proc_add_arg(proc, "required_arg",
mgp_type_nullable(mgp_type_any())))
return 1;
struct mgp_value *null_value = mgp_value_make_null(memory);
if (!mgp_proc_add_opt_arg(proc, "optional_arg",
mgp_type_nullable(mgp_type_any()), null_value)) {
mgp_value_destroy(null_value);
return 1;
}
mgp_value_destroy(null_value);
if (!mgp_proc_add_result(proc, "result", mgp_type_string())) return 1;
if (!mgp_proc_add_result(proc, "args",
mgp_type_list(mgp_type_nullable(mgp_type_any()))))
return 1;
return 0;
}
```
In Python things should be a lot simpler.
```Python
# mgp.read_proc obtains the procedure name via __name__ attribute of a function.
@mgp.read_proc(# Arguments passed to multiple mgp_proc_add_arg calls
(('required_arg', mgp.Nullable(mgp.Any)), ('optional_arg', mgp.Nullable(mgp.Any), None)),
# Result fields passed to multiple mgp_proc_add_result calls
(('result', str), ('args', mgp.List(mgp.Nullable(mgp.Any)))))
def procedure(args, graph, result, memory):
pass
```
Here we have replaced `mgp_module_*` and `mgp_proc_*` C API with a much
simpler decorator function in Python -- `mgp.read_proc`. The types of
arguments and result fields can both be our types as well as Python builtin
types which can map to supported `mgp_value` types. The expected builtin types
we ought to support are: `bool`, `str`, `int`, `float` and `map`. While the
rest of the types are provided via our Python API. Optionally, we can add
convenience support for `object` type which would map to
`mgp.Nullable(mgp.Any)` and `list` which would map to
`mgp.List(mgp.Nullable(mgp.Any))`. Also, it makes sense to take a look if we
can leverage Python's `typing` module here.
Another Pythonic change is to remove `mgp_value` C API from Python altogether.
This means that the arguments a Python procedure receives are not `mgp_value`
instances but rather `PyObject` instances. In other words, our implementation
would immediately marshal `mgp_value` to corresponding type in Python.
Obviously we would need to provide our own Python types for non-builtin
things like `mgp.Vertex` (equivalent to `mgp_vertex`) and other.
Continuing from our example above, let's say the procedure was invoked through
Cypher using the following query.
MATCH (n) CALL py_module.procedure(42, n) YIELD *;
The Python procedure could then do the following and complete without throwing
neither the AssertionError nor the ValueError.
```Python
def procedure(args, graph, result, memory):
assert isinstance(args, list)
# Unpacking throws ValueError if args does not contain exactly 2 values.
required_arg, optional_arg = args
assert isintance(required_arg, int)
assert isinstance(optional_arg, mgp.Vertex)
```
The rest of the C API should naturally map to either top level functions or
class methods as appropriate.
## Loading Python Query Modules
Our current mechanism for loading the modules is to look for `.so` files in
the directory specified by `--query-modules` flag. This is done when Memgraph
is started. We can extend this mechanism to look for `.py` files in addition
to `.so` files in the same directory and import them in the embedded Python
interpreter. The only issue is embedding the interpreter in Memgraph. There
are multiple choices:
1. Building Memgraph and statically linking to Python.
2. Building Memgraph and dynamically linking to Python, and distributing
Python with Memgraph's installation.
3. Building Memgraph and dynamically linking to Python, but without
distributing the Python library.
4. Building Memgraph and optionally loading Python library by trying to
`dlopen` it.
The first two options are only viable if the Python license allows, and this
will need further investigation.
The third option adds Python as an installation dependency for Memgraph, and
without it Memgraph will not run. This is problematic for users which cannot
or do not want to install Python 3.
The fourth option avoids all of the issues present in the first 3 options, but
comes at a higher implementation cost. We would need to try to `dlopen` the
Python library and setup function pointers. If we succeed we would import
`.py` files from the `--query-modules` directory. On the other hand, if the
user does not have Python, `dlopen` would fail and Memgraph would run without
Python support.
After live discussion, we've decided to go with option 3. This way we don't
have to worry about mismatching Python versions we support and what the users
expect. Also, we should target Python 3.5 as that should be common between
Debian and CentOS for which we ship installation packages.
## Performance and Implementation Problems
As previously mentioned, embedding Python introduces usability issues compared
to other embeddable languages.
The first, major issue is Global Interpreter Lock (GIL). Initializing Python
will start a single global interpreter and running multiple threads will
require acquiring GIL. In practice, this means that when multiple users run a
procedure written in Python in parallel the execution will not actually be
parallel. Python's interpreter will jump between executing one user's
procedure and the other's. This can be quite an issue for long running
procedures when multiple users are querying Memgraph. The solution for this
issue is Python's API for sub-interpreters. Unfortunately, the support for
them is rather poor and the API contains a lot of critical bugs when we tried
to use them. For the time being, we will have to accept GIL and its downsides.
Perhaps in the future we will gain more knowledge on how we could reduce the
acquire rate of GIL or the sub-interpreter API will get improved.
Another major issue is memory allocation. Python's C API does not have support
for setting up a temporary allocator during execution of a single function.
It only has support for setting up a global heap allocator. This obviously
impacts our control of memory during a query procedure invocation. Besides
potential performance penalty, a procedure could allocate much more memory
than we would actually allow for execution of a single query. This means that
options controlling the memory limit during query execution are useless. On
the bright side, Python does use block style allocators and reference
counting, so the performance penalty and global memory usage should not be
that terrible.
The final issue that isn't as major as the ones above is the global state of
the interpreter. In practice this means that any registered procedure and
imported module has access to any other procedure and module. This may pollute
the namespace for other users, but it should not be much of a problem because
Python always has things under a module scope. The other, slightly bigger
downside is that a malicious user could use this knowledge to modify other
modules and procedures. This seems like a major issue, but if we take the
bigger picture into consideration, we already have a security issue in general
by invoking `dlopen` on `.so` and potentially running arbitrary code. This was
the trade off we chose to allow users to extend Memgraph. It's up to the users
to write sane extensions and protect their servers from access.

View File

@ -1,198 +0,0 @@
# Tensorflow Op - Technicalities
The final result should be a shared object (".so") file that can be dynamically
loaded by the Tensorflow runtime in order to directly access the bolt client.
## About Tensorflow
Tensorflow is usually used with Python such that the Python code is used to
define a directed acyclic computation graph. Basically no computation is done
in Python. Instead, values from Python are copied into the graph structure as
constants to be used by other Ops. The directed acyclic graph naturally ends up
with two sets of border nodes, one for inputs, one for outputs. These are
sometimes called "feeds".
Following the Python definition of the graph, during training, the entire data
processing graph/pipeline is called from Python as a single expression. This
leads to lazy evaluation since the called result has already been defined for a
while.
Tensorflow internally works with tensors, i.e. n-dimensional arrays. That means
all of its inputs need to be matrices as well as its outputs. While it is
possible to feed data directly from Python's numpy matrices straight into
Tensorflow, this is less desirable than using the Tensorflow data API (which
defines data input and processing as a Tensorflow graph) because:
1. The data API is written in C++ and entirely avoids Python and as such is
faster
2. The data API, unlike Python is available in "Tensorflow serving". The
default way to serve Tensorflow models in production.
Once the entire input pipeline is defined via the tf.data API, its input is
basically a list of node IDs the model is supposed to work with. The model,
through the data API knows how to connect to Memgraph and execute openCypher
queries in order to get the remaining data it needs. (For example features of
neighbouring nodes.)
## The Interface
I think it's best you read the official guide...
<https://www.tensorflow.org/extend/adding_an_op> And especially the addition
that specifies how data ops are special
<https://www.tensorflow.org/extend/new_data_formats>
## Compiling the TF Op
There are two options for compiling a custom op. One of them involves pulling
the TF source, adding your code to it and compiling via bazel. This is
probably awkward to do for us and would significantly slow down compilation.
The other method involves installing Tensorflow as a Python package and pulling
the required headers from for example:
`/usr/local/lib/python3.6/site-packages/tensorflow/include` We can then compile
our Op with our regular build system.
This is practical since we can copy the required headers to our repo. If
necessary, we can have several versions of the headers to build several
versions of our Op for every TF version which we want to support. (But this is
unlikely to be required as the API should be stable).
## Example for Using the Bolt Client Tensorflow Op
### Dynamic Loading
``` python3
import tensorflow as tf
mg_ops = tf.load_op_library('/usr/bin/memgraph/tensorflow_ops.so')
```
### Basic Usage
``` python3
dataset = mg_ops.OpenCypherDataset(
# This is probably unfortunate as the username and password
# get hardcoded into the graph, but for the simple case it's fine
"hostname:7687", auth=("user", "pass"),
# Our query
'''
MATCH (n:Train) RETURN n.id, n.features
''',
# Cast return values to these types
(tf.string, tf.float32))
# Some Tensorflow data api boilerplate
iterator = dataset.make_one_shot_iterator()
next_element = iterator.get_next()
# Up to now we have only defined our computation graph which basically
# just connects to Memgraph
# `next_element` is not really data but a handle to a node in the Tensorflow
# graph, which we can and do evaluate
# It is a Tensorflow tensor with shape=(None, 2)
# and dtype=(tf.string, tf.float)
# shape `None` means the shape of the tensor is unknown at definition time
# and is dynamic and will only be known once the tensor has been evaluated
with tf.Session() as sess:
node_ids = sess.run(next_element)
# `node_ids` contains IDs and features of all the nodes
# in the graph with the label "Train"
# It is a numpy.ndarray with a shape ($n_matching_nodes, 2)
```
### Memgraph Client as a Generic Tensorflow Op
Other than the Tensorflow Data Op, we'll want to support a generic Tensorflow
Op which can be put anywhere in the Tensorflow computation Graph. It takes in
an arbitrary tensor and produces a tensor. This would be used in the GraphSage
algorithm to fetch the lowest level features into Tensorflow
```python3
requested_ids = np.array([1, 2, 3])
ids_placeholder = tf.placeholder(tf.int32)
model = mg_ops.OpenCypher()
"hostname:7687", auth=("user", "pass"),
"""
UNWIND $node_ids as nid
MATCH (n:Train {id: nid})
RETURN n.features
""",
# What to call the input tensor as an openCypher parameter
parameter_name="node_ids",
# Type of our resulting tensor
dtype=(tf.float32)
)
features = model(ids_placeholder)
with tf.Session() as sess:
result = sess.run(features,
feed_dict={ids_placeholder: requested_ids})
```
This is probably easier to implement than the Data Op, so it might be a good
idea to start with.
### Production Usage
During training, in the GraphSage algorithm at least, Memgraph is at the
beginning and at the end of the Tensorflow computation graph. At the
beginning, the Data Op provides the node IDs which are fed into the generic
Tensorflow Op to find their neighbours and their neighbours and their features.
Production usage differs in that we don't use the Data Op. The Data Op is
effectively cut off and the initial input is fed by Tensorflow serving, with
the data found in the request.
For example a JSON request to classify a node might look like:
`POST http://host:port/v1/models/GraphSage/versions/v1:classify`
With the contents:
```json
{
"examples": [
{"node_id": 1},
{"node_id": 2}
],
}
```
Every element of the "examples" list is an example to be computed. Each is
represented by a dict with keys matching names of feeds in the Tensorflow graph
and values being the values we want fed in for each example.
The REST API then replies in kind with the classification result in JSON.
Note about adding our custom Op to Tensorflow serving. Our Ops .so can be
added into the Bazel build to link with Tensorflow serving or it can be
dynamically loaded by starting Tensorflow serving with a flag
`--custom_op_paths`.
### Considerations
There might be issues here that the url to connect to Memgraph is hardcoded
into the op and would thus be wrong when moved to production, requiring some
type of a hack to make work. We probably want to solve this by having the
client op take in another tf.Variable as an input which would contain a
connection url and username/password. We have to research whether this makes
it easy enough to move to production, as the connection string variable is
still a part of the graph, but maybe easier to replace.
It is probably the best idea to utilize openCypher parameters to make our
queries flexible. The exact API as to how to declare the parameters in Python
is open to discussion.
The Data Op might not even be necessary to implement as it is not key for
production use. It can be replaced in training mode with feed dicts and either
1. Getting the initial list of nodes via a Python Bolt client
2. Creating a separate Tensorflow computation graph that gets all the relevant
node IDs into Python

View File

@ -1,33 +0,0 @@
# Feature Specifications
## Active
* [Python Query Modules](active/python-query-modules.md)
* [Tensorflow Op](active/tensorflow-op.md)
## Draft
* [A-star Variable-length Expand](draft/a-star-variable-length-expand.md)
* [Cloud-native Graph Store](draft/cloud-native-graph-store.md)
* [Compile Filter Expressions](draft/compile-filter-expressions.md)
* [Database Triggers](draft/database-triggers.md)
* [Date and Time Data Types](draft/date-and-time-data-types.md)
* [Distributed Query Execution](draft/distributed-query-execution.md)
* [Edge Create or Update Queries](draft/edge-create-or-update-queries.md)
* [Extend Variable-length Filter Expressions](draft/extend-variable-length-filter-expression.md)
* [Geospatial Data Types](draft/geospatial-data-types.md)
* [Hybrid Storage Engine](draft/hybrid-storage-engine.md)
* [Load Data Queries](draft/load-data-queries.md)
* [Multitenancy](draft/multitenancy.md)
* [Query Compilation](draft/query-compilation.md)
* [Release Log Levels](draft/release-log-levels.md)
* [Rust Query Modules](draft/rust-query-modules.md)
* [Sharded Graph Store](draft/sharded-graph-store.md)
* [Storage Memory Management](draft/storage-memory-management.md)
* [Vectorized Query Execution](draft/vectorized-query-execution.md)
## Obsolete
* [Distributed](obsolete/distributed.md)
* [High-availability](obsolete/high-availability.md)
* [Kafka Integration](obsolete/kafka-integration.md)

View File

@ -1,15 +0,0 @@
# A-star Variable-length Expand
Like DFS/BFS/WeightedShortestPath, it should be possible to support the A-star
algorithm in the format of variable length expansion.
Syntactically, the query should look like the following one:
```
MATCH (start)-[
*aStar{{hops}} {{heuristic_expression} {{weight_expression}} {{aggregated_weight_variable}} {{filtering_expression}}
]-(end)
RETURN {{aggregated_weight_variable}};
```
It would be convenient to add geospatial data support before because A-star
works well with geospatial data (heuristic function might exist).

View File

@ -1,7 +0,0 @@
# Cloud-native Graph Store
The biggest problem with the current in-memory storage is the total cost of
ownership for large datasets non-frequently updated. An idea to solve that is a
decoupled storage and compute inside a cloud environment. E.g., on AWS, a
database instance could use EC2 machines to run the query execution against
data stored inside S3.

View File

@ -1,40 +0,0 @@
# Compile Filter Expressions
Memgraph evaluates filter expression by traversing the abstract syntax tree of
the given filter. Filtering is a general operation in query execution.
Some simple examples are:
```
MATCH (n:Person {name: "John"}) WHERE n.age > 20 AND n.age < 40 RETURN n;
MATCH (a {id: 723})-[*bfs..10 (e, n | e.x > 12 AND n.y < 3)]-() RETURN *;
```
More real-world example looks like this (Ethereum network analysis):
```
MATCH (a: Address {addr: ''})-[]->(t: Transaction)-[]->(b: Address)
RETURN DISTINCT b.addr
UNION
MATCH (a: Address {addr: ''})-[]->(t: Transaction)-[]->(b1: Address)-[]->(t2: Transaction)-[]->(b: Address)
WHERE t2.timestamp > t.timestamp
RETURN DISTINCT b.addr
UNION
MATCH (a: Address {addr: ''})-[]->(t: Transaction)-[]->(b1: Address)-[]->(t2: Transaction)-[]->(b2: Address)-[]->(t3: Transaction)-[]->(b: Address)
WHERE t2.timestamp > t.timestamp AND t3.timestamp > t2.timestamp
return distinct b.addr
UNION
MATCH (a: Address {addr: ''})-[]->(t: Transaction)-[]->(b1: Address)-[]->(t2: Transaction)-[]->(b2: Address)-[]->(t3: Transaction)-[]->(b3: Address)-[]->(t4: Transaction)-[]->(b: Address)
WHERE t2.timestamp > t.timestamp AND t3.timestamp > t2.timestamp AND t4.timestamp > t3.timestamp
RETURN DISTINCT b.addr
UNION
MATCH (a: Address {addr: ''})-[]->(t: Transaction)-[]->(b1: Address)-[]->(t2: Transaction)-[]->(b2: Address)-[]->(t3: Transaction)-[]->(b3: Address)-[]->(t4: Transaction)-[]->(b4: Address)-[]->(t5: Transaction)-[]->(b: Address)
WHERE t2.timestamp > t.timestamp AND t3.timestamp > t2.timestamp AND t4.timestamp > t3.timestamp AND t5.timestamp > t4.timestamp
RETURN DISTINCT b.addr;
```
Filtering may take a significant portion of query execution, which means it has
to be fast.
The first step towards improvement might be to expose an API under which a
developer can implement its filtering logic (it's OK to support only C++ in the
beginning). Later on, we can introduce an automatic compilation of filtering
expressions.

View File

@ -1,14 +0,0 @@
# Database Triggers
Memgraph doesn't have any built-in notification mechanism yet. In the case a
user wants to get notified about anything happening inside Memgraph, the only
option is some pull mechanism from the client code. In many cases, that might
be suboptimal.
A natural place to start would be put to some notification code on each update
action inside Memgraph. It's probably too early to send a notification
immediately after WAL delta gets created, but at some point after transaction
commits or after WAL deltas are written to disk might be a pretty good place.
Furthermore, Memgraph has the query module infrastructure. The first
implementation might call a user-defined query module procedure and pass
whatever gets created or updated during the query execution.

View File

@ -1,13 +0,0 @@
# Date and Time Data Types
Neo4j offers the following functionality:
* https://neo4j.com/docs/cypher-manual/current/syntax/temporal/
* https://neo4j.com/docs/cypher-manual/current/functions/temporal/
The question is, how are we going to support equivalent capabilities? We need
something very similar because these are, in general, very well defined types.
A note about the storage is that Memgraph has a limit on the total number of
different data types, 16 at this point. We have to be mindful of that during
the design phase.

View File

@ -1,10 +0,0 @@
# Distributed Query Execution
Add the ability to execute graph algorithms on a cluster of machines. The scope
of this is ONLY the query execution without changing the underlying storage
because that's much more complex. The first significant decision here is to
figure out do we implement our own distributed execution engine or deploy
something already available, like [Giraph](https://giraph.apache.org). An
important part is that Giraph by itself isn't enough because people want to
update data on the fly. The final solution needs to provide some updating
capabilities.

View File

@ -1,14 +0,0 @@
# Edge Create or Update Queries
The old semantic of the `MERGE` clause is quite tricky. The new semantic of
`MERGE` is explained
[here](https://blog.acolyer.org/2019/09/18/updating-graph-databases-with-cypher/).
Similar to `MERGE`, but maybe simpler is to define clauses and semantics that
apply only to a single edge. In the case an edge between two nodes doesn't
exist, it should be created. On the other hand, if it exists, it should be
updated. The syntax should look similar to the following:
```
MERGE EDGE (a)-[e:Type {props}]->(b) [ON CREATE SET expression ON UPDATE SET expression] ...
```

View File

@ -1,12 +0,0 @@
# Extend Variable-length Filter Expressions
Variable-length filtering (DFS/BFS/WeightedShortestPath) can to be arbitrarily
complex. At this point, the filtering expression only gets currently visited
node and edge:
```
MATCH (a {id: 723})-[*bfs..10 (e, n | e.x > 12 AND n.y < 3)]-() RETURN *;
```
If a user had the whole path available, he would write more complex filtering
logic.

View File

@ -1,28 +0,0 @@
# Geospatial Data Types
Neo4j offers the following functionality:
* https://neo4j.com/docs/cypher-manual/current/syntax/spatial/
* https://neo4j.com/docs/cypher-manual/current/functions/spatial/
The question is, how are we going to support equivalent capabilities? We need
something very similar because these are, in general, very well defined types.
The main reasons for implementing this feature are:
1. Ease of use. At this point, users have to encode/decode time data types
manually.
2. Memory efficiency in some cases because user defined encoding could still
be more efficient.
The number of functionalities that could be built on top of geospatial types is
huge. Probably some C/C++ libraries should be used:
* https://github.com/OSGeo/gdal.
* http://geostarslib.sourceforge.net/ Furthermore, the query engine could use
these data types during query execution (specific for query execution).
* https://www.cgal.org.
Also, the storage engine could have specialized indices for these types of
data.
A note about the storage is that Memgraph has a limit on the total number of
different data types, 16 at this point. We have to be mindful of that during
the design phase.

View File

@ -1,20 +0,0 @@
# Hybrid Storage Engine
The goal here is easy to improve Memgraph storage massively! Please take a look
[here](http://cidrdb.org/cidr2020/papers/p29-neumann-cidr20.pdf) for the
reasons.
The general idea is to store edges on disk by using an LSM like data structure.
Storing edge properties will be tricky because strict schema also has to be
introduced. Otherwise, it's impossible to store data on disk optimally (Neo4j
already has a pretty optimized implementation of that). Furthermore, we have to
introduce the paging concept.
This is a complex feature because various aspects of the core engine have to be
considered and probably updated (memory management, garbage collection,
indexing).
## References
* [On Disk IO, Part 3: LSM Trees](https://medium.com/databasss/on-disk-io-part-3-lsm-trees-8b2da218496f)
* [2020-04-13 On-disk Edge Store Research](https://docs.google.com/document/d/1avoR2g9dNWa4FSFt9NVn4JrT6uOAH_ReNeUoNVsJ7J4)

View File

@ -1,17 +0,0 @@
# Load Data Queries
Loading data into Memgraph is a challenging task. We have to implement
something equivalent to the [Neo4j LOAD
CSV](https://neo4j.com/developer/guide-import-csv/#import-load-csv). This
feature seems relatively straightforward to implement because `LoadCSV` could
be another operator that would yield row by row. By having the operator, the
operation would be composable with the rest of the `CREATE`|`MERGE` queries.
The composability is the key because users would be able to combine various
clauses to import data.
A more general concept is [SingleStore
Pipelines](https://docs.singlestore.com/v7.1/reference/sql-reference/pipelines-commands/create-pipeline).
We already tried with [Graph Streams](../obsolete/kafka-integration.md). An option
is to migrate that code as a standalone product
[here](https://github.com/memgraph/mgtools).

View File

@ -1,15 +0,0 @@
# Multitenancy
[Multitenancy](https://en.wikipedia.org/wiki/Multitenancy) is a feature mainly
in the domain of ease of use. Neo4j made a great move by introducing
[Fabric](https://neo4j.com/developer/multi-tenancy-worked-example).
Memgraph first step in a similar direction would be to add an abstraction layer
containing multiple `Storage` instances + the ability to specify a database
instance per client session or database transaction.
## Replication Context
Each transaction has to encode on top of which database it's getting executed.
Once a replica gets delta objects containing database info, the replica engine
could apply changes locally.

View File

@ -1,14 +0,0 @@
# Query Compilation
Memgraph supports the interpretation of queries in a pull-based way. An
advantage of interpreting queries is a fast time until the execution, which is
convenient when a user wants to test a bunch of queries in a short time. The
downside is slow runtime. The runtime could be improved by compiling query
plans.
## Research Area 1
The easiest route to the query compilation might be generating [virtual
constexpr](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1064r0.html)
pull functions, making a dynamic library out of the entire compiled query plan,
and swapping query plans during the database runtime.

View File

@ -1,17 +0,0 @@
# Release Log Levels
It's impossible to control the log level in Memgraph Community. That means it's
tough to debug issues in interacting with Memgraph. At least three log levels
should be available to the user:
* Log nothing (as it is now).
* Log each executed query.
* Log Bolt server states.
Memgraph Enterprise has the audit log feature. The audit log provides
additional info about each query (user, source, etc.), but it's only available
in the Enterprise edition. Furthermore, the intention of audit logs isn't
debugging.
An important note is that the logged queries should be stripped out because, in
the Memgraph cloud context, we shouldn't log sensitive data.

View File

@ -1,15 +0,0 @@
# Rust Query Modules
Memgraph provides the query modules infrastructure. It's possible to write
query modules in
[C/C++](https://docs.memgraph.com/memgraph/reference-overview/query-modules/c-api)
and
[Python](https://docs.memgraph.com/memgraph/reference-overview/query-modules/python-api).
The problem with C/C++ is that it's very error-prone and time-consuming.
Python's problem is that it's slow and has a bunch of other limitations listed
in the [feature spec](../active/python-query-modules.md).
On the other hand, Rust is fast and much less error-prone compared to C. It
should be possible to use [bindgen](https://github.com/rust-lang/rust-bindgen)
to generate bindings out of the current C API and write wrapper code for Rust
developers to enjoy.

View File

@ -1,8 +0,0 @@
# Sharded Graph Store
Add the ability to shard graph data across machines in a cluster. The scope of
this is ONLY changing to the storage engine.
## References
* [Spinner: Scalable Graph Partitioning in the Cloud](https://arxiv.org/pdf/1404.3861.pdf)

View File

@ -1,13 +0,0 @@
# Storage Memory Management
If Memgraph uses too much memory, OS will kill it. There has to be an internal
mechanism to control memory usage.
Since C++17, polymorphic allocators are an excellent way to inject custom
memory management while having a modular code. Memgraph already uses PMR in the
query execution. Also, refer to [1] on how to start with PMR in the storage
context.
## Resources
[1] [PMR: Mistakes Were Made](https://www.youtube.com/watch?v=6BLlIj2QoT8)

View File

@ -1,9 +0,0 @@
# Vectorized Query Execution
Memgraph query engine pulls one by one record during query execution. A more
efficient way would be to pull multiple records in an array. Adding that
shouldn't be complicated, but it wouldn't be advantageous without vectorizing
fetching records from the storage.
On the query engine level, the array could be part of the frame. In other
words, the frame and the code dealing with the frame has to be changed.

View File

@ -1,148 +0,0 @@
# Distributed Memgraph specs
This document describes reasnonings behind Memgraphs distributed concepts.
## Distributed state machine
Memgraphs distributed mode introduces two states of the cluster, recovering and
working. The change between states shouldn't happen often, but when it happens
it can take a while to make a transition from one to another.
### Recovering
This state is the default state for Memgraph when the cluster starts with
recovery flags. If the recovery finishes successfully, the state changes to
working. If recovery fails, the user will be presented with a message that
explains what happened and what are the next steps.
Another way to enter this state is failure. If the cluster encounters a failure,
the master will enter the Recovering mode. This time, it will wait for all
workers to respond with a message saying they are alive and well, and making
sure they all have consistent state.
### Working
This state should be the default state of Memgraph most of the time. When in
this state, Memgraph accepts connections from Bolt clients and allows query
execution.
If distributed execution fails for a transaction, that transaction, and all
other active transactions will be aborted and the cluster will enter the
Recovering state.
## Durability
One of the important concepts in distributed Memgraph is durability.
### Cluster configuration
When running Memgraph in distributed mode, the master will store cluster
metadata in a persistent store. If fore some reason the cluster shuts down,
recovering Memgraph from durability files shouldn't require any additional
flags.
### Database ID
Each new and clean run of Memgraph should generate a new globally unique
database id. This id will associate all files that have persisted with this
run. Adding the database id to snapshots, write-ahead logs and cluster metadata
files ties them a specific Memgraph run, and it makes recovery easier to reason
about.
When recovering, the cluster won't generate a new id, but will reuse the one
from the snapshot/wal that it was able to recover from.
### Durability files
Memgraph uses snapshots and write-ahead logs for durability.
When Memgraph recovers it has to make sure all machines in the cluster recover
to the same recovery point. This is done by finding a common snapshot and
finding common transactions in per-machine available write-ahead logs.
Since we can not be sure that each machine persisted durability files, we need
to be able to negotiate a common recovery point in the cluster. Possible
durability file failures could require to start the cluster from scratch,
purging everything from storage and recovering from existing durability files.
We need to ensure that we keep wal files containing information about
transactions between all existing snapshots. This will provide better durability
in the case of a random machine durability file failure, where the cluster can
find a common recovery point that all machines in the cluster have.
Also, we should suggest and make clear docs that anything less than two
snapshots isn't considered safe for recovery.
### Recovery
The recovery happens in following steps:
* Master enables worker registration.
* Master recovers cluster metadata from the persisted storage.
* Master waits all required workers to register.
* Master broadcasts a recovery request to all workers.
* Workers respond with with a set of possible recovery points.
* Master finds a common recovery point for the whole cluster.
* Master broadcasts a recovery request with the common recovery point.
* Master waits for the cluster to recover.
* After a successful cluster recovery, master can enter Working state.
## Dynamic Graph Partitioning (abbr. DGP)
### Implemented parameters
--dynamic-graph-partitioner-enabled (If the dynamic graph partitioner should be
enabled.) type: bool default: false (start time)
--dgp-improvement-threshold (How much better should specific node score be
to consider a migration to another worker. This represents the minimal
difference between new score that the vertex will have when migrated
and the old one such that it's migrated.) type: int32 default: 10
(start time)
--dgp-max-batch-size (Maximal amount of vertices which should be migrated
in one dynamic graph partitioner step.) type: int32 default: 2000
(start time)
### Design decisions
* Each partitioning session has to be a new transaction.
* When and how does an instance perform the moves?
* Periodically.
* Token sharing (round robin, exactly one instance at a time has an
opportunity to perform the moves).
* On server-side serialization error (when DGP receives an error).
-> Quit partitioning and wait for the next turn.
* On client-side serialization error (when end client receives an error).
-> The client should never receive an error because of any
internal operation.
-> For the first implementation, it's good enough to wait until data becomes
available again.
-> It would be nice to achieve that DGP has lower priority than end client
operations.
### End-user parameters
* --dynamic-graph-partitioner-enabled (execution time)
* --dgp-improvement-threshold (execution time)
* --dgp-max-batch-size (execution time)
* --dgp-min-batch-size (execution time)
-> Minimum number of nodes that will be moved in each step.
* --dgp-fitness-threshold (execution time)
-> Do not perform moves if partitioning is good enough.
* --dgp-delta-turn-time (execution time)
-> Time between each turn.
* --dgp-delta-step-time (execution time)
-> Time between each step.
* --dgp-step-time (execution time)
-> Time limit per each step.
### Testing
The implementation has to provide good enough results in terms of:
* How good the partitioning is (numeric value), aka goodness.
* Workload execution time.
* Stress test correctness.
Test cases:
* N not connected subgraphs
-> shuffle nodes to N instances
-> run partitioning
-> test perfect partitioning.
* N connected subgraph
-> shuffle nodes to N instance
-> run partitioning
-> test partitioning.
* Take realistic workload (Long Running, LDBC1, LDBC2, Card Fraud, BFS, WSP)
-> measure exec time
-> run partitioning
-> test partitioning
-> measure exec time (during and after partitioning).

View File

@ -1,275 +0,0 @@
# High Availability (abbr. HA)
## High Level Context
High availability is a characteristic of a system which aims to ensure a
certain level of operational performance for a higher-than-normal period.
Although there are multiple ways to design highly available systems, Memgraph
strives to achieve HA by elimination of single points of failure. In essence,
this implies adding redundancy to the system so that a failure of a component
does not imply the failure of the entire system. To ensure this, HA Memgraph
implements the [Raft consensus algorithm](https://raft.github.io/).
Correct implementation of the algorithm guarantees that the cluster will be
fully functional (available) as long as any strong majority of the servers are
operational and can communicate with each other and with clients. For example,
clusters of three or four machines can tolerate the failure of a single server,
clusters of five and six machines can tolerate the failure of any two servers,
and so on. Therefore, we strongly recommend a setup of an odd-sized cluster.
### Performance Implications
Internally, Raft achieves high availability by keeping a consistent replicated
log on each server within the cluster. Therefore, we must successfully replicate
a transaction on the majority of servers within the cluster before we actually
commit it and report the result back to the client. This operation represents
a significant performance hit when compared with single node version of
Memgraph.
Luckily, the algorithm can be tweaked in a way which allows read-only
transactions to perform significantly better than those which modify the
database state. That being said, the performance of read-only operations
is still not going to be on par with single node Memgraph.
This section will be updated with exact numbers once we integrate HA with
new storage.
With the old storage, write throughput was almost five times lower than read
throughput (~30000 reads per second vs ~6000 writes per second).
## User Facing Setup
### How to Setup HA Memgraph Cluster?
First, the user needs to install `memgraph_ha` package on each machine
in their cluster. HA Memgraph should be available as a Debian package,
so its installation on each machine should be as simple as:
```plaintext
dpkg -i /path/to/memgraph_ha_<version>.deb
```
After successful installation of the `memgraph_ha` package, the user should
finish its configuration before attempting to start the cluster.
There are two main things that need to be configured on every node in order for
the cluster to be able to run:
1. The user has to edit the main configuration file and specify the unique node
ID to each server in the cluster
2. The user has to create a file that describes all IP addresses of all servers
that will be used in the cluster
The `memgraph_ha` binary loads all main configuration parameters from
`/etc/memgraph/memgraph_ha.conf`. On each node of the cluster, the user should
uncomment the `--server-id=0` parameter and change its value to the `server_id`
of that node.
The last step before starting the server is to create a `coordination`
configuration file. That file is already present as an example in
`/etc/memgraph/coordination.json.example` and you have to copy it to
`/etc/memgraph/coordination.json` and edit it according to your cluster
configuration. The file contains coordination info consisting of a list of
`server_id`, `ip_address` and `rpc_port` lists. The assumed contents of the
`coordination.json` file are:
```plaintext
[
[1, "192.168.0.1", 10000],
[2, "192.168.0.2", 10000],
[3, "192.168.0.3", 10000]
]
```
Here, each line corresponds to coordination of one server. The first entry is
that server's ID, the second is its IP address and the third is the RPC port it
listens to. This port should not be confused with the port used for client
interaction via the Bolt protocol.
The `ip_address` entered for each `server_id` *must* match the exact IP address
that belongs to that server and that will be used to communicate to other nodes
in the cluster. The coordination configuration file *must* be identical on all
nodes in the cluster.
After the user has set the `server_id` on each node in
`/etc/memgraph/memgraph_ha.conf` and provided the same
`/etc/memgraph/coordination.json` file to each node in the cluster, they can
start the Memgraph HA service by issuing the following command on each node in
the cluster:
```plaintext
systemctl start memgraph_ha
```
### How to Configure Raft Parameters?
All Raft configuration parameters can be controlled by modifying
`/etc/memgraph/raft.json`. The assumed contents of the `raft.json` file are:
```plaintext
{
"election_timeout_min": 750,
"election_timeout_max": 1000,
"heartbeat_interval": 100,
"replication_timeout": 20000,
"log_size_snapshot_threshold": 50000
}
```
The meaning behind each entry is demystified in the following table:
Flag | Description
------------------------------|------------
`election_timeout_min` | Lower bound for the randomly sampled reelection timer given in milliseconds
`election_timeout_max` | Upper bound for the randomly sampled reelection timer given in milliseconds
`heartbeat_interval` | Time interval between consecutive heartbeats given in milliseconds
`replication_timeout` | Time interval allowed for data replication given in milliseconds
`log_size_snapshot_threshold` | Allowed number of entries in Raft log before its compaction
### How to Query HA Memgraph via Proxy?
This chapter describes how to query HA Memgraph using our proxy server.
Note that this is not intended to be a long-term solution. Instead, we will
implement a proper Memgraph HA client which is capable of communicating with
the HA cluster. Once our own client is implemented, it will no longer be
possible to query HA Memgraph using other clients (such as neo4j client).
The Bolt protocol that is exposed by each Memgraph HA node is an extended
version of the standard Bolt protocol. In order to be able to communicate with
the highly available cluster of Memgraph HA nodes, the client must have some
logic implemented in itself so that it can communicate correctly with all nodes
in the cluster. To facilitate a faster start with the HA cluster we will build
the Memgraph HA proxy binary that communicates with all nodes in the HA cluster
using the extended Bolt protocol and itself exposes a standard Bolt protocol to
the user. All standard Bolt clients (libraries and custom systems) can
communicate with the Memgraph HA proxy without any code modifications.
The HA proxy should be deployed on each client machine that is used to
communicate with the cluster. It can't be deployed on the Memgraph HA nodes!
When using the Memgraph HA proxy, the communication flow is described in the
following diagram:
```plaintext
Memgraph HA node 1 -----+
|
Memgraph HA node 2 -----+ Memgraph HA proxy <---> any standard Bolt client (C, Java, PHP, Python, etc.)
|
Memgraph HA node 3 -----+
```
To setup the Memgraph HA proxy the user should install the `memgraph_ha_proxy`
package.
After its successful installation, the user should enter all endpoints of the
HA Memgraph cluster servers into the configuration before attempting to start
the HA Memgraph proxy server.
The HA Memgraph proxy server loads all of its configuration from
`/etc/memgraph/memgraph_ha_proxy.conf`. Assuming that the cluster is set up
like in the previous examples, the user should uncomment and enter the following
value into the `--endpoints` parameter:
```plaintext
--endpoints=192.168.0.1:7687,192.168.0.2:7687,192.168.0.3:7687
```
Note that the IP addresses used in the example match the individual cluster
nodes IP addresses, but the ports used are the Bolt server ports exposed by
each node (currently the default value of `7687`).
The user can now start the proxy by using the following command:
```plaintext
systemctl start memgraph_ha_proxy
```
After the proxy has been started, the user can query the HA cluster by
connecting to the HA Memgraph proxy IP address using their favorite Bolt
client.
## Integration with Memgraph
The first thing that should be defined is a single instruction within the
context of Raft (i.e. a single entry in a replicated log).
These instructions should be completely deterministic when applied
to the state machine. We have therefore decided that the appropriate level
of abstraction within Memgraph corresponds to `Delta`s (data structures
which describe a single change to the Memgraph state, used for durability
in WAL). Moreover, a single instruction in a replicated log will consist of a
batch of `Delta`s which correspond to a single transaction that's about
to be **committed**.
Apart from `Delta`s, there are certain operations within the storage called
`StorageGlobalOperations` which do not conform to usual transactional workflow
(e.g. Creating indices). Since our storage engine implementation guarantees
that at the moment of their execution no other transactions are active, we can
safely replicate them as well. In other words, no additional logic needs to be
implemented because of them.
Therefore, we will introduce a new `RaftDelta` object which can be constructed
both from storage `Delta` and `StorageGlobalOperation`. Instead of appending
these to WAL (as we do in single node), we will start to replicate them across
our cluster. Once we have replicated the corresponding Raft log entry on
majority of the cluster, we are able to safely commit the transaction or execute
a global operation. If for any reason the replication fails (leadership change,
worker failures, etc.) the transaction will be aborted.
In the follower mode, we need to be able to apply `RaftDelta`s we got from
the leader when the protocol allows us to do so. In that case, we will use the
same concepts from durability in storage v2, i.e., applying deltas maps
completely to recovery from WAL in storage v2.
## Test and Benchmark Strategy
We have already implemented some integration and stress tests. These are:
1. leader election -- Tests whether leader election works properly.
2. basic test -- Tests basic leader election and log replication.
3. term updates test -- Tests a specific corner case (which used to fail)
regarding term updates.
4. log compaction test -- Tests whether log compaction works properly.
5. large log entries -- Tests whether we can successfully replicate relatively
large log entries.
6. index test -- Tests whether index creation works in HA.
7. normal operation stress test -- Long running concurrent stress test under
normal conditions (no failures).
8. read benchmark -- Measures read throughput in HA.
9. write benchmark -- Measures write throughput in HA.
At the moment, our main goal is to pass existing tests and have a stable version
on our stress test. We should also implement a stress test which occasionally
introduces different types of failures in our cluster (we did this kind of
testing manually thus far). Passing these tests should convince us that we have
a "stable enough" version which we can start pushing to our customers.
Additional (proper) testing should probably involve some ideas from
[here](https://jepsen.io/analyses/dgraph-1-0-2)
## Possible Future Changes/Improvements/Extensions
There are two general directions in which we can alter HA Memgraph. The first
direction assumes we are going to stick with the Raft protocol. In that case
there are a few known ways to extend the basic algorithm in order to gain
better performance or achieve extra functionality. In no particular order,
these are:
1. Improving read performance using leader leases [Section 6.4 from Raft thesis]
2. Introducing cluster membership changes [Chapter 4 from Raft thesis]
3. Introducing a [learner mode](https://etcd.io/docs/v3.3.12/learning/learner/).
4. Consider different log compaction strategies [Chapter 5 from Raft thesis]
5. Removing HA proxy and implementing our own HA Memgraph client.
On the other hand, we might decide in the future to base our HA implementation
on a completely different protocol which might even offer different guarantees.
In that case we probably need to do a bit more of market research and weigh the
trade-offs of different solutions.
[This](https://www.postgresql.org/docs/9.5/different-replication-solutions.html)
might be a good starting point.
## Reading materials
1. [Raft paper](https://raft.github.io/raft.pdf)
2. [Raft thesis](https://github.com/ongardie/dissertation) (book.pdf)
3. [Raft playground](https://raft.github.io/)
4. [Leader Leases](https://blog.yugabyte.com/low-latency-reads-in-geo-distributed-sql-with-raft-leader-leases/)
5. [Improving Raft ETH](https://pub.tik.ee.ethz.ch/students/2017-FS/SA-2017-80.pdf)

View File

@ -1,117 +0,0 @@
# Kafka Integration
## openCypher clause
One must be able to specify the following when importing data from Kafka:
* Kafka URI
* Kafka topic
* Transform [script](transform.md) URI
Minimum required syntax looks like:
```opencypher
CREATE STREAM stream_name AS LOAD DATA KAFKA 'URI'
WITH TOPIC 'topic'
WITH TRANSFORM 'URI';
```
The full openCypher clause for creating a stream is:
```opencypher
CREATE STREAM stream_name AS
LOAD DATA KAFKA 'URI'
WITH TOPIC 'topic'
WITH TRANSFORM 'URI'
[BATCH_INTERVAL milliseconds]
[BATCH_SIZE count]
```
The `CREATE STREAM` clause happens in a transaction.
`WITH TOPIC` parameter specifies the Kafka topic from which we'll stream
data.
`WITH TRANSFORM` parameter should contain a URI of the transform script.
`BATCH_INTERVAL` parameter defines the time interval in milliseconds
which is the time between two successive stream importing operations.
`BATCH_SIZE` parameter defines the count of Kafka messages that will be
batched together before import.
If both `BATCH_INTERVAL` and `BATCH_SIZE` parameters are given, the condition
that is satisfied first will trigger the batched import.
Default value for `BATCH_INTERVAL` is 100 milliseconds, and the default value
for `BATCH_SIZE` is 10;
The `DROP` clause deletes a stream:
```opencypher
DROP STREAM stream_name;
```
The `SHOW` clause enables you to see all configured streams:
```opencypher
SHOW STREAMS;
```
You can also start/stop streams with the `START` and `STOP` clauses:
```opencypher
START STREAM stream_name [LIMIT count BATCHES];
STOP STREAM stream_name;
```
A stream needs to be stopped in order to start it and it needs to be started in
order to stop it. Starting a started or stopping a stopped stream will not
affect that stream.
There are also convenience clauses to start and stop all streams:
```opencypher
START ALL STREAMS;
STOP ALL STREAMS;
```
Before the actual import, you can also test the stream with the `TEST
STREAM` clause:
```opencypher
TEST STREAM stream_name [LIMIT count BATCHES];
```
When a stream is tested, data extraction and transformation occurs, but no
output is inserted in the graph.
A stream needs to be stopped in order to test it. When the batch limit is
omitted, `TEST STREAM` will run for only one batch by default.
## Data Transform
The transform script is a user defined script written in Python. The script
should be aware of the data format in the Kafka message.
Each Kafka message is byte length encoded, which means that the first eight
bytes of each message contain the length of the message.
A sample code for a streaming transform script could look like this:
```python
def create_vertex(vertex_id):
return ("CREATE (:Node {id: $id})", {"id": vertex_id})
def create_edge(from_id, to_id):
return ("MATCH (n:Node {id: $from_id}), (m:Node {id: $to_id}) "\
"CREATE (n)-[:Edge]->(m)", {"from_id": from_id, "to_id": to_id})
def stream(batch):
result = []
for item in batch:
message = item.decode('utf-8').strip().split()
if len(message) == 1:
result.append(create_vertex(message[0])))
else:
result.append(create_edge(message[0], message[1]))
return result
```
The script should output openCypher query strings based on the type of the
records.

View File

@ -1,222 +0,0 @@
# Replication
## High Level Context
Replication is a method that ensures that multiple database instances are
storing the same data. To enable replication, there must be at least two
instances of Memgraph in a cluster. Each instance has one of either two roles:
main or replica. The main instance is the instance that accepts writes to the
database and replicates its state to the replicas. In a cluster, there can only
be one main. There can be one or more replicas. None of the replicas will accept
write queries, but they will always accept read queries (there is an exception
to this rule and is described below). Replicas can also be configured to be
replicas of replicas, not necessarily replicas of the main. Each instance will
always be reachable using the standard supported communication protocols. The
replication will replicate WAL data. All data is transported through a custom
binary protocol that will try remain backward compatible, so that replication
immediately allows for zero downtime upgrades.
Each replica can be configured to accept replicated data in one of the following
modes:
- synchronous
- asynchronous
- semi-synchronous
### Synchronous Replication
When the data is replicated to a replica synchronously, all of the data of a
currently pending transaction must be sent to the synchronous replica before the
transaction is able to commit its changes.
This mode has a positive implication that all data that is committed to the
main will always be replicated to the synchronous replica. It also has a
negative performance implication because non-responsive replicas could grind all
query execution to a halt.
This mode is good when you absolutely need to be sure that all data is always
consistent between the main and the replica.
### Asynchronous Replication
When the data is replicated to a replica asynchronously, all pending
transactions are immediately committed and their data is replicated to the
asynchronous replica in the background.
This mode has a positive performance implication in which it won't slow down
query execution. It also has a negative implication that the data between the
main and the replica is almost never in a consistent state (when the data is
being changed).
This mode is good when you don't care about consistency and only need an
eventually consistent cluster, but you care about performance.
### Semi-synchronous Replication
When the data is replicated to a replica semi-synchronously, the data is
replicated using both the synchronous and asynchronous methodology. The data is
always replicated synchronously, but, if the replica for any reason doesn't
respond within a preset timeout, the pending transaction is committed and the
data is replicated to the replica asynchronously.
This mode has a positive implication that all data that is committed is
*mostly* replicated to the semi-synchronous replica. It also has a negative
performance implication as the synchronous replication mode.
This mode is useful when you want the replication to be synchronous to ensure
that the data within the cluster is consistent, but you don't want the main
to grind to a halt when you have a non-responsive replica.
### Addition of a New Replica
Each replica, when added to the cluster (in any mode), will first start out as
an asynchronous replica. That will allow replicas that have fallen behind to
first catch-up to the current state of the database. When the replica is in a
state that it isn't lagging behind the main it will then be promoted (in a brief
stop-the-world operation) to a semi-synchronous or synchronous replica. Slaves
that are added as asynchronous replicas will remain asynchronous.
## User Facing Setup
### How to Setup a Memgraph Cluster with Replication?
Replication configuration is done primarily through openCypher commands. This
allows the cluster to be dynamically rearranged (new leader election, addition
of a new replica, etc.).
Each Memgraph instance when first started will be a main. You have to change
the role of all replica nodes using the following openCypher query before you
can enable replication on the main:
```plaintext
SET REPLICATION ROLE TO (MAIN|REPLICA) WITH PORT <port_number>;
```
Note that the "WITH PORT <port_number>" part of the query sets the replication port,
but it applies only to the replica. In other words, if you try to set the
replication port as the main, a semantic exception will be thrown.
After you have set your replica instance to the correct operating role, you can
enable replication in the main instance by issuing the following openCypher
command:
```plaintext
REGISTER REPLICA name (SYNC|ASYNC) [WITH TIMEOUT 0.5] TO <socket_address>;
```
The socket address must be a string of the following form:
```plaintext
"IP_ADDRESS:PORT_NUMBER"
```
where IP_ADDRESS is a valid IP address, and PORT_NUMBER is a valid port number,
both given in decimal notation.
Note that in this case they must be separated by a single colon.
Alternatively, one can give the socket address as:
```plaintext
"IP_ADDRESS"
```
where IP_ADDRESS must be a valid IP address, and the port number will be
assumed to be the default one (we specify it to be 10000).
Each Memgraph instance will remember what the configuration was set to and will
automatically resume with its role when restarted.
### How to See the Current Replication Status?
To see the replication ROLE of the current Memgraph instance, you can issue the
following query:
```plaintext
SHOW REPLICATION ROLE;
```
To see the replicas of the current Memgraph instance, you can issue the
following query:
```plaintext
SHOW REPLICAS;
```
To delete a replica, issue the following query:
```plaintext
DROP REPLICA 'name';
```
### How to Promote a New Main?
When you have an already set-up cluster, to promote a new main, just set the
replica that you want to be a main to the main role.
```plaintext
SET REPLICATION ROLE TO MAIN; # on desired replica
```
After the command is issued, if the original main is still alive, it won't be
able to replicate its data to the replica (the new main) anymore and will enter
an error state. You must ensure that at any given point in time there aren't
two mains in the cluster.
## Limitations and Potential Features
Currently, we do not support chained replicas, i.e. a replica can't have its
own replica. When this feature becomes available, the user will be able to
configure scenarios such as the following one:
```plaintext
main -[asynchronous]-> replica 1 -[semi-synchronous]-> replica 2
```
To configure the above scenario, the user will be able to issue the following
commands:
```plaintext
SET REPLICATION ROLE TO REPLICA WITH PORT <port1>; # on replica 1
SET REPLICATION ROLE TO REPLICA WITH PORT <port2>; # on replica 2
REGISTER REPLICA replica1 ASYNC TO "replica1_ip_address:port1"; # on main
REGISTER REPLICA replica2 SYNC WITH TIMEOUT 0.5
TO "replica2_ip_address:port2"; # on replica 1
```
In addition, we do not yet support advanced recovery mechanisms. For example,
if a main crashes, a suitable replica will take its place as the new main. If
the crashed main goes back online, it will not be able to reclaim its previous
role, but will be forced to be a replica of the new main.
In the upcoming releases, we might be adding more advanced recovery mechanisms.
However, users are able to setup their own recovery policies using the basic
recovery mechanisms we currently provide, that can cover a wide range of
real-life scenarios.
Also, we do not yet support the replication of authentication configurations,
rendering access control replication unavailable.
The query and authentication modules, as well as audit logs are not replicated.
## Integration with Memgraph
WAL `Delta`s are replicated between the replication main and replica. With
`Delta`s, all `StorageGlobalOperation`s are also replicated. Replication is
essentially the same as appending to the WAL.
Synchronous replication will occur in `Commit` and each
`StorageGlobalOperation` handler. The storage itself guarantees that `Commit`
will be called single-threadedly and that no `StorageGlobalOperation` will be
executed during an active transaction. Asynchronous replication will load its
data from already written WAL files and transmit the data to the replica. All
data will be replicated using our RPC protocol (SLK encoded).
For each replica the replication main (or replica) will keep track of the
replica's state. That way, it will know which operations must be transmitted to
the replica and which operations can be skipped. When a replica is very stale,
a snapshot will be transmitted to it so that it can quickly synchronize with
the current state. All following operations will transmit WAL deltas.
## Reading materials
1. [PostgreSQL comparison of different solutions](https://www.postgresql.org/docs/12/different-replication-solutions.html)
2. [PostgreSQL docs](https://www.postgresql.org/docs/12/runtime-config-replication.html)
3. [MySQL reference manual](https://dev.mysql.com/doc/refman/8.0/en/replication.html)
4. [MySQL docs](https://dev.mysql.com/doc/refman/8.0/en/replication-setup-slaves.html)
5. [MySQL master switch](https://dev.mysql.com/doc/refman/8.0/en/replication-solutions-switch.html)

View File

@ -1,22 +0,0 @@
# Memgraph LaTeX Beamer Template
This folder contains all of the needed files for creating a presentation with
Memgraph styling. You should use this style for any public presentations.
Feel free to improve it according to style guidelines and raise issues if you
find any.
## Usage
Copy the contents of this folder (excluding this README file) to where you
want to write your own presentation. After copying, you can start editing the
`template.tex` with your content.
To compile the presentation to a PDF, run `latexmk -pdf -xelatex`. Some
directives require XeLaTeX, so you need to pass `-xelatex` as the final option
of `latexmk`. You may also need to install some packages if the compilation
complains about missing packages.
To clean up the generated files, use `latexmk -C`. This will also delete the
generated PDF. If you wish to remove generated files except the PDF, use
`latexmk -c`.

View File

@ -1,82 +0,0 @@
\NeedsTeXFormat{LaTeX2e}
\ProvidesClass{mg-beamer}[2018/03/26 Memgraph Beamer]
\DeclareOption*{\PassOptionsToClass{\CurrentOption}{beamer}}
\ProcessOptions \relax
\LoadClass{beamer}
\usetheme{Pittsburgh}
% Memgraph color palette
\definecolor{mg-purple}{HTML}{720096}
\definecolor{mg-red}{HTML}{DD2222}
\definecolor{mg-orange}{HTML}{FB6E00}
\definecolor{mg-yellow}{HTML}{FFC500}
\definecolor{mg-gray}{HTML}{857F87}
\definecolor{mg-black}{HTML}{231F20}
\RequirePackage{fontspec}
% Title fonts
\setbeamerfont{frametitle}{family={\fontspec[Path = ./mg-style/fonts/]{EncodeSansSemiCondensed-Regular.ttf}}}
\setbeamerfont{title}{family={\fontspec[Path = ./mg-style/fonts/]{EncodeSansSemiCondensed-Regular.ttf}}}
% Body font
\RequirePackage[sfdefault,light]{roboto}
% Roboto is pretty bad for monospace font. We will find a replacement.
% \setmonofont{RobotoMono-Regular.ttf}[Path = ./mg-style/fonts/]
% Title slide styles
% \setbeamerfont{frametitle}{size=\huge}
% \setbeamerfont{title}{size=\huge}
% \setbeamerfont{date}{size=\tiny}
% Other typography styles
\setbeamertemplate{frametitle}[default][center]
\setbeamercolor{frametitle}{fg=mg-black}
\setbeamercolor{title}{fg=mg-black}
\setbeamercolor{section in toc}{fg=mg-black}
\setbeamercolor{local structure}{fg=mg-orange}
\setbeamercolor{alert text}{fg=mg-red}
% Commands
\newcommand{\mgalert}[1]{{\usebeamercolor[fg]{alert text}#1}}
\newcommand{\titleframe}{\frame[plain]{\titlepage}}
\newcommand{\mgtexttt}[1]{{\textcolor{mg-gray}{\texttt{#1}}}}
% Title slide background
\RequirePackage{tikz,calc}
% Use title-slide-169 if aspect ration is 16:9
\pgfdeclareimage[interpolate=true,width=\paperwidth,height=\paperheight]{logo}{mg-style/title-slide-169}
\setbeamertemplate{background}{
\begin{tikzpicture}
\useasboundingbox (0,0) rectangle (\the\paperwidth,\the\paperheight);
\pgftext[at=\pgfpoint{0}{0},left,base]{\pgfuseimage{logo}};
\ifnum\thepage>1\relax
\useasboundingbox (0,0) rectangle (\the\paperwidth,\the\paperheight);
\fill[white, opacity=1](0,\the\paperheight)--(\the\paperwidth,\the\paperheight)--(\the\paperwidth,0)--(0,0)--(0,\the\paperheight);
\fi
\end{tikzpicture}
}
% Footline content
\setbeamertemplate{navigation symbols}{}%remove navigation symbols
\setbeamertemplate{footline}{
\begin{beamercolorbox}[ht=1.6cm,wd=\paperwidth]{footlinecolor}
\vspace{0.1cm}
\hfill
\begin{minipage}[c]{3cm}
\begin{center}
\includegraphics[height=0.8cm]{mg-style/memgraph-logo.png}
\end{center}
\end{minipage}
\begin{minipage}[c]{7cm}
\insertshorttitle\ --- \insertsection
\end{minipage}
\begin{minipage}[c]{2cm}
\tiny{\insertframenumber{} of \inserttotalframenumber}
\end{minipage}
\end{beamercolorbox}
}
\endinput

Binary file not shown.

Before

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 185 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 189 KiB

View File

@ -1,40 +0,0 @@
% Set 16:9 aspect ratio
\documentclass[aspectratio=169]{mg-beamer}
% Default directive sets the regular 4:3 aspect ratio
% \documentclass{mg-beamer}
\mode<presentation>
% requires xelatex
\usepackage{ccicons}
\title{Insert Presentation Title}
\titlegraphic{\ccbyncnd}
\author{Insert Name}
% Institute doesn't look good in our current styling class.
% \institute[Memgraph Ltd.]{\pgfimage[height=1.5cm]{mg-logo.png}}
% Date is autogenerated on compilation, so no need to set it explicitly,
% unless you wish to override it with a different date.
% \date{March 23, 2018}
\begin{document}
\titleframe
\section{Intro}
\begin{frame}{Contents}
\tableofcontents
\end{frame}
\begin{frame}{Memgraph Markup Test}
\begin{itemize}
\item \mgtexttt{Prefer \\mgtexttt for monospace}
\item Replace this slide with your own
\item Add even more slides in different sections
\item Make sure you spellcheck your presentation
\end{itemize}
\end{frame}
\end{document}

View File

@ -1,44 +1,4 @@
# Memgraph Build and Run Environments
## Toolchain Installation Procedure
1) Download the toolchain for your operating system from one of the following
links (current active toolchain is `toolchain-v2`):
* [CentOS 7](https://s3-eu-west-1.amazonaws.com/deps.memgraph.io/toolchain-v2/toolchain-v2-binaries-centos-7.tar.gz)
* [CentOS 8](https://s3-eu-west-1.amazonaws.com/deps.memgraph.io/toolchain-v2/toolchain-v2-binaries-centos-8.tar.gz)
* [Debian 9](https://s3-eu-west-1.amazonaws.com/deps.memgraph.io/toolchain-v2/toolchain-v2-binaries-debian-9.tar.gz)
* [Debian 10](https://s3-eu-west-1.amazonaws.com/deps.memgraph.io/toolchain-v2/toolchain-v2-binaries-debian-10.tar.gz)
* [Ubuntu 18.04](https://s3-eu-west-1.amazonaws.com/deps.memgraph.io/toolchain-v2/toolchain-v2-binaries-ubuntu-18.04.tar.gz)
* [Ubuntu 20.04](https://s3-eu-west-1.amazonaws.com/deps.memgraph.io/toolchain-v2/toolchain-v2-binaries-ubuntu-20.04.tar.gz)
2) Extract the toolchain with the following command:
```bash
tar xzvf {{toolchain-archive}}.tar.gz -C /opt
```
3) Check and install required toolchain runtime dependencies by executing
(e.g., on **Debian 10**):
```bash
./environment/os/debian-10.sh check TOOLCHAIN_RUN_DEPS
./environment/os/debian-10.sh install TOOLCHAIN_RUN_DEPS
```
4) Activate the toolchain:
```bash
source /opt/toolchain-v2/activate
```
## Toolchain Upgrade Procedure
1) Build a new toolchain for each supported OS (latest versions).
2) If the new toolchain doesn't compile on some supported OS, the last
compilable toolchain has to be used instead. In other words, the project has
to compile on the oldest active toolchain as well. Suppose some
changes/improvements were added when migrating to the latest toolchain; in
that case, the maintainer has to ensure that the project still compiles on
previous toolchains (everything from `init` script to the actual code has to
work on all supported operating systems).
Please continue
[here](https://www.notion.so/memgraph/Tools-05e0baafb78a49b386e0063b4833d23d).