Remove docs/dev (migrated to Notion) (#84)
This commit is contained in:
parent
90d4ebdb1e
commit
2caf0e617e
2
.github/workflows/diff.yaml
vendored
2
.github/workflows/diff.yaml
vendored
@ -4,7 +4,7 @@ on:
|
||||
push:
|
||||
paths-ignore:
|
||||
- 'docs/**'
|
||||
- '*.md'
|
||||
- '**/*.md'
|
||||
|
||||
jobs:
|
||||
community_build:
|
||||
|
30
README.md
30
README.md
@ -6,34 +6,8 @@ data structures, multi-version concurrency control and asynchronous IO.
|
||||
|
||||
## Development Documentation
|
||||
|
||||
* [Quick Start](docs/dev/quick-start.md)
|
||||
* [Workflow](docs/dev/workflow.md)
|
||||
* [Storage](docs/dev/storage/v2/contents.md)
|
||||
* [Query Engine](docs/dev/query/contents.md)
|
||||
* [Communication](docs/dev/communication/contents.md)
|
||||
* [Lisp C++ Preprocessor (LCP)](docs/dev/lcp.md)
|
||||
|
||||
## Feature Specifications
|
||||
|
||||
Each prominent Memgraph feature requires a feature specification. The purpose
|
||||
of the feature specification is to have a base for discussing all aspects of
|
||||
the feature. Elements of feature specifications should be:
|
||||
|
||||
* High-level context.
|
||||
* Interface.
|
||||
* User stories. Usage from the end-user perspective. In the case of a library,
|
||||
that should be cases on how to use the programming interface. In the case of
|
||||
a shell script, that should be cases on how to use flags.
|
||||
* Discussion about concurrency, memory management, error management.
|
||||
* Any other essential functional or non-functional requirements.
|
||||
* Test and benchmark strategy.
|
||||
* Possible future changes/improvements/extensions.
|
||||
* Security concerns.
|
||||
* Additional and/or optional implementation details.
|
||||
|
||||
It's crucial to keep feature spec up-to-date with the implementation. Take a
|
||||
look at the list of [feature specifications](docs/feature_spec/contents.md) to
|
||||
learn more about powerful Memgraph features.
|
||||
Please continue
|
||||
[here](https://www.notion.so/memgraph/memgraph-0428591638604c8385550e214ea9f3e6).
|
||||
|
||||
## User Documentation
|
||||
|
||||
|
@ -1,269 +0,0 @@
|
||||
# Code Review Guidelines
|
||||
|
||||
This chapter describes some of the things you should be on the lookout when
|
||||
reviewing someone else's code.
|
||||
|
||||
## Exceptions
|
||||
|
||||
Although the Google C++ Style Guide forbids exceptions, we do allow them in
|
||||
our codebase. As a reviewer you should watch out for the following.
|
||||
|
||||
The documentation of throwing functions needs to be in-sync with the
|
||||
implementation. This must be enforced recursively. I.e. if a function A now
|
||||
throws a new exception, and the function B uses A, then B needs to handle that
|
||||
exception or have its documentation updated and so on. Naturally, the same
|
||||
applies when an exception is removed.
|
||||
|
||||
Transitive callers of the function which throws a new exception must be OK
|
||||
with that. This ties into the previous point. You need to check that all users
|
||||
of the new exception either handle it correctly or propagate it.
|
||||
|
||||
Exceptions should not escape out of class destructors, because that will
|
||||
terminate the program. The code should be changed so that such cases are not
|
||||
possible.
|
||||
|
||||
Exceptions being thrown in class constructors. Although this is well defined
|
||||
in C++, it usually implies that a constructor is doing too much work and the
|
||||
class construction & initialization should be redesigned. Usual approaches are
|
||||
using the (Static) Factory Method pattern or having some sort of an
|
||||
initialization method that needs to be called after the construction is done.
|
||||
Prefer the Factory Method.
|
||||
|
||||
Don't forget that STL functions may also throw!
|
||||
|
||||
## Pointers & References
|
||||
|
||||
In cases when some code passes a pointer or reference, or if a code stores a
|
||||
pointer or reference you should take a careful look at the following.
|
||||
|
||||
* Lifetime of the pointed to value (this includes both ownership and
|
||||
multithreaded access).
|
||||
* In case of a class, check validity of destructor and move/copy
|
||||
constructors.
|
||||
* Is the pointed to value mutated, if not it should be `const` (`const Type
|
||||
*` or `const Type &`).
|
||||
|
||||
## Allocators & Memory Resources
|
||||
|
||||
With the introduction of polymorphic allocators (C++17 `<memory_resource>` and
|
||||
our `utils/memory.hpp`) we get a more convenient type signatures for
|
||||
containers so as to keep the outward facing API nice. This convenience comes
|
||||
at a cost of less static checks on the type level due to type erasure.
|
||||
|
||||
For example:
|
||||
|
||||
std::pmr::vector<int> first_vec(std::pmr::null_memory_resource());
|
||||
std::pmr::vector<int> second_vec(std::pmr::new_delete_resource());
|
||||
|
||||
second_vec = first_vec // What happens here?
|
||||
|
||||
// Or with our implementation
|
||||
utils::MonotonicBufferResource monotonic_memory(1024);
|
||||
std::vector<int, utils::Allocator<int>> first_vec(&monotonic_memory);
|
||||
std::vector<int, utils::Allocator<int>> second_vec(utils::NewDeleteResource());
|
||||
|
||||
second_vec = first_vec // What happens here?
|
||||
|
||||
In the above, both `first_vec` and `second_vec` have the same type, but have
|
||||
*different* allocators! This can lead to ambiguity when moving or copying
|
||||
elements between them.
|
||||
|
||||
You need to watch out for the following.
|
||||
|
||||
* Swapping can lead to undefined behaviour if the allocators are not equal.
|
||||
* Is the move construction done with the right allocator.
|
||||
* Is the move assignment done correctly, also it may throw an exception.
|
||||
* Is the copy construction done with the right allocator.
|
||||
* Is the copy assignment done correctly.
|
||||
* Using `auto` makes allocator propagation rules rather ambiguous.
|
||||
|
||||
## Classes & Object Oriented Programming
|
||||
|
||||
A common mistake is to use classes, inheritance and "OOP" when it's not
|
||||
needed. This sections shows examples of encountered cases.
|
||||
|
||||
### Classes without (Meaningful) Members
|
||||
|
||||
class MyCoolClass {
|
||||
public:
|
||||
int BeCool(int a, int b) { return a + b; }
|
||||
|
||||
void SaySomethingCool() { std::cout << "Hello!"; }
|
||||
};
|
||||
|
||||
The above class has no members (i.e. state) which affect the behaviour of
|
||||
methods. This class should need not exist, it can be easily replaced with a
|
||||
more modular (and shorter) design -- top level functions.
|
||||
|
||||
int BeCool(int a, int b) { return a + b; }
|
||||
|
||||
void SaySomethingCool() { std::cout << "Hello!"; }
|
||||
|
||||
### Classes with a Single Public Method
|
||||
|
||||
clas MyAwesomeClass {
|
||||
public:
|
||||
MyAwesomeClass(int state) : state_(state) {}
|
||||
|
||||
int GetAwesome() { return GetAwesomeImpl() + 1; }
|
||||
|
||||
private:
|
||||
int state_;
|
||||
|
||||
int GetAwesomeImpl() { return state_; }
|
||||
};
|
||||
|
||||
The above class has a `state_` and even a private method, but there's only one
|
||||
public method -- `GetAwesome`.
|
||||
|
||||
You should check "Does the stored state have any meaningful influence on the
|
||||
public method?", similarly to the previous point.
|
||||
|
||||
In the above case it doesn't, and the class should be replaced with a public
|
||||
function in `.hpp` while the private method should become a private function
|
||||
in `.cpp` (static or in anonymous namespace).
|
||||
|
||||
// hpp
|
||||
int GetAwesome(int state);
|
||||
|
||||
// cpp
|
||||
namespace {
|
||||
int GetAwesomeImpl(int state) { return state; }
|
||||
}
|
||||
int GetAwesome(int state) { return GetAwesomeImpl(state) + 1; }
|
||||
|
||||
A counterexample is when the state is meaningful.
|
||||
|
||||
class Counter {
|
||||
public:
|
||||
Counter(int state) : state_(state) {}
|
||||
|
||||
int Get() { return state_++; }
|
||||
|
||||
private:
|
||||
int state_;
|
||||
};
|
||||
|
||||
But even that could be replaced with a closure.
|
||||
|
||||
auto MakeCounter(int state) {
|
||||
return [state]() mutable { return state++; };
|
||||
}
|
||||
|
||||
### Private Methods
|
||||
|
||||
Instead of private methods, top level functions should be preferred. The
|
||||
reasoning is completely explained in "Effective C++" Item 23 by Scott Meyers.
|
||||
In our codebase, even improvements to compilation times can be noticed if
|
||||
private methods in interface (`.hpp`) files are replaced with top level
|
||||
functions in implementation (`.cpp`) files.
|
||||
|
||||
### Inheritance
|
||||
|
||||
The rule is simple -- if there are no virtual methods (but maybe destructor),
|
||||
then the class should be marked as `final` and never inherited.
|
||||
|
||||
If there are virtual methods (i.e. class is meant to be inherited), make sure
|
||||
that either a public virtual destructor or a protected non-virtual destructor
|
||||
exist. See "Effective C++" Item 7 by Scott Meyers. Also take a look at
|
||||
"Effective C++" Items 32---39 by Scott Meyers.
|
||||
|
||||
An example of how inheritance with no virtual methods is replaced with
|
||||
composition.
|
||||
|
||||
class MyBase {
|
||||
public:
|
||||
virtual ~MyBase() {}
|
||||
|
||||
void DoSomethingBase() { ... }
|
||||
};
|
||||
|
||||
class MyDerived final : public MyBase {
|
||||
public:
|
||||
void DoSomethingNew() { ... DoSomethingBase(); ... }
|
||||
};
|
||||
|
||||
With composition, the above becomes.
|
||||
|
||||
class MyBase final {
|
||||
public:
|
||||
void DoSomethingBase() { ... }
|
||||
};
|
||||
|
||||
class MyDerived final {
|
||||
MyBase base_;
|
||||
|
||||
public:
|
||||
void DoSomethingNew() { ... base_.DoSomethingBase(); ... }
|
||||
};
|
||||
|
||||
The composition approach is preferred as it encapsulates the fact that
|
||||
`MyBase` is used for the implementation and users only interact with the
|
||||
public interface of `MyDerived`. Additionally, you can easily replace `MyBase
|
||||
base_;` with a C++ PIMPL idiom (`std::unique_ptr<MyBase> base_;`) to make the
|
||||
code more modular with regards to compilation.
|
||||
|
||||
More advanced C++ users will recognize that the encapsulation feature of the
|
||||
non-PIMPL composition can be replaced with private inheritance.
|
||||
|
||||
class MyDerived final : private MyBase {
|
||||
public:
|
||||
void DoSomethingNew() { ... MyBase::DoSomethingBase(); ... }
|
||||
};
|
||||
|
||||
One of the common "counterexample" is the ability to store objects of
|
||||
different type in a container or pass them to a function. Unfortunately, this
|
||||
is not that good of a design. For example.
|
||||
|
||||
class MyBase {
|
||||
... // No virtual methods (but the destructor)
|
||||
};
|
||||
|
||||
class MyFirstClass final : public MyBase { ... };
|
||||
|
||||
class MySecondClass final : public MyBase { ... };
|
||||
|
||||
std::vector<std::unique_ptr<MyBase>> first_and_second_classes;
|
||||
first_and_second_classes.push_back(std::make_unique<MyFirstClass>());
|
||||
first_and_second_classes.push_back(std::make_unique<MySecondClass>());
|
||||
|
||||
void FunctionOnFirstOrSecond(const MyBase &first_or_second, ...) { ... }
|
||||
|
||||
With C++17, the containers for different types should be implemented with
|
||||
`std::variant`, and as before the functions can be templated.
|
||||
|
||||
class MyFirstClass final { ... };
|
||||
|
||||
class MySecondClass final { ... };
|
||||
|
||||
std::vector<std::variant<MyFirstClass, MySecondClass>> first_and_second_classes;
|
||||
// Notice no heap allocation, since we don't store a pointer
|
||||
first_and_second_classes.emplace_back(MyFirstClass());
|
||||
first_and_second_classes.emplace_back(MySecondClass());
|
||||
|
||||
// You can also use `std::variant` here instead of template
|
||||
template <class TFirstOrSecond>
|
||||
void FunctionOnFirstOrSecond(const TFirstOrSecond &first_or_second, ...) { ... }
|
||||
|
||||
Naturally, if the base class has meaningful virtual methods (i.e. other than
|
||||
destructor) it maybe is OK to use inheritance but also consider alternatives.
|
||||
See "Effective C++" Items 32---39 by Scott Meyers.
|
||||
|
||||
### Multiple Inheritance
|
||||
|
||||
Multiple inheritance should not be used unless all base classes are pure
|
||||
interface classes. This decision is inherited from [Google C++ Style
|
||||
Guide](https://google.github.io/styleguide/cppguide.html#Inheritance). For
|
||||
example on how to design with and around multiple inheritance refer to
|
||||
"Effective C++" Item 40 by Scott Meyers.
|
||||
|
||||
Naturally, if there *really* is no better design, then multiple inheritance is
|
||||
allowed. An example of this can be found in our codebase when inheriting
|
||||
Visitor classes (though even that could be replaced with `std::variant` for
|
||||
example).
|
||||
|
||||
## Code Format & Style
|
||||
|
||||
If something doesn't conform to our code formatting and style, just refer the
|
||||
author to either [C++ Style](cpp-code-conventions.md) or [Other Code
|
||||
Conventions](other-code-conventions.md).
|
@ -1,5 +0,0 @@
|
||||
# Communication
|
||||
|
||||
## Bolt
|
||||
|
||||
Memgraph implements [Bolt communication protocol](https://7687.org/).
|
@ -1,350 +0,0 @@
|
||||
# C++ Code Conventions
|
||||
|
||||
This chapter describes code conventions which should be followed when writing
|
||||
C++ code.
|
||||
|
||||
## Code Style
|
||||
|
||||
Memgraph uses the
|
||||
[Google Style Guide for C++](https://google.github.io/styleguide/cppguide.html)
|
||||
in most of its code. You should follow them whenever writing new code.
|
||||
Besides following the style guide, take a look at
|
||||
[Code Review Guidelines](code-review.md) for common design issues and pitfalls
|
||||
with C++ as well as [Required Reading](required-reading.md).
|
||||
|
||||
### Often Overlooked Style Conventions
|
||||
|
||||
#### Pointers & References
|
||||
|
||||
References provide a shorter syntax for accessing members and better declare
|
||||
the intent that a pointer *should* not be `nullptr`. They do not prevent
|
||||
accessing a `nullptr` and obfuscate the client/calling code because the
|
||||
reference argument is passed just like a value. Errors with such code have
|
||||
been very difficult to debug. Therefore, pointers are always used. They will
|
||||
not prevent bugs but will make some of them more obvious when reading code.
|
||||
|
||||
The only time a reference can be used is if it is `const`. Note that this
|
||||
kind of reference is not allowed if it is stored somewhere, i.e. in a class.
|
||||
You should use a pointer to `const` then. The primary reason being is that
|
||||
references obscure the semantics of moving an object, thus making bugs with
|
||||
references pointing to invalid memory harder to track down.
|
||||
|
||||
Example of this can be seen while capturing member variables by reference
|
||||
inside a lambda.
|
||||
Let's define a class that has two members, where one of those members is a
|
||||
lambda that captures the other member by reference.
|
||||
|
||||
```cpp
|
||||
struct S {
|
||||
std::function<void()> foo;
|
||||
int bar;
|
||||
|
||||
S() : foo([&]() { std::cout << bar; })
|
||||
{}
|
||||
};
|
||||
```
|
||||
What would happend if we move an instance of this object? Our lambda
|
||||
reference capture will point to the same location as before, i.e. it
|
||||
will point to the **old** memory location of `bar`. This means we have
|
||||
a dangling reference in our code!
|
||||
There are multiple ways to avoid this. The simple solutions would be
|
||||
capturing by value or disabling move constructors/assignments.
|
||||
Still, if we capture by reference an object that is not a member
|
||||
of the struct containing the lambda, we can still have a dangling
|
||||
reference if we move that object somewhere in our code and there is
|
||||
nothing we can do to prevent that.
|
||||
So, be careful with lambda catptures, and remember that references are
|
||||
still a pointer under the hood!
|
||||
|
||||
[Style guide reference](https://google.github.io/styleguide/cppguide.html#Reference_Arguments)
|
||||
|
||||
#### Constructors & RAII
|
||||
|
||||
RAII (Resource Acquisition is Initialization) is a nice mechanism for managing
|
||||
resources. It is especially useful when exceptions are used, such as in our
|
||||
code. Unfortunately, they do have 2 major downsides.
|
||||
|
||||
* Only exceptions can be used for to signal failure.
|
||||
* Calls to virtual methods are not resolved as expected.
|
||||
|
||||
For those reasons the style guide recommends minimal work that cannot fail.
|
||||
Using virtual methods or doing a lot more should be delegated to some form of
|
||||
`Init` method, possibly coupled with static factory methods. Similar rules
|
||||
apply to destructors, which are not allowed to even throw exceptions.
|
||||
|
||||
[Style guide reference](https://google.github.io/styleguide/cppguide.html#Doing_Work_in_Constructors)
|
||||
|
||||
#### Constructors and member variables
|
||||
|
||||
One of the most powerful tools in C++ is the move semantics. We won't go into
|
||||
detail how it works, but you should know how to utilize it as much as
|
||||
possible. In our example we will define a small `struct` called `S` which
|
||||
contains only a single member, `text` of type `std::string`.
|
||||
```cpp
|
||||
struct S {
|
||||
std::string text;
|
||||
};
|
||||
```
|
||||
We want to define a constructor that takes a `std::string`, and saves its
|
||||
value in `text`. This is a common situation, where the constructor takes
|
||||
a value, and saves it in the object to be constructed.
|
||||
|
||||
Our first implementation would look like this:
|
||||
```cpp
|
||||
S(const std::string &s) : text(s) {}
|
||||
```
|
||||
|
||||
This is a valid solution but with one downside - we always copy. If we
|
||||
construct an object like this:
|
||||
```cpp
|
||||
S s("some text");
|
||||
```
|
||||
we would create a temporary `std::string` object and then copy it to our member
|
||||
variable.
|
||||
|
||||
Of course, we know what to do now - we will capture temporary variables using
|
||||
`&&` and move it into our `text` variable.
|
||||
```cpp
|
||||
S(std::string &&s) : text(std::move(s)) {}
|
||||
```
|
||||
|
||||
Now let's add an extra member variable of type `std::vector<int>` called
|
||||
`words`. Our constructors accept 2 values now - `std::vector<int>` and
|
||||
`std::string`. Those arguments could be passed by value, by reference, or
|
||||
as rvalues. To cover all the cases we need to define a dedicated constructor
|
||||
for each case.
|
||||
Fortunately, there are two simpler options, the first one is writing a
|
||||
templated constructor:
|
||||
```cpp
|
||||
template<typename T1, typename T2>
|
||||
S(T1 &&s, T2 &&v) : text(std::forward<T1>(s), words(std::forward<T2>(v) {}
|
||||
```
|
||||
But don't forget to define `requires` clause so you don't accept any type. This
|
||||
solution is optimal but really hard to read AND write. The second solution is
|
||||
something you should ALWAYS prefer in these simple cases where we only store
|
||||
one of the arguments:
|
||||
```cpp
|
||||
S(std::string s, std::vector<int> v) : text(std::move(s)), words(std::move(v)) {}
|
||||
```
|
||||
This way we have an almost optimal solution. The only extra operation we have is
|
||||
the extra move when we send an `lvalue`. We would copy the value to the `s`, and
|
||||
then move it to the `text` variable. Before, we would copy directly to `text`.
|
||||
Also, you should ALWAYS write const-correct code, meaning `s` and `v` cannot be
|
||||
`const` as it's not correct here. Why is that? You CANNOT move a const object!
|
||||
It would just degrade to copying the object. I would say that this is a small
|
||||
price to pay for a much cleaner and more maintainable code.
|
||||
|
||||
### Additional Style Conventions
|
||||
|
||||
Old code may have broken Google C++ Style accidentally, but the new code
|
||||
should adhere to it as close as possible. We do have some exceptions
|
||||
to Google style as well as additions for unspecified conventions.
|
||||
|
||||
#### Using C++ Exceptions
|
||||
|
||||
Unlike Google, we do not forbid using exceptions.
|
||||
|
||||
But, you should be very careful when using them and introducing new ones. They
|
||||
are indeed handy, but cause problems with understanding the control flow since
|
||||
exceptions are another form of `goto`. It also becomes very hard to determine
|
||||
that the program is in correct state after the stack is unwound and the thrown
|
||||
exception handled. Other than those issues, throwing exceptions in destructors
|
||||
will terminate the program. The same will happen if a thread doesn't handle an
|
||||
exception even though it is not the main thread.
|
||||
|
||||
[Style guide reference](https://google.github.io/styleguide/cppguide.html#Exceptions)
|
||||
|
||||
In general, when introducing a new exception, either via `throw` statement or
|
||||
calling a function which throws, you must examine all transitive callers and
|
||||
update their implementation and/or documentation.
|
||||
|
||||
#### Assertions
|
||||
|
||||
We use `CHECK` and `DCHECK` macros from glog library. You are encouraged to
|
||||
use them as often as possible to both document and validate various pre and
|
||||
post conditions of a function.
|
||||
|
||||
`CHECK` remains even in release build and should be preferred over it's cousin
|
||||
`DCHECK` which only exists in debug builds. The primary reason is that you
|
||||
want to trigger assertions in release builds in case the tests didn't
|
||||
completely validate all code paths. It is better to fail fast and crash the
|
||||
program, than to leave it in undefined state and potentially corrupt end
|
||||
user's work. In cases when profiling shows that `CHECK` is causing visible
|
||||
slowdown you should switch to `DCHECK`.
|
||||
|
||||
#### Template Parameter Naming
|
||||
|
||||
Template parameter names should start with capital letter 'T' followed by a
|
||||
short descriptive name. For example:
|
||||
|
||||
```cpp
|
||||
template <typename TKey, typename TValue>
|
||||
class KeyValueStore
|
||||
```
|
||||
|
||||
## Code Formatting
|
||||
|
||||
You should install `clang-format` and run it on code you change or add. The
|
||||
root of Memgraph's project contains the `.clang-format` file, which specifies
|
||||
how formatting should behave. Running `clang-format -style=file` in the
|
||||
project's root will read the file and behave as expected. For ease of use, you
|
||||
should integrate formatting with your favourite editor.
|
||||
|
||||
The code formatting isn't enforced, because sometimes manual formatting may
|
||||
produce better results. Though, running `clang-format` is strongly encouraged.
|
||||
|
||||
## Documentation
|
||||
|
||||
Besides following the comment guidelines from [Google Style
|
||||
Guide](https://google.github.io/styleguide/cppguide.html#Comments), your
|
||||
documentation of the public API should be
|
||||
[Doxygen](https://github.com/doxygen/doxygen) compatible. For private parts of
|
||||
the code or for comments accompanying the implementation, you are free to
|
||||
break doxygen compatibility. In both cases, you should write your
|
||||
documentation as full sentences, correctly written in English.
|
||||
|
||||
## Doxygen
|
||||
|
||||
To start a Doxygen compatible documentation string, you should open your
|
||||
comment with either a JavaDoc style block comment (`/**`) or a line comment
|
||||
containing 3 slashes (`///`). Take a look at the 2 examples below.
|
||||
|
||||
### Block Comment
|
||||
|
||||
```cpp
|
||||
/**
|
||||
* One sentence, brief description.
|
||||
*
|
||||
* Long form description.
|
||||
*/
|
||||
```
|
||||
|
||||
### Line Comment
|
||||
|
||||
```cpp
|
||||
/// One sentence, brief description.
|
||||
///
|
||||
/// Long form description.
|
||||
```
|
||||
|
||||
If you only have a brief description, you may collapse the documentation into
|
||||
a single line.
|
||||
|
||||
### Block Comment
|
||||
|
||||
```cpp
|
||||
/** Brief description. */
|
||||
```
|
||||
|
||||
### Line Comment
|
||||
|
||||
```cpp
|
||||
/// Brief description.
|
||||
```
|
||||
|
||||
Whichever style you choose, keep it consistent across the whole file.
|
||||
|
||||
Doxygen supports various commands in comments, such as `@file` and `@param`.
|
||||
These help Doxygen to render specified things differently or to track them for
|
||||
cross referencing. If you want to learn more, take a look at these two links:
|
||||
|
||||
* http://www.stack.nl/~dimitri/doxygen/manual/docblocks.html
|
||||
* http://www.stack.nl/~dimitri/doxygen/manual/commands.html
|
||||
|
||||
## Examples
|
||||
|
||||
Below are a few examples of documentation from the codebase.
|
||||
|
||||
### Function
|
||||
|
||||
```cpp
|
||||
/**
|
||||
* Removes whitespace characters from the start and from the end of a string.
|
||||
*
|
||||
* @param s String that is going to be trimmed.
|
||||
*
|
||||
* @return Trimmed string.
|
||||
*/
|
||||
inline std::string Trim(const std::string &s);
|
||||
```
|
||||
|
||||
### Class
|
||||
|
||||
```cpp
|
||||
/** Base class for logical operators.
|
||||
*
|
||||
* Each operator describes an operation, which is to be performed on the
|
||||
* database. Operators are iterated over using a @c Cursor. Various operators
|
||||
* can serve as inputs to others and thus a sequence of operations is formed.
|
||||
*/
|
||||
class LogicalOperator
|
||||
: public ::utils::Visitable<HierarchicalLogicalOperatorVisitor> {
|
||||
public:
|
||||
/** Constructs a @c Cursor which is used to run this operator.
|
||||
*
|
||||
* @param GraphDbAccessor Used to perform operations on the database.
|
||||
*/
|
||||
virtual std::unique_ptr<Cursor> MakeCursor(GraphDbAccessor &db) const = 0;
|
||||
|
||||
/** Return @c Symbol vector where the results will be stored.
|
||||
*
|
||||
* Currently, outputs symbols are only generated in @c Produce operator.
|
||||
* @c Skip, @c Limit and @c OrderBy propagate the symbols from @c Produce (if
|
||||
* it exists as input operator). In the future, we may want this method to
|
||||
* return the symbols that will be set in this operator.
|
||||
*
|
||||
* @param SymbolTable used to find symbols for expressions.
|
||||
* @return std::vector<Symbol> used for results.
|
||||
*/
|
||||
virtual std::vector<Symbol> OutputSymbols(const SymbolTable &) const {
|
||||
return std::vector<Symbol>();
|
||||
}
|
||||
|
||||
virtual ~LogicalOperator() {}
|
||||
};
|
||||
```
|
||||
|
||||
### File Header
|
||||
|
||||
```cpp
|
||||
/// @file visitor.hpp
|
||||
///
|
||||
/// This file contains the generic implementation of visitor pattern.
|
||||
///
|
||||
/// There are 2 approaches to the pattern:
|
||||
///
|
||||
/// * classic visitor pattern using @c Accept and @c Visit methods, and
|
||||
/// * hierarchical visitor which also uses @c PreVisit and @c PostVisit
|
||||
/// methods.
|
||||
///
|
||||
/// Classic Visitor
|
||||
/// ===============
|
||||
///
|
||||
/// Explanation on the classic visitor pattern can be found from many
|
||||
/// sources, but here is the link to hopefully most easily accessible
|
||||
/// information: https://en.wikipedia.org/wiki/Visitor_pattern
|
||||
///
|
||||
/// The idea behind the generic implementation of classic visitor pattern is to
|
||||
/// allow returning any type via @c Accept and @c Visit methods. Traversing the
|
||||
/// class hierarchy is relegated to the visitor classes. Therefore, visitor
|
||||
/// should call @c Accept on children when visiting their parents. To implement
|
||||
/// such a visitor refer to @c Visitor and @c Visitable classes.
|
||||
///
|
||||
/// Hierarchical Visitor
|
||||
/// ====================
|
||||
///
|
||||
/// Unlike the classic visitor, the intent of this design is to allow the
|
||||
/// visited structure itself to control the traversal. This way the internal
|
||||
/// children structure of classes can remain private. On the other hand,
|
||||
/// visitors may want to differentiate visiting composite types from leaf types.
|
||||
/// Composite types are those which contain visitable children, unlike the leaf
|
||||
/// nodes. Differentiation is accomplished by providing @c PreVisit and @c
|
||||
/// PostVisit methods, which should be called inside @c Accept of composite
|
||||
/// types. Regular @c Visit is only called inside @c Accept of leaf types.
|
||||
/// To implement such a visitor refer to @c CompositeVisitor, @c LeafVisitor and
|
||||
/// @c Visitable classes.
|
||||
///
|
||||
/// Implementation of hierarchical visiting is modelled after:
|
||||
/// http://wiki.c2.com/?HierarchicalVisitorPattern
|
||||
```
|
||||
|
1349
docs/dev/lcp.md
1349
docs/dev/lcp.md
File diff suppressed because it is too large
Load Diff
@ -1,15 +0,0 @@
|
||||
# Other Code Conventions
|
||||
|
||||
While we are mainly programming in C++, we do use other programming languages
|
||||
when appropriate. This chapter describes conventions for such code.
|
||||
|
||||
## Python
|
||||
|
||||
Code written in Python should adhere to
|
||||
[PEP 8](https://www.python.org/dev/peps/pep-0008/). You should run `flake8` on
|
||||
your code to automatically check compliance.
|
||||
|
||||
## Common Lisp
|
||||
|
||||
Code written in Common Lisp should adhere to
|
||||
[Google Common Lisp Style](https://google.github.io/styleguide/lispguide.xml).
|
1
docs/dev/query/.gitignore
vendored
1
docs/dev/query/.gitignore
vendored
@ -1 +0,0 @@
|
||||
html/
|
@ -1,34 +0,0 @@
|
||||
# Query Parsing, Planning and Execution
|
||||
|
||||
This part of the documentation deals with query execution.
|
||||
|
||||
Memgraph currently supports only query interpretation. Each new query is
|
||||
parsed, analysed and translated into a sequence of operations which are then
|
||||
executed on the main database storage. Query execution is organized into the
|
||||
following phases:
|
||||
|
||||
1. [Lexical Analysis (Tokenization)](parsing.md)
|
||||
2. [Syntactic Analysis (Parsing)](parsing.md)
|
||||
3. [Semantic Analysis and Symbol Generation](semantic.md)
|
||||
4. [Logical Planning](planning.md)
|
||||
5. [Logical Plan Execution](execution.md)
|
||||
|
||||
The main entry point is `Interpreter::operator()`, which takes a query text
|
||||
string and produces a `Results` object. To instantiate the object,
|
||||
`Interpreter` needs to perform the above steps from 1 to 4. If any of the
|
||||
steps fail, a `QueryException` is thrown. The complete `LogicalPlan` is
|
||||
wrapped into a `CachedPlan` and stored for reuse. This way we can skip the
|
||||
whole process of analysing a query if it appears to be the same as before.
|
||||
|
||||
When we have valid plan, the client code can invoke `Results::PullAll` with a
|
||||
stream object. The `Results` instance will then execute the plan and fill the
|
||||
stream with the obtained results.
|
||||
|
||||
Since we want to optionally run Memgraph as a distributed database, we have
|
||||
hooks for creating a different plan of logical operators.
|
||||
`DistributedInterpreter` inherits from `Interpreter` and overrides
|
||||
`MakeLogicalPlan` method. This method needs to return a concrete instance of
|
||||
`LogicalPlan`, and in case of distributed database that will be
|
||||
`DistributedLogicalPlan`.
|
||||
|
||||
![Intepreter Class Diagram](interpreter-class.png)
|
@ -1,373 +0,0 @@
|
||||
# Logical Plan Execution
|
||||
|
||||
We implement classical iterator style operators. Logical operators define
|
||||
operations on database. They encapsulate the following info: what the input is
|
||||
(another `LogicalOperator`), what to do with the data, and how to do it.
|
||||
|
||||
Currently logical operators can have zero or more input operations, and thus a
|
||||
`LogicalOperator` tree is formed. Most `LogicalOperator` types have only one
|
||||
input, so we are mostly working with chains instead of full fledged trees.
|
||||
You can find information on each operator in `src/query/plan/operator.lcp`.
|
||||
|
||||
## Cursor
|
||||
|
||||
Logical operators do not perform database work themselves. Instead they create
|
||||
`Cursor` objects that do the actual work, based on the info in the operator.
|
||||
Cursors expose a `Pull` method that gets called by the cursor's consumer. The
|
||||
consumer keeps pulling as long as the `Pull` returns `true` (indicating it
|
||||
successfully performed some work and might be eligible for another `Pull`).
|
||||
Most cursors will call the `Pull` function of their input provided cursor, so
|
||||
typically a cursor chain is created that is analogue to the logical operator
|
||||
chain it's created from.
|
||||
|
||||
## Frame
|
||||
|
||||
The `Frame` object contains all the data of the current `Pull` chain. It
|
||||
serves for communicating data between cursors.
|
||||
|
||||
For example, in a `MATCH (n) RETURN n` query the `ScanAllCursor` places a
|
||||
vertex on the `Frame` for each `Pull`. It places it on the place reserved for
|
||||
the `n` symbol. Then the `ProduceCursor` can take that same value from the
|
||||
`Frame` because it knows the appropriate symbol. `Frame` positions are indexed
|
||||
by `Symbol` objects.
|
||||
|
||||
## ExpressionEvaluator
|
||||
|
||||
Expressions results are not placed on the `Frame` since they do not need to be
|
||||
communicated between different `Cursors`. Instead, expressions are evaluated
|
||||
using an instance of `ExpressionEvaluator`. Since generally speaking an
|
||||
expression can be defined by a tree of subexpressions, the
|
||||
`ExpressionEvaluator` is implemented as a tree visitor. There is a performance
|
||||
sub-optimality here because a stack is used to communicate intermediary
|
||||
expression results between elements of the tree. This is one of the reasons
|
||||
why it's planned to use `Frame` for intermediary expression results as well.
|
||||
The other reason is that it might facilitate compilation later on.
|
||||
|
||||
## Cypher Execution Semantics
|
||||
|
||||
Cypher query execution has *mostly* well-defined semantics. Some are
|
||||
explicitly defined by openCypher and its TCK, while others are implicitly
|
||||
defined by Neo4j's implementation of Cypher that we want to be generally
|
||||
compatible with.
|
||||
|
||||
These semantics can in short be described as follows: a Cypher query consists
|
||||
of multiple clauses some of which modify it. Generally, every clause in the
|
||||
query, when reading it left to right, operates on a consistent state of the
|
||||
property graph, untouched by subsequent clauses. This means that a `MATCH`
|
||||
clause in the beginning operates on a graph-state in which modifications by
|
||||
the subsequent `SET` are not visible.
|
||||
|
||||
The stated semantics feel very natural to the end-user, and Neo seems to
|
||||
implement them well. For Memgraph the situation is complex because
|
||||
`LogicalOperator` execution (through a `Cursor`) happens one `Pull` at a time
|
||||
(generally meaning all the query clauses get executed for every top-level
|
||||
`Pull`). This is not inherently consistent with Cypher semantics because a
|
||||
`SET` clause can modify data, and the `MATCH` clause that precedes it might
|
||||
see the modification in a subsequent `Pull`. Also, the `RETURN` clause might
|
||||
want to stream results to the user before all `SET` clauses have been
|
||||
executed, so the user might see some intermediate graph state. There are many
|
||||
edge-cases that Memgraph does its best to avoid to stay true to Cypher
|
||||
semantics, while at the same time using a high-performance streaming approach.
|
||||
The edge-cases are enumerated in this document along with the implementation
|
||||
details they imply.
|
||||
|
||||
## Implementation Peculiarities
|
||||
|
||||
### Once
|
||||
|
||||
An operator that does nothing but whose `Cursor::Pull` returns `true` on the
|
||||
first `Pull` and `false` on subsequent ones. This operator is used when
|
||||
another operator has an optional input, because in Cypher a clause will
|
||||
typically execute once for every input from the preceding clauses, or just
|
||||
once if there was no preceding input. For example, consider the `CREATE`
|
||||
clause. In the query `CREATE (n)` only one node is created, while in the query
|
||||
`MATCH (n) CREATE (m)` a node is created for each existing node. Thus in our
|
||||
`CreateNode` logical operator the input is either a `ScanAll` operator, or a
|
||||
`Once` operator.
|
||||
|
||||
### storage::View
|
||||
|
||||
In the previous section, [Cypher Execution
|
||||
Semantics](#cypher-execution-semantics), we mentioned how the preceding
|
||||
clauses should not see changes made in subsequent ones. For that reason, some
|
||||
operators take a `storage::View` enum value. This value determines which state of
|
||||
the graph an operator sees.
|
||||
|
||||
Consider the query `MATCH (n)--(m) WHERE n.x = 0 SET m.x = 1`. Naive streaming
|
||||
could match a vertex `n` on the given criteria, expand to `m`, update it's
|
||||
property, and in the next iteration consider the vertex previously matched to
|
||||
`m` and skip it because it's newly set property value does not qualify. This
|
||||
is not how Cypher works. To handle this issue properly, Memgraph designed the
|
||||
`VertexAccessor` class that tracks two versions of data: one that was visible
|
||||
before the current transaction+command, and the optional other that was
|
||||
created in the current transaction+command. The `MATCH` clause will be planned
|
||||
as `ScanAll` and `Expand` operations using `storage::View::OLD` value. This
|
||||
will ensure modifications performed in the same query do not affect it. The
|
||||
same applies to edges and the `EdgeAccessor` class.
|
||||
|
||||
### Existing Record Detection
|
||||
|
||||
It's possible that a pattern element has already been declared in the same
|
||||
pattern, or a preceding pattern. For example `MATCH (n)--(m), (n)--(l)` or a
|
||||
cycle-detection match `MATCH (n)-->(n) RETURN n`. Implementation-wise,
|
||||
existing record detection just checks that the expanded record is equal to the
|
||||
one already on the frame.
|
||||
|
||||
### Why Not Use Separate Expansion Ops for Edges and Vertices?
|
||||
|
||||
Expanding an edge and a vertex in separate ops is not feasible when matching a
|
||||
cycle in bi-directional expansions. Consider the query `MATCH (n)--(n) RETURN
|
||||
n`. Let's try to expand first the edge in one op, and vertex in the next. The
|
||||
vertex expansion consumes the edge expansion input. It takes the expanded edge
|
||||
from the frame. It needs to detect a cycle by comparing the vertex existing on
|
||||
the frame with one of the edge vertices (`from` or `to`). But which one? It
|
||||
doesn't know, and can't ensure correct cycle detection.
|
||||
|
||||
### Data Visibility During and After SET
|
||||
|
||||
In Cypher, setting values always works on the latest version of data (from
|
||||
preceding or current clause). That means that within a `SET` clause all the
|
||||
changes from previous clauses must be visible, as well as changes done by the
|
||||
current `SET` clause. Also, if there is a clause after `SET` it must see *all*
|
||||
the changes performed by the preceding `SET`. Both these things are best
|
||||
illustrated with the following queries executed on an empty database:
|
||||
|
||||
CREATE (n:A {x:0})-[:EdgeType]->(m:B {x:0})
|
||||
MATCH (n)--(m) SET m.x = n.x + 1 RETURN labels(n), n.x, labels(m), m.x
|
||||
|
||||
This returns:
|
||||
|
||||
+---------+---+---------+---+
|
||||
|labels(n)|n.x|labels(m)|m.x|
|
||||
+:=======:+:=:+:=======:+:=:+
|
||||
|[A] |2 |[B] |1 |
|
||||
+---------+---+---------+---+
|
||||
|[B] |1 |[A] |2 |
|
||||
+---------+---+---------+---+
|
||||
|
||||
The obtained result implies the following operations:
|
||||
|
||||
1. In the first iteration set the value of the `B.x` to 1
|
||||
2. In the second iteration the we observe `B.x` with the value of 1 and set
|
||||
`A.x` to 2
|
||||
3. In `RETURN` we see all the changes made in both iterations
|
||||
|
||||
To implement the desired behavior Memgraph utilizes two techniques. First is
|
||||
the already mentioned tracking of two versions of data in vertex accessors.
|
||||
Using this approach ensures that the second iteration in the example query
|
||||
sees the data modification performed by the preceding iteration. The second
|
||||
technique is the `Accumulate` operation that accumulates all the iterations
|
||||
from the preceding logical op before passing them to the next logical op. In
|
||||
the example query, `Accumulate` ensures that the results returned to the user
|
||||
reflect changes performed in all iterations of the query (naive streaming
|
||||
could stream results at the end of first iteration producing inconsistent
|
||||
results). Note that `Accumulate` is demanding regarding memory and slows down
|
||||
query execution. For that reason it should be used only when necessary, for
|
||||
example it does not have to be used in a query that has `MATCH` and `SET` but
|
||||
no `RETURN`.
|
||||
|
||||
### Neo4j Inconsistency on Multiple SET Clauses
|
||||
|
||||
Considering the preceding example it could be expected that when a query has
|
||||
multiple `SET` clauses all the changes from those preceding one are visible.
|
||||
This is not the case in Neo4j's implementation. Consider the following queries
|
||||
executed on an empty database:
|
||||
|
||||
CREATE (n:A {x:0})-[:EdgeType]->(m:B {x:0})
|
||||
MATCH (n)--(m) SET n.x = n.x + 1 SET m.x = m.x * 2
|
||||
RETURN labels(n), n.x, labels(m), m.x
|
||||
|
||||
This returns:
|
||||
|
||||
+---------+---+---------+---+
|
||||
|labels(n)|n.x|labels(m)|m.x|
|
||||
+:=======:+:=:+:=======:+:=:+
|
||||
|[A] |2 |[B] |1 |
|
||||
+---------+---+---------+---+
|
||||
|[B] |1 |[A] |2 |
|
||||
+---------+---+---------+---+
|
||||
|
||||
If all the iterations of the first `SET` clause were executed before executing
|
||||
the second, all the resulting values would be 2. This not being the case, we
|
||||
conclude that Neo4j does not use a barrier-like mechanism between `SET`
|
||||
clauses. It is Memgraph's current vision that this is inconsistent and we
|
||||
plan to reduce Neo4j compliance in favour of operation consistency.
|
||||
|
||||
### Double Deletion
|
||||
|
||||
It's possible to match the same graph element multiple times in a single query
|
||||
and delete it. Neo supports this, and so do we. The relevant implementation
|
||||
detail is in the `GraphDbAccessor` class, where the record deletion functions
|
||||
reside, and not in the logical plan execution. It comes down to checking if a
|
||||
record has already been deleted in the current transaction+command and not
|
||||
attempting to do it again (results in a crash).
|
||||
|
||||
### Set + Delete Edge-case
|
||||
|
||||
It's legal for a query to combine `SET` and `DELETE` clauses. Consider the
|
||||
following queries executed on an empty database:
|
||||
|
||||
|
||||
CREATE ()-[:T]->()
|
||||
MATCH (n)--(m) SET n.x = 42 DETACH DELETE m
|
||||
|
||||
Due to the `MATCH` being undirected the second pull will attempt to set data
|
||||
on a deleted vertex. This is not a legal operation in Memgraph storage
|
||||
implementation. For that reason the logical operator for `SET` must check if
|
||||
the record it's trying to set something on has been deleted by the current
|
||||
transaction+command. If so, the modification is not executed.
|
||||
|
||||
### Deletion Accumulation
|
||||
|
||||
Sometimes it's necessary to accumulate deletions of all the matches before
|
||||
attempting to execute them. Consider this the following. Start with an empty
|
||||
database and execute queries:
|
||||
|
||||
CREATE ()-[:T]->()-[:T]->()
|
||||
MATCH (a)-[r1]-(b)-[r2]-(c) DELETE r1, b, c
|
||||
|
||||
Note that the `DELETE` clause attempts to delete node `c`, but it does not
|
||||
detach it by deleting edge `r2`. However, due to undirected edge in the
|
||||
`MATCH`, both edges get pulled and deleted.
|
||||
|
||||
Currently Memgraph does not support this behavior, Neo does. There are a few
|
||||
ways that we could do this.
|
||||
|
||||
* Accumulate on deletion (that sucks because we have to keep track of
|
||||
everything that gets returned after the deletion).
|
||||
* Maybe we could stream through the deletion op, but defer actual deletion
|
||||
until plan-execution end.
|
||||
* Ignore this because it's very edgy (this is the currently selected option).
|
||||
|
||||
### Aggregation Without Input
|
||||
|
||||
It is necessary to define what aggregation ops return when they receive no
|
||||
input. Following is a table that shows what Neo4j's Cypher implementation and
|
||||
SQL produce.
|
||||
|
||||
|
||||
+-------------+------------------------+---------------------+---------------------+------------------+
|
||||
| \<OP\> | 1. Cypher, no group-by | 2. Cypher, group-by | 3. SQL, no group-by | 4. SQL, group-by |
|
||||
+=============+:======================:+:===================:+:===================:+:================:+
|
||||
| Count(\*) | 0 | \<NO\_ROWS> | 0 | \<NO\_ROWS> |
|
||||
+-------------+------------------------+---------------------+---------------------+------------------+
|
||||
| Count(prop) | 0 | \<NO\_ROWS> | 0 | \<NO\_ROWS> |
|
||||
+-------------+------------------------+---------------------+---------------------+------------------+
|
||||
| Sum | 0 | \<NO\_ROWS> | NULL | \<NO\_ROWS> |
|
||||
+-------------+------------------------+---------------------+---------------------+------------------+
|
||||
| Avg | NULL | \<NO\_ROWS> | NULL | \<NO\_ROWS> |
|
||||
+-------------+------------------------+---------------------+---------------------+------------------+
|
||||
| Min | NULL | \<NO\_ROWS> | NULL | \<NO\_ROWS> |
|
||||
+-------------+------------------------+---------------------+---------------------+------------------+
|
||||
| Max | NULL | \<NO\_ROWS> | NULL | \<NO\_ROWS> |
|
||||
+-------------+------------------------+---------------------+---------------------+------------------+
|
||||
| Collect | [] | \<NO\_ROWS> | N/A | N/A |
|
||||
+-------------+------------------------+---------------------+---------------------+------------------+
|
||||
|
||||
Where:
|
||||
|
||||
1. `MATCH (n) RETURN <OP>(n.prop)`
|
||||
2. `MATCH (n) RETURN <OP>(n.prop), (n.prop2)`
|
||||
3. `SELECT <OP>(prop) FROM Table`
|
||||
4. `SELECT <OP>(prop), prop2 FROM Table GROUP BY prop2`
|
||||
|
||||
Neo's Cypher implementation diverges from SQL only when performing `SUM`.
|
||||
Memgraph implements SQL-like behavior. It is considered that `SUM` of
|
||||
arbitrary elements should not be implicitly 0, especially in a property graph
|
||||
without a strict schema (the property in question can contain values of
|
||||
arbitrary types, or no values at all).
|
||||
|
||||
### OrderBy
|
||||
|
||||
The `OrderBy` logical operator sorts the results in the desired order. It
|
||||
occurs in Cypher as part of a `WITH` or `RETURN` clause. Both the concept and
|
||||
the implementation are straightforward. It's necessary for the logical op to
|
||||
`Pull` everything from its input so it can be sorted. It's not necessary to
|
||||
keep the whole `Frame` state of each input, it is sufficient to keep a list of
|
||||
`TypedValues` on which the results will be sorted, and another list of values
|
||||
that need to be remembered and recreated on the `Frame` when yielding.
|
||||
|
||||
The sorting itself is made to reflect that of Neo's implementation which comes
|
||||
down to these points.
|
||||
|
||||
* `Null` comes last (as if it's greater than anything).
|
||||
* Primitive types compare naturally, with no implicit casting except from
|
||||
`int` to `double`.
|
||||
* Complex types are not comparable.
|
||||
* Every unsupported comparison results in an exception that gets propagated
|
||||
to the end user.
|
||||
|
||||
### Limit in Write Queries
|
||||
|
||||
`Limit` can be used as part of a write query, in which case it will *not*
|
||||
reduce the amount of performed updates. For example, consider a database that
|
||||
has 10 vertices. The query `MATCH (n) SET n.x = 1 RETURN n LIMIT 3` will
|
||||
result in all vertices having their property value changed, while returning
|
||||
only the first to the client. This makes sense from the implementation
|
||||
standpoint, because `Accumulate` is planned after `SetProperty` but before
|
||||
`Produce` and `Limit` operations. Note that this behavior can be
|
||||
non-deterministic in some queries, since it relies on the order of iteration
|
||||
over nodes which is undefined when not explicitly specified.
|
||||
|
||||
### Merge
|
||||
|
||||
`MERGE` in Cypher attempts to match a pattern. If it already exists, it does
|
||||
nothing and subsequent clauses like `RETURN` can use the matched pattern
|
||||
elements. If the pattern can't match to any data, it creates it. For detailed
|
||||
information see Neo4j's [merge
|
||||
documentation.](https://neo4j.com/docs/developer-manual/current/cypher/clauses/merge/)
|
||||
|
||||
An important thing about `MERGE` is visibility of modified data. `MERGE` takes
|
||||
an input (typically a `MATCH`) and has two additional *phases*: the merging
|
||||
part, and the subsequent set parts (`ON MATCH SET` and `ON CREATE SET`).
|
||||
Analysis of Neo4j's behavior indicates that each of these three phases (input,
|
||||
merge, set) does not see changes to the graph state done by subsequent phase.
|
||||
The input phase does not see data created by the merge phase, nor the set
|
||||
phase. This is consistent with what seems like the general Cypher philosophy
|
||||
that query clause effects aren't visible in the preceding clauses.
|
||||
|
||||
We define the `Merge` logical operator as a *routing* operator that uses three
|
||||
logical operator branches.
|
||||
|
||||
1. The input from a preceding clause.
|
||||
|
||||
For example in `MATCH (n), (m) MERGE (n)-[:T]-(m)`. This input is
|
||||
optional because `MERGE` is allowed to be the first clause in a query.
|
||||
|
||||
2. The `merge_match` branch.
|
||||
|
||||
This logical operator branch is `Pull`-ed from until exhausted for each
|
||||
successful `Pull` from the input branch.
|
||||
|
||||
3. The `merge_create` branch.
|
||||
|
||||
This branch is `Pull`ed when the `merge_match` branch does not match
|
||||
anything (no successful `Pull`s) for an input `Pull`. It is `Pull`ed only
|
||||
once in such a situation, since only one creation needs to occur for a
|
||||
failed match.
|
||||
|
||||
The `ON MATCH SET` and `ON CREATE SET` parts of the `MERGE` clause are
|
||||
included in the `merge_match` and `merge_create` branches respectively. They
|
||||
are placed on the end of their branches so that they execute only when those
|
||||
branches succeed.
|
||||
|
||||
Memgraph strives to be consistent with Neo in its `MERGE` implementation,
|
||||
while at the same time keeping performance as good as possible. Consistency
|
||||
with Neo w.r.t. graph state visibility is not trivial. Documentation for
|
||||
`Expand` and `Set` describe how Memgraph keeps track of both the updated
|
||||
version of an edge/vertex and the old one, as it was before the current
|
||||
transaction+command. This technique is also used in `Merge`. The input
|
||||
phase/branch of `Merge` always looks at the old data. The merge phase needs to
|
||||
see the new data so it doesn't create more data then necessary.
|
||||
|
||||
For example, consider the query.
|
||||
|
||||
MATCH (p:Person) MERGE (c:City {name: p.lives_in})
|
||||
|
||||
This query needs to create a city node only once for each unique `p.lives_in`.
|
||||
Finally the set phase of a `MERGE` clause should not affect the merge phase.
|
||||
To achieve this the `merge_match` branch of the `Merge` operator should see
|
||||
the latest created nodes, but filter them on their old state (if those nodes
|
||||
were not created by the `create_branch`). Implementation-wise that means that
|
||||
`ScanAll` and `Expand` operators in the `merge_branch` need to look at the new
|
||||
graph state, while `Filter` operators the old, if available.
|
@ -1,23 +0,0 @@
|
||||
digraph interpreter {
|
||||
node [fontname="dejavusansmono"]
|
||||
edge [fontname="dejavusansmono"]
|
||||
node [shape=record]
|
||||
edge [dir=back,arrowtail=empty,arrowsize=1.5]
|
||||
Interpreter [label="{\N|+ operator(query : string, ...) : Results\l|
|
||||
# MakeLogicalPlan(...) : LogicalPlan\l|
|
||||
- plan_cache_ : Map(QueryHash, CachedPlan)\l}"]
|
||||
Interpreter -> DistributedInterpreter
|
||||
Results [label="{\N|+ PullAll(stream) : void\l|- plan_ : CachedPlan\l}"]
|
||||
Interpreter -> Results
|
||||
[dir=forward,style=dashed,arrowhead=open,label="<<create>>"]
|
||||
CachedPlan -> Results
|
||||
[dir=forward,arrowhead=odiamond,taillabel="1",headlabel="*"]
|
||||
Interpreter -> CachedPlan [arrowtail=diamond,taillabel="1",headlabel="*"]
|
||||
CachedPlan -> LogicalPlan [arrowtail=diamond]
|
||||
LogicalPlan [label="{\N|+ GetRoot() : LogicalOperator
|
||||
\l+ GetCost() : double\l}"]
|
||||
LogicalPlan -> SingleNodeLogicalPlan [style=dashed]
|
||||
LogicalPlan -> DistributedLogicalPlan [style=dashed]
|
||||
DistributedInterpreter -> DistributedLogicalPlan
|
||||
[dir=forward,style=dashed,arrowhead=open,label="<<create>>"]
|
||||
}
|
Binary file not shown.
Before Width: | Height: | Size: 55 KiB |
@ -1,62 +0,0 @@
|
||||
# Lexical and Syntactic Analysis
|
||||
|
||||
## Antlr
|
||||
|
||||
We use Antlr for lexical and syntax analysis of Cypher queries. Antrl uses
|
||||
grammar file `Cypher.g4` downloaded from http://www.opencypher.org to generate
|
||||
the parser and the visitor for the Cypher parse tree. Even though the provided
|
||||
grammar is not very pleasant to work with we decided not to do any drastic
|
||||
changes to it so that our transition to newly published versions of
|
||||
`Cypher.g4` would be easier. Nevertheless, we had to fix some bugs and add
|
||||
features, so our version is not completely the same.
|
||||
|
||||
In addition to using `Cypher.g4`, we have `MemgraphCypher.g4`. This grammar
|
||||
file defines Memgraph specific extensions to the original grammar. Most
|
||||
notable example is the inclusion of syntax for handling authorization. At the
|
||||
moment, some extensions are also found in `Cypher.g4`. For example, the syntax
|
||||
for using a lambda function in relationship patterns. These extensions should
|
||||
be moved out of `Cypher.g4`, so that it remains as close to the original
|
||||
grammar as possible. Additionally, having `MemgraphCypher.g4` may not be
|
||||
enough if we wish to split the functionality for community and enterprise
|
||||
editions of Memgraph.
|
||||
|
||||
## Abstract Syntax Tree (AST)
|
||||
|
||||
Since Antlr generated visitor and the official openCypher grammar are not very
|
||||
practical to use, we translate the Antlr's AST to our own AST. Currently there
|
||||
are ~40 types of nodes in our AST. Their definitions can be found in
|
||||
`src/query/frontend/ast/ast.lcp`.
|
||||
|
||||
Major groups of types can be found under the following base types.
|
||||
|
||||
* `Expression` --- types corresponding to Cypher expressions.
|
||||
* `Clause` --- types corresponding to Cypher clauses.
|
||||
* `PatternAtom` --- node or edge related information.
|
||||
* `Query` --- different kinds of queries, allows extending the language with
|
||||
Memgraph specific query syntax.
|
||||
|
||||
Memory management of created AST nodes is done with `AstStorage`. Each type
|
||||
must be created by invoking `AstStorage::Create` method. This way all of the
|
||||
pointers to nodes and their children are raw pointers. The only owner of
|
||||
allocated memory is the `AstStorage`. When the storage goes out of scope, the
|
||||
pointers become invalid. It may be more natural to handle tree ownership via
|
||||
`unique_ptr`, i.e. each node owns its children. But there are some benefits to
|
||||
having a custom storage and allocation scheme.
|
||||
|
||||
The primary reason we opted for not using `unique_ptr` is the requirement of
|
||||
Antlr's base visitor class that the resulting values must by copyable. The
|
||||
result is wrapped in `antlr::Any` so that the derived visitor classes may
|
||||
return any type they wish when visiting Antlr's AST. Unfortunately,
|
||||
`antlr::Any` does not work with non-copyable types.
|
||||
|
||||
Another benefit of having `AstStorage` is that we can easily add a different
|
||||
allocation scheme for AST nodes. The interface of node creation would not
|
||||
change.
|
||||
|
||||
### AST Translation
|
||||
|
||||
The translation process is done via `CypherMainVisitor` class, which is
|
||||
derived from Antlr generated visitor. Besides instancing our AST types, a
|
||||
minimal number of syntactic checks are done on a query. These checks handle
|
||||
the cases which were valid in original openCypher grammar, but may be invalid
|
||||
when combined with other syntax elements.
|
@ -1,526 +0,0 @@
|
||||
# Logical Planning
|
||||
|
||||
After the semantic analysis and symbol generation, the AST is converted to a
|
||||
tree of logical operators. This conversion is called *planning* and the tree
|
||||
of logical operators is called a *plan*. The whole planning process is done in
|
||||
the following steps.
|
||||
|
||||
1. [AST Preprocessing](#ast-preprocessing)
|
||||
|
||||
The first step is to preprocess the AST by collecting
|
||||
information on filters, divide the query into parts, normalize patterns
|
||||
in `MATCH` clauses, etc.
|
||||
|
||||
2. [Logical Operator Planning](#logical-operator-planning)
|
||||
|
||||
After the preprocess step, the planning can be done via 2 planners:
|
||||
`VariableStartPlanner` and `RuleBasedPlanner`. The first planner will
|
||||
generate multiple plans where each plan has different starting points for
|
||||
searching the patterns in `MATCH` clauses. The second planner produces a
|
||||
single plan by mapping the query parts as they are to logical operators.
|
||||
|
||||
3. [Logical Plan Postprocessing](#logical-plan-postprocessing)
|
||||
|
||||
In this stage, we perform various transformations on the generated logical
|
||||
plan. Here we want to optimize the operations in order to improve
|
||||
performance during the execution. Naturally, transformations need to
|
||||
preserve the semantic behaviour of the original plan.
|
||||
|
||||
4. [Cost Estimation](#cost-estimation)
|
||||
|
||||
After the generation, the execution cost of each plan is estimated. This
|
||||
estimation is used to select the best plan which will be executed.
|
||||
|
||||
The implementation can be found in the `query/plan` directory, with the public
|
||||
entry point being `query/plan/planner.hpp`.
|
||||
|
||||
## AST Preprocessing
|
||||
|
||||
Each openCypher query consists of at least 1 **single query**. Multiple single
|
||||
queries are chained together using a **query combinator**. Currently, there is
|
||||
only one combinator, `UNION`. The preprocessing step starts in the
|
||||
`CollectQueryParts` function. This function will take a look at each single
|
||||
query and divide it into parts. Each part is separated with `RETURN` and
|
||||
`WITH` clauses. For example:
|
||||
|
||||
MATCH (n) CREATE (m) WITH m MATCH (l)-[]-(m) RETURN l
|
||||
| | |
|
||||
|------- part 1 -----------+-------- part 2 --------|
|
||||
| |
|
||||
|-------------------- single query -----------------|
|
||||
|
||||
Each part is created by collecting all `MATCH` clauses and *normalizing* their
|
||||
patterns. Pattern normalization is the process of converting an arbitrarily
|
||||
long pattern chain of nodes and edges into a list of triplets `(start node,
|
||||
edge, end node)`. The triplets should preserve the semantics of the match. For
|
||||
example:
|
||||
|
||||
MATCH (a)-[p]-(b)-[q]-(c)-[r]-(d)
|
||||
|
||||
is equivalent to:
|
||||
|
||||
MATCH (a)-[p]-(b), (b)-[q]-(c), (c)-[r]-(d)
|
||||
|
||||
With this representation, it becomes easier to reorder the triplets and choose
|
||||
different strategies for pattern matching.
|
||||
|
||||
In addition to normalizing patterns, all of the filter expressions in patterns
|
||||
and inside of the `WHERE` clause (of the accompanying `MATCH`) are extracted
|
||||
and stored separately. During the extraction, symbols used in the filter
|
||||
expression are collected. This allows for planning filters in a valid order,
|
||||
as the matching for triplets is being done. Another important benefit of
|
||||
having extra information on filters, is to recognize when a database index
|
||||
could be used.
|
||||
|
||||
After each `MATCH` is processed, they are all grouped, so that even the whole
|
||||
`MATCH` clauses may be reordered. The important thing is to remember which
|
||||
symbols were used to name edges in each `MATCH`. With those symbols we can
|
||||
plan for *cyphermorphism*, i.e. ensure different edges in the search pattern
|
||||
of a single `MATCH` map to different edges in the graph. This preserves the
|
||||
semantic of the query, even though we may have reordered the matching. The
|
||||
same steps are done for `OPTIONAL MATCH`.
|
||||
|
||||
Another clause which needs processing is `MERGE`. Here we normalize the
|
||||
pattern, since the `MERGE` is a bit like `MATCH` and `CREATE` in one.
|
||||
|
||||
All the other clauses are left as is.
|
||||
|
||||
In the end, each query part consists of:
|
||||
|
||||
* processed and grouped `MATCH` clauses;
|
||||
* processed and grouped `OPTIONAL MATCH` clauses;
|
||||
* processed `MERGE` matching pattern and
|
||||
* unchanged remaining clauses.
|
||||
|
||||
The last stored clause is guaranteed to be either `WITH` or `RETURN`.
|
||||
|
||||
## Logical Operator Planning
|
||||
|
||||
### Variable Start Planner
|
||||
|
||||
The `VariableStartPlanner` generates multiple plans for a single query. Each
|
||||
plan is generated by selecting a different starting point for pattern
|
||||
matching.
|
||||
|
||||
The algorithm works as follows.
|
||||
|
||||
1. For each query part:
|
||||
1. For each node in triplets of collected `MATCH` clauses:
|
||||
i. Add the node to a set of `expanded` nodes
|
||||
ii. Select a triplet `(start node, edge, end node)` whose `start node` is
|
||||
in the `expanded` set
|
||||
iii. If no triplet was selected, choose a new starting node that isn't in
|
||||
`expanded` and continue expanding
|
||||
iv. Repeat steps ii. -- iii. until all triplets have been selected
|
||||
and store that as a variation of the `MATCH` clauses
|
||||
2. Do step 1.1. for `OPTIONAL MATCH` and `MERGE` clauses
|
||||
3. Take all combinations of the generated `MATCH`, `OPTIONAL MATCH` and
|
||||
`MERGE` and store them as variations of the query part.
|
||||
2. For each combination of query part variations:
|
||||
1. Generate a plan using the rule based planner
|
||||
|
||||
### Rule Based Planner
|
||||
|
||||
The `RuleBasedPlanner` generates a single plan for a single query. A plan is
|
||||
generated by following hardcoded rules for producing logical operators. The
|
||||
following sections are an overview on how each openCypher clause is converted
|
||||
to a `LogicalOperator`.
|
||||
|
||||
#### MATCH
|
||||
|
||||
`MATCH` clause is used to specify which patterns need to be searched for in
|
||||
the database. These patterns are normalized in the preprocess step to be
|
||||
represented as triplets `(start node, edge, end node)`. When there is no edge,
|
||||
then the triplet is reduced only to the `start node`. Generating the operators
|
||||
is done by looping over these triplets.
|
||||
|
||||
##### Searching for Nodes
|
||||
|
||||
The simplest search is finding standalone nodes. For example, `MATCH (n)`
|
||||
will find all the nodes in the graph. This is accomplished by generating a
|
||||
`ScanAll` operator and forwarding the node symbol which should store the
|
||||
results. In this case, all the nodes will be referenced by `n`.
|
||||
|
||||
Multiple nodes can be specified in a single match, e.g. `MATCH (n), (m)`.
|
||||
Planning is done by repeating the same steps for each sub pattern (separated
|
||||
by a comma). In this case, we would get 2 `ScanAll` operators chained one
|
||||
after the other. An optimization can be obtained if the node in the pattern is
|
||||
already searched for. In `MATCH (n), (n)` we can drop the second `ScanAll`
|
||||
operator since we have already generated it for the first node.
|
||||
|
||||
##### Searching for Relationships
|
||||
|
||||
A more advanced search includes finding nodes with relationships. For example,
|
||||
`MATCH (n)-[r]-(m)` should find every pair of connected nodes in the database.
|
||||
This means, that if a single node has multiple connections, it will be
|
||||
repeated for each combination of pairs. The generation of operators starts
|
||||
from the first node in the pattern. If we are referencing a new starting node,
|
||||
we need to generate a `ScanAll` which finds all the nodes and stores them
|
||||
into `n`. Then, we generate an `Expand` operator which reads the `n` and
|
||||
traverses all the edges of that node. The edge is stored into `r`, while the
|
||||
destination node is stored in `m`.
|
||||
|
||||
Matching multiple relationships proceeds similarly, by repeating the same
|
||||
steps. The only difference is that we need to ensure different edges in the
|
||||
search pattern, map to different edges in the graph. This means that after each
|
||||
`Expand` operator, we need to generate an `EdgeUniquenessFilter`. We provide
|
||||
this operator with a list of symbols for the previously matched edges and the
|
||||
symbol for the current edge.
|
||||
|
||||
For example.
|
||||
|
||||
MATCH (n)-[r1]-(m)-[r2]-(l)
|
||||
|
||||
The above is preprocessed into
|
||||
|
||||
MATCH (n)-[r1]-(m), (m)-[r2]-(l)
|
||||
|
||||
Then we look at each triplet in order and perform the described steps. This
|
||||
way, we would generate:
|
||||
|
||||
ScanAll (n) > Expand (n, r1, m) > Expand (m, r2, l) >
|
||||
EdgeUniquenessFilter ([r1], r2)
|
||||
|
||||
Note that we don't need to make `EdgeUniquenessFilter` after the first
|
||||
`Expand`, since there are no edges to compare to. This filtering needs to work
|
||||
across multiple pattern, but inside a *single* `MATCH` clause.
|
||||
|
||||
Let's take a look at the following.
|
||||
|
||||
MATCH (n)-[r1]-(m), (m)-[r2]-(l)
|
||||
|
||||
We would also generate the exact same operators.
|
||||
|
||||
ScanAll (n) > Expand (n, r1, m) > Expand (m, r2, l) >
|
||||
EdgeUniquenessFilter ([r1], r2)
|
||||
|
||||
On the other hand,
|
||||
|
||||
MATCH (n)-[r1]-(m) MATCH (m)-[r2]-(l)-[r3]-(i)
|
||||
|
||||
would reset the uniqueness filtering at the start of the second match. This
|
||||
would mean that we output the following:
|
||||
|
||||
ScanAll (n) > Expand (n, r1, m) > Expand (m, r2, l) > Expand (l, r3, i) >
|
||||
EdgeUniquenessFilter ([r2], r3)
|
||||
|
||||
There is a difference in how we handle edge uniqueness compared to Neo4j.
|
||||
Neo4j does not allow searching for a single edge multiple times, but we've
|
||||
decided to support that.
|
||||
|
||||
For example, the user can say the following.
|
||||
|
||||
MATCH (n)-[r]-(m)-[r]-l
|
||||
|
||||
We would ensure that both `r` variables match to the same edge. In our
|
||||
terminology, we call this the *edge cycle*. For the above example, we would
|
||||
generate this plan.
|
||||
|
||||
ScanAll (n) > Expand (n, r, m) > Expand (m, r, l)
|
||||
|
||||
We do not put an `EdgeUniquenessFilter` operator between 2 `Expand`
|
||||
operators and we tell the 2nd `Expand` that it is an edge cycle. This, 2nd
|
||||
`Expand` will ensure we have matched both the same edges.
|
||||
|
||||
##### Filtering
|
||||
|
||||
To narrow the search down, the patterns in `MATCH` can have filtered labels
|
||||
and properties. A more general filtering is done using the accompanying
|
||||
`WHERE` clause. During the preprocess step, all filters are collected and
|
||||
extracted into expressions. Additional information on which symbols are used
|
||||
is also stored. This way, each time we generate a `ScanAll` or `Expand`, we
|
||||
look at all the filters to see if any of them can be used. I.e. if the symbols
|
||||
they use have been bound by a newly produced operator. If a filter expression
|
||||
can be used, we immediately add a `Filter` operator with that expression.
|
||||
|
||||
For example.
|
||||
|
||||
MATCH (n)-[r]-(m :label) WHERE n.prop = 42
|
||||
|
||||
We would produce:
|
||||
|
||||
ScanAll (n) > Filter (n.prop) > Expand (n, r, m) > Filter (m :label)
|
||||
|
||||
This means that the same plan is generated for the query:
|
||||
|
||||
MATCH (n {prop: 42})-[r]-(m :label)
|
||||
|
||||
#### OPTIONAL
|
||||
|
||||
If a `MATCH` clause is preceded by `OPTIONAL`, then we need to generate a plan
|
||||
such that we produce results even if we fail to match anything. This is
|
||||
accomplished by generating an `Optional` operator, which takes 2 operator
|
||||
trees:
|
||||
|
||||
* input operation and
|
||||
* optional operation.
|
||||
|
||||
The input is the operation we generated for the part of the query before
|
||||
`OPTIONAL MATCH`. For the optional operation, we simply generate the `OPTIONAL
|
||||
MATCH` part just like we would for regular `MATCH`. In addition to operations,
|
||||
we need to send the symbols which are set during optional matching to the
|
||||
`Optional` operator. The operator will reset values of those symbols to
|
||||
`null`, when the optional part fails to match.
|
||||
|
||||
#### RETURN & WITH
|
||||
|
||||
`RETURN` and `WITH` clauses are very similar to each other. The only
|
||||
difference is that `WITH` separates parts of the query and can be paired with
|
||||
`WHERE` clause.
|
||||
|
||||
The common part is generating operators for the body of the clause. Separation
|
||||
of query parts is mostly done in semantic analysis, which checks that only the
|
||||
symbols exposed through `WITH` are visible in the query parts after the
|
||||
clause. The minor part is done in planning.
|
||||
|
||||
##### Named Results
|
||||
|
||||
Both clauses contain multiple named expressions (`expr AS name`) which are
|
||||
used to generate `Produce` operator.
|
||||
|
||||
##### Aggregations
|
||||
|
||||
If an expression contains an aggregation operator (`sum`, `avg`, ...) we need
|
||||
to plan the `Aggregate` operator as input to `Produce`. This case is more
|
||||
complex, because aggregation in openCypher can perform implicit grouping of
|
||||
results used for aggregation.
|
||||
|
||||
For example, `WITH/RETURN sum(n.x) AS s, n.y AS group` will implicitly group
|
||||
by `n.y` expression.
|
||||
|
||||
Another, obscure grouping can be achieved with `RETURN sum(n.a) + n.b AS s`.
|
||||
Here, the `n.b` will be used for grouping, even though both the `sum` and
|
||||
`n.b` are in the same named expression.
|
||||
|
||||
Therefore, we need to collect all expressions which do not contain
|
||||
aggregations and use them for grouping. You may have noticed that in the last
|
||||
example `sum` is actually a sub-expression of `+`. `Aggregate` operator does
|
||||
not see that (nor it should), so the responsibility of evaluating that falls
|
||||
on `Produce`. One way is for `Aggregate` to store results of grouping
|
||||
expressions on the frame in addition to aggregation results. Unfortunately,
|
||||
this would require rewiring named expressions in `Produce` to reference
|
||||
already evaluated expressions. In the current implementation, we opted for
|
||||
`Aggregate` to store only aggregation results on the frame, while `Produce`
|
||||
will re-evaluate all the other (grouping) expressions. To handle that, symbols
|
||||
which are used in expressions are passed to `Aggregate`, so that they can be
|
||||
remembered. `Produce` will read those symbols from the frame and use it to
|
||||
re-evaluate the needed expressions.
|
||||
|
||||
##### Accumulation
|
||||
|
||||
After we have `Produce` and potentially `Aggregate`, we need to handle a
|
||||
special case when the part of the query before `RETURN` or `WITH` performs
|
||||
updates. For that, we want to run that part of the query fully, so that we get
|
||||
the latest results. This is accomplished by adding `Accumulate` operator as
|
||||
input to `Aggregate` or `Produce` (if there is no aggregation). Accumulation
|
||||
will store all the values for all the used symbols inside `RETURN` and `WITH`,
|
||||
so that they can be used in the operator which follows. This way, only parts
|
||||
of the frame are copied, instead of the whole frame. Here is a minor
|
||||
difference between planning `WITH`, compared to `RETURN`. Since `WITH` can
|
||||
separate writing from reading, we need to advance the transaction command.
|
||||
This enables the later, read parts of the query to obtain the newest changes.
|
||||
This is supported by passing `advance_command` flag to `Accumulate` operator.
|
||||
|
||||
In the simplest case, common to both clauses, we have `Accumulate > Aggregate
|
||||
> Produce` operators, where `Accumulate` and `Aggregate` may be left out.
|
||||
|
||||
##### Ordering
|
||||
|
||||
Planning `ORDER BY` is simple enough. Since it may see new symbols (filled in
|
||||
`Produce`), we add the `OrderBy` operator at the end. The operator will change
|
||||
the order of produced results, so we pass it the ordering expressions and the
|
||||
output symbols of named expressions.
|
||||
|
||||
##### Filtering
|
||||
|
||||
A final difference in `WITH`, is when it contains a `WHERE` clause. For that,
|
||||
we simply generate the `Filter` operator, appended after `Produce` or
|
||||
`OrderBy` (depending which operator is last).
|
||||
|
||||
##### Skipping and Limiting
|
||||
|
||||
If we have `SKIP` or `LIMIT`, we generate `Skip` or `Limit` operators,
|
||||
respectively. These operators are put at the end of the clause.
|
||||
|
||||
This placement may have some unexpected behaviour when combined with
|
||||
operations that update the graph. For example.
|
||||
|
||||
MATCH (n) SET n.x = n.x + 1 RETURN n LIMIT 1
|
||||
|
||||
The above query may be interpreted as if the `SET` will be done only once.
|
||||
Since this is a write query, we need to accumulate results, so the part before
|
||||
`RETURN` will execute completely. The accumulated results will be yielded up
|
||||
to the given limit, and the user would get only the first `n` that was
|
||||
updated. This may confuse the user because in reality, every node in the
|
||||
database had been updated.
|
||||
|
||||
Note that `Skip` always comes before `Limit`. In the current implementation,
|
||||
they are generated directly one after the other.
|
||||
|
||||
#### CREATE
|
||||
|
||||
`CREATE` clause is used to create nodes and edges (relationships).
|
||||
|
||||
For multiple `CREATE` clauses or multiple creation patterns in a single
|
||||
clause, we perform the same, following steps.
|
||||
|
||||
##### Creating a Single Node
|
||||
|
||||
A node is created by simply specifying a node pattern.
|
||||
|
||||
For example `CREATE (n :label {property: "value"}), ()` would create 2 nodes.
|
||||
The 1st one would be created with a label and a property. This node could be
|
||||
referenced later in the query, by using the variable `n`. The 2nd node cannot
|
||||
be referenced and it would be created without any labels nor properties. For
|
||||
node creation, we generate a `CreateNode` operator and pass it all the details
|
||||
of node creation: variable symbol, labels and properties. In the mentioned
|
||||
example, we would have `CreateNode > CreateNode`.
|
||||
|
||||
##### Creating a Relationship
|
||||
|
||||
To create a relationship, the `CREATE` clause must contain a pattern with a
|
||||
directed edge. Compared to creating a single node, this case is a bit more
|
||||
complicated, because either side of the edge may not exist. By exist, we mean
|
||||
that the endpoint is a variable which already references a node.
|
||||
|
||||
For example, `MATCH (n) CREATE (n)-[r]->(m)` would create an edge `r` and a
|
||||
node `m` for each matched node `n`. If we focus on the `CREATE` part, we
|
||||
generate `CreateExpand (n, r, m)` where `n` already exists (refers to matched
|
||||
node) and `m` would be newly created along with edge `r`. If we had only
|
||||
`CREATE (n)-[r]->(m)`, then we would need to create both nodes of the edge
|
||||
`r`. This is done by generating `CreateNode (n) > CreateExpand(n, r, m)`. The
|
||||
final case is when both endpoints refer to an existing node. For example, when
|
||||
adding a node with a cyclical connection `CREATE (n)-[r]->(n)`. In this case,
|
||||
we would generate `CreateNode (n) > CreateExpand (n, r, n)`. We would tell
|
||||
`CreateExpand` to only create the edge `r` between the already created `n`.
|
||||
|
||||
#### MERGE
|
||||
|
||||
Although the merge operation is complex, planning turns out to be relatively
|
||||
simple. The pattern inside the `MERGE` clause is used for both matching and
|
||||
creating. Therefore, we create 2 operator trees, one for each action.
|
||||
|
||||
For example.
|
||||
|
||||
MERGE (n)-[r:r]-(m)
|
||||
|
||||
We would generate a single `Merge` operator which has the following.
|
||||
|
||||
* No input operation (since it is not preceded by any other clause).
|
||||
|
||||
* On match operation
|
||||
|
||||
`ScanAll (n) > Expand (n, r, m) > Filter (r)`
|
||||
|
||||
* On create operation
|
||||
|
||||
`CreateNode (n) > CreateExpand (n, r, m)`
|
||||
|
||||
In cases when `MERGE` contains `ON MATCH` and `ON CREATE` parts, we simply
|
||||
append their operations to the respective operator trees.
|
||||
|
||||
Observe the following example.
|
||||
|
||||
MERGE (n)-[r:r]-(m) ON MATCH SET n.x = 42 ON CREATE SET m :label
|
||||
|
||||
The `Merge` would be generated with the following.
|
||||
|
||||
* No input operation (again, since there is no clause preceding it).
|
||||
|
||||
* On match operation
|
||||
|
||||
`ScanAll (n) > Expand (n, r, m) > Filter (r) > SetProperty (n.x, 42)`
|
||||
|
||||
* On create operation
|
||||
|
||||
`CreateNode (n) > CreateExpand (n, r, m) > SetLabels (n, :label)`
|
||||
|
||||
When we have preceding clauses, we simply put their operator as input to
|
||||
`Merge`.
|
||||
|
||||
MATCH (n) MERGE (n)-[r:r]-(m)
|
||||
|
||||
The above would be generated as
|
||||
|
||||
ScanAll (n) > Merge (on_match_operation, on_create_operation)
|
||||
|
||||
Here we need to be careful to recognize which symbols are already declared.
|
||||
But, since the `on_match_operation` uses the same algorithm for generating a
|
||||
`Match`, that problem is handled there. The same should hold for
|
||||
`on_create_operation`, which uses the process of generating a `Create`. So,
|
||||
finally for this example, the `Merge` would have:
|
||||
|
||||
* Input operation
|
||||
|
||||
`ScanAll (n)`
|
||||
|
||||
* On match operation
|
||||
|
||||
`Expand (n, r, m) > Filter (r)`
|
||||
|
||||
Note that `ScanAll` is not needed since we get the nodes from input.
|
||||
|
||||
* On create operation
|
||||
|
||||
`CreateExpand (n, r, m)`
|
||||
|
||||
Note that `CreateNode` is dropped, since we want to expand the existing one.
|
||||
|
||||
## Logical Plan Postprocessing
|
||||
|
||||
Postprocessing of a logical plan is done by rewriting the original plan into
|
||||
a more efficient one while preserving the original semantic of operations.
|
||||
The rewriters are found in `query/plan/rewrite` directory, and currently we
|
||||
only have one -- `IndexLookupRewriter`.
|
||||
|
||||
### IndexLookupRewriter
|
||||
|
||||
The job of this rewriter is to merge `Filter` and `ScanAll` operations into
|
||||
equivalent `ScanAllBy<Index>` operations. In almost all cases using indexed
|
||||
lookup will be faster than regular lookup, so `IndexLookupRewriter` simply
|
||||
does the transformations whenever possible. The simplest case being the
|
||||
following, assuming we have an index over `id`.
|
||||
|
||||
* Original Plan
|
||||
|
||||
`ScanAll (n) > Filter (id(n) == 42) > Produce (n)`
|
||||
|
||||
* Rewritten Plan
|
||||
|
||||
`ScanAllById (n, id=42) > Produce (n)`
|
||||
|
||||
Naturally, there are some cases we need to be careful about.
|
||||
|
||||
1. Operators with Multiple Branches
|
||||
|
||||
Here we may not carry `Filter` operations outside of the operator into
|
||||
its branches, so the branches are rewritten as stand alone plans with a
|
||||
branch new `IndexLookupRewriter`. Some of the operators with multiple
|
||||
branches are `Merge`, `Optional`, `Cartesian` and `Union`.
|
||||
|
||||
2. Expand Operators
|
||||
|
||||
Expand operations aren't that tricky to handle, but they have a special
|
||||
case where we want to use an indexed lookup of the destination so that the
|
||||
expansion is performed between known nodes. This decision may depend on
|
||||
various parameters which may need further tweaking as we encounter more
|
||||
use-cases of Cypher queries.
|
||||
|
||||
## Cost Estimation
|
||||
|
||||
Cost estimation is the final step of processing a logical plan. The
|
||||
implementation can be found in `query/plan/cost_estimator.hpp`. We give each
|
||||
operator a cost based on the estimated cardinality of results of that operator
|
||||
and on the preset coefficient of the runtime performance of that operator.
|
||||
|
||||
This scheme is rather simple and works quite well, but there are couple of
|
||||
improvements we may want to do at some point.
|
||||
|
||||
* Track more information about the stored graph and use that to improve the
|
||||
estimates.
|
||||
* Do a quick, partial run of the plan and tweak the estimation based on how
|
||||
much each operator produced results. This may require us having some kind
|
||||
of representative subset of the stored graph.
|
||||
* Write micro benchmarks for each operator and based on the results create
|
||||
sensible preset coefficients. This would replace the current coefficients
|
||||
which are just assumptions on how each operator implementation performs.
|
@ -1,134 +0,0 @@
|
||||
# Semantic Analysis and Symbol Generation
|
||||
|
||||
In this phase, various semantic and variable type checks are performed.
|
||||
Additionally, we generate symbols which map AST nodes to stored values
|
||||
computed from evaluated expressions.
|
||||
|
||||
## Symbol Generation
|
||||
|
||||
Implementation can be found in `query/frontend/semantic/symbol_generator.cpp`.
|
||||
|
||||
Symbols are generated for each AST node that represents data that needs to
|
||||
have storage. Currently, these are:
|
||||
|
||||
* `NamedExpression`
|
||||
* `CypherUnion`
|
||||
* `Identifier`
|
||||
* `Aggregation`
|
||||
|
||||
You may notice that the above AST nodes may not correspond to something named
|
||||
by a user. For example, `Aggregation` can be a part of larger expression and
|
||||
thus remain unnamed. The reason we still generate symbols is to have a uniform
|
||||
behaviour when executing a query as well as allow for caching the results of
|
||||
expression evaluation.
|
||||
|
||||
AST nodes do not actually store a `Symbol` instance, instead they have a
|
||||
`int32_t` index identifying the symbol in the `SymbolTable` class. This is
|
||||
done to minimize the size of AST types as well as allow easier sharing of same
|
||||
symbols with multiple instances of AST nodes.
|
||||
|
||||
The storage for evaluated data is represented by the `Frame` class. Each
|
||||
symbol determines a unique position in the frame. During interpretation,
|
||||
evaluation of expressions which have a symbol will either read or store values
|
||||
in the frame. For example, instance of an `Identifier` will use the symbol to
|
||||
find and read the value from `Frame`. On the other hand, `NamedExpression`
|
||||
will take the result of evaluating its own expression and store it in the
|
||||
`Frame`.
|
||||
|
||||
When a symbol is created, context of creation is used to assign a type to that
|
||||
symbol. This type is used for simple type checking operations. For example,
|
||||
`MATCH (n)` will create a symbol for variable `n`. Since the `MATCH (n)`
|
||||
represents finding a vertex in the graph, we can set `Symbol::Type::Vertex`
|
||||
for that symbol. Later, for example in `MATCH ()-[n]-()` we see that variable
|
||||
`n` is used as an edge. Since we already have a symbol for that variable, we
|
||||
detect this type mismatch and raise a `SemanticException`.
|
||||
|
||||
Basic rule of symbol generation, is that variables inside `MATCH`, `CREATE`,
|
||||
`MERGE`, `WITH ... AS` and `RETURN ... AS` clauses establish new symbols.
|
||||
|
||||
### Symbols in Patterns
|
||||
|
||||
Inside `MATCH`, symbols are created only if they didn't exist before. For
|
||||
example, patterns in `MATCH (n {a: 5})--(m {b: 5}) RETURN n, m` will create 2
|
||||
symbols: one for `n` and one for `m`. `RETURN` clause will, in turn, reference
|
||||
those symbols. Symbols established in a part of pattern are immediately bound
|
||||
and visible in later parts. For example, `MATCH (n)--(n)` will create a symbol
|
||||
for variable `n` for 1st `(n)`. That symbol is referenced in 2nd `(n)`. Note
|
||||
that the symbol is not bound inside 1st `(n)` itself. What this means is that,
|
||||
for example, `MATCH (n {a: n.b})` should raise an error, because `n` is not
|
||||
yet bound when encountering `n.b`. On the other hand,
|
||||
`MATCH (n)--(n {a: n.b})` is fine.
|
||||
|
||||
The `CREATE` is similar to `MATCH`, but it *always* establishes symbols for
|
||||
variables which create graph elements. What this means is that, for example
|
||||
`MATCH (n) CREATE (n)` is not allowed. `CREATE` wants to create a new node,
|
||||
for which we already have a symbol. In such a case, we need to throw an error
|
||||
that the variable `n` is being redeclared. On the other hand `MATCH (n) CREATE
|
||||
(n)-[r :r]->(n)` is fine, because `CREATE` will only create the edge `r`,
|
||||
connecting the already existing node `n`. Remaining behaviour is the same as
|
||||
in `MATCH`. This means that we can simplify `CREATE` to be like `MATCH` with 2
|
||||
special cases.
|
||||
|
||||
1. Are we creating a node, i.e. `CREATE (n)`? If yes, then the symbol for
|
||||
`n` must not have been created before. Otherwise, we reference the
|
||||
existing symbol.
|
||||
2. Are we creating an edge, i.e. we encounter a variable for an edge inside
|
||||
`CREATE`? If yes, then that variable must not reference a symbol.
|
||||
|
||||
The `MERGE` clause is treated the same as `CREATE` with regards to symbol
|
||||
generation. The only difference is that we allow bidirectional edges in the
|
||||
pattern. When creating such a pattern, the direction of the created edge is
|
||||
arbitrarily determined.
|
||||
|
||||
### Symbols in WITH and RETURN
|
||||
|
||||
In addition to patterns, new symbols are established in the `WITH` clause.
|
||||
This clause makes the new symbols visible *only* to the rest of the query.
|
||||
For example, `MATCH (old) WITH old AS new RETURN new, old` should raise an
|
||||
error that `old` is unbound inside `RETURN`.
|
||||
|
||||
There is a special case with symbol visibility in `WHERE` and `ORDER BY`. They
|
||||
need to see both the old and the new symbols. Therefore `MATCH (old) RETURN
|
||||
old AS new ORDER BY old.prop` needs to work. On the other hand, if we perform
|
||||
aggregations inside `WITH` or `RETURN`, then the old symbols should not be
|
||||
visible neither in `WHERE` nor in `ORDER BY`. Since the aggregation has to go
|
||||
through all the results in order to generate the final value, it makes no
|
||||
sense to store old symbols and their values. A query like `MATCH (old) WITH
|
||||
SUM(old.prop) AS sum WHERE old.prop = 42 RETURN sum` needs to raise an error
|
||||
that `old` is unbound inside `WHERE`.
|
||||
|
||||
For cases when `SKIP` and `LIMIT` appear, we disallow any identifiers from
|
||||
appearing in their expressions. Basically, `SKIP` and `LIMIT` can only be
|
||||
constant expressions[^1]. For example, `MATCH (old) RETURN old AS new SKIP
|
||||
new.prop` needs to raise that variables are not allowed in `SKIP`. It makes no
|
||||
sense to allow variables, since their values may vary on each iteration. On
|
||||
the other hand, we could support variables to constant expressions, but for
|
||||
simplicity we do not. For example, `MATCH (old) RETURN old, 2 AS limit_var
|
||||
LIMIT limit_var` would still throw an error.
|
||||
|
||||
Finally, we generate symbols for names created in `RETURN` clause. These
|
||||
symbols are used for the final results of a query.
|
||||
|
||||
NOTE: New symbols in `WITH` and `RETURN` should be unique. This means that
|
||||
`WITH a AS same, b AS same` is not allowed, neither is a construct like
|
||||
`RETURN 2, 2`
|
||||
|
||||
### Symbols in Functions which Establish New Scope
|
||||
|
||||
Symbols can also be created in some functions. These functions usually take an
|
||||
expression, bind a single variable and run the expression inside the newly
|
||||
established scope.
|
||||
|
||||
The `all` function takes a list, creates a variable for list element and runs
|
||||
the predicate expression. For example:
|
||||
|
||||
MATCH (n) RETURN n, all(n IN n.prop_list WHERE n < 42)
|
||||
|
||||
We create a new symbol for use inside `all`, this means that the `WHERE n <
|
||||
42` uses the `n` which takes values from a `n.prop_list` elements. The
|
||||
original `n` bound by `MATCH` is not visible inside the `all` function, but it
|
||||
is visible outside. Therefore, the `RETURN n` and `n.prop_list` reference the
|
||||
`n` from `MATCH`.
|
||||
|
||||
[^1]: Constant expressions are expressions for which the result can be
|
||||
computed at compile time.
|
@ -1,107 +0,0 @@
|
||||
# Quick Start
|
||||
|
||||
A short chapter on downloading the Memgraph source, compiling and running.
|
||||
|
||||
## Obtaining the Source Code
|
||||
|
||||
Memgraph uses `git` for source version control. You will need to install `git`
|
||||
on your machine before you can download the source code.
|
||||
|
||||
On Debian systems, you can do it inside a terminal with the following
|
||||
command:
|
||||
|
||||
apt install git
|
||||
|
||||
After installing `git`, you are now ready to fetch your own copy of Memgraph
|
||||
source code. Run the following command:
|
||||
|
||||
git clone https://github.com/memgraph/memgraph.git
|
||||
|
||||
The above will create a `memgraph` directory and put all source code there.
|
||||
|
||||
## Compiling Memgraph
|
||||
|
||||
With the source code, you are now ready to compile Memgraph. Well... Not
|
||||
quite. You'll need to download Memgraph's dependencies first.
|
||||
|
||||
In your terminal, position yourself in the obtained memgraph directory.
|
||||
|
||||
cd memgraph
|
||||
|
||||
### Installing Dependencies
|
||||
|
||||
Dependencies that are required by the codebase should be checked by running the
|
||||
`init` script:
|
||||
|
||||
./init
|
||||
|
||||
If the script fails, dependencies installation scripts could be found under
|
||||
`environment/os/`. The directory contains dependencies management script for
|
||||
each supported operating system. E.g. if your system is **Debian 10**, run the
|
||||
following to install all required build packages:
|
||||
|
||||
./environment/os/debian-10.sh install MEMGRAPH_BUILD_DEPS
|
||||
|
||||
Once everything is installed, rerun the `init` script.
|
||||
|
||||
Once the `init` script is successfully finished, issue the following commands:
|
||||
|
||||
mkdir -p build
|
||||
./libs/setup.sh
|
||||
|
||||
### Compiling
|
||||
|
||||
Memgraph is compiled using our own custom toolchain that can be obtained from
|
||||
the toolchain repository. You should read the `environment/README.txt` file
|
||||
in the repository and install the apropriate toolchain for your distribution.
|
||||
After you have installed the toolchain you should read the instructions for the
|
||||
toolchain in the toolchain install directory (`/opt/toolchain-vXYZ/README.md`)
|
||||
and install dependencies that are necessary to run the toolchain.
|
||||
|
||||
When you want to compile Memgraph you should activate the toolchain using the
|
||||
prepared toolchain activation script that is also described in the toolchain
|
||||
`README`.
|
||||
|
||||
NOTE: You **must** activate the toolchain every time you want to compile
|
||||
Memgraph!
|
||||
|
||||
You should now activate the toolchain in your console.
|
||||
|
||||
source /opt/toolchain-vXYZ/activate
|
||||
|
||||
With all of the dependencies installed and the build environment set-up, you
|
||||
need to configure the build system. To do that, execute the following:
|
||||
|
||||
cd build
|
||||
cmake ..
|
||||
|
||||
If everything went OK, you can now, finally, compile Memgraph.
|
||||
|
||||
make -j$(nproc)
|
||||
|
||||
### Running
|
||||
|
||||
After the compilation verify that Memgraph works:
|
||||
|
||||
./memgraph --version
|
||||
|
||||
To make extra sure, run the unit tests:
|
||||
|
||||
ctest -R unit -j$(nproc)
|
||||
|
||||
## Problems
|
||||
|
||||
If you have any trouble running the above commands, contact your nearest
|
||||
developer who successfully built Memgraph. Ask for help and insist on getting
|
||||
this document updated with correct steps!
|
||||
|
||||
## Next Steps
|
||||
|
||||
Familiarise yourself with our code conventions and guidelines:
|
||||
|
||||
* [C++ Code](cpp-code-conventions.md)
|
||||
* [Other Code](other-code-conventions.md)
|
||||
* [Code Review Guidelines](code-review.md)
|
||||
|
||||
Take a look at the list of [required reading](required-reading.md) for
|
||||
brushing up on technical skills.
|
@ -1,129 +0,0 @@
|
||||
# Required Reading
|
||||
|
||||
This chapter lists a few books that should be read by everyone working on
|
||||
Memgraph. Since Memgraph is developed primarily with C++, Python and Common
|
||||
Lisp, books are oriented around those languages. Of course, there are plenty
|
||||
of general books which will help you improve your technical skills (such as
|
||||
"The Pragmatic Programmer", "The Mythical Man-Month", etc.), but they are not
|
||||
listed here. This way the list should be kept short and the *required* part in
|
||||
"Required Reading" more easily honored.
|
||||
|
||||
Some of these books you may find in our office, so feel free to pick them up.
|
||||
If any are missing and you would like a physical copy, don't be afraid to
|
||||
request the book for our office shelves.
|
||||
|
||||
Besides reading, don't get stuck in a rut and be a
|
||||
[Blub Programmer](http://www.paulgraham.com/avg.html).
|
||||
|
||||
## Effective C++ by Scott Meyers
|
||||
|
||||
Required for C++ developers.
|
||||
|
||||
The book is a must-read as it explains most common gotchas of using C++. After
|
||||
reading this book, you are good to write competent C++ which will pass code
|
||||
reviews easily.
|
||||
|
||||
## Effective Modern C++ by Scott Meyers
|
||||
|
||||
Required for C++ developers.
|
||||
|
||||
This is a continuation of the previous book, it covers updates to C++ which
|
||||
came with C++11 and later. The book isn't as imperative as the previous one,
|
||||
but it will make you aware of modern features we are using in our codebase.
|
||||
|
||||
## Practical Common Lisp by Peter Siebel
|
||||
|
||||
Required for Common Lisp developers.
|
||||
|
||||
Free: http://www.gigamonkeys.com/book/
|
||||
|
||||
We use Common Lisp to generate C++ code and make our lives easier.
|
||||
Unfortunately, not many developers are familiar with the language. This book
|
||||
will make you familiar very quickly as it has tons of very practical
|
||||
exercises. E.g. implementing unit testing library, serialization library and
|
||||
bundling all that to create a mp3 music server.
|
||||
|
||||
## Effective Python by Brett Slatkin
|
||||
|
||||
(Almost) required reading for Python developers.
|
||||
|
||||
Why the "almost"? Well, Python is relatively easy to pick up and you will
|
||||
probably learn all the gotchas during code review from someone more
|
||||
experienced. This makes the book less necessary for a newcomer to Memgraph,
|
||||
but the book is not advanced enough to delegate it to
|
||||
[Advanced Reading](#advanced-reading). The book is written in similar vein as
|
||||
the "Effective C++" ones and will make you familiar with nifty Python features
|
||||
that make everyone's lives easier.
|
||||
|
||||
# Advanced Reading
|
||||
|
||||
The books listed below are not required reading, but you may want to read them
|
||||
at some point when you feel comfortable enough.
|
||||
|
||||
## Design Patterns by Gamma et. al.
|
||||
|
||||
Recommended for C++ developers.
|
||||
|
||||
This book is highly divisive because it introduced a culture centered around
|
||||
design patterns. The main issues is overuse of patterns which complicates the
|
||||
code. This has made many Java programs to serve as examples of highly
|
||||
complicated, "enterprise" code.
|
||||
|
||||
Unfortunately, design patterns are pretty much missing
|
||||
language features. This is most evident in dynamic languages such as Python
|
||||
and Lisp, as demonstrated by
|
||||
[Peter Norvig](http://www.norvig.com/design-patterns/).
|
||||
|
||||
Or as [Paul Graham](http://www.paulgraham.com/icad.html) put it:
|
||||
|
||||
```
|
||||
This practice is not only common, but institutionalized. For example, in the
|
||||
OO world you hear a good deal about "patterns". I wonder if these patterns are
|
||||
not sometimes evidence of case (c), the human compiler, at work. When I see
|
||||
patterns in my programs, I consider it a sign of trouble. The shape of a
|
||||
program should reflect only the problem it needs to solve. Any other
|
||||
regularity in the code is a sign, to me at least, that I'm using abstractions
|
||||
that aren't powerful enough-- often that I'm generating by hand the expansions
|
||||
of some macro that I need to write
|
||||
```
|
||||
|
||||
After presenting the book so negatively, why you should even read it then?
|
||||
Well, it is good to be aware of those design patterns and use them when
|
||||
appropriate. They can improve modularity and reuse of the code. You will also
|
||||
find examples of such patterns in our code, primarily Strategy and Visitor
|
||||
patterns. The book is also a good stepping stone to more advanced reading
|
||||
about software design.
|
||||
|
||||
## Modern C++ Design by Andrei Alexandrescu
|
||||
|
||||
Recommended for C++ developers.
|
||||
|
||||
This book can be treated as a continuation of the previous "Design Patterns"
|
||||
book. It introduced "dark arts of template meta-programming" to the world.
|
||||
Many of the patterns are converted to use C++ templates which makes them even
|
||||
better for reuse. But, like the previous book, there are downsides if used too
|
||||
much. You should approach it with a critical eye and it will help you
|
||||
understand ideas that are used in some parts of our codebase.
|
||||
|
||||
## Large Scale C++ Software Design by John Lakos
|
||||
|
||||
Recommended for C++ developers.
|
||||
|
||||
An old book, but well worth the read. Lakos presents a very pragmatic view of
|
||||
writing modular software and how it affects both development time as well as
|
||||
program runtime. Some things are outdated or controversial, but it will help
|
||||
you understand how the whole C++ process of working in a large team, compiling
|
||||
and linking affects development.
|
||||
|
||||
## On Lisp by Paul Graham
|
||||
|
||||
Recommended for Common Lisp developers.
|
||||
|
||||
Free: http://www.paulgraham.com/onlisp.html
|
||||
|
||||
An excellent continuation to "Practical Common Lisp". It starts of slow, as if
|
||||
introducing the language, but very quickly picks up speed. The main meat of
|
||||
the book are macros and their uses. From using macros to define cooperative
|
||||
concurrency to including Prolog as if it's part of Common Lisp. The book will
|
||||
help you understand more advanced macros that are occasionally used in our
|
||||
Lisp C++ Preprocessor (LCP).
|
@ -1,110 +0,0 @@
|
||||
# DatabaseAccessor
|
||||
|
||||
A `DatabaseAccessor` actually wraps a transactional access to database
|
||||
data, for a single transaction. In that sense the naming is bad. It
|
||||
encapsulates references to the database and the transaction object.
|
||||
|
||||
It contains logic for working with database content (graph element
|
||||
data) in the context of a single transaction. All CRUD operations are
|
||||
performed within a single transaction (as Memgraph is a transactional
|
||||
database), and therefore iteration over data, finding a specific graph
|
||||
element etc are all functionalities of a `GraphDbAccessor`.
|
||||
|
||||
In single-node Memgraph the database accessor also defined the lifetime
|
||||
of a transaction. Even though a `Transaction` object was owned by the
|
||||
transactional engine, it was `GraphDbAccessor`'s lifetime that object
|
||||
was bound to (the transaction was implicitly aborted in
|
||||
`GraphDbAccessor`'s destructor, if it was not explicitly ended before
|
||||
that).
|
||||
|
||||
# RecordAccessor
|
||||
|
||||
It is important to understand data organization and access in the
|
||||
storage layer. This discussion pertains to vertices and edges as graph
|
||||
elements that the end client works with.
|
||||
|
||||
Memgraph uses MVCC (documented on it's own page). This means that for
|
||||
each graph element there could be different versions visible to
|
||||
different currently executing transactions. When we talk about a
|
||||
`Vertex` or `Edge` as a data structure we typically mean one of those
|
||||
versions. In code this semantic is implemented so that both those classes
|
||||
inherit `mvcc::Record`, which in turn inherits `mvcc::Version`.
|
||||
|
||||
Handling MVCC and visibility is not in itself trivial. Next to that,
|
||||
there is other book-keeping to be performed when working with data. For
|
||||
that reason, Memgraph uses "accessors" to define an API of working with
|
||||
data in a safe way. Most of the code in Memgraph (for example the
|
||||
interpretation code) should work with accessors. There is a
|
||||
`RecordAccessor` as a base class for `VertexAccessor` and
|
||||
`EdgeAccessor`. Following is an enumeration of their purpose.
|
||||
|
||||
### Data Access
|
||||
|
||||
The client interacts with Memgraph using the Cypher query language. That
|
||||
language has certain semantics which imply that multiple versions of the
|
||||
data need to be visible during the execution of a single query. For
|
||||
example: expansion over the graph is always done over the graph state as
|
||||
it was at the beginning of the transaction.
|
||||
|
||||
The `RecordAccessor` exposes functions to switch between the old and the new
|
||||
versions of the same graph element (intelligently named `SwitchOld` and
|
||||
`SwitchNew`) within a single transaction. In that way the client code
|
||||
(mostly the interpreter) can avoid dealing with the underlying MVCC
|
||||
version concepts.
|
||||
|
||||
### Updates
|
||||
|
||||
Data updates are also done through accessors. Meaning: there are methods
|
||||
on the accessors that modify data, the client code should almost never
|
||||
interact directly with `Vertex` or `Edge` objects.
|
||||
|
||||
The accessor layer takes care of creating version in the MVCC layer and
|
||||
performing updates on appropriate versions.
|
||||
|
||||
Next, for many kinds of updates it is necessary to update the relevant
|
||||
indexes. There are implicit indexes for vertex labels, as
|
||||
well as user-created indexes for (label, property) pairs. The accessor
|
||||
layer takes care of updating the indexes when these values are changed.
|
||||
|
||||
Each update also triggers a log statement in the write-ahead log. This
|
||||
is also handled by the accessor layer.
|
||||
|
||||
### Distributed
|
||||
|
||||
In distributed Memgraph accessors also contain a lot of the remote graph
|
||||
element handling logic. More info on that is available in the
|
||||
documentation for distributed.
|
||||
|
||||
### Deferred MVCC Data Lookup for Edges
|
||||
|
||||
Vertices and edges are versioned using MVCC. This means that for each
|
||||
transaction an MVCC lookup needs to be done to determine which version
|
||||
is visible to that transaction. This tends to slow things down due to
|
||||
cache invalidations (version lists and versions are stored in arbitrary
|
||||
locations on the heap).
|
||||
|
||||
However, for edges, only the properties are mutable. The edge endpoints
|
||||
and type are fixed once the edge is created. For that reason both edge
|
||||
endpoints and type are available in vertex data, so that when expanding
|
||||
it is not mandatory to do MVCC lookups of versioned, mutable data. This
|
||||
logic is implemented in `RecordAccessor` and `EdgeAccessor`.
|
||||
|
||||
### Exposure
|
||||
|
||||
The original idea and implementation of graph element accessors was that
|
||||
they'd prevent client code from ever interacting with raw `Vertex` or
|
||||
`Edge` data. This however turned out to be impractical when implementing
|
||||
distributed Memgraph and the raw data members have since been exposed
|
||||
(through getters to old and new version pointers). However, refrain from
|
||||
working with that data directly whenever possible! Always consider the
|
||||
accessors to be the first go-to for interacting with data, especially
|
||||
when in the context of a transaction.
|
||||
|
||||
# Skiplist Accessor
|
||||
|
||||
The term "accessor" is also used in the context of a skiplist. Every
|
||||
operation on a skiplist must be performed within on an
|
||||
accessor. The skiplist ensures that there will be no physical deletions
|
||||
of an object during the lifetime of an accessor. This mechanism is used
|
||||
to ensure deletion correctness in a highly concurrent container.
|
||||
We only mention that here to avoid confusion regarding terminology.
|
@ -1,6 +0,0 @@
|
||||
# Storage v1
|
||||
|
||||
* [Accessors](accessors.md)
|
||||
* [Indexes](indexes.md)
|
||||
* [Property Storage](property-storage.md)
|
||||
* [Durability](durability.md)
|
@ -1,80 +0,0 @@
|
||||
# Durability
|
||||
|
||||
## Write-ahead Logging
|
||||
|
||||
Typically WAL denotes the process of writing a "log" of database
|
||||
operations (state changes) to persistent storage before committing the
|
||||
transaction, thus ensuring that the state can be recovered (in the case
|
||||
of a crash) for all the transactions which the database committed.
|
||||
|
||||
The WAL is a fine-grained durability format. It's purpose is to store
|
||||
database changes fast. It's primary purpose is not to provide
|
||||
space-efficient storage, nor to support fast recovery. For that reason
|
||||
it's often used in combination with a different persistence mechanism
|
||||
(in Memgraph's case the "snapshot") that has complementary
|
||||
characteristics.
|
||||
|
||||
### Guarantees
|
||||
|
||||
Ensuring that the log is written before the transaction is committed can
|
||||
slow down the database. For that reason this guarantee is most often
|
||||
configurable in databases.
|
||||
|
||||
Memgraph offers two options for the WAL. The default option, where the WAL is
|
||||
flushed to the disk periodically and transactions do not wait for this to
|
||||
complete, introduces the risk of database inconsistency because an operating
|
||||
system or hardware crash might lead to missing transactions in the WAL. Memgraph
|
||||
will handle this as if those transactions never happened. The second option,
|
||||
called synchronous commit, will instruct Memgraph to wait for the WAL to be
|
||||
flushed to the disk when a transactions completes and the transaction will wait
|
||||
for this to complete. This option can be turned on with the
|
||||
`--synchronous-commit` command line flag.
|
||||
|
||||
### Format
|
||||
|
||||
The WAL file contains a series of DB state changes called `StateDelta`s.
|
||||
Each of them describes what the state change is and in which transaction
|
||||
it happened. Also some kinds of meta-information needed to ensure proper
|
||||
state recovery are recorded (transaction beginnings and commits/abort).
|
||||
|
||||
The following is guaranteed w.r.t. `StateDelta` ordering in
|
||||
a single WAL file:
|
||||
- For two ops in the same transaction, if op A happened before B in the
|
||||
database, that ordering is preserved in the log.
|
||||
- Transaction begin/commit/abort messages also appear in exactly the
|
||||
same order as they were executed in the transactional engine.
|
||||
|
||||
### Recovery
|
||||
|
||||
The database can recover from the WAL on startup. This works in
|
||||
conjunction with snapshot recovery. The database attempts to recover from
|
||||
the latest snapshot and then apply as much as possible from the WAL
|
||||
files. Only those transactions that were not recovered from the snapshot
|
||||
are recovered from the WAL, for speed efficiency. It is possible (but
|
||||
inefficient) to recover the database from WAL only, provided all the WAL
|
||||
files created from DB start are available. It is not possible to recover
|
||||
partial database state (i.e. from some suffix of WAL files, without the
|
||||
preceding snapshot).
|
||||
|
||||
## Snapshots
|
||||
|
||||
A "snapshot" is a record of the current database state stored in permanent
|
||||
storage. Note that the term "snapshot" is used also in the context of
|
||||
the transaction engine to denote a set of running transactions.
|
||||
|
||||
A snapshot is written to the file by Memgraph periodically if so
|
||||
configured. The snapshot creation process is done within a transaction created
|
||||
specifically for that purpose. The transaction is needed to ensure that
|
||||
the stored state is internally consistent.
|
||||
|
||||
The database state can be recovered from the snapshot during startup, if
|
||||
so configured. This recovery works in conjunction with write-ahead log
|
||||
recovery.
|
||||
|
||||
A single snapshot contains all the data needed to recover a database. In
|
||||
that sense snapshots are independent of each other and old snapshots can
|
||||
be deleted once the new ones are safely stored, if it is not necessary
|
||||
to revert the database to some older state.
|
||||
|
||||
The exact format of the snapshot file is defined inline in the snapshot
|
||||
creation code.
|
@ -1,116 +0,0 @@
|
||||
# Label Indexes
|
||||
|
||||
These are unsorted indexes that contain all the vertices that have the label
|
||||
the indexes are for (one index per label). These kinds of indexes get
|
||||
automatically generated for each label used in the database.
|
||||
|
||||
### Updating the Indexes
|
||||
|
||||
Whenever something gets added to the record we update the index (add that
|
||||
record to index). We keep an index which might contain garbage (not relevant
|
||||
records, because the value got removed or something similar) but we will
|
||||
filter it out when querying the index. We do it like this because we don't
|
||||
have to do bookkeeping and deciding if we update the index on the end of the
|
||||
transaction (commit/abort phase), moreover current interpreter advances the
|
||||
command in transaction and as such assumes that the indexes now contain
|
||||
objects added in the previous command inside this transaction, so we need to
|
||||
update over the whole scope of transaction (whenever something is added to the
|
||||
record).
|
||||
|
||||
### Index Entries Label
|
||||
|
||||
These kinds of indexes are internally keeping track of pair (record, vlist).
|
||||
Why do we need to keep track of exactly those two things?
|
||||
|
||||
Problems with two different approaches
|
||||
|
||||
1) Keep track of just the record:
|
||||
|
||||
- We need the `VersionList` for creating an accessor (this in itself is a
|
||||
deal-breaker).
|
||||
- Semantically it makes sense. An edge/vertex maps bijectionally to a
|
||||
`VersionList`.
|
||||
- We might try to access some members of record while the record is being
|
||||
modified from another thread.
|
||||
- A vertex/edge could get updated, thus expiring the record in the index.
|
||||
The newly created record should be present in the index, but it's not.
|
||||
Without the `VersionList` we can't reach the newly created record.
|
||||
- Probably there are even more reasons... It should be obvious by now that
|
||||
we need the `VersionList` in the index.
|
||||
|
||||
2) Keep track of just the version list:
|
||||
|
||||
- Removing from an index is a problem for two major reasons. First, if we
|
||||
only have the `VersionList`, checking if it should be removed implies
|
||||
checking all the reachable records, which is not thread-safe. Second,
|
||||
there are issues with concurrent removal and insertion. The cleanup thread
|
||||
could determine the vertex/edge should be removed from the index and
|
||||
remove it, while in between those ops another thread attempts to insert
|
||||
the `VersionList` into the index. The insertion does nothing because the
|
||||
`VersionList` is already in, but it gets removed immediately after.
|
||||
|
||||
Because of inability to keep track of just the record, or value, we need to
|
||||
keep track of both of them. Resolution of problems mentioned above, in the
|
||||
same order, with (record, vlist) pair
|
||||
|
||||
- simple `vlist.find(current transaction)` will get us the newest visible
|
||||
record
|
||||
- we'll never try to access some record if it's still being written since we
|
||||
will always operate on vlist.find returned record
|
||||
- newest record will contain that label
|
||||
- since we have (record, vlist) pair as the key in the index when we update
|
||||
and delete in the same time we will never delete the same record, vlist
|
||||
pair we are adding because the record, vlist pair we are deleting is
|
||||
already superseded by a newer record and as such won't be inserted while
|
||||
it's being deleted
|
||||
|
||||
### Querying the Index
|
||||
|
||||
We run through the index for the given label and do `vlist.find` operation for
|
||||
the current transaction, and check if the newest return record has that
|
||||
label. If it has it then we return it. By now you are probably wondering
|
||||
aren't we sometimes returning duplicate vlist entries? And you are wondering
|
||||
correctly, we would be returning them, but we are making sure that the entires
|
||||
in the index are sorted by their `vlist*` and as such we can filter consecutive
|
||||
duplicate `vlist*` to only return one of those while still being able to create
|
||||
an iterator to index.
|
||||
|
||||
### Cleaning the Index
|
||||
|
||||
Cleaning the index is not as straightforward as it seems as a lot of garbage
|
||||
can accumulate, but it's hard to know when exactly can we delete some (record,
|
||||
vlist) pair. First, let's assume that we are doing the cleaning process at
|
||||
some `transaction_id`, `id` such that there doesn't exist an active transaction
|
||||
with an id lower than `id`.
|
||||
|
||||
We scan through the whole index and for each (record, vlist) pair we first
|
||||
check if it was deleted before the id (i.e. no transaction with an id >= `id`
|
||||
will ever again see that record), if it was deleted before we might naively
|
||||
say that it's safe to delete it, but, we must take into account that when some
|
||||
new record is created from this record (update operation), that record still
|
||||
contains the label but by deleting this record we won't be able to see that
|
||||
vlist because that new record won't add again to index because we didn't
|
||||
explicitly add that label again to it.
|
||||
|
||||
Because of this we have to 'update' this index (record, vlist) pair. We have
|
||||
to update the record to now point to a newer record in vlist, the one that is
|
||||
not deleted yet. We can do that by querying the `version_list` for the last
|
||||
record inside (oldest it has — remember that `mvcc_gc` will re-link not
|
||||
visible records so the last record will be visible for the current GC id).
|
||||
When updating the record inside the index, it's not okay to just update the
|
||||
pointer and leave the index as it is, because with updating the `record*` we
|
||||
might change the relative order of entries inside the index. We first have to
|
||||
re-insert it with new `record*`, and then delete the old entry. And we need to
|
||||
do insertion before the remove operation! Otherwise it could happen that the
|
||||
vlist with a newer record with that label won't exist while some transaction
|
||||
is querying the index.
|
||||
|
||||
Records which we added as a consequence of deleting older records will be
|
||||
eventually removed from the index if they don't contain label because if we
|
||||
see that the record is not deleted we try to check if that record still
|
||||
contains the label. We also need to be careful here because we can't check
|
||||
that while the record is being potentially updated by some transaction (race
|
||||
condition), so we need can check if records still contain label if it's
|
||||
creation id is smaller than our `id`, as that implies that the creating
|
||||
transaction either aborted or committed as our `id` is equal to the oldest
|
||||
active transaction in time of starting the GC.
|
@ -1,131 +0,0 @@
|
||||
# Property Storage
|
||||
|
||||
Although the reader is probably familiar with properties in *Memgraph*, let's
|
||||
briefly recap.
|
||||
|
||||
Both vertices and edges can store an arbitrary number of properties. Properties
|
||||
are, in essence, ordered pairs of property names and property values. Each
|
||||
property name within a single graph element (edge/node) can store a single
|
||||
property value. Property names are represented as strings, while property values
|
||||
must be one of the following types:
|
||||
|
||||
Type | Description
|
||||
-----------|------------
|
||||
`Null` | Denotes that the property has no value. This is the same as if the property does not exist.
|
||||
`String` | A character string, i.e. text.
|
||||
`Boolean` | A boolean value, either `true` or `false`.
|
||||
`Integer` | An integer number.
|
||||
`Float` | A floating-point number, i.e. a real number.
|
||||
`List` | A list containing any number of property values of any supported type. It can be used to store multiple values under a single property name.
|
||||
`Map` | A mapping of string keys to values of any supported type.
|
||||
|
||||
Property values are modeled in a class conveniently called `PropertyValue`.
|
||||
|
||||
## Mapping Between Property Names and Property Keys.
|
||||
|
||||
Although users think of property names in terms of descriptive strings
|
||||
(e.g. "location" or "department"), *Memgraph* internally converts those names
|
||||
into property keys which are, essentially, unsigned 16-bit integers.
|
||||
|
||||
Property keys are modelled by a not-so-conveniently named class called
|
||||
`Property` which can be found in `storage/types.hpp`. The actual conversion
|
||||
between property names and property keys is done within the `ConcurrentIdMapper`
|
||||
but the internals of that implementation are out of scope for understanding
|
||||
property storage.
|
||||
|
||||
## PropertyValueStore
|
||||
|
||||
Both `Edge` and `Vertex` objects contain an instance of `PropertyValueStore`
|
||||
object which is responsible for storing properties of a corresponding graph
|
||||
element.
|
||||
|
||||
An interface of `PropertyValueStore` is as follows:
|
||||
|
||||
Method | Description
|
||||
-----------|------------
|
||||
`at` | Returns the `PropertyValue` for a given `Property` (key).
|
||||
`set` | Stores a given `PropertyValue` under a given `Property` (key).
|
||||
`erase` | Deletes a given `Property` (key) alongside its corresponding `PropertyValue`.
|
||||
`clear` | Clears the storage.
|
||||
`iterator`| Provides an extension of `std::input_iterator` that iterates over storage.
|
||||
|
||||
## Storage Location
|
||||
|
||||
By default, *Memgraph* is an in-memory database and all properties are therefore
|
||||
stored in working memory unless specified otherwise by the user. User has an
|
||||
option to specify via the command line which properties they wish to be stored
|
||||
on disk.
|
||||
|
||||
Storage location of each property is encapsulated within a `Property` object
|
||||
which is ensured by the `ConcurrentIdMapper`. More precisely, the unsigned 16-bit
|
||||
property key has the following format:
|
||||
|
||||
```
|
||||
|---location--|------id------|
|
||||
|-Memory|Disk-|-----2^15-----|
|
||||
```
|
||||
|
||||
In other words, the most significant bit determines the location where the
|
||||
property will be stored.
|
||||
|
||||
### In-memory Storage
|
||||
|
||||
The underlying implementation of in-memory storage for the time being is
|
||||
`std::vector<std::pair<Property, PropertyValue>>`. Implementations of`at`, `set`
|
||||
and `erase` are linear in time. This implementation is arguably more efficient
|
||||
than `std::map` or `std::unordered_map` when the average number of properties of
|
||||
a record is relatively small (up to 10) which seems to be the case.
|
||||
|
||||
### On-disk Storage
|
||||
|
||||
#### KVStore
|
||||
|
||||
Disk storage is modeled by an abstraction of key-value storage as implemented in
|
||||
`storage/kvstore.hpp'. An interface of this abstraction is as follows:
|
||||
|
||||
Method | Description
|
||||
----------------|------------
|
||||
`Put` | Stores the given value under the given key.
|
||||
`Get` | Obtains the given value stored under the given key.
|
||||
`Delete` | Deletes a given (key, value) pair from storage..
|
||||
`DeletePrefix` | Deletes all (key, value) pairs where key begins with a given prefix.
|
||||
`Size` | Returns the size of the storage or, optionally, the number of stored pairs that begin with a given prefix.
|
||||
`iterator` | Provides an extension of `std::input_iterator` that iterates over storage.
|
||||
|
||||
Keys and values in this context are of type `std::string`.
|
||||
|
||||
The actual underlying implementation of this abstraction uses
|
||||
[RocksDB]{https://rocksdb.org} — a persistent key-value store for fast
|
||||
storage.
|
||||
|
||||
It is worthy to note that the custom iterator implementation allows the user
|
||||
to iterate over a given prefix. Otherwise, the implementation follows familiar
|
||||
c++ constructs and can be used as follows:
|
||||
|
||||
```
|
||||
KVStore storage = ...;
|
||||
for (auto it = storage.begin(); it != storage.end(); ++it) {}
|
||||
for (auto kv : storage) {}
|
||||
for (auto it = storage.begin("prefix"); it != storage.end("prefix"); ++it) {}
|
||||
```
|
||||
|
||||
Note that it is not possible to scan over multiple prefixes. For instance, one
|
||||
might assume that you can scan over all keys that fall in a certain
|
||||
lexicographical range. Unfortunately, that is not the case and running the
|
||||
following code will result in an infinite loop with a touch of undefined
|
||||
behavior.
|
||||
|
||||
```
|
||||
KVStore storage = ...;
|
||||
for (auto it = storage.begin("alpha"); it != storage.end("omega"); ++it) {}
|
||||
```
|
||||
|
||||
#### Data Organization on Disk
|
||||
|
||||
Each `PropertyValueStore` instance can access a static `KVStore` object that can
|
||||
store `(key, value)` pairs on disk. The key of each property on disk consists of
|
||||
two parts — a unique identifier (unsigned 64-bit integer) of the current
|
||||
record version (see mvcc docummentation for further clarification) and a
|
||||
property key as described above. The actual value of the property is serialized
|
||||
into a bytestring using bolt `BaseEncoder`. Similarly, deserialization is
|
||||
performed by bolt `Decoder`.
|
@ -1,3 +0,0 @@
|
||||
# Storage v2
|
||||
|
||||
TODO(gitbuda): Write documentation.
|
@ -1,166 +0,0 @@
|
||||
# Memgraph Workflow
|
||||
|
||||
This chapter describes the usual workflow for working on Memgraph.
|
||||
|
||||
## Git
|
||||
|
||||
Memgraph uses [git](https://git-scm.com/) for source version control. If you
|
||||
obtained the source, you probably already have it installed. Before you can
|
||||
track new changes, you need to setup some basic information.
|
||||
|
||||
First, tell git your name:
|
||||
|
||||
git config --global user.name "FirstName LastName"
|
||||
|
||||
Then, set your Memgraph email:
|
||||
|
||||
git config --global user.email "my.email@memgraph.com"
|
||||
|
||||
Finally, make git aware of your favourite editor:
|
||||
|
||||
git config --global core.editor "vim"
|
||||
|
||||
## Github
|
||||
|
||||
All of the code in Memgraph needs to go through code review before it can be
|
||||
accepted in the codebase. This is done through [Github](https://github.com/).
|
||||
You should already have it installed if you followed the steps in [Quick
|
||||
Start](quick-start.md).
|
||||
|
||||
## Working on Your Feature Branch
|
||||
|
||||
Git has a concept of source code **branches**. The `master` branch contains all
|
||||
of the changes which were reviewed and accepted in Memgraph's code base. The
|
||||
`master` branch is selected by default.
|
||||
|
||||
### Creating a Branch
|
||||
|
||||
When working on a new feature or fixing a bug, you should create a new branch
|
||||
out of the `master` branch. There are two branch types, **epic** and **task**
|
||||
branches. The epic branch is created when introducing a new feature or any work
|
||||
unit requiring more than one commit. More commits are required to split the
|
||||
work into chunks to be able to easier review code or find a bug (in each
|
||||
commit, there could be various problems, e.g., related to performance or
|
||||
concurrency issues, which are the hardest to track down). Each commit on the
|
||||
master or epic branch should be a compilable and well-documented set of
|
||||
changes. Task branches should be created when a smaller work unit has to be
|
||||
integrated into the codebase. The task branch could be branched out of the
|
||||
master or an epic branch. We manage epics and tasks on the project management
|
||||
tool called [Airtable](https://airtable.com/tblTUqycq8sHTTkBF). Each epic is
|
||||
prefixed by `Exyz-MG`, on the other hand, each task has `Tabcd-MG` prefix.
|
||||
Examples on how to create branches follow:
|
||||
|
||||
```
|
||||
git checkout master
|
||||
git checkout -b T0025-MG-fix-a-problem
|
||||
...
|
||||
git checkout master
|
||||
git checkout -b E025-MG-huge-feature
|
||||
...
|
||||
git checkout E025-MG-huge-feature
|
||||
git checkout -b T0123-MG-add-feature-part
|
||||
```
|
||||
|
||||
Note that a branch is created from the currently selected branch. So, if you
|
||||
wish to create another branch from `master` you need to switch to `master`
|
||||
first.
|
||||
|
||||
### Making and Committing Changes
|
||||
|
||||
When you have a branch for your new addition, you can now actually start
|
||||
implementing it. After some amount of time, you may have created new files,
|
||||
modified others and maybe even deleted unused files. You need to tell git to
|
||||
track those changes. This is accomplished with `git add` and `git rm`
|
||||
commands.
|
||||
|
||||
git add path-to-new-file path-to-modified-file
|
||||
git rm path-to-deleted-file
|
||||
|
||||
To check that everything is correctly tracked, you may use the `git status`
|
||||
command. It will also print the name of the currently selected branch.
|
||||
|
||||
If everything seems OK, you should commit these changes to git.
|
||||
|
||||
git commit
|
||||
|
||||
You will be presented with an editor where you need to type the commit
|
||||
message. Writing a good commit message is an art in itself. You should take a
|
||||
look at the links below. We try to follow these conventions as much as
|
||||
possible.
|
||||
|
||||
* [How to Write a Git Commit Message](http://chris.beams.io/posts/git-commit/)
|
||||
* [A Note About Git Commit Messages](http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html)
|
||||
* [stopwritingramblingcommitmessages](http://stopwritingramblingcommitmessages.com/)
|
||||
|
||||
### Sending Changes on a Review
|
||||
|
||||
After finishing your work on your feature branch, you will want to send it on
|
||||
code review. This is done by pushing the branch to Github and creating a pull
|
||||
request. You can find all PRs
|
||||
[here](https://github.com/memgraph/memgraph/pulls).
|
||||
|
||||
### Code Integration
|
||||
|
||||
When working, you have to integrate some changes to your work or push your work
|
||||
to be available for others. To pull changes into a local `branch`, usually run
|
||||
the following:
|
||||
|
||||
git checkout {{branch}}
|
||||
git pull origin {{branch}}
|
||||
|
||||
To push your changes, usually run the following:
|
||||
|
||||
git checkout {{branch}}
|
||||
git push origin {{branch}}
|
||||
|
||||
Sometimes, things could get a little bit more complicated. Diagram below shows
|
||||
which git operation should be performed if a piece of code has to be integrated
|
||||
from one branch to another. Note, `main_branch` is the **master** branch in our
|
||||
case.
|
||||
|
||||
```
|
||||
|<---------------------------|
|
||||
| squash merge |
|
||||
|--------------------------->|
|
||||
| merge |
|
||||
| |
|
||||
|<-----------|<--------------|
|
||||
| merge | squash merge |
|
||||
| | |
|
||||
|----------->|-------------->|
|
||||
| rebase | merge |
|
||||
| | rebase --onto |
|
||||
| | |
|
||||
main_branch epic_branch task_branch
|
||||
```
|
||||
|
||||
There are a couple of cases:
|
||||
|
||||
* If a code has to be integrated from a task branch to the main branch, use
|
||||
**squash merge**. While you were working on a task, you probably committed a
|
||||
couple of cleanup commits that are not relevant to the main branch. In the
|
||||
other direction, while integrating the main branch to a task branch, the
|
||||
**regular merge** is ok because changes from the task branch will later be
|
||||
squash merged.
|
||||
|
||||
* You should use **squash merge** when integrating changes from task to epic
|
||||
branch (task might have irrelevant commits). On the other hand, you should
|
||||
use a **regular merge** when an epic is completed and has to be integrated into
|
||||
the main branch. Epic is a more significant piece of work, decoupled in
|
||||
compilable and testable commits. All these commits should be preserved to be
|
||||
able to find potential issues later on.
|
||||
|
||||
* You should use **rebase** when integrating changes from main to an epic
|
||||
branch. The epic branch has to be as clean as possible, avoid pure merge
|
||||
commits. Once you rebase epic on main, all commits on the epic branch will
|
||||
change the hashes. The implications are: 1) you have to force push your local
|
||||
branch to the origin, 2) if you made a task branch out of the epic branch, you
|
||||
would have to use **rebase --onto** (please refer to `git help rebase` for
|
||||
details). In simple cases, **regular merge** should be sufficient to integrate
|
||||
changes from epic to a task branch (that can even be done via GitHub web
|
||||
interface).
|
||||
|
||||
During any code integration, you may get reports that some files have
|
||||
conflicting changes. If you need help resolving them, don't be afraid to ask
|
||||
around! After you've resolved them, mark them as done with `git add` command.
|
||||
You may then continue with `git {{action}} --continue`.
|
@ -1,185 +0,0 @@
|
||||
# Python 3 Query Modules
|
||||
|
||||
## Introduction
|
||||
|
||||
Memgraph exposes a C API for writing the so called Query Modules. These
|
||||
modules contain definitions of procedures which can be invoked through the
|
||||
query language using the `CALL ... YIELD ...` syntax. This mechanism allows
|
||||
database users to extend Memgraph with their own algorithms and
|
||||
functionalities.
|
||||
|
||||
Using a low level language like C can be quite cumbersome for writing modules,
|
||||
so it seems natural to add support for a higher level language on top of the
|
||||
existing C API.
|
||||
|
||||
There are languages written exactly for this purpose of extending C with high
|
||||
level constructs, for example Lua and Guile. Instead of those, we have chosen
|
||||
Python 3 to be the first high level language we will support. The primary reason
|
||||
being that it's very popular, so more people should be able to write modules.
|
||||
Another benefit of Python which comes out of its popularity is the large
|
||||
ecosystem of libraries, especially graph algorithm related ones like NetworkX.
|
||||
Python does have significant performance and implementation downsides compared
|
||||
to Lua and Guile, but these are described in more detail later in this
|
||||
document.
|
||||
|
||||
## Python 3 API Overview
|
||||
|
||||
The Python 3 API should be as user friendly as possible as well as look
|
||||
Pythonic. This implies that some functions from the C API will not map to the
|
||||
exact same functions. The most obvious case for a Pythonic approach is
|
||||
registering procedures of a query module. Let's take a look at the C example
|
||||
and its transformation to Python.
|
||||
|
||||
```c
|
||||
static void procedure(const struct mgp_list *args,
|
||||
const struct mgp_graph *graph, struct mgp_result *result,
|
||||
struct mgp_memory *memory);
|
||||
|
||||
int mgp_init_module(struct mgp_module *module, struct mgp_memory *memory) {
|
||||
struct mgp_proc *proc =
|
||||
mgp_module_add_read_procedure(module, "procedure", procedure);
|
||||
if (!proc) return 1;
|
||||
if (!mgp_proc_add_arg(proc, "required_arg",
|
||||
mgp_type_nullable(mgp_type_any())))
|
||||
return 1;
|
||||
struct mgp_value *null_value = mgp_value_make_null(memory);
|
||||
if (!mgp_proc_add_opt_arg(proc, "optional_arg",
|
||||
mgp_type_nullable(mgp_type_any()), null_value)) {
|
||||
mgp_value_destroy(null_value);
|
||||
return 1;
|
||||
}
|
||||
mgp_value_destroy(null_value);
|
||||
if (!mgp_proc_add_result(proc, "result", mgp_type_string())) return 1;
|
||||
if (!mgp_proc_add_result(proc, "args",
|
||||
mgp_type_list(mgp_type_nullable(mgp_type_any()))))
|
||||
return 1;
|
||||
return 0;
|
||||
}
|
||||
```
|
||||
|
||||
In Python things should be a lot simpler.
|
||||
|
||||
```Python
|
||||
# mgp.read_proc obtains the procedure name via __name__ attribute of a function.
|
||||
@mgp.read_proc(# Arguments passed to multiple mgp_proc_add_arg calls
|
||||
(('required_arg', mgp.Nullable(mgp.Any)), ('optional_arg', mgp.Nullable(mgp.Any), None)),
|
||||
# Result fields passed to multiple mgp_proc_add_result calls
|
||||
(('result', str), ('args', mgp.List(mgp.Nullable(mgp.Any)))))
|
||||
def procedure(args, graph, result, memory):
|
||||
pass
|
||||
```
|
||||
|
||||
Here we have replaced `mgp_module_*` and `mgp_proc_*` C API with a much
|
||||
simpler decorator function in Python -- `mgp.read_proc`. The types of
|
||||
arguments and result fields can both be our types as well as Python builtin
|
||||
types which can map to supported `mgp_value` types. The expected builtin types
|
||||
we ought to support are: `bool`, `str`, `int`, `float` and `map`. While the
|
||||
rest of the types are provided via our Python API. Optionally, we can add
|
||||
convenience support for `object` type which would map to
|
||||
`mgp.Nullable(mgp.Any)` and `list` which would map to
|
||||
`mgp.List(mgp.Nullable(mgp.Any))`. Also, it makes sense to take a look if we
|
||||
can leverage Python's `typing` module here.
|
||||
|
||||
Another Pythonic change is to remove `mgp_value` C API from Python altogether.
|
||||
This means that the arguments a Python procedure receives are not `mgp_value`
|
||||
instances but rather `PyObject` instances. In other words, our implementation
|
||||
would immediately marshal `mgp_value` to corresponding type in Python.
|
||||
Obviously we would need to provide our own Python types for non-builtin
|
||||
things like `mgp.Vertex` (equivalent to `mgp_vertex`) and other.
|
||||
|
||||
Continuing from our example above, let's say the procedure was invoked through
|
||||
Cypher using the following query.
|
||||
|
||||
MATCH (n) CALL py_module.procedure(42, n) YIELD *;
|
||||
|
||||
The Python procedure could then do the following and complete without throwing
|
||||
neither the AssertionError nor the ValueError.
|
||||
|
||||
```Python
|
||||
def procedure(args, graph, result, memory):
|
||||
assert isinstance(args, list)
|
||||
# Unpacking throws ValueError if args does not contain exactly 2 values.
|
||||
required_arg, optional_arg = args
|
||||
assert isintance(required_arg, int)
|
||||
assert isinstance(optional_arg, mgp.Vertex)
|
||||
```
|
||||
|
||||
The rest of the C API should naturally map to either top level functions or
|
||||
class methods as appropriate.
|
||||
|
||||
## Loading Python Query Modules
|
||||
|
||||
Our current mechanism for loading the modules is to look for `.so` files in
|
||||
the directory specified by `--query-modules` flag. This is done when Memgraph
|
||||
is started. We can extend this mechanism to look for `.py` files in addition
|
||||
to `.so` files in the same directory and import them in the embedded Python
|
||||
interpreter. The only issue is embedding the interpreter in Memgraph. There
|
||||
are multiple choices:
|
||||
|
||||
1. Building Memgraph and statically linking to Python.
|
||||
2. Building Memgraph and dynamically linking to Python, and distributing
|
||||
Python with Memgraph's installation.
|
||||
3. Building Memgraph and dynamically linking to Python, but without
|
||||
distributing the Python library.
|
||||
4. Building Memgraph and optionally loading Python library by trying to
|
||||
`dlopen` it.
|
||||
|
||||
The first two options are only viable if the Python license allows, and this
|
||||
will need further investigation.
|
||||
|
||||
The third option adds Python as an installation dependency for Memgraph, and
|
||||
without it Memgraph will not run. This is problematic for users which cannot
|
||||
or do not want to install Python 3.
|
||||
|
||||
The fourth option avoids all of the issues present in the first 3 options, but
|
||||
comes at a higher implementation cost. We would need to try to `dlopen` the
|
||||
Python library and setup function pointers. If we succeed we would import
|
||||
`.py` files from the `--query-modules` directory. On the other hand, if the
|
||||
user does not have Python, `dlopen` would fail and Memgraph would run without
|
||||
Python support.
|
||||
|
||||
After live discussion, we've decided to go with option 3. This way we don't
|
||||
have to worry about mismatching Python versions we support and what the users
|
||||
expect. Also, we should target Python 3.5 as that should be common between
|
||||
Debian and CentOS for which we ship installation packages.
|
||||
|
||||
## Performance and Implementation Problems
|
||||
|
||||
As previously mentioned, embedding Python introduces usability issues compared
|
||||
to other embeddable languages.
|
||||
|
||||
The first, major issue is Global Interpreter Lock (GIL). Initializing Python
|
||||
will start a single global interpreter and running multiple threads will
|
||||
require acquiring GIL. In practice, this means that when multiple users run a
|
||||
procedure written in Python in parallel the execution will not actually be
|
||||
parallel. Python's interpreter will jump between executing one user's
|
||||
procedure and the other's. This can be quite an issue for long running
|
||||
procedures when multiple users are querying Memgraph. The solution for this
|
||||
issue is Python's API for sub-interpreters. Unfortunately, the support for
|
||||
them is rather poor and the API contains a lot of critical bugs when we tried
|
||||
to use them. For the time being, we will have to accept GIL and its downsides.
|
||||
Perhaps in the future we will gain more knowledge on how we could reduce the
|
||||
acquire rate of GIL or the sub-interpreter API will get improved.
|
||||
|
||||
Another major issue is memory allocation. Python's C API does not have support
|
||||
for setting up a temporary allocator during execution of a single function.
|
||||
It only has support for setting up a global heap allocator. This obviously
|
||||
impacts our control of memory during a query procedure invocation. Besides
|
||||
potential performance penalty, a procedure could allocate much more memory
|
||||
than we would actually allow for execution of a single query. This means that
|
||||
options controlling the memory limit during query execution are useless. On
|
||||
the bright side, Python does use block style allocators and reference
|
||||
counting, so the performance penalty and global memory usage should not be
|
||||
that terrible.
|
||||
|
||||
The final issue that isn't as major as the ones above is the global state of
|
||||
the interpreter. In practice this means that any registered procedure and
|
||||
imported module has access to any other procedure and module. This may pollute
|
||||
the namespace for other users, but it should not be much of a problem because
|
||||
Python always has things under a module scope. The other, slightly bigger
|
||||
downside is that a malicious user could use this knowledge to modify other
|
||||
modules and procedures. This seems like a major issue, but if we take the
|
||||
bigger picture into consideration, we already have a security issue in general
|
||||
by invoking `dlopen` on `.so` and potentially running arbitrary code. This was
|
||||
the trade off we chose to allow users to extend Memgraph. It's up to the users
|
||||
to write sane extensions and protect their servers from access.
|
@ -1,198 +0,0 @@
|
||||
# Tensorflow Op - Technicalities
|
||||
|
||||
The final result should be a shared object (".so") file that can be dynamically
|
||||
loaded by the Tensorflow runtime in order to directly access the bolt client.
|
||||
|
||||
## About Tensorflow
|
||||
|
||||
Tensorflow is usually used with Python such that the Python code is used to
|
||||
define a directed acyclic computation graph. Basically no computation is done
|
||||
in Python. Instead, values from Python are copied into the graph structure as
|
||||
constants to be used by other Ops. The directed acyclic graph naturally ends up
|
||||
with two sets of border nodes, one for inputs, one for outputs. These are
|
||||
sometimes called "feeds".
|
||||
|
||||
Following the Python definition of the graph, during training, the entire data
|
||||
processing graph/pipeline is called from Python as a single expression. This
|
||||
leads to lazy evaluation since the called result has already been defined for a
|
||||
while.
|
||||
|
||||
Tensorflow internally works with tensors, i.e. n-dimensional arrays. That means
|
||||
all of its inputs need to be matrices as well as its outputs. While it is
|
||||
possible to feed data directly from Python's numpy matrices straight into
|
||||
Tensorflow, this is less desirable than using the Tensorflow data API (which
|
||||
defines data input and processing as a Tensorflow graph) because:
|
||||
|
||||
1. The data API is written in C++ and entirely avoids Python and as such is
|
||||
faster
|
||||
2. The data API, unlike Python is available in "Tensorflow serving". The
|
||||
default way to serve Tensorflow models in production.
|
||||
|
||||
Once the entire input pipeline is defined via the tf.data API, its input is
|
||||
basically a list of node IDs the model is supposed to work with. The model,
|
||||
through the data API knows how to connect to Memgraph and execute openCypher
|
||||
queries in order to get the remaining data it needs. (For example features of
|
||||
neighbouring nodes.)
|
||||
|
||||
## The Interface
|
||||
|
||||
I think it's best you read the official guide...
|
||||
<https://www.tensorflow.org/extend/adding_an_op> And especially the addition
|
||||
that specifies how data ops are special
|
||||
<https://www.tensorflow.org/extend/new_data_formats>
|
||||
|
||||
## Compiling the TF Op
|
||||
|
||||
There are two options for compiling a custom op. One of them involves pulling
|
||||
the TF source, adding your code to it and compiling via bazel. This is
|
||||
probably awkward to do for us and would significantly slow down compilation.
|
||||
|
||||
The other method involves installing Tensorflow as a Python package and pulling
|
||||
the required headers from for example:
|
||||
`/usr/local/lib/python3.6/site-packages/tensorflow/include` We can then compile
|
||||
our Op with our regular build system.
|
||||
|
||||
This is practical since we can copy the required headers to our repo. If
|
||||
necessary, we can have several versions of the headers to build several
|
||||
versions of our Op for every TF version which we want to support. (But this is
|
||||
unlikely to be required as the API should be stable).
|
||||
|
||||
## Example for Using the Bolt Client Tensorflow Op
|
||||
|
||||
### Dynamic Loading
|
||||
|
||||
``` python3
|
||||
import tensorflow as tf
|
||||
|
||||
mg_ops = tf.load_op_library('/usr/bin/memgraph/tensorflow_ops.so')
|
||||
```
|
||||
|
||||
### Basic Usage
|
||||
|
||||
``` python3
|
||||
dataset = mg_ops.OpenCypherDataset(
|
||||
# This is probably unfortunate as the username and password
|
||||
# get hardcoded into the graph, but for the simple case it's fine
|
||||
"hostname:7687", auth=("user", "pass"),
|
||||
|
||||
# Our query
|
||||
'''
|
||||
MATCH (n:Train) RETURN n.id, n.features
|
||||
''',
|
||||
|
||||
# Cast return values to these types
|
||||
(tf.string, tf.float32))
|
||||
|
||||
# Some Tensorflow data api boilerplate
|
||||
iterator = dataset.make_one_shot_iterator()
|
||||
next_element = iterator.get_next()
|
||||
|
||||
# Up to now we have only defined our computation graph which basically
|
||||
# just connects to Memgraph
|
||||
# `next_element` is not really data but a handle to a node in the Tensorflow
|
||||
# graph, which we can and do evaluate
|
||||
# It is a Tensorflow tensor with shape=(None, 2)
|
||||
# and dtype=(tf.string, tf.float)
|
||||
# shape `None` means the shape of the tensor is unknown at definition time
|
||||
# and is dynamic and will only be known once the tensor has been evaluated
|
||||
|
||||
with tf.Session() as sess:
|
||||
node_ids = sess.run(next_element)
|
||||
# `node_ids` contains IDs and features of all the nodes
|
||||
# in the graph with the label "Train"
|
||||
# It is a numpy.ndarray with a shape ($n_matching_nodes, 2)
|
||||
```
|
||||
|
||||
### Memgraph Client as a Generic Tensorflow Op
|
||||
|
||||
Other than the Tensorflow Data Op, we'll want to support a generic Tensorflow
|
||||
Op which can be put anywhere in the Tensorflow computation Graph. It takes in
|
||||
an arbitrary tensor and produces a tensor. This would be used in the GraphSage
|
||||
algorithm to fetch the lowest level features into Tensorflow
|
||||
|
||||
```python3
|
||||
requested_ids = np.array([1, 2, 3])
|
||||
ids_placeholder = tf.placeholder(tf.int32)
|
||||
|
||||
model = mg_ops.OpenCypher()
|
||||
"hostname:7687", auth=("user", "pass"),
|
||||
"""
|
||||
UNWIND $node_ids as nid
|
||||
MATCH (n:Train {id: nid})
|
||||
RETURN n.features
|
||||
""",
|
||||
|
||||
# What to call the input tensor as an openCypher parameter
|
||||
parameter_name="node_ids",
|
||||
|
||||
# Type of our resulting tensor
|
||||
dtype=(tf.float32)
|
||||
)
|
||||
|
||||
features = model(ids_placeholder)
|
||||
|
||||
with tf.Session() as sess:
|
||||
result = sess.run(features,
|
||||
feed_dict={ids_placeholder: requested_ids})
|
||||
```
|
||||
|
||||
This is probably easier to implement than the Data Op, so it might be a good
|
||||
idea to start with.
|
||||
|
||||
### Production Usage
|
||||
|
||||
During training, in the GraphSage algorithm at least, Memgraph is at the
|
||||
beginning and at the end of the Tensorflow computation graph. At the
|
||||
beginning, the Data Op provides the node IDs which are fed into the generic
|
||||
Tensorflow Op to find their neighbours and their neighbours and their features.
|
||||
|
||||
Production usage differs in that we don't use the Data Op. The Data Op is
|
||||
effectively cut off and the initial input is fed by Tensorflow serving, with
|
||||
the data found in the request.
|
||||
|
||||
For example a JSON request to classify a node might look like:
|
||||
|
||||
`POST http://host:port/v1/models/GraphSage/versions/v1:classify`
|
||||
|
||||
With the contents:
|
||||
|
||||
```json
|
||||
{
|
||||
"examples": [
|
||||
{"node_id": 1},
|
||||
{"node_id": 2}
|
||||
],
|
||||
}
|
||||
```
|
||||
|
||||
Every element of the "examples" list is an example to be computed. Each is
|
||||
represented by a dict with keys matching names of feeds in the Tensorflow graph
|
||||
and values being the values we want fed in for each example.
|
||||
|
||||
The REST API then replies in kind with the classification result in JSON.
|
||||
|
||||
Note about adding our custom Op to Tensorflow serving. Our Ops .so can be
|
||||
added into the Bazel build to link with Tensorflow serving or it can be
|
||||
dynamically loaded by starting Tensorflow serving with a flag
|
||||
`--custom_op_paths`.
|
||||
|
||||
### Considerations
|
||||
|
||||
There might be issues here that the url to connect to Memgraph is hardcoded
|
||||
into the op and would thus be wrong when moved to production, requiring some
|
||||
type of a hack to make work. We probably want to solve this by having the
|
||||
client op take in another tf.Variable as an input which would contain a
|
||||
connection url and username/password. We have to research whether this makes
|
||||
it easy enough to move to production, as the connection string variable is
|
||||
still a part of the graph, but maybe easier to replace.
|
||||
|
||||
It is probably the best idea to utilize openCypher parameters to make our
|
||||
queries flexible. The exact API as to how to declare the parameters in Python
|
||||
is open to discussion.
|
||||
|
||||
The Data Op might not even be necessary to implement as it is not key for
|
||||
production use. It can be replaced in training mode with feed dicts and either
|
||||
|
||||
1. Getting the initial list of nodes via a Python Bolt client
|
||||
2. Creating a separate Tensorflow computation graph that gets all the relevant
|
||||
node IDs into Python
|
@ -1,33 +0,0 @@
|
||||
# Feature Specifications
|
||||
|
||||
## Active
|
||||
|
||||
* [Python Query Modules](active/python-query-modules.md)
|
||||
* [Tensorflow Op](active/tensorflow-op.md)
|
||||
|
||||
## Draft
|
||||
|
||||
* [A-star Variable-length Expand](draft/a-star-variable-length-expand.md)
|
||||
* [Cloud-native Graph Store](draft/cloud-native-graph-store.md)
|
||||
* [Compile Filter Expressions](draft/compile-filter-expressions.md)
|
||||
* [Database Triggers](draft/database-triggers.md)
|
||||
* [Date and Time Data Types](draft/date-and-time-data-types.md)
|
||||
* [Distributed Query Execution](draft/distributed-query-execution.md)
|
||||
* [Edge Create or Update Queries](draft/edge-create-or-update-queries.md)
|
||||
* [Extend Variable-length Filter Expressions](draft/extend-variable-length-filter-expression.md)
|
||||
* [Geospatial Data Types](draft/geospatial-data-types.md)
|
||||
* [Hybrid Storage Engine](draft/hybrid-storage-engine.md)
|
||||
* [Load Data Queries](draft/load-data-queries.md)
|
||||
* [Multitenancy](draft/multitenancy.md)
|
||||
* [Query Compilation](draft/query-compilation.md)
|
||||
* [Release Log Levels](draft/release-log-levels.md)
|
||||
* [Rust Query Modules](draft/rust-query-modules.md)
|
||||
* [Sharded Graph Store](draft/sharded-graph-store.md)
|
||||
* [Storage Memory Management](draft/storage-memory-management.md)
|
||||
* [Vectorized Query Execution](draft/vectorized-query-execution.md)
|
||||
|
||||
## Obsolete
|
||||
|
||||
* [Distributed](obsolete/distributed.md)
|
||||
* [High-availability](obsolete/high-availability.md)
|
||||
* [Kafka Integration](obsolete/kafka-integration.md)
|
@ -1,15 +0,0 @@
|
||||
# A-star Variable-length Expand
|
||||
|
||||
Like DFS/BFS/WeightedShortestPath, it should be possible to support the A-star
|
||||
algorithm in the format of variable length expansion.
|
||||
|
||||
Syntactically, the query should look like the following one:
|
||||
```
|
||||
MATCH (start)-[
|
||||
*aStar{{hops}} {{heuristic_expression} {{weight_expression}} {{aggregated_weight_variable}} {{filtering_expression}}
|
||||
]-(end)
|
||||
RETURN {{aggregated_weight_variable}};
|
||||
```
|
||||
|
||||
It would be convenient to add geospatial data support before because A-star
|
||||
works well with geospatial data (heuristic function might exist).
|
@ -1,7 +0,0 @@
|
||||
# Cloud-native Graph Store
|
||||
|
||||
The biggest problem with the current in-memory storage is the total cost of
|
||||
ownership for large datasets non-frequently updated. An idea to solve that is a
|
||||
decoupled storage and compute inside a cloud environment. E.g., on AWS, a
|
||||
database instance could use EC2 machines to run the query execution against
|
||||
data stored inside S3.
|
@ -1,40 +0,0 @@
|
||||
# Compile Filter Expressions
|
||||
|
||||
Memgraph evaluates filter expression by traversing the abstract syntax tree of
|
||||
the given filter. Filtering is a general operation in query execution.
|
||||
|
||||
Some simple examples are:
|
||||
```
|
||||
MATCH (n:Person {name: "John"}) WHERE n.age > 20 AND n.age < 40 RETURN n;
|
||||
MATCH (a {id: 723})-[*bfs..10 (e, n | e.x > 12 AND n.y < 3)]-() RETURN *;
|
||||
```
|
||||
|
||||
More real-world example looks like this (Ethereum network analysis):
|
||||
```
|
||||
MATCH (a: Address {addr: ''})-[]->(t: Transaction)-[]->(b: Address)
|
||||
RETURN DISTINCT b.addr
|
||||
UNION
|
||||
MATCH (a: Address {addr: ''})-[]->(t: Transaction)-[]->(b1: Address)-[]->(t2: Transaction)-[]->(b: Address)
|
||||
WHERE t2.timestamp > t.timestamp
|
||||
RETURN DISTINCT b.addr
|
||||
UNION
|
||||
MATCH (a: Address {addr: ''})-[]->(t: Transaction)-[]->(b1: Address)-[]->(t2: Transaction)-[]->(b2: Address)-[]->(t3: Transaction)-[]->(b: Address)
|
||||
WHERE t2.timestamp > t.timestamp AND t3.timestamp > t2.timestamp
|
||||
return distinct b.addr
|
||||
UNION
|
||||
MATCH (a: Address {addr: ''})-[]->(t: Transaction)-[]->(b1: Address)-[]->(t2: Transaction)-[]->(b2: Address)-[]->(t3: Transaction)-[]->(b3: Address)-[]->(t4: Transaction)-[]->(b: Address)
|
||||
WHERE t2.timestamp > t.timestamp AND t3.timestamp > t2.timestamp AND t4.timestamp > t3.timestamp
|
||||
RETURN DISTINCT b.addr
|
||||
UNION
|
||||
MATCH (a: Address {addr: ''})-[]->(t: Transaction)-[]->(b1: Address)-[]->(t2: Transaction)-[]->(b2: Address)-[]->(t3: Transaction)-[]->(b3: Address)-[]->(t4: Transaction)-[]->(b4: Address)-[]->(t5: Transaction)-[]->(b: Address)
|
||||
WHERE t2.timestamp > t.timestamp AND t3.timestamp > t2.timestamp AND t4.timestamp > t3.timestamp AND t5.timestamp > t4.timestamp
|
||||
RETURN DISTINCT b.addr;
|
||||
```
|
||||
|
||||
Filtering may take a significant portion of query execution, which means it has
|
||||
to be fast.
|
||||
|
||||
The first step towards improvement might be to expose an API under which a
|
||||
developer can implement its filtering logic (it's OK to support only C++ in the
|
||||
beginning). Later on, we can introduce an automatic compilation of filtering
|
||||
expressions.
|
@ -1,14 +0,0 @@
|
||||
# Database Triggers
|
||||
|
||||
Memgraph doesn't have any built-in notification mechanism yet. In the case a
|
||||
user wants to get notified about anything happening inside Memgraph, the only
|
||||
option is some pull mechanism from the client code. In many cases, that might
|
||||
be suboptimal.
|
||||
|
||||
A natural place to start would be put to some notification code on each update
|
||||
action inside Memgraph. It's probably too early to send a notification
|
||||
immediately after WAL delta gets created, but at some point after transaction
|
||||
commits or after WAL deltas are written to disk might be a pretty good place.
|
||||
Furthermore, Memgraph has the query module infrastructure. The first
|
||||
implementation might call a user-defined query module procedure and pass
|
||||
whatever gets created or updated during the query execution.
|
@ -1,13 +0,0 @@
|
||||
# Date and Time Data Types
|
||||
|
||||
Neo4j offers the following functionality:
|
||||
|
||||
* https://neo4j.com/docs/cypher-manual/current/syntax/temporal/
|
||||
* https://neo4j.com/docs/cypher-manual/current/functions/temporal/
|
||||
|
||||
The question is, how are we going to support equivalent capabilities? We need
|
||||
something very similar because these are, in general, very well defined types.
|
||||
|
||||
A note about the storage is that Memgraph has a limit on the total number of
|
||||
different data types, 16 at this point. We have to be mindful of that during
|
||||
the design phase.
|
@ -1,10 +0,0 @@
|
||||
# Distributed Query Execution
|
||||
|
||||
Add the ability to execute graph algorithms on a cluster of machines. The scope
|
||||
of this is ONLY the query execution without changing the underlying storage
|
||||
because that's much more complex. The first significant decision here is to
|
||||
figure out do we implement our own distributed execution engine or deploy
|
||||
something already available, like [Giraph](https://giraph.apache.org). An
|
||||
important part is that Giraph by itself isn't enough because people want to
|
||||
update data on the fly. The final solution needs to provide some updating
|
||||
capabilities.
|
@ -1,14 +0,0 @@
|
||||
# Edge Create or Update Queries
|
||||
|
||||
The old semantic of the `MERGE` clause is quite tricky. The new semantic of
|
||||
`MERGE` is explained
|
||||
[here](https://blog.acolyer.org/2019/09/18/updating-graph-databases-with-cypher/).
|
||||
|
||||
Similar to `MERGE`, but maybe simpler is to define clauses and semantics that
|
||||
apply only to a single edge. In the case an edge between two nodes doesn't
|
||||
exist, it should be created. On the other hand, if it exists, it should be
|
||||
updated. The syntax should look similar to the following:
|
||||
|
||||
```
|
||||
MERGE EDGE (a)-[e:Type {props}]->(b) [ON CREATE SET expression ON UPDATE SET expression] ...
|
||||
```
|
@ -1,12 +0,0 @@
|
||||
# Extend Variable-length Filter Expressions
|
||||
|
||||
Variable-length filtering (DFS/BFS/WeightedShortestPath) can to be arbitrarily
|
||||
complex. At this point, the filtering expression only gets currently visited
|
||||
node and edge:
|
||||
|
||||
```
|
||||
MATCH (a {id: 723})-[*bfs..10 (e, n | e.x > 12 AND n.y < 3)]-() RETURN *;
|
||||
```
|
||||
|
||||
If a user had the whole path available, he would write more complex filtering
|
||||
logic.
|
@ -1,28 +0,0 @@
|
||||
# Geospatial Data Types
|
||||
|
||||
Neo4j offers the following functionality:
|
||||
|
||||
* https://neo4j.com/docs/cypher-manual/current/syntax/spatial/
|
||||
* https://neo4j.com/docs/cypher-manual/current/functions/spatial/
|
||||
|
||||
The question is, how are we going to support equivalent capabilities? We need
|
||||
something very similar because these are, in general, very well defined types.
|
||||
|
||||
The main reasons for implementing this feature are:
|
||||
1. Ease of use. At this point, users have to encode/decode time data types
|
||||
manually.
|
||||
2. Memory efficiency in some cases because user defined encoding could still
|
||||
be more efficient.
|
||||
|
||||
The number of functionalities that could be built on top of geospatial types is
|
||||
huge. Probably some C/C++ libraries should be used:
|
||||
* https://github.com/OSGeo/gdal.
|
||||
* http://geostarslib.sourceforge.net/ Furthermore, the query engine could use
|
||||
these data types during query execution (specific for query execution).
|
||||
* https://www.cgal.org.
|
||||
Also, the storage engine could have specialized indices for these types of
|
||||
data.
|
||||
|
||||
A note about the storage is that Memgraph has a limit on the total number of
|
||||
different data types, 16 at this point. We have to be mindful of that during
|
||||
the design phase.
|
@ -1,20 +0,0 @@
|
||||
# Hybrid Storage Engine
|
||||
|
||||
The goal here is easy to improve Memgraph storage massively! Please take a look
|
||||
[here](http://cidrdb.org/cidr2020/papers/p29-neumann-cidr20.pdf) for the
|
||||
reasons.
|
||||
|
||||
The general idea is to store edges on disk by using an LSM like data structure.
|
||||
Storing edge properties will be tricky because strict schema also has to be
|
||||
introduced. Otherwise, it's impossible to store data on disk optimally (Neo4j
|
||||
already has a pretty optimized implementation of that). Furthermore, we have to
|
||||
introduce the paging concept.
|
||||
|
||||
This is a complex feature because various aspects of the core engine have to be
|
||||
considered and probably updated (memory management, garbage collection,
|
||||
indexing).
|
||||
|
||||
## References
|
||||
|
||||
* [On Disk IO, Part 3: LSM Trees](https://medium.com/databasss/on-disk-io-part-3-lsm-trees-8b2da218496f)
|
||||
* [2020-04-13 On-disk Edge Store Research](https://docs.google.com/document/d/1avoR2g9dNWa4FSFt9NVn4JrT6uOAH_ReNeUoNVsJ7J4)
|
@ -1,17 +0,0 @@
|
||||
# Load Data Queries
|
||||
|
||||
Loading data into Memgraph is a challenging task. We have to implement
|
||||
something equivalent to the [Neo4j LOAD
|
||||
CSV](https://neo4j.com/developer/guide-import-csv/#import-load-csv). This
|
||||
feature seems relatively straightforward to implement because `LoadCSV` could
|
||||
be another operator that would yield row by row. By having the operator, the
|
||||
operation would be composable with the rest of the `CREATE`|`MERGE` queries.
|
||||
The composability is the key because users would be able to combine various
|
||||
clauses to import data.
|
||||
|
||||
A more general concept is [SingleStore
|
||||
Pipelines](https://docs.singlestore.com/v7.1/reference/sql-reference/pipelines-commands/create-pipeline).
|
||||
|
||||
We already tried with [Graph Streams](../obsolete/kafka-integration.md). An option
|
||||
is to migrate that code as a standalone product
|
||||
[here](https://github.com/memgraph/mgtools).
|
@ -1,15 +0,0 @@
|
||||
# Multitenancy
|
||||
|
||||
[Multitenancy](https://en.wikipedia.org/wiki/Multitenancy) is a feature mainly
|
||||
in the domain of ease of use. Neo4j made a great move by introducing
|
||||
[Fabric](https://neo4j.com/developer/multi-tenancy-worked-example).
|
||||
|
||||
Memgraph first step in a similar direction would be to add an abstraction layer
|
||||
containing multiple `Storage` instances + the ability to specify a database
|
||||
instance per client session or database transaction.
|
||||
|
||||
## Replication Context
|
||||
|
||||
Each transaction has to encode on top of which database it's getting executed.
|
||||
Once a replica gets delta objects containing database info, the replica engine
|
||||
could apply changes locally.
|
@ -1,14 +0,0 @@
|
||||
# Query Compilation
|
||||
|
||||
Memgraph supports the interpretation of queries in a pull-based way. An
|
||||
advantage of interpreting queries is a fast time until the execution, which is
|
||||
convenient when a user wants to test a bunch of queries in a short time. The
|
||||
downside is slow runtime. The runtime could be improved by compiling query
|
||||
plans.
|
||||
|
||||
## Research Area 1
|
||||
|
||||
The easiest route to the query compilation might be generating [virtual
|
||||
constexpr](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1064r0.html)
|
||||
pull functions, making a dynamic library out of the entire compiled query plan,
|
||||
and swapping query plans during the database runtime.
|
@ -1,17 +0,0 @@
|
||||
# Release Log Levels
|
||||
|
||||
It's impossible to control the log level in Memgraph Community. That means it's
|
||||
tough to debug issues in interacting with Memgraph. At least three log levels
|
||||
should be available to the user:
|
||||
|
||||
* Log nothing (as it is now).
|
||||
* Log each executed query.
|
||||
* Log Bolt server states.
|
||||
|
||||
Memgraph Enterprise has the audit log feature. The audit log provides
|
||||
additional info about each query (user, source, etc.), but it's only available
|
||||
in the Enterprise edition. Furthermore, the intention of audit logs isn't
|
||||
debugging.
|
||||
|
||||
An important note is that the logged queries should be stripped out because, in
|
||||
the Memgraph cloud context, we shouldn't log sensitive data.
|
@ -1,15 +0,0 @@
|
||||
# Rust Query Modules
|
||||
|
||||
Memgraph provides the query modules infrastructure. It's possible to write
|
||||
query modules in
|
||||
[C/C++](https://docs.memgraph.com/memgraph/reference-overview/query-modules/c-api)
|
||||
and
|
||||
[Python](https://docs.memgraph.com/memgraph/reference-overview/query-modules/python-api).
|
||||
The problem with C/C++ is that it's very error-prone and time-consuming.
|
||||
Python's problem is that it's slow and has a bunch of other limitations listed
|
||||
in the [feature spec](../active/python-query-modules.md).
|
||||
|
||||
On the other hand, Rust is fast and much less error-prone compared to C. It
|
||||
should be possible to use [bindgen](https://github.com/rust-lang/rust-bindgen)
|
||||
to generate bindings out of the current C API and write wrapper code for Rust
|
||||
developers to enjoy.
|
@ -1,8 +0,0 @@
|
||||
# Sharded Graph Store
|
||||
|
||||
Add the ability to shard graph data across machines in a cluster. The scope of
|
||||
this is ONLY changing to the storage engine.
|
||||
|
||||
## References
|
||||
|
||||
* [Spinner: Scalable Graph Partitioning in the Cloud](https://arxiv.org/pdf/1404.3861.pdf)
|
@ -1,13 +0,0 @@
|
||||
# Storage Memory Management
|
||||
|
||||
If Memgraph uses too much memory, OS will kill it. There has to be an internal
|
||||
mechanism to control memory usage.
|
||||
|
||||
Since C++17, polymorphic allocators are an excellent way to inject custom
|
||||
memory management while having a modular code. Memgraph already uses PMR in the
|
||||
query execution. Also, refer to [1] on how to start with PMR in the storage
|
||||
context.
|
||||
|
||||
## Resources
|
||||
|
||||
[1] [PMR: Mistakes Were Made](https://www.youtube.com/watch?v=6BLlIj2QoT8)
|
@ -1,9 +0,0 @@
|
||||
# Vectorized Query Execution
|
||||
|
||||
Memgraph query engine pulls one by one record during query execution. A more
|
||||
efficient way would be to pull multiple records in an array. Adding that
|
||||
shouldn't be complicated, but it wouldn't be advantageous without vectorizing
|
||||
fetching records from the storage.
|
||||
|
||||
On the query engine level, the array could be part of the frame. In other
|
||||
words, the frame and the code dealing with the frame has to be changed.
|
@ -1,148 +0,0 @@
|
||||
# Distributed Memgraph specs
|
||||
This document describes reasnonings behind Memgraphs distributed concepts.
|
||||
|
||||
## Distributed state machine
|
||||
Memgraphs distributed mode introduces two states of the cluster, recovering and
|
||||
working. The change between states shouldn't happen often, but when it happens
|
||||
it can take a while to make a transition from one to another.
|
||||
|
||||
### Recovering
|
||||
This state is the default state for Memgraph when the cluster starts with
|
||||
recovery flags. If the recovery finishes successfully, the state changes to
|
||||
working. If recovery fails, the user will be presented with a message that
|
||||
explains what happened and what are the next steps.
|
||||
|
||||
Another way to enter this state is failure. If the cluster encounters a failure,
|
||||
the master will enter the Recovering mode. This time, it will wait for all
|
||||
workers to respond with a message saying they are alive and well, and making
|
||||
sure they all have consistent state.
|
||||
|
||||
### Working
|
||||
This state should be the default state of Memgraph most of the time. When in
|
||||
this state, Memgraph accepts connections from Bolt clients and allows query
|
||||
execution.
|
||||
|
||||
If distributed execution fails for a transaction, that transaction, and all
|
||||
other active transactions will be aborted and the cluster will enter the
|
||||
Recovering state.
|
||||
|
||||
## Durability
|
||||
One of the important concepts in distributed Memgraph is durability.
|
||||
|
||||
### Cluster configuration
|
||||
When running Memgraph in distributed mode, the master will store cluster
|
||||
metadata in a persistent store. If fore some reason the cluster shuts down,
|
||||
recovering Memgraph from durability files shouldn't require any additional
|
||||
flags.
|
||||
|
||||
### Database ID
|
||||
Each new and clean run of Memgraph should generate a new globally unique
|
||||
database id. This id will associate all files that have persisted with this
|
||||
run. Adding the database id to snapshots, write-ahead logs and cluster metadata
|
||||
files ties them a specific Memgraph run, and it makes recovery easier to reason
|
||||
about.
|
||||
|
||||
When recovering, the cluster won't generate a new id, but will reuse the one
|
||||
from the snapshot/wal that it was able to recover from.
|
||||
|
||||
### Durability files
|
||||
Memgraph uses snapshots and write-ahead logs for durability.
|
||||
|
||||
When Memgraph recovers it has to make sure all machines in the cluster recover
|
||||
to the same recovery point. This is done by finding a common snapshot and
|
||||
finding common transactions in per-machine available write-ahead logs.
|
||||
|
||||
Since we can not be sure that each machine persisted durability files, we need
|
||||
to be able to negotiate a common recovery point in the cluster. Possible
|
||||
durability file failures could require to start the cluster from scratch,
|
||||
purging everything from storage and recovering from existing durability files.
|
||||
|
||||
We need to ensure that we keep wal files containing information about
|
||||
transactions between all existing snapshots. This will provide better durability
|
||||
in the case of a random machine durability file failure, where the cluster can
|
||||
find a common recovery point that all machines in the cluster have.
|
||||
|
||||
Also, we should suggest and make clear docs that anything less than two
|
||||
snapshots isn't considered safe for recovery.
|
||||
|
||||
### Recovery
|
||||
The recovery happens in following steps:
|
||||
* Master enables worker registration.
|
||||
* Master recovers cluster metadata from the persisted storage.
|
||||
* Master waits all required workers to register.
|
||||
* Master broadcasts a recovery request to all workers.
|
||||
* Workers respond with with a set of possible recovery points.
|
||||
* Master finds a common recovery point for the whole cluster.
|
||||
* Master broadcasts a recovery request with the common recovery point.
|
||||
* Master waits for the cluster to recover.
|
||||
* After a successful cluster recovery, master can enter Working state.
|
||||
|
||||
## Dynamic Graph Partitioning (abbr. DGP)
|
||||
|
||||
### Implemented parameters
|
||||
|
||||
--dynamic-graph-partitioner-enabled (If the dynamic graph partitioner should be
|
||||
enabled.) type: bool default: false (start time)
|
||||
--dgp-improvement-threshold (How much better should specific node score be
|
||||
to consider a migration to another worker. This represents the minimal
|
||||
difference between new score that the vertex will have when migrated
|
||||
and the old one such that it's migrated.) type: int32 default: 10
|
||||
(start time)
|
||||
--dgp-max-batch-size (Maximal amount of vertices which should be migrated
|
||||
in one dynamic graph partitioner step.) type: int32 default: 2000
|
||||
(start time)
|
||||
|
||||
### Design decisions
|
||||
|
||||
* Each partitioning session has to be a new transaction.
|
||||
* When and how does an instance perform the moves?
|
||||
* Periodically.
|
||||
* Token sharing (round robin, exactly one instance at a time has an
|
||||
opportunity to perform the moves).
|
||||
* On server-side serialization error (when DGP receives an error).
|
||||
-> Quit partitioning and wait for the next turn.
|
||||
* On client-side serialization error (when end client receives an error).
|
||||
-> The client should never receive an error because of any
|
||||
internal operation.
|
||||
-> For the first implementation, it's good enough to wait until data becomes
|
||||
available again.
|
||||
-> It would be nice to achieve that DGP has lower priority than end client
|
||||
operations.
|
||||
|
||||
### End-user parameters
|
||||
|
||||
* --dynamic-graph-partitioner-enabled (execution time)
|
||||
* --dgp-improvement-threshold (execution time)
|
||||
* --dgp-max-batch-size (execution time)
|
||||
* --dgp-min-batch-size (execution time)
|
||||
-> Minimum number of nodes that will be moved in each step.
|
||||
* --dgp-fitness-threshold (execution time)
|
||||
-> Do not perform moves if partitioning is good enough.
|
||||
* --dgp-delta-turn-time (execution time)
|
||||
-> Time between each turn.
|
||||
* --dgp-delta-step-time (execution time)
|
||||
-> Time between each step.
|
||||
* --dgp-step-time (execution time)
|
||||
-> Time limit per each step.
|
||||
|
||||
### Testing
|
||||
|
||||
The implementation has to provide good enough results in terms of:
|
||||
* How good the partitioning is (numeric value), aka goodness.
|
||||
* Workload execution time.
|
||||
* Stress test correctness.
|
||||
|
||||
Test cases:
|
||||
* N not connected subgraphs
|
||||
-> shuffle nodes to N instances
|
||||
-> run partitioning
|
||||
-> test perfect partitioning.
|
||||
* N connected subgraph
|
||||
-> shuffle nodes to N instance
|
||||
-> run partitioning
|
||||
-> test partitioning.
|
||||
* Take realistic workload (Long Running, LDBC1, LDBC2, Card Fraud, BFS, WSP)
|
||||
-> measure exec time
|
||||
-> run partitioning
|
||||
-> test partitioning
|
||||
-> measure exec time (during and after partitioning).
|
@ -1,275 +0,0 @@
|
||||
# High Availability (abbr. HA)
|
||||
|
||||
## High Level Context
|
||||
|
||||
High availability is a characteristic of a system which aims to ensure a
|
||||
certain level of operational performance for a higher-than-normal period.
|
||||
Although there are multiple ways to design highly available systems, Memgraph
|
||||
strives to achieve HA by elimination of single points of failure. In essence,
|
||||
this implies adding redundancy to the system so that a failure of a component
|
||||
does not imply the failure of the entire system. To ensure this, HA Memgraph
|
||||
implements the [Raft consensus algorithm](https://raft.github.io/).
|
||||
|
||||
Correct implementation of the algorithm guarantees that the cluster will be
|
||||
fully functional (available) as long as any strong majority of the servers are
|
||||
operational and can communicate with each other and with clients. For example,
|
||||
clusters of three or four machines can tolerate the failure of a single server,
|
||||
clusters of five and six machines can tolerate the failure of any two servers,
|
||||
and so on. Therefore, we strongly recommend a setup of an odd-sized cluster.
|
||||
|
||||
### Performance Implications
|
||||
|
||||
Internally, Raft achieves high availability by keeping a consistent replicated
|
||||
log on each server within the cluster. Therefore, we must successfully replicate
|
||||
a transaction on the majority of servers within the cluster before we actually
|
||||
commit it and report the result back to the client. This operation represents
|
||||
a significant performance hit when compared with single node version of
|
||||
Memgraph.
|
||||
|
||||
Luckily, the algorithm can be tweaked in a way which allows read-only
|
||||
transactions to perform significantly better than those which modify the
|
||||
database state. That being said, the performance of read-only operations
|
||||
is still not going to be on par with single node Memgraph.
|
||||
|
||||
This section will be updated with exact numbers once we integrate HA with
|
||||
new storage.
|
||||
|
||||
With the old storage, write throughput was almost five times lower than read
|
||||
throughput (~30000 reads per second vs ~6000 writes per second).
|
||||
|
||||
## User Facing Setup
|
||||
|
||||
### How to Setup HA Memgraph Cluster?
|
||||
|
||||
First, the user needs to install `memgraph_ha` package on each machine
|
||||
in their cluster. HA Memgraph should be available as a Debian package,
|
||||
so its installation on each machine should be as simple as:
|
||||
|
||||
```plaintext
|
||||
dpkg -i /path/to/memgraph_ha_<version>.deb
|
||||
```
|
||||
|
||||
After successful installation of the `memgraph_ha` package, the user should
|
||||
finish its configuration before attempting to start the cluster.
|
||||
|
||||
There are two main things that need to be configured on every node in order for
|
||||
the cluster to be able to run:
|
||||
|
||||
1. The user has to edit the main configuration file and specify the unique node
|
||||
ID to each server in the cluster
|
||||
2. The user has to create a file that describes all IP addresses of all servers
|
||||
that will be used in the cluster
|
||||
|
||||
The `memgraph_ha` binary loads all main configuration parameters from
|
||||
`/etc/memgraph/memgraph_ha.conf`. On each node of the cluster, the user should
|
||||
uncomment the `--server-id=0` parameter and change its value to the `server_id`
|
||||
of that node.
|
||||
|
||||
The last step before starting the server is to create a `coordination`
|
||||
configuration file. That file is already present as an example in
|
||||
`/etc/memgraph/coordination.json.example` and you have to copy it to
|
||||
`/etc/memgraph/coordination.json` and edit it according to your cluster
|
||||
configuration. The file contains coordination info consisting of a list of
|
||||
`server_id`, `ip_address` and `rpc_port` lists. The assumed contents of the
|
||||
`coordination.json` file are:
|
||||
|
||||
```plaintext
|
||||
[
|
||||
[1, "192.168.0.1", 10000],
|
||||
[2, "192.168.0.2", 10000],
|
||||
[3, "192.168.0.3", 10000]
|
||||
]
|
||||
```
|
||||
Here, each line corresponds to coordination of one server. The first entry is
|
||||
that server's ID, the second is its IP address and the third is the RPC port it
|
||||
listens to. This port should not be confused with the port used for client
|
||||
interaction via the Bolt protocol.
|
||||
|
||||
The `ip_address` entered for each `server_id` *must* match the exact IP address
|
||||
that belongs to that server and that will be used to communicate to other nodes
|
||||
in the cluster. The coordination configuration file *must* be identical on all
|
||||
nodes in the cluster.
|
||||
|
||||
After the user has set the `server_id` on each node in
|
||||
`/etc/memgraph/memgraph_ha.conf` and provided the same
|
||||
`/etc/memgraph/coordination.json` file to each node in the cluster, they can
|
||||
start the Memgraph HA service by issuing the following command on each node in
|
||||
the cluster:
|
||||
|
||||
```plaintext
|
||||
systemctl start memgraph_ha
|
||||
```
|
||||
|
||||
### How to Configure Raft Parameters?
|
||||
|
||||
All Raft configuration parameters can be controlled by modifying
|
||||
`/etc/memgraph/raft.json`. The assumed contents of the `raft.json` file are:
|
||||
|
||||
```plaintext
|
||||
{
|
||||
"election_timeout_min": 750,
|
||||
"election_timeout_max": 1000,
|
||||
"heartbeat_interval": 100,
|
||||
"replication_timeout": 20000,
|
||||
"log_size_snapshot_threshold": 50000
|
||||
}
|
||||
```
|
||||
|
||||
The meaning behind each entry is demystified in the following table:
|
||||
|
||||
Flag | Description
|
||||
------------------------------|------------
|
||||
`election_timeout_min` | Lower bound for the randomly sampled reelection timer given in milliseconds
|
||||
`election_timeout_max` | Upper bound for the randomly sampled reelection timer given in milliseconds
|
||||
`heartbeat_interval` | Time interval between consecutive heartbeats given in milliseconds
|
||||
`replication_timeout` | Time interval allowed for data replication given in milliseconds
|
||||
`log_size_snapshot_threshold` | Allowed number of entries in Raft log before its compaction
|
||||
|
||||
### How to Query HA Memgraph via Proxy?
|
||||
|
||||
This chapter describes how to query HA Memgraph using our proxy server.
|
||||
Note that this is not intended to be a long-term solution. Instead, we will
|
||||
implement a proper Memgraph HA client which is capable of communicating with
|
||||
the HA cluster. Once our own client is implemented, it will no longer be
|
||||
possible to query HA Memgraph using other clients (such as neo4j client).
|
||||
|
||||
The Bolt protocol that is exposed by each Memgraph HA node is an extended
|
||||
version of the standard Bolt protocol. In order to be able to communicate with
|
||||
the highly available cluster of Memgraph HA nodes, the client must have some
|
||||
logic implemented in itself so that it can communicate correctly with all nodes
|
||||
in the cluster. To facilitate a faster start with the HA cluster we will build
|
||||
the Memgraph HA proxy binary that communicates with all nodes in the HA cluster
|
||||
using the extended Bolt protocol and itself exposes a standard Bolt protocol to
|
||||
the user. All standard Bolt clients (libraries and custom systems) can
|
||||
communicate with the Memgraph HA proxy without any code modifications.
|
||||
|
||||
The HA proxy should be deployed on each client machine that is used to
|
||||
communicate with the cluster. It can't be deployed on the Memgraph HA nodes!
|
||||
|
||||
When using the Memgraph HA proxy, the communication flow is described in the
|
||||
following diagram:
|
||||
|
||||
```plaintext
|
||||
Memgraph HA node 1 -----+
|
||||
|
|
||||
Memgraph HA node 2 -----+ Memgraph HA proxy <---> any standard Bolt client (C, Java, PHP, Python, etc.)
|
||||
|
|
||||
Memgraph HA node 3 -----+
|
||||
```
|
||||
|
||||
To setup the Memgraph HA proxy the user should install the `memgraph_ha_proxy`
|
||||
package.
|
||||
|
||||
After its successful installation, the user should enter all endpoints of the
|
||||
HA Memgraph cluster servers into the configuration before attempting to start
|
||||
the HA Memgraph proxy server.
|
||||
|
||||
The HA Memgraph proxy server loads all of its configuration from
|
||||
`/etc/memgraph/memgraph_ha_proxy.conf`. Assuming that the cluster is set up
|
||||
like in the previous examples, the user should uncomment and enter the following
|
||||
value into the `--endpoints` parameter:
|
||||
|
||||
```plaintext
|
||||
--endpoints=192.168.0.1:7687,192.168.0.2:7687,192.168.0.3:7687
|
||||
```
|
||||
|
||||
Note that the IP addresses used in the example match the individual cluster
|
||||
nodes IP addresses, but the ports used are the Bolt server ports exposed by
|
||||
each node (currently the default value of `7687`).
|
||||
|
||||
The user can now start the proxy by using the following command:
|
||||
|
||||
```plaintext
|
||||
systemctl start memgraph_ha_proxy
|
||||
```
|
||||
|
||||
After the proxy has been started, the user can query the HA cluster by
|
||||
connecting to the HA Memgraph proxy IP address using their favorite Bolt
|
||||
client.
|
||||
|
||||
## Integration with Memgraph
|
||||
|
||||
The first thing that should be defined is a single instruction within the
|
||||
context of Raft (i.e. a single entry in a replicated log).
|
||||
These instructions should be completely deterministic when applied
|
||||
to the state machine. We have therefore decided that the appropriate level
|
||||
of abstraction within Memgraph corresponds to `Delta`s (data structures
|
||||
which describe a single change to the Memgraph state, used for durability
|
||||
in WAL). Moreover, a single instruction in a replicated log will consist of a
|
||||
batch of `Delta`s which correspond to a single transaction that's about
|
||||
to be **committed**.
|
||||
|
||||
Apart from `Delta`s, there are certain operations within the storage called
|
||||
`StorageGlobalOperations` which do not conform to usual transactional workflow
|
||||
(e.g. Creating indices). Since our storage engine implementation guarantees
|
||||
that at the moment of their execution no other transactions are active, we can
|
||||
safely replicate them as well. In other words, no additional logic needs to be
|
||||
implemented because of them.
|
||||
|
||||
Therefore, we will introduce a new `RaftDelta` object which can be constructed
|
||||
both from storage `Delta` and `StorageGlobalOperation`. Instead of appending
|
||||
these to WAL (as we do in single node), we will start to replicate them across
|
||||
our cluster. Once we have replicated the corresponding Raft log entry on
|
||||
majority of the cluster, we are able to safely commit the transaction or execute
|
||||
a global operation. If for any reason the replication fails (leadership change,
|
||||
worker failures, etc.) the transaction will be aborted.
|
||||
|
||||
In the follower mode, we need to be able to apply `RaftDelta`s we got from
|
||||
the leader when the protocol allows us to do so. In that case, we will use the
|
||||
same concepts from durability in storage v2, i.e., applying deltas maps
|
||||
completely to recovery from WAL in storage v2.
|
||||
|
||||
## Test and Benchmark Strategy
|
||||
|
||||
We have already implemented some integration and stress tests. These are:
|
||||
|
||||
1. leader election -- Tests whether leader election works properly.
|
||||
2. basic test -- Tests basic leader election and log replication.
|
||||
3. term updates test -- Tests a specific corner case (which used to fail)
|
||||
regarding term updates.
|
||||
4. log compaction test -- Tests whether log compaction works properly.
|
||||
5. large log entries -- Tests whether we can successfully replicate relatively
|
||||
large log entries.
|
||||
6. index test -- Tests whether index creation works in HA.
|
||||
7. normal operation stress test -- Long running concurrent stress test under
|
||||
normal conditions (no failures).
|
||||
8. read benchmark -- Measures read throughput in HA.
|
||||
9. write benchmark -- Measures write throughput in HA.
|
||||
|
||||
At the moment, our main goal is to pass existing tests and have a stable version
|
||||
on our stress test. We should also implement a stress test which occasionally
|
||||
introduces different types of failures in our cluster (we did this kind of
|
||||
testing manually thus far). Passing these tests should convince us that we have
|
||||
a "stable enough" version which we can start pushing to our customers.
|
||||
|
||||
Additional (proper) testing should probably involve some ideas from
|
||||
[here](https://jepsen.io/analyses/dgraph-1-0-2)
|
||||
|
||||
## Possible Future Changes/Improvements/Extensions
|
||||
|
||||
There are two general directions in which we can alter HA Memgraph. The first
|
||||
direction assumes we are going to stick with the Raft protocol. In that case
|
||||
there are a few known ways to extend the basic algorithm in order to gain
|
||||
better performance or achieve extra functionality. In no particular order,
|
||||
these are:
|
||||
|
||||
1. Improving read performance using leader leases [Section 6.4 from Raft thesis]
|
||||
2. Introducing cluster membership changes [Chapter 4 from Raft thesis]
|
||||
3. Introducing a [learner mode](https://etcd.io/docs/v3.3.12/learning/learner/).
|
||||
4. Consider different log compaction strategies [Chapter 5 from Raft thesis]
|
||||
5. Removing HA proxy and implementing our own HA Memgraph client.
|
||||
|
||||
On the other hand, we might decide in the future to base our HA implementation
|
||||
on a completely different protocol which might even offer different guarantees.
|
||||
In that case we probably need to do a bit more of market research and weigh the
|
||||
trade-offs of different solutions.
|
||||
[This](https://www.postgresql.org/docs/9.5/different-replication-solutions.html)
|
||||
might be a good starting point.
|
||||
|
||||
## Reading materials
|
||||
|
||||
1. [Raft paper](https://raft.github.io/raft.pdf)
|
||||
2. [Raft thesis](https://github.com/ongardie/dissertation) (book.pdf)
|
||||
3. [Raft playground](https://raft.github.io/)
|
||||
4. [Leader Leases](https://blog.yugabyte.com/low-latency-reads-in-geo-distributed-sql-with-raft-leader-leases/)
|
||||
5. [Improving Raft ETH](https://pub.tik.ee.ethz.ch/students/2017-FS/SA-2017-80.pdf)
|
@ -1,117 +0,0 @@
|
||||
# Kafka Integration
|
||||
|
||||
## openCypher clause
|
||||
|
||||
One must be able to specify the following when importing data from Kafka:
|
||||
|
||||
* Kafka URI
|
||||
* Kafka topic
|
||||
* Transform [script](transform.md) URI
|
||||
|
||||
|
||||
Minimum required syntax looks like:
|
||||
```opencypher
|
||||
CREATE STREAM stream_name AS LOAD DATA KAFKA 'URI'
|
||||
WITH TOPIC 'topic'
|
||||
WITH TRANSFORM 'URI';
|
||||
```
|
||||
|
||||
|
||||
The full openCypher clause for creating a stream is:
|
||||
```opencypher
|
||||
CREATE STREAM stream_name AS
|
||||
LOAD DATA KAFKA 'URI'
|
||||
WITH TOPIC 'topic'
|
||||
WITH TRANSFORM 'URI'
|
||||
[BATCH_INTERVAL milliseconds]
|
||||
[BATCH_SIZE count]
|
||||
```
|
||||
The `CREATE STREAM` clause happens in a transaction.
|
||||
|
||||
`WITH TOPIC` parameter specifies the Kafka topic from which we'll stream
|
||||
data.
|
||||
|
||||
`WITH TRANSFORM` parameter should contain a URI of the transform script.
|
||||
|
||||
`BATCH_INTERVAL` parameter defines the time interval in milliseconds
|
||||
which is the time between two successive stream importing operations.
|
||||
|
||||
`BATCH_SIZE` parameter defines the count of Kafka messages that will be
|
||||
batched together before import.
|
||||
|
||||
If both `BATCH_INTERVAL` and `BATCH_SIZE` parameters are given, the condition
|
||||
that is satisfied first will trigger the batched import.
|
||||
|
||||
Default value for `BATCH_INTERVAL` is 100 milliseconds, and the default value
|
||||
for `BATCH_SIZE` is 10;
|
||||
|
||||
The `DROP` clause deletes a stream:
|
||||
```opencypher
|
||||
DROP STREAM stream_name;
|
||||
```
|
||||
|
||||
The `SHOW` clause enables you to see all configured streams:
|
||||
```opencypher
|
||||
SHOW STREAMS;
|
||||
```
|
||||
|
||||
You can also start/stop streams with the `START` and `STOP` clauses:
|
||||
```opencypher
|
||||
START STREAM stream_name [LIMIT count BATCHES];
|
||||
STOP STREAM stream_name;
|
||||
```
|
||||
A stream needs to be stopped in order to start it and it needs to be started in
|
||||
order to stop it. Starting a started or stopping a stopped stream will not
|
||||
affect that stream.
|
||||
|
||||
There are also convenience clauses to start and stop all streams:
|
||||
```opencypher
|
||||
START ALL STREAMS;
|
||||
STOP ALL STREAMS;
|
||||
```
|
||||
|
||||
Before the actual import, you can also test the stream with the `TEST
|
||||
STREAM` clause:
|
||||
```opencypher
|
||||
TEST STREAM stream_name [LIMIT count BATCHES];
|
||||
```
|
||||
When a stream is tested, data extraction and transformation occurs, but no
|
||||
output is inserted in the graph.
|
||||
|
||||
A stream needs to be stopped in order to test it. When the batch limit is
|
||||
omitted, `TEST STREAM` will run for only one batch by default.
|
||||
|
||||
## Data Transform
|
||||
|
||||
The transform script is a user defined script written in Python. The script
|
||||
should be aware of the data format in the Kafka message.
|
||||
|
||||
Each Kafka message is byte length encoded, which means that the first eight
|
||||
bytes of each message contain the length of the message.
|
||||
|
||||
A sample code for a streaming transform script could look like this:
|
||||
|
||||
```python
|
||||
def create_vertex(vertex_id):
|
||||
return ("CREATE (:Node {id: $id})", {"id": vertex_id})
|
||||
|
||||
|
||||
def create_edge(from_id, to_id):
|
||||
return ("MATCH (n:Node {id: $from_id}), (m:Node {id: $to_id}) "\
|
||||
"CREATE (n)-[:Edge]->(m)", {"from_id": from_id, "to_id": to_id})
|
||||
|
||||
|
||||
def stream(batch):
|
||||
result = []
|
||||
for item in batch:
|
||||
message = item.decode('utf-8').strip().split()
|
||||
if len(message) == 1:
|
||||
result.append(create_vertex(message[0])))
|
||||
else:
|
||||
result.append(create_edge(message[0], message[1]))
|
||||
return result
|
||||
|
||||
```
|
||||
|
||||
The script should output openCypher query strings based on the type of the
|
||||
records.
|
@ -1,222 +0,0 @@
|
||||
# Replication
|
||||
|
||||
## High Level Context
|
||||
|
||||
Replication is a method that ensures that multiple database instances are
|
||||
storing the same data. To enable replication, there must be at least two
|
||||
instances of Memgraph in a cluster. Each instance has one of either two roles:
|
||||
main or replica. The main instance is the instance that accepts writes to the
|
||||
database and replicates its state to the replicas. In a cluster, there can only
|
||||
be one main. There can be one or more replicas. None of the replicas will accept
|
||||
write queries, but they will always accept read queries (there is an exception
|
||||
to this rule and is described below). Replicas can also be configured to be
|
||||
replicas of replicas, not necessarily replicas of the main. Each instance will
|
||||
always be reachable using the standard supported communication protocols. The
|
||||
replication will replicate WAL data. All data is transported through a custom
|
||||
binary protocol that will try remain backward compatible, so that replication
|
||||
immediately allows for zero downtime upgrades.
|
||||
|
||||
Each replica can be configured to accept replicated data in one of the following
|
||||
modes:
|
||||
- synchronous
|
||||
- asynchronous
|
||||
- semi-synchronous
|
||||
|
||||
### Synchronous Replication
|
||||
|
||||
When the data is replicated to a replica synchronously, all of the data of a
|
||||
currently pending transaction must be sent to the synchronous replica before the
|
||||
transaction is able to commit its changes.
|
||||
|
||||
This mode has a positive implication that all data that is committed to the
|
||||
main will always be replicated to the synchronous replica. It also has a
|
||||
negative performance implication because non-responsive replicas could grind all
|
||||
query execution to a halt.
|
||||
|
||||
This mode is good when you absolutely need to be sure that all data is always
|
||||
consistent between the main and the replica.
|
||||
|
||||
### Asynchronous Replication
|
||||
|
||||
When the data is replicated to a replica asynchronously, all pending
|
||||
transactions are immediately committed and their data is replicated to the
|
||||
asynchronous replica in the background.
|
||||
|
||||
This mode has a positive performance implication in which it won't slow down
|
||||
query execution. It also has a negative implication that the data between the
|
||||
main and the replica is almost never in a consistent state (when the data is
|
||||
being changed).
|
||||
|
||||
This mode is good when you don't care about consistency and only need an
|
||||
eventually consistent cluster, but you care about performance.
|
||||
|
||||
### Semi-synchronous Replication
|
||||
|
||||
When the data is replicated to a replica semi-synchronously, the data is
|
||||
replicated using both the synchronous and asynchronous methodology. The data is
|
||||
always replicated synchronously, but, if the replica for any reason doesn't
|
||||
respond within a preset timeout, the pending transaction is committed and the
|
||||
data is replicated to the replica asynchronously.
|
||||
|
||||
This mode has a positive implication that all data that is committed is
|
||||
*mostly* replicated to the semi-synchronous replica. It also has a negative
|
||||
performance implication as the synchronous replication mode.
|
||||
|
||||
This mode is useful when you want the replication to be synchronous to ensure
|
||||
that the data within the cluster is consistent, but you don't want the main
|
||||
to grind to a halt when you have a non-responsive replica.
|
||||
|
||||
### Addition of a New Replica
|
||||
|
||||
Each replica, when added to the cluster (in any mode), will first start out as
|
||||
an asynchronous replica. That will allow replicas that have fallen behind to
|
||||
first catch-up to the current state of the database. When the replica is in a
|
||||
state that it isn't lagging behind the main it will then be promoted (in a brief
|
||||
stop-the-world operation) to a semi-synchronous or synchronous replica. Slaves
|
||||
that are added as asynchronous replicas will remain asynchronous.
|
||||
|
||||
## User Facing Setup
|
||||
|
||||
### How to Setup a Memgraph Cluster with Replication?
|
||||
|
||||
Replication configuration is done primarily through openCypher commands. This
|
||||
allows the cluster to be dynamically rearranged (new leader election, addition
|
||||
of a new replica, etc.).
|
||||
|
||||
Each Memgraph instance when first started will be a main. You have to change
|
||||
the role of all replica nodes using the following openCypher query before you
|
||||
can enable replication on the main:
|
||||
|
||||
```plaintext
|
||||
SET REPLICATION ROLE TO (MAIN|REPLICA) WITH PORT <port_number>;
|
||||
```
|
||||
|
||||
Note that the "WITH PORT <port_number>" part of the query sets the replication port,
|
||||
but it applies only to the replica. In other words, if you try to set the
|
||||
replication port as the main, a semantic exception will be thrown.
|
||||
After you have set your replica instance to the correct operating role, you can
|
||||
enable replication in the main instance by issuing the following openCypher
|
||||
command:
|
||||
```plaintext
|
||||
REGISTER REPLICA name (SYNC|ASYNC) [WITH TIMEOUT 0.5] TO <socket_address>;
|
||||
```
|
||||
|
||||
The socket address must be a string of the following form:
|
||||
|
||||
```plaintext
|
||||
"IP_ADDRESS:PORT_NUMBER"
|
||||
```
|
||||
|
||||
where IP_ADDRESS is a valid IP address, and PORT_NUMBER is a valid port number,
|
||||
both given in decimal notation.
|
||||
Note that in this case they must be separated by a single colon.
|
||||
Alternatively, one can give the socket address as:
|
||||
|
||||
```plaintext
|
||||
"IP_ADDRESS"
|
||||
```
|
||||
|
||||
where IP_ADDRESS must be a valid IP address, and the port number will be
|
||||
assumed to be the default one (we specify it to be 10000).
|
||||
|
||||
Each Memgraph instance will remember what the configuration was set to and will
|
||||
automatically resume with its role when restarted.
|
||||
|
||||
|
||||
### How to See the Current Replication Status?
|
||||
|
||||
To see the replication ROLE of the current Memgraph instance, you can issue the
|
||||
following query:
|
||||
|
||||
```plaintext
|
||||
SHOW REPLICATION ROLE;
|
||||
```
|
||||
|
||||
To see the replicas of the current Memgraph instance, you can issue the
|
||||
following query:
|
||||
|
||||
```plaintext
|
||||
SHOW REPLICAS;
|
||||
```
|
||||
|
||||
To delete a replica, issue the following query:
|
||||
|
||||
```plaintext
|
||||
DROP REPLICA 'name';
|
||||
```
|
||||
|
||||
### How to Promote a New Main?
|
||||
|
||||
When you have an already set-up cluster, to promote a new main, just set the
|
||||
replica that you want to be a main to the main role.
|
||||
|
||||
```plaintext
|
||||
SET REPLICATION ROLE TO MAIN; # on desired replica
|
||||
```
|
||||
|
||||
After the command is issued, if the original main is still alive, it won't be
|
||||
able to replicate its data to the replica (the new main) anymore and will enter
|
||||
an error state. You must ensure that at any given point in time there aren't
|
||||
two mains in the cluster.
|
||||
|
||||
## Limitations and Potential Features
|
||||
|
||||
Currently, we do not support chained replicas, i.e. a replica can't have its
|
||||
own replica. When this feature becomes available, the user will be able to
|
||||
configure scenarios such as the following one:
|
||||
|
||||
```plaintext
|
||||
main -[asynchronous]-> replica 1 -[semi-synchronous]-> replica 2
|
||||
```
|
||||
|
||||
To configure the above scenario, the user will be able to issue the following
|
||||
commands:
|
||||
```plaintext
|
||||
SET REPLICATION ROLE TO REPLICA WITH PORT <port1>; # on replica 1
|
||||
SET REPLICATION ROLE TO REPLICA WITH PORT <port2>; # on replica 2
|
||||
|
||||
REGISTER REPLICA replica1 ASYNC TO "replica1_ip_address:port1"; # on main
|
||||
REGISTER REPLICA replica2 SYNC WITH TIMEOUT 0.5
|
||||
TO "replica2_ip_address:port2"; # on replica 1
|
||||
```
|
||||
|
||||
In addition, we do not yet support advanced recovery mechanisms. For example,
|
||||
if a main crashes, a suitable replica will take its place as the new main. If
|
||||
the crashed main goes back online, it will not be able to reclaim its previous
|
||||
role, but will be forced to be a replica of the new main.
|
||||
In the upcoming releases, we might be adding more advanced recovery mechanisms.
|
||||
However, users are able to setup their own recovery policies using the basic
|
||||
recovery mechanisms we currently provide, that can cover a wide range of
|
||||
real-life scenarios.
|
||||
|
||||
Also, we do not yet support the replication of authentication configurations,
|
||||
rendering access control replication unavailable.
|
||||
|
||||
The query and authentication modules, as well as audit logs are not replicated.
|
||||
|
||||
## Integration with Memgraph
|
||||
|
||||
WAL `Delta`s are replicated between the replication main and replica. With
|
||||
`Delta`s, all `StorageGlobalOperation`s are also replicated. Replication is
|
||||
essentially the same as appending to the WAL.
|
||||
|
||||
Synchronous replication will occur in `Commit` and each
|
||||
`StorageGlobalOperation` handler. The storage itself guarantees that `Commit`
|
||||
will be called single-threadedly and that no `StorageGlobalOperation` will be
|
||||
executed during an active transaction. Asynchronous replication will load its
|
||||
data from already written WAL files and transmit the data to the replica. All
|
||||
data will be replicated using our RPC protocol (SLK encoded).
|
||||
|
||||
For each replica the replication main (or replica) will keep track of the
|
||||
replica's state. That way, it will know which operations must be transmitted to
|
||||
the replica and which operations can be skipped. When a replica is very stale,
|
||||
a snapshot will be transmitted to it so that it can quickly synchronize with
|
||||
the current state. All following operations will transmit WAL deltas.
|
||||
|
||||
## Reading materials
|
||||
|
||||
1. [PostgreSQL comparison of different solutions](https://www.postgresql.org/docs/12/different-replication-solutions.html)
|
||||
2. [PostgreSQL docs](https://www.postgresql.org/docs/12/runtime-config-replication.html)
|
||||
3. [MySQL reference manual](https://dev.mysql.com/doc/refman/8.0/en/replication.html)
|
||||
4. [MySQL docs](https://dev.mysql.com/doc/refman/8.0/en/replication-setup-slaves.html)
|
||||
5. [MySQL master switch](https://dev.mysql.com/doc/refman/8.0/en/replication-solutions-switch.html)
|
@ -1,22 +0,0 @@
|
||||
# Memgraph LaTeX Beamer Template
|
||||
|
||||
This folder contains all of the needed files for creating a presentation with
|
||||
Memgraph styling. You should use this style for any public presentations.
|
||||
|
||||
Feel free to improve it according to style guidelines and raise issues if you
|
||||
find any.
|
||||
|
||||
## Usage
|
||||
|
||||
Copy the contents of this folder (excluding this README file) to where you
|
||||
want to write your own presentation. After copying, you can start editing the
|
||||
`template.tex` with your content.
|
||||
|
||||
To compile the presentation to a PDF, run `latexmk -pdf -xelatex`. Some
|
||||
directives require XeLaTeX, so you need to pass `-xelatex` as the final option
|
||||
of `latexmk`. You may also need to install some packages if the compilation
|
||||
complains about missing packages.
|
||||
|
||||
To clean up the generated files, use `latexmk -C`. This will also delete the
|
||||
generated PDF. If you wish to remove generated files except the PDF, use
|
||||
`latexmk -c`.
|
@ -1,82 +0,0 @@
|
||||
\NeedsTeXFormat{LaTeX2e}
|
||||
\ProvidesClass{mg-beamer}[2018/03/26 Memgraph Beamer]
|
||||
|
||||
\DeclareOption*{\PassOptionsToClass{\CurrentOption}{beamer}}
|
||||
|
||||
\ProcessOptions \relax
|
||||
|
||||
\LoadClass{beamer}
|
||||
|
||||
\usetheme{Pittsburgh}
|
||||
|
||||
% Memgraph color palette
|
||||
\definecolor{mg-purple}{HTML}{720096}
|
||||
\definecolor{mg-red}{HTML}{DD2222}
|
||||
\definecolor{mg-orange}{HTML}{FB6E00}
|
||||
\definecolor{mg-yellow}{HTML}{FFC500}
|
||||
\definecolor{mg-gray}{HTML}{857F87}
|
||||
\definecolor{mg-black}{HTML}{231F20}
|
||||
|
||||
\RequirePackage{fontspec}
|
||||
% Title fonts
|
||||
\setbeamerfont{frametitle}{family={\fontspec[Path = ./mg-style/fonts/]{EncodeSansSemiCondensed-Regular.ttf}}}
|
||||
\setbeamerfont{title}{family={\fontspec[Path = ./mg-style/fonts/]{EncodeSansSemiCondensed-Regular.ttf}}}
|
||||
% Body font
|
||||
\RequirePackage[sfdefault,light]{roboto}
|
||||
% Roboto is pretty bad for monospace font. We will find a replacement.
|
||||
% \setmonofont{RobotoMono-Regular.ttf}[Path = ./mg-style/fonts/]
|
||||
|
||||
% Title slide styles
|
||||
% \setbeamerfont{frametitle}{size=\huge}
|
||||
% \setbeamerfont{title}{size=\huge}
|
||||
% \setbeamerfont{date}{size=\tiny}
|
||||
|
||||
% Other typography styles
|
||||
\setbeamertemplate{frametitle}[default][center]
|
||||
\setbeamercolor{frametitle}{fg=mg-black}
|
||||
\setbeamercolor{title}{fg=mg-black}
|
||||
\setbeamercolor{section in toc}{fg=mg-black}
|
||||
\setbeamercolor{local structure}{fg=mg-orange}
|
||||
\setbeamercolor{alert text}{fg=mg-red}
|
||||
|
||||
% Commands
|
||||
\newcommand{\mgalert}[1]{{\usebeamercolor[fg]{alert text}#1}}
|
||||
\newcommand{\titleframe}{\frame[plain]{\titlepage}}
|
||||
\newcommand{\mgtexttt}[1]{{\textcolor{mg-gray}{\texttt{#1}}}}
|
||||
|
||||
% Title slide background
|
||||
\RequirePackage{tikz,calc}
|
||||
% Use title-slide-169 if aspect ration is 16:9
|
||||
\pgfdeclareimage[interpolate=true,width=\paperwidth,height=\paperheight]{logo}{mg-style/title-slide-169}
|
||||
\setbeamertemplate{background}{
|
||||
\begin{tikzpicture}
|
||||
\useasboundingbox (0,0) rectangle (\the\paperwidth,\the\paperheight);
|
||||
\pgftext[at=\pgfpoint{0}{0},left,base]{\pgfuseimage{logo}};
|
||||
\ifnum\thepage>1\relax
|
||||
\useasboundingbox (0,0) rectangle (\the\paperwidth,\the\paperheight);
|
||||
\fill[white, opacity=1](0,\the\paperheight)--(\the\paperwidth,\the\paperheight)--(\the\paperwidth,0)--(0,0)--(0,\the\paperheight);
|
||||
\fi
|
||||
\end{tikzpicture}
|
||||
}
|
||||
|
||||
% Footline content
|
||||
\setbeamertemplate{navigation symbols}{}%remove navigation symbols
|
||||
\setbeamertemplate{footline}{
|
||||
\begin{beamercolorbox}[ht=1.6cm,wd=\paperwidth]{footlinecolor}
|
||||
\vspace{0.1cm}
|
||||
\hfill
|
||||
\begin{minipage}[c]{3cm}
|
||||
\begin{center}
|
||||
\includegraphics[height=0.8cm]{mg-style/memgraph-logo.png}
|
||||
\end{center}
|
||||
\end{minipage}
|
||||
\begin{minipage}[c]{7cm}
|
||||
\insertshorttitle\ --- \insertsection
|
||||
\end{minipage}
|
||||
\begin{minipage}[c]{2cm}
|
||||
\tiny{\insertframenumber{} of \inserttotalframenumber}
|
||||
\end{minipage}
|
||||
\end{beamercolorbox}
|
||||
}
|
||||
|
||||
\endinput
|
Binary file not shown.
Binary file not shown.
Before Width: | Height: | Size: 26 KiB |
Binary file not shown.
Before Width: | Height: | Size: 185 KiB |
Binary file not shown.
Before Width: | Height: | Size: 189 KiB |
@ -1,40 +0,0 @@
|
||||
% Set 16:9 aspect ratio
|
||||
\documentclass[aspectratio=169]{mg-beamer}
|
||||
% Default directive sets the regular 4:3 aspect ratio
|
||||
% \documentclass{mg-beamer}
|
||||
\mode<presentation>
|
||||
|
||||
% requires xelatex
|
||||
\usepackage{ccicons}
|
||||
|
||||
\title{Insert Presentation Title}
|
||||
\titlegraphic{\ccbyncnd}
|
||||
\author{Insert Name}
|
||||
|
||||
% Institute doesn't look good in our current styling class.
|
||||
% \institute[Memgraph Ltd.]{\pgfimage[height=1.5cm]{mg-logo.png}}
|
||||
|
||||
% Date is autogenerated on compilation, so no need to set it explicitly,
|
||||
% unless you wish to override it with a different date.
|
||||
% \date{March 23, 2018}
|
||||
|
||||
\begin{document}
|
||||
|
||||
\titleframe
|
||||
|
||||
\section{Intro}
|
||||
|
||||
\begin{frame}{Contents}
|
||||
\tableofcontents
|
||||
\end{frame}
|
||||
|
||||
\begin{frame}{Memgraph Markup Test}
|
||||
\begin{itemize}
|
||||
\item \mgtexttt{Prefer \\mgtexttt for monospace}
|
||||
\item Replace this slide with your own
|
||||
\item Add even more slides in different sections
|
||||
\item Make sure you spellcheck your presentation
|
||||
\end{itemize}
|
||||
\end{frame}
|
||||
|
||||
\end{document}
|
@ -1,44 +1,4 @@
|
||||
# Memgraph Build and Run Environments
|
||||
|
||||
## Toolchain Installation Procedure
|
||||
|
||||
1) Download the toolchain for your operating system from one of the following
|
||||
links (current active toolchain is `toolchain-v2`):
|
||||
|
||||
* [CentOS 7](https://s3-eu-west-1.amazonaws.com/deps.memgraph.io/toolchain-v2/toolchain-v2-binaries-centos-7.tar.gz)
|
||||
* [CentOS 8](https://s3-eu-west-1.amazonaws.com/deps.memgraph.io/toolchain-v2/toolchain-v2-binaries-centos-8.tar.gz)
|
||||
* [Debian 9](https://s3-eu-west-1.amazonaws.com/deps.memgraph.io/toolchain-v2/toolchain-v2-binaries-debian-9.tar.gz)
|
||||
* [Debian 10](https://s3-eu-west-1.amazonaws.com/deps.memgraph.io/toolchain-v2/toolchain-v2-binaries-debian-10.tar.gz)
|
||||
* [Ubuntu 18.04](https://s3-eu-west-1.amazonaws.com/deps.memgraph.io/toolchain-v2/toolchain-v2-binaries-ubuntu-18.04.tar.gz)
|
||||
* [Ubuntu 20.04](https://s3-eu-west-1.amazonaws.com/deps.memgraph.io/toolchain-v2/toolchain-v2-binaries-ubuntu-20.04.tar.gz)
|
||||
|
||||
2) Extract the toolchain with the following command:
|
||||
|
||||
```bash
|
||||
tar xzvf {{toolchain-archive}}.tar.gz -C /opt
|
||||
```
|
||||
|
||||
3) Check and install required toolchain runtime dependencies by executing
|
||||
(e.g., on **Debian 10**):
|
||||
|
||||
```bash
|
||||
./environment/os/debian-10.sh check TOOLCHAIN_RUN_DEPS
|
||||
./environment/os/debian-10.sh install TOOLCHAIN_RUN_DEPS
|
||||
```
|
||||
|
||||
4) Activate the toolchain:
|
||||
|
||||
```bash
|
||||
source /opt/toolchain-v2/activate
|
||||
```
|
||||
|
||||
## Toolchain Upgrade Procedure
|
||||
|
||||
1) Build a new toolchain for each supported OS (latest versions).
|
||||
2) If the new toolchain doesn't compile on some supported OS, the last
|
||||
compilable toolchain has to be used instead. In other words, the project has
|
||||
to compile on the oldest active toolchain as well. Suppose some
|
||||
changes/improvements were added when migrating to the latest toolchain; in
|
||||
that case, the maintainer has to ensure that the project still compiles on
|
||||
previous toolchains (everything from `init` script to the actual code has to
|
||||
work on all supported operating systems).
|
||||
Please continue
|
||||
[here](https://www.notion.so/memgraph/Tools-05e0baafb78a49b386e0063b4833d23d).
|
||||
|
Loading…
Reference in New Issue
Block a user