Summary:
This only imports the `.py` files into the global interpreter. These
modules are not exposed to query execution. A later diff will add that
support.
Reviewers: mferencevic, ipaljak
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2660
Summary:
The antlr openCypher parser generation is moved to query. Also, header files
have been added to the list of generated files so that if any header file is
deleted CMake will know that it has to regenerate it.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2652
Summary:
The property store stores a map of `PropertyId` to `PropertyValue` mappings. It
compresses all of the values in order to use as little memory as possible.
Reviewers: teon.banek, ipaljak
Reviewed By: teon.banek
Subscribers: buda, pullbot
Differential Revision: https://phabricator.memgraph.io/D2604
Summary:
Most of the examples require properties to be enabled on edges so this change
specifies that flag explicitly. Also, errors are handled better by replacing
`bolt_client` with `mg_client`. Missing indices are created to improve import
speed.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2594
Summary:
The configuration file that was used for the Debian/CentOS package was manually
written. This diff adds a configuration file generator that extracts all of the
necessary information about the flags directly from the built binary and uses
that information to generate the configuration file for the packages.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2589
Summary:
The function will also be used during AST construction in order to
support `CALL ... YIELD *` syntax.
Reviewers: mferencevic, ipaljak
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2586
Summary:
The garbage collector had a race condition when it would delete deltas that
were in the middle of an object's delta chain. In the process of deleting
(unlinking) the delta, the garbage collector previously wouldn't acquire any
locks. That operation was then racing with the standard MVCC
`CreateAndLinkDelta` function that adds a new delta into the chain.
Fortunately, `CreateAndLinkDelta` always does its modifications while holding a
lock to the owner of the chain (either a vertex or an edge) so this change just
adds the lock acquiring to the garbage collector.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2582
Summary:
The functions that previously had locks in them are always called while the
vertex lock is already being held. Also, the lock guards were implemented
incorrectly.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2580
Summary:
Edge filters (edge type and destination vertex) are now handled natively in the
storage API. The API is implemented to be the fastest possible when using the
filters with the assumption that the number of edge types will be small.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2576
Summary:
This diff renames `__reload__` procedure to be `mg.reload` accepting a
module name. The main CallCustomProcedure function is now split into
multiple parts, so that there's more control over finding a procedure,
type checking its arguments and finally checking the returned result
set.
Depends on D2572
Reviewers: mferencevic, ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2573
Summary:
The query execution now has a timeout for each Cypher query it executes. The
timeout is implemented using TSC and will work only when TSC is available (same
as PROFILE). TSC is used to mitigate the performance impact of reading the
current time constantly.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2562
Summary:
Now when iterating over a label+property index the index verifies that the
bounds meet the criteria imposed by openCypher.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2552
Summary:
This simplifies the C API and reduces total allocations done when
constructing a type.
Reviewers: mferencevic, ipaljak
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2550
Summary:
Previously, when accessing the labels/properties/edges of a vertex/edge that
was just created the NEW view would correctly display the change, but the OLD
view would be invalid and would crash the database. With this change the OLD
view of a freshly created vertex/edge won't cause a crash, but will instead
report an error.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2549
Summary:
The type system is modelled after "CIP2015-09-16"
https://github.com/opencypher/openCypher/blob/master/cip/1.accepted/CIP2015-09-16-public-type-system-type-annotation.adoc
This is needed for registering procedures and their signatures. The
users will be able to specify what a custom procedure accepts and
returns. All of this needs to be available for inspection during
runtime. Therefore, this diff implements printing types as a user
presentable string. In the future, we will probably want to add type
checking through these types, because openCypher requires type checking
on values passed in and returned from custom procedures.
Reviewers: mferencevic, ipaljak, dsantl
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2544
Summary: Also test Explain and Profile through Intepreter.
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2546
Summary:
The dumper is now a function and doesn't have to worry about any state. The
function streams the Cypher queries directly to the client. This diff also
makes the dumper work with storage v2.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2545
Summary:
The atomic memory order should be `acquire` for `load` operations, `release`
for `store` operations and `acq_rel` for any RMW (read-modify-write) operations
(like `fetch_add`).
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2540
Summary:
- Add the `AnyStream` wrapper
- Remove the `Results` struct and store a function (handler) instead
Reviewers: teon.banek, mferencevic
Reviewed By: teon.banek, mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2497
Summary:
All mgp_* symbols are exported from Memgraph executable, no other
symbols should be visible.
The primary C API header, mg_procedure.h, is now part of the
installation. Also, added a shippable query module example.
Directory `query_modules` is meant to contain sources of modules we
write and ship as part of the installation. Currently, there's only an
example module, but there may be potentially more. Some modules could
only be installed as part of the enterprise release.
For Memgraph to load custom procedures, it needs to be started with a
flag pointing to a directory with compiled shared libraries implementing
those procedures.
Reviewers: mferencevic, ipaljak, llugovic, dsantl, buda
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2538
Summary:
This change increases the throughput of the storage v2 durability 20x. With
this change, the storage v2 durability is 3x faster than the storage v1
durability in both recovery and snapshotting (before the change v2 durability
is slower than v1 durability).
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2529
Summary:
Store accumulated results as `communication::bolt::Value`s instead of
`TypedValue`s.
Add additional overloads for `Result` and `Summary` which accept `TypedValue`s
but internally perform conversions.
Reviewers: teon.banek, mferencevic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2514
Summary:
Depends on D2471
- Add pointer to storage to `InterpreterContext`
- Rename `operator()` to `Prepare`
- Use `Interpret` instead of `operator()` (`Interpret` will be removed soon)
- Remove the `in_explicit_transaction` parameter
- Remove the memory resource parameter from `Interpret`
- Remove the storage accessor parameter from `Interpret`
- Fix up tests (remove the `Interpreter` from `database_transaction_timeout`)
Reviewers: teon.banek, mferencevic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2482
Summary: Make `InterpreterContext` a top level instead of a nested struct
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2512
Summary: For example, the aggregate element produced for `COUNT(*)` has its `value` set to `NULL`.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2463
Summary:
The registry is now in the `query::procedure` namespace, this makes the
naming more consistent. I.e. we are dealing with custom procedure which
are contained in modules. This naming convention is similar to Python
source code where each file represents a module and each module provides
multiple functions (or procedures in our case). At the moment we only
support exactly 1 procedure per module, but the openCypher syntax allows
for more.
Reviewers: mferencevic, ipaljak, dsantl
Differential Revision: https://phabricator.memgraph.io/D2454
Summary:
This diff implements a mechanism for registering plugins which provide
custom procedures for openCypher. Although the `Plugin` struct already
stores some function pointers, these are not set in stone w.r.t. to
requirements and signatures.
For example, in the future, we may want to allow a single plugin to
register multiple custom procedures instead of just one.
Reviewers: ipaljak, dsantl, mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2386
Summary:
This diff adds support for an auth module. The module is used to provide
authentication and authorization (only user to role mappings). The module can
be written in any language and uses a simple protocol to communicate with
Memgraph.
Reviewers: teon.banek, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2359
Summary:
Switching to Storage V2 API will require passing storage::View when
serializing VertexAccessor and EdgeAccessor, so this is just the first
step in adapting the code.
Reviewers: mferencevic, ipaljak
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2352
Summary:
Most instances of `@throw std::bad_alloc` are left unexplained as these
functions perform general heap allocations are it's obvious from the
function name that it will do so. Basically anything with `Create`, `Make` or
`Build` implies allocations. Additionally, which parts exactly perform
allocations are an implementation detail. Functions which do unexpected
heap allocations have the reason stated in the documentation, these
functions typically have exactly one spot which could raise such an
exception.
Some functions are marked as `noexcept`, these are usually "special
functions" such as constructors and operators. This could potentially
improve performance because STL may use API overloads that work faster
with `noexcept` stuff. Remaining non-throwing functions aren't marked as
`noexcept` as that wasn't our practice nor is common in our codebase. On
the other hand, if we continue enforcing the documentation of thrown
exceptions, perhaps we should start using `noexcept`.
Reviewers: mferencevic, ipaljak
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2350
Summary:
This makes Gid the same as the one in storage/v2. Before they can be
merge into one implementation, we probably want to have a similar
transition for remaining ID types.
Depends on D2346
Reviewers: mferencevic, ipaljak
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2347
Summary:
It never made sense that a global ID is its own namespace in the storage
directory tree.
Reviewers: mferencevic, ipaljak
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2346
Summary:
This effectively replaces the old PropertyValue implementation from the
one in storage/v2
Depends on D2333
Reviewers: mferencevic, ipaljak
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2335
Summary:
This is a different scheme for setting up a bookkeeping object while
still supporting arbitrary allocation alignment requests. The previous
scheme was simpler as it always allocated a power of 2 bytes, but the
trade-off was increased memory usage. This should waste less memory.
Reviewers: mtomic, mferencevic, ipaljak
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2321
Summary:
There's a possible race condition where we add a deleted vertex into
index and garbage collection removes it from main storage before indices are
cleaned-up.
Reviewers: mferencevic, teon.banek
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2314
Summary:
Utils now contain a BasicResult implementation which supports different
types of error. This will make it useful for other parts of the code as
well as during the transition from old to new storage.
Reviewers: mtomic, msantl, mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2291
Summary:
The documentation includes `std` exceptions like `std::bad_alloc` or
`std::system_error`, for which there's probably nothing we can do. This
may seem unnecessary, but it will be really helpful when writing the C
API for interfacing with custom modules and plugins, as well as when
switching to storage v2 API.
In general, we should start updating the documentation of functions
which may throw exceptions. This ought to be enforced in code review, so
that the implementation and documentation are kept in sync.
Reviewers: mferencevic, mtomic, msantl
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2288
Summary:
This ought to simplify the code which needs to work with any kind of
vertex iteration, be it through an index store or regular.
Reviewers: mtomic, mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2258
Summary: this will make GC for indices easier
Reviewers: teon.banek, mferencevic
Reviewed By: teon.banek, mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2223
Summary:
`std::unordered_map` is 56 bytes in size, `std::map` is 48 bytes in size.
Also, `std::map` doesn't require the key type to be hashable.
Reviewers: mtomic, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2218
Summary:
For proper client interaction, we need to expose the (term_id, log_index)
pair for the transaction that's about to be replicated and we need to be able
to retrieve the status of a transaction defined by that pair. Transaction
status can be one of the following:
1) REPLICATED (self-explanatory)
2) WAITING (waiting for replication)
3) ABORTED (self-explanatory)
4) INVALID (received request with either invalid term_id or invalid log_index)
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2201
Summary:
The first 2 tuple elements are redundant as they are available through
EdgeAccessor and they needlessly complicate the usage of the API.
Reviewers: mtomic, mferencevic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2200
Summary:
The newly added methods are modeled after the ones being added in C++20 to
`std::pmr::polymorphic_allocator`.
Reviewers: mtomic, mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2197
Summary:
`lock_guard` is holding vertex lock while we're deleting the vertex,
and then it might try to unlock it in its destructor and access freed memory.
Reviewers: teon.banek, mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2192
Summary:
This is a preparation step in case we want to have a custom allocator in
SkipList. For example, pool based allocator for SkipListNode.
Introduction of MemoryResource and removal of `calloc` has reduced the
performance a bit according to micro benchmarks. This performance hit is
not visible on benchmarks which do more concurrent operations.
Reviewers: mferencevic, mtomic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2140
Summary:
This change implements full edges support in storage v2. Edges can be created
and deleted. Support for detach-deleting vertices is added and regular vertex
deletion verifies existance of edges.
Reviewers: mtomic, teon.banek
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2180
Summary:
Instead of calling the wanted method, this diff takes copies of both `is_replicated(x)` and `is_active(x)` results so that they can't change during the method execution.
The problem with the previous implementation was that the information about a
log could change during two consecutive queries as ilustrated in the example
below.
```
Thread 1 Thread2
rlog->set_active(1);
if (rlog->is_replicated(1)) return true;
rlog->set_replicated(1);
if (rlog->is_active(1)) return false;
throw InvalidLogReplicationLookup();
```
The snippet above would throw as we don't have any information about the log
`1`, because the `set_replicated` call would "shadow" the active bit.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot, mferencevic
Differential Revision: https://phabricator.memgraph.io/D2165
Summary:
We forgot to update `modified_vertices` in the case when the vertex
has an empty version chain. It didn't manifest before because it was impossible
for a vertex to have an empty version chain without garbage collection.
Reviewers: mferencevic, teon.banek
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2154
Summary:
We forgot to add the newly created vertex into `modified_vertices`
list.
Reviewers: teon.banek, mferencevic
Reviewed By: mferencevic
Differential Revision: https://phabricator.memgraph.io/D2153
Summary:
Initial implementation of new storage engine. It implements snapshot isolation
for transactions. All changes in the database are stored as deltas instead of
making full copies. Currently, the storage supports full transaction
functionality (commit, abort, command advancement). Also, support has been
implemented only for vertices that have only labels.
Reviewers: teon.banek, mtomic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2138
Summary:
Micro benchmarks show some minor variations compared to the previous
commit. Smaller cases are a bit worse while larger data cases are a bit
better.
Reviewers: mtomic, mferencevic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2136
Summary:
Micro benchmarks show improvements in performance of MapLiteral from 5%
to 40% depending on the size of the input. On the other hand, a sequence
of AdditionOperators behaves the same with both allocation schemes.
Reviewers: mtomic, mferencevic, msantl
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2132
Summary:
The global variable may hide the fact that it uses the default
utils::NewDeleteResource() for allocations.
Reviewers: mtomic, llugovic, mferencevic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2121
Summary:
Split up `process-file` so that looking at the generated code for an LCP form is
easier from the REPL.
`process-lcp`, `generate-hpp` and `generate-cpp` now perform the generation of
C++ code, but take a list of "C++ elements" (the results of LCP forms) as input
and write their output to streams. They do no reading/evaluating of LCP forms of
their own.
`read-lcp` and `read-lcp-file` are used to read and evaluate a stream of LCP
forms. The latter is a specialized version for file streams which also reports
the position of the form within the file when an error happens.
`process-lcp-string` and `process-lcp-file` are convenient wrappers around the
main functionality that take a string (file) and output to strings (files).
Using `read-lcp` and `read-lcp-file` they process LCP forms and pass them off to
`process-lcp` for code generation.
Reviewers: mtomic, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2097
Summary:
`dump()` awesome Memgraph function is removed.
`mg_dump` executable now uses "dump database" command instead of "return dump()"
and dumps database to standard output instead of a file.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2122
Summary:
Stream queries to the output table.
For effective output streaming, a simple operator, `OutputTableStream` is implemented which fetches and
produces a single row on each Pull.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2099
Summary:
This is unfortunately needed in the C++17 standard, so that the
allocator is correctly propagated to elements of pair which respect the
"Uses Allocator" protocol. C++20 standard resolves this issue, but we
still have a long way before it is released and implemented by the
compiler and standard library vendors.
Reviewers: mtomic, llugovic, mferencevic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2107
Summary: Forward declare the needed struct and include in the correct file.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2110
Summary:
This diff only introduces a MemoryResource member to TypedValue and correctly
propagates through various constructors and assignments. At the moment,
MemoryResource is not used to actually allocate anything.
Reviewers: mtomic, llugovic, mferencevic, msantl
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2085
Summary:
HA should now support constraints in the same way the SM version does.
I only tested this thing manually, but I plan to add a new integration test for
this also.
Reviewers: ipaljak, vkasljevic, mferencevic
Reviewed By: ipaljak, mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2083
Summary:
Implemented a simple dump executable which runs `RETURN dump()` on the database using Bolt protocol.
The `dump()` function returns a list of openCypher queries needed for database reconstruction (note: in the future we should use more appropriate method that supports streaming of data).
Reviewers: teon.banek, mferencevic
Reviewed By: teon.banek
Subscribers: pullbot, mferencevic
Differential Revision: https://phabricator.memgraph.io/D2067
Summary:
Fix condition variable notifications
Fix vote requested invalid memory access (size off by one)
Fix blocking wait in RaftPeer while candidate
Don't copy large log entries when only the term is needed
Reviewers: ipaljak, msantl
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2088
Summary:
Preparing unique constraint code to be implemented in HA. To do so, I'm
moving everything to `stograge/common/` folder. I also added a new header,
`gid.hpp` which does a `ifdef` include of the correct `gid.hpp` based on the
product.
Reviewers: ipaljak, mferencevic, vkasljevic, teon.banek
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2079
Summary:
Instead of asking rocksdb for snapshot metadata, keep it in memory for
faster access.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2075
Summary:
During it's leadership, one peer can receive RPC messages from other peers that his reign is over.
The problem is when this happens during a transaction commit.
This is handled in the following way.
If we're the current leader and we want to commit a transaction, we need to make sure the Raft Log is replicated before we can tell the client that the transaction is committed.
During that wait, we can only notice that the replication takes too long, and we report that with `LOG(WARNING)` messages.
If we change the Raft mode during the wait, our Raft implementation will internally commit this transaction, but won't be able to acquire the Raft lock because the `db.Reset` has been called.
This is why there is an manual lock acquire. If we pick up that the `db.Reset` has been called, we throw an `UnexpectedLeaderChangeException` exception to the client.
Another thing with long running transactions, if someone decides to kill a `memgraph_ha` instance during the commit, the transaction will have `abort` hint set. This will cause the `src/query/operator.cpp` to throw a `HintedAbortError`. We need to catch this during the shutdown, because the `memgraph_ha` isn't dead from the user perspective, and the transaction wasn't aborted because it took too long, but we can differentiate between those two.
Reviewers: mferencevic, ipaljak
Reviewed By: mferencevic, ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1956
Summary:
Micro benchmarks show no change compared to global new & delete. This is
to be expected, because Unwind relies only on `std::vector` which ought
to reserve the memory in reasonable chunks.
Reviewers: mtomic, llugovic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2064
Summary:
Micro benchmarks show an improvement to performance of about 10%
compared to global new & delete.
Reviewers: mtomic, llugovic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2061
Summary: We should now exchange around O(log(n)) messages to get a follower that is n transactions behind back in sync.
Reviewers: msantl, mferencevic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2051
Summary:
I was a bit stupid and thought I can use `std::swap`, while `std::swap`
is implemented in terms of move itself.
Reviewers: mtomic, mferencevic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2054
Summary:
A `valgrind` run of a modified client that connects multiple times to the
database now shows no memory leaks, previously the context created with
`SSL_CTX_new` was leaked every time.
Reviewers: teon.banek, mtomic
Reviewed By: teon.banek, mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2052
Summary:
Micro benchmarks show that MonotonicBufferResource improves performance
by a factor of 1.5.
Reviewers: mtomic, mferencevic, llugovic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2048
Summary:
Benchmarks show minor improvements. Perhaps it makes sense at some later date
to use another allocator for things lasting only in a single `Pull`.
Reviewers: mferencevic, mtomic, llugovic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2018
Summary:
Unfortunately, the written micro benchmark only reports minor
improvements compared to default allocator. The results are in some
cases even a tiny bit worse.
Reviewers: mtomic, mferencevic, llugovic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2039
Summary:
This change introduces dumping of indices keys. During the dump process,
an internal label is assigned to each vertex and index on vertex's
internal property id is created for faster matching during edge creation.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: msantl, pullbot
Differential Revision: https://phabricator.memgraph.io/D2046
Summary:
Prior to this change, a huge query was returned by DumpGenerator that
dumped the entire graph. This change split the single query to multiple
queries, each dumping a single vertex/edge. For easier vertex matching
when dumping edge, an internal property id is assigned to each vertex and
removed after the whole graph is dumped.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2038
Summary:
# Summary
## Concepts and terminology
A **namestring for <object>** is a Lisp string that names the C++ language element
`<object>` (such as a namespace, a variable, a class, etc.). Therefore we have a
"namestring for a namespace", a "namestring for a variable", etc.
A **designator for a namestring for <object>** is a Lisp object that denotes a
"namestring for <object>". These are symbols and strings themselves.
A **typestring** is a Lisp string that represents a C++ type, i.e. its
declaration.
A **typestring designator** is a Lisp object that denotes a typestring.
A typestring (and the corresponding C++ type) is said to be **supported** if it
can be parsed using `parse-cpp-type-declaration`. This concept is important and
should form the base of our design because we can't really hope to ever support
all of C++'s type declarations.
A typestring (and the corresponding C++ type) that is not supported is
**unsupported**.
A **processed typestring** is a typestring that is either fully qualified or
unqualified.
A C++ type is said to be **known** if LCP knows extra information about the type,
rather than just what kind of type it is, in which namespace it lives, etc. For
now, the only known types are those that are defined within LCP itself using
`define-class` & co.
A C++ type is **unknown** if it is not known.
**Typestring resolution** is the process of resolving a (processed) typestring
into an instance of `cpp-type` or `unsupported-cpp-type`.
**Resolving accessors** are accessors which perform typestring resolution.
## Changes
Explicitly introduce the concept of supported and known types. `cpp-type` models
supported types while `unsupported-cpp-type` models unsupported types.
Subclasses of `cpp-type` model known types. `general-cpp-type` is either a
`cpp-type` or an `unsupported-cpp-type`.
Add various type queries.
Fix `define-rpc`'s `:initarg` (remove it from the `cpp-member` struct).
Introduce namestrings and their designators in `names.lisp`.
Introduce typestrings and their designators.
Introduce **processed typestrings**. Our DSL's macros (`define-class` & co.)
convert all of the given typestrings into processed typestrings because we don't
attempt to support partially qualified names and relative name lookup yet. A
warning is signalled when a partially qualified name is treated as a fully
qualified name.
The slots of `cpp-type`, `cpp-class`, `cpp-member` and `cpp-capnp-opts` now
store processed typestrings which are lazily resolved into their corresponding
C++ types.
The only thing that instances of `unsupported-cpp-type` are good for is getting
the typestring that was used to construct them. Most of LCP's functions only
work with known C++ types, i.e. `cpp-type` instances. The only function so far
that works for both of them is `cpp-type-decl`.
Since "unsupportedness" is now explicitly part of LCP, client code is expected
to manually check whether a returned type is unsupported or not (or face
receiving an error otherwise), unless a function is documented to return only
`cpp-type` instances.
A similar thing goes for "knowness". Client code is expected to manually check
whether a returned type is known or not, unless a function is documented to
return only (instances of `cpp-type` subclasses) known types.
## TODO
Resolution still has to be done for the following slots of the
following structures:
- `slk-opts`
- `save-args`
- `load-args`
- `clone-opts`
- `args`
- `return-type`
Reviewers: teon.banek, mtomic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1962
Summary:
This should hopefully resolve multithreading issues when multiple LCP
invocations tried to process the same file.
Reviewers: mferencevic, msantl
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2029
Summary:
Read throughput dropped by about 50% from the last diff.
This should fix it (at least according to local measurements).
Reviewers: msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2027
Summary:
According to the written benchmark, using MonotonicBufferResource yields
significant improvements to performance of Distinct. The setup fills the
database with vertices depending on the benchmark state. No edges are
created. Then we run DISTINCT on that. Since each vertex is unique, we
will store everything in the `DistinctCursor::seen_rows_`, which is
backed by a MemoryResource. This setup, on my machine, yields 10 times
better performance when run with MonotonicBufferResource.
Reviewers: mferencevic, mtomic, msantl
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1894
Summary:
This allows us to decouple issuing heartbeats from the server mode
which is useful when we know we will transition to LEADER but cannot yet
change the mode due to some Raft internals.
It can now happen that a couple of HBs are sent when in FOLLOWER or CANDIDATE
mode, but this doesn't affect the correctness of the protocol.
Reviewers: msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2025
Summary:
There will be a lot of leftover files, execute the following commands inside
`src/` to remove them:
```
git clean -xf
rm -r rpc/ storage/single_node_ha/rpc/
```
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2011
Summary:
Clang-analyzer-core found NullDereference issue in LruList. I don't see
how that can occur, but as a precaution I'll add a CHECK.
Reviewers: msantl, ipaljak
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2015
Summary:
Queries such as `UNWIND RANGE(1, 90000) AS x CREATE(:node{id:x});` work
now. At some point (cca 100000 state deltas) the messages become too large for
Cap'n Proto to handle and that exception kills the cluster. This is fine since
we are about to replace it anyway.
Also, the issue with leadership change during replication still remains if the
user tempers with the constants (election timeout more or less the same to hb
interval). This is unavoidable, so we need to address that issue in the future.
Also, I've removed the unnecessary `backoff_until_` variable and the logic around it.
It's a small change so I kept it within this diff (sorry).
Reviewers: msantl, mferencevic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1998
Summary:
This API change is needed in order to propagate the memory allocation
scheme for the execution of LogicalOperator::Cursor
Depends on D1990
Reviewers: mtomic, mferencevic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1980
Summary:
The new distributed directory is inside the query, and mirrors the query
structure. This groups all of the distributed (query) source code
together, which should make the potential directory extraction easier.
Reviewers: mferencevic, llugovic, mtomic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1923
Summary:
Manually call `CompactRange` on KVStore (rocksDB) to free disk space
upon Raft log compaction.
Raft log takes surprisingly large amount of disk space (1MB for 1000 entries) so
we need to free disk space as we compact the log into snapshot.
When `log_size_snapshot_threshold` set to `5000` the disk usage of the
`durability_dir` goes from `4.8 MB` to `240 KB`.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1939
Summary:
This is still in progress.
As reported in
https://app.asana.com/0/743890251333732/971034572323525/f
memgraph distributed fails when the snapshot that it tries to recover from is
invalid. Instead of inheritance of `StorageGc` this diff adds composition and we
don't try to register a rpc server after the initialization.
Reviewers: ipaljak, vkasljevic, mtomic
Reviewed By: vkasljevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1930
Summary:
Same as `UniqueLabelPropertyConstratint` except it works with multiple properties.
Because of that it is a bit more complex and slower.
Reviewers: msantl, ipaljak
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1926
Summary:
Sometimes, when the leader resigns it's leadership, a newly elected
leader would send the old leader `AppendEntriesRPC` that would cause the
snapshot recovery to happen. This diff prevents that.
Reviewers: mferencevic, ipaljak
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1943
Summary:
Based on the failure manifested in
https://apollo.memgraph.io/runs/512803/ it seems like machines give each other
votes for the same term.
Looking at the code, `voted_for_` variable wasn't assigned on the election start
and the election starter could grant his vote to someone else but would still
count his vote to himself.
Reviewers: ipaljak, mferencevic
Reviewed By: mferencevic
Subscribers: pullbot, vkasljevic
Differential Revision: https://phabricator.memgraph.io/D1941
Summary:
This is a bugfix for D1836. It made `SymbolTable` return references to vector
elements, which then get invalidated and weird stuff happens.
This made a `DCHECK` in `rule_based_planner.hpp` trigger, and it was noticed by
@ipaljak 2 months later. All `DCHECK`s in `rule_based_planner.hpp` are now
changed to `CHECK`s.
Also, hash function for `Symbol` was wrong, because it also took
`user_declared` field into consideration, and `==` operator doesn't do that.
Reviewers: ipaljak, teon.banek, mferencevic, msantl
Reviewed By: msantl
Subscribers: pullbot, ipaljak
Differential Revision: https://phabricator.memgraph.io/D1938
Summary:
It seems that older clang versions erroneously accept modification of
constant pointers, this is a quick hack which resolves the issue on new
clang versions.
None of the const methods are really const and should not be marked as
such. This whole accessor thing is just a steaming pile of crap and
should be rewritten from scratch.
Reviewers: vkasljevic, msantl
Reviewed By: vkasljevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1922
Summary:
For HA benchmarks, if one of the executables exits with a status other
than zero, the benchmark should fail.
Also, removing `LOG(INFO)`, since failing benchmarks should flag where to look.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1921
Summary:
Concurent access to the map that contains replication log timeouts
caused the HA version to often report replication log timeout errors.
Adding locks around the access prevets them from happening.
Performance on Apollo reports write speed around 8k/s.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1920
Summary:
Case with existence constraints is about 8-9% slower then case without
existence constraints. Before this diff that difference was about 15-16%.
Reviewers: msantl, ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1917
Summary:
UniqueLabelPropertyConstraint defines label + property restriction on vertices.
Label + Property + PropertyValue for that property must be unique at any given moment.
Reviewers: msantl, ipaljak
Reviewed By: msantl
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1884
Summary:
PROVE:PLAN clears the previously stored results for the current suite (which is
the suite associated with the current package) and prevents "result
accumulation" (and the accompanying huge and partly outdated reports).
Reviewers: mtomic, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1909
Summary:
`SHOW STORAGE STATS` when executed in a Raft cluster should return
stats for each member of the cluster.
`StorageStats` starts a RPC server on each member of the cluster that answers
about its local storage stats.
The query can be invoked only on the current leader, the leader sends a request
to each peer and shows the results it gets. If some peers don't answer within 1
second, stats for those peers won't be shown.
The new output can be seen here: P27
Reviewers: ipaljak, mferencevic
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1907
Summary:
Added a new config parameter, replication timeout. This parameter sets the
upper limit to the replication phase and once the timeout exceeds, the
transaction engine stops accepting new transactions.
We could experience this timeout in two cases:
1. a network partition
2. majority of the cluster stops working
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1893
Summary: Removing Kafka, auth and audit log features from HA instance for the time being.
Reviewers: mferencevic, msantl
Reviewed By: mferencevic, msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1882
Summary:
Added index creation and deletion handling in StateDelta.
Also included an integration test that creates an index and makes sure that it
gets replicated by killing each peer eventually causing a leader re-election.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1886
Summary: `DataRpcClient` and `DataRpcServer` now work with both old and new record. This will be needed later when I replace `Cache` with `LruCache`.
Reviewers: msantl, teon.banek, ipaljak
Reviewed By: msantl, teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1875
Summary:
`EdgesIterable` is used for iterating over Edges in distributed memgraph. Because of lru cache there is a possibility of data getting evicted as someone iterates over it.
To prevent that `EdgesIterable` will lock that data and release it when it's deconstructed.
Reviewers: msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1868
Summary:
CachedDataLock is necessary for lru cache as remote data is no longer
persistent. Most methods internally handle this, but for methods that
return pointers or references to remote data, we need to manually
lock data.
Reviewers: msantl, ipaljak
Reviewed By: msantl
Subscribers: teon.banek, pullbot
Differential Revision: https://phabricator.memgraph.io/D1869
Summary:
Existence constraint ensures that all nodes with certain label have a certain property.
`ExistenceRule` defines label -> properties rule and `ExistenceConstraints` manages all
constraints.
Reviewers: msantl, ipaljak, teon.banek, mferencevic
Reviewed By: msantl, teon.banek, mferencevic
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1797
Summary:
RecordAccessor doesn't implement `operator<`, so it doesn't make sense
to have it inherit TotalOrdering.
Reviewers: mferencevic, msantl, vkasljevic
Reviewed By: vkasljevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1866
Summary: Teon found this nit so we might fix it.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1860
Summary:
During the following scenario:
- start a HA cluster with 3 machines
- find the leader and start sending queries
- SIGTERM the leader but leave other 2 machines untouched
The leader would be stuck in the shutdown phase.
This was happening because during the shutdown phase of the Bolt server, a
`graph_db_accessor` would try to commit a transaction after we've already shut
down Raft server. Raft, although not running, is still thinking it's in the
Leader mode. Tx Engine calls the `SafeToCommit` method to Commit transactions,
and ends up in an infinite loop.
Since Raft was shut down it won't handle any of the incoming RPCs and won't
change it's mode.
The fix here is to shut down the Bolt server before Raft, so we don't have any
pending commits once Raft is shut down.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1853
Summary:
Added a new awesome function called `storageInfo`. This function
returns a list of key-value pairs containing the following (useful?)
information:
* number of vertices
* number of edges
* average degree
* memory usage
* disk usage
The current implementation is in `awesome_memgraph_functions` but it will end up
as a separate clause for better access control.
Reviewers: teon.banek, mtomic, mferencevic
Reviewed By: teon.banek
Subscribers: pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D1850
Summary:
This change splits mg-communication into mg-communication and
mg-comm-rpc. The main reason for doing this, is to make separation of
enterprise features from community Memgraph more clear.
Reviewers: mferencevic, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1844
Summary:
In Raft, we often need to access persistent state of the server
without modifying it. In order to speed up such operations, we
keep an in-memory copy of that state.
In this diff we make a copy of all persistent state except for
the log itself. Running our feature benchmark locally, we manage
to increase the throughput for cca 750 queries/s.
Reviewers: msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1843
Summary:
RuleBasedPlanner now generates only the regular ScanAll operations, and
Filter operations are appended as soon as possible. The newly added
Rewrite step, takes this operator tree and replaces viable Filter &
ScanAll operators with appropriate ScanAllBy<Index> operator. This
change ought to simplify the behaviour of DistributedPlanner when that
stage is moved before the indexed lookup rewrite.
Showing unoptimized plan in interactive planner is also supported.
Reviewers: mtomic, llugovic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1839
Summary:
All AST nodes had a member `uid_` that was used as a key in
`SymbolTable`. It is renamed to `symbol_pos_` and it appears only in
`Identifier`, `NamedExpression` and `Aggregation`, since only those types were
used in `SymbolTable`. SymbolGenerator is now responsible for creating symbols
in `SymbolTable` and assigning positions to AST nodes.
Cloning and serialization code is now simpler since there is no need to track
UIDs.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1836
Summary:
In this part of log compaction for raft, I've implemented snapshooting
and snapshot recovery. I've also refactored the code a bit, so `RaftServer` now
has a pointer to the `GraphDb` and it can do some things by itself.
Log compaction requires some further work. Since snapshooting isn't synchronous
between peers, and each peer can work at their own pace, once we've compacted
the log so that the next log to be sent to peer `x` isn't available anymore, we
need to send the snapshot over the wire. This means that the next part will
contain the `InstallSnapshotRPC` and then maybe one more that will implement the
logic of sending `LogEntry` or the whole snapshot.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1834
Summary:
`ReplicationLog` had a classic off-by-one bug. The `valid_prefix`
variable wasn't set properly.
This diff also includes a poor man's version of a HA client. This client
assumes that all the HA instances run on a single machine and that the
corresponding Bold endpoints have open ports ranging from `7687` to
`7687 + num_machines - 1`.
This should make it easeir to test certain things, ie. disk usage, P25.
This test revealed the bug with `ReplicationLog`
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1813
Summary:
Variable expansions cannot appear in merge patterns or after updates,
so they can only be planned with GraphState::OLD. Because of that, it makes
sense to remove GraphView parameter from them to reduce confusion.
Reviewers: teon.banek, llugovic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1825
Summary:
Implement proper plan cloning using LCP instead of hacking it with
serialization.
depends on D1815
Reviewers: teon.banek, llugovic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1816
Summary:
Once a leader loses it's leadership, in order to handle hanging
transactions, we reset the storage and the transaction engine.
This requires to re-apply all the commited entries from the log.
Once we add snapshot (log compaction) we would need to do that also.
One thing to have in mind is the `election_timeout_min` parameter. If it's set
too low it could trigger leader re-election too often.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1822
Summary:
use newly added LCP functionality to get rid of manually written
`Clone` functions in AST.
depends on D1808
Reviewers: teon.banek, llugovic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1815
Summary:
When there was an empty line comment starting right at the end of the
query, stripper wouldn't properly change state from `IN_LINE_COMMENT` to `OUT`
and it would return the wrong length (0) from `MatchWhitespaceAndComments`.
Because of that, the two slashes would be interpreted as division operators.
Query "RETURN 5;//" would be changed by stripper into "RETURN 5; / /" which
obviously can't be parsed;
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1819
Summary:
Add automatic generating of cloning (deep-copy) functions for LCP
defined classes. This enables us to remove a bunch of manually written `Clone`
functions from AST and also to implement logical plan cloning properly (before
it was using serialization).
Reviewers: teon.banek, llugovic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1808
Summary:
Rename Context to ExecutionContext and make it struct
Move ParsingContext to cypher_main_visitor.hpp
Reviewers: mtomic, llugovic
Reviewed By: llugovic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1810
Summary:
Identifiers would be printed correctly if they were printed as part of a larger
`Expression` (so the visitor did the printing). However, if `PrintObject` was
called with an `Identifier *` then it would delegate to the template overload
and just print the pointer.
Reviewers: mtomic, teon.banek
Reviewed By: mtomic
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1802
Summary:
This diff removes the need for a database when parsing a query and
creating an Ast. Instead of storing storage::{Label,Property,EdgeType}
in Ast nodes, we store the name and an index into all of the names. This
allows for easy creation of a map from {Label,Property,EdgeType} index
into the concrete storage type. Obviously, this comes with a performance
penalty during execution, but it should be minor. The upside is that the
query/frontend minimally depends on storage (PropertyValue), which makes
writing tests easier as well as running them a lot faster (there is no
database setup). This is most noticeable in the ast_serialization test
which took a long time due to start up of a distributed database.
Reviewers: mtomic, llugovic
Reviewed By: mtomic
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1774
Summary:
- Rename `type` to `cpp-type` for consistency
- Fix docs
- Guard against simultaneous `type-params` and `type-args`
- Explicitly initialize to the empty list
- Fix test
Reviewers: mtomic, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1806
Summary:
Each `raft::LogEntry` is now persisted under its own key in our `KVStore`. Locally running our HA feature benchmark yields the following results:
```
duration 23.7
executed_writes: 15000
write_per_second: 632.888
```
This represents about 5x increase in throughput.
Reviewers: msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1799
Summary:
`Edges` class didn't properly filter edges when given both edge type
and destination arguments, which causes weird bugs in query execution (wrong
edge getting matched in `Expand` with existing node flag set). This is probably
because we didn't support using both filters at the same time, but the API and
documentation for `VertexAccessor::in` and `VertexAccessor::out` functions
didn't reflect that.
Reviewers: teon.banek, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1796
Summary:
Before I wrongly assumed `Shutdown` will always be called on Cursors and
removed BFS subcursors there. Now it is done using transactional cache clean-up
mechanism. There's a separate clean-up for `BfsSubcursorStorage` (for actual
subcursors) and `BfsRpcServer` (for database accessors created by the server).
I've changed `BfsRpcServer` to have a `GraphDbAccessor` per transaction,
instead of per `BfsSubcursor`. Mainly because there is no reliable way to check
if the transaction tied to the accessor has expired as it is not safe to call
`transaction_id` method (since `GraphDbAccessor` is holding only a reference to
`Transaction` object).
Reviewers: teon.banek, mferencevic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1792
Summary:
Created a new integration test for Raft protocol.
The tests iterates through the Raft cluster and does the following:
* kill machine `X`
* execute a query
* bring `X` back to life
The first step is to insert a vertex in the cluster, and last step is to check
if the cluster has all the data.
I also edited some of the raft core files because this test surafaced some bugs.
The `tester` binary is a hacked version of the HA client and so are the parts in
the code that refuse to execute a query is the machine is not in `Leader` mode.o
Those parts will go away once we have a proper HA client.
I've run the `runner.py` for a while (215 times)
```
while ./runner.py &> log.txt; do echo -n "."; done
```
and it didn't break.
Reviewers: ipaljak, mferencevic
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1788
Summary: This simple function is required by the Tensorflow integration so that Memgraph can always return regular matrices of desired size.
Reviewers: teon.banek, mtomic, dsantl
Reviewed By: mtomic, dsantl
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1783
Summary:
Creating Raft noop logs on leader change will trigger the whole
log replication procedure that ends up committing/applying state deltas on newly
elected leaders that didn't receive the last commit index from the previous
leader.
I also included a small tweak that won't trigger add logs when a transaction
contains only BEGIN and ABORT StateDeltas, because we don't want to replicate
read queries.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1785
Summary:
This (almost) removes the dependency of operators on NodeAtom and
EdgeAtom. Only EdgeAtom::Direction is needed. The change was done as the
initial step of removing dependency on storage from Ast. Additionally,
it makes sense for LogicalOperator to only depend on Expression classes.
Reviewers: mtomic, llugovic
Reviewed By: llugovic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1779
Summary:
TransactionReplicator replicates transactions on follower machines in
HA memgraph. Our DB accessor API doesn't provide us with the functionality to
begin transactions with non-increasing ids. This is why the
`TransactionReplicator` uses a internal map that maps tx ids from the leader
node to transactions on the follower node (whose id doesn't have to match the
leaders tx id).
If the leader has the following transaction timeline:
```
L
tx1
|
| tx2
| |
| |
| |
| |
| |
| tx2
|
|
|
|
tx1
```
`tx2` will commit first and will be replicated. When applying `tx2` on follower
nodes, they will start a new transaction with tx id `1`. When `tx1` starts
replicating, followers will start a new transaction with tx id `2`. And this is
wehre `TransactionReplicator` kicks in.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1775
Summary:
Explicitly start and stop raft server.
This way we can be sure that raft won't try to use coordination after it's
shutdown, and we can define the start of th raft protocol easier.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1766
Summary:
This is just the first diff that tries to wire the raft protocol into
memgraph.
In this diff I'm introducing transaction engine reset functionality. I also
introduced `RaftInterface` which should be used wherever someone wants to access
Raft from Memgraph.
For design decisions see the feature spec.
Reviewers: ipaljak, teon.banek
Reviewed By: ipaljak
Subscribers: pullbot, teon.banek
Differential Revision: https://phabricator.memgraph.io/D1758
Summary: This diff contains a rough implementation of the Raft protocol which ends at leader election.
Reviewers: msantl
Reviewed By: msantl
Subscribers: teon.banek, pullbot
Differential Revision: https://phabricator.memgraph.io/D1744
Summary:
Classes marked with `:serialize (:slk)` will now generate SLK
serialization code. This diff also changes how the `:serialize` option
is parsed, so that multiple different serialization backends are
supported.
Reviewers: mtomic, llugovic, mferencevic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1755
Summary: This change undoes the performance impact that was introduced in D1117
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1756
Summary:
This should cover the minimum required feature set for generating the
serialization code for SLK. There are some TODO comments, mostly
concerning quality of life improvements. The documentation on LCP has
been updated.
Additionally, any previous CHECK which would trigger if loading went
wrong is now replaced by raising SlkDecodeException. Other assertions of
code misuse are left as CHECK invocations.
Reviewers: mtomic, llugovic, mferencevic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1754
Summary:
Serialization of std::unique_ptr and std::shared_ptr now requires a
custom callback for handling the concrete element that is pointed to.
We use C++ type trait which checks whether the we are using a
polymorphic type --- class that has at least 1 virtual function. This
obviously doesn't work with pointers to base classes of hierarchies
without virtual member functions. But we don't use that kind of
inheritance, why would we, right? :)
(Luckily the breaking case isn't serialized, and hopefully never will
be. Perhaps we fix such inheritance in the future.)
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1753
Summary: Starting memgraph with logging verbosity level 12 would crash memgraph, because extended and regular callbacks were not properly differentiated in logging.
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1747
Summary:
It doesn't make sense to tie skipping a member for serialization to a
particular serialization implementation.
Reviewers: mtomic, llugovic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1743
Summary:
With quite frequent changes of serialization backend, it's getting
really painful updating the C++ implementation. This diff defines the
Symbol class via LCP, so that the C++ serialization is generated instead
of written by hand.
Reviewers: mtomic, llugovic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1741
Summary:
Instantiation with VertexAccessor was never used, so the template
needlessly complicated the rest of the codebase.
Reviewers: mtomic, llugovic
Reviewed By: mtomic
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1736
Summary:
TypeInfo will be needed for upcoming serialization via SLK. This diff
changes the already defined TypeInfo by removing the reliance on
capnp::typeId calls. The struct itself is now in utils.
Hopefully, this shouldn't break our RPC stack due to new ID generation.
Reviewers: mferencevic, mtomic, llugovic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1735
Summary:
This is the initial work on transferring our serialization code from
Cap'n Proto to SaveLoadKit. The commit contains tests for generated
code, but still requires work on supporting some features. Most notably,
generating and storing type IDs for derived classes so that they can be
loaded from a base pointer. Naturally, loading implementation hasn't
even been started yet.
Reviewers: mtomic, llugovic, mferencevic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1733
Summary:
GraphDb now has GetStorageStat method that returns live view to
object containing vertex_count, edge_count and avg_degree().
Stat is updated on each garbage collection run and represents number of
objects that are visible to any transaction.
Reviewers: teon.banek, msantl, ipaljak
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1731
Summary: SmallVector data structure introduction. This is the first step. In this revision are SmallVector implementation and tests. The second step will be swapping std::vector with SmallVector in the current codebase.
Reviewers: teon.banek, buda, ipaljak, mtomic
Reviewed By: teon.banek, ipaljak
Subscribers: ipaljak, pullbot
Differential Revision: https://phabricator.memgraph.io/D1730
Summary:
Basic RPC messages for Raft protocol. They will most likely be updated as we
move along with the implementation.
Reviewers: msantl, teon.banek, mferencevic
Reviewed By: msantl, teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1726
Summary:
Since we need to send `StateDelta`s over the wire in HA, we need to be
able to serialize those bad boys.
This diff hopefully does this the right way.
Reviewers: teon.banek, mferencevic, ipaljak
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1725
Summary:
This change makes HierarchicalTreeVisitor visit only Cypher related AST
nodes. QueryVisitor can be used to differentiate between various query
types we have. The next step is to either rename HierarchicalTreeVisitor
to something like CypherQueryVisitor, or perhaps extract Clause visiting
from it.
Reviewers: mtomic, llugovic
Reviewed By: llugovic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1710
Summary:
See https://app.asana.com/0/743890251333732/888297761596047/f for more details.
# BEFORE
```
Note: Google Test filter = *UniqueConstraintRecovery*
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from Durability
[ RUN ] Durability.UniqueConstraintRecovery
unknown file: Failure
C++ exception with description "Index couldn't be created due to constraint
violation!" thrown in the test body.
[ FAILED ] Durability.UniqueConstraintRecovery (3 ms)
[----------] 1 test from Durability (3 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (3 ms total)
[ PASSED ] 0 tests.
[ FAILED ] 1 test, listed below:
[ FAILED ] Durability.UniqueConstraintRecovery
1 FAILED TEST
```
# AFTER
```
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from Durability
[ RUN ] Durability.UniqueConstraintRecovery
[ OK ] Durability.UniqueConstraintRecovery (4 ms)
[----------] 1 test from Durability (4 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (4 ms total)
[ PASSED ] 1 test.
```
Reviewers: ipaljak, vkasljevic
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1714
Summary:
Initial for fork (in form of a c/p) form the current single node
version. Once we finish HA we plan to re-link the files to the single node
versions if they don't change.
Reviewers: ipaljak, buda, mferencevic
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1705
Summary:
This is the initial step in making the visitors more granular in what
they visit.
Also add ignoring multiple inheritance in LCP and update LCP docs.
Reviewers: mtomic, llugovic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1704
Summary:
Generate conversion functions from C++ enum to Cap'n Proto and vice
verse. This should reduce the generated code size and hopefully improve
readability.
Use conversion functions for enums in default serialization generation.
`capnp-save-enum` and `capnp-load-enum` should now be used exclusively
to serialize non LCP defined enumerations. They now always require
`enum-values` argument.
Reviewers: mtomic, llugovic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1696
Summary: this should reduce parsing time for very simple queries.
Reviewers: teon.banek, mferencevic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1698
Summary:
Unfortunately, CMake macros cannot access variables defined in the file
where macro is defined. This means that we have to repeat the values by
hand in order to track the correct dependency of LCP.
Reviewers: mtomic, llugovic, mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1701
Summary:
Blocking transaction has the ability to stop the transaction engine from
starting new transactions (regular or blocking) and to wait all other active
transactions to finish (to become non active, committed or aborted). One thing
that blocking transactions support is defining the parent transaction which
does not need to end in order for the blocking one to start. This is because of
a use case where we start nested transactions.
One could thing we should build indexes inside those blocking transactions. This
is true and I wanted to implement this, but this would require some digging in
the interpreter which I didn't want to do in this change.
Reviewers: mferencevic, vkasljevic, teon.banek
Reviewed By: mferencevic, teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1695
Summary:
`Query` is now an abstract class which has `CypherQuery`,
`ExplainQuery`, `IndexQuery`, `AuthQuery` and `StreamQuery` as derived
classes. Only `CypherQuery` is forwarded to planner and the rest of the
queries are handled directly in the interpreter. This enabled us to
remove auth, explain and stream operators, clean up `Context` class and
remove coupling between `Results` class and plan cache. This should make
it easier to add similar functionality because no logical operator
boilerplate is needed. It should also be easier to separate community
and enterprise features for open source.
Remove Explain logical operator
Separate IndexQuery in AST
Handle index creation in interpreter
Remove CreateIndex operator and ast nodes
Remove plan cache reference from Results
Move auth queries out of operator tree
Remove auth from context
Fix tests, separate stream queries
Remove in_explicit_transaction and streams from context
Reviewers: teon.banek, mferencevic, msantl
Reviewed By: teon.banek, mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1664
Summary:
As Tom requested in the #graphileon slack channel, I changed the error
message when node can't be updated.
Reviewers: mferencevic, ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1687
Summary:
The row elements must not be moved, because the following pulls may
still use them.
Reviewers: mtomic, llugovic, mferencevic
Reviewed By: llugovic, mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1672
Summary:
LabelPropertyIndex now has the ability to enforce unique constraint.
This doesn't lock the tx engine.
Reviewers: teon.banek, mferencevic
Reviewed By: teon.banek
Subscribers: pullbot, vkasljevic, buda
Differential Revision: https://phabricator.memgraph.io/D1660
Summary: Up till now, `AstStorage` also took care of tracking the root of the `Query` and loading of cloning of `Query` nodes would change that root. This felt out of place because sometimes `AstStorage` is used only for storing expressions, and we don't even have an entire query in the storage. This diff removes that feature from `AstStorage`. Now its only functionality is owning AST nodes and assigning unique IDs to them.
Reviewers: teon.banek, llugovic
Reviewed By: teon.banek
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1646
Summary:
If the `STDIN_FILENO` is a `TTY`, then an interactive command prompt
is shown. Otherwise, client runs in non-interactive mode.
Shell commands start with a colon sign, and end at the end of
line.
Csv and tabular output formats are supported.
Reviewers: teon.banek, mferencevic
Reviewed By: teon.banek, mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1633
Summary:
This diff fixes the storage part of the bug that happen for the
following queries if all nodes end up on the non-master machine.
```
CREATE (a:T{c: True})-[:X{x: 2.5}]->(:A:B)-[:Y]->()-[:Z{r: 1}]->(a)
MATCH (:T{c: True})-[a:X{x: 2.5}]->(node:A:B)-[:Y]->()-[:Z{r: 1}]->() RETURN a AS node, node AS a
```
The current state of the code doesn't work completely, it's still missing the
query execution fix.
Reviewers: teon.banek, ipaljak, vkasljevic
Reviewed By: teon.banek
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1629
Summary:
Start removing `is_remote` from `Address`
Remove `GlobalAddress`
Remove `GlobalizedAddress`
Remove bitmasks from `Address`
Remove `is_local` from `Address`
Remove `is_local` from `RecordAccessor`
Remove `worker_id` from `Address`
Remove `worker_id` from `GidGenerator`
Unfriend `IndexRpcServer` from `Storage`
Remove `LocalizedAddressIfPossible`
Make member private
Remove `worker_id` from `Storage`
Copy function to ease removal of distributed logic
Remove `worker_id` from `WriteAheadLog`
Remove `worker_id` from `GraphDb`
Remove `worker_id` from durability
Remove nonexistant function
Remove `gid` from `Address`
Remove usage of `Address`
Remove `Address`
Remove `VertexAddress` and `EdgeAddress`
Fix Id test
Remove `cypher_id` from `VersionList`
Remove `cypher_id` from durability
Remove `cypher_id` member from `VersionList`
Remove `cypher_id` from database
Fix recovery (revert D1142)
Remove unnecessary functions from `GraphDbAccessor`
Revert `InsertEdge` implementation to the way it was in/before D1142
Remove leftover `VertexAddress` from `Edge`
Remove `PostCreateIndex` and `PopulateIndexFromBuildIndex`
Split durability paths into single node and distributed
Fix `TransactionIdFromWalFilename` implementation
Fix tests
Remove `cypher_id` from `snapshooter` and `durability` test
Reviewers: msantl, teon.banek
Reviewed By: msantl
Subscribers: msantl, pullbot
Differential Revision: https://phabricator.memgraph.io/D1647
Summary:
`heaptrack` shows a miniscule decrease in memory usage during query
execution.
Running the below query on the TEDTalk database 100 times gives the
following results:
- number of allocations: from 642647 to 642589
- bytes allocated in total: from 48.79 MiB to 48.78 MiB
```
MATCH (t:Tag)<-[:HasTag]-(n:Talk)
RETURN t.name AS Tag, COUNT(n) AS TalksCount
ORDER BY TalksCount DESC, Tag LIMIT 20;
```
Regarding `TypedValue`'s destructor: we're allowed to manually destruct the
union memebers that we constructed using placement-new. However, it is undefined
behavior to call the destructor after an object's lifetime has ended. Calling
`TypedValue`'s own destructor within its assignment operator counts as ending
its lifetime, which means that the next call to its destructor will invoke
undefined behavior.
See the C++ Draft N4140, clauses 12.4/15 and 3.8/1.3.
Reviewers: teon.banek, mtomic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1630
Summary:
To clean the working directory after this diff you should execute:
```
rm src/database/counters_rpc_messages.capnp
rm src/database/counters_rpc_messages.hpp
rm src/database/serialization.capnp
rm src/database/serialization.hpp
```
Reviewers: teon.banek, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1636
Summary: Move explain to Cypher.g4 so MemgraphCypher contains only enterprise features. Also added `EXPLAIN` to keyword list because query `MATCH (explain) RETURN explain` wasn't working.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1632
Summary:
This should fix the issue of having a write followed by a read during
distributed execution. Any kind of merger of plans should behave like a
Cartesian with regards to planning the following ScanAll.
Reviewers: mtomic, mferencevic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1624
Summary:
This diff splits single node and distributed storage from each other.
Currently all of the storage code is copied into two directories (one single
node, one distributed). The logic used in the storage implementation isn't
touched, it will be refactored in following diffs.
To clean the working directory after this diff you should execute:
```
rm database/state_delta.capnp
rm database/state_delta.hpp
rm storage/concurrent_id_mapper_rpc_messages.capnp
rm storage/concurrent_id_mapper_rpc_messages.hpp
```
Reviewers: teon.banek, buda, msantl
Reviewed By: teon.banek, msantl
Subscribers: teon.banek, pullbot
Differential Revision: https://phabricator.memgraph.io/D1625
Summary:
TODO:
~~1. Figure out how to propagate exceptions during lambda evaluation to master.~~
~~2. Make some more complicated test cases to see if everything is~~
~~sent over the network properly (lambdas depending on frame, evaluation context).~~
~~3. Support only `GraphView::OLD`.~~
4. [MAYBE] Send only parts of the frame necessary for lambda evaluation.
~~5. Fix EdgeType handling~~
--------------------
Serialize frame and send it in PrepareForExpand RPC
Move Lambda out of ExpandVariable
Send symbol table and filter lambda in CreateBfsSubcursor RPC
Evaluate filter lambda during the expansion
Send evaluation context in CreateBfsSubcursor RPC
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1600
Summary:
This should allow us to more easily decouple the code which should be
open sourced. Unfortunately, the downside of this approach is that we
cannot rely on virtual calls to dispatch the serialization to correct
type. Another downside is that members need to be publicly accessible
for serialization.
Reviewers: mtomic, msantl
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1596
Summary:
This diff changes the RPC layer to directly return `TResponse` to the user when
issuing a `Call<...>` RPC call. The call throws an exception on failure
(instead of the previous return `nullopt`).
All servers (network, RPC and distributed) are set to have explicit `Shutdown`
methods so that a controlled shutdown can always be performed. The object
destructors now have `CHECK`s to enforce that the `AwaitShutdown` methods were
called.
The distributed memgraph is changed that none of the binaries (master/workers)
crash when there is a communication failure. Instead, the whole cluster starts
a graceful shutdown when a persistent communication error is detected.
Transient errors are allowed during execution. The transaction that errored out
will be aborted on the whole cluster. The cluster state is managed using a new
Heartbeat RPC call.
Reviewers: buda, teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1604
Summary:
This change should allow for multiple `add_lcp` functions to exist in a
single CMakeLists.txt, each adding LCP files to a different target.
Supporting this feature allows us to separate LCP files for single node
and distributed.
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1613
Summary: This diff undoes the changes made in D925
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1614
Summary:
Even though we may not need the results of a RPC, we should check that
it completed without error.
Reviewers: mferencevic, mtomic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1610
Summary:
Knowledge of distributed operators is now removed from PlanPrinter
class. The DistributedOperatorVisitor and PlanPrinter are modified so
that they may be multiple inherited. This is done using virtual
inheritance of HierarchicalLogicalOperatorVisitor. Multiple inheritance
is used to derived a DistributedPlanPrinter which knows how to print
distributed in operators. This removes the dependency of single node
pretty printing on distributed.
Reviewers: msantl, mtomic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1606
Summary:
With this diff we should be able to support dynamic worker addition.
This is ofcourse a minimal effort maximal impact approach.
This diff introduces new RPC calls when a worker registers.
The `DynamicWorkerAddition` doesn't use `GraphDbAccessor` to get indices because
they create WAL entries.
Reviewers: vkasljevic, ipaljak, buda
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1594
Summary:
Previous version of code was wrong in some cases. Example:
There's an edge e = (a)->(b). Node (a) belongs to worker 1, node (b) belongs to worker 0. Therefore, edge e belongs to worker 1.
When edge e was serialized on worker 0, address of node b was local and it would be globalized, but the worker id of edge (e) would be used, which is wrong.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1599
Summary:
In a bunch of places `TypedValue` was used where `PropertyValue` should be. A lot of times it was only because `TypedValue` serialization code could be reused for `PropertyValue`, only without providing callbacks for `VERTEX`, `EDGE` and `PATH`. So first I wrote separate serialization code for `PropertyValue` and put it into storage folder. Then I fixed all the places where `TypedValue` was incorrectly used instead of `PropertyValue`. I also disabled implicit `TypedValue` to `PropertyValue` conversion in hopes of preventing misuse in the future.
After that, I wrote code for `VertexAccessor` and `EdgeAccessor` serialization and put it into `storage` folder because it was almost duplicated in distributed BFS and pull produce RPC messages. On the sender side, some subset of records (old or new or both) is serialized, and on the reciever side, records are deserialized and immediately put into transaction cache.
Then I rewrote the `TypedValue` serialization functions (`SaveCapnpTypedValue` and `LoadCapnpTypedValue`) to not take callbacks for `VERTEX`, `EDGE` and `PATH`, but use accessor serialization functions instead. That means that any code that wants to use `TypedValue` serialization must hold a reference to `GraphDbAccessor` and `DataManager`, so that should make clients reconsider if they really want to use `TypedValue` instead of `PropertyValue`.
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1598
Summary:
This change introduces a pure virtual initial implementation of the transaction
engine which is then implemented in two versions: single node and distributed.
The interface classes now have the following hierarchy:
```
Engine (pure interface)
|
+----+---------- EngineDistributed (common logic)
| |
EngineSingleNode +-------+--------+
| |
EngineMaster EngineWorker
```
In addition to this layout the `EngineMaster` uses `EngineSingleNode` as its
underlying storage engine and only changes the necessary functions to make
them work with the `EngineWorker`.
After this change I recommend that you delete the following leftover files:
```
rm src/distributed/transactional_cache_cleaner_rpc_messages.*
rm src/transactions/common.*
rm src/transactions/engine_rpc_messages.*
```
Reviewers: teon.banek, msantl, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1589
Summary:
This change improves detection of errorneous situations when starting a
distributed cluster on a single machine. It asserts that the user hasn't
started more memgraph nodes on the same machine with the same durability
directory. Also, this diff improves worker registration. Now workers don't have
to have explicitly set IP addresses. The master will deduce them from the
connecting IP when the worker registers.
Reviewers: teon.banek, buda, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1582
Summary:
This is a simple change which modifies interface of
awesome_memgraph_functions to accept C-style pointer to array with
count. Doing things this way, allows us to easily try out different
allocation schemes for function arguments. In this diff, we are now
using stack allocation of arguments in a plain fixed size array. This is
done when the number of arguments is small. According to heaptrack, this
small change should yield noticeable improvements to heap usage.
Obviously, this doesn't solve the problem of heap allocations inside
TypedValue arguments themselves. These allocations appear when
std::string and std::vector is used inside TypedValue.
Micro benchmarks show that there is some performance improvement,
mostly around the limits of using array vs std::vector. The improvement is
more noticeable with multiple threads, due to primary gain being in avoiding
calls to memory allocation.
Reviewers: mtomic, msantl, mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1581
Summary:
These functions were defined in multiple places. They are moved to
cmake/functions.cmake to keep only one source of truth.
Reviewers: mferencevic, msantl, mculinovic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1578
Summary: Recovery file is version inconsistent iff it starts with a correct magic number, has at least one integer written after that and that integer differs from kVersion.
Reviewers: mferencevic, ipaljak, vkasljevic, buda
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1565
Summary:
In order to add kafka benchmark, `memgraph_bolt.cpp` has been split.
Now we have `memgraph_init.cpp/hpp` files with common memgraph startup code.
Kafka benchmark implements a new `main` function that doesn't start a bolt
server, it just creates and starts a stream. Then it waits for the stream to
start consuming and measures the time it took to import the given number of
entries.
This benchmark is in a new folder, `feature_benchmark`, and so should any new
bechmark that measures performance of memgraphs features.
Reviewers: mferencevic, teon.banek, ipaljak, vkasljevic
Reviewed By: mferencevic, teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1552
Summary:
This diff introduces a new flags
* `--synchronous-commit`
The `--synchronous-commit` tells the WAL when should the deltas be flushed to
the disk drive. By default this is off and the WAL flushes deltas every `N`
milliseconds. If it's turned on, on every transaction end, commit or abort, the
WAL will first flush the deltas and only after that will return from ending a
transaction.
Reviewers: buda, vkasljevic, mferencevic, teon.banek, ipaljak
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1542
Summary: A quick clean-up of user visible error messages. Tried to make them gramatically correct by capitalizing the first word in the sentence and putting a dot at the end.
Reviewers: teon.banek, buda, ipaljak
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1571
Summary:
Changed GRANT ROLE to SET ROLE. Now it is `SET ROLE FOR user TO ROLE` instead of `GRANT ROLE role TO user`. It makes more sense because our users can only have 1 role.
Changed REVOKE ROLE to CLEAR ROLE. Now it is `CLEAR ROLE FOR user` instead of `REVOKE ROLE role FOR user`. REVOKE ROLE would throw exception if user was not a member of role. CLEAR ROLE clears the role whatever it is. I find that the latter makes more sense combined with SET ROLE.
Changed `SHOW ROLE FOR USER user` to `SHOW ROLE FOR user`.
Changed `SHOW USERS FOR ROLE role` to `SHOW USERS FOR role`.
Reviewers: mferencevic, teon.banek, buda
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1572
Summary:
Visitor pattern's main issue is cyclical dependency between classes that
are visited and the visitor instance itself. We need to decouple this
dependency if we want to open source part of the code, namely
non-distributed part. This decoupling is achieved through the use of
`dynamic_cast` in distributed operators. Hopefully the solution is good
enough and doesn't cause performance issues. An alternative solution is
to build our own custom double dispatch solution, but that will
basically boil down to our implementation of runtime type information
and casts.
Note, this only decouples the distributed operators. If and when we
decide that other operators shouldn't be open sourced, the same
`dynamic_cast` pattern should be applied in them also.
Depends on D1563
Reviewers: mtomic, msantl, buda
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1566
Summary:
This is the first step in separating the implementation of distributed
features in operators. Following steps are:
* decoupling distributed visitors
* injecting distributed details in operator state
* minor cleanup or anything else that was overlooked
Reviewers: mtomic, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1563
Summary:
Instead of DataManager returning Cache which can fetch data when needed,
I refactored the code so that the Cache is simple wrapper around
unordered_map and the DataManager is one that is fetching data. Also Cache
is not visible from outside of the DataManager so we can add LRU policy
without changing anything else.
Reviewers: msantl, ipaljak, teon.banek, buda
Reviewed By: msantl, teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1545
Summary:
Added WAL and Snapshot explorer utility executables that read a wal or a
snapshot file and print the content of it, but the main purpose is that they
check it.
This is useful when debugging durability and want to know where it gets stuck.
Reviewers: dgleich, buda
Reviewed By: buda
Subscribers: ipaljak, vkasljevic, teon.banek, pullbot
Differential Revision: https://phabricator.memgraph.io/D1422
Summary:
Updated the feature specs, the changelog and added a new section in
user technical.
Reviewers: mferencevic, mculinovic, buda, ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1534
Summary:
The Kafka Python transform functionality uses a Python script to transform
incoming Kafka data into queries and parameters that are executed against the
database. When starting the Python transform script it is started in a
sandboxed environment so that it can't do harm to the host system or the
database.
Reviewers: msantl, teon.banek
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1509
Summary:
Since RecordAccessor is often instantiated and copied, using a
factory-like function to allocate concrete types on the heap would be
too costly. The approach in this diff uses a Strategy pattern (see
"Design Patterns" by Gamma et al.), where the Strategy interface is
given as RecordAccessor::Impl. Concrete implementations are then created
for each GraphDb. This allows us to instantiate the concrete
RecordAccessors::Impl *once* and *share* it among all RecordAccessors.
Reviewers: msantl, vkasljevic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1510
Summary:
Our implementation of the Bolt protocol now correctly handles INIT message
metadata used for authentification. The related description from the Bolt
standard is: 'The token must contain either just the entry {"scheme": "none"}
or the keys scheme, principal, and credentials. Example {"scheme": "basic",
"principal": "user", "credentials": "secret"}". If no scheme is provided, it
defaults to "none".'
Reviewers: msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1518
Summary:
Since we switched to Cap'n Proto serialization there's no need for
keeping boost around anymore.
Reviewers: mtomic, mferencevic, buda
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1515
Summary:
The raft implementation has been stale for a while now. It doesn't
compile, nor uses Cap'n Proto for serialization. In the future we would
probably rewrite it, so it doesn't need to be part of the repo at this
moment.
Reviewers: mferencevic, mtomic, buda
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1513
Summary: first step for a cleaner query front end that might be easier to open source
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1511
Summary:
GraphDbAccessor is now constructed only through GraphDb. This allows the
concrete GraphDb to instantiate a concrete GraphDbAccessor. This allows
us to use virtual calls, so that the implementation may be kept
separate. The major downside of doing things this way is heap allocation
of GraphDbAccessor. In case it turns out to be a real performance
issues, another solution with pointer to static implementation may be
used.
InsertVertexIntoRemote is now a non-member function, which reduces
coupling. It made no sense for it to be member function because it used
only the public parts of GraphDbAccessor.
Reviewers: msantl, mtomic, mferencevic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1504
Summary:
This change, hopefully, simplifies the implementation of different kinds
of GraphDb. The pimpl idiom is now simplified by removing all of the
crazy inheritance. Implementations classes are just plain data stores,
without any methods. The interface classes now have a more flat
hierarchy:
```
GraphDb (pure interface)
|
+----+---------- DistributedGraphDb (pure interface)
| |
Single Node +-----+------+
| |
Master Worker
```
DistributedGraphDb is used as an intermediate interface for all the
things that should work only in distributed. Therefore, virtual calls
for distributed stuff have been removed from GraphDb. Some are exposed
via DistributedGraphDb, other's are only in concrete Master and Worker
classes. The code which relied on those virtual calls has been
refactored to either use DistributedGraphDb, take a pointer to what is
actually needed or use dynamic_cast. Obviously, dynamic_cast is a
temporary solution and should be replaced with another mechanism (e.g.
virtual call, or some other function pointer style).
The cost of the above change is some code duplication in constructors
and destructors of classes. This duplication has a lot of little tweaks
that make it hard to generalize, not to mention that virtual calls do
not work in constructor and destructor. If we really care about
generalizing this, we should think about abandoning RAII in favor of
constructor + Init method.
The next steps for splitting the dependencies that seem logical are:
1) Split GraphDbAccessor implementation, either via inheritance or
passing in an implementation pointer. GraphDbAccessor should then
only be created by a virtual call on GraphDb.
2) Split Interpreter implementation. Besides allowing single node
interpreter to exist without depending on distributed, this will
enable the planner and operators to be correctly separated.
Reviewers: msantl, mferencevic, ipaljak
Reviewed By: msantl
Subscribers: dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D1493
Summary: This is a simple thing that enables us to use keywords as symbolic names. A bad thing is that planning will be done for queries which differ only in keyword case (e.g. "MATCH (x) RETURN x" and "match (x) return x"), but that should not be a significant performance impact as queries are done programmaticaly when high throughput is expected and we don't expect letter case change in that case.
Reviewers: buda, mferencevic, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1496
Summary:
Run each query from a batch in a separate transaction. This will allow us to
abort a transaction if the interpreter throws on a query.
We should keep that in mind in the future for possible speedups.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1501
Summary:
First iteration in implementing kafka.
Currently, memgraph streams won't use the transform script provided in the
`CREATE STREAM` query.
There is a manual test that serves a POC purpose which we'll use to fully wire
kafka in memgraph.
Since streams need to download the script, I moved curl init from
telemetry.
Reviewers: teon.banek, mferencevic
Reviewed By: mferencevic
Subscribers: ipaljak, pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D1491
Summary:
This change should preclude the need to specify `:capnp-save` and
`:capnp-load` functions for regularly saved elements of `std::vector<T>`
and `std::optional<T>`. Regular saving in this context means saving
primitive types or compound types which have a `Save(capnp::Builder *)`
method.
Reviewers: mtomic, msantl
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1497
Summary:
Previously, our implementation of the Bolt protocol buffered all results in
memory before sending them out to the client. This implementation immediately
streams the results to the client to avoid any memory allocations. Also, this
implementation splits the interpretation and pulling logic into two.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1495
Summary:
This change should completely support planning Optional for distributed
execution. Cartesian matching is handled, as well as dependencies
between optional branch and the main input branch.
Unit tests are expanded to cover the planning algorithm.
Reviewers: msantl, mtomic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1480
Summary:
During the creation of indexes there could be a case in which a vertex contains a label/property but is not a part of index after
index building completes.
This happens if vertices are being inserted while the index is being built.
Reviewers: buda, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1484
Summary:
Session specifics have been move out of the Bolt `executing` state, and
are accessed via pure virtual Session type. Our server is templated on
the session and we are setting the concrete type, so there should be no
virtual call overhead. Abstract Session is used to indicate the
interface, this could have also been templated, but the explicit
interface definition makes it clearer.
Specific session implementation for running Memgraph is now implemented
in memgraph_bolt, which instantiates the concrete session type. This may
not be 100% appropriate place, but Memgraph specific session isn't
needed anywhere else.
Bolt/communication tests now use a dummy session and depend only on
communication, which significantly improves test run times.
All these changes make the communication a library which doesn't depend
on storage nor the database. Only shared connection points, which aren't
part of the base communication library are:
* glue/conversion -- which converts between storage and bolt types, and
* communication/result_stream_faker -- templated, but used in tests and query/repl
Depends on D1453
Reviewers: mferencevic, buda, mtomic, msantl
Reviewed By: mferencevic, mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1456
Summary:
This is the first step in cutting the crazy dependencies of
communication module to the whole database. Includes have been
reorganized and conversion between DecodedValue and other Memgraph types
(TypedValue and PropertyValue) has been extracted to a higher level
component called `communication/conversion`. Encoder, like Decoder, now
relies only on DecodedValue. Hopefully the conversion operations will
not significantly slow down streaming Bolt data.
Additionally, Bolt ID is now wrapped in a class. Our storage model uses
*unsigned* int64, while Bolt expects *signed* int64. The implicit
conversions may lead to encode/decode errors, so the wrapper should
enforce some type safety to prevent such errors.
Reviewers: mferencevic, buda, msantl, mtomic
Reviewed By: mferencevic, mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1453
Summary:
Added test stream functionality. When a stream is configured, it will try to
consume messages from a kafka topic and return them back to the user.
For now, the messages aren't transformed, so it just returns the payload string.
Depends on D1466
Next steps are persisting stream metadata and transforming messages in order to
store them in the graph.
Reviewers: teon.banek, mtomic
Reviewed By: teon.banek
Subscribers: pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D1474