Summary:
This is the initial step to getting a correct version of distributed
planning of Cartesian operator. Functions and structs have been added
which should collect enough information to correctly order the execution
with regards to dependencies among Cartesian branches. The support
functionality should be the same as was before, but unsupported cases
should now raise an exception instead of leading to undefined behaviour.
Reviewers: msantl, mtomic, buda
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1418
Summary:
Converts the RPC stack to use Cap'n Proto for serialization instead of
boost. There are still some traces of boost in other places in the code,
but most of it is removed. A future diff should cleanup boost for good.
The RPC API is now changed to be more flexible with regards to how
serialize data. This makes the simplest cases a bit more verbose, but
allows complex serialization code to be correctly written instead of
relying on hacks. (For reference, look for the old serialization of
`PullRpc` which had a nasty pointer hacks to inject accessors in
`TypedValue`.)
Since RPC messages were uselessly modeled via inheritance of Message
base class, that class is now removed. Furthermore, that approach
doesn't really work with Cap'n Proto. Instead, each message type is
required to have some type information. This can be automated, so
`define-rpc` has been added to LCP, which hopefully simplifies defining
new RPC request and response messages.
Specify Cap'n Proto schema ID in cmake
This preserves Cap'n Proto generated typeIds across multiple generations
of capnp schemas through LCP. It is imperative that typeId stays the
same to ensure that different compilations of Memgraph may communicate
via RPC in a distributed cluster.
Use CLOS for meta information on C++ types in LCP
Since some structure slots and functions have started to repeat
themselves, it makes sense to model C++ meta information via Common Lisp
Object System.
Depends on D1391
Reviewers: buda, dgleich, mferencevic, mtomic, mculinovic, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1407
Summary:
Add additional structs and functions for handling C++ meta information.
Add capnp-file and capnp-id arguments to lcp:process-file.
Generate cpp along with hpp and capnp in lcp.
Wrap LogicalOperator base class in lcp:define-class.
Modify logical operators for capnp serialization.
Add query/common.capnp.
Reviewers: mculinovic, buda, mtomic, msantl, ipaljak, dgleich, mferencevic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1391
Summary:
Utils source files are now moved to a standalone mg-utils library.
Unit and manual tests are no longer collected using glob recursion in
cmake, but are explicitly listed. This allows us to set only required
dependencies of those tests.
Both of these changes should improve compilation and link times, as well
as lower the disk usage.
Additional improvement would be to cleanup utils header files to be
split in .hpp and .cpp as well as merging threading into utils. Other
potential library extractions that shouldn't be difficult are:
* data_structures
* io/network
* communication
Reviewers: buda, mferencevic, dgleich, ipaljak, mculinovic, mtomic, msantl
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1408
Summary:
It seems that streamoff is not in std::iostream namespace, but in std::
namespace. This caused problems with newer versions of libraries.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1404
Summary:
Command id is necessary in remote produce to identify an ongoing pull
because a transaction can have multiple commands that all belong under
the same plan and tx id.
Reviewers: teon.banek, mtomic, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1386
Summary:
Removing `AS_IS` from GraphView because it doesn't seem like it is necessary for query execution and it also has weird semantics (you might get a mix of old and new records). `Unwind`, `OrderBy` and `PullRemoteOrderBy` now use `OLD` graph view.
Remove AS_IS from GraphView
Fix query_cost_estimator tests
Fix query_expression_evaluator tests
Fix query_plan_match_filter_return tests
Fix query_plan_create_set_remove_delete tests
Fix query_plan_accumulate_aggregate tests
Reviewers: teon.banek, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1390
Summary:
A simplified end-to-end implementation
POD interface set-up, still have bugs with HDDkeys
Version bug fix and first iterator implementation
Fixed out-of-scope reference in PVS iterator
Added PVS unit tests
Reviewers: buda, mferencevic, dgleich, teon.banek
Reviewed By: buda, dgleich
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1369
Summary:
Since we are moving from boost to Capnp for serialization, it makes
sense to keep all of the LogicalOperator classes in LCP format. This
will make it easier to generate Capnp code.
Depends on D1361
Reviewers: buda, mferencevic, msantl, dgleich, ipaljak, mculinovic, mtomic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1362
Summary: Global address of `from` was stored in Edge record when created via CreateEdge RPC.
Reviewers: buda, msantl, dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1387
Summary:
Pointers in AstTreeStorage can point to same nodes. These nodes
should be serialized(saved) only once when serializing Ast tree.
Same goes for deserializing(loading).
When deserializing nodes, one should not use storage Create method,
because it will add node which isn't completely loaded to storage vector.
Therefore, one should use new Deserialize method which doesn't
add node to storage vector. Inserting node into storage vector
is defered until all node data is loaded.
Pass pointer to saved uids set, instead of storage
Delete unused set
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1344
Summary:
In order to enhance C++ metaprogramming capabilities, a custom
preprocessing step is added before compilation. C++ code may be mixed
with Lisp code in order to generate a complete C++ source code. The
mechanism is hooked into cmake. To notify cmake of .lcp files, `add_lcp`
function in src/CMakeLists.txt needs to be invoked.
The main executable entry point is in tools/lcp, while the source code
is in src/lisp/lcp.lisp
The main goal of LCP is to auto generate class serialization code and
member variable getter functions. This should now be significantly less
error prone, since you cannot forget to serialize a member variable
through this mechanism. Future uses should be generating other repeating
code, such as `Clone` methods or perhaps some debug information.
.lcp files may contain mixed C++ code (enclosed in #>cpp ... cpp<#
blocks) with Common Lisp code.
NOTE: With great power comes great responsibility. Lisp metaprogramming
capabilities are incredibly powerful. To keep the sanity of the team
intact, use Lisp preprocessing only when *really* necessary.
Reviewers: buda, mferencevic, msantl, dgleich, ipaljak, mculinovic, mtomic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1361
Summary:
GC called clog info on transactions which were not completed even though it
could deduce that they are not finished. By avoiding calling clog info on them
we can lower the number of rpc calls and remove a strange behaviour of gc
appearing to be hanging while recovering a worker.
Reviewers: mculinovic, mferencevic, buda
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1377
Summary:
This binary is installed and packaged for release. This is just a quick
solution for releasing the Community 0.10 version. We still need to
setup the installation and packaging for both the Enterprise and
Community versions. Additionally, the automated build system needs to
test both binaries for correct behaviour. Obviously, some tests can only
be run on one of the 2 versions.
Reviewers: mferencevic, buda
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1363
Summary:
Added a new markdown file for concepts and a new test to test the edge
case with upper bound.
Reviewers: teon.banek, mtomic, dgleich, buda, ipaljak
Reviewed By: teon.banek, dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1366
Summary:
Wal retention test was flaky, this kinda fixes it.
It's hard to fix it completely since there are separate threads happening which
are not synchronized, and forcing the synchronization from the outside would
not agree to the specifications i.e. single-threadness of WAL.
Reviewers: buda
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1358
Summary: Locks should be released as early as possible
Reviewers: msantl, florijan
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1357
Summary:
Use /var/lib/memgraph as home dir for memgraph user.
Merge post and pre RPM scripts in RPM spec file.
Properly handle memgraph.service and directory permissions.
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1350
Summary:
- Removed a lot of stuff that was incorrect and/or unnecessary
- Fixed const-correctness in the skiplist family
Reviewers: dgleich, teon.banek, buda
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1351
Summary: Do it so properties-on-disk can be implemented.
Reviewers: buda, dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1343
Summary:
If someone was to call `EnsureDir` concurrently it might fail because
the `fs::create_directories` returns false when the directory already exists
with the `error_code` value being `0`.
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1349
Summary:
After the commit log was cleared there was some transaction that tried to acquire
a lock on an object that was taken by a transaction that was not longer active on
the worker. Inquring the commit log about that transaction causes a crash since
the commit log is cleared of that transaction.
Solution is to clear the transaction cache before clearing the commit log, which
forces the transactions to release their locks and as such their ids will never be
queried through the commit log in the future.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1342
Summary:
Implemented cluster discovery in distributed memgraph.
When a worker registers, it sends a RPC request to master.
The master assigns that worker an id and sends the information about other
workers (pairs of <worker_id, endpoint>) to the new worker.
Master also sends the information about the new worker to all existing workers
in the process of worker registration.
After the last worker registers, all memgraph instances in the clusters should
know about every other.
Reviewers: mtomic, buda, florijan
Reviewed By: mtomic
Subscribers: teon.banek, dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D1339
Summary: Make synchronized snapshot. This invokese the snapshooter on workers on the master snapshot scheduler interval.
Reviewers: msantl, mtomic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1334
Summary:
Snapshots should have the transaction from which they were created because we need this info for recovery later on.
Otherwise we wouldn't be able to tell the workers from which snapshots to recover. The whole cluster should be recovered
from the same transaction snapshot.
Reviewers: msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1338
Summary: Adds a commit log garbage collector, which clears old transactions from the commit log
Reviewers: florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1310
Summary:
When commiting/aborting a transaction in tx master engine, make a two
phase commit to all workers so they can stop all futures and clear
transactional cache.
Reviewers: dgleich, florijan
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1320
Summary:
Snapshot scheduler object was released from unique ptr and not actually freed, which
caused the snapshooter to access the tx_engine after it was destructed.
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1325
Summary:
Master shouldn't stop processing rpc calls immediately on shutdown. It
should wait for all workers to stop, and then destroy itself.
Reviewers: dgleich, mferencevic
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1330
Summary:
Remove semicolon.
Semicolon shouldn't be used within macros and should
be explicitly provided by the user of the said macro.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1321
Summary:
The network layer now has a `Session` that handles all things that should be
done before the `Execute` method is called on sessions. Also, all sessions
now communicate using streams instead of holding the input buffer and writing
to the `Socket`. This design will allow implementation of a SSL middleware.
Reviewers: buda, dgleich
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1314
Summary:
Before we used `utils::Future` only where it's created by our `ThreadPool`.
I suggest in this diff that we use it everywhere, it's a bit more defensive and
should not have any downsides.
Reviewers: msantl, teon.banek
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1316
Summary:
Add different priority VLOGs for distributed memgraph.
For level 3 you'll get logs for dispatching/consuming plans.
For level 4 you'll get logs for tx start/commit/abort, remote produce, remote
pull, remote result consume,
For level 5 there will be a log for each request/response made by the RPC
client.
Master log snippet P9
Worker log snippet P10
Reviewers: florijan, teon.banek
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1296
Summary:
Remove "produce_" and "Produce" as prefix from all distributed stuff.
It's not removed in src/query/ stuff (operators).
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1315
Summary:
When performing recovery, ensure that the transaction ID in engine is
bumped to one after the max tx id seen in recovery.
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1312
Summary:
Find[Vertex|Edge] -> Find[Vertex|Edge]Optional
Find[Vertex|Edge]Checked -> Find[Vertex|Edge]
In some places change old code that finds-optional and immediately checks
to use the checked functionality.
It seems that in all the src/ stuff optional finds are no loger used,
only in tests, but there they are used extensively so I don't feel those
functions should be removed.
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1309
Summary: This is correct, and uses less memory.
Reviewers: teon.banek, buda, mislav.bradac
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1304
Summary:
Also removed some convenience code that became obsolete.
No logic changes.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1303
Summary:
Writers preference is not guaranteed by the standard and as such
there are variations between implementations in different versions
of libphthread.
Reviewers: dgleich
Reviewed By: dgleich
Differential Revision: https://phabricator.memgraph.io/D1305
Summary:
Ensures that query plans are invalidated on the remote only when it's
guaranteed they will never be used on the master. This is done by
invalidating remote caches in the `CachedPlan` destructor.
There are two unplesant side-effects. First, an RPC call is made in an
object destructor. This is somewhat ugly, but not that different then
making an RPC call that must succeed in any other function. Note that
this does NOT slow down any query execution because the relevant
destructor is called by the skiplist garbage collector. The second ugly
side-effect is that in the unit test now we need to sleep to ensure the
skiplist GC destructs a cached plan before checking that it's
invalidated on the remote worker.
We might want to redesign this at some point.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1302
Summary:
Recursion in Version destructor was causing a SEGFAULT for
long version lists because of a stack overflow.
Reviewers: florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1301
Summary:
- Remove caches on workers as a result of plan expiration or race
during insertion.
- Extract caching functionality into a class.
- Minor refactor of Interpreter::operator()
- New RPC and test for it.
- Rename ConsumePlanRes to DispatchPlanRes for consistency, remove
return value as it's always true and never used.
- Interpreter is now constructed with a `GraphDb` reference. At the
moment only for reaching the `distributed::PlanDispatcher`, but in
the future we should probably use that primarily for planning.
I added a function to `PlanConsumer` that is only used for testing.
I prefer not doing this, but I felt this needed testing. I can remove
it now if you like.
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1292
Summary:
Added a class that is supposed to be used as a base class on classes
where we want to see the stacktrace of destructor call.
Example of the output is P7 where classes `GraphDb` and `Client` extend
`LogDestructor` base.
Reviewers: florijan, teon.banek
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1281
Summary:
This fixes a bug where the planning would raise NotYetImplemented error,
due to preventing plan splitting while the operator is already on
master. For example, `MATCH (a), (b) CREATE (a)-[e:r]->(b) RETURN e`
would split the plan after Cartesian and before Create. Thus, the rest
of the plan would be on master, including the Accumulate before Produce.
We still can (and must) replace Accumulate with Synchronize but this
would fail due to unneeded check.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1272
Summary:
I have not thoroughly thought this through, especially the worker
destruction (is it legit to abort all running tx?), but it's tested to
abort during remote pull, what we need.
Also I improved error handling for vertex deletion failure during
remote pull (@dgleich).
Reviewers: teon.banek, msantl, dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1263