Summary:
Instead of DataManager returning Cache which can fetch data when needed,
I refactored the code so that the Cache is simple wrapper around
unordered_map and the DataManager is one that is fetching data. Also Cache
is not visible from outside of the DataManager so we can add LRU policy
without changing anything else.
Reviewers: msantl, ipaljak, teon.banek, buda
Reviewed By: msantl, teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1545
Summary:
Since we switched to Cap'n Proto serialization there's no need for
keeping boost around anymore.
Reviewers: mtomic, mferencevic, buda
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1515
Summary:
GraphDbAccessor is now constructed only through GraphDb. This allows the
concrete GraphDb to instantiate a concrete GraphDbAccessor. This allows
us to use virtual calls, so that the implementation may be kept
separate. The major downside of doing things this way is heap allocation
of GraphDbAccessor. In case it turns out to be a real performance
issues, another solution with pointer to static implementation may be
used.
InsertVertexIntoRemote is now a non-member function, which reduces
coupling. It made no sense for it to be member function because it used
only the public parts of GraphDbAccessor.
Reviewers: msantl, mtomic, mferencevic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1504
Summary:
This change, hopefully, simplifies the implementation of different kinds
of GraphDb. The pimpl idiom is now simplified by removing all of the
crazy inheritance. Implementations classes are just plain data stores,
without any methods. The interface classes now have a more flat
hierarchy:
```
GraphDb (pure interface)
|
+----+---------- DistributedGraphDb (pure interface)
| |
Single Node +-----+------+
| |
Master Worker
```
DistributedGraphDb is used as an intermediate interface for all the
things that should work only in distributed. Therefore, virtual calls
for distributed stuff have been removed from GraphDb. Some are exposed
via DistributedGraphDb, other's are only in concrete Master and Worker
classes. The code which relied on those virtual calls has been
refactored to either use DistributedGraphDb, take a pointer to what is
actually needed or use dynamic_cast. Obviously, dynamic_cast is a
temporary solution and should be replaced with another mechanism (e.g.
virtual call, or some other function pointer style).
The cost of the above change is some code duplication in constructors
and destructors of classes. This duplication has a lot of little tweaks
that make it hard to generalize, not to mention that virtual calls do
not work in constructor and destructor. If we really care about
generalizing this, we should think about abandoning RAII in favor of
constructor + Init method.
The next steps for splitting the dependencies that seem logical are:
1) Split GraphDbAccessor implementation, either via inheritance or
passing in an implementation pointer. GraphDbAccessor should then
only be created by a virtual call on GraphDb.
2) Split Interpreter implementation. Besides allowing single node
interpreter to exist without depending on distributed, this will
enable the planner and operators to be correctly separated.
Reviewers: msantl, mferencevic, ipaljak
Reviewed By: msantl
Subscribers: dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D1493
Summary:
This change should preclude the need to specify `:capnp-save` and
`:capnp-load` functions for regularly saved elements of `std::vector<T>`
and `std::optional<T>`. Regular saving in this context means saving
primitive types or compound types which have a `Save(capnp::Builder *)`
method.
Reviewers: mtomic, msantl
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1497
Summary:
During the creation of indexes there could be a case in which a vertex contains a label/property but is not a part of index after
index building completes.
This happens if vertices are being inserted while the index is being built.
Reviewers: buda, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1484
Summary:
Wal on workers didn't contain committed transactions ids, this is needed for
distributed recovery so that the master may decide which transactions are
present on all the workers.
Reviewers: buda, msantl
Reviewed By: buda
Subscribers: pullbot, msantl, buda
Differential Revision: https://phabricator.memgraph.io/D1440
Summary:
Converts the RPC stack to use Cap'n Proto for serialization instead of
boost. There are still some traces of boost in other places in the code,
but most of it is removed. A future diff should cleanup boost for good.
The RPC API is now changed to be more flexible with regards to how
serialize data. This makes the simplest cases a bit more verbose, but
allows complex serialization code to be correctly written instead of
relying on hacks. (For reference, look for the old serialization of
`PullRpc` which had a nasty pointer hacks to inject accessors in
`TypedValue`.)
Since RPC messages were uselessly modeled via inheritance of Message
base class, that class is now removed. Furthermore, that approach
doesn't really work with Cap'n Proto. Instead, each message type is
required to have some type information. This can be automated, so
`define-rpc` has been added to LCP, which hopefully simplifies defining
new RPC request and response messages.
Specify Cap'n Proto schema ID in cmake
This preserves Cap'n Proto generated typeIds across multiple generations
of capnp schemas through LCP. It is imperative that typeId stays the
same to ensure that different compilations of Memgraph may communicate
via RPC in a distributed cluster.
Use CLOS for meta information on C++ types in LCP
Since some structure slots and functions have started to repeat
themselves, it makes sense to model C++ meta information via Common Lisp
Object System.
Depends on D1391
Reviewers: buda, dgleich, mferencevic, mtomic, mculinovic, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1407
Summary:
Utils source files are now moved to a standalone mg-utils library.
Unit and manual tests are no longer collected using glob recursion in
cmake, but are explicitly listed. This allows us to set only required
dependencies of those tests.
Both of these changes should improve compilation and link times, as well
as lower the disk usage.
Additional improvement would be to cleanup utils header files to be
split in .hpp and .cpp as well as merging threading into utils. Other
potential library extractions that shouldn't be difficult are:
* data_structures
* io/network
* communication
Reviewers: buda, mferencevic, dgleich, ipaljak, mculinovic, mtomic, msantl
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1408
Summary:
Command id is necessary in remote produce to identify an ongoing pull
because a transaction can have multiple commands that all belong under
the same plan and tx id.
Reviewers: teon.banek, mtomic, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1386
Summary:
A simplified end-to-end implementation
POD interface set-up, still have bugs with HDDkeys
Version bug fix and first iterator implementation
Fixed out-of-scope reference in PVS iterator
Added PVS unit tests
Reviewers: buda, mferencevic, dgleich, teon.banek
Reviewed By: buda, dgleich
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1369
Summary: Global address of `from` was stored in Edge record when created via CreateEdge RPC.
Reviewers: buda, msantl, dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1387
Summary:
Implemented cluster discovery in distributed memgraph.
When a worker registers, it sends a RPC request to master.
The master assigns that worker an id and sends the information about other
workers (pairs of <worker_id, endpoint>) to the new worker.
Master also sends the information about the new worker to all existing workers
in the process of worker registration.
After the last worker registers, all memgraph instances in the clusters should
know about every other.
Reviewers: mtomic, buda, florijan
Reviewed By: mtomic
Subscribers: teon.banek, dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D1339
Summary: Make synchronized snapshot. This invokese the snapshooter on workers on the master snapshot scheduler interval.
Reviewers: msantl, mtomic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1334
Summary: Adds a commit log garbage collector, which clears old transactions from the commit log
Reviewers: florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1310
Summary:
When commiting/aborting a transaction in tx master engine, make a two
phase commit to all workers so they can stop all futures and clear
transactional cache.
Reviewers: dgleich, florijan
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1320
Summary:
Master shouldn't stop processing rpc calls immediately on shutdown. It
should wait for all workers to stop, and then destroy itself.
Reviewers: dgleich, mferencevic
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1330
Summary:
Remove "produce_" and "Produce" as prefix from all distributed stuff.
It's not removed in src/query/ stuff (operators).
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1315
Summary:
Find[Vertex|Edge] -> Find[Vertex|Edge]Optional
Find[Vertex|Edge]Checked -> Find[Vertex|Edge]
In some places change old code that finds-optional and immediately checks
to use the checked functionality.
It seems that in all the src/ stuff optional finds are no loger used,
only in tests, but there they are used extensively so I don't feel those
functions should be removed.
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1309
Summary:
Also removed some convenience code that became obsolete.
No logic changes.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1303
Summary:
Ensures that query plans are invalidated on the remote only when it's
guaranteed they will never be used on the master. This is done by
invalidating remote caches in the `CachedPlan` destructor.
There are two unplesant side-effects. First, an RPC call is made in an
object destructor. This is somewhat ugly, but not that different then
making an RPC call that must succeed in any other function. Note that
this does NOT slow down any query execution because the relevant
destructor is called by the skiplist garbage collector. The second ugly
side-effect is that in the unit test now we need to sleep to ensure the
skiplist GC destructs a cached plan before checking that it's
invalidated on the remote worker.
We might want to redesign this at some point.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1302
Summary:
- Remove caches on workers as a result of plan expiration or race
during insertion.
- Extract caching functionality into a class.
- Minor refactor of Interpreter::operator()
- New RPC and test for it.
- Rename ConsumePlanRes to DispatchPlanRes for consistency, remove
return value as it's always true and never used.
- Interpreter is now constructed with a `GraphDb` reference. At the
moment only for reaching the `distributed::PlanDispatcher`, but in
the future we should probably use that primarily for planning.
I added a function to `PlanConsumer` that is only used for testing.
I prefer not doing this, but I felt this needed testing. I can remove
it now if you like.
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1292
Summary:
I have not thoroughly thought this through, especially the worker
destruction (is it legit to abort all running tx?), but it's tested to
abort during remote pull, what we need.
Also I improved error handling for vertex deletion failure during
remote pull (@dgleich).
Reviewers: teon.banek, msantl, dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1263