Summary:
The single-binary distributed tests didn't wait for the whole cluster to be up
and running before they started running their tests. Now all tests ensure that
all workers are registered to the master and all other workers before starting
any tests.
Reviewers: mculinovic
Reviewed By: mculinovic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1724
Summary:
To clean the working directory after this diff you should execute:
```
rm src/database/counters_rpc_messages.capnp
rm src/database/counters_rpc_messages.hpp
rm src/database/serialization.capnp
rm src/database/serialization.hpp
```
Reviewers: teon.banek, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1636
Summary:
This diff splits single node and distributed storage from each other.
Currently all of the storage code is copied into two directories (one single
node, one distributed). The logic used in the storage implementation isn't
touched, it will be refactored in following diffs.
To clean the working directory after this diff you should execute:
```
rm database/state_delta.capnp
rm database/state_delta.hpp
rm storage/concurrent_id_mapper_rpc_messages.capnp
rm storage/concurrent_id_mapper_rpc_messages.hpp
```
Reviewers: teon.banek, buda, msantl
Reviewed By: teon.banek, msantl
Subscribers: teon.banek, pullbot
Differential Revision: https://phabricator.memgraph.io/D1625
Summary:
This diff changes the RPC layer to directly return `TResponse` to the user when
issuing a `Call<...>` RPC call. The call throws an exception on failure
(instead of the previous return `nullopt`).
All servers (network, RPC and distributed) are set to have explicit `Shutdown`
methods so that a controlled shutdown can always be performed. The object
destructors now have `CHECK`s to enforce that the `AwaitShutdown` methods were
called.
The distributed memgraph is changed that none of the binaries (master/workers)
crash when there is a communication failure. Instead, the whole cluster starts
a graceful shutdown when a persistent communication error is detected.
Transient errors are allowed during execution. The transaction that errored out
will be aborted on the whole cluster. The cluster state is managed using a new
Heartbeat RPC call.
Reviewers: buda, teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1604
Summary:
With this diff we should be able to support dynamic worker addition.
This is ofcourse a minimal effort maximal impact approach.
This diff introduces new RPC calls when a worker registers.
The `DynamicWorkerAddition` doesn't use `GraphDbAccessor` to get indices because
they create WAL entries.
Reviewers: vkasljevic, ipaljak, buda
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1594
Summary:
This change introduces a pure virtual initial implementation of the transaction
engine which is then implemented in two versions: single node and distributed.
The interface classes now have the following hierarchy:
```
Engine (pure interface)
|
+----+---------- EngineDistributed (common logic)
| |
EngineSingleNode +-------+--------+
| |
EngineMaster EngineWorker
```
In addition to this layout the `EngineMaster` uses `EngineSingleNode` as its
underlying storage engine and only changes the necessary functions to make
them work with the `EngineWorker`.
After this change I recommend that you delete the following leftover files:
```
rm src/distributed/transactional_cache_cleaner_rpc_messages.*
rm src/transactions/common.*
rm src/transactions/engine_rpc_messages.*
```
Reviewers: teon.banek, msantl, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1589
Summary:
This change improves detection of errorneous situations when starting a
distributed cluster on a single machine. It asserts that the user hasn't
started more memgraph nodes on the same machine with the same durability
directory. Also, this diff improves worker registration. Now workers don't have
to have explicitly set IP addresses. The master will deduce them from the
connecting IP when the worker registers.
Reviewers: teon.banek, buda, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1582
Summary:
GraphDbAccessor is now constructed only through GraphDb. This allows the
concrete GraphDb to instantiate a concrete GraphDbAccessor. This allows
us to use virtual calls, so that the implementation may be kept
separate. The major downside of doing things this way is heap allocation
of GraphDbAccessor. In case it turns out to be a real performance
issues, another solution with pointer to static implementation may be
used.
InsertVertexIntoRemote is now a non-member function, which reduces
coupling. It made no sense for it to be member function because it used
only the public parts of GraphDbAccessor.
Reviewers: msantl, mtomic, mferencevic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1504
Summary:
This change, hopefully, simplifies the implementation of different kinds
of GraphDb. The pimpl idiom is now simplified by removing all of the
crazy inheritance. Implementations classes are just plain data stores,
without any methods. The interface classes now have a more flat
hierarchy:
```
GraphDb (pure interface)
|
+----+---------- DistributedGraphDb (pure interface)
| |
Single Node +-----+------+
| |
Master Worker
```
DistributedGraphDb is used as an intermediate interface for all the
things that should work only in distributed. Therefore, virtual calls
for distributed stuff have been removed from GraphDb. Some are exposed
via DistributedGraphDb, other's are only in concrete Master and Worker
classes. The code which relied on those virtual calls has been
refactored to either use DistributedGraphDb, take a pointer to what is
actually needed or use dynamic_cast. Obviously, dynamic_cast is a
temporary solution and should be replaced with another mechanism (e.g.
virtual call, or some other function pointer style).
The cost of the above change is some code duplication in constructors
and destructors of classes. This duplication has a lot of little tweaks
that make it hard to generalize, not to mention that virtual calls do
not work in constructor and destructor. If we really care about
generalizing this, we should think about abandoning RAII in favor of
constructor + Init method.
The next steps for splitting the dependencies that seem logical are:
1) Split GraphDbAccessor implementation, either via inheritance or
passing in an implementation pointer. GraphDbAccessor should then
only be created by a virtual call on GraphDb.
2) Split Interpreter implementation. Besides allowing single node
interpreter to exist without depending on distributed, this will
enable the planner and operators to be correctly separated.
Reviewers: msantl, mferencevic, ipaljak
Reviewed By: msantl
Subscribers: dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D1493
Summary:
Wal on workers didn't contain committed transactions ids, this is needed for
distributed recovery so that the master may decide which transactions are
present on all the workers.
Reviewers: buda, msantl
Reviewed By: buda
Subscribers: pullbot, msantl, buda
Differential Revision: https://phabricator.memgraph.io/D1440
Summary: Make synchronized snapshot. This invokese the snapshooter on workers on the master snapshot scheduler interval.
Reviewers: msantl, mtomic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1334
Summary:
When commiting/aborting a transaction in tx master engine, make a two
phase commit to all workers so they can stop all futures and clear
transactional cache.
Reviewers: dgleich, florijan
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1320
Summary:
Remove "produce_" and "Produce" as prefix from all distributed stuff.
It's not removed in src/query/ stuff (operators).
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1315
Summary:
- The expansion vertex gets created on the origin's worker
- The edge automatically gets created wherever necessary
- Vertex creation logic reuse
- End to end test
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1227
Summary:
- Add the said test
- Add `workerid` function
- Add random wait (enabled with flag) to `rpc::Client::Call` for network
latency simulation
Reviewers: buda, mferencevic, teon.banek
Reviewed By: buda, teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1222
Summary:
This may improve the query execution workload, because the Expand will
yield local results while it waits for remote ones. Note that we rely on
the fact that walking the graph produces results without any predetermined
order. Therefore, we can yield paths as we see fit. Enforcing the order
is done through OrderBy operator, and this change shouldn't affect that.
Reviewers: florijan, msantl, buda
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1215
Summary:
Implementation of remote vertex and edge creation. This diff addresses
the creation API (`GraphDbAccessor::InsertEdge`,
`GraphDbAccessor::InsertRemoteVertex`) and the necessary RPC and
`RemoteCache` stuff.
What is missing for full remote creation support are
`query::plan::operator` changes that are expected to minor. Pushing this
diff as it's large enough, operator and end to end tests in the next.
Also, the naming of existing structures and files is confusing (update
refering to both updates and created, `results` used too often etc.). I
will address this too, but feel free to comment on bad naming.
Reviewers: dgleich, teon.banek, msantl
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1210
Summary:
We have been using `Edges::VertexAddress` and `Edges::EdgeAddress` a lot
in other parts of the codebase because it's cleaner to write then
`Address<mvcc::VersionList<Edge>>`, especially in code what should not
really be MVCC-aware. However, a lot of that code should not really be
`Edges` aware either, as that's a storage datastructure that should not
be exposed.
This became annoying, so I extracted these addresses into a type-file. I
don't really like this approach, it might be better to have
`Vertex::Address` and `Edge::Address`, but that means we'd have to
import those headers and we'd get circular dependencies.
“The horror! The horror!”
- Joseph Conrad, Heart of Darkness
Reviewers: teon.banek, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1204
Summary:
Updates are supported, insertions and removals not in this diff. The
test is a bit overdesigned, it happens.
Reviewers: teon.banek, dgleich, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1176
Summary:
Defined a new test based on reported bug, for multiple remote expansion.
Fixed the bug. Introduced minor refactors in distributed unit testing.
Reviewers: mculinovic, dgleich
Reviewed By: mculinovic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1173
Summary:
This diff consolidates local and remote update handling. It ensures and
tests that updates for remote elements are visible locally (on the
updating worker).
The next part will be accumulating remote updates and applying them on
the owner.
Also extracted a common testing fixture.
Reviewers: dgleich, buda, mtomic
Reviewed By: mtomic
Subscribers: mtomic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1169