Summary:
Implementation of remote vertex and edge creation. This diff addresses
the creation API (`GraphDbAccessor::InsertEdge`,
`GraphDbAccessor::InsertRemoteVertex`) and the necessary RPC and
`RemoteCache` stuff.
What is missing for full remote creation support are
`query::plan::operator` changes that are expected to minor. Pushing this
diff as it's large enough, operator and end to end tests in the next.
Also, the naming of existing structures and files is confusing (update
refering to both updates and created, `results` used too often etc.). I
will address this too, but feel free to comment on bad naming.
Reviewers: dgleich, teon.banek, msantl
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1210
Summary:
On the master cleanups are hooked directly into the transaction engine.
This is beneficial because the master might have bigger caches and we
want to clear them as soon as possible.
On the workers there is a periodic RPC call to the master about living
transactions, which takes care of releasing local caches. This is
suboptimal because long transactions will prevent cache GC (like with
data GC). It is however fairly simple.
Note that all cleanup is not done automatically and `RemotePull` has
been reduced accordingly. @msantl, please verify correctness and
consider if the code can be additionally simplified.
Reviewers: teon.banek, msantl
Reviewed By: msantl
Subscribers: pullbot, msantl
Differential Revision: https://phabricator.memgraph.io/D1202
Summary:
We have been using `Edges::VertexAddress` and `Edges::EdgeAddress` a lot
in other parts of the codebase because it's cleaner to write then
`Address<mvcc::VersionList<Edge>>`, especially in code what should not
really be MVCC-aware. However, a lot of that code should not really be
`Edges` aware either, as that's a storage datastructure that should not
be exposed.
This became annoying, so I extracted these addresses into a type-file. I
don't really like this approach, it might be better to have
`Vertex::Address` and `Edge::Address`, but that means we'd have to
import those headers and we'd get circular dependencies.
“The horror! The horror!”
- Joseph Conrad, Heart of Darkness
Reviewers: teon.banek, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1204
Summary:
Remote caches used to be owned by `GraphDbAccessor`. An advantage of
that was immediate cleanup when destructing. A disadvantage was sharing
the remote cache between mutliple program-flows in the same transaction
in distributed (one would have to share the accessor).
We will have to do post-transactional global cleanup anyway, since we
leak, which reduces the above stated advantage. And the stated
disadvantage is becoming more and more pronounced as additional
components need access to the remote cache.
Hence the refactor.
Reviewers: buda, teon.banek, msantl
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1186
Summary:
Updates are supported, insertions and removals not in this diff. The
test is a bit overdesigned, it happens.
Reviewers: teon.banek, dgleich, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1176
Summary:
Refactor in two ways. First, expose members without getters as we will
need most of them in distributed. And this was always the sensible thing
to do. Second, add storage type values to deltas. This is also a
sensible thing to do, and it will be very beneficial in distributed. We
didn't do it before because name<->value type mappings aren't guaranteed
to be the same after recovery. A task has been added to address this
(preserve mappings in durability).
Reviewers: dgleich, buda
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1167
Summary:
A hack worthy of young master Gleich. I *think* it's correct though, and
the tests pass. End-to-end cluster recovery testing will be written and
tried out by @mculinovic
Reviewers: dgleich, mculinovic
Reviewed By: dgleich
Subscribers: pullbot, mculinovic
Differential Revision: https://phabricator.memgraph.io/D1163
Summary:
Change `GraphDb` so it exposes index clients in the same
convention as other members.
Reviewers: dgleich, mculinovic
Reviewed By: mculinovic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1159
Summary: PullRemoteCursor will pull all clients in a RoundRobin fashion until all clients are exhausted and there are no more results to return.
Reviewers: teon.banek, florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1147
Summary:
It is possible that we have a global address to resolve, for a graph
element that's local. Consider W1 expanding, getting data from W2,
expanding from there and getting data that is on W1. We then don't want
to do RPC from W1 to W1, but do a lookup directly.
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1145
Summary:
NOTE: This diff is still in progress. Many TODOs, lacking documentation
etc. But the main logic is there (some could be different), and it tests
out OK.
Reviewers: teon.banek, msantl, buda
Reviewed By: teon.banek
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1138
Summary:
Start removal of old logic
Remove more obsolete classes
Move Message class to RPC
Remove client logic from system
Remove messaging namespace
Move protocol from messaging to rpc
Move System from messaging to rpc
Remove unnecessary namespace
Remove System from RPC Client
Split Client and Server into separate files
Start implementing new client logic
First semi-working state
Changed network protocol layout
Rewrite client
Fix client receive bug
Cleanup code of debug lines
Migrate to accessors
Migrate back to binary boost archives
Remove debug logging from server
Disable timeout test
Reduce message_id from uint64_t to uint32_t
Add multiple workers to server
Fix compiler warnings
Apply clang-format
Reviewers: teon.banek, florijan, dgleich, buda, mtomic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1129
Summary:
- End to end distributed GraphDb testing
- Refactors as necessary
- Basic RemoteCache for storing remote data
- RemoteDataRpc
As we are on a tight schedule, please let's focus on the essentials:
functionality and proper testing.
Reviewers: dgleich, teon.banek, buda
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1121
Summary:
A slight insanity here... I realized I will need to create
`GraphDbAccessor` instance (which need `&GraphDb`) within some members
of `::impl` classes. Within those classes I can pass `this` to those
members, if `this` is a valid `GraphDb`. Semantically it really is (at
the moment), but heirarchically it wasn't. This diff changes that.
`GraphDb` is now only an interface. `PublicBase` is the base for all
the public classes, `PrivateBase` for the `::impl` classes. Seems to
work.
Oh yes, another thing to keep in mind when doing this is that I should avoid
calling virtual functions in public classes (the motivation for the double
heirarchy). Before this diff the getters weren't virtual, now they are, so
I should have made all the appropriate changes in code as well.
Buda, was this a task I could have delegated to you or Cula?
Reviewers: teon.banek, dgleich, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1120
Summary:
Adds worker id to snapshot and wal filename.
Adds a new worker_id flag to be used for recovering a worker with a distributed snapshot.
Adds worker_id field to snapshot to check for consistency.
Reviewers: florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1096
Summary:
GraphDb is refactored to become an API exposing different parts
necessary for the database to function. These different parts can have
different implementations in SingleNode or distributed Master/Server
GraphDb implementations.
Interally GraphDb is implemented using two class heirarchies. One
contains all the members and correct wiring for each situation. The
other takes care of initialization and shutdown. This architecture is
practical because it can guarantee that the initialization of the
object structure is complete, before initializing state.
Reviewers: buda, mislav.bradac, dgleich, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1093
Summary:
No logic changes, just split `tx::MasterEngine` into
`tx::SingleNodeEngine` and `tx::MasterEngine`. This gives better
responsibility separation and is more appropriate now there is no
Start/Shutdown.
Reviewers: dgleich, teon.banek, buda
Reviewed By: dgleich, teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1099
Summary:
Serialization of vertices and edges for distributed. Based on Boost
serialization. Threrefore moved TypedValue serialization from AST to
utils.
Reviewers: buda, dgleich, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1088
Summary:
This is a proposal on how the WAL recovery process can be implemented so
that Deltas aren't accumulated, but instead applied in the same order
they are written to the WAL.
I *believe* that the only additional requirement on the system are
atomic transaction Begin/Commit/Abort. By atomic I mean that they are
present in the WAL in exactly the same ordering like in the transaciton
engine, to ensure the same commitability of original and recovery
transactions.
This could be a requirement for HA recovery. It is desirable that WAL
and HA log become the same thing, and the recovery process too.
Reviewers: mtomic, dgleich, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1068
Summary:
Although the first solution used cereal, the final implementation uses
boost. Since the cereal is still used in the codebase, compilation has
been modified to support multithreaded cereal.
In addition to serializing Ast classes, the following also needed to be
serialized:
* GraphDbTypes
* Symbol
* TypedValue
TypedValue is treated specially, by inlining the serialization code in
the Ast class, concretely PrimitiveLiteral.
Another special case was the Function Ast class, which now stores a
function name which is resolved to a concrete std::function on
construction.
Tests have been added for serialized Ast in
tests/unit/cypher_main_visitor
Reviewers: mferencevic, mislav.bradac, florijan
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1067
Summary:
The distributed ID mapper is not yet utilised in GraphDb as those
changes are in D1060. Depending on landing order it will be added.
Reviewers: dgleich, mislav.bradac
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1064
Summary:
What's done:
- `RecordAccessor` can represent remote data
- `GraphDbAccessor` manages remote data
- Cleanup: different `EdgeAccessor lazyness (@dgleich: take a look), unused methods, documentation...
- `TODO` placeholders for remote implementation
What's not done:
- RPC and data transfer
- how exactly remote errors are handled
- not sure if any MVCC Record info for remote data should be tracked
- WAL and RPC Deltas properly handled (Gleich working on extracting `Wal::Op`)
This implementation should not break single-node execution, and should provide good abstractions and placeholders for distributed. Once that's satisfied, it should land.
Reviewers: dgleich, buda, mislav.bradac
Reviewed By: dgleich
Subscribers: dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D1030
Summary: Operations are moved and renamed from WAL to a separate file in preparation for HA and distributed storage.
Reviewers: florijan, mtomic, mislav.bradac
Reviewed By: florijan
Subscribers: mislav.bradac, pullbot
Differential Revision: https://phabricator.memgraph.io/D1034
Summary:
The current idea is that the same MG binary can be used for single-node,
distributed master and distributed worker. The transactional engine in
the single-node and distributed master is the same: it determines the
transactional time and exposes all the "global" functionalities. In the
distributed worker the "global" functions must contact the master.
Reviewers: dgleich, mislav.bradac, buda
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1013
Summary: Because it will never be used, we already have replacements for it.
Reviewers: buda
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1016
Summary: Code simplification made possible by making `locks_` `mutable` in `tx::Transaction`.
Reviewers: dgleich, buda
Reviewed By: buda
Differential Revision: https://phabricator.memgraph.io/D1015