Summary:
It is possible that boost's archive didn't write all data to a stream
before the archive is destroyed. To make sure everything works, archives
are now wrapped in a nested block.
Additionally, there may be some issues if the stream isn't set for
binary reading/writing due to new line character conversion. This
shouldn't pose problems on Linux, but just to make sure it doesn't bite
us in the a$% down the line.
Reviewers: mferencevic, mculinovic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1194
Summary:
The same behaviour can now be achieved by passing a nullptr for input
operator.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1195
Summary:
Remote caches used to be owned by `GraphDbAccessor`. An advantage of
that was immediate cleanup when destructing. A disadvantage was sharing
the remote cache between mutliple program-flows in the same transaction
in distributed (one would have to share the accessor).
We will have to do post-transactional global cleanup anyway, since we
leak, which reduces the above stated advantage. And the stated
disadvantage is becoming more and more pronounced as additional
components need access to the remote cache.
Hence the refactor.
Reviewers: buda, teon.banek, msantl
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1186
Summary:
Updates are supported, insertions and removals not in this diff. The
test is a bit overdesigned, it happens.
Reviewers: teon.banek, dgleich, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1176
Summary:
Distributed plan is now dispatched before inserting into a cache. This
prevents a race condition when another thread would take the cached plan
and assume it was dispatched, thus crashing during execution.
The fix introduces a possibility that the same plan may be distributed
and dispatched twice (with different plan IDs). This shouldn't raise an
issue during execution, because only one plan will be cached and thus
executed. Obviously, we still need to invalidate the unused plan from
worker's caches.
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1178
Summary:
Defined a new test based on reported bug, for multiple remote expansion.
Fixed the bug. Introduced minor refactors in distributed unit testing.
Reviewers: mculinovic, dgleich
Reviewed By: mculinovic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1173
Summary:
This is a basic take on planning distributed writes. Main logic is in handling
the Accumulate operator, which requires Synchronize operation if the results
are used after modification.
Tests for planning distributed writes have been added.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1172
Summary:
This diff consolidates local and remote update handling. It ensures and
tests that updates for remote elements are visible locally (on the
updating worker).
The next part will be accumulating remote updates and applying them on
the owner.
Also extracted a common testing fixture.
Reviewers: dgleich, buda, mtomic
Reviewed By: mtomic
Subscribers: mtomic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1169
Summary:
Refactor in two ways. First, expose members without getters as we will
need most of them in distributed. And this was always the sensible thing
to do. Second, add storage type values to deltas. This is also a
sensible thing to do, and it will be very beneficial in distributed. We
didn't do it before because name<->value type mappings aren't guaranteed
to be the same after recovery. A task has been added to address this
(preserve mappings in durability).
Reviewers: dgleich, buda
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1167
Summary:
A hack worthy of young master Gleich. I *think* it's correct though, and
the tests pass. End-to-end cluster recovery testing will be written and
tried out by @mculinovic
Reviewers: dgleich, mculinovic
Reviewed By: dgleich
Subscribers: pullbot, mculinovic
Differential Revision: https://phabricator.memgraph.io/D1163
Summary:
Change `GraphDb` so it exposes index clients in the same
convention as other members.
Reviewers: dgleich, mculinovic
Reviewed By: mculinovic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1159
Summary:
This avoids the unnecessary work of storing symbols which can never be
read after an aggregation is complete. With regards to distributed, a
major benefit is gained in reducing what is transferred over the
network. Hopefully, this doesn't break some obscure case where we
actually needed to remember all used symbols.
Reviewers: florijan, mislav.bradac, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1158
Summary:
Queries such as `RETURN 1` should only be run on a single machine. This
change assumes that a query should only be distributed if it contains at
least one `ScanAll` operator, i.e. a `MATCH` clause.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1154
Summary:
Remote pulls can now be async. Note that
`RemotePullRpcClients::RemotePull` takes references to data structures
which should not be temporary in the caller. Still, maybe safer to make
copies?
Changed `RpcWorkerClients` API to make that possible.
Reviewers: dgleich, msantl, teon.banek
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1150
Summary: PullRemoteCursor will pull all clients in a RoundRobin fashion until all clients are exhausted and there are no more results to return.
Reviewers: teon.banek, florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1147
Summary:
It is possible that we have a global address to resolve, for a graph
element that's local. Consider W1 expanding, getting data from W2,
expanding from there and getting data that is on W1. We then don't want
to do RPC from W1 to W1, but do a lookup directly.
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1145
Summary:
NOTE: This diff is still in progress. Many TODOs, lacking documentation
etc. But the main logic is there (some could be different), and it tests
out OK.
Reviewers: teon.banek, msantl, buda
Reviewed By: teon.banek
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1138
Summary:
Remove AdvanceCommand operator declaration.
Add query/plan/distributed source files.
Add hacked cloning of LogicalOperator via serialization.
Add virtual Default methods to CompositeVisitor.
Use BOOST_CLASS_EXPORT_KEY in ast.hpp and operator.hpp.
This is needed in a single binary to correctly serialize polymorphic
classes. The previous implementation worked because the memgraph_lib was
linked with test binaries, but nobody was actually serializing things
inside the single memgraph binary itself.
Print PullRemote symbols in tests/manual/query_planner
Add names to implicitly created aggregation symbols
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1137
Summary:
With this patch the number of packets for a simple RPC call is lowered
from 22 to 12 (45% reduction). The number of packets for the Bolt protocol
is lowered from 26 to 18 (30% reduction).
Impact on the Bolt protocol will be a constant of ~ 8 packets less per
connection, while the impact on the RPC protocol will be approximately
a 45% reduction overall.
Reviewers: buda, teon.banek
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1141
Summary: See above. The unit test creates two clients on demand so I guess it works.
Reviewers: mferencevic, florijan, teon.banek
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1136
Summary:
Start removal of old logic
Remove more obsolete classes
Move Message class to RPC
Remove client logic from system
Remove messaging namespace
Move protocol from messaging to rpc
Move System from messaging to rpc
Remove unnecessary namespace
Remove System from RPC Client
Split Client and Server into separate files
Start implementing new client logic
First semi-working state
Changed network protocol layout
Rewrite client
Fix client receive bug
Cleanup code of debug lines
Migrate to accessors
Migrate back to binary boost archives
Remove debug logging from server
Disable timeout test
Reduce message_id from uint64_t to uint32_t
Add multiple workers to server
Fix compiler warnings
Apply clang-format
Reviewers: teon.banek, florijan, dgleich, buda, mtomic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1129
Summary:
Added test for `PlanDispatcher` and `PlanConsumer`.
This diff also contains a fix for the async rpc call on all clients.h
Reviewers: florijan, teon.banek
Reviewed By: florijan
Subscribers: pullbot, dgleich
Differential Revision: https://phabricator.memgraph.io/D1135
Summary:
It seems that RecordAccessor &co are ready for read-only distributed
execution. In read-only there is no command advancement and the implied
cache invalidation, `SwitchOld` and `SwitchNew` perform default
switching and `Reconstruct` uses the `RemoteCache` which is implemented.
I just added a few TODOs for proper CRUD.
Reviewers: dgleich
Reviewed By: dgleich
Differential Revision: https://phabricator.memgraph.io/D1125
Summary:
- End to end distributed GraphDb testing
- Refactors as necessary
- Basic RemoteCache for storing remote data
- RemoteDataRpc
As we are on a tight schedule, please let's focus on the essentials:
functionality and proper testing.
Reviewers: dgleich, teon.banek, buda
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1121
Summary:
With the added support for serialization, we should be able to transfer
plans across distributed workers.
The planner tests has been extended to test serialization. Operators
should be mostly tested, but the expression they contain aren't
completely. The quick solution is to use typeid for rudimentary
expression equality testing. The more involved solution of comparing
the expression tree for equality is the correct choice. It should be
done in the near future.
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1122
Summary:
Other than the plan operators and the frame, we will need to pass the
generated symbol table to distributed workers.
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1123
Summary:
A slight insanity here... I realized I will need to create
`GraphDbAccessor` instance (which need `&GraphDb`) within some members
of `::impl` classes. Within those classes I can pass `this` to those
members, if `this` is a valid `GraphDb`. Semantically it really is (at
the moment), but heirarchically it wasn't. This diff changes that.
`GraphDb` is now only an interface. `PublicBase` is the base for all
the public classes, `PrivateBase` for the `::impl` classes. Seems to
work.
Oh yes, another thing to keep in mind when doing this is that I should avoid
calling virtual functions in public classes (the motivation for the double
heirarchy). Before this diff the getters weren't virtual, now they are, so
I should have made all the appropriate changes in code as well.
Buda, was this a task I could have delegated to you or Cula?
Reviewers: teon.banek, dgleich, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1120
Summary:
Virtual destructors were missing in classes/structs which can
be inherited.
A missing virtual destructor gives undefined behaviour when
deleting derived class using base type.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1117
Summary:
Adds worker id to snapshot and wal filename.
Adds a new worker_id flag to be used for recovering a worker with a distributed snapshot.
Adds worker_id field to snapshot to check for consistency.
Reviewers: florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1096
Summary:
Previously, we would have a `DCHECK` which crashes the application. This
was evident when testing a queries, such as:
MATCH (n) DELETE n SET n.prop = 42
Since the argument to update clauses is evaluated during execution, it
makes it very difficult to prevent such errors during semantic analysis.
For example:
MATCH (n)--(m) WITH collect(n) as ns, m
DETACH DELETE ns[m.prop] SET head(ns).prop = 42
Test query updates on deleted graph elements
Reviewers: florijan, dgleich
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1114
Summary:
Close the file descriptor in File destructor. This will prevent
accidental crashes during unexpected destructor calls. For example, if
an exception is thrown before the file is closed. File now takes
ownership of the descriptor. These changes now honor RAII idiom, which
should handle most of the peculiarities of C++.
Use optional value for TryOpenFile function, instead of returning a File
without a descriptor. It makes the failure state more semantically clear
to the API user.
Merge utils/filesystem with utils/file
The files aren't that big, and the naming is a bit confusing because
functions aren't really grouped for file and filesystem distinction.
Reviewers: mferencevic, mtomic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1111
Summary:
Added wrappers for some Unix system calls in utils/filesystem.hpp and implemented
a simple log storage interface for Raft. It is not very efficient, we will need
something more sophisticated later, but this is good enough for testing.
Reviewers: mferencevic, mislav.bradac, buda, mculinovic
Reviewed By: mferencevic
Subscribers: teon.banek, dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D1091
Summary:
GraphDb is refactored to become an API exposing different parts
necessary for the database to function. These different parts can have
different implementations in SingleNode or distributed Master/Server
GraphDb implementations.
Interally GraphDb is implemented using two class heirarchies. One
contains all the members and correct wiring for each situation. The
other takes care of initialization and shutdown. This architecture is
practical because it can guarantee that the initialization of the
object structure is complete, before initializing state.
Reviewers: buda, mislav.bradac, dgleich, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1093
Summary:
No logic changes, just split `tx::MasterEngine` into
`tx::SingleNodeEngine` and `tx::MasterEngine`. This gives better
responsibility separation and is more appropriate now there is no
Start/Shutdown.
Reviewers: dgleich, teon.banek, buda
Reviewed By: dgleich, teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1099
Summary:
Serialization of vertices and edges for distributed. Based on Boost
serialization. Threrefore moved TypedValue serialization from AST to
utils.
Reviewers: buda, dgleich, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1088
Summary:
A PropertyValueStore is not a generic data structure, but only ever used
to store properties in a Vertex/Edge. It has behaviours specific to it.
So, the templatization was not necessary.
Reviewers: buda
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1089
Summary:
If there was no plan caching, the CachedPlan would not survive
`Interpreter::operator()`, as it was not owned by the
`Interpreter::Result`. If there was caching, it could hapen that the
cache got invalidated while that plan was being interpreted (by
another thread) without that interpretation retaining ownership.
Also simplified code around this.
Reviewers: mislav.bradac, teon.banek
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1084
Summary:
Not having a virtual destructor caused tests
to fail (cypher_main_visitor, interpreter) sporadically
since unfreed memory was re-used incorrectly.
Also Valgrind complained constantly.
Reviewers: florijan, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1081
Summary:
This is a proposal on how the WAL recovery process can be implemented so
that Deltas aren't accumulated, but instead applied in the same order
they are written to the WAL.
I *believe* that the only additional requirement on the system are
atomic transaction Begin/Commit/Abort. By atomic I mean that they are
present in the WAL in exactly the same ordering like in the transaciton
engine, to ensure the same commitability of original and recovery
transactions.
This could be a requirement for HA recovery. It is desirable that WAL
and HA log become the same thing, and the recovery process too.
Reviewers: mtomic, dgleich, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1068
Summary: Remove two tx::Transaction methods that are not defined and never used.
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1072
Summary:
Although the first solution used cereal, the final implementation uses
boost. Since the cereal is still used in the codebase, compilation has
been modified to support multithreaded cereal.
In addition to serializing Ast classes, the following also needed to be
serialized:
* GraphDbTypes
* Symbol
* TypedValue
TypedValue is treated specially, by inlining the serialization code in
the Ast class, concretely PrimitiveLiteral.
Another special case was the Function Ast class, which now stores a
function name which is resolved to a concrete std::function on
construction.
Tests have been added for serialized Ast in
tests/unit/cypher_main_visitor
Reviewers: mferencevic, mislav.bradac, florijan
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1067
Summary:
The distributed ID mapper is not yet utilised in GraphDb as those
changes are in D1060. Depending on landing order it will be added.
Reviewers: dgleich, mislav.bradac
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1064
Summary:
Implement AddComand method on RaftMember
Move RPCType out of rpc request and reply
Add unit test for AddCommand
Reviewers: mislav.bradac, buda
Reviewed By: mislav.bradac
Subscribers: pullbot, mculinovic
Differential Revision: https://phabricator.memgraph.io/D1042