Summary:
Command id is necessary in remote produce to identify an ongoing pull
because a transaction can have multiple commands that all belong under
the same plan and tx id.
Reviewers: teon.banek, mtomic, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1386
Summary:
Removing `AS_IS` from GraphView because it doesn't seem like it is necessary for query execution and it also has weird semantics (you might get a mix of old and new records). `Unwind`, `OrderBy` and `PullRemoteOrderBy` now use `OLD` graph view.
Remove AS_IS from GraphView
Fix query_cost_estimator tests
Fix query_expression_evaluator tests
Fix query_plan_match_filter_return tests
Fix query_plan_create_set_remove_delete tests
Fix query_plan_accumulate_aggregate tests
Reviewers: teon.banek, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1390
Summary:
A simplified end-to-end implementation
POD interface set-up, still have bugs with HDDkeys
Version bug fix and first iterator implementation
Fixed out-of-scope reference in PVS iterator
Added PVS unit tests
Reviewers: buda, mferencevic, dgleich, teon.banek
Reviewed By: buda, dgleich
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1369
Summary:
Since we are moving from boost to Capnp for serialization, it makes
sense to keep all of the LogicalOperator classes in LCP format. This
will make it easier to generate Capnp code.
Depends on D1361
Reviewers: buda, mferencevic, msantl, dgleich, ipaljak, mculinovic, mtomic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1362
Summary: Global address of `from` was stored in Edge record when created via CreateEdge RPC.
Reviewers: buda, msantl, dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1387
Summary:
Pointers in AstTreeStorage can point to same nodes. These nodes
should be serialized(saved) only once when serializing Ast tree.
Same goes for deserializing(loading).
When deserializing nodes, one should not use storage Create method,
because it will add node which isn't completely loaded to storage vector.
Therefore, one should use new Deserialize method which doesn't
add node to storage vector. Inserting node into storage vector
is defered until all node data is loaded.
Pass pointer to saved uids set, instead of storage
Delete unused set
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1344
Summary:
In order to enhance C++ metaprogramming capabilities, a custom
preprocessing step is added before compilation. C++ code may be mixed
with Lisp code in order to generate a complete C++ source code. The
mechanism is hooked into cmake. To notify cmake of .lcp files, `add_lcp`
function in src/CMakeLists.txt needs to be invoked.
The main executable entry point is in tools/lcp, while the source code
is in src/lisp/lcp.lisp
The main goal of LCP is to auto generate class serialization code and
member variable getter functions. This should now be significantly less
error prone, since you cannot forget to serialize a member variable
through this mechanism. Future uses should be generating other repeating
code, such as `Clone` methods or perhaps some debug information.
.lcp files may contain mixed C++ code (enclosed in #>cpp ... cpp<#
blocks) with Common Lisp code.
NOTE: With great power comes great responsibility. Lisp metaprogramming
capabilities are incredibly powerful. To keep the sanity of the team
intact, use Lisp preprocessing only when *really* necessary.
Reviewers: buda, mferencevic, msantl, dgleich, ipaljak, mculinovic, mtomic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1361
Summary:
GC called clog info on transactions which were not completed even though it
could deduce that they are not finished. By avoiding calling clog info on them
we can lower the number of rpc calls and remove a strange behaviour of gc
appearing to be hanging while recovering a worker.
Reviewers: mculinovic, mferencevic, buda
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1377
Summary:
This binary is installed and packaged for release. This is just a quick
solution for releasing the Community 0.10 version. We still need to
setup the installation and packaging for both the Enterprise and
Community versions. Additionally, the automated build system needs to
test both binaries for correct behaviour. Obviously, some tests can only
be run on one of the 2 versions.
Reviewers: mferencevic, buda
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1363
Summary:
Added a new markdown file for concepts and a new test to test the edge
case with upper bound.
Reviewers: teon.banek, mtomic, dgleich, buda, ipaljak
Reviewed By: teon.banek, dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1366
Summary:
Wal retention test was flaky, this kinda fixes it.
It's hard to fix it completely since there are separate threads happening which
are not synchronized, and forcing the synchronization from the outside would
not agree to the specifications i.e. single-threadness of WAL.
Reviewers: buda
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1358
Summary: Locks should be released as early as possible
Reviewers: msantl, florijan
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1357
Summary:
Use /var/lib/memgraph as home dir for memgraph user.
Merge post and pre RPM scripts in RPM spec file.
Properly handle memgraph.service and directory permissions.
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1350
Summary:
- Removed a lot of stuff that was incorrect and/or unnecessary
- Fixed const-correctness in the skiplist family
Reviewers: dgleich, teon.banek, buda
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1351
Summary: Do it so properties-on-disk can be implemented.
Reviewers: buda, dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1343
Summary:
If someone was to call `EnsureDir` concurrently it might fail because
the `fs::create_directories` returns false when the directory already exists
with the `error_code` value being `0`.
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1349
Summary:
After the commit log was cleared there was some transaction that tried to acquire
a lock on an object that was taken by a transaction that was not longer active on
the worker. Inquring the commit log about that transaction causes a crash since
the commit log is cleared of that transaction.
Solution is to clear the transaction cache before clearing the commit log, which
forces the transactions to release their locks and as such their ids will never be
queried through the commit log in the future.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1342
Summary:
Implemented cluster discovery in distributed memgraph.
When a worker registers, it sends a RPC request to master.
The master assigns that worker an id and sends the information about other
workers (pairs of <worker_id, endpoint>) to the new worker.
Master also sends the information about the new worker to all existing workers
in the process of worker registration.
After the last worker registers, all memgraph instances in the clusters should
know about every other.
Reviewers: mtomic, buda, florijan
Reviewed By: mtomic
Subscribers: teon.banek, dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D1339
Summary: Make synchronized snapshot. This invokese the snapshooter on workers on the master snapshot scheduler interval.
Reviewers: msantl, mtomic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1334
Summary:
Snapshots should have the transaction from which they were created because we need this info for recovery later on.
Otherwise we wouldn't be able to tell the workers from which snapshots to recover. The whole cluster should be recovered
from the same transaction snapshot.
Reviewers: msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1338
Summary: Adds a commit log garbage collector, which clears old transactions from the commit log
Reviewers: florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1310
Summary:
When commiting/aborting a transaction in tx master engine, make a two
phase commit to all workers so they can stop all futures and clear
transactional cache.
Reviewers: dgleich, florijan
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1320
Summary:
Snapshot scheduler object was released from unique ptr and not actually freed, which
caused the snapshooter to access the tx_engine after it was destructed.
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1325
Summary:
Master shouldn't stop processing rpc calls immediately on shutdown. It
should wait for all workers to stop, and then destroy itself.
Reviewers: dgleich, mferencevic
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1330
Summary:
Remove semicolon.
Semicolon shouldn't be used within macros and should
be explicitly provided by the user of the said macro.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1321
Summary:
The network layer now has a `Session` that handles all things that should be
done before the `Execute` method is called on sessions. Also, all sessions
now communicate using streams instead of holding the input buffer and writing
to the `Socket`. This design will allow implementation of a SSL middleware.
Reviewers: buda, dgleich
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1314
Summary:
Before we used `utils::Future` only where it's created by our `ThreadPool`.
I suggest in this diff that we use it everywhere, it's a bit more defensive and
should not have any downsides.
Reviewers: msantl, teon.banek
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1316
Summary:
Add different priority VLOGs for distributed memgraph.
For level 3 you'll get logs for dispatching/consuming plans.
For level 4 you'll get logs for tx start/commit/abort, remote produce, remote
pull, remote result consume,
For level 5 there will be a log for each request/response made by the RPC
client.
Master log snippet P9
Worker log snippet P10
Reviewers: florijan, teon.banek
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1296
Summary:
Remove "produce_" and "Produce" as prefix from all distributed stuff.
It's not removed in src/query/ stuff (operators).
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1315
Summary:
When performing recovery, ensure that the transaction ID in engine is
bumped to one after the max tx id seen in recovery.
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1312
Summary:
Find[Vertex|Edge] -> Find[Vertex|Edge]Optional
Find[Vertex|Edge]Checked -> Find[Vertex|Edge]
In some places change old code that finds-optional and immediately checks
to use the checked functionality.
It seems that in all the src/ stuff optional finds are no loger used,
only in tests, but there they are used extensively so I don't feel those
functions should be removed.
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1309
Summary: This is correct, and uses less memory.
Reviewers: teon.banek, buda, mislav.bradac
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1304
Summary:
Also removed some convenience code that became obsolete.
No logic changes.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1303
Summary:
Writers preference is not guaranteed by the standard and as such
there are variations between implementations in different versions
of libphthread.
Reviewers: dgleich
Reviewed By: dgleich
Differential Revision: https://phabricator.memgraph.io/D1305
Summary:
Ensures that query plans are invalidated on the remote only when it's
guaranteed they will never be used on the master. This is done by
invalidating remote caches in the `CachedPlan` destructor.
There are two unplesant side-effects. First, an RPC call is made in an
object destructor. This is somewhat ugly, but not that different then
making an RPC call that must succeed in any other function. Note that
this does NOT slow down any query execution because the relevant
destructor is called by the skiplist garbage collector. The second ugly
side-effect is that in the unit test now we need to sleep to ensure the
skiplist GC destructs a cached plan before checking that it's
invalidated on the remote worker.
We might want to redesign this at some point.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1302
Summary:
Recursion in Version destructor was causing a SEGFAULT for
long version lists because of a stack overflow.
Reviewers: florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1301
Summary:
- Remove caches on workers as a result of plan expiration or race
during insertion.
- Extract caching functionality into a class.
- Minor refactor of Interpreter::operator()
- New RPC and test for it.
- Rename ConsumePlanRes to DispatchPlanRes for consistency, remove
return value as it's always true and never used.
- Interpreter is now constructed with a `GraphDb` reference. At the
moment only for reaching the `distributed::PlanDispatcher`, but in
the future we should probably use that primarily for planning.
I added a function to `PlanConsumer` that is only used for testing.
I prefer not doing this, but I felt this needed testing. I can remove
it now if you like.
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1292
Summary:
Added a class that is supposed to be used as a base class on classes
where we want to see the stacktrace of destructor call.
Example of the output is P7 where classes `GraphDb` and `Client` extend
`LogDestructor` base.
Reviewers: florijan, teon.banek
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1281
Summary:
This fixes a bug where the planning would raise NotYetImplemented error,
due to preventing plan splitting while the operator is already on
master. For example, `MATCH (a), (b) CREATE (a)-[e:r]->(b) RETURN e`
would split the plan after Cartesian and before Create. Thus, the rest
of the plan would be on master, including the Accumulate before Produce.
We still can (and must) replace Accumulate with Synchronize but this
would fail due to unneeded check.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1272
Summary:
I have not thoroughly thought this through, especially the worker
destruction (is it legit to abort all running tx?), but it's tested to
abort during remote pull, what we need.
Also I improved error handling for vertex deletion failure during
remote pull (@dgleich).
Reviewers: teon.banek, msantl, dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1263
Summary:
If the `db` says we need to abort we should abort. There are two
places (in both `PullRemote` and `PullRemoteOrderBy`) where the check is made.
The first one is on the very beginning of the `Pull` method and the second one
is in the loop that checks/waits for remote results.
Reviewers: teon.banek, florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1265
Summary:
Instead of passing `coordination`, pass `rpc_worker_clients` that
holds a map of worker_id->clientPool. By having only one instance of
`RpcWorkerClients` that is owned by `GraphDB` and passing it by refference
we'll share the same client pools for rpc clients.
Reviewers: teon.banek, florijan, dgleich, mferencevic
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1261
Summary:
Usually a single RPC call takes 50us, which is a lot lower than our
millisecond sleep. Calling sleep with less than a ms duration may not
be really supported (due to timer granularity), but this change should
cause some sub-ms sleep.
Reviewers: florijan, msantl, mferencevic
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1258
Summary:
There's no need to heap allocate the request which is used only to be
serialized and sent over the network. Unfortunately, this change
complicates the reading a bit, because we need to deserialize the raw
pointer and put it in std::unique_ptr instead of simply deserializing
the unique_ptr.
Heap allocation shows up visibly in perf during benchmarks, so this
change may yield some improvement.
Reviewers: mferencevic, mtomic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1254
Summary: It was possible for transaction to be double freed because both the AllClear and SingleClear of transactions could delete the pointer
Reviewers: florijan, mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1250
Summary:
Release of per-transaction data in distributed Memgraph refactored. The
master node no longer releases each time a transaction is done, thus
offloading some work from the engine.
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1235
Summary:
Remove a method in tx::Engine whose results can be obtained from commit
log info (also guaranteed to be globally correct in distributed).
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1240
Summary:
Difficult to test properly as the problem was an implicit conversion of
`gid::Gid` to `mvcc::VersionList<> *`. The only proper way to defend
against this would be to make `gid::Gid` a type.
The `CHECK` added in this diff does not help, but should be there, so...
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1244
Summary:
During distributed execution, OrderBy is split across workers and the
master gets to merge those results via PullRemoteOrderBy. Since this
operator may be an input to almost any other operator, virtual accessors
to `input` have been added in LogicalOperator.
Depends on D1221
Reviewers: florijan, msantl, buda
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1232
Summary:
Extracted `TypedValueVectorCompare` and `RemotePuller` from operators
so it can be reused.
The new `PullRemoteOrerBy` operator pulls one result from each worker and one
from master, relies on the fact that workers/master returned sorted results,
returns the next one in order, and pulls the source of that result to get the
next one.
Depends on D1215 that (at the moment) is still in review.
Reviewers: florijan, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1221
Summary:
- The expansion vertex gets created on the origin's worker
- The edge automatically gets created wherever necessary
- Vertex creation logic reuse
- End to end test
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1227
Summary:
To see the thread names in htop press <F2> to enter setup and under
'Display options' choose 'Show custom thread names'.
Reviewers: buda, teon.banek, florijan
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1234
Summary:
- Add the said test
- Add `workerid` function
- Add random wait (enabled with flag) to `rpc::Client::Call` for network
latency simulation
Reviewers: buda, mferencevic, teon.banek
Reviewed By: buda, teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1222
Summary:
Previously, the RPC stack used the network stack only to receive messages. The
messages were then added to a separate queue that was processed by different
thread pools. This design was inefficient because there was a lock when
inserting and getting messages from the common queue.
This diff removes the need for separate thread pools by utilising the new
network stack design. This is possible because the new network stack allows
full processing of the network request without blocking the whole queue.
Reviewers: buda, florijan, teon.banek, dgleich, mislav.bradac
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1229
Summary:
This may improve the query execution workload, because the Expand will
yield local results while it waits for remote ones. Note that we rely on
the fact that walking the graph produces results without any predetermined
order. Therefore, we can yield paths as we see fit. Enforcing the order
is done through OrderBy operator, and this change shouldn't affect that.
Reviewers: florijan, msantl, buda
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1215
Summary:
Updating a record locally while there is an remote update waiting to be applied caused
the operation to return as already deleted, instead of applying it
Reviewers: florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1226
Summary:
Previously, the network stack `communication::Server` accepted connections and
assigned them statically in a round-robin fashion to `communication::Worker`.
That meant that if two compute intensive connections were assigned to the same
worker they would block each other while the other workers would do nothing.
This implementation replaces `communication::Worker` with
`communication::Listener` which holds all accepted connections in one pool and
ensures that all workers execute all connections.
Reviewers: buda, florijan, teon.banek
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1220
Summary:
- Add `database::GraphDb::GetWorkerIds()`
- Change `CreateNode` constructor API
- Make `CreateNode` distribute nodes uniformly over workers
Did not yet modify `CreateExpand`, coming in the next diff.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1216
Summary:
The queue size stat is updated every time we pop from queue.
This means it won't get updated if we get requests while all threads are
busy processing some long running request. I don't think we care about
the exact queue size in that case, so that is fine.
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1213
Summary:
Implementation of remote vertex and edge creation. This diff addresses
the creation API (`GraphDbAccessor::InsertEdge`,
`GraphDbAccessor::InsertRemoteVertex`) and the necessary RPC and
`RemoteCache` stuff.
What is missing for full remote creation support are
`query::plan::operator` changes that are expected to minor. Pushing this
diff as it's large enough, operator and end to end tests in the next.
Also, the naming of existing structures and files is confusing (update
refering to both updates and created, `results` used too often etc.). I
will address this too, but feel free to comment on bad naming.
Reviewers: dgleich, teon.banek, msantl
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1210
Summary:
This is still very much in progress. No advanced checks are done to
prevent planning unimplemented things. Basic Cartesian product should
work, for example `MATCH (a), (b) CREATE (a)-[:r]->(c)-[:r]->(b)`. But
anything more advanced may lead to undefined behaviour of the planner
and therefore execution. Use at your own risk!
Add ModifiedSymbols method to LogicalOperator
For planning Cartesian, we need information on which symbols are filled
by operator sub-trees. Currently, this is used to set symbols which
should be transferred over network. Later, they should be used to detect
whether filter expressions use symbols modified from Cartesian branches.
Then we will be able to ensure correct dependency of filters and their
behaviour.
Prepare DistributedPlan for multiple worker plans
Since Cartesian branches need to be split and handled by each worker, we
now dispatch multiple plans to workers.
Reviewers: florijan, msantl, buda
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1208
Summary:
Rename and fix concurrent list test.,
This tests a fix for the concurrent list.I was not able to reproduce the flakyness, but this could be it.
If it happens again we'll know that this is not the solution to the problem, and look further.
Change memory ordering
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: buda, pullbot
Differential Revision: https://phabricator.memgraph.io/D1188
Summary:
Stats server wasn't connecting to the right service on statsd.
Also, benchmark client stats now have prefix `client` instead
of machine name to be consistent with memgraphs stats naming
which starts with `master` or `worker`.
Reviewers: mtomic, buda
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1209
Summary:
On the master cleanups are hooked directly into the transaction engine.
This is beneficial because the master might have bigger caches and we
want to clear them as soon as possible.
On the workers there is a periodic RPC call to the master about living
transactions, which takes care of releasing local caches. This is
suboptimal because long transactions will prevent cache GC (like with
data GC). It is however fairly simple.
Note that all cleanup is not done automatically and `RemotePull` has
been reduced accordingly. @msantl, please verify correctness and
consider if the code can be additionally simplified.
Reviewers: teon.banek, msantl
Reviewed By: msantl
Subscribers: pullbot, msantl
Differential Revision: https://phabricator.memgraph.io/D1202
Summary:
Pulls left op cursor and keeps the result, and then for each pull of
the right op cursor, adds all the left op results to produce a cartesian
product.
Reviewers: teon.banek, florijan
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1201
Summary:
Get rid of client class
Fix cppcheck errors
Add documentation to metrics.hpp
Add documentation to stats.hpp
Remove stats from global namespace
Fix build failures
Refactor a bit
Refactor stopwatch into a function
Add rpc execution time stats
Fix segmentation fault
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1200
Summary:
We have been using `Edges::VertexAddress` and `Edges::EdgeAddress` a lot
in other parts of the codebase because it's cleaner to write then
`Address<mvcc::VersionList<Edge>>`, especially in code what should not
really be MVCC-aware. However, a lot of that code should not really be
`Edges` aware either, as that's a storage datastructure that should not
be exposed.
This became annoying, so I extracted these addresses into a type-file. I
don't really like this approach, it might be better to have
`Vertex::Address` and `Edge::Address`, but that means we'd have to
import those headers and we'd get circular dependencies.
“The horror! The horror!”
- Joseph Conrad, Heart of Darkness
Reviewers: teon.banek, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1204
Summary:
Added a hidden flag, a int32 value, that represents the number of
milliseconds the thread that pulls remote results should sleep in case when the
local results are exhausted but there are still some remote results that are
pending.
Reviewers: teon.banek, florijan
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1198
Summary:
It is possible that boost's archive didn't write all data to a stream
before the archive is destroyed. To make sure everything works, archives
are now wrapped in a nested block.
Additionally, there may be some issues if the stream isn't set for
binary reading/writing due to new line character conversion. This
shouldn't pose problems on Linux, but just to make sure it doesn't bite
us in the a$% down the line.
Reviewers: mferencevic, mculinovic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1194
Summary:
The same behaviour can now be achieved by passing a nullptr for input
operator.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1195
Summary:
Remote caches used to be owned by `GraphDbAccessor`. An advantage of
that was immediate cleanup when destructing. A disadvantage was sharing
the remote cache between mutliple program-flows in the same transaction
in distributed (one would have to share the accessor).
We will have to do post-transactional global cleanup anyway, since we
leak, which reduces the above stated advantage. And the stated
disadvantage is becoming more and more pronounced as additional
components need access to the remote cache.
Hence the refactor.
Reviewers: buda, teon.banek, msantl
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1186
Summary:
Updates are supported, insertions and removals not in this diff. The
test is a bit overdesigned, it happens.
Reviewers: teon.banek, dgleich, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1176
Summary:
Distributed plan is now dispatched before inserting into a cache. This
prevents a race condition when another thread would take the cached plan
and assume it was dispatched, thus crashing during execution.
The fix introduces a possibility that the same plan may be distributed
and dispatched twice (with different plan IDs). This shouldn't raise an
issue during execution, because only one plan will be cached and thus
executed. Obviously, we still need to invalidate the unused plan from
worker's caches.
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1178
Summary:
Defined a new test based on reported bug, for multiple remote expansion.
Fixed the bug. Introduced minor refactors in distributed unit testing.
Reviewers: mculinovic, dgleich
Reviewed By: mculinovic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1173
Summary:
This is a basic take on planning distributed writes. Main logic is in handling
the Accumulate operator, which requires Synchronize operation if the results
are used after modification.
Tests for planning distributed writes have been added.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1172
Summary:
This diff consolidates local and remote update handling. It ensures and
tests that updates for remote elements are visible locally (on the
updating worker).
The next part will be accumulating remote updates and applying them on
the owner.
Also extracted a common testing fixture.
Reviewers: dgleich, buda, mtomic
Reviewed By: mtomic
Subscribers: mtomic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1169
Summary:
Refactor in two ways. First, expose members without getters as we will
need most of them in distributed. And this was always the sensible thing
to do. Second, add storage type values to deltas. This is also a
sensible thing to do, and it will be very beneficial in distributed. We
didn't do it before because name<->value type mappings aren't guaranteed
to be the same after recovery. A task has been added to address this
(preserve mappings in durability).
Reviewers: dgleich, buda
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1167
Summary:
A hack worthy of young master Gleich. I *think* it's correct though, and
the tests pass. End-to-end cluster recovery testing will be written and
tried out by @mculinovic
Reviewers: dgleich, mculinovic
Reviewed By: dgleich
Subscribers: pullbot, mculinovic
Differential Revision: https://phabricator.memgraph.io/D1163
Summary:
Change `GraphDb` so it exposes index clients in the same
convention as other members.
Reviewers: dgleich, mculinovic
Reviewed By: mculinovic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1159
Summary:
This avoids the unnecessary work of storing symbols which can never be
read after an aggregation is complete. With regards to distributed, a
major benefit is gained in reducing what is transferred over the
network. Hopefully, this doesn't break some obscure case where we
actually needed to remember all used symbols.
Reviewers: florijan, mislav.bradac, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1158
Summary:
Queries such as `RETURN 1` should only be run on a single machine. This
change assumes that a query should only be distributed if it contains at
least one `ScanAll` operator, i.e. a `MATCH` clause.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1154
Summary:
Remote pulls can now be async. Note that
`RemotePullRpcClients::RemotePull` takes references to data structures
which should not be temporary in the caller. Still, maybe safer to make
copies?
Changed `RpcWorkerClients` API to make that possible.
Reviewers: dgleich, msantl, teon.banek
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1150
Summary: PullRemoteCursor will pull all clients in a RoundRobin fashion until all clients are exhausted and there are no more results to return.
Reviewers: teon.banek, florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1147
Summary:
It is possible that we have a global address to resolve, for a graph
element that's local. Consider W1 expanding, getting data from W2,
expanding from there and getting data that is on W1. We then don't want
to do RPC from W1 to W1, but do a lookup directly.
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1145
Summary:
NOTE: This diff is still in progress. Many TODOs, lacking documentation
etc. But the main logic is there (some could be different), and it tests
out OK.
Reviewers: teon.banek, msantl, buda
Reviewed By: teon.banek
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1138
Summary:
Remove AdvanceCommand operator declaration.
Add query/plan/distributed source files.
Add hacked cloning of LogicalOperator via serialization.
Add virtual Default methods to CompositeVisitor.
Use BOOST_CLASS_EXPORT_KEY in ast.hpp and operator.hpp.
This is needed in a single binary to correctly serialize polymorphic
classes. The previous implementation worked because the memgraph_lib was
linked with test binaries, but nobody was actually serializing things
inside the single memgraph binary itself.
Print PullRemote symbols in tests/manual/query_planner
Add names to implicitly created aggregation symbols
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1137
Summary:
With this patch the number of packets for a simple RPC call is lowered
from 22 to 12 (45% reduction). The number of packets for the Bolt protocol
is lowered from 26 to 18 (30% reduction).
Impact on the Bolt protocol will be a constant of ~ 8 packets less per
connection, while the impact on the RPC protocol will be approximately
a 45% reduction overall.
Reviewers: buda, teon.banek
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1141
Summary: See above. The unit test creates two clients on demand so I guess it works.
Reviewers: mferencevic, florijan, teon.banek
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1136
Summary:
Start removal of old logic
Remove more obsolete classes
Move Message class to RPC
Remove client logic from system
Remove messaging namespace
Move protocol from messaging to rpc
Move System from messaging to rpc
Remove unnecessary namespace
Remove System from RPC Client
Split Client and Server into separate files
Start implementing new client logic
First semi-working state
Changed network protocol layout
Rewrite client
Fix client receive bug
Cleanup code of debug lines
Migrate to accessors
Migrate back to binary boost archives
Remove debug logging from server
Disable timeout test
Reduce message_id from uint64_t to uint32_t
Add multiple workers to server
Fix compiler warnings
Apply clang-format
Reviewers: teon.banek, florijan, dgleich, buda, mtomic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1129
Summary:
Added test for `PlanDispatcher` and `PlanConsumer`.
This diff also contains a fix for the async rpc call on all clients.h
Reviewers: florijan, teon.banek
Reviewed By: florijan
Subscribers: pullbot, dgleich
Differential Revision: https://phabricator.memgraph.io/D1135
Summary:
It seems that RecordAccessor &co are ready for read-only distributed
execution. In read-only there is no command advancement and the implied
cache invalidation, `SwitchOld` and `SwitchNew` perform default
switching and `Reconstruct` uses the `RemoteCache` which is implemented.
I just added a few TODOs for proper CRUD.
Reviewers: dgleich
Reviewed By: dgleich
Differential Revision: https://phabricator.memgraph.io/D1125
Summary:
- End to end distributed GraphDb testing
- Refactors as necessary
- Basic RemoteCache for storing remote data
- RemoteDataRpc
As we are on a tight schedule, please let's focus on the essentials:
functionality and proper testing.
Reviewers: dgleich, teon.banek, buda
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1121
Summary:
With the added support for serialization, we should be able to transfer
plans across distributed workers.
The planner tests has been extended to test serialization. Operators
should be mostly tested, but the expression they contain aren't
completely. The quick solution is to use typeid for rudimentary
expression equality testing. The more involved solution of comparing
the expression tree for equality is the correct choice. It should be
done in the near future.
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1122
Summary:
Other than the plan operators and the frame, we will need to pass the
generated symbol table to distributed workers.
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1123
Summary:
A slight insanity here... I realized I will need to create
`GraphDbAccessor` instance (which need `&GraphDb`) within some members
of `::impl` classes. Within those classes I can pass `this` to those
members, if `this` is a valid `GraphDb`. Semantically it really is (at
the moment), but heirarchically it wasn't. This diff changes that.
`GraphDb` is now only an interface. `PublicBase` is the base for all
the public classes, `PrivateBase` for the `::impl` classes. Seems to
work.
Oh yes, another thing to keep in mind when doing this is that I should avoid
calling virtual functions in public classes (the motivation for the double
heirarchy). Before this diff the getters weren't virtual, now they are, so
I should have made all the appropriate changes in code as well.
Buda, was this a task I could have delegated to you or Cula?
Reviewers: teon.banek, dgleich, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1120
Summary:
Virtual destructors were missing in classes/structs which can
be inherited.
A missing virtual destructor gives undefined behaviour when
deleting derived class using base type.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1117
Summary:
Adds worker id to snapshot and wal filename.
Adds a new worker_id flag to be used for recovering a worker with a distributed snapshot.
Adds worker_id field to snapshot to check for consistency.
Reviewers: florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1096