Commit Graph

85 Commits

Author SHA1 Message Date
Teon Banek
1fd9a72e10 Generate Load functions from LCP as top level
Summary: Depends on D1596

Reviewers: mtomic, msantl

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1601
2018-09-28 10:34:20 +02:00
Teon Banek
a5926b4e0f Generate Save functions from LCP as top level
Summary:
This should allow us to more easily decouple the code which should be
open sourced. Unfortunately, the downside of this approach is that we
cannot rely on virtual calls to dispatch the serialization to correct
type. Another downside is that members need to be publicly accessible
for serialization.

Reviewers: mtomic, msantl

Reviewed By: mtomic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1596
2018-09-28 10:26:56 +02:00
Matej Ferencevic
53c405c699 Throw exceptions on RPC failure and Distributed error handling
Summary:
This diff changes the RPC layer to directly return `TResponse` to the user when
issuing a `Call<...>` RPC call. The call throws an exception on failure
(instead of the previous return `nullopt`).

All servers (network, RPC and distributed) are set to have explicit `Shutdown`
methods so that a controlled shutdown can always be performed. The object
destructors now have `CHECK`s to enforce that the `AwaitShutdown` methods were
called.

The distributed memgraph is changed that none of the binaries (master/workers)
crash when there is a communication failure. Instead, the whole cluster starts
a graceful shutdown when a persistent communication error is detected.
Transient errors are allowed during execution. The transaction that errored out
will be aborted on the whole cluster. The cluster state is managed using a new
Heartbeat RPC call.

Reviewers: buda, teon.banek, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1604
2018-09-27 16:27:40 +02:00
Matej Ferencevic
96ece11cdd Move distributed transaction engine logic
Summary:
This change introduces a pure virtual initial implementation of the transaction
engine which is then implemented in two versions: single node and distributed.
The interface classes now have the following hierarchy:

```
    Engine (pure interface)
         |
    +----+---------- EngineDistributed (common logic)
    |                         |
EngineSingleNode      +-------+--------+
                      |                |
                 EngineMaster     EngineWorker
```

In addition to this layout the `EngineMaster` uses `EngineSingleNode` as its
underlying storage engine and only changes the necessary functions to make
them work with the `EngineWorker`.

After this change I recommend that you delete the following leftover files:
```
rm src/distributed/transactional_cache_cleaner_rpc_messages.*
rm src/transactions/common.*
rm src/transactions/engine_rpc_messages.*
```

Reviewers: teon.banek, msantl, buda

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1589
2018-09-07 11:43:57 +02:00
Dominik Gleich
d9f25cc668 Write committed/aborted op to wal
Summary:
Wal on workers didn't contain committed transactions ids, this is needed for
distributed recovery so that the master may decide which transactions are
present on all the workers.

Reviewers: buda, msantl

Reviewed By: buda

Subscribers: pullbot, msantl, buda

Differential Revision: https://phabricator.memgraph.io/D1440
2018-07-05 12:43:18 +02:00
Teon Banek
e0474a8e92 Replace boost with capnp in RPC
Summary:
Converts the RPC stack to use Cap'n Proto for serialization instead of
boost. There are still some traces of boost in other places in the code,
but most of it is removed. A future diff should cleanup boost for good.

The RPC API is now changed to be more flexible with regards to how
serialize data. This makes the simplest cases a bit more verbose, but
allows complex serialization code to be correctly written instead of
relying on hacks. (For reference, look for the old serialization of
`PullRpc` which had a nasty pointer hacks to inject accessors in
`TypedValue`.)

Since RPC messages were uselessly modeled via inheritance of Message
base class, that class is now removed. Furthermore, that approach
doesn't really work with Cap'n Proto. Instead, each message type is
required to have some type information. This can be automated, so
`define-rpc` has been added to LCP, which hopefully simplifies defining
new RPC request and response messages.

Specify Cap'n Proto schema ID in cmake

This preserves Cap'n Proto generated typeIds across multiple generations
of capnp schemas through LCP. It is imperative that typeId stays the
same to ensure that different compilations of Memgraph may communicate
via RPC in a distributed cluster.

Use CLOS for meta information on C++ types in LCP

Since some structure slots and functions have started to repeat
themselves, it makes sense to model C++ meta information via Common Lisp
Object System.

Depends on D1391

Reviewers: buda, dgleich, mferencevic, mtomic, mculinovic, msantl

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1407
2018-06-04 10:45:12 +02:00
Teon Banek
c7b6cae526 Extract io/network into mg-io library
Reviewers: buda, dgleich, mferencevic

Reviewed By: mferencevic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1411
2018-05-30 14:58:41 +02:00
Matija Santl
f872c93ad1 Add command id to remote produce
Summary:
Command id is necessary in remote produce to identify an ongoing pull
because a transaction can have multiple commands that all belong under
the same plan and tx id.

Reviewers: teon.banek, mtomic, buda

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1386
2018-05-16 10:20:39 +02:00
Dominik Gleich
7af80ebb8d Replace command_id_t with CommandId and transaction_id_t with TransactionId.
Reviewers: buda

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1367
2018-04-20 13:55:14 +02:00
Dominik Gleich
eb30ecb6a0 Commit log gc
Summary: Adds a commit log garbage collector, which clears old transactions from the commit log

Reviewers: florijan

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1310
2018-04-04 10:25:25 +02:00
Matija Santl
0bcf2edeae Two phase commit on cursor destruction
Summary:
When commiting/aborting a transaction in tx master engine, make a two
phase commit to all workers so they can stop all futures and clear
transactional cache.

Reviewers: dgleich, florijan

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1320
2018-04-03 16:20:00 +02:00
Marin Tomic
86072a993a Fix logging in worker_engine
Summary: log before deleting tx object

Reviewers: msantl

Reviewed By: msantl

Differential Revision: https://phabricator.memgraph.io/D1332
2018-03-30 10:41:26 +02:00
Dominik Gleich
a9dd98f8b9 Remove rpc message macro semicolon
Summary:
Remove semicolon.
Semicolon shouldn't be used within macros and should
be explicitly provided by the user of the said macro.

Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1321
2018-03-28 15:16:29 +02:00
Matija Santl
29ba055b64 Add custom VLOGs for distributed memgraph
Summary:
Add different priority VLOGs for distributed memgraph.

For level 3 you'll get logs for dispatching/consuming plans.
For level 4 you'll get logs for tx start/commit/abort, remote produce, remote
pull, remote result consume,
For level 5 there will be a log for each request/response made by the RPC
client.

Master log snippet P9
Worker log snippet P10

Reviewers: florijan, teon.banek

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1296
2018-03-26 09:24:39 +02:00
Dominik Gleich
645e3bbc23 Add GlobalLast
Reviewers: florijan

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1313
2018-03-23 15:10:51 +01:00
florijan
c826fa6640 Fix recovery bug (transaction ID bumping)
Summary:
When performing recovery, ensure that the transaction ID in engine is
bumped to one after the max tx id seen in recovery.

Reviewers: dgleich

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1312
2018-03-23 10:09:04 +01:00
florijan
543f953ab5 Check all RPC call results
Summary: Also make error reporting in consistent style: "NameRpc failed"

Reviewers: teon.banek, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1294
2018-03-13 15:16:50 +01:00
Matija Santl
1ca98826af Use the same ClientPools in distributed
Summary:
Instead of passing `coordination`, pass `rpc_worker_clients` that
holds a map of worker_id->clientPool. By having only one instance of
`RpcWorkerClients` that is owned by `GraphDB` and passing it by refference
we'll share the same client pools for rpc clients.

Reviewers: teon.banek, florijan, dgleich, mferencevic

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1261
2018-03-01 17:14:59 +01:00
Dominik Gleich
79dc10960e Fix double free
Summary: It was possible for transaction to be double freed because both the AllClear and SingleClear of transactions could delete the pointer

Reviewers: florijan, mferencevic

Reviewed By: mferencevic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1250
2018-02-27 14:31:18 +01:00
florijan
bb62f463f8 Refactor distributed transactional cache GC
Summary:
Release of per-transaction data in distributed Memgraph refactored. The
master node no longer releases each time a transaction is done, thus
offloading some work from the engine.

Reviewers: dgleich

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1235
2018-02-27 10:47:58 +01:00
florijan
c8dc07ad0e Remove tx::Engine::GlobalIsActive
Summary:
Remove a method in tx::Engine whose results can be obtained from commit
log info (also guaranteed to be globally correct in distributed).

Reviewers: dgleich

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1240
2018-02-26 14:01:49 +01:00
Matej Ferencevic
d99724647a Add thread names
Summary:
To see the thread names in htop press <F2> to enter setup and under
'Display options' choose 'Show custom thread names'.

Reviewers: buda, teon.banek, florijan

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1234
2018-02-23 16:04:49 +01:00
Matej Ferencevic
c877c87bb4 Refactor RPC
Summary:
Previously, the RPC stack used the network stack only to receive messages. The
messages were then added to a separate queue that was processed by different
thread pools. This design was inefficient because there was a lock when
inserting and getting messages from the common queue.

This diff removes the need for separate thread pools by utilising the new
network stack design. This is possible because the new network stack allows
full processing of the network request without blocking the whole queue.

Reviewers: buda, florijan, teon.banek, dgleich, mislav.bradac

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1229
2018-02-23 12:07:22 +01:00
florijan
2a99b9c80e Send tx info with pull remote
Reviewers: dgleich, teon.banek

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1219
2018-02-22 17:16:54 +01:00
florijan
16853e5b8a Make transaction thread-safe
Reviewers: dgleich

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1223
2018-02-22 11:53:37 +01:00
florijan
9721ccf61c Cleanup per-transaction caches in distributed
Summary:
On the master cleanups are hooked directly into the transaction engine.
This is beneficial because the master might have bigger caches and we
want to clear them as soon as possible.

On the workers there is a periodic RPC call to the master about living
transactions, which takes care of releasing local caches. This is
suboptimal because long transactions will prevent cache GC (like with
data GC). It is however fairly simple.

Note that all cleanup is not done automatically and `RemotePull` has
been reduced accordingly. @msantl, please verify correctness and
consider if the code can be additionally simplified.

Reviewers: teon.banek, msantl

Reviewed By: msantl

Subscribers: pullbot, msantl

Differential Revision: https://phabricator.memgraph.io/D1202
2018-02-16 15:30:05 +01:00
florijan
796946ad1b Implement sync operator
Reviewers: teon.banek, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1192
2018-02-14 13:03:18 +01:00
florijan
b97b48b365 Support worker transaction begin/advance/commit/abort
Reviewers: dgleich, buda, teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1161
2018-02-01 12:12:20 +01:00
Matija Santl
78afaa07a3 Use RPC ClientPool instead of Client
Summary: Use RPC `ClientPool` instead of `Client`

Reviewers: florijan, teon.banek

Reviewed By: florijan

Subscribers: pullbot, mtomic

Differential Revision: https://phabricator.memgraph.io/D1153
2018-02-01 10:32:05 +01:00
Matej Ferencevic
fc20ddcd25 RPC refactor
Summary:
Start removal of old logic
Remove more obsolete classes
Move Message class to RPC
Remove client logic from system
Remove messaging namespace
Move protocol from messaging to rpc
Move System from messaging to rpc
Remove unnecessary namespace
Remove System from RPC Client
Split Client and Server into separate files
Start implementing new client logic
First semi-working state
Changed network protocol layout
Rewrite client
Fix client receive bug
Cleanup code of debug lines
Migrate to accessors
Migrate back to binary boost archives
Remove debug logging from server
Disable timeout test
Reduce message_id from uint64_t to uint32_t
Add multiple workers to server
Fix compiler warnings
Apply clang-format

Reviewers: teon.banek, florijan, dgleich, buda, mtomic

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1129
2018-01-24 15:27:40 +01:00
florijan
e1e4a70714 Implement graph element rpc
Summary:
- End to end distributed GraphDb testing
- Refactors as necessary
- Basic RemoteCache for storing remote data
- RemoteDataRpc

As we are on a tight schedule, please let's focus on the essentials:
functionality and proper testing.

Reviewers: dgleich, teon.banek, buda

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1121
2018-01-22 14:20:41 +01:00
florijan
9361d79c6d Implement GraphDbAccessor creation for running transaction
Reviewers: dgleich

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1119
2018-01-19 13:21:54 +01:00
Dominik Gleich
5418dfb19e Rename NetworkEndpoint
Summary:
Rename redunant port str

Add endpoint << operator

Migrate everything to endpoint

Reviewers: mferencevic, florijan

Reviewed By: mferencevic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1100
2018-01-15 15:47:37 +01:00
florijan
c4327b26f4 Extract tx::SingleNodeEngine from tx::MasterEngine
Summary:
No logic changes, just split `tx::MasterEngine` into
`tx::SingleNodeEngine` and `tx::MasterEngine`. This gives better
responsibility separation and is more appropriate now there is no
Start/Shutdown.

Reviewers: dgleich, teon.banek, buda

Reviewed By: dgleich, teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1099
2018-01-11 15:44:42 +01:00
Dominik Gleich
007a7f1a6d Change network design from Start/Shutdown to constructor/destructor
Summary:
Make ServerT start on constructor

Remove shutdown from MasterCoordinator

Distributed system remove Shutdown

Rcp server start and shutdown removed

Reviewers: florijan, mferencevic

Reviewed By: mferencevic

Subscribers: mferencevic, pullbot

Differential Revision: https://phabricator.memgraph.io/D1097
2018-01-10 14:58:57 +01:00
Mislav Bradac
d3623585e7 Migrate cereal to boost_serialization
Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1077
2017-12-27 13:44:46 +01:00
florijan
1ee6d6d05e Make WAL recovery linear
Summary:
This is a proposal on how the WAL recovery process can be implemented so
that Deltas aren't accumulated, but instead applied in the same order
they are written to the WAL.

I *believe* that the only additional requirement on the system are
atomic transaction Begin/Commit/Abort. By atomic I mean that they are
present in the WAL in exactly the same ordering like in the transaciton
engine, to ensure the same commitability of original and recovery
transactions.

This could be a requirement for HA recovery. It is desirable that WAL
and HA log become the same thing, and the recovery process too.

Reviewers: mtomic, dgleich, mislav.bradac

Reviewed By: mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1068
2017-12-22 10:02:09 +01:00
florijan
9191800fc6 Remove tx::Transaction undefined methods
Summary: Remove two tx::Transaction methods that are not defined and never used.

Reviewers: dgleich

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1072
2017-12-21 12:02:39 +01:00
florijan
3ae45e0d19 Add master/worker flags, main functions and coordination
Reviewers: dgleich, mislav.bradac, buda

Reviewed By: mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1060
2017-12-19 16:05:57 +01:00
Dominik Gleich
0f1e8aa763 Add rpc server local thread
Summary: Update tests for new rpc server

Reviewers: mislav.bradac, florijan, mferencevic

Reviewed By: mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1057
2017-12-14 13:37:32 +01:00
Dominik Gleich
a8488ac497 Rpc client locking
Summary:
Rpc client wasn't thread safe and required a lock before each rpc call.
The locking functionality is now incorporated in Rpc client.

Reviewers: mislav.bradac

Reviewed By: mislav.bradac

Differential Revision: https://phabricator.memgraph.io/D1056
2017-12-13 17:37:32 +01:00
Dominik Gleich
19e96c98b6 Generalize cereal message macro
Reviewers: florijan

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1053
2017-12-13 12:53:15 +01:00
florijan
4982d45e27 Add RPC to distributed tx::Engine
Reviewers: mislav.bradac, dgleich

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1035
2017-12-11 11:05:01 +01:00
florijan
ce3638b25e Prepare transactional engine for distributed
Summary:
The current idea is that the same MG binary can be used for single-node,
distributed master and distributed worker. The transactional engine in
the single-node and distributed master is the same: it determines the
transactional time and exposes all the "global" functionalities. In the
distributed worker the "global" functions must contact the master.

Reviewers: dgleich, mislav.bradac, buda

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1013
2017-12-01 09:17:44 +01:00
Dominik Gleich
9c24faa4d6 Expose faster edge_accessor constructor
Summary:
Fix nullptr exceptions

Inline edge_type and from/to in edge accessor

Reviewers: florijan, buda

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1010
2017-11-29 12:40:11 +01:00
florijan
3bd5a882cf Remove Update and Reconstruct from GraphDbAccessor
Summary: Code simplification made possible by making `locks_` `mutable` in `tx::Transaction`.

Reviewers: dgleich, buda

Reviewed By: buda

Differential Revision: https://phabricator.memgraph.io/D1015
2017-11-29 10:02:01 +01:00
Dominik Gleich
a7f9255c17 Remove redundant transaction from index creation
Reviewers: florijan

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1011
2017-11-27 16:47:12 +01:00
florijan
23241e76b9 Prepare tx::Engine for distributed
Summary:
This diff contains step 1:
- Remove clog exposure from tx::engine
- Reduce and cleanup tx::Engine API

All current functionality is kept, but the API is reduced. This is very
desirable because every function in tx::Engine will need to be
considered and implemented in both Master and Worker situations. The
less we have, the better.

Next step is exactly that: seeing how each of these functions behaves in
a distributed system and implementing accordingly.

Reviewers: dgleich, mislav.bradac, buda

Reviewed By: mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1008
2017-11-24 14:51:29 +01:00
Marko Budiselic
3c9e143c95 Fix SegFault within executor state
Reviewers: mislav.bradac, dgleich, florijan

Reviewed By: mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D995
2017-11-20 11:27:08 +01:00
florijan
1e0ac8ab8f Write-ahead log
Summary:
My dear fellow Memgraphians. It's friday afternoon, and I am as ready to pop as WAL is to get reviewed...

What's done:
- Vertices and Edges have global IDs, stored in `VersionList`. Main storage is now a concurrent map ID->vlist_ptr.
- WriteAheadLog class added. It's based around buffering WAL::Op objects (elementraly DB changes) and periodically serializing and flusing them to disk.
- Snapshot recovery refactored, WAL recovery added. Snapshot format changed again to include necessary info.
- Durability testing completely reworked.

What's not done (and should be when we decide how):
- Old WAL file purging.
- Config refactor (naming and organization). Will do when we discuss what we want.
- Changelog and new feature documentation (both depending on the point above).
- Better error handling and recovery feedback. Currently it's all returning bools, which is not fine-grained enough (neither for errors nor partial successes, also EOF is reported as a failure at the moment).
- Moving the implementation of WAL stuff to .cpp where possible.
- Not sure if there are transactions being created outside of `GraphDbAccessor` and it's `BuildIndex`. Need to look into.
- True write-ahead logic (flag controlled): not committing a DB transaction if the WAL has not flushed it's data. We can discuss the gain/effort ratio for this feature.

Reviewers: buda, mislav.bradac, teon.banek, dgleich

Reviewed By: dgleich

Subscribers: mtomic, pullbot

Differential Revision: https://phabricator.memgraph.io/D958
2017-11-13 09:51:39 +01:00