Commit Graph

21 Commits

Author SHA1 Message Date
Matej Ferencevic
f7d1050a9d Remove ConcurrentMap from RPC
Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1662
2018-10-18 10:04:55 +02:00
Matej Ferencevic
cdaf7581bf Add explicit start to servers
Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1661
2018-10-16 11:39:42 +02:00
Matej Ferencevic
baae40fcc6 Move RPC server to Coordination
Reviewers: teon.banek, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1658
2018-10-16 09:12:37 +02:00
Matej Ferencevic
53c405c699 Throw exceptions on RPC failure and Distributed error handling
Summary:
This diff changes the RPC layer to directly return `TResponse` to the user when
issuing a `Call<...>` RPC call. The call throws an exception on failure
(instead of the previous return `nullopt`).

All servers (network, RPC and distributed) are set to have explicit `Shutdown`
methods so that a controlled shutdown can always be performed. The object
destructors now have `CHECK`s to enforce that the `AwaitShutdown` methods were
called.

The distributed memgraph is changed that none of the binaries (master/workers)
crash when there is a communication failure. Instead, the whole cluster starts
a graceful shutdown when a persistent communication error is detected.
Transient errors are allowed during execution. The transaction that errored out
will be aborted on the whole cluster. The cluster state is managed using a new
Heartbeat RPC call.

Reviewers: buda, teon.banek, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1604
2018-09-27 16:27:40 +02:00
Matej Ferencevic
96ece11cdd Move distributed transaction engine logic
Summary:
This change introduces a pure virtual initial implementation of the transaction
engine which is then implemented in two versions: single node and distributed.
The interface classes now have the following hierarchy:

```
    Engine (pure interface)
         |
    +----+---------- EngineDistributed (common logic)
    |                         |
EngineSingleNode      +-------+--------+
                      |                |
                 EngineMaster     EngineWorker
```

In addition to this layout the `EngineMaster` uses `EngineSingleNode` as its
underlying storage engine and only changes the necessary functions to make
them work with the `EngineWorker`.

After this change I recommend that you delete the following leftover files:
```
rm src/distributed/transactional_cache_cleaner_rpc_messages.*
rm src/transactions/common.*
rm src/transactions/engine_rpc_messages.*
```

Reviewers: teon.banek, msantl, buda

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1589
2018-09-07 11:43:57 +02:00
Matej Ferencevic
4e996d2667 Ensure that the durability directory is unique
Summary:
This change improves detection of errorneous situations when starting a
distributed cluster on a single machine. It asserts that the user hasn't
started more memgraph nodes on the same machine with the same durability
directory. Also, this diff improves worker registration. Now workers don't have
to have explicitly set IP addresses. The master will deduce them from the
connecting IP when the worker registers.

Reviewers: teon.banek, buda, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1582
2018-09-03 13:40:10 +02:00
Dominik Gleich
7af80ebb8d Replace command_id_t with CommandId and transaction_id_t with TransactionId.
Reviewers: buda

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1367
2018-04-20 13:55:14 +02:00
Matija Santl
7b88e514b8 Add ClusterDiscovery RPC for distributed BFS
Summary:
Implemented cluster discovery in distributed memgraph.
When a worker registers, it sends a RPC request to master.
The master assigns that worker an id and sends the information about other
workers (pairs of <worker_id, endpoint>) to the new worker.
Master also sends the information about the new worker to all existing workers
in the process of worker registration.

After the last worker registers, all memgraph instances in the clusters should
know about every other.

Reviewers: mtomic, buda, florijan

Reviewed By: mtomic

Subscribers: teon.banek, dgleich, pullbot

Differential Revision: https://phabricator.memgraph.io/D1339
2018-04-09 14:28:22 +02:00
Matija Santl
0bcf2edeae Two phase commit on cursor destruction
Summary:
When commiting/aborting a transaction in tx master engine, make a two
phase commit to all workers so they can stop all futures and clear
transactional cache.

Reviewers: dgleich, florijan

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1320
2018-04-03 16:20:00 +02:00
Dominik Gleich
645e3bbc23 Add GlobalLast
Reviewers: florijan

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1313
2018-03-23 15:10:51 +01:00
florijan
c826fa6640 Fix recovery bug (transaction ID bumping)
Summary:
When performing recovery, ensure that the transaction ID in engine is
bumped to one after the max tx id seen in recovery.

Reviewers: dgleich

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1312
2018-03-23 10:09:04 +01:00
Matija Santl
1ca98826af Use the same ClientPools in distributed
Summary:
Instead of passing `coordination`, pass `rpc_worker_clients` that
holds a map of worker_id->clientPool. By having only one instance of
`RpcWorkerClients` that is owned by `GraphDB` and passing it by refference
we'll share the same client pools for rpc clients.

Reviewers: teon.banek, florijan, dgleich, mferencevic

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1261
2018-03-01 17:14:59 +01:00
florijan
bb62f463f8 Refactor distributed transactional cache GC
Summary:
Release of per-transaction data in distributed Memgraph refactored. The
master node no longer releases each time a transaction is done, thus
offloading some work from the engine.

Reviewers: dgleich

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1235
2018-02-27 10:47:58 +01:00
florijan
c8dc07ad0e Remove tx::Engine::GlobalIsActive
Summary:
Remove a method in tx::Engine whose results can be obtained from commit
log info (also guaranteed to be globally correct in distributed).

Reviewers: dgleich

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1240
2018-02-26 14:01:49 +01:00
Matej Ferencevic
c877c87bb4 Refactor RPC
Summary:
Previously, the RPC stack used the network stack only to receive messages. The
messages were then added to a separate queue that was processed by different
thread pools. This design was inefficient because there was a lock when
inserting and getting messages from the common queue.

This diff removes the need for separate thread pools by utilising the new
network stack design. This is possible because the new network stack allows
full processing of the network request without blocking the whole queue.

Reviewers: buda, florijan, teon.banek, dgleich, mislav.bradac

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1229
2018-02-23 12:07:22 +01:00
florijan
9721ccf61c Cleanup per-transaction caches in distributed
Summary:
On the master cleanups are hooked directly into the transaction engine.
This is beneficial because the master might have bigger caches and we
want to clear them as soon as possible.

On the workers there is a periodic RPC call to the master about living
transactions, which takes care of releasing local caches. This is
suboptimal because long transactions will prevent cache GC (like with
data GC). It is however fairly simple.

Note that all cleanup is not done automatically and `RemotePull` has
been reduced accordingly. @msantl, please verify correctness and
consider if the code can be additionally simplified.

Reviewers: teon.banek, msantl

Reviewed By: msantl

Subscribers: pullbot, msantl

Differential Revision: https://phabricator.memgraph.io/D1202
2018-02-16 15:30:05 +01:00
florijan
b97b48b365 Support worker transaction begin/advance/commit/abort
Reviewers: dgleich, buda, teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1161
2018-02-01 12:12:20 +01:00
Matej Ferencevic
fc20ddcd25 RPC refactor
Summary:
Start removal of old logic
Remove more obsolete classes
Move Message class to RPC
Remove client logic from system
Remove messaging namespace
Move protocol from messaging to rpc
Move System from messaging to rpc
Remove unnecessary namespace
Remove System from RPC Client
Split Client and Server into separate files
Start implementing new client logic
First semi-working state
Changed network protocol layout
Rewrite client
Fix client receive bug
Cleanup code of debug lines
Migrate to accessors
Migrate back to binary boost archives
Remove debug logging from server
Disable timeout test
Reduce message_id from uint64_t to uint32_t
Add multiple workers to server
Fix compiler warnings
Apply clang-format

Reviewers: teon.banek, florijan, dgleich, buda, mtomic

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1129
2018-01-24 15:27:40 +01:00
florijan
9361d79c6d Implement GraphDbAccessor creation for running transaction
Reviewers: dgleich

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1119
2018-01-19 13:21:54 +01:00
Dominik Gleich
5418dfb19e Rename NetworkEndpoint
Summary:
Rename redunant port str

Add endpoint << operator

Migrate everything to endpoint

Reviewers: mferencevic, florijan

Reviewed By: mferencevic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1100
2018-01-15 15:47:37 +01:00
florijan
c4327b26f4 Extract tx::SingleNodeEngine from tx::MasterEngine
Summary:
No logic changes, just split `tx::MasterEngine` into
`tx::SingleNodeEngine` and `tx::MasterEngine`. This gives better
responsibility separation and is more appropriate now there is no
Start/Shutdown.

Reviewers: dgleich, teon.banek, buda

Reviewed By: dgleich, teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1099
2018-01-11 15:44:42 +01:00