Commit Graph

21 Commits

Author SHA1 Message Date
Matej Ferencevic
9291a5fc4d Migrate to C++17
Reviewers: teon.banek, buda

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1974
2019-04-23 14:46:44 +02:00
Matej Ferencevic
d9673698e5 Cleanup file utils
Reviewers: teon.banek, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1861
2019-02-14 12:48:37 +01:00
Matej Ferencevic
f7d1050a9d Remove ConcurrentMap from RPC
Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1662
2018-10-18 10:04:55 +02:00
Matej Ferencevic
cdaf7581bf Add explicit start to servers
Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1661
2018-10-16 11:39:42 +02:00
Matej Ferencevic
baae40fcc6 Move RPC server to Coordination
Reviewers: teon.banek, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1658
2018-10-16 09:12:37 +02:00
Matej Ferencevic
53c405c699 Throw exceptions on RPC failure and Distributed error handling
Summary:
This diff changes the RPC layer to directly return `TResponse` to the user when
issuing a `Call<...>` RPC call. The call throws an exception on failure
(instead of the previous return `nullopt`).

All servers (network, RPC and distributed) are set to have explicit `Shutdown`
methods so that a controlled shutdown can always be performed. The object
destructors now have `CHECK`s to enforce that the `AwaitShutdown` methods were
called.

The distributed memgraph is changed that none of the binaries (master/workers)
crash when there is a communication failure. Instead, the whole cluster starts
a graceful shutdown when a persistent communication error is detected.
Transient errors are allowed during execution. The transaction that errored out
will be aborted on the whole cluster. The cluster state is managed using a new
Heartbeat RPC call.

Reviewers: buda, teon.banek, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1604
2018-09-27 16:27:40 +02:00
Matej Ferencevic
4e996d2667 Ensure that the durability directory is unique
Summary:
This change improves detection of errorneous situations when starting a
distributed cluster on a single machine. It asserts that the user hasn't
started more memgraph nodes on the same machine with the same durability
directory. Also, this diff improves worker registration. Now workers don't have
to have explicitly set IP addresses. The master will deduce them from the
connecting IP when the worker registers.

Reviewers: teon.banek, buda, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1582
2018-09-03 13:40:10 +02:00
Dominik Gleich
0681040395 Durability distributed wal
Summary:
Recover only snapshot

Return recovery info on worker recovery

Update tests

Start wal recovery

Single node wal split

Keep track of wal possible to recover

Fix comment

Wal tx intersection

Merge branch 'master' into sync_wal_tx

Reviewers: buda, ipaljak, dgleich, vkasljevic, teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1489
2018-07-24 08:55:13 +02:00
Dominik Gleich
797bd9e435 Notify master of worker recovery
Summary: Notify master when workers finish recovering such that master doesn't start doing stuff before they recovered

Reviewers: buda, msantl, mferencevic

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1389
2018-05-10 11:09:34 +02:00
Dominik Gleich
dba81f223c Ensure workers recover appropriate snapshot
Reviewers: msantl

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1355
2018-04-16 16:58:54 +02:00
Dominik Gleich
87dbe26038 Check if connection alive & fix pure virtual bug
Reviewers: mferencevic, msantl, florijan

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1353
2018-04-12 10:17:00 +02:00
Matija Santl
4cbfc800b8 Check that workers desired id is equal to the assigned id from master.
Summary:
Returns -1 from coordinations `AddWorker` method and propagate it to
worker if master can't assign the desired worker id.

Reviewers: dgleich, florijan

Reviewed By: dgleich

Subscribers: pullbot, buda

Differential Revision: https://phabricator.memgraph.io/D1352
2018-04-11 12:36:40 +02:00
Matija Santl
7b88e514b8 Add ClusterDiscovery RPC for distributed BFS
Summary:
Implemented cluster discovery in distributed memgraph.
When a worker registers, it sends a RPC request to master.
The master assigns that worker an id and sends the information about other
workers (pairs of <worker_id, endpoint>) to the new worker.
Master also sends the information about the new worker to all existing workers
in the process of worker registration.

After the last worker registers, all memgraph instances in the clusters should
know about every other.

Reviewers: mtomic, buda, florijan

Reviewed By: mtomic

Subscribers: teon.banek, dgleich, pullbot

Differential Revision: https://phabricator.memgraph.io/D1339
2018-04-09 14:28:22 +02:00
florijan
44aefe775e Fix flakyness in tests/unit/distributed_coord
Summary:
Instead of waiting for a fix period for the coordinations to start and
coordinate with the master, wait for each of them individually to report
being done.

Also: rename `WorkerInThread` to `WorkerCoordinationInThread`.

Reviewers: dgleich, teon.banek, msantl

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1288
2018-03-12 12:30:59 +01:00
Matej Ferencevic
c877c87bb4 Refactor RPC
Summary:
Previously, the RPC stack used the network stack only to receive messages. The
messages were then added to a separate queue that was processed by different
thread pools. This design was inefficient because there was a lock when
inserting and getting messages from the common queue.

This diff removes the need for separate thread pools by utilising the new
network stack design. This is possible because the new network stack allows
full processing of the network request without blocking the whole queue.

Reviewers: buda, florijan, teon.banek, dgleich, mislav.bradac

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1229
2018-02-23 12:07:22 +01:00
Matej Ferencevic
fc20ddcd25 RPC refactor
Summary:
Start removal of old logic
Remove more obsolete classes
Move Message class to RPC
Remove client logic from system
Remove messaging namespace
Move protocol from messaging to rpc
Move System from messaging to rpc
Remove unnecessary namespace
Remove System from RPC Client
Split Client and Server into separate files
Start implementing new client logic
First semi-working state
Changed network protocol layout
Rewrite client
Fix client receive bug
Cleanup code of debug lines
Migrate to accessors
Migrate back to binary boost archives
Remove debug logging from server
Disable timeout test
Reduce message_id from uint64_t to uint32_t
Add multiple workers to server
Fix compiler warnings
Apply clang-format

Reviewers: teon.banek, florijan, dgleich, buda, mtomic

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1129
2018-01-24 15:27:40 +01:00
Dominik Gleich
ca9fac8adc Expose worker ids
Summary: Add worker ids expose to rpc_worker_clients

Reviewers: msantl

Reviewed By: msantl

Differential Revision: https://phabricator.memgraph.io/D1128
2018-01-22 17:09:12 +01:00
Dominik Gleich
5418dfb19e Rename NetworkEndpoint
Summary:
Rename redunant port str

Add endpoint << operator

Migrate everything to endpoint

Reviewers: mferencevic, florijan

Reviewed By: mferencevic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1100
2018-01-15 15:47:37 +01:00
Dominik Gleich
007a7f1a6d Change network design from Start/Shutdown to constructor/destructor
Summary:
Make ServerT start on constructor

Remove shutdown from MasterCoordinator

Distributed system remove Shutdown

Rcp server start and shutdown removed

Reviewers: florijan, mferencevic

Reviewed By: mferencevic

Subscribers: mferencevic, pullbot

Differential Revision: https://phabricator.memgraph.io/D1097
2018-01-10 14:58:57 +01:00
Mislav Bradac
d3623585e7 Migrate cereal to boost_serialization
Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1077
2017-12-27 13:44:46 +01:00
florijan
3ae45e0d19 Add master/worker flags, main functions and coordination
Reviewers: dgleich, mislav.bradac, buda

Reviewed By: mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1060
2017-12-19 16:05:57 +01:00