Commit Graph

69 Commits

Author SHA1 Message Date
Teon Banek
2c50ea41d5 Split GraphDb to distributed and single node files
Summary:
This change, hopefully, simplifies the implementation of different kinds
of GraphDb. The pimpl idiom is now simplified by removing all of the
crazy inheritance. Implementations classes are just plain data stores,
without any methods. The interface classes now have a more flat
hierarchy:

```
    GraphDb (pure interface)
         |
    +----+---------- DistributedGraphDb (pure interface)
    |                         |
Single Node             +-----+------+
                        |            |
                      Master       Worker
```

DistributedGraphDb is used as an intermediate interface for all the
things that should work only in distributed. Therefore, virtual calls
for distributed stuff have been removed from GraphDb. Some are exposed
via DistributedGraphDb, other's are only in concrete Master and Worker
classes. The code which relied on those virtual calls has been
refactored to either use DistributedGraphDb, take a pointer to what is
actually needed or use dynamic_cast. Obviously, dynamic_cast is a
temporary solution and should be replaced with another mechanism (e.g.
virtual call, or some other function pointer style).

The cost of the above change is some code duplication in constructors
and destructors of classes. This duplication has a lot of little tweaks
that make it hard to generalize, not to mention that virtual calls do
not work in constructor and destructor. If we really care about
generalizing this, we should think about abandoning RAII in favor of
constructor + Init method.

The next steps for splitting the dependencies that seem logical are:

  1) Split GraphDbAccessor implementation, either via inheritance or
     passing in an implementation pointer. GraphDbAccessor should then
     only be created by a virtual call on GraphDb.
  2) Split Interpreter implementation. Besides allowing single node
     interpreter to exist without depending on distributed, this will
     enable the planner and operators to be correctly separated.

Reviewers: msantl, mferencevic, ipaljak

Reviewed By: msantl

Subscribers: dgleich, pullbot

Differential Revision: https://phabricator.memgraph.io/D1493
2018-07-20 10:48:38 +02:00
Matija Santl
4c27596fdd Implement kafka transform functionality
Summary:
First iteration in implementing kafka.
Currently, memgraph streams won't use the transform script provided in the
`CREATE STREAM` query.

There is a manual test that serves a POC purpose which we'll use to fully wire
kafka in memgraph.

Since streams need to download the script, I moved curl init from
telemetry.

Reviewers: teon.banek, mferencevic

Reviewed By: mferencevic

Subscribers: ipaljak, pullbot, buda

Differential Revision: https://phabricator.memgraph.io/D1491
2018-07-19 12:52:25 +02:00
Matija Santl
fa7e214bcf Add kafka library and integrate it into memgraph
Summary:
Integrated kafka library into memgraph. This version supports all opencypher
features and will only output messages consumed from kafka.

Depends on D1434

Next steps are persisting stream metadata and transforming messages in order to
store them in the graph.

Reviewers: teon.banek, mtomic, mferencevic, buda

Reviewed By: teon.banek

Subscribers: mferencevic, pullbot, buda

Differential Revision: https://phabricator.memgraph.io/D1466
2018-07-06 15:52:23 +02:00
Dominik Gleich
1875be1e34 Create Dynamic Graph Partitioner,
Summary:
Count labels in DGP

Add worker_id getter

Replace current_worker_id with worker_id

Add spinner

Add rpc calls for worker counts

Check worker capacity

Migrate to new worker

Fix moving two connected vertices

Token sharing algorithm

Reviewers: msantl, buda

Reviewed By: buda

Subscribers: msantl, pullbot

Differential Revision: https://phabricator.memgraph.io/D1392
2018-05-29 11:32:39 +02:00
Marko Budiselic
68755edc5c Fix compile warning (order of database::Config members)
Reviewers: msantl, teon.banek, ipaljak

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1396
2018-05-16 14:57:43 +02:00
Marin Tomic
91e38f6413 Distributed BFS
Summary: depends on D1387

Reviewers: msantl, teon.banek, buda

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1341
2018-05-15 17:38:51 +02:00
Ivan Paljak
909e42d414 Add initial version of properties on disk
Summary:
A simplified end-to-end implementation

POD interface set-up, still have bugs with HDDkeys

Version bug fix and first iterator implementation

Fixed out-of-scope reference in PVS iterator

Added PVS unit tests

Reviewers: buda, mferencevic, dgleich, teon.banek

Reviewed By: buda, dgleich

Subscribers: mferencevic, pullbot

Differential Revision: https://phabricator.memgraph.io/D1369
2018-05-10 17:58:38 +02:00
Dominik Gleich
797bd9e435 Notify master of worker recovery
Summary: Notify master when workers finish recovering such that master doesn't start doing stuff before they recovered

Reviewers: buda, msantl, mferencevic

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1389
2018-05-10 11:09:34 +02:00
Teon Banek
939056eac7 Add single node memgraph binary
Summary:
This binary is installed and packaged for release. This is just a quick
solution for releasing the Community 0.10 version. We still need to
setup the installation and packaging for both the Enterprise and
Community versions. Additionally, the automated build system needs to
test both binaries for correct behaviour. Obviously, some tests can only
be run on one of the 2 versions.

Reviewers: mferencevic, buda

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1363
2018-04-23 14:05:23 +02:00
Marko Budiselic
c76170a9db Clean utils folder (namespaces, function names)
Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1359
2018-04-22 09:44:32 +02:00
Dominik Gleich
261d50a02e Destroy and create storage object after failed recovery
Reviewers: buda

Reviewed By: buda

Subscribers: mferencevic, pullbot

Differential Revision: https://phabricator.memgraph.io/D1370
2018-04-20 16:16:52 +02:00
Dominik Gleich
b20e31e800 Synchronize snapshooting
Summary: Make synchronized snapshot. This invokese the snapshooter on workers on the master snapshot scheduler interval.

Reviewers: msantl, mtomic

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1334
2018-04-06 10:10:01 +02:00
Dominik Gleich
eb30ecb6a0 Commit log gc
Summary: Adds a commit log garbage collector, which clears old transactions from the commit log

Reviewers: florijan

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1310
2018-04-04 10:25:25 +02:00
florijan
ac8c96ccc2 Tidyup distributed stuff naming
Summary:
Remove "produce_" and "Produce" as prefix from all distributed stuff.
It's not removed in src/query/ stuff (operators).

Reviewers: dgleich

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1315
2018-03-23 16:32:29 +01:00
florijan
43c0e91057 Enable transaction killer on worker
Summary:
I have not thoroughly thought this through, especially the worker
destruction (is it legit to abort all running tx?), but it's tested to
abort during remote pull, what we need.

Also I improved error handling for vertex deletion failure during
remote pull (@dgleich).

Reviewers: teon.banek, msantl, dgleich

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1263
2018-03-05 12:45:42 +01:00
Matija Santl
1ca98826af Use the same ClientPools in distributed
Summary:
Instead of passing `coordination`, pass `rpc_worker_clients` that
holds a map of worker_id->clientPool. By having only one instance of
`RpcWorkerClients` that is owned by `GraphDB` and passing it by refference
we'll share the same client pools for rpc clients.

Reviewers: teon.banek, florijan, dgleich, mferencevic

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1261
2018-03-01 17:14:59 +01:00
Matej Ferencevic
c877c87bb4 Refactor RPC
Summary:
Previously, the RPC stack used the network stack only to receive messages. The
messages were then added to a separate queue that was processed by different
thread pools. This design was inefficient because there was a lock when
inserting and getting messages from the common queue.

This diff removes the need for separate thread pools by utilising the new
network stack design. This is possible because the new network stack allows
full processing of the network request without blocking the whole queue.

Reviewers: buda, florijan, teon.banek, dgleich, mislav.bradac

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1229
2018-02-23 12:07:22 +01:00
florijan
ae95e9480c Use RemoteCreateVertex in CreateNode operator
Summary:
- Add `database::GraphDb::GetWorkerIds()`
- Change `CreateNode` constructor API
- Make `CreateNode` distribute nodes uniformly over workers

Did not yet modify `CreateExpand`, coming in the next diff.

Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1216
2018-02-21 12:17:43 +01:00
florijan
fc2703833c Refactor remote cache ownership
Summary:
Remote caches used to be owned by `GraphDbAccessor`. An advantage of
that was immediate cleanup when destructing. A disadvantage was sharing
the remote cache between mutliple program-flows in the same transaction
in distributed (one would have to share the accessor).

We will have to do post-transactional global cleanup anyway, since we
leak, which reduces the above stated advantage. And the stated
disadvantage is becoming more and more pronounced as additional
components need access to the remote cache.

Hence the refactor.

Reviewers: buda, teon.banek, msantl

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1186
2018-02-09 14:12:11 +01:00
florijan
81e2e8f64f Add remote updates RPC
Summary:
Updates are supported, insertions and removals not in this diff. The
test is a bit overdesigned, it happens.

Reviewers: teon.banek, dgleich, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1176
2018-02-07 15:29:57 +01:00
florijan
e5a55a39e3 Fix distributed master index recovery from snapshot
Summary:
Change `GraphDb` so it exposes index clients in the same
convention as other members.

Reviewers: dgleich, mculinovic

Reviewed By: mculinovic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1159
2018-02-01 08:01:14 +01:00
Dominik Gleich
24857cc1cf Support distributed (label, property) indexes
Summary:
Call workers buildindex

Merge branch 'master' into setup_distributed_index

Use ExecuteOnWorkers api

Merge branch 'master' into setup_distributed_index

Improve test

Merge branch 'master' into setup_distributed_index

Finish test

Reviewers: florijan, teon.banek

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1134
2018-01-25 18:08:04 +01:00
florijan
ccefae4002 Add distributed execution RPC
Summary:
NOTE: This diff is still in progress. Many TODOs, lacking documentation
etc. But the main logic is there (some could be different), and it tests
out OK.

Reviewers: teon.banek, msantl, buda

Reviewed By: teon.banek

Subscribers: mferencevic, pullbot

Differential Revision: https://phabricator.memgraph.io/D1138
2018-01-25 14:50:28 +01:00
Matija Santl
62323965e3 Implement distributed plan dispatcher/consumer methods
Summary:
Implementations of `DistributePlanRpc`.
I'll add tests afterwards #promise

Reviewers: teon.banek, florijan

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1131
2018-01-24 10:50:27 +01:00
florijan
e1e4a70714 Implement graph element rpc
Summary:
- End to end distributed GraphDb testing
- Refactors as necessary
- Basic RemoteCache for storing remote data
- RemoteDataRpc

As we are on a tight schedule, please let's focus on the essentials:
functionality and proper testing.

Reviewers: dgleich, teon.banek, buda

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1121
2018-01-22 14:20:41 +01:00
florijan
1953f3563f Refactor GraphDb so ::impl classes inherit GraphDb
Summary:
A slight insanity here... I realized I will need to create
`GraphDbAccessor` instance (which need `&GraphDb`) within some members
of `::impl` classes. Within those classes I can pass `this` to those
members, if `this` is a valid `GraphDb`. Semantically it really is (at
the moment), but heirarchically it wasn't. This diff changes that.
`GraphDb`  is now only an interface. `PublicBase` is the base for all
the public classes, `PrivateBase` for the `::impl` classes. Seems to
work.

Oh yes, another thing to keep in mind when doing this is that I should avoid
calling virtual functions in public classes (the motivation for the double
heirarchy). Before this diff the getters weren't virtual, now they are, so
I should have made all the appropriate changes in code as well.

Buda, was this a task I could have delegated to you or Cula?

Reviewers: teon.banek, dgleich, buda

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1120
2018-01-19 15:40:04 +01:00
Dominik Gleich
2a130e784e Worker id in snapshot/wal
Summary:
Adds worker id to snapshot and wal filename.
Adds a new worker_id flag to be used for recovering a worker with a distributed snapshot.
Adds worker_id field to snapshot to check for consistency.

Reviewers: florijan

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1096
2018-01-18 11:46:47 +01:00
florijan
813d37e939 Migrate db::types to storage::
Reviewers: teon.banek, dgleich

Reviewed By: teon.banek, dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1110
2018-01-17 10:35:12 +01:00
Dominik Gleich
5418dfb19e Rename NetworkEndpoint
Summary:
Rename redunant port str

Add endpoint << operator

Migrate everything to endpoint

Reviewers: mferencevic, florijan

Reviewed By: mferencevic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1100
2018-01-15 15:47:37 +01:00
florijan
6fc6a27288 Refactor GraphDb
Summary:
GraphDb is refactored to become an API exposing different parts
necessary for the database to function. These different parts can have
different implementations in SingleNode or distributed Master/Server
GraphDb implementations.

Interally GraphDb is implemented using two class heirarchies. One
contains all the members and correct wiring for each situation. The
other takes care of initialization and shutdown. This architecture is
practical because it can guarantee that the initialization of the
object structure is complete, before initializing state.

Reviewers: buda, mislav.bradac, dgleich, teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1093
2018-01-12 16:47:24 +01:00
florijan
c4327b26f4 Extract tx::SingleNodeEngine from tx::MasterEngine
Summary:
No logic changes, just split `tx::MasterEngine` into
`tx::SingleNodeEngine` and `tx::MasterEngine`. This gives better
responsibility separation and is more appropriate now there is no
Start/Shutdown.

Reviewers: dgleich, teon.banek, buda

Reviewed By: dgleich, teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1099
2018-01-11 15:44:42 +01:00
florijan
4a0345e1c5 Prepare counter for distributed
Reviewers: dgleich, teon.banek

Reviewed By: teon.banek

Differential Revision: https://phabricator.memgraph.io/D1090
2018-01-10 13:28:41 +01:00
Dominik Gleich
503381549e Change gid bit size
Summary:
Change gid methods

Rename GidGenerator and add tests

Fix tools broken by gid changes

Reviewers: dgleich, buda

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1044
2017-12-28 11:04:52 +01:00
Dominik Gleich
1556d78d15 Update snapshot format
Summary:
Set vertex/edge generator id from recovery

Add tests

Reviewers: florijan

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1061
2017-12-20 16:57:42 +01:00
florijan
3ae45e0d19 Add master/worker flags, main functions and coordination
Reviewers: dgleich, mislav.bradac, buda

Reviewed By: mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1060
2017-12-19 16:05:57 +01:00
florijan
37722e54b3 Add RPC to concurrent ID mapper
Summary:
The distributed ID mapper is not yet utilised in GraphDb as those
changes are in D1060. Depending on landing order it will be added.

Reviewers: dgleich, mislav.bradac

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1064
2017-12-19 09:11:16 +01:00
Dominik Gleich
3ddbcad0d9 Refactor global ids and prepare for distributed
Summary:
Change ids to global ids

Fix tests

Reviewers: florijan, buda

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1019
2017-12-05 13:05:55 +01:00
florijan
d1dbf22cd1 Prepare ConcurrentIdMapper for distributed
Reviewers: dgleich

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1017
2017-12-04 14:06:05 +01:00
florijan
ce3638b25e Prepare transactional engine for distributed
Summary:
The current idea is that the same MG binary can be used for single-node,
distributed master and distributed worker. The transactional engine in
the single-node and distributed master is the same: it determines the
transactional time and exposes all the "global" functionalities. In the
distributed worker the "global" functions must contact the master.

Reviewers: dgleich, mislav.bradac, buda

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1013
2017-12-01 09:17:44 +01:00
florijan
2fbe967465 Remove UniqueObjectStore
Summary: Because it will never be used, we already have replacements for it.

Reviewers: buda

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1016
2017-11-30 13:04:37 +01:00
Dominik Gleich
2b2de245d1 Allow concurrent index creation
Summary: Update tests

Reviewers: florijan, buda

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1012
2017-11-28 16:44:12 +01:00
Dominik Gleich
67538aceeb Migrate labels/properties/edgetypes to ids
Summary:
In preparation for distributed storage we need to have labels/properties/edgetypes uniquely identifiable by their ids, which will be global in near future.
The old design has to be abandoned because it's not possible to keep track of global labels/properties/edgetypes while they are local pointers.

Reviewers: mislav.bradac, florijan

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D993
2017-11-23 17:12:37 +01:00
florijan
8bbf1af525 Cleanup durability config, docs, CHANGELOG
Reviewers: teon.banek, buda, mislav.bradac, dgleich

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D992
2017-11-21 10:17:13 +01:00
florijan
1e0ac8ab8f Write-ahead log
Summary:
My dear fellow Memgraphians. It's friday afternoon, and I am as ready to pop as WAL is to get reviewed...

What's done:
- Vertices and Edges have global IDs, stored in `VersionList`. Main storage is now a concurrent map ID->vlist_ptr.
- WriteAheadLog class added. It's based around buffering WAL::Op objects (elementraly DB changes) and periodically serializing and flusing them to disk.
- Snapshot recovery refactored, WAL recovery added. Snapshot format changed again to include necessary info.
- Durability testing completely reworked.

What's not done (and should be when we decide how):
- Old WAL file purging.
- Config refactor (naming and organization). Will do when we discuss what we want.
- Changelog and new feature documentation (both depending on the point above).
- Better error handling and recovery feedback. Currently it's all returning bools, which is not fine-grained enough (neither for errors nor partial successes, also EOF is reported as a failure at the moment).
- Moving the implementation of WAL stuff to .cpp where possible.
- Not sure if there are transactions being created outside of `GraphDbAccessor` and it's `BuildIndex`. Need to look into.
- True write-ahead logic (flag controlled): not committing a DB transaction if the WAL has not flushed it's data. We can discuss the gain/effort ratio for this feature.

Reviewers: buda, mislav.bradac, teon.banek, dgleich

Reviewed By: dgleich

Subscribers: mtomic, pullbot

Differential Revision: https://phabricator.memgraph.io/D958
2017-11-13 09:51:39 +01:00
Teon Banek
55456b4214 Remove Dbms
Summary:
Remove name from GraphDb.
Take GraphDb in query test macros instead of accessor.
Add is_accepting_transactions flag to GraphDb.

Reviewers: mislav.bradac, florijan, mferencevic

Reviewed By: mislav.bradac

Subscribers: mferencevic, pullbot

Differential Revision: https://phabricator.memgraph.io/D940
2017-10-30 12:33:29 +01:00
Mislav Bradac
c6f1920f8b Remove EdgeType index - not used in interpreter
Reviewers: teon.banek, florijan

Reviewed By: teon.banek, florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D870
2017-10-06 15:02:34 +02:00
florijan
bfbec8d550 Counter function added
Reviewers: buda, mferencevic, mislav.bradac

Reviewed By: mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D787
2017-09-13 17:09:12 +02:00
florijan
a656ba3343 Scheduler - removed templatization
Summary: Because it was unnecessary and also implemented wrong (if someone tried using a Schduler with something other then std::mutex, it would not compile). We can trivially add this if it ever becomes necessary.

Reviewers: buda, mislav.bradac

Reviewed By: buda, mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D666
2017-08-17 16:10:41 +02:00
florijan
71d8062af1 GraphDb - index garbage collection fix
Summary:
A single line (graph_db.cpp:109 in the new code) was missing. This should have been done in D355 (made by DGleich, approved by Flor AND Buda AND Mislav :D).

Converted a lambda to a method for convenience.

Reviewers: buda, dgleich, mislav.bradac

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D665
2017-08-17 09:14:00 +02:00
florijan
0703724295 GraphDbAccessor::BuildIndex - deadlock bugfix
Reviewers: buda, mislav.bradac, dgleich

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D658
2017-08-11 10:34:18 +02:00