Add initial version of dynamic partitioning feature spec

Reviewers: dgleich, dtomicevic, msantl Reviewed By: msantl Differential Revision: https://phabricator.memgraph.io/D1388
2018-06-26 09:45:11 +02:00 · 2018-06-26 09:45:11 +02:00 · 5db39ac501
commit 5db39ac501
parent 98350274ad
2 changed files with 78 additions and 0 deletions
--- a/docs/dev/distributed/dynamic_graph_partitioning.md
+++ b/docs/dev/distributed/dynamic_graph_partitioning.md
@ -1,4 +1,5 @@
 ## Dynamic Graph Partitioning
+
 Memgraph supports dynamic graph partitioning similar to the Spinner algorithm, 
 mentioned in this paper: [https://arxiv.org/pdf/1404.3861.pdf].

@ -7,6 +8,7 @@ it tries to keep closely connected data on one worker. It tries to avoid jumps
 across workers when querying/traversing the distributed graph.

 ### Our implementation
+
 It works independently on each worker but it is always running the migration 
 on only one worker at the same time. It achieves that by sharing a token 
 between workers, and the token ownership is transferred to the next worker 
@ -18,6 +20,7 @@ migrations, which might cause an update of some vertex from two or more
 different transactions.

 ### Migrations
+
 For each vertex and workerid (label in the context of Dgp algorithm) we define  
 a score function. Score function takes into account labels of surrounding 
 endpoints of vertex edges (in/out) and the capacity of the worker with said 
--- a/docs/feature_specs/dynamic_graph_partitioning.md
+++ b/docs/feature_specs/dynamic_graph_partitioning.md
@ -0,0 +1,75 @@
+# Dynamic Graph Partitioning (abbr. DGP)
+
+## Implementation
+
+Take a look under `dev/memgraph/distributed/dynamic_graph_partitioning.md`.
+
+### Implemented parameters
+
+--dynamic-graph-partitioner-enabled (If the dynamic graph partitioner should be
+  enabled.) type: bool default: false (start time)
+--dgp-improvement-threshold (How much better should specific node score be
+  to consider a migration to another worker. This represents the minimal
+  difference between new score that the vertex will have when migrated
+  and the old one such that it's migrated.) type: int32 default: 10
+  (start time)
+--dgp-max-batch-size (Maximal amount of vertices which should be migrated
+  in one dynamic graph partitioner step.) type: int32 default: 2000
+  (start time)
+
+## Planning
+
+### Design decisions
+
+* Each partitioning session has to be a new transaction.
+* When and how does an instance perform the moves?
+  * Periodically.
+  * Token sharing (round robin, exactly one instance at a time has an
+    opportunity to perform the moves).
+* On server-side serialization error (when DGP receives an error).
+  -> Quit partitioning and wait for the next turn.
+* On client-side serialization error (when end client receives an error).
+  -> The client should never receive an error because of any
+     internal operation.
+  -> For the first implementation, it's good enough to wait until data becomes
+     available again.
+  -> It would be nice to achieve that DGP has lower priority than end client
+     operations.
+
+### End-user parameters
+
+* --dynamic-graph-partitioner-enabled (execution time)
+* --dgp-improvement-threshold (execution time)
+* --dgp-max-batch-size (execution time)
+* --dgp-min-batch-size (execution time)
+  -> Minimum number of nodes that will be moved in each step.
+* --dgp-fitness-threshold (execution time)
+  -> Do not perform moves if partitioning is good enough.
+* --dgp-delta-turn-time (execution time)
+  -> Time between each turn.
+* --dgp-delta-step-time (execution time)
+  -> Time between each step.
+* --dgp-step-time (execution time)
+  -> Time limit per each step.
+
+### Testing
+
+The implementation has to provide good enough results in terms of:
+  * How good the partitioning is (numeric value), aka goodness.
+  * Workload execution time.
+  * Stress test correctness.
+
+Test cases:
+  * N not connected subgraphs
+    -> shuffle nodes to N instances
+    -> run partitioning
+    -> test perfect partitioning.
+  * N connected subgraph
+    -> shuffle nodes to N instance
+    -> run partitioning
+    -> test partitioning.
+  * Take realistic workload (Long Running, LDBC1, LDBC2, Card Fraud, BFS, WSP)
+    -> measure exec time
+    -> run partitioning
+    -> test partitioning
+    -> measure exec time (during and after partitioning).