Commit Graph

555 Commits

Author SHA1 Message Date
Teon Banek
a44d9a0fac Always plan Synchronize if the plan needs it
Reviewers: florijan, msantl

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1241
2018-02-26 13:16:08 +01:00
Teon Banek
30d2cfb9db Plan PullRemoteOrderBy
Summary:
During distributed execution, OrderBy is split across workers and the
master gets to merge those results via PullRemoteOrderBy. Since this
operator may be an input to almost any other operator, virtual accessors
to `input` have been added in LogicalOperator.

Depends on D1221

Reviewers: florijan, msantl, buda

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1232
2018-02-26 11:15:46 +01:00
Matija Santl
b9d61a0127 Implement distributed aware OrderBy operator
Summary:
Extracted `TypedValueVectorCompare` and `RemotePuller` from operators
so it can be reused.
The new `PullRemoteOrerBy` operator pulls one result from each worker and one
from master, relies on the fact that workers/master returned sorted results,
returns the next one in order, and pulls the source of that result to get the
next one.

Depends on D1215 that (at the moment) is still in review.

Reviewers: florijan, teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1221
2018-02-23 17:44:21 +01:00
florijan
cbc9420c17 Support distributed query::plan::CreateExpand
Summary:
- The expansion vertex gets created on the origin's worker
- The edge automatically gets created wherever necessary
- Vertex creation logic reuse
- End to end test

Reviewers: teon.banek, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1227
2018-02-23 17:34:01 +01:00
Matej Ferencevic
d99724647a Add thread names
Summary:
To see the thread names in htop press <F2> to enter setup and under
'Display options' choose 'Show custom thread names'.

Reviewers: buda, teon.banek, florijan

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1234
2018-02-23 16:04:49 +01:00
florijan
80b86f0314 Add test/manual/distributed_console
Summary:
- Add the said test
- Add `workerid` function
- Add random wait (enabled with flag) to `rpc::Client::Call` for network
  latency simulation

Reviewers: buda, mferencevic, teon.banek

Reviewed By: buda, teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1222
2018-02-23 15:23:26 +01:00
Matej Ferencevic
c877c87bb4 Refactor RPC
Summary:
Previously, the RPC stack used the network stack only to receive messages. The
messages were then added to a separate queue that was processed by different
thread pools. This design was inefficient because there was a lock when
inserting and getting messages from the common queue.

This diff removes the need for separate thread pools by utilising the new
network stack design. This is possible because the new network stack allows
full processing of the network request without blocking the whole queue.

Reviewers: buda, florijan, teon.banek, dgleich, mislav.bradac

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1229
2018-02-23 12:07:22 +01:00
Teon Banek
452ce6a30c Use std::async for fetching remote edges in Expand
Summary:
This may improve the query execution workload, because the Expand will
yield local results while it waits for remote ones. Note that we rely on
the fact that walking the graph produces results without any predetermined
order. Therefore, we can yield paths as we see fit. Enforcing the order
is done through OrderBy operator, and this change shouldn't affect that.

Reviewers: florijan, msantl, buda

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1215
2018-02-22 17:11:53 +01:00
Dominik Gleich
753aa07cdf Tests and fix remote update already deleted bug
Summary:
Updating a record locally while there is an remote update waiting to be applied caused
the operation to return as already deleted, instead of applying it

Reviewers: florijan

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1226
2018-02-22 16:36:22 +01:00
Matej Ferencevic
017e8004e8 Refactor network stack
Summary:
Previously, the network stack `communication::Server` accepted connections and
assigned them statically in a round-robin fashion to `communication::Worker`.
That meant that if two compute intensive connections were assigned to the same
worker they would block each other while the other workers would do nothing.

This implementation replaces `communication::Worker` with
`communication::Listener` which holds all accepted connections in one pool and
ensures that all workers execute all connections.

Reviewers: buda, florijan, teon.banek

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1220
2018-02-22 16:29:17 +01:00
Teon Banek
c7eaf4711a Pass a flag for distributed CreateNode
Reviewers: florijan

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1218
2018-02-22 14:25:46 +01:00
Marin Tomic
5ff1ca4a5d Add client side RPC stats
Summary: ^^

Reviewers: mferencevic, buda

Reviewed By: mferencevic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1224
2018-02-22 12:34:48 +01:00
Dominik Gleich
46b8a91a2f Add rpc_worker_clients tests
Reviewers: florijan

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1214
2018-02-21 13:54:28 +01:00
florijan
ae95e9480c Use RemoteCreateVertex in CreateNode operator
Summary:
- Add `database::GraphDb::GetWorkerIds()`
- Change `CreateNode` constructor API
- Make `CreateNode` distribute nodes uniformly over workers

Did not yet modify `CreateExpand`, coming in the next diff.

Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1216
2018-02-21 12:17:43 +01:00
florijan
0c40c67ac2 Implement remote create (storage, RPC, not operator)
Summary:
Implementation of remote vertex and edge creation. This diff addresses
the creation API (`GraphDbAccessor::InsertEdge`,
`GraphDbAccessor::InsertRemoteVertex`) and the necessary RPC and
`RemoteCache` stuff.

What is missing for full remote creation support are
`query::plan::operator` changes that are expected to minor. Pushing this
diff as it's large enough, operator and end to end tests in the next.

Also, the naming of existing structures and files is confusing (update
refering to both updates and created, `results` used too often etc.). I
will address this too, but feel free to comment on bad naming.

Reviewers: dgleich, teon.banek, msantl

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1210
2018-02-21 09:17:48 +01:00
Teon Banek
2055176139 Plan basic distributed Cartesian without any checks
Summary:
This is still very much in progress.  No advanced checks are done to
prevent planning unimplemented things.  Basic Cartesian product should
work, for example `MATCH (a), (b) CREATE (a)-[:r]->(c)-[:r]->(b)`. But
anything more advanced may lead to undefined behaviour of the planner
and therefore execution. Use at your own risk!

Add ModifiedSymbols method to LogicalOperator

For planning Cartesian, we need information on which symbols are filled
by operator sub-trees.  Currently, this is used to set symbols which
should be transferred over network. Later, they should be used to detect
whether filter expressions use symbols modified from Cartesian branches.
Then we will be able to ensure correct dependency of filters and their
behaviour.

Prepare DistributedPlan for multiple worker plans

Since Cartesian branches need to be split and handled by each worker, we
now dispatch multiple plans to workers.

Reviewers: florijan, msantl, buda

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1208
2018-02-20 13:02:08 +01:00
Teon Banek
8856682c7f Always plan Synchronize for write queries
Reviewers: florijan, msantl

Reviewed By: florijan, msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1179
2018-02-20 09:55:42 +01:00
Dominik Gleich
0639c44ed7 Fix flaky concurrent list test
Summary:
Rename and fix concurrent list test.,
This tests a fix for the concurrent list.I was not able to reproduce the flakyness, but this could be it.
If it happens again we'll know that this is not the solution to the problem, and look further.
Change memory ordering

Reviewers: mferencevic

Reviewed By: mferencevic

Subscribers: buda, pullbot

Differential Revision: https://phabricator.memgraph.io/D1188
2018-02-19 12:49:38 +01:00
florijan
9721ccf61c Cleanup per-transaction caches in distributed
Summary:
On the master cleanups are hooked directly into the transaction engine.
This is beneficial because the master might have bigger caches and we
want to clear them as soon as possible.

On the workers there is a periodic RPC call to the master about living
transactions, which takes care of releasing local caches. This is
suboptimal because long transactions will prevent cache GC (like with
data GC). It is however fairly simple.

Note that all cleanup is not done automatically and `RemotePull` has
been reduced accordingly. @msantl, please verify correctness and
consider if the code can be additionally simplified.

Reviewers: teon.banek, msantl

Reviewed By: msantl

Subscribers: pullbot, msantl

Differential Revision: https://phabricator.memgraph.io/D1202
2018-02-16 15:30:05 +01:00
Matija Santl
84e7aec27b Implement Cartesian Cursor
Summary:
Pulls left op cursor and keeps the result, and then for each pull of
the right op cursor, adds all the left op results to produce a cartesian
product.

Reviewers: teon.banek, florijan

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1201
2018-02-16 13:43:12 +01:00
Matej Ferencevic
763bdaac0f Add support for large messages to RPC
Reviewers: buda, teon.banek, mculinovic

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1196
2018-02-16 12:38:31 +01:00
Marin Tomic
d15464e181 Add some metric types and basic RPC server stats
Summary:
Get rid of client class

Fix cppcheck errors

Add documentation to metrics.hpp

Add documentation to stats.hpp

Remove stats from global namespace

Fix build failures

Refactor a bit

Refactor stopwatch into a function

Add rpc execution time stats

Fix segmentation fault

Reviewers: mferencevic

Reviewed By: mferencevic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1200
2018-02-16 08:33:15 +01:00
florijan
b2d7f95568 Extract address types
Summary:
We have been using `Edges::VertexAddress` and `Edges::EdgeAddress` a lot
in other parts of the codebase because it's cleaner to write then
`Address<mvcc::VersionList<Edge>>`, especially in code what should not
really be MVCC-aware. However, a lot of that code should not really be
`Edges` aware either, as that's a storage datastructure that should not
be exposed.

This became annoying, so I extracted these addresses into a type-file. I
don't really like this approach, it might be better to have
`Vertex::Address` and `Edge::Address`, but that means we'd have to
import those headers and we'd get circular dependencies.

“The horror! The horror!”
   - Joseph Conrad, Heart of Darkness

Reviewers: teon.banek, buda

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1204
2018-02-15 17:31:10 +01:00
florijan
796946ad1b Implement sync operator
Reviewers: teon.banek, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1192
2018-02-14 13:03:18 +01:00
Matija Santl
b1a8e48dc4 Add ExpandWeightedShortestPathCursor
Summary: Implement Dijkstras algorithm to find weighted shortest path.

Reviewers: teon.banek, florijan, buda

Reviewed By: teon.banek

Subscribers: pullbot, mtomic

Differential Revision: https://phabricator.memgraph.io/D1182
2018-02-14 09:32:41 +01:00
Teon Banek
4326847ab3 Add SINGLE function to openCypher
Reviewers: florijan, msantl, buda

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1193
2018-02-13 17:30:44 +01:00
Teon Banek
e8608f4050 Request correct remote symbols for non-associative aggregation
Reviewers: florijan, msantl, mculinovic

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1191
2018-02-13 15:39:09 +01:00
Teon Banek
c1e4676316 Add optional total_weight variable to wShortest grammar
Reviewers: florijan, msantl, buda

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1187
2018-02-12 13:23:17 +01:00
Teon Banek
db407142ce Add wShortest expansion to openCypher
Reviewers: florijan, msantl

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1181
2018-02-09 11:02:41 +01:00
Matija Santl
013ee34f90 Fix RemotePull infinite loop when all workers don't have results
Reviewers: buda, teon.banek, florijan

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1184
2018-02-08 16:29:34 +01:00
florijan
45da645a4b Fix remote pull
Summary:
- Extracted TypedValue reconstruction into utils (will need that soon)
- Added more error handling in RemotePull

Reviewers: teon.banek, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1183
2018-02-08 15:00:40 +01:00
florijan
81e2e8f64f Add remote updates RPC
Summary:
Updates are supported, insertions and removals not in this diff. The
test is a bit overdesigned, it happens.

Reviewers: teon.banek, dgleich, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1176
2018-02-07 15:29:57 +01:00
Teon Banek
fa55c130ba Add REDUCE function to openCypher
Reviewers: florijan, msantl

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1175
2018-02-07 12:48:20 +01:00
Matija Santl
a66351c3f4 Make RemotePull operator async
Reviewers: teon.banek, florijan

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1168
2018-02-06 13:42:11 +01:00
florijan
f84b0b0787 Fix address serialization
Summary:
Defined a new test based on reported bug, for multiple remote expansion.
Fixed the bug. Introduced minor refactors in distributed unit testing.

Reviewers: mculinovic, dgleich

Reviewed By: mculinovic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1173
2018-02-06 11:01:06 +01:00
Teon Banek
abe419984a Plan Unwind and write operations in distributed
Summary:
This is a basic take on planning distributed writes. Main logic is in handling
the Accumulate operator, which requires Synchronize operation if the results
are used after modification.

Tests for planning distributed writes have been added.

Reviewers: florijan, msantl

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1172
2018-02-06 10:13:53 +01:00
florijan
f808252142 Prepare RecordAccessor for distributed, part one
Summary:
This diff consolidates local and remote update handling. It ensures and
tests that updates for remote elements are visible locally (on the
updating worker).

The next part will be accumulating remote updates and applying them on
the owner.

Also extracted a common testing fixture.

Reviewers: dgleich, buda, mtomic

Reviewed By: mtomic

Subscribers: mtomic, pullbot

Differential Revision: https://phabricator.memgraph.io/D1169
2018-02-05 09:48:50 +01:00
Marin Tomic
dca1e9eebc Add initial version of statsd
Reviewers: mferencevic, teon.banek, buda, mculinovic

Reviewed By: mferencevic

Subscribers: teon.banek, pullbot

Differential Revision: https://phabricator.memgraph.io/D1152
2018-02-02 17:02:37 +01:00
florijan
1d5d67aeac Refactor database::StateDelta
Summary:
Refactor in two ways. First, expose members without getters as we will
need most of them in distributed. And this was always the sensible thing
to do. Second, add storage type values to deltas. This is also a
sensible thing to do, and it will be very beneficial in distributed. We
didn't do it before because name<->value type mappings aren't guaranteed
to be the same after recovery. A task has been added to address this
(preserve mappings in durability).

Reviewers: dgleich, buda

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1167
2018-02-02 11:44:03 +01:00
Teon Banek
583f6f1794 Add Synchronize operator stub
Reviewers: florijan, msantl

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1166
2018-02-02 11:21:15 +01:00
Teon Banek
27a90a5c25 Support distributed CreateIndex plan
Reviewers: florijan, msantl

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1162
2018-02-01 13:05:11 +01:00
florijan
b97b48b365 Support worker transaction begin/advance/commit/abort
Reviewers: dgleich, buda, teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1161
2018-02-01 12:12:20 +01:00
florijan
e5035cf477 Support graph elements in remote pull rpc
Reviewers: teon.banek, dgleich, msantl, buda

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1157
2018-02-01 10:53:41 +01:00
Teon Banek
abdd7d8e9d Add basic tests for distributed query plans
Reviewers: florijan, msantl

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1156
2018-01-31 12:56:55 +01:00
florijan
3f30a6fad9 Add async remote pull RPC
Summary:
Remote pulls can now be async. Note that
`RemotePullRpcClients::RemotePull` takes references to data structures
which should not be temporary in the caller. Still, maybe safer to make
copies?

Changed `RpcWorkerClients` API to make that possible.

Reviewers: dgleich, msantl, teon.banek

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1150
2018-01-30 10:32:20 +01:00
Dominik Gleich
c37bb87ed8 Support snapshot creation and recovery in distributed
Summary:
Add custom encoder/decoder

Update snapshot recovery

Reviewers: florijan, teon.banek, mferencevic, mculinovic

Reviewed By: florijan

Subscribers: mferencevic, pullbot

Differential Revision: https://phabricator.memgraph.io/D1142
2018-01-29 19:16:13 +01:00
florijan
bfb3a0d9b1 Resolve global address to local
Summary:
It is possible that we have a global address to resolve, for a graph
element that's local. Consider W1 expanding, getting data from W2,
expanding from there and getting data that is on W1. We then don't want
to do RPC from W1 to W1, but do a lookup directly.

Reviewers: dgleich

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1145
2018-01-26 16:09:19 +01:00
Dominik Gleich
24857cc1cf Support distributed (label, property) indexes
Summary:
Call workers buildindex

Merge branch 'master' into setup_distributed_index

Use ExecuteOnWorkers api

Merge branch 'master' into setup_distributed_index

Improve test

Merge branch 'master' into setup_distributed_index

Finish test

Reviewers: florijan, teon.banek

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1134
2018-01-25 18:08:04 +01:00
florijan
ccefae4002 Add distributed execution RPC
Summary:
NOTE: This diff is still in progress. Many TODOs, lacking documentation
etc. But the main logic is there (some could be different), and it tests
out OK.

Reviewers: teon.banek, msantl, buda

Reviewed By: teon.banek

Subscribers: mferencevic, pullbot

Differential Revision: https://phabricator.memgraph.io/D1138
2018-01-25 14:50:28 +01:00
Matej Ferencevic
e2f2cd5722 Improve network performance
Summary:
With this patch the number of packets for a simple RPC call is lowered
from 22 to 12 (45% reduction). The number of packets for the Bolt protocol
is lowered from 26 to 18 (30% reduction).
Impact on the Bolt protocol will be a constant of ~ 8 packets less per
connection, while the impact on the RPC protocol will be approximately
a 45% reduction overall.

Reviewers: buda, teon.banek

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1141
2018-01-25 14:04:02 +01:00