Commit Graph

134 Commits

Author SHA1 Message Date
Matija Santl
a026c4c764 Add test stream clause
Summary:
Added test stream functionality. When a stream is configured, it will try to
consume messages from a kafka topic and return them back to the user.
For now, the messages aren't transformed, so it just returns the payload string.

Depends on D1466

Next steps are persisting stream metadata and transforming messages in order to
store them in the graph.

Reviewers: teon.banek, mtomic

Reviewed By: teon.banek

Subscribers: pullbot, buda

Differential Revision: https://phabricator.memgraph.io/D1474
2018-07-09 11:00:18 +02:00
Marin Tomic
c4f51d87f8 Implement Reset for distributed operators
Reviewers: teon.banek, msantl, buda

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1467
2018-07-06 16:02:17 +02:00
Matija Santl
fa7e214bcf Add kafka library and integrate it into memgraph
Summary:
Integrated kafka library into memgraph. This version supports all opencypher
features and will only output messages consumed from kafka.

Depends on D1434

Next steps are persisting stream metadata and transforming messages in order to
store them in the graph.

Reviewers: teon.banek, mtomic, mferencevic, buda

Reviewed By: teon.banek

Subscribers: mferencevic, pullbot, buda

Differential Revision: https://phabricator.memgraph.io/D1466
2018-07-06 15:52:23 +02:00
Marin Tomic
e2f9eb6fa5 Stop bfs early when possible
Summary: When doing bfs with given endpoint, we can stop the traversal on first successful pull.

Reviewers: teon.banek, msantl, mculinovic, buda

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1469
2018-07-06 15:32:43 +02:00
Matija Santl
1c2f599a93 Add kafka openCypher clauses
Summary:
Added basic functionality for kafka streams. The `CREATE STREAM` clause is a
simplified version from the one mentioned in D1415 so we can start testing
end-to-end sooner.

This diff also includes a bug fix in `lcp.list ` for operators that have no
members.

Reviewers: teon.banek, mtomic, buda

Reviewed By: mtomic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1434
2018-07-06 15:29:28 +02:00
Teon Banek
c9b75cbb45 Remove unused private member
Reviewers: mtomic

Reviewed By: mtomic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1457
2018-06-28 16:54:30 +02:00
Marin Tomic
cd07664564 Add timestamp function
Reviewers: teon.banek, buda

Reviewed By: teon.banek

Subscribers: mferencevic, pullbot

Differential Revision: https://phabricator.memgraph.io/D1452
2018-06-27 16:06:54 +02:00
Teon Banek
4b97747c14 Allow planning Cartesian after Produce
Summary:
Hopefully, the mechanism of generating Cartesian is general enough, so
this simple change should work correctly in all cases.

Planner tests have been modified to use a FakeDbAccessor in order to
speed them up and potentially allow extracting planning into a library.

Reviewers: msantl, mtomic, buda

Reviewed By: mtomic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1431
2018-06-20 12:55:56 +02:00
Marin Tomic
2c5d756d52 Remove unused members of ModifyUser and DropUser
Summary: Get rid of warning

Reviewers: msantl, teon.banek

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1433
2018-06-19 09:28:41 +02:00
Marin Tomic
b9be394cb2 Add parsing and planning of basic user management queries
Reviewers: teon.banek, mferencevic

Reviewed By: teon.banek, mferencevic

Subscribers: pullbot, buda

Differential Revision: https://phabricator.memgraph.io/D1398
2018-06-14 16:51:22 +02:00
Teon Banek
e0474a8e92 Replace boost with capnp in RPC
Summary:
Converts the RPC stack to use Cap'n Proto for serialization instead of
boost. There are still some traces of boost in other places in the code,
but most of it is removed. A future diff should cleanup boost for good.

The RPC API is now changed to be more flexible with regards to how
serialize data. This makes the simplest cases a bit more verbose, but
allows complex serialization code to be correctly written instead of
relying on hacks. (For reference, look for the old serialization of
`PullRpc` which had a nasty pointer hacks to inject accessors in
`TypedValue`.)

Since RPC messages were uselessly modeled via inheritance of Message
base class, that class is now removed. Furthermore, that approach
doesn't really work with Cap'n Proto. Instead, each message type is
required to have some type information. This can be automated, so
`define-rpc` has been added to LCP, which hopefully simplifies defining
new RPC request and response messages.

Specify Cap'n Proto schema ID in cmake

This preserves Cap'n Proto generated typeIds across multiple generations
of capnp schemas through LCP. It is imperative that typeId stays the
same to ensure that different compilations of Memgraph may communicate
via RPC in a distributed cluster.

Use CLOS for meta information on C++ types in LCP

Since some structure slots and functions have started to repeat
themselves, it makes sense to model C++ meta information via Common Lisp
Object System.

Depends on D1391

Reviewers: buda, dgleich, mferencevic, mtomic, mculinovic, msantl

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1407
2018-06-04 10:45:12 +02:00
Teon Banek
c7b6cae526 Extract io/network into mg-io library
Reviewers: buda, dgleich, mferencevic

Reviewed By: mferencevic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1411
2018-05-30 14:58:41 +02:00
Matija Santl
f872c93ad1 Add command id to remote produce
Summary:
Command id is necessary in remote produce to identify an ongoing pull
because a transaction can have multiple commands that all belong under
the same plan and tx id.

Reviewers: teon.banek, mtomic, buda

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1386
2018-05-16 10:20:39 +02:00
Marin Tomic
91e38f6413 Distributed BFS
Summary: depends on D1387

Reviewers: msantl, teon.banek, buda

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1341
2018-05-15 17:38:51 +02:00
Marin Tomic
0a6b8cdf4f Remove AS_IS from GraphView
Summary:
Removing `AS_IS` from GraphView because it doesn't seem like it is necessary for query execution and it also has weird semantics (you might get a mix of old and new records). `Unwind`, `OrderBy` and `PullRemoteOrderBy` now use `OLD` graph view.

Remove AS_IS from GraphView

Fix query_cost_estimator tests

Fix query_expression_evaluator tests

Fix query_plan_match_filter_return tests

Fix query_plan_create_set_remove_delete tests

Fix query_plan_accumulate_aggregate tests

Reviewers: teon.banek, buda

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1390
2018-05-15 13:10:19 +02:00
Teon Banek
30506f44f5 Use LCP to generate LogicalOperator boost serialization
Summary:
Since we are moving from boost to Capnp for serialization, it makes
sense to keep all of the LogicalOperator classes in LCP format. This
will make it easier to generate Capnp code.

Depends on D1361

Reviewers: buda, mferencevic, msantl, dgleich, ipaljak, mculinovic, mtomic

Reviewed By: mtomic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1362
2018-05-10 09:56:40 +02:00
Marko Budiselic
c76170a9db Clean utils folder (namespaces, function names)
Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1359
2018-04-22 09:44:32 +02:00
Matija Santl
5c7d3a908f Add two state dijkstra for wsp implementation
Summary:
Added a new markdown file for concepts and a new test to test the edge
case with upper bound.

Reviewers: teon.banek, mtomic, dgleich, buda, ipaljak

Reviewed By: teon.banek, dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1366
2018-04-20 16:10:42 +02:00
florijan
a88c598822 Use wait-in-destruct future everywhere
Summary:
Before we used `utils::Future` only where it's created by our `ThreadPool`.
I suggest in this diff that we use it everywhere, it's a bit more defensive and
should not have any downsides.

Reviewers: msantl, teon.banek

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1316
2018-03-26 14:15:11 +02:00
Matija Santl
29ba055b64 Add custom VLOGs for distributed memgraph
Summary:
Add different priority VLOGs for distributed memgraph.

For level 3 you'll get logs for dispatching/consuming plans.
For level 4 you'll get logs for tx start/commit/abort, remote produce, remote
pull, remote result consume,
For level 5 there will be a log for each request/response made by the RPC
client.

Master log snippet P9
Worker log snippet P10

Reviewers: florijan, teon.banek

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1296
2018-03-26 09:24:39 +02:00
florijan
ac8c96ccc2 Tidyup distributed stuff naming
Summary:
Remove "produce_" and "Produce" as prefix from all distributed stuff.
It's not removed in src/query/ stuff (operators).

Reviewers: dgleich

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1315
2018-03-23 16:32:29 +01:00
Teon Banek
226992f420 Check values can be used for range indexed ScanAll
Summary: Test indexing ScanAll with invalid property values

Reviewers: florijan, msantl

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1293
2018-03-13 10:23:30 +01:00
florijan
67092ae4d7 Fix ScanAll cursor family w.r.t. empty collections
Reviewers: teon.banek, dgleich, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1291
2018-03-12 11:18:07 +01:00
florijan
42ca81eb01 Use custom future that waits on destruct
Reviewers: teon.banek, dgleich

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1286
2018-03-09 10:16:17 +01:00
florijan
43c0e91057 Enable transaction killer on worker
Summary:
I have not thoroughly thought this through, especially the worker
destruction (is it legit to abort all running tx?), but it's tested to
abort during remote pull, what we need.

Also I improved error handling for vertex deletion failure during
remote pull (@dgleich).

Reviewers: teon.banek, msantl, dgleich

Reviewed By: dgleich

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1263
2018-03-05 12:45:42 +01:00
Matija Santl
418db646c1 Abort from remote pulls if db says so
Summary:
If the `db` says we need to abort we should abort. There are two
places (in both `PullRemote` and `PullRemoteOrderBy`) where the check is made.
The first one is on the very beginning of the `Pull` method and the second one
is in the loop that checks/waits for remote results.

Reviewers: teon.banek, florijan

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1265
2018-03-02 14:23:02 +01:00
Teon Banek
f963c54f77 Use 10us sleep while waiting on RPC
Summary:
Usually a single RPC call takes 50us, which is a lot lower than our
millisecond sleep. Calling sleep with less than a ms duration may not
be really supported (due to timer granularity), but this change should
cause some sub-ms sleep.

Reviewers: florijan, msantl, mferencevic

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1258
2018-03-01 14:50:36 +01:00
Dominik Gleich
99375a4b47 Vertex removal using rpcs
Summary:
Remove vertex remote

Add tests

Reviewers: florijan, teon.banek

Reviewed By: florijan

Subscribers: teon.banek, pullbot

Differential Revision: https://phabricator.memgraph.io/D1230
2018-02-28 11:35:44 +01:00
Teon Banek
f1aece8352 Visit PullRemote's input only if it has any
Reviewers: florijan, msantl

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1239
2018-02-26 13:18:10 +01:00
Teon Banek
30d2cfb9db Plan PullRemoteOrderBy
Summary:
During distributed execution, OrderBy is split across workers and the
master gets to merge those results via PullRemoteOrderBy. Since this
operator may be an input to almost any other operator, virtual accessors
to `input` have been added in LogicalOperator.

Depends on D1221

Reviewers: florijan, msantl, buda

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1232
2018-02-26 11:15:46 +01:00
Matija Santl
b9d61a0127 Implement distributed aware OrderBy operator
Summary:
Extracted `TypedValueVectorCompare` and `RemotePuller` from operators
so it can be reused.
The new `PullRemoteOrerBy` operator pulls one result from each worker and one
from master, relies on the fact that workers/master returned sorted results,
returns the next one in order, and pulls the source of that result to get the
next one.

Depends on D1215 that (at the moment) is still in review.

Reviewers: florijan, teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1221
2018-02-23 17:44:21 +01:00
florijan
cbc9420c17 Support distributed query::plan::CreateExpand
Summary:
- The expansion vertex gets created on the origin's worker
- The edge automatically gets created wherever necessary
- Vertex creation logic reuse
- End to end test

Reviewers: teon.banek, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1227
2018-02-23 17:34:01 +01:00
Teon Banek
452ce6a30c Use std::async for fetching remote edges in Expand
Summary:
This may improve the query execution workload, because the Expand will
yield local results while it waits for remote ones. Note that we rely on
the fact that walking the graph produces results without any predetermined
order. Therefore, we can yield paths as we see fit. Enforcing the order
is done through OrderBy operator, and this change shouldn't affect that.

Reviewers: florijan, msantl, buda

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1215
2018-02-22 17:11:53 +01:00
florijan
ae95e9480c Use RemoteCreateVertex in CreateNode operator
Summary:
- Add `database::GraphDb::GetWorkerIds()`
- Change `CreateNode` constructor API
- Make `CreateNode` distribute nodes uniformly over workers

Did not yet modify `CreateExpand`, coming in the next diff.

Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1216
2018-02-21 12:17:43 +01:00
Teon Banek
2055176139 Plan basic distributed Cartesian without any checks
Summary:
This is still very much in progress.  No advanced checks are done to
prevent planning unimplemented things.  Basic Cartesian product should
work, for example `MATCH (a), (b) CREATE (a)-[:r]->(c)-[:r]->(b)`. But
anything more advanced may lead to undefined behaviour of the planner
and therefore execution. Use at your own risk!

Add ModifiedSymbols method to LogicalOperator

For planning Cartesian, we need information on which symbols are filled
by operator sub-trees.  Currently, this is used to set symbols which
should be transferred over network. Later, they should be used to detect
whether filter expressions use symbols modified from Cartesian branches.
Then we will be able to ensure correct dependency of filters and their
behaviour.

Prepare DistributedPlan for multiple worker plans

Since Cartesian branches need to be split and handled by each worker, we
now dispatch multiple plans to workers.

Reviewers: florijan, msantl, buda

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1208
2018-02-20 13:02:08 +01:00
florijan
9721ccf61c Cleanup per-transaction caches in distributed
Summary:
On the master cleanups are hooked directly into the transaction engine.
This is beneficial because the master might have bigger caches and we
want to clear them as soon as possible.

On the workers there is a periodic RPC call to the master about living
transactions, which takes care of releasing local caches. This is
suboptimal because long transactions will prevent cache GC (like with
data GC). It is however fairly simple.

Note that all cleanup is not done automatically and `RemotePull` has
been reduced accordingly. @msantl, please verify correctness and
consider if the code can be additionally simplified.

Reviewers: teon.banek, msantl

Reviewed By: msantl

Subscribers: pullbot, msantl

Differential Revision: https://phabricator.memgraph.io/D1202
2018-02-16 15:30:05 +01:00
Matija Santl
84e7aec27b Implement Cartesian Cursor
Summary:
Pulls left op cursor and keeps the result, and then for each pull of
the right op cursor, adds all the left op results to produce a cartesian
product.

Reviewers: teon.banek, florijan

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1201
2018-02-16 13:43:12 +01:00
Teon Banek
b8acefb1e3 Add Cartesian operator stub
Reviewers: florijan, msantl

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1199
2018-02-15 12:45:14 +01:00
Matija Santl
9eb01f363f Add sleep in remote pull operator
Summary:
Added a hidden flag, a int32 value, that represents the number of
milliseconds the thread that pulls remote results should sleep in case when the
local results are exhausted but there are still some remote results that are
pending.

Reviewers: teon.banek, florijan

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1198
2018-02-15 10:21:34 +01:00
florijan
796946ad1b Implement sync operator
Reviewers: teon.banek, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1192
2018-02-14 13:03:18 +01:00
Teon Banek
0616eb0aa4 Remove pull_local flag from PullRemote
Summary:
The same behaviour can now be achieved by passing a nullptr for input
operator.

Reviewers: florijan, msantl

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1195
2018-02-14 12:56:44 +01:00
Matija Santl
b1a8e48dc4 Add ExpandWeightedShortestPathCursor
Summary: Implement Dijkstras algorithm to find weighted shortest path.

Reviewers: teon.banek, florijan, buda

Reviewed By: teon.banek

Subscribers: pullbot, mtomic

Differential Revision: https://phabricator.memgraph.io/D1182
2018-02-14 09:32:41 +01:00
Matija Santl
013ee34f90 Fix RemotePull infinite loop when all workers don't have results
Reviewers: buda, teon.banek, florijan

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1184
2018-02-08 16:29:34 +01:00
florijan
45da645a4b Fix remote pull
Summary:
- Extracted TypedValue reconstruction into utils (will need that soon)
- Added more error handling in RemotePull

Reviewers: teon.banek, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1183
2018-02-08 15:00:40 +01:00
Matija Santl
a66351c3f4 Make RemotePull operator async
Reviewers: teon.banek, florijan

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1168
2018-02-06 13:42:11 +01:00
Teon Banek
583f6f1794 Add Synchronize operator stub
Reviewers: florijan, msantl

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1166
2018-02-02 11:21:15 +01:00
florijan
e5035cf477 Support graph elements in remote pull rpc
Reviewers: teon.banek, dgleich, msantl, buda

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1157
2018-02-01 10:53:41 +01:00
florijan
3f30a6fad9 Add async remote pull RPC
Summary:
Remote pulls can now be async. Note that
`RemotePullRpcClients::RemotePull` takes references to data structures
which should not be temporary in the caller. Still, maybe safer to make
copies?

Changed `RpcWorkerClients` API to make that possible.

Reviewers: dgleich, msantl, teon.banek

Reviewed By: msantl

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1150
2018-01-30 10:32:20 +01:00
Matija Santl
a73a4c3762 Implement PullRemote logical operator
Summary: PullRemoteCursor will pull all clients in a RoundRobin fashion until all clients are exhausted and there are no more results to return.

Reviewers: teon.banek, florijan

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1147
2018-01-26 17:12:45 +01:00
Teon Banek
dbc9c995c9 Plan distributed execution of basic read operators
Summary:
Remove AdvanceCommand operator declaration.
Add query/plan/distributed source files.
Add hacked cloning of LogicalOperator via serialization.
Add virtual Default methods to CompositeVisitor.

Use BOOST_CLASS_EXPORT_KEY in ast.hpp and operator.hpp.

This is needed in a single binary to correctly serialize polymorphic
classes. The previous implementation worked because the memgraph_lib was
linked with test binaries, but nobody was actually serializing things
inside the single memgraph binary itself.

Print PullRemote symbols in tests/manual/query_planner

Add names to implicitly created aggregation symbols

Reviewers: florijan, msantl

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1137
2018-01-25 14:20:51 +01:00