Summary:
Visitor pattern's main issue is cyclical dependency between classes that
are visited and the visitor instance itself. We need to decouple this
dependency if we want to open source part of the code, namely
non-distributed part. This decoupling is achieved through the use of
`dynamic_cast` in distributed operators. Hopefully the solution is good
enough and doesn't cause performance issues. An alternative solution is
to build our own custom double dispatch solution, but that will
basically boil down to our implementation of runtime type information
and casts.
Note, this only decouples the distributed operators. If and when we
decide that other operators shouldn't be open sourced, the same
`dynamic_cast` pattern should be applied in them also.
Depends on D1563
Reviewers: mtomic, msantl, buda
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1566
Summary:
This is the first step in separating the implementation of distributed
features in operators. Following steps are:
* decoupling distributed visitors
* injecting distributed details in operator state
* minor cleanup or anything else that was overlooked
Reviewers: mtomic, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1563
Summary:
Since we switched to Cap'n Proto serialization there's no need for
keeping boost around anymore.
Reviewers: mtomic, mferencevic, buda
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1515
Summary:
This change should completely support planning Optional for distributed
execution. Cartesian matching is handled, as well as dependencies
between optional branch and the main input branch.
Unit tests are expanded to cover the planning algorithm.
Reviewers: msantl, mtomic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1480
Summary:
Added test stream functionality. When a stream is configured, it will try to
consume messages from a kafka topic and return them back to the user.
For now, the messages aren't transformed, so it just returns the payload string.
Depends on D1466
Next steps are persisting stream metadata and transforming messages in order to
store them in the graph.
Reviewers: teon.banek, mtomic
Reviewed By: teon.banek
Subscribers: pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D1474
Summary:
Integrated kafka library into memgraph. This version supports all opencypher
features and will only output messages consumed from kafka.
Depends on D1434
Next steps are persisting stream metadata and transforming messages in order to
store them in the graph.
Reviewers: teon.banek, mtomic, mferencevic, buda
Reviewed By: teon.banek
Subscribers: mferencevic, pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D1466
Summary:
Added basic functionality for kafka streams. The `CREATE STREAM` clause is a
simplified version from the one mentioned in D1415 so we can start testing
end-to-end sooner.
This diff also includes a bug fix in `lcp.list ` for operators that have no
members.
Reviewers: teon.banek, mtomic, buda
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1434
Summary:
Hopefully, the mechanism of generating Cartesian is general enough, so
this simple change should work correctly in all cases.
Planner tests have been modified to use a FakeDbAccessor in order to
speed them up and potentially allow extracting planning into a library.
Reviewers: msantl, mtomic, buda
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1431
Summary:
This change should correctly plan Cartesian which have dependent Filter
or Expand operators. Tests have been added for those cases. Other cases
are not yet supported and should throw an exception.
Reviewers: msantl, mtomic, buda
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1426
Summary:
Add additional structs and functions for handling C++ meta information.
Add capnp-file and capnp-id arguments to lcp:process-file.
Generate cpp along with hpp and capnp in lcp.
Wrap LogicalOperator base class in lcp:define-class.
Modify logical operators for capnp serialization.
Add query/common.capnp.
Reviewers: mculinovic, buda, mtomic, msantl, ipaljak, dgleich, mferencevic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1391
Summary:
This fixes a bug where the planning would raise NotYetImplemented error,
due to preventing plan splitting while the operator is already on
master. For example, `MATCH (a), (b) CREATE (a)-[e:r]->(b) RETURN e`
would split the plan after Cartesian and before Create. Thus, the rest
of the plan would be on master, including the Accumulate before Produce.
We still can (and must) replace Accumulate with Synchronize but this
would fail due to unneeded check.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1272
Summary:
During distributed execution, OrderBy is split across workers and the
master gets to merge those results via PullRemoteOrderBy. Since this
operator may be an input to almost any other operator, virtual accessors
to `input` have been added in LogicalOperator.
Depends on D1221
Reviewers: florijan, msantl, buda
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1232
Summary:
This is still very much in progress. No advanced checks are done to
prevent planning unimplemented things. Basic Cartesian product should
work, for example `MATCH (a), (b) CREATE (a)-[:r]->(c)-[:r]->(b)`. But
anything more advanced may lead to undefined behaviour of the planner
and therefore execution. Use at your own risk!
Add ModifiedSymbols method to LogicalOperator
For planning Cartesian, we need information on which symbols are filled
by operator sub-trees. Currently, this is used to set symbols which
should be transferred over network. Later, they should be used to detect
whether filter expressions use symbols modified from Cartesian branches.
Then we will be able to ensure correct dependency of filters and their
behaviour.
Prepare DistributedPlan for multiple worker plans
Since Cartesian branches need to be split and handled by each worker, we
now dispatch multiple plans to workers.
Reviewers: florijan, msantl, buda
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1208
Summary:
This is a basic take on planning distributed writes. Main logic is in handling
the Accumulate operator, which requires Synchronize operation if the results
are used after modification.
Tests for planning distributed writes have been added.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1172
Summary:
With the added support for serialization, we should be able to transfer
plans across distributed workers.
The planner tests has been extended to test serialization. Operators
should be mostly tested, but the expression they contain aren't
completely. The quick solution is to use typeid for rudimentary
expression equality testing. The more involved solution of comparing
the expression tree for equality is the correct choice. It should be
done in the near future.
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1122
Summary:
GraphDb is refactored to become an API exposing different parts
necessary for the database to function. These different parts can have
different implementations in SingleNode or distributed Master/Server
GraphDb implementations.
Interally GraphDb is implemented using two class heirarchies. One
contains all the members and correct wiring for each situation. The
other takes care of initialization and shutdown. This architecture is
practical because it can guarantee that the initialization of the
object structure is complete, before initializing state.
Reviewers: buda, mislav.bradac, dgleich, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1093
Summary:
Union query combinator implementation consists of:
* adjustments to the AST and `cypher_main_visitor`
* enabling `QueryStripper` to parse multiple `return` statements (not stopping after first)
* symbol generation for union results
* union logical operator
* query plan generator adjustments
Reviewers: teon.banek, mislav.bradac
Reviewed By: teon.banek
Subscribers: pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D1038
Summary:
This change generates multiple PropertyFilters for expressions such as
`n.prop1 = m.prop2`. When choosing one PropertyFilter, we want to also
remove the other one, because they represent the same original
expression. Therefore, the removal is no longer based on FilterInfo
equality, but on the original expression equality. Additionally,
FilterInfo and PropertyFilter equality operators have been removed to
avoid any pretense they do what you expect or want.
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1021
Summary:
Referring to the TCK failure on:
```
MATCH (a {name: 'Andres'})<-[:FATHER]-(child)
RETURN {foo: a.name='Andres', kids: collect(child.name)}
```
In the planner we'd only treat a list|map as a group_by if it contained
no aggregations. That's changed so that if a map contains both aggregations
and non-aggregations, then non-aggregations are treated as individual
group_by expressions.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: buda, pullbot, teon.banek
Differential Revision: https://phabricator.memgraph.io/D1004
Summary:
Previously, named path symbols remained untracked as `new_symbols` during planning. This meant that
operator `Optional` would be left unaware of those symbols, and therefore not reset them to `Null`
if optional matching failed.
Test Optional operator will be aware of path symbols
Reviewers: florijan, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D974
Summary:
Remove name from GraphDb.
Take GraphDb in query test macros instead of accessor.
Add is_accepting_transactions flag to GraphDb.
Reviewers: mislav.bradac, florijan, mferencevic
Reviewed By: mislav.bradac
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D940
Summary:
Move QueryParts and Filters to a new file.
Reorganize FilterInfo struct.
Remove label filter if we do indexed scan by label.
Remove property filter used in indexed scan.
Reviewers: florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D915
Summary: This change increases the planning time, but should reduce memory consumption.
Reviewers: florijan, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D901
Summary:
- Removed BreadthFirstAtom, using EdgeAtom only with a Type enum.
- Both variable expansions (breadth and depth first) now have mandatory inner node and edge Identifiers.
- Both variable expansions use inline property filtering and support inline lambdas.
- BFS and variable expansion now have the same planning process.
- Planner modified in the following ways:
- Variable expansions support inline property filtering (two filters added to all_filters, one for inline, one for post-expand).
- Asserting against existing_edge since we don't support that anymore.
- Edge and node symbols bound after variable expansion to disallow post-expand filters to get inlined.
- Some things simplified due to different handling.
- BreadthFirstExpand logical operator merged into ExpandVariable. Two Cursor classes remain and are dynamically chosen from.
As part of planned planner refactor we should ensure that a filter is applied only once. The current implementation is very suboptimal for property filtering in variable expansions.
@buda: we will start refactoring this these days. This current planner logic is too dense and complex. It is becoming technical debt. Most of the time I spent working on this has been spent figuring the planning out, and I still needed Teon's help at times. Implementing the correct and optimal version of query execution (avoiding multiple potentially expensive filterings) was out of reach also due to tech debt.
Reviewers: buda, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D852
Summary:
- The new BFS syntax implemented as proposed.
- AST BreadthFirstAtom now uses EdgeAtom members: has_range_{true}, upper_bound_, lower_bound_
- Edges data structure now handles all the edge filtering (single or multiple edges), to ease planning. Additional edge filtering (additional Filter op in the plan) is removed. AST EdgeTypeTest is no longer used and is removed.
Current state is stable but there are things left to do:
- BFS property filtering.
- BFS lower_bound_ support.
- Support for lambdas in variable length expansion. This includes obligatory (even if not user_defined) inner_node and inner_edge symbols for easier handling.
- Code-sharing between BFS and variable length expansions.
I'll add asana tasks (and probably start working on them immediately) when/if this lands.
Reviewers: buda, teon.banek, mislav.bradac
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D836
Summary:
Test planner splits MATCH ... WHERE
Remove distinction between FilterAnd and AndOperator
Reviewers: florijan, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D814
Summary:
Antlr grammar has been updated to support putting edge types after the BFS
symbol. Planner collects edge type filters for BFS and inlines them in the
operator by joining the filter with the user input BFS filter itself. This
requires no change from the standpoint of the operator. On the other hand, in
order to use the faster lookup by a single edge type, `ExpandBreadthFirst`
operator now accept an optional edge type. The edge type is passed from the
planner only if the user is filtering by a single type.
Unit tests as well as tck have been updated.
Reviewers: florijan, mislav.bradac
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D777