Commit Graph

76 Commits

Author SHA1 Message Date
Teon Banek
b90375c3ae Remove GraphDbAccessor and storage types from Ast
Summary:
This diff removes the need for a database when parsing a query and
creating an Ast. Instead of storing storage::{Label,Property,EdgeType}
in Ast nodes, we store the name and an index into all of the names. This
allows for easy creation of a map from {Label,Property,EdgeType} index
into the concrete storage type. Obviously, this comes with a performance
penalty during execution, but it should be minor. The upside is that the
query/frontend minimally depends on storage (PropertyValue), which makes
writing tests easier as well as running them a lot faster (there is no
database setup). This is most noticeable in the ast_serialization test
which took a long time due to start up of a distributed database.

Reviewers: mtomic, llugovic

Reviewed By: mtomic

Subscribers: mferencevic, pullbot

Differential Revision: https://phabricator.memgraph.io/D1774
2019-01-16 09:47:42 +01:00
Lovro Lugovic
2730f2d35f Add query profiling: Tree
Reviewers: mtomic, teon.banek, mferencevic

Reviewed By: mtomic, teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1772
2019-01-15 12:13:40 +01:00
Matej Ferencevic
3209788cd4 Implement new spin lock
Reviewers: teon.banek, buda

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1786
2019-01-08 09:15:07 +01:00
Marin Tomic
e4b661fe4a Cache required privileges for query execution
Summary: this should reduce parsing time for very simple queries.

Reviewers: teon.banek, mferencevic

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1698
2018-10-26 10:40:13 +02:00
Marin Tomic
eee8b57daf Separate query types in AST and interpreter
Summary:
`Query` is now an abstract class which has `CypherQuery`,
`ExplainQuery`, `IndexQuery`, `AuthQuery` and `StreamQuery` as derived
classes. Only `CypherQuery` is forwarded to planner and the rest of the
queries are handled directly in the interpreter. This enabled us to
remove auth, explain and stream operators, clean up `Context` class and
remove coupling between `Results` class and plan cache. This should make
it easier to add similar functionality because no logical operator
boilerplate is needed. It should also be easier to separate community
and enterprise features for open source.

Remove Explain logical operator
Separate IndexQuery in AST
Handle index creation in interpreter
Remove CreateIndex operator and ast nodes
Remove plan cache reference from Results
Move auth queries out of operator tree
Remove auth from context
Fix tests, separate stream queries
Remove in_explicit_transaction and streams from context

Reviewers: teon.banek, mferencevic, msantl

Reviewed By: teon.banek, mferencevic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1664
2018-10-22 16:43:42 +02:00
Marin Tomic
285e02d5ec Remove root tracking from AST storage
Summary: Up till now, `AstStorage` also took care of tracking the root of the `Query` and loading of cloning of `Query` nodes would change that root. This felt out of place because sometimes `AstStorage` is used only for storing expressions, and we don't even have an entire query in the storage. This diff removes that feature from `AstStorage`. Now its only functionality is owning AST nodes and assigning unique IDs to them.

Reviewers: teon.banek, llugovic

Reviewed By: teon.banek

Subscribers: mferencevic, pullbot

Differential Revision: https://phabricator.memgraph.io/D1646
2018-10-16 10:22:21 +02:00
Teon Banek
91566bb9fc Don't implicitly wait on futures during query execution
Summary:
Even though we may not need the results of a RPC, we should check that
it completed without error.

Reviewers: mferencevic, mtomic

Reviewed By: mtomic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1610
2018-09-26 11:02:32 +02:00
Marin Tomic
b5cdf6b476 Clean-up TypedValue misuse
Summary:
In a bunch of places `TypedValue` was used where `PropertyValue` should be. A lot of times it was only because `TypedValue` serialization code could be reused for `PropertyValue`, only without providing callbacks for `VERTEX`, `EDGE` and `PATH`. So first I wrote separate serialization code for `PropertyValue` and put it into storage folder. Then I fixed all the places where `TypedValue` was incorrectly used instead of `PropertyValue`. I also disabled implicit `TypedValue` to `PropertyValue` conversion in hopes of preventing misuse in the future.

After that, I wrote code for `VertexAccessor` and `EdgeAccessor` serialization and put it into `storage` folder because it was almost duplicated in distributed BFS and pull produce RPC messages. On the sender side, some subset of records (old or new or both) is serialized, and on the reciever side, records are deserialized and immediately put into transaction cache.

Then I rewrote the `TypedValue` serialization functions (`SaveCapnpTypedValue` and `LoadCapnpTypedValue`) to not take callbacks for `VERTEX`, `EDGE` and `PATH`, but use accessor serialization functions instead. That means that any code that wants to use `TypedValue` serialization must hold a reference to `GraphDbAccessor` and `DataManager`, so that should make clients reconsider if they really want to use `TypedValue` instead of `PropertyValue`.

Reviewers: teon.banek, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1598
2018-09-13 13:45:54 +02:00
Marin Tomic
ba3837e942 Remove is_query_cached from Context
Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1588
2018-09-04 14:17:49 +02:00
Teon Banek
4db62b18f0 Decouple visiting of distributed operators from regular
Summary:
Visitor pattern's main issue is cyclical dependency between classes that
are visited and the visitor instance itself. We need to decouple this
dependency if we want to open source part of the code, namely
non-distributed part. This decoupling is achieved through the use of
`dynamic_cast` in distributed operators. Hopefully the solution is good
enough and doesn't cause performance issues. An alternative solution is
to build our own custom double dispatch solution, but that will
basically boil down to our implementation of runtime type information
and casts.

Note, this only decouples the distributed operators. If and when we
decide that other operators shouldn't be open sourced, the same
`dynamic_cast` pattern should be applied in them also.

Depends on D1563

Reviewers: mtomic, msantl, buda

Reviewed By: mtomic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1566
2018-08-28 14:53:02 +02:00
Teon Banek
6427902920 Extract distributed interpretation out of Interpreter
Reviewers: mtomic, mferencevic, msantl, buda

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1560
2018-08-27 09:31:39 +02:00
Teon Banek
50c75c56a4 Add EXPLAIN to openCypher
Summary:
  * Move PlanPrinter from test to memgraph
  * Add explainQuery to MemgraphCypher.g4
  * Add Explain operator
  * Update changelog

Reviewers: mtomic, buda, ipaljak

Reviewed By: mtomic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1555
2018-08-23 14:05:32 +02:00
Marin Tomic
327c3c5d9b Add required privileges for query to Results
Reviewers: mferencevic, buda

Reviewed By: mferencevic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1537
2018-08-16 15:59:10 +02:00
Matej Ferencevic
2ecb660790 Initial implementation of authentication
Reviewers: teon.banek, buda

Reviewed By: teon.banek

Subscribers: mtomic, pullbot

Differential Revision: https://phabricator.memgraph.io/D1488
2018-07-27 13:08:17 +02:00
Matija Santl
4c27596fdd Implement kafka transform functionality
Summary:
First iteration in implementing kafka.
Currently, memgraph streams won't use the transform script provided in the
`CREATE STREAM` query.

There is a manual test that serves a POC purpose which we'll use to fully wire
kafka in memgraph.

Since streams need to download the script, I moved curl init from
telemetry.

Reviewers: teon.banek, mferencevic

Reviewed By: mferencevic

Subscribers: ipaljak, pullbot, buda

Differential Revision: https://phabricator.memgraph.io/D1491
2018-07-19 12:52:25 +02:00
Matej Ferencevic
4f417e1f5d Support streaming of Bolt results
Summary:
Previously, our implementation of the Bolt protocol buffered all results in
memory before sending them out to the client. This implementation immediately
streams the results to the client to avoid any memory allocations. Also, this
implementation splits the interpretation and pulling logic into two.

Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1495
2018-07-18 13:01:50 +02:00
Teon Banek
b15eeffd48 Extract communication to static library
Summary:
Session specifics have been move out of the Bolt `executing` state, and
are accessed via pure virtual Session type. Our server is templated on
the session and we are setting the concrete type, so there should be no
virtual call overhead. Abstract Session is used to indicate the
interface, this could have also been templated, but the explicit
interface definition makes it clearer.

Specific session implementation for running Memgraph is now implemented
in memgraph_bolt, which instantiates the concrete session type. This may
not be 100% appropriate place, but Memgraph specific session isn't
needed anywhere else.

Bolt/communication tests now use a dummy session and depend only on
communication, which significantly improves test run times.

All these changes make the communication a library which doesn't depend
on storage nor the database. Only shared connection points, which aren't
part of the base communication library are:

  * glue/conversion -- which converts between storage and bolt types, and
  * communication/result_stream_faker -- templated, but used in tests and query/repl

Depends on D1453

Reviewers: mferencevic, buda, mtomic, msantl

Reviewed By: mferencevic, mtomic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1456
2018-07-11 12:52:41 +02:00
Teon Banek
d7a9c5bab8 Extract TypedValue/DecodedValue conversion to higher component
Summary:
This is the first step in cutting the crazy dependencies of
communication module to the whole database. Includes have been
reorganized and conversion between DecodedValue and other Memgraph types
(TypedValue and PropertyValue) has been extracted to a higher level
component called `communication/conversion`. Encoder, like Decoder, now
relies only on DecodedValue. Hopefully the conversion operations will
not significantly slow down streaming Bolt data.

Additionally, Bolt ID is now wrapped in a class. Our storage model uses
*unsigned* int64, while Bolt expects *signed* int64. The implicit
conversions may lead to encode/decode errors, so the wrapper should
enforce some type safety to prevent such errors.

Reviewers: mferencevic, buda, msantl, mtomic

Reviewed By: mferencevic, mtomic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1453
2018-07-11 12:51:31 +02:00
Marin Tomic
3948cea83c Rename AstTreeStorage to AstStorage
Summary: happiness

Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1403
2018-06-14 13:39:03 +02:00
Teon Banek
c7b6cae526 Extract io/network into mg-io library
Reviewers: buda, dgleich, mferencevic

Reviewed By: mferencevic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1411
2018-05-30 14:58:41 +02:00
florijan
95f1b6fe56 Fix remote plan invalidation
Summary:
Ensures that query plans are invalidated on the remote only when it's
guaranteed they will never be used on the master. This is done by
invalidating remote caches in the `CachedPlan` destructor.

There are two unplesant side-effects. First, an RPC call is made in an
object destructor. This is somewhat ugly, but not that different then
making an RPC call that must succeed in any other function. Note that
this does NOT slow down any query execution because the relevant
destructor is called by the skiplist garbage collector. The second ugly
side-effect is that in the unit test now we need to sleep to ensure the
skiplist GC destructs a cached plan before checking that it's
invalidated on the remote worker.

We might want to redesign this at some point.

Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1302
2018-03-16 11:18:16 +01:00
florijan
03f1547a8d Invalidate distributed plan caches
Summary:
 - Remove caches on workers as a result of plan expiration or race
   during insertion.
 - Extract caching functionality into a class.
 - Minor refactor of Interpreter::operator()
 - New RPC and test for it.
 - Rename ConsumePlanRes to DispatchPlanRes for consistency, remove
   return value as it's always true and never used.
 - Interpreter is now constructed with a `GraphDb` reference. At the
   moment only for reaching the `distributed::PlanDispatcher`, but in
   the future we should probably use that primarily for planning.

I added a function to `PlanConsumer` that is only used for testing.
I prefer not doing this, but I felt this needed testing. I can remove
it now if you like.

Reviewers: teon.banek, msantl

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1292
2018-03-13 14:59:51 +01:00
Teon Banek
2055176139 Plan basic distributed Cartesian without any checks
Summary:
This is still very much in progress.  No advanced checks are done to
prevent planning unimplemented things.  Basic Cartesian product should
work, for example `MATCH (a), (b) CREATE (a)-[:r]->(c)-[:r]->(b)`. But
anything more advanced may lead to undefined behaviour of the planner
and therefore execution. Use at your own risk!

Add ModifiedSymbols method to LogicalOperator

For planning Cartesian, we need information on which symbols are filled
by operator sub-trees.  Currently, this is used to set symbols which
should be transferred over network. Later, they should be used to detect
whether filter expressions use symbols modified from Cartesian branches.
Then we will be able to ensure correct dependency of filters and their
behaviour.

Prepare DistributedPlan for multiple worker plans

Since Cartesian branches need to be split and handled by each worker, we
now dispatch multiple plans to workers.

Reviewers: florijan, msantl, buda

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1208
2018-02-20 13:02:08 +01:00
Teon Banek
760c6246d8 Make and dispatch worker plans on distributed master
Reviewers: florijan, msantl

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1144
2018-01-29 10:55:57 +01:00
florijan
6fc6a27288 Refactor GraphDb
Summary:
GraphDb is refactored to become an API exposing different parts
necessary for the database to function. These different parts can have
different implementations in SingleNode or distributed Master/Server
GraphDb implementations.

Interally GraphDb is implemented using two class heirarchies. One
contains all the members and correct wiring for each situation. The
other takes care of initialization and shutdown. This architecture is
practical because it can guarantee that the initialization of the
object structure is complete, before initializing state.

Reviewers: buda, mislav.bradac, dgleich, teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1093
2018-01-12 16:47:24 +01:00
florijan
7b3d298741 Fix interpreter plan ownership
Summary:
If there was no plan caching, the CachedPlan would not survive
`Interpreter::operator()`, as it was not owned by the
`Interpreter::Result`. If there was caching, it could hapen that the
cache got invalidated while that plan was being interpreted (by
another thread) without that interpretation retaining ownership.

Also simplified code around this.

Reviewers: mislav.bradac, teon.banek

Reviewed By: mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1084
2017-12-27 13:33:13 +01:00
florijan
9b878c91eb Make interpretation pullable
Reviewers: mislav.bradac, teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1079
2017-12-22 15:19:05 +01:00
Mislav Bradac
e078c9f7cd Refactor Interpreter
Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D896
2017-10-11 11:06:25 +02:00
Mislav Bradac
31798eb957 Handle index creation correctly
Reviewers: teon.banek, buda

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D891
2017-10-09 19:22:39 +02:00
Mislav Bradac
3140f175fc Implement record lock deadlock breaker
Reviewers: florijan, buda

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D823
2017-09-27 16:11:05 +02:00
Teon Banek
eb821aa615 Remove ast-cache flag
Summary:
AST caching should be well tested by now.
We should consider removing `Context.is_query_cached_` member as well as the
implementation and tests for `CypherMainVisitor`.

Reviewers: mislav.bradac, mferencevic, buda

Reviewed By: mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D833
2017-09-27 09:04:31 +02:00
florijan
a9381df09e Edges data structure now supports multiple edge filtering (implicit OR)
Summary: - modified all utils/algorithm functions to be inline and in the utils namespace

Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D830
2017-09-26 13:46:18 +02:00
florijan
d9503d6b65 Flags cleanup and QueryEngine removal
Summary:
I started with cleaning flags up (removing unused ones, documenting undocumented ones). There were some flags to remove in `QueryEngine`. Seeing how we never use hardcoded queries (AFAIK last Mislav's testing also indicated they aren't faster then interpretation), when removing those unused flags the `QueryEngine` becomes obsolete. That means that a bunch of other stuff becomes obsolete, along with the hardcoded queries. So I removed it all (this has been discussed and approved on the daily).

Some flags that were previously undocumented in `docs/user_technical/installation` are now documented. The following flags are NOT documented and in my opinion should not be displayed when starting `./memgraph --help` (@mferencevic):
```
query_vertex_count_to_expand_existsing (from rule_based_planner.cpp)
query_max_plans (rule_based_planner.cpp)
```

If you think that another organization is needed w.r.t. flag visibility, comment.

@teon.banek: I had to remove some stuff from CMakeLists to make it buildable. Please review what I removed and clean up if necessary if/when this lands. If the needed changes are minor, you can also comment.

Reviewers: buda, mislav.bradac, teon.banek, mferencevic

Reviewed By: buda, mislav.bradac

Subscribers: pullbot, mferencevic, teon.banek

Differential Revision: https://phabricator.memgraph.io/D825
2017-09-22 17:03:45 +02:00
Teon Banek
182a9241cf Use symbol names for header if missing token position
Summary:
This fixes a bug, where streaming would try to get the name of the symbol from
an invalid token position. For example, `MATCH (n) WITH n RETURN *`

Reviewers: florijan, mislav.bradac

Reviewed By: florijan, mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D811
2017-09-20 15:38:53 +02:00
Teon Banek
08eaaba77f Add caching logical plans
Summary: Add const to MakeCursor method

Reviewers: florijan, mislav.bradac

Reviewed By: mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D791
2017-09-20 13:31:26 +02:00
Teon Banek
b31478dfd8 Look into parameter value when estimating plan cost
Summary: Test cost estimator considers parameter values

Reviewers: florijan, mislav.bradac

Reviewed By: florijan

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D798
2017-09-15 16:14:57 +02:00
Teon Banek
d116e0deb1 Generate ParameterLookup instead of PrimitiveLiteral
Summary:
This is done when the generated AST will be cached.
Remove LiteralsPlugger.

Reviewers: florijan, mislav.bradac

Reviewed By: mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D788
2017-09-13 17:49:34 +02:00
Teon Banek
a83bea0b74 Add ParameterLookup to AST
Reviewers: florijan, mislav.bradac

Reviewed By: mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D782
2017-09-13 14:28:48 +02:00
Teon Banek
0aee18544d Add caching VerticesCount during planning and estimation
Summary:
Benchmark planning and estimating indexed ScanAll. According to the benchmark,
caching speeds up the whole process of planning and estimation by a factor of
2. Most of the performance gain is in the `CostEstimator` itself, due to plenty
of calls to `VerticesCount` when estimating all of the generated plans.

Reviewers: mislav.bradac, florijan

Reviewed By: mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D765
2017-09-09 12:26:48 +02:00
Matej Ferencevic
6a1f083370 Added parameters to query logging.
Reviewers: buda

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D757
2017-09-06 16:48:09 +02:00
Matej Ferencevic
70d9f3f6f1 Refactored harness and added PostgreSQL support.
Summary:
Moved Neo4j config to config dir.

Neo4j and PostgreSQL are now downloaded to libs.

Renamed metadata flags in memgraph.

Changed apollo generate for new harness.

Reviewers: mislav.bradac

Reviewed By: mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D741
2017-09-05 09:45:41 +02:00
Matej Ferencevic
65c6b50ade Removed query logging in release mode.
Reviewers: mislav.bradac, buda

Reviewed By: mislav.bradac, buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D742
2017-09-04 13:37:20 +02:00
Teon Banek
fdc389f1eb Templatize CostEstimator on DbAccessor
Summary:
This allows for inserting dummy DbAccessor in tests. Unfortunate side
effect of this change is that the whole implementation had to be moved
from cpp to hpp.

Also templatize remaining RuleBasedPlanner implementation

Reviewers: florijan, mislav.bradac

Reviewed By: mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D704
2017-08-24 14:27:14 +02:00
Mislav Bradac
cc8e9f2996 Fix of stupid, stupid bug
Reviewers: mferencevic, buda

Reviewed By: mferencevic

Subscribers: florijan, pullbot

Differential Revision: https://phabricator.memgraph.io/D635
2017-08-04 15:03:59 +02:00
Mislav Bradac
d750970381 Lock when accessing parser to prevent antlr bugs
Reviewers: buda, mferencevic

Reviewed By: buda, mferencevic

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D625
2017-08-03 16:38:08 +02:00
Mislav Bradac
e8a465e4b5 Add query parameters support
Reviewers: buda

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D570
2017-07-19 18:44:59 +02:00
Mislav Bradac
4dabf31005 Remove unused function time_second
Reviewers: teon.banek

Reviewed By: teon.banek

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D561
2017-07-17 15:55:43 +02:00
Mislav Bradac
50cd53e5c9 Migrate timer to use walltime, instead of cputime
Summary:
 - Move some stuff to poc
 - Use steady_clock instead of system_clock

Reviewers: buda

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D555
2017-07-17 13:42:44 +02:00
Teon Banek
88153595ce Make GraphDbAccessor mandatory for planning
Reviewers: florijan, mislav.bradac, buda

Reviewed By: buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D527
2017-07-10 10:13:12 +02:00
Teon Banek
f5db9d65b0 Add CreateIndex operator
Reviewers: florijan, mislav.bradac, buda

Reviewed By: florijan, buda

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D516
2017-07-03 11:03:15 +02:00