Summary:
- Add the `AnyStream` wrapper
- Remove the `Results` struct and store a function (handler) instead
Reviewers: teon.banek, mferencevic
Reviewed By: teon.banek, mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2497
Summary:
Depends on D2471
- Add pointer to storage to `InterpreterContext`
- Rename `operator()` to `Prepare`
- Use `Interpret` instead of `operator()` (`Interpret` will be removed soon)
- Remove the `in_explicit_transaction` parameter
- Remove the memory resource parameter from `Interpret`
- Remove the storage accessor parameter from `Interpret`
- Fix up tests (remove the `Interpreter` from `database_transaction_timeout`)
Reviewers: teon.banek, mferencevic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2482
Summary: Make `InterpreterContext` a top level instead of a nested struct
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2512
Summary:
Micro benchmarks show some minor variations compared to the previous
commit. Smaller cases are a bit worse while larger data cases are a bit
better.
Reviewers: mtomic, mferencevic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2136
Summary:
Micro benchmarks show improvements in performance of MapLiteral from 5%
to 40% depending on the size of the input. On the other hand, a sequence
of AdditionOperators behaves the same with both allocation schemes.
Reviewers: mtomic, mferencevic, msantl
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2132
Summary:
Rename Context to ExecutionContext and make it struct
Move ParsingContext to cypher_main_visitor.hpp
Reviewers: mtomic, llugovic
Reviewed By: llugovic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1810
Summary:
This diff removes the need for a database when parsing a query and
creating an Ast. Instead of storing storage::{Label,Property,EdgeType}
in Ast nodes, we store the name and an index into all of the names. This
allows for easy creation of a map from {Label,Property,EdgeType} index
into the concrete storage type. Obviously, this comes with a performance
penalty during execution, but it should be minor. The upside is that the
query/frontend minimally depends on storage (PropertyValue), which makes
writing tests easier as well as running them a lot faster (there is no
database setup). This is most noticeable in the ast_serialization test
which took a long time due to start up of a distributed database.
Reviewers: mtomic, llugovic
Reviewed By: mtomic
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1774
Summary: this should reduce parsing time for very simple queries.
Reviewers: teon.banek, mferencevic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1698
Summary:
`Query` is now an abstract class which has `CypherQuery`,
`ExplainQuery`, `IndexQuery`, `AuthQuery` and `StreamQuery` as derived
classes. Only `CypherQuery` is forwarded to planner and the rest of the
queries are handled directly in the interpreter. This enabled us to
remove auth, explain and stream operators, clean up `Context` class and
remove coupling between `Results` class and plan cache. This should make
it easier to add similar functionality because no logical operator
boilerplate is needed. It should also be easier to separate community
and enterprise features for open source.
Remove Explain logical operator
Separate IndexQuery in AST
Handle index creation in interpreter
Remove CreateIndex operator and ast nodes
Remove plan cache reference from Results
Move auth queries out of operator tree
Remove auth from context
Fix tests, separate stream queries
Remove in_explicit_transaction and streams from context
Reviewers: teon.banek, mferencevic, msantl
Reviewed By: teon.banek, mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1664
Summary: Up till now, `AstStorage` also took care of tracking the root of the `Query` and loading of cloning of `Query` nodes would change that root. This felt out of place because sometimes `AstStorage` is used only for storing expressions, and we don't even have an entire query in the storage. This diff removes that feature from `AstStorage`. Now its only functionality is owning AST nodes and assigning unique IDs to them.
Reviewers: teon.banek, llugovic
Reviewed By: teon.banek
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1646
Summary:
Even though we may not need the results of a RPC, we should check that
it completed without error.
Reviewers: mferencevic, mtomic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1610
Summary:
In a bunch of places `TypedValue` was used where `PropertyValue` should be. A lot of times it was only because `TypedValue` serialization code could be reused for `PropertyValue`, only without providing callbacks for `VERTEX`, `EDGE` and `PATH`. So first I wrote separate serialization code for `PropertyValue` and put it into storage folder. Then I fixed all the places where `TypedValue` was incorrectly used instead of `PropertyValue`. I also disabled implicit `TypedValue` to `PropertyValue` conversion in hopes of preventing misuse in the future.
After that, I wrote code for `VertexAccessor` and `EdgeAccessor` serialization and put it into `storage` folder because it was almost duplicated in distributed BFS and pull produce RPC messages. On the sender side, some subset of records (old or new or both) is serialized, and on the reciever side, records are deserialized and immediately put into transaction cache.
Then I rewrote the `TypedValue` serialization functions (`SaveCapnpTypedValue` and `LoadCapnpTypedValue`) to not take callbacks for `VERTEX`, `EDGE` and `PATH`, but use accessor serialization functions instead. That means that any code that wants to use `TypedValue` serialization must hold a reference to `GraphDbAccessor` and `DataManager`, so that should make clients reconsider if they really want to use `TypedValue` instead of `PropertyValue`.
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1598
Summary:
Visitor pattern's main issue is cyclical dependency between classes that
are visited and the visitor instance itself. We need to decouple this
dependency if we want to open source part of the code, namely
non-distributed part. This decoupling is achieved through the use of
`dynamic_cast` in distributed operators. Hopefully the solution is good
enough and doesn't cause performance issues. An alternative solution is
to build our own custom double dispatch solution, but that will
basically boil down to our implementation of runtime type information
and casts.
Note, this only decouples the distributed operators. If and when we
decide that other operators shouldn't be open sourced, the same
`dynamic_cast` pattern should be applied in them also.
Depends on D1563
Reviewers: mtomic, msantl, buda
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1566
Summary:
First iteration in implementing kafka.
Currently, memgraph streams won't use the transform script provided in the
`CREATE STREAM` query.
There is a manual test that serves a POC purpose which we'll use to fully wire
kafka in memgraph.
Since streams need to download the script, I moved curl init from
telemetry.
Reviewers: teon.banek, mferencevic
Reviewed By: mferencevic
Subscribers: ipaljak, pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D1491
Summary:
Previously, our implementation of the Bolt protocol buffered all results in
memory before sending them out to the client. This implementation immediately
streams the results to the client to avoid any memory allocations. Also, this
implementation splits the interpretation and pulling logic into two.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1495
Summary:
Session specifics have been move out of the Bolt `executing` state, and
are accessed via pure virtual Session type. Our server is templated on
the session and we are setting the concrete type, so there should be no
virtual call overhead. Abstract Session is used to indicate the
interface, this could have also been templated, but the explicit
interface definition makes it clearer.
Specific session implementation for running Memgraph is now implemented
in memgraph_bolt, which instantiates the concrete session type. This may
not be 100% appropriate place, but Memgraph specific session isn't
needed anywhere else.
Bolt/communication tests now use a dummy session and depend only on
communication, which significantly improves test run times.
All these changes make the communication a library which doesn't depend
on storage nor the database. Only shared connection points, which aren't
part of the base communication library are:
* glue/conversion -- which converts between storage and bolt types, and
* communication/result_stream_faker -- templated, but used in tests and query/repl
Depends on D1453
Reviewers: mferencevic, buda, mtomic, msantl
Reviewed By: mferencevic, mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1456
Summary:
This is the first step in cutting the crazy dependencies of
communication module to the whole database. Includes have been
reorganized and conversion between DecodedValue and other Memgraph types
(TypedValue and PropertyValue) has been extracted to a higher level
component called `communication/conversion`. Encoder, like Decoder, now
relies only on DecodedValue. Hopefully the conversion operations will
not significantly slow down streaming Bolt data.
Additionally, Bolt ID is now wrapped in a class. Our storage model uses
*unsigned* int64, while Bolt expects *signed* int64. The implicit
conversions may lead to encode/decode errors, so the wrapper should
enforce some type safety to prevent such errors.
Reviewers: mferencevic, buda, msantl, mtomic
Reviewed By: mferencevic, mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1453
Summary:
Ensures that query plans are invalidated on the remote only when it's
guaranteed they will never be used on the master. This is done by
invalidating remote caches in the `CachedPlan` destructor.
There are two unplesant side-effects. First, an RPC call is made in an
object destructor. This is somewhat ugly, but not that different then
making an RPC call that must succeed in any other function. Note that
this does NOT slow down any query execution because the relevant
destructor is called by the skiplist garbage collector. The second ugly
side-effect is that in the unit test now we need to sleep to ensure the
skiplist GC destructs a cached plan before checking that it's
invalidated on the remote worker.
We might want to redesign this at some point.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1302
Summary:
- Remove caches on workers as a result of plan expiration or race
during insertion.
- Extract caching functionality into a class.
- Minor refactor of Interpreter::operator()
- New RPC and test for it.
- Rename ConsumePlanRes to DispatchPlanRes for consistency, remove
return value as it's always true and never used.
- Interpreter is now constructed with a `GraphDb` reference. At the
moment only for reaching the `distributed::PlanDispatcher`, but in
the future we should probably use that primarily for planning.
I added a function to `PlanConsumer` that is only used for testing.
I prefer not doing this, but I felt this needed testing. I can remove
it now if you like.
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1292
Summary:
This is still very much in progress. No advanced checks are done to
prevent planning unimplemented things. Basic Cartesian product should
work, for example `MATCH (a), (b) CREATE (a)-[:r]->(c)-[:r]->(b)`. But
anything more advanced may lead to undefined behaviour of the planner
and therefore execution. Use at your own risk!
Add ModifiedSymbols method to LogicalOperator
For planning Cartesian, we need information on which symbols are filled
by operator sub-trees. Currently, this is used to set symbols which
should be transferred over network. Later, they should be used to detect
whether filter expressions use symbols modified from Cartesian branches.
Then we will be able to ensure correct dependency of filters and their
behaviour.
Prepare DistributedPlan for multiple worker plans
Since Cartesian branches need to be split and handled by each worker, we
now dispatch multiple plans to workers.
Reviewers: florijan, msantl, buda
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1208
Summary:
GraphDb is refactored to become an API exposing different parts
necessary for the database to function. These different parts can have
different implementations in SingleNode or distributed Master/Server
GraphDb implementations.
Interally GraphDb is implemented using two class heirarchies. One
contains all the members and correct wiring for each situation. The
other takes care of initialization and shutdown. This architecture is
practical because it can guarantee that the initialization of the
object structure is complete, before initializing state.
Reviewers: buda, mislav.bradac, dgleich, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1093