Summary:
The Kafka Python transform functionality uses a Python script to transform
incoming Kafka data into queries and parameters that are executed against the
database. When starting the Python transform script it is started in a
sandboxed environment so that it can't do harm to the host system or the
database.
Reviewers: msantl, teon.banek
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1509
Summary:
Since we switched to Cap'n Proto serialization there's no need for
keeping boost around anymore.
Reviewers: mtomic, mferencevic, buda
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1515
Summary: first step for a cleaner query front end that might be easier to open source
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1511
Summary:
GraphDbAccessor is now constructed only through GraphDb. This allows the
concrete GraphDb to instantiate a concrete GraphDbAccessor. This allows
us to use virtual calls, so that the implementation may be kept
separate. The major downside of doing things this way is heap allocation
of GraphDbAccessor. In case it turns out to be a real performance
issues, another solution with pointer to static implementation may be
used.
InsertVertexIntoRemote is now a non-member function, which reduces
coupling. It made no sense for it to be member function because it used
only the public parts of GraphDbAccessor.
Reviewers: msantl, mtomic, mferencevic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1504
Summary:
This change, hopefully, simplifies the implementation of different kinds
of GraphDb. The pimpl idiom is now simplified by removing all of the
crazy inheritance. Implementations classes are just plain data stores,
without any methods. The interface classes now have a more flat
hierarchy:
```
GraphDb (pure interface)
|
+----+---------- DistributedGraphDb (pure interface)
| |
Single Node +-----+------+
| |
Master Worker
```
DistributedGraphDb is used as an intermediate interface for all the
things that should work only in distributed. Therefore, virtual calls
for distributed stuff have been removed from GraphDb. Some are exposed
via DistributedGraphDb, other's are only in concrete Master and Worker
classes. The code which relied on those virtual calls has been
refactored to either use DistributedGraphDb, take a pointer to what is
actually needed or use dynamic_cast. Obviously, dynamic_cast is a
temporary solution and should be replaced with another mechanism (e.g.
virtual call, or some other function pointer style).
The cost of the above change is some code duplication in constructors
and destructors of classes. This duplication has a lot of little tweaks
that make it hard to generalize, not to mention that virtual calls do
not work in constructor and destructor. If we really care about
generalizing this, we should think about abandoning RAII in favor of
constructor + Init method.
The next steps for splitting the dependencies that seem logical are:
1) Split GraphDbAccessor implementation, either via inheritance or
passing in an implementation pointer. GraphDbAccessor should then
only be created by a virtual call on GraphDb.
2) Split Interpreter implementation. Besides allowing single node
interpreter to exist without depending on distributed, this will
enable the planner and operators to be correctly separated.
Reviewers: msantl, mferencevic, ipaljak
Reviewed By: msantl
Subscribers: dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D1493
Summary: This is a simple thing that enables us to use keywords as symbolic names. A bad thing is that planning will be done for queries which differ only in keyword case (e.g. "MATCH (x) RETURN x" and "match (x) return x"), but that should not be a significant performance impact as queries are done programmaticaly when high throughput is expected and we don't expect letter case change in that case.
Reviewers: buda, mferencevic, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1496
Summary:
First iteration in implementing kafka.
Currently, memgraph streams won't use the transform script provided in the
`CREATE STREAM` query.
There is a manual test that serves a POC purpose which we'll use to fully wire
kafka in memgraph.
Since streams need to download the script, I moved curl init from
telemetry.
Reviewers: teon.banek, mferencevic
Reviewed By: mferencevic
Subscribers: ipaljak, pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D1491
Summary:
This change should preclude the need to specify `:capnp-save` and
`:capnp-load` functions for regularly saved elements of `std::vector<T>`
and `std::optional<T>`. Regular saving in this context means saving
primitive types or compound types which have a `Save(capnp::Builder *)`
method.
Reviewers: mtomic, msantl
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1497
Summary:
Previously, our implementation of the Bolt protocol buffered all results in
memory before sending them out to the client. This implementation immediately
streams the results to the client to avoid any memory allocations. Also, this
implementation splits the interpretation and pulling logic into two.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1495
Summary:
This change should completely support planning Optional for distributed
execution. Cartesian matching is handled, as well as dependencies
between optional branch and the main input branch.
Unit tests are expanded to cover the planning algorithm.
Reviewers: msantl, mtomic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1480
Summary:
Session specifics have been move out of the Bolt `executing` state, and
are accessed via pure virtual Session type. Our server is templated on
the session and we are setting the concrete type, so there should be no
virtual call overhead. Abstract Session is used to indicate the
interface, this could have also been templated, but the explicit
interface definition makes it clearer.
Specific session implementation for running Memgraph is now implemented
in memgraph_bolt, which instantiates the concrete session type. This may
not be 100% appropriate place, but Memgraph specific session isn't
needed anywhere else.
Bolt/communication tests now use a dummy session and depend only on
communication, which significantly improves test run times.
All these changes make the communication a library which doesn't depend
on storage nor the database. Only shared connection points, which aren't
part of the base communication library are:
* glue/conversion -- which converts between storage and bolt types, and
* communication/result_stream_faker -- templated, but used in tests and query/repl
Depends on D1453
Reviewers: mferencevic, buda, mtomic, msantl
Reviewed By: mferencevic, mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1456
Summary:
This is the first step in cutting the crazy dependencies of
communication module to the whole database. Includes have been
reorganized and conversion between DecodedValue and other Memgraph types
(TypedValue and PropertyValue) has been extracted to a higher level
component called `communication/conversion`. Encoder, like Decoder, now
relies only on DecodedValue. Hopefully the conversion operations will
not significantly slow down streaming Bolt data.
Additionally, Bolt ID is now wrapped in a class. Our storage model uses
*unsigned* int64, while Bolt expects *signed* int64. The implicit
conversions may lead to encode/decode errors, so the wrapper should
enforce some type safety to prevent such errors.
Reviewers: mferencevic, buda, msantl, mtomic
Reviewed By: mferencevic, mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1453
Summary:
Added test stream functionality. When a stream is configured, it will try to
consume messages from a kafka topic and return them back to the user.
For now, the messages aren't transformed, so it just returns the payload string.
Depends on D1466
Next steps are persisting stream metadata and transforming messages in order to
store them in the graph.
Reviewers: teon.banek, mtomic
Reviewed By: teon.banek
Subscribers: pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D1474
Summary:
Integrated kafka library into memgraph. This version supports all opencypher
features and will only output messages consumed from kafka.
Depends on D1434
Next steps are persisting stream metadata and transforming messages in order to
store them in the graph.
Reviewers: teon.banek, mtomic, mferencevic, buda
Reviewed By: teon.banek
Subscribers: mferencevic, pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D1466
Summary: When doing bfs with given endpoint, we can stop the traversal on first successful pull.
Reviewers: teon.banek, msantl, mculinovic, buda
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1469
Summary:
Added basic functionality for kafka streams. The `CREATE STREAM` clause is a
simplified version from the one mentioned in D1415 so we can start testing
end-to-end sooner.
This diff also includes a bug fix in `lcp.list ` for operators that have no
members.
Reviewers: teon.banek, mtomic, buda
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1434
Summary:
For reasons unknown to man, antlr didn't want to parse calls
to functions whose name is a single hex letter properly. Now it should work.
Reviewers: teon.banek, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1464
Summary:
Added missing string functions.
I also cleaned up error messages a bit in effort to make them uniform.
Reviewers: teon.banek, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1458
Summary:
Hopefully, the mechanism of generating Cartesian is general enough, so
this simple change should work correctly in all cases.
Planner tests have been modified to use a FakeDbAccessor in order to
speed them up and potentially allow extracting planning into a library.
Reviewers: msantl, mtomic, buda
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1431
Summary:
This change should correctly plan Cartesian which have dependent Filter
or Expand operators. Tests have been added for those cases. Other cases
are not yet supported and should throw an exception.
Reviewers: msantl, mtomic, buda
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1426
Summary:
This is the initial step to getting a correct version of distributed
planning of Cartesian operator. Functions and structs have been added
which should collect enough information to correctly order the execution
with regards to dependencies among Cartesian branches. The support
functionality should be the same as was before, but unsupported cases
should now raise an exception instead of leading to undefined behaviour.
Reviewers: msantl, mtomic, buda
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1418
Summary:
Converts the RPC stack to use Cap'n Proto for serialization instead of
boost. There are still some traces of boost in other places in the code,
but most of it is removed. A future diff should cleanup boost for good.
The RPC API is now changed to be more flexible with regards to how
serialize data. This makes the simplest cases a bit more verbose, but
allows complex serialization code to be correctly written instead of
relying on hacks. (For reference, look for the old serialization of
`PullRpc` which had a nasty pointer hacks to inject accessors in
`TypedValue`.)
Since RPC messages were uselessly modeled via inheritance of Message
base class, that class is now removed. Furthermore, that approach
doesn't really work with Cap'n Proto. Instead, each message type is
required to have some type information. This can be automated, so
`define-rpc` has been added to LCP, which hopefully simplifies defining
new RPC request and response messages.
Specify Cap'n Proto schema ID in cmake
This preserves Cap'n Proto generated typeIds across multiple generations
of capnp schemas through LCP. It is imperative that typeId stays the
same to ensure that different compilations of Memgraph may communicate
via RPC in a distributed cluster.
Use CLOS for meta information on C++ types in LCP
Since some structure slots and functions have started to repeat
themselves, it makes sense to model C++ meta information via Common Lisp
Object System.
Depends on D1391
Reviewers: buda, dgleich, mferencevic, mtomic, mculinovic, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1407
Summary:
Add additional structs and functions for handling C++ meta information.
Add capnp-file and capnp-id arguments to lcp:process-file.
Generate cpp along with hpp and capnp in lcp.
Wrap LogicalOperator base class in lcp:define-class.
Modify logical operators for capnp serialization.
Add query/common.capnp.
Reviewers: mculinovic, buda, mtomic, msantl, ipaljak, dgleich, mferencevic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1391
Summary:
Command id is necessary in remote produce to identify an ongoing pull
because a transaction can have multiple commands that all belong under
the same plan and tx id.
Reviewers: teon.banek, mtomic, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1386
Summary:
Removing `AS_IS` from GraphView because it doesn't seem like it is necessary for query execution and it also has weird semantics (you might get a mix of old and new records). `Unwind`, `OrderBy` and `PullRemoteOrderBy` now use `OLD` graph view.
Remove AS_IS from GraphView
Fix query_cost_estimator tests
Fix query_expression_evaluator tests
Fix query_plan_match_filter_return tests
Fix query_plan_create_set_remove_delete tests
Fix query_plan_accumulate_aggregate tests
Reviewers: teon.banek, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1390
Summary:
Since we are moving from boost to Capnp for serialization, it makes
sense to keep all of the LogicalOperator classes in LCP format. This
will make it easier to generate Capnp code.
Depends on D1361
Reviewers: buda, mferencevic, msantl, dgleich, ipaljak, mculinovic, mtomic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1362
Summary:
Pointers in AstTreeStorage can point to same nodes. These nodes
should be serialized(saved) only once when serializing Ast tree.
Same goes for deserializing(loading).
When deserializing nodes, one should not use storage Create method,
because it will add node which isn't completely loaded to storage vector.
Therefore, one should use new Deserialize method which doesn't
add node to storage vector. Inserting node into storage vector
is defered until all node data is loaded.
Pass pointer to saved uids set, instead of storage
Delete unused set
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1344
Summary:
Added a new markdown file for concepts and a new test to test the edge
case with upper bound.
Reviewers: teon.banek, mtomic, dgleich, buda, ipaljak
Reviewed By: teon.banek, dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1366
Summary:
When commiting/aborting a transaction in tx master engine, make a two
phase commit to all workers so they can stop all futures and clear
transactional cache.
Reviewers: dgleich, florijan
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1320
Summary:
Before we used `utils::Future` only where it's created by our `ThreadPool`.
I suggest in this diff that we use it everywhere, it's a bit more defensive and
should not have any downsides.
Reviewers: msantl, teon.banek
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1316
Summary:
Add different priority VLOGs for distributed memgraph.
For level 3 you'll get logs for dispatching/consuming plans.
For level 4 you'll get logs for tx start/commit/abort, remote produce, remote
pull, remote result consume,
For level 5 there will be a log for each request/response made by the RPC
client.
Master log snippet P9
Worker log snippet P10
Reviewers: florijan, teon.banek
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1296
Summary:
Remove "produce_" and "Produce" as prefix from all distributed stuff.
It's not removed in src/query/ stuff (operators).
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1315
Summary:
Ensures that query plans are invalidated on the remote only when it's
guaranteed they will never be used on the master. This is done by
invalidating remote caches in the `CachedPlan` destructor.
There are two unplesant side-effects. First, an RPC call is made in an
object destructor. This is somewhat ugly, but not that different then
making an RPC call that must succeed in any other function. Note that
this does NOT slow down any query execution because the relevant
destructor is called by the skiplist garbage collector. The second ugly
side-effect is that in the unit test now we need to sleep to ensure the
skiplist GC destructs a cached plan before checking that it's
invalidated on the remote worker.
We might want to redesign this at some point.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1302
Summary:
- Remove caches on workers as a result of plan expiration or race
during insertion.
- Extract caching functionality into a class.
- Minor refactor of Interpreter::operator()
- New RPC and test for it.
- Rename ConsumePlanRes to DispatchPlanRes for consistency, remove
return value as it's always true and never used.
- Interpreter is now constructed with a `GraphDb` reference. At the
moment only for reaching the `distributed::PlanDispatcher`, but in
the future we should probably use that primarily for planning.
I added a function to `PlanConsumer` that is only used for testing.
I prefer not doing this, but I felt this needed testing. I can remove
it now if you like.
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1292
Summary:
This fixes a bug where the planning would raise NotYetImplemented error,
due to preventing plan splitting while the operator is already on
master. For example, `MATCH (a), (b) CREATE (a)-[e:r]->(b) RETURN e`
would split the plan after Cartesian and before Create. Thus, the rest
of the plan would be on master, including the Accumulate before Produce.
We still can (and must) replace Accumulate with Synchronize but this
would fail due to unneeded check.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1272
Summary:
I have not thoroughly thought this through, especially the worker
destruction (is it legit to abort all running tx?), but it's tested to
abort during remote pull, what we need.
Also I improved error handling for vertex deletion failure during
remote pull (@dgleich).
Reviewers: teon.banek, msantl, dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1263
Summary:
If the `db` says we need to abort we should abort. There are two
places (in both `PullRemote` and `PullRemoteOrderBy`) where the check is made.
The first one is on the very beginning of the `Pull` method and the second one
is in the loop that checks/waits for remote results.
Reviewers: teon.banek, florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1265
Summary:
Usually a single RPC call takes 50us, which is a lot lower than our
millisecond sleep. Calling sleep with less than a ms duration may not
be really supported (due to timer granularity), but this change should
cause some sub-ms sleep.
Reviewers: florijan, msantl, mferencevic
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1258
Summary:
During distributed execution, OrderBy is split across workers and the
master gets to merge those results via PullRemoteOrderBy. Since this
operator may be an input to almost any other operator, virtual accessors
to `input` have been added in LogicalOperator.
Depends on D1221
Reviewers: florijan, msantl, buda
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1232
Summary:
Extracted `TypedValueVectorCompare` and `RemotePuller` from operators
so it can be reused.
The new `PullRemoteOrerBy` operator pulls one result from each worker and one
from master, relies on the fact that workers/master returned sorted results,
returns the next one in order, and pulls the source of that result to get the
next one.
Depends on D1215 that (at the moment) is still in review.
Reviewers: florijan, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1221
Summary:
- The expansion vertex gets created on the origin's worker
- The edge automatically gets created wherever necessary
- Vertex creation logic reuse
- End to end test
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1227
Summary:
- Add the said test
- Add `workerid` function
- Add random wait (enabled with flag) to `rpc::Client::Call` for network
latency simulation
Reviewers: buda, mferencevic, teon.banek
Reviewed By: buda, teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1222
Summary:
This may improve the query execution workload, because the Expand will
yield local results while it waits for remote ones. Note that we rely on
the fact that walking the graph produces results without any predetermined
order. Therefore, we can yield paths as we see fit. Enforcing the order
is done through OrderBy operator, and this change shouldn't affect that.
Reviewers: florijan, msantl, buda
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1215
Summary:
- Add `database::GraphDb::GetWorkerIds()`
- Change `CreateNode` constructor API
- Make `CreateNode` distribute nodes uniformly over workers
Did not yet modify `CreateExpand`, coming in the next diff.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1216
Summary:
This is still very much in progress. No advanced checks are done to
prevent planning unimplemented things. Basic Cartesian product should
work, for example `MATCH (a), (b) CREATE (a)-[:r]->(c)-[:r]->(b)`. But
anything more advanced may lead to undefined behaviour of the planner
and therefore execution. Use at your own risk!
Add ModifiedSymbols method to LogicalOperator
For planning Cartesian, we need information on which symbols are filled
by operator sub-trees. Currently, this is used to set symbols which
should be transferred over network. Later, they should be used to detect
whether filter expressions use symbols modified from Cartesian branches.
Then we will be able to ensure correct dependency of filters and their
behaviour.
Prepare DistributedPlan for multiple worker plans
Since Cartesian branches need to be split and handled by each worker, we
now dispatch multiple plans to workers.
Reviewers: florijan, msantl, buda
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1208
Summary:
On the master cleanups are hooked directly into the transaction engine.
This is beneficial because the master might have bigger caches and we
want to clear them as soon as possible.
On the workers there is a periodic RPC call to the master about living
transactions, which takes care of releasing local caches. This is
suboptimal because long transactions will prevent cache GC (like with
data GC). It is however fairly simple.
Note that all cleanup is not done automatically and `RemotePull` has
been reduced accordingly. @msantl, please verify correctness and
consider if the code can be additionally simplified.
Reviewers: teon.banek, msantl
Reviewed By: msantl
Subscribers: pullbot, msantl
Differential Revision: https://phabricator.memgraph.io/D1202
Summary:
Pulls left op cursor and keeps the result, and then for each pull of
the right op cursor, adds all the left op results to produce a cartesian
product.
Reviewers: teon.banek, florijan
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1201
Summary:
Added a hidden flag, a int32 value, that represents the number of
milliseconds the thread that pulls remote results should sleep in case when the
local results are exhausted but there are still some remote results that are
pending.
Reviewers: teon.banek, florijan
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1198
Summary:
The same behaviour can now be achieved by passing a nullptr for input
operator.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1195
Summary:
Distributed plan is now dispatched before inserting into a cache. This
prevents a race condition when another thread would take the cached plan
and assume it was dispatched, thus crashing during execution.
The fix introduces a possibility that the same plan may be distributed
and dispatched twice (with different plan IDs). This shouldn't raise an
issue during execution, because only one plan will be cached and thus
executed. Obviously, we still need to invalidate the unused plan from
worker's caches.
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1178
Summary:
This is a basic take on planning distributed writes. Main logic is in handling
the Accumulate operator, which requires Synchronize operation if the results
are used after modification.
Tests for planning distributed writes have been added.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1172
Summary:
This avoids the unnecessary work of storing symbols which can never be
read after an aggregation is complete. With regards to distributed, a
major benefit is gained in reducing what is transferred over the
network. Hopefully, this doesn't break some obscure case where we
actually needed to remember all used symbols.
Reviewers: florijan, mislav.bradac, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1158
Summary:
Queries such as `RETURN 1` should only be run on a single machine. This
change assumes that a query should only be distributed if it contains at
least one `ScanAll` operator, i.e. a `MATCH` clause.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1154
Summary:
Remote pulls can now be async. Note that
`RemotePullRpcClients::RemotePull` takes references to data structures
which should not be temporary in the caller. Still, maybe safer to make
copies?
Changed `RpcWorkerClients` API to make that possible.
Reviewers: dgleich, msantl, teon.banek
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1150
Summary: PullRemoteCursor will pull all clients in a RoundRobin fashion until all clients are exhausted and there are no more results to return.
Reviewers: teon.banek, florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1147
Summary:
NOTE: This diff is still in progress. Many TODOs, lacking documentation
etc. But the main logic is there (some could be different), and it tests
out OK.
Reviewers: teon.banek, msantl, buda
Reviewed By: teon.banek
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1138
Summary:
Remove AdvanceCommand operator declaration.
Add query/plan/distributed source files.
Add hacked cloning of LogicalOperator via serialization.
Add virtual Default methods to CompositeVisitor.
Use BOOST_CLASS_EXPORT_KEY in ast.hpp and operator.hpp.
This is needed in a single binary to correctly serialize polymorphic
classes. The previous implementation worked because the memgraph_lib was
linked with test binaries, but nobody was actually serializing things
inside the single memgraph binary itself.
Print PullRemote symbols in tests/manual/query_planner
Add names to implicitly created aggregation symbols
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1137
Summary:
With the added support for serialization, we should be able to transfer
plans across distributed workers.
The planner tests has been extended to test serialization. Operators
should be mostly tested, but the expression they contain aren't
completely. The quick solution is to use typeid for rudimentary
expression equality testing. The more involved solution of comparing
the expression tree for equality is the correct choice. It should be
done in the near future.
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1122
Summary:
Other than the plan operators and the frame, we will need to pass the
generated symbol table to distributed workers.
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1123
Summary:
Previously, we would have a `DCHECK` which crashes the application. This
was evident when testing a queries, such as:
MATCH (n) DELETE n SET n.prop = 42
Since the argument to update clauses is evaluated during execution, it
makes it very difficult to prevent such errors during semantic analysis.
For example:
MATCH (n)--(m) WITH collect(n) as ns, m
DETACH DELETE ns[m.prop] SET head(ns).prop = 42
Test query updates on deleted graph elements
Reviewers: florijan, dgleich
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1114
Summary:
GraphDb is refactored to become an API exposing different parts
necessary for the database to function. These different parts can have
different implementations in SingleNode or distributed Master/Server
GraphDb implementations.
Interally GraphDb is implemented using two class heirarchies. One
contains all the members and correct wiring for each situation. The
other takes care of initialization and shutdown. This architecture is
practical because it can guarantee that the initialization of the
object structure is complete, before initializing state.
Reviewers: buda, mislav.bradac, dgleich, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1093
Summary:
Serialization of vertices and edges for distributed. Based on Boost
serialization. Threrefore moved TypedValue serialization from AST to
utils.
Reviewers: buda, dgleich, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1088
Summary:
If there was no plan caching, the CachedPlan would not survive
`Interpreter::operator()`, as it was not owned by the
`Interpreter::Result`. If there was caching, it could hapen that the
cache got invalidated while that plan was being interpreted (by
another thread) without that interpretation retaining ownership.
Also simplified code around this.
Reviewers: mislav.bradac, teon.banek
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1084
Summary:
Although the first solution used cereal, the final implementation uses
boost. Since the cereal is still used in the codebase, compilation has
been modified to support multithreaded cereal.
In addition to serializing Ast classes, the following also needed to be
serialized:
* GraphDbTypes
* Symbol
* TypedValue
TypedValue is treated specially, by inlining the serialization code in
the Ast class, concretely PrimitiveLiteral.
Another special case was the Function Ast class, which now stores a
function name which is resolved to a concrete std::function on
construction.
Tests have been added for serialized Ast in
tests/unit/cypher_main_visitor
Reviewers: mferencevic, mislav.bradac, florijan
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1067
Summary:
Special casing the last token when adding it to the named expression list.
Added a test to check this case.
Reviewers: mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1052
Summary: Add curvy braces handling in `QueryStripper` and an accompanying test for it.
Reviewers: mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1051
Summary:
Union query combinator implementation consists of:
* adjustments to the AST and `cypher_main_visitor`
* enabling `QueryStripper` to parse multiple `return` statements (not stopping after first)
* symbol generation for union results
* union logical operator
* query plan generator adjustments
Reviewers: teon.banek, mislav.bradac
Reviewed By: teon.banek
Subscribers: pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D1038
Summary:
What's done:
- `RecordAccessor` can represent remote data
- `GraphDbAccessor` manages remote data
- Cleanup: different `EdgeAccessor lazyness (@dgleich: take a look), unused methods, documentation...
- `TODO` placeholders for remote implementation
What's not done:
- RPC and data transfer
- how exactly remote errors are handled
- not sure if any MVCC Record info for remote data should be tracked
- WAL and RPC Deltas properly handled (Gleich working on extracting `Wal::Op`)
This implementation should not break single-node execution, and should provide good abstractions and placeholders for distributed. Once that's satisfied, it should land.
Reviewers: dgleich, buda, mislav.bradac
Reviewed By: dgleich
Subscribers: dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D1030
Summary:
This change generates multiple PropertyFilters for expressions such as
`n.prop1 = m.prop2`. When choosing one PropertyFilter, we want to also
remove the other one, because they represent the same original
expression. Therefore, the removal is no longer based on FilterInfo
equality, but on the original expression equality. Additionally,
FilterInfo and PropertyFilter equality operators have been removed to
avoid any pretense they do what you expect or want.
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1021
Summary:
Referring to the TCK failure on:
```
MATCH (a {name: 'Andres'})<-[:FATHER]-(child)
RETURN {foo: a.name='Andres', kids: collect(child.name)}
```
In the planner we'd only treat a list|map as a group_by if it contained
no aggregations. That's changed so that if a map contains both aggregations
and non-aggregations, then non-aggregations are treated as individual
group_by expressions.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: buda, pullbot, teon.banek
Differential Revision: https://phabricator.memgraph.io/D1004
Summary:
In preparation for distributed storage we need to have labels/properties/edgetypes uniquely identifiable by their ids, which will be global in near future.
The old design has to be abandoned because it's not possible to keep track of global labels/properties/edgetypes while they are local pointers.
Reviewers: mislav.bradac, florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D993
Summary:
Fixing https://app.asana.com/0/170237629387822/481366792497820/f
Test plan:
Started two builds, one with the fix and the second one without the fix.
Connected to each of them using `neo4j` client.
Logs received from the build wihtout the fix:
```
neo4j> Create (n: BigInteger{id:12345678912345678912345}) return n;
<interactive>:0:0: error:
```
```
I1120 13:29:09.551208 30482 executing.hpp:69] [Run] 'Create (n: BigInteger{id:12345678912345678912345}) return n'
W1120 13:29:09.552387 30482 executing.hpp:145] Error message:
```
Logs received from the build with the fix:
```
neo4j> Create (n: BigInteger{id:12345678912345678912345}) return n;
<interactive>:0:0: error: Integer literal exceeds 64 bits
```
```
I1120 13:29:07.940943 30453 executing.hpp:69] [Run] 'Create (n: BigInteger{id:12345678912345678912345}) return n'
W1120 13:29:07.942919 30453 executing.hpp:146] Error message: Integer literal exceeds 64 bits
```
Reviewers: mislav.bradac, teon.banek
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D997
Summary:
My dear fellow Memgraphians. It's friday afternoon, and I am as ready to pop as WAL is to get reviewed...
What's done:
- Vertices and Edges have global IDs, stored in `VersionList`. Main storage is now a concurrent map ID->vlist_ptr.
- WriteAheadLog class added. It's based around buffering WAL::Op objects (elementraly DB changes) and periodically serializing and flusing them to disk.
- Snapshot recovery refactored, WAL recovery added. Snapshot format changed again to include necessary info.
- Durability testing completely reworked.
What's not done (and should be when we decide how):
- Old WAL file purging.
- Config refactor (naming and organization). Will do when we discuss what we want.
- Changelog and new feature documentation (both depending on the point above).
- Better error handling and recovery feedback. Currently it's all returning bools, which is not fine-grained enough (neither for errors nor partial successes, also EOF is reported as a failure at the moment).
- Moving the implementation of WAL stuff to .cpp where possible.
- Not sure if there are transactions being created outside of `GraphDbAccessor` and it's `BuildIndex`. Need to look into.
- True write-ahead logic (flag controlled): not committing a DB transaction if the WAL has not flushed it's data. We can discuss the gain/effort ratio for this feature.
Reviewers: buda, mislav.bradac, teon.banek, dgleich
Reviewed By: dgleich
Subscribers: mtomic, pullbot
Differential Revision: https://phabricator.memgraph.io/D958
Summary:
We need to explicitly say that UNION clause isn't supported, otherwise
it gets silently ignored. So for example, `CREATE () UNION CREATE ()`
would create 2 nodes without a hitch. On the other hand,
`RETURN 1 UNION RETURN 2` would complain that there is more than 1
RETURN in the query, which was misleading.
Reviewers: florijan, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D976
Summary:
Previously, named path symbols remained untracked as `new_symbols` during planning. This meant that
operator `Optional` would be left unaware of those symbols, and therefore not reset them to `Null`
if optional matching failed.
Test Optional operator will be aware of path symbols
Reviewers: florijan, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D974
Summary:
Warnings I ignored:
Creates a new stacktrace object and then dumps it
102.570384 (102.495075) E[1]: [src/utils/exceptions.hpp:116]: (performance) Variable 'stacktrace_' is assigned in constructor body. Consider performing initialization in initialization list.
102.570390 (102.495081) E[1]: [src/utils/exceptions.hpp:127]: (performance) Variable 'stacktrace_' is assigned in constructor body. Consider performing initialization in initialization list.
Used all over the codebase without explicit cast
102.570412 (102.495103) E[1]: [src/utils/stacktrace.hpp:14]: (style) Class 'Line' has a constructor with 1 argument that is not explicit.
Not really used anywhere before initialized:
102.570526 (102.495217) E[1]: [src/data_structures/concurrent/skiplist.hpp:467]: (warning) Member variable 'Accessor::preds' is not initialized in the constructor.
102.570530 (102.495221) E[1]: [src/data_structures/concurrent/skiplist.hpp:467]: (warning) Member variable 'Accessor::succs' is not initialized in the constructor.
Implicit conversions between types are used all over the codebase:
102.570548 (102.495239) E[1]: [src/storage/property_value.hpp:41]: (style) Class 'PropertyValue' has a constructor with 1 argument that is not explicit.
102.570552 (102.495243) E[1]: [src/storage/property_value.hpp:42]: (style) Class 'PropertyValue' has a constructor with 1 argument that is not explicit.
102.570557 (102.495248) E[1]: [src/storage/property_value.hpp:43]: (style) Class 'PropertyValue' has a constructor with 1 argument that is not explicit.
102.570561 (102.495252) E[1]: [src/storage/property_value.hpp:44]: (style) Class 'PropertyValue' has a constructor with 1 argument that is not explicit.
102.570566 (102.495257) E[1]: [src/storage/property_value.hpp:47]: (style) Class 'PropertyValue' has a constructor with 1 argument that is not explicit.
102.570570 (102.495261) E[1]: [src/storage/property_value.hpp:50]: (style) Class 'PropertyValue' has a constructor with 1 argument that is not explicit.
102.570578 (102.495269) E[1]: [src/storage/property_value.hpp:53]: (style) Class 'PropertyValue' has a constructor with 1 argument that is not explicit.
102.570582 (102.495273) E[1]: [src/storage/property_value.hpp:57]: (style) Class 'PropertyValue' has a constructor with 1 argument that is not explicit.
102.570591 (102.495282) E[1]: [src/query/typed_value.hpp:80]: (style) Class 'TypedValue' has a constructor with 1 argument that is not explicit.
102.570596 (102.495287) E[1]: [src/query/typed_value.hpp:81]: (style) Class 'TypedValue' has a constructor with 1 argument that is not explicit.
102.570601 (102.495292) E[1]: [src/query/typed_value.hpp:82]: (style) Class 'TypedValue' has a constructor with 1 argument that is not explicit.
102.570605 (102.495296) E[1]: [src/query/typed_value.hpp:83]: (style) Class 'TypedValue' has a constructor with 1 argument that is not explicit.
102.570609 (102.495300) E[1]: [src/query/typed_value.hpp:89]: (style) Class 'TypedValue' has a constructor with 1 argument that is not explicit.
102.570614 (102.495305) E[1]: [src/query/typed_value.hpp:92]: (style) Class 'TypedValue' has a constructor with 1 argument that is not explicit.
102.570618 (102.495309) E[1]: [src/query/typed_value.hpp:95]: (style) Class 'TypedValue' has a constructor with 1 argument that is not explicit.
102.570623 (102.495314) E[1]: [src/query/typed_value.hpp:98]: (style) Class 'TypedValue' has a constructor with 1 argument that is not explicit.
102.570627 (102.495318) E[1]: [src/query/typed_value.hpp:102]: (style) Class 'TypedValue' has a constructor with 1 argument that is not explicit.
102.570632 (102.495323) E[1]: [src/query/typed_value.hpp:105]: (style) Class 'TypedValue' has a constructor with 1 argument that is not explicit.
102.570636 (102.495327) E[1]: [src/query/typed_value.hpp:108]: (style) Class 'TypedValue' has a constructor with 1 argument that is not explicit.
102.570641 (102.495332) E[1]: [src/query/typed_value.hpp:109]: (style) Class 'TypedValue' has a constructor with 1 argument that is not explicit.
102.570645 (102.495336) E[1]: [src/communication/bolt/v1/decoder/decoded_value.hpp:88]: (style) Class 'DecodedValue' has a constructor with 1 argument that is not explicit.
102.570650 (102.495341) E[1]: [src/communication/bolt/v1/decoder/decoded_value.hpp:89]: (style) Class 'DecodedValue' has a constructor with 1 argument that is not explicit.
102.570654 (102.495345) E[1]: [src/communication/bolt/v1/decoder/decoded_value.hpp:90]: (style) Class 'DecodedValue' has a constructor with 1 argument that is not explicit.
102.570659 (102.495350) E[1]: [src/communication/bolt/v1/decoder/decoded_value.hpp:91]: (style) Class 'DecodedValue' has a constructor with 1 argument that is not explicit.
102.570663 (102.495354) E[1]: [src/communication/bolt/v1/decoder/decoded_value.hpp:94]: (style) Class 'DecodedValue' has a constructor with 1 argument that is not explicit.
102.570668 (102.495359) E[1]: [src/communication/bolt/v1/decoder/decoded_value.hpp:97]: (style) Class 'DecodedValue' has a constructor with 1 argument that is not explicit.
102.570672 (102.495363) E[1]: [src/communication/bolt/v1/decoder/decoded_value.hpp:100]: (style) Class 'DecodedValue' has a constructor with 1 argument that is not explicit.
102.570677 (102.495368) E[1]: [src/communication/bolt/v1/decoder/decoded_value.hpp:104]: (style) Class 'DecodedValue' has a constructor with 1 argument that is not explicit.
102.570681 (102.495372) E[1]: [src/communication/bolt/v1/decoder/decoded_value.hpp:107]: (style) Class 'DecodedValue' has a constructor with 1 argument that is not explicit.
102.570690 (102.495381) E[1]: [src/communication/bolt/v1/decoder/decoded_value.hpp:110]: (style) Class 'DecodedValue' has a constructor with 1 argument that is not explicit.
102.570694 (102.495385) E[1]: [src/communication/bolt/v1/decoder/decoded_value.hpp:113]: (style) Class 'DecodedValue' has a constructor with 1 argument that is not explicit.
CypherParser:
102.570767 (102.495458) E[1]: [src/query/frontend/opencypher/generated/CypherParser.h:69]: (style) Class 'CypherParser' has a constructor with 1 argument that is not explicit.
102.570772 (102.495463) E[1]: [src/query/frontend/opencypher/generated/CypherLexer.h:40]: (style) Class 'CypherLexer' has a constructor with 1 argument that is not explicit.
102.570776 (102.495467) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:86]: (style) The scope of the variable '_la' can be reduced.
102.570781 (102.495472) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:311]: (style) The scope of the variable '_la' can be reduced.
102.570785 (102.495476) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:402]: (style) The scope of the variable '_la' can be reduced.
102.570789 (102.495480) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:497]: (style) The scope of the variable '_la' can be reduced.
102.570797 (102.495488) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:778]: (style) The scope of the variable '_la' can be reduced.
102.570802 (102.495493) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:895]: (style) The scope of the variable '_la' can be reduced.
102.570806 (102.495497) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:991]: (style) The scope of the variable '_la' can be reduced.
102.570811 (102.495502) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:1190]: (style) The scope of the variable '_la' can be reduced.
102.570815 (102.495506) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:1274]: (style) The scope of the variable '_la' can be reduced.
102.570820 (102.495511) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:1393]: (style) The scope of the variable '_la' can be reduced.
102.570824 (102.495515) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:1570]: (style) The scope of the variable '_la' can be reduced.
102.570829 (102.495520) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:1695]: (style) The scope of the variable '_la' can be reduced.
102.570834 (102.495525) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:1800]: (style) The scope of the variable '_la' can be reduced.
102.570839 (102.495530) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:1903]: (style) The scope of the variable '_la' can be reduced.
102.570843 (102.495534) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:2019]: (style) The scope of the variable '_la' can be reduced.
102.570848 (102.495539) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:2228]: (style) The scope of the variable '_la' can be reduced.
102.570852 (102.495543) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:2542]: (style) The scope of the variable '_la' can be reduced.
102.570857 (102.495548) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:2797]: (style) The scope of the variable '_la' can be reduced.
102.570861 (102.495552) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:2966]: (style) The scope of the variable '_la' can be reduced.
102.570866 (102.495557) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:3067]: (style) The scope of the variable '_la' can be reduced.
102.570870 (102.495561) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:3289]: (style) The scope of the variable '_la' can be reduced.
102.570875 (102.495566) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:3295]: (style) The scope of the variable 'alt' can be reduced.
102.570879 (102.495570) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:3419]: (style) The scope of the variable '_la' can be reduced.
102.570884 (102.495575) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:3596]: (style) The scope of the variable '_la' can be reduced.
102.570888 (102.495579) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:3688]: (style) The scope of the variable '_la' can be reduced.
102.570893 (102.495584) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:3963]: (style) The scope of the variable '_la' can be reduced.
102.570897 (102.495588) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:4452]: (style) The scope of the variable '_la' can be reduced.
102.570902 (102.495593) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:4586]: (style) The scope of the variable '_la' can be reduced.
102.570906 (102.495597) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:4813]: (style) The scope of the variable '_la' can be reduced.
102.570911 (102.495602) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:4943]: (style) The scope of the variable '_la' can be reduced.
102.570918 (102.495609) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:5026]: (style) The scope of the variable '_la' can be reduced.
102.570923 (102.495614) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:5569]: (style) The scope of the variable '_la' can be reduced.
102.570928 (102.495619) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:5664]: (style) The scope of the variable '_la' can be reduced.
102.570932 (102.495623) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:5755]: (style) The scope of the variable '_la' can be reduced.
102.570937 (102.495628) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:5888]: (style) The scope of the variable '_la' can be reduced.
102.570941 (102.495632) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:6045]: (style) The scope of the variable '_la' can be reduced.
102.570946 (102.495637) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:6142]: (style) The scope of the variable '_la' can be reduced.
102.570950 (102.495641) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:6347]: (style) The scope of the variable '_la' can be reduced.
102.570955 (102.495646) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:6523]: (style) The scope of the variable '_la' can be reduced.
102.570959 (102.495650) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:6614]: (style) The scope of the variable '_la' can be reduced.
102.570964 (102.495655) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:6899]: (style) The scope of the variable '_la' can be reduced.
102.570968 (102.495659) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:6992]: (style) The scope of the variable '_la' can be reduced.
102.570973 (102.495664) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:7147]: (style) The scope of the variable '_la' can be reduced.
102.570977 (102.495668) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:7680]: (style) The scope of the variable '_la' can be reduced.
102.570982 (102.495673) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:7759]: (style) The scope of the variable '_la' can be reduced.
102.570986 (102.495677) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:7938]: (style) The scope of the variable '_la' can be reduced.
102.570991 (102.495682) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:8126]: (style) The scope of the variable '_la' can be reduced.
102.570995 (102.495686) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:8220]: (style) The scope of the variable '_la' can be reduced.
102.571000 (102.495691) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:8313]: (style) The scope of the variable '_la' can be reduced.
102.571004 (102.495695) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:8491]: (style) The scope of the variable '_la' can be reduced.
102.571009 (102.495700) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:8703]: (style) The scope of the variable '_la' can be reduced.
102.571013 (102.495704) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:8783]: (style) The scope of the variable '_la' can be reduced.
102.571018 (102.495709) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:8914]: (style) The scope of the variable '_la' can be reduced.
102.571022 (102.495713) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:9119]: (style) The scope of the variable '_la' can be reduced.
102.571027 (102.495718) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:9220]: (style) The scope of the variable '_la' can be reduced.
102.571034 (102.495725) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:9414]: (style) The scope of the variable '_la' can be reduced.
102.571039 (102.495730) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:9660]: (style) The scope of the variable '_la' can be reduced.
102.571043 (102.495734) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:10008]: (style) The scope of the variable '_la' can be reduced.
102.571048 (102.495739) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:10158]: (style) The scope of the variable '_la' can be reduced.
102.571052 (102.495743) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:10250]: (style) The scope of the variable '_la' can be reduced.
102.571057 (102.495748) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:10370]: (style) The scope of the variable '_la' can be reduced.
102.571061 (102.495752) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:10637]: (style) The scope of the variable '_la' can be reduced.
102.571065 (102.495756) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:10749]: (style) The scope of the variable '_la' can be reduced.
102.571070 (102.495761) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:10815]: (style) The scope of the variable '_la' can be reduced.
102.571075 (102.495766) E[1]: [src/query/frontend/opencypher/generated/CypherParser.cpp:10881]: (style) The scope of the variable '_la' can be reduced.
We know that we represented it correctly in memory:
102.571079 (102.495770) E[1]: [src/communication/bolt/v1/decoder/decoder.hpp:252]: (portability) Casting between integer* and double* which have an incompatible binary data representation.
Cont assigned but not used after:
102.571101 (102.495792) E[1]: [src/query/frontend/ast/ast.hpp:1008]: (style) Variable 'cont' is assigned a value that is never used.
Reviewers: teon.banek, buda, mferencevic
Reviewed By: teon.banek
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D967
Summary:
Tests have been updated to catch this error and other behaviour. Other
than this change, `AND` should behave as before.
Reviewers: florijan, mislav.bradac
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D970
Summary: TypedValue assignment operator=(TypedValue &other) is inefficient for types which already have a defined assignment operator, or can be assigned trivially, without using TypedValue to pass the value for assignment.
Reviewers: mislav.bradac, florijan, teon.banek
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D934
Summary:
Remove name from GraphDb.
Take GraphDb in query test macros instead of accessor.
Add is_accepting_transactions flag to GraphDb.
Reviewers: mislav.bradac, florijan, mferencevic
Reviewed By: mislav.bradac
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D940
Summary:
Move QueryParts and Filters to a new file.
Reorganize FilterInfo struct.
Remove label filter if we do indexed scan by label.
Remove property filter used in indexed scan.
Reviewers: florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D915
Summary: This change increases the planning time, but should reduce memory consumption.
Reviewers: florijan, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D901
Summary: This is not a very important functionality, but it turned out simple to do, so let's add it to have a consistent query support.
Reviewers: buda, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D862
Summary:
- Removed BreadthFirstAtom, using EdgeAtom only with a Type enum.
- Both variable expansions (breadth and depth first) now have mandatory inner node and edge Identifiers.
- Both variable expansions use inline property filtering and support inline lambdas.
- BFS and variable expansion now have the same planning process.
- Planner modified in the following ways:
- Variable expansions support inline property filtering (two filters added to all_filters, one for inline, one for post-expand).
- Asserting against existing_edge since we don't support that anymore.
- Edge and node symbols bound after variable expansion to disallow post-expand filters to get inlined.
- Some things simplified due to different handling.
- BreadthFirstExpand logical operator merged into ExpandVariable. Two Cursor classes remain and are dynamically chosen from.
As part of planned planner refactor we should ensure that a filter is applied only once. The current implementation is very suboptimal for property filtering in variable expansions.
@buda: we will start refactoring this these days. This current planner logic is too dense and complex. It is becoming technical debt. Most of the time I spent working on this has been spent figuring the planning out, and I still needed Teon's help at times. Implementing the correct and optimal version of query execution (avoiding multiple potentially expensive filterings) was out of reach also due to tech debt.
Reviewers: buda, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D852
Summary:
- The new BFS syntax implemented as proposed.
- AST BreadthFirstAtom now uses EdgeAtom members: has_range_{true}, upper_bound_, lower_bound_
- Edges data structure now handles all the edge filtering (single or multiple edges), to ease planning. Additional edge filtering (additional Filter op in the plan) is removed. AST EdgeTypeTest is no longer used and is removed.
Current state is stable but there are things left to do:
- BFS property filtering.
- BFS lower_bound_ support.
- Support for lambdas in variable length expansion. This includes obligatory (even if not user_defined) inner_node and inner_edge symbols for easier handling.
- Code-sharing between BFS and variable length expansions.
I'll add asana tasks (and probably start working on them immediately) when/if this lands.
Reviewers: buda, teon.banek, mislav.bradac
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D836
Summary:
AST caching should be well tested by now.
We should consider removing `Context.is_query_cached_` member as well as the
implementation and tests for `CypherMainVisitor`.
Reviewers: mislav.bradac, mferencevic, buda
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D833
Summary: - modified all utils/algorithm functions to be inline and in the utils namespace
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D830
Summary:
I started with cleaning flags up (removing unused ones, documenting undocumented ones). There were some flags to remove in `QueryEngine`. Seeing how we never use hardcoded queries (AFAIK last Mislav's testing also indicated they aren't faster then interpretation), when removing those unused flags the `QueryEngine` becomes obsolete. That means that a bunch of other stuff becomes obsolete, along with the hardcoded queries. So I removed it all (this has been discussed and approved on the daily).
Some flags that were previously undocumented in `docs/user_technical/installation` are now documented. The following flags are NOT documented and in my opinion should not be displayed when starting `./memgraph --help` (@mferencevic):
```
query_vertex_count_to_expand_existsing (from rule_based_planner.cpp)
query_max_plans (rule_based_planner.cpp)
```
If you think that another organization is needed w.r.t. flag visibility, comment.
@teon.banek: I had to remove some stuff from CMakeLists to make it buildable. Please review what I removed and clean up if necessary if/when this lands. If the needed changes are minor, you can also comment.
Reviewers: buda, mislav.bradac, teon.banek, mferencevic
Reviewed By: buda, mislav.bradac
Subscribers: pullbot, mferencevic, teon.banek
Differential Revision: https://phabricator.memgraph.io/D825
Summary:
Added path encoding and decoding in our Bolt layer. Returning paths to the Neo client seems to work fine.
Not sure how to test that the decoded works fine (our client?).
Tests are not written, that suite is complicated. Leaving that to mferencevic.
Also reduced DecodedValue a bit with macros. Less code, less error prone.
Reviewers: buda, florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D810
Summary:
Test planner splits MATCH ... WHERE
Remove distinction between FilterAnd and AndOperator
Reviewers: florijan, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D814
Summary:
This fixes a bug, where streaming would try to get the name of the symbol from
an invalid token position. For example, `MATCH (n) WITH n RETURN *`
Reviewers: florijan, mislav.bradac
Reviewed By: florijan, mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D811
Summary: Removed the traversal API, been waiting for named path to land because that data structure has replaced traversal's Path. I left the wiki page of the API, just put a warning that it's not used.
Reviewers: buda, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D809
Summary:
Three TODOs resolved.
1. around line 897 - we currently don't support expansion into existing variable length edges (there is a TODO in symbol_generator.cpp:213), so this should not be done at the moment.
2. around line 1025 - This TODO was on review and nobody commented, so I'm removing it. Should have done that when the diff landed.
3. around line 1560 - This does not seem possible. Edge-uniqueness checks happen within a single `[OPTIONAL ] MATCH`. If it is OPTIONAL (the case interesting here), then the uniqueness check also gets planned under the optional branch. So, if an optional fails, the uniqueness check will get skipped, as opposed to getting executed over a Null. I added an edge-case test to verify this (and checked with the planner test).
Reviewers: buda, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D807
Summary:
- Keys() functions in the indices can't be const because ConcurrentMap doesn't provide const accessors (and they are broken in skiplist) :D
- no cucumber tests because many tests create indices so it's hard to say what's inside and what not
Reviewers: buda, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D797
Summary:
This is done when the generated AST will be cached.
Remove LiteralsPlugger.
Reviewers: florijan, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D788
Summary:
Replaced std::list with std::vector in all plan operators. Performance increase in harness tests is not visible. Defined a custom test:
```
unwind range(0, 1000000) as x
create ({a: tointeger(rand() * 100), b: tointeger(rand() * 100), c: tointeger(rand() * 100), d: tointeger(rand() * 10), e: tointeger(rand() * 10), f: tointeger(rand() * 10)});
match (n) return min(n.a), max(n.b), sum(n.c), n.d, n.e, n.f
match (n) with distinct n.a AS a, n.b AS b, n.c AS c, n.d AS d, n.e AS e, n.f AS f return count(*)
```
In that test performance gains are 9.8% on the aggregation query (mean 0.83s vs 092s) and 34% (mean 2.15s vs 3.25s) on the distinct query. Doubt we'll see much on any of the LDBC tests because they don't stress those operators nearly as much.
Reviewers: buda, teon.banek, mislav.bradac
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D778
Summary:
Antlr grammar has been updated to support putting edge types after the BFS
symbol. Planner collects edge type filters for BFS and inlines them in the
operator by joining the filter with the user input BFS filter itself. This
requires no change from the standpoint of the operator. On the other hand, in
order to use the faster lookup by a single edge type, `ExpandBreadthFirst`
operator now accept an optional edge type. The edge type is passed from the
planner only if the user is filtering by a single type.
Unit tests as well as tck have been updated.
Reviewers: florijan, mislav.bradac
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D777
Summary: Reduces latency on LDBC query 9 from 7.9sec to 6.8sec (14%). That query has 650k rows in ORDER BY, 3 ordering elements and 10ish values get returned (both of them are now accumulated into vectors).
Reviewers: buda, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D775
Summary:
Add function First to utils.
Insert EdgeType into Expand during planning.
Reviewers: florijan, mislav.bradac
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D769
Summary: The constness of the DbAccessor interferes with caching the results.
Reviewers: florijan, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D771
Summary:
Benchmark planning and estimating indexed ScanAll. According to the benchmark,
caching speeds up the whole process of planning and estimation by a factor of
2. Most of the performance gain is in the `CostEstimator` itself, due to plenty
of calls to `VerticesCount` when estimating all of the generated plans.
Reviewers: mislav.bradac, florijan
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D765
Summary:
- The `Edges` data structure now handles common ops, including providing an iterator over edges whose "other" vertex is know.
- This should improve performance on dense_expand tests in the harness without other side-effects.
- query::plan::Expand operator modified not to check for existing-node stuff since that now gets handled by the `Edges` data structure.
- `Edges::Iterator` implemented only for const iterators since that suffices for now. Can implement non-const if the need arrises.
Reviewers: buda, mislav.bradac, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D763
Summary: `assert` function added. Useful to us for asserting DB state in harness tests. Potentially useful to the client for breaking out of a query as soon as a predicate fails, as opposed to collecting result and checking them client-side.
Reviewers: buda, mislav.bradac, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D750
Summary:
Moved Neo4j config to config dir.
Neo4j and PostgreSQL are now downloaded to libs.
Renamed metadata flags in memgraph.
Changed apollo generate for new harness.
Reviewers: mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D741
Summary: Some TypedValue arithmetic ops threw exceptions for the wrong reasons, the op applicability type checks were wrong. Also the tests for that behavior were wrong. These are the fixes.
Reviewers: mislav.bradac, buda
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D724
Summary:
This allows for inserting dummy DbAccessor in tests. Unfortunate side
effect of this change is that the whole implementation had to be moved
from cpp to hpp.
Also templatize remaining RuleBasedPlanner implementation
Reviewers: florijan, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D704
Summary:
Added:
- map support in PropertyValue
- conversion of map TypedValue to PropertyValue if appropriate flag is set (undocumented because it's private)
- ordering of map PropertyValue in LabelPropertyIndex
- issue raised regarding list and value property modifications in storage (currently unsupported)
Maybe I missed some feature or whatever?
Reviewers: mislav.bradac, buda, teon.banek
Reviewed By: mislav.bradac, buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D692
Summary:
This change should allow for passing a different PlanningContext and/or
GraphDbAccessor. In turn, we can write tests which pass a dummy context
for decoupled testing of the planning process (from the rest of the
system).
Reviewers: mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D700
Summary:
This is needed in cases when the planner decides to start expanding from
the other end.
Reviewers: mislav.bradac, florijan
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D681
Summary:
Matching variants are generated iteratively and limited. Additionally,
When generating expansions, if there is nothing to continue on the
expanded nodes, simply append all the remaining expansions to the
currently generated one. This should speed up queries with large number
of `MATCH` clauses.
Reviewers: mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D612
Summary: Not strictly neccessary, but it's been itching me. It took an hour.
Reviewers: buda, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D648
Summary:
- added only one function for getting the total (in + out) vertex degree, it's required for the Ravelin use-case
- specific `degree_in` and `degree_out` functions can be added as necessary
- also fixed random_graph_generator bug (needed it for testing)
Reviewers: buda, mislav.bradac
Reviewed By: buda, mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D652
Summary:
Now all ScanAll and Expand ops are covered by the cost estimator. For ScanAll with indices cost estimation is pretty good, for new Expand ops it is tragically bad (Expand to the power of expansion depth, plus arbitrary filtering). Static cost estimation is wrong wrong wrong.
Currently cost estimation of even trivial plans that use indices is wrong because the planner leaves filtering expressions that are implicitly handled by the index in the operator tree, IIRC. Tasking Teon to revise this, even though I'm not sure how bad an influence this has on cost estimation and it's use in plan choosing.
Reviewers: mislav.bradac, teon.banek, buda
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D633
Summary:
This diff contains a bug fix for the expansion operators that are currently on dev.
More importantly, it proposes end-to-end testing for edge-cases for which it's a
pain to write single-phase tests. In my opinion this is OK, you're all reviewers so
you can comment.
The test relies on left-to-right query execution. We need this guarantee in tests
like this. I propose renaming "RuleBasedPlanner" to "LeftToRightPlanner" to make
this explicit. As Teon is not here at the moment, will make this a task/discussion.
Reviewers: buda, mislav.bradac, teon.banek, lion
Reviewed By: mislav.bradac
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D626
Summary:
Not complete (but review can start):
- implementation should be done
- still need to finish tests
- documentation missing
Reviewers: mislav.bradac, teon.banek, buda
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D616
Summary:
Implement trie and use it in stripper
Make it nicer
Reviewers: buda, florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D614
Summary:
`UNWIND` can come before `MATCH`, so it needs to break query parts. If
it didn't, a query part would incorrectly grab all the matches and plan
them incorrectly. A test for such a case has been added.
Reviewers: florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D598
Summary:
Add All expression to Ast
Evaluate All expression
Visit All and generate symbols
Handle All when collecting context during planning
Reviewers: florijan, mislav.bradac, buda
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D587
Summary:
Add EdgeList symbol type and check for redeclaration
The result of variable path is a list of edges, so the symbol type has
been added. In the future, we need to extend the type checker and the
type structure to have a generic list type.
We also currently do not support reusing an already bound symbol for a
variable path, so the SymbolGenerator will raise a redeclaration error.
Reviewers: florijan, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot, lion
Differential Revision: https://phabricator.memgraph.io/D574
Summary:
Variable expansion logical operator added. Some functionalities are missing:
- taking into account optional matching when expanding into existing symbol
- accepting Expression bounds (current implementation takes size_t)
Also, a TODO is added for handling optional matching in the uniqueness operator (with an Asana task)
All this will be done in the following diff, this is already substantial.
Also, please consider if we want to have all those `VLOG`s in the code. Not very pretty. And I think that `VLOG` is not compiled-away in release build, will put an asana task.
Reviewers: teon.banek, mislav.bradac, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D569
Summary:
Allow expressions for variable length path bounds
Replace test which expected a syntax exception
Since we now allow variable length to have an arbitrary expression, the
test case is obsolete. It was replaced with something that excepts an
expression which wasn't allowed before.
Reviewers: florijan, mislav.bradac, buda
Reviewed By: mislav.bradac
Subscribers: lion, pullbot
Differential Revision: https://phabricator.memgraph.io/D568
Summary:
- GraphDbAccessor - index range API added
- index api tests refactored
- skiplist minor cleanup.
Reviewers: teon.banek, buda, mislav.bradac
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D533