Summary:
Store accumulated results as `communication::bolt::Value`s instead of
`TypedValue`s.
Add additional overloads for `Result` and `Summary` which accept `TypedValue`s
but internally perform conversions.
Reviewers: teon.banek, mferencevic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2514
Summary:
Depends on D2471
- Add pointer to storage to `InterpreterContext`
- Rename `operator()` to `Prepare`
- Use `Interpret` instead of `operator()` (`Interpret` will be removed soon)
- Remove the `in_explicit_transaction` parameter
- Remove the memory resource parameter from `Interpret`
- Remove the storage accessor parameter from `Interpret`
- Fix up tests (remove the `Interpreter` from `database_transaction_timeout`)
Reviewers: teon.banek, mferencevic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2482
Summary: Make `InterpreterContext` a top level instead of a nested struct
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2512
Summary:
The new distributed directory is inside the query, and mirrors the query
structure. This groups all of the distributed (query) source code
together, which should make the potential directory extraction easier.
Reviewers: mferencevic, llugovic, mtomic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1923
Summary: Add flag to disable printing of records.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1898
Summary:
RuleBasedPlanner now generates only the regular ScanAll operations, and
Filter operations are appended as soon as possible. The newly added
Rewrite step, takes this operator tree and replaces viable Filter &
ScanAll operators with appropriate ScanAllBy<Index> operator. This
change ought to simplify the behaviour of DistributedPlanner when that
stage is moved before the indexed lookup rewrite.
Showing unoptimized plan in interactive planner is also supported.
Reviewers: mtomic, llugovic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1839
Summary:
`ReplicationLog` had a classic off-by-one bug. The `valid_prefix`
variable wasn't set properly.
This diff also includes a poor man's version of a HA client. This client
assumes that all the HA instances run on a single machine and that the
corresponding Bold endpoints have open ports ranging from `7687` to
`7687 + num_machines - 1`.
This should make it easeir to test certain things, ie. disk usage, P25.
This test revealed the bug with `ReplicationLog`
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1813
Summary:
Rename Context to ExecutionContext and make it struct
Move ParsingContext to cypher_main_visitor.hpp
Reviewers: mtomic, llugovic
Reviewed By: llugovic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1810
Summary:
Identifiers would be printed correctly if they were printed as part of a larger
`Expression` (so the visitor did the printing). However, if `PrintObject` was
called with an `Identifier *` then it would delegate to the template overload
and just print the pointer.
Reviewers: mtomic, teon.banek
Reviewed By: mtomic
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1802
Summary:
This diff removes the need for a database when parsing a query and
creating an Ast. Instead of storing storage::{Label,Property,EdgeType}
in Ast nodes, we store the name and an index into all of the names. This
allows for easy creation of a map from {Label,Property,EdgeType} index
into the concrete storage type. Obviously, this comes with a performance
penalty during execution, but it should be minor. The upside is that the
query/frontend minimally depends on storage (PropertyValue), which makes
writing tests easier as well as running them a lot faster (there is no
database setup). This is most noticeable in the ast_serialization test
which took a long time due to start up of a distributed database.
Reviewers: mtomic, llugovic
Reviewed By: mtomic
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1774
Summary:
This change makes HierarchicalTreeVisitor visit only Cypher related AST
nodes. QueryVisitor can be used to differentiate between various query
types we have. The next step is to either rename HierarchicalTreeVisitor
to something like CypherQueryVisitor, or perhaps extract Clause visiting
from it.
Reviewers: mtomic, llugovic
Reviewed By: llugovic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1710
Summary:
The single-binary distributed tests didn't wait for the whole cluster to be up
and running before they started running their tests. Now all tests ensure that
all workers are registered to the master and all other workers before starting
any tests.
Reviewers: mculinovic
Reviewed By: mculinovic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1724
Summary:
`Query` is now an abstract class which has `CypherQuery`,
`ExplainQuery`, `IndexQuery`, `AuthQuery` and `StreamQuery` as derived
classes. Only `CypherQuery` is forwarded to planner and the rest of the
queries are handled directly in the interpreter. This enabled us to
remove auth, explain and stream operators, clean up `Context` class and
remove coupling between `Results` class and plan cache. This should make
it easier to add similar functionality because no logical operator
boilerplate is needed. It should also be easier to separate community
and enterprise features for open source.
Remove Explain logical operator
Separate IndexQuery in AST
Handle index creation in interpreter
Remove CreateIndex operator and ast nodes
Remove plan cache reference from Results
Move auth queries out of operator tree
Remove auth from context
Fix tests, separate stream queries
Remove in_explicit_transaction and streams from context
Reviewers: teon.banek, mferencevic, msantl
Reviewed By: teon.banek, mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1664
Summary: Up till now, `AstStorage` also took care of tracking the root of the `Query` and loading of cloning of `Query` nodes would change that root. This felt out of place because sometimes `AstStorage` is used only for storing expressions, and we don't even have an entire query in the storage. This diff removes that feature from `AstStorage`. Now its only functionality is owning AST nodes and assigning unique IDs to them.
Reviewers: teon.banek, llugovic
Reviewed By: teon.banek
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1646
Summary:
Start removing `is_remote` from `Address`
Remove `GlobalAddress`
Remove `GlobalizedAddress`
Remove bitmasks from `Address`
Remove `is_local` from `Address`
Remove `is_local` from `RecordAccessor`
Remove `worker_id` from `Address`
Remove `worker_id` from `GidGenerator`
Unfriend `IndexRpcServer` from `Storage`
Remove `LocalizedAddressIfPossible`
Make member private
Remove `worker_id` from `Storage`
Copy function to ease removal of distributed logic
Remove `worker_id` from `WriteAheadLog`
Remove `worker_id` from `GraphDb`
Remove `worker_id` from durability
Remove nonexistant function
Remove `gid` from `Address`
Remove usage of `Address`
Remove `Address`
Remove `VertexAddress` and `EdgeAddress`
Fix Id test
Remove `cypher_id` from `VersionList`
Remove `cypher_id` from durability
Remove `cypher_id` member from `VersionList`
Remove `cypher_id` from database
Fix recovery (revert D1142)
Remove unnecessary functions from `GraphDbAccessor`
Revert `InsertEdge` implementation to the way it was in/before D1142
Remove leftover `VertexAddress` from `Edge`
Remove `PostCreateIndex` and `PopulateIndexFromBuildIndex`
Split durability paths into single node and distributed
Fix `TransactionIdFromWalFilename` implementation
Fix tests
Remove `cypher_id` from `snapshooter` and `durability` test
Reviewers: msantl, teon.banek
Reviewed By: msantl
Subscribers: msantl, pullbot
Differential Revision: https://phabricator.memgraph.io/D1647
Summary:
To clean the working directory after this diff you should execute:
```
rm src/database/counters_rpc_messages.capnp
rm src/database/counters_rpc_messages.hpp
rm src/database/serialization.capnp
rm src/database/serialization.hpp
```
Reviewers: teon.banek, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1636
Summary:
This diff splits single node and distributed storage from each other.
Currently all of the storage code is copied into two directories (one single
node, one distributed). The logic used in the storage implementation isn't
touched, it will be refactored in following diffs.
To clean the working directory after this diff you should execute:
```
rm database/state_delta.capnp
rm database/state_delta.hpp
rm storage/concurrent_id_mapper_rpc_messages.capnp
rm storage/concurrent_id_mapper_rpc_messages.hpp
```
Reviewers: teon.banek, buda, msantl
Reviewed By: teon.banek, msantl
Subscribers: teon.banek, pullbot
Differential Revision: https://phabricator.memgraph.io/D1625
Summary:
This diff changes the RPC layer to directly return `TResponse` to the user when
issuing a `Call<...>` RPC call. The call throws an exception on failure
(instead of the previous return `nullopt`).
All servers (network, RPC and distributed) are set to have explicit `Shutdown`
methods so that a controlled shutdown can always be performed. The object
destructors now have `CHECK`s to enforce that the `AwaitShutdown` methods were
called.
The distributed memgraph is changed that none of the binaries (master/workers)
crash when there is a communication failure. Instead, the whole cluster starts
a graceful shutdown when a persistent communication error is detected.
Transient errors are allowed during execution. The transaction that errored out
will be aborted on the whole cluster. The cluster state is managed using a new
Heartbeat RPC call.
Reviewers: buda, teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1604
Summary:
Knowledge of distributed operators is now removed from PlanPrinter
class. The DistributedOperatorVisitor and PlanPrinter are modified so
that they may be multiple inherited. This is done using virtual
inheritance of HierarchicalLogicalOperatorVisitor. Multiple inheritance
is used to derived a DistributedPlanPrinter which knows how to print
distributed in operators. This removes the dependency of single node
pretty printing on distributed.
Reviewers: msantl, mtomic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1606
Summary:
In a bunch of places `TypedValue` was used where `PropertyValue` should be. A lot of times it was only because `TypedValue` serialization code could be reused for `PropertyValue`, only without providing callbacks for `VERTEX`, `EDGE` and `PATH`. So first I wrote separate serialization code for `PropertyValue` and put it into storage folder. Then I fixed all the places where `TypedValue` was incorrectly used instead of `PropertyValue`. I also disabled implicit `TypedValue` to `PropertyValue` conversion in hopes of preventing misuse in the future.
After that, I wrote code for `VertexAccessor` and `EdgeAccessor` serialization and put it into `storage` folder because it was almost duplicated in distributed BFS and pull produce RPC messages. On the sender side, some subset of records (old or new or both) is serialized, and on the reciever side, records are deserialized and immediately put into transaction cache.
Then I rewrote the `TypedValue` serialization functions (`SaveCapnpTypedValue` and `LoadCapnpTypedValue`) to not take callbacks for `VERTEX`, `EDGE` and `PATH`, but use accessor serialization functions instead. That means that any code that wants to use `TypedValue` serialization must hold a reference to `GraphDbAccessor` and `DataManager`, so that should make clients reconsider if they really want to use `TypedValue` instead of `PropertyValue`.
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1598
Summary:
This change improves detection of errorneous situations when starting a
distributed cluster on a single machine. It asserts that the user hasn't
started more memgraph nodes on the same machine with the same durability
directory. Also, this diff improves worker registration. Now workers don't have
to have explicitly set IP addresses. The master will deduce them from the
connecting IP when the worker registers.
Reviewers: teon.banek, buda, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1582
Summary: Recovery file is version inconsistent iff it starts with a correct magic number, has at least one integer written after that and that integer differs from kVersion.
Reviewers: mferencevic, ipaljak, vkasljevic, buda
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1565
Summary:
Added WAL and Snapshot explorer utility executables that read a wal or a
snapshot file and print the content of it, but the main purpose is that they
check it.
This is useful when debugging durability and want to know where it gets stuck.
Reviewers: dgleich, buda
Reviewed By: buda
Subscribers: ipaljak, vkasljevic, teon.banek, pullbot
Differential Revision: https://phabricator.memgraph.io/D1422
Summary:
The raft implementation has been stale for a while now. It doesn't
compile, nor uses Cap'n Proto for serialization. In the future we would
probably rewrite it, so it doesn't need to be part of the repo at this
moment.
Reviewers: mferencevic, mtomic, buda
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1513
Summary: first step for a cleaner query front end that might be easier to open source
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1511
Summary:
GraphDbAccessor is now constructed only through GraphDb. This allows the
concrete GraphDb to instantiate a concrete GraphDbAccessor. This allows
us to use virtual calls, so that the implementation may be kept
separate. The major downside of doing things this way is heap allocation
of GraphDbAccessor. In case it turns out to be a real performance
issues, another solution with pointer to static implementation may be
used.
InsertVertexIntoRemote is now a non-member function, which reduces
coupling. It made no sense for it to be member function because it used
only the public parts of GraphDbAccessor.
Reviewers: msantl, mtomic, mferencevic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1504
Summary:
This change, hopefully, simplifies the implementation of different kinds
of GraphDb. The pimpl idiom is now simplified by removing all of the
crazy inheritance. Implementations classes are just plain data stores,
without any methods. The interface classes now have a more flat
hierarchy:
```
GraphDb (pure interface)
|
+----+---------- DistributedGraphDb (pure interface)
| |
Single Node +-----+------+
| |
Master Worker
```
DistributedGraphDb is used as an intermediate interface for all the
things that should work only in distributed. Therefore, virtual calls
for distributed stuff have been removed from GraphDb. Some are exposed
via DistributedGraphDb, other's are only in concrete Master and Worker
classes. The code which relied on those virtual calls has been
refactored to either use DistributedGraphDb, take a pointer to what is
actually needed or use dynamic_cast. Obviously, dynamic_cast is a
temporary solution and should be replaced with another mechanism (e.g.
virtual call, or some other function pointer style).
The cost of the above change is some code duplication in constructors
and destructors of classes. This duplication has a lot of little tweaks
that make it hard to generalize, not to mention that virtual calls do
not work in constructor and destructor. If we really care about
generalizing this, we should think about abandoning RAII in favor of
constructor + Init method.
The next steps for splitting the dependencies that seem logical are:
1) Split GraphDbAccessor implementation, either via inheritance or
passing in an implementation pointer. GraphDbAccessor should then
only be created by a virtual call on GraphDb.
2) Split Interpreter implementation. Besides allowing single node
interpreter to exist without depending on distributed, this will
enable the planner and operators to be correctly separated.
Reviewers: msantl, mferencevic, ipaljak
Reviewed By: msantl
Subscribers: dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D1493
Summary:
During the creation of indexes there could be a case in which a vertex contains a label/property but is not a part of index after
index building completes.
This happens if vertices are being inserted while the index is being built.
Reviewers: buda, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1484
Summary:
Session specifics have been move out of the Bolt `executing` state, and
are accessed via pure virtual Session type. Our server is templated on
the session and we are setting the concrete type, so there should be no
virtual call overhead. Abstract Session is used to indicate the
interface, this could have also been templated, but the explicit
interface definition makes it clearer.
Specific session implementation for running Memgraph is now implemented
in memgraph_bolt, which instantiates the concrete session type. This may
not be 100% appropriate place, but Memgraph specific session isn't
needed anywhere else.
Bolt/communication tests now use a dummy session and depend only on
communication, which significantly improves test run times.
All these changes make the communication a library which doesn't depend
on storage nor the database. Only shared connection points, which aren't
part of the base communication library are:
* glue/conversion -- which converts between storage and bolt types, and
* communication/result_stream_faker -- templated, but used in tests and query/repl
Depends on D1453
Reviewers: mferencevic, buda, mtomic, msantl
Reviewed By: mferencevic, mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1456
Summary:
This is the first step in cutting the crazy dependencies of
communication module to the whole database. Includes have been
reorganized and conversion between DecodedValue and other Memgraph types
(TypedValue and PropertyValue) has been extracted to a higher level
component called `communication/conversion`. Encoder, like Decoder, now
relies only on DecodedValue. Hopefully the conversion operations will
not significantly slow down streaming Bolt data.
Additionally, Bolt ID is now wrapped in a class. Our storage model uses
*unsigned* int64, while Bolt expects *signed* int64. The implicit
conversions may lead to encode/decode errors, so the wrapper should
enforce some type safety to prevent such errors.
Reviewers: mferencevic, buda, msantl, mtomic
Reviewed By: mferencevic, mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1453
Summary:
Added test stream functionality. When a stream is configured, it will try to
consume messages from a kafka topic and return them back to the user.
For now, the messages aren't transformed, so it just returns the payload string.
Depends on D1466
Next steps are persisting stream metadata and transforming messages in order to
store them in the graph.
Reviewers: teon.banek, mtomic
Reviewed By: teon.banek
Subscribers: pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D1474
Summary:
Added basic functionality for kafka streams. The `CREATE STREAM` clause is a
simplified version from the one mentioned in D1415 so we can start testing
end-to-end sooner.
This diff also includes a bug fix in `lcp.list ` for operators that have no
members.
Reviewers: teon.banek, mtomic, buda
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1434
Summary:
This diff implements OpenSSL support in the network stack.
Currently SSL support is only enabled for Bolt connections,
support for RPC connections will be added in another diff.
Reviewers: buda, teon.banek
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1328
Summary:
Add telemetry to main memgraph binary
Add resource usage collector
Add telemetry flag
Change telemetry collector logic
Fix utils compilation
Add timestamp
Add first version of interactive test
Started working on test runner
Implement all tests
Flake8 on runner.py
Integrate test with Apollo
Add TODO
Reviewers: buda, teon.banek
Reviewed By: buda, teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1419
Summary:
Hopefully, the mechanism of generating Cartesian is general enough, so
this simple change should work correctly in all cases.
Planner tests have been modified to use a FakeDbAccessor in order to
speed them up and potentially allow extracting planning into a library.
Reviewers: msantl, mtomic, buda
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1431
Summary:
Clangs Address, Thread and Undefined Behaviour sanitizers are now available as build options.
Only one of those can be set ON in the same build.
Reviewers: teon.banek, buda
Reviewed By: buda
Subscribers: buda, pullbot
Differential Revision: https://phabricator.memgraph.io/D1380
Summary:
Converts the RPC stack to use Cap'n Proto for serialization instead of
boost. There are still some traces of boost in other places in the code,
but most of it is removed. A future diff should cleanup boost for good.
The RPC API is now changed to be more flexible with regards to how
serialize data. This makes the simplest cases a bit more verbose, but
allows complex serialization code to be correctly written instead of
relying on hacks. (For reference, look for the old serialization of
`PullRpc` which had a nasty pointer hacks to inject accessors in
`TypedValue`.)
Since RPC messages were uselessly modeled via inheritance of Message
base class, that class is now removed. Furthermore, that approach
doesn't really work with Cap'n Proto. Instead, each message type is
required to have some type information. This can be automated, so
`define-rpc` has been added to LCP, which hopefully simplifies defining
new RPC request and response messages.
Specify Cap'n Proto schema ID in cmake
This preserves Cap'n Proto generated typeIds across multiple generations
of capnp schemas through LCP. It is imperative that typeId stays the
same to ensure that different compilations of Memgraph may communicate
via RPC in a distributed cluster.
Use CLOS for meta information on C++ types in LCP
Since some structure slots and functions have started to repeat
themselves, it makes sense to model C++ meta information via Common Lisp
Object System.
Depends on D1391
Reviewers: buda, dgleich, mferencevic, mtomic, mculinovic, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1407
Summary:
Utils source files are now moved to a standalone mg-utils library.
Unit and manual tests are no longer collected using glob recursion in
cmake, but are explicitly listed. This allows us to set only required
dependencies of those tests.
Both of these changes should improve compilation and link times, as well
as lower the disk usage.
Additional improvement would be to cleanup utils header files to be
split in .hpp and .cpp as well as merging threading into utils. Other
potential library extractions that shouldn't be difficult are:
* data_structures
* io/network
* communication
Reviewers: buda, mferencevic, dgleich, ipaljak, mculinovic, mtomic, msantl
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1408
Summary:
Snapshots should have the transaction from which they were created because we need this info for recovery later on.
Otherwise we wouldn't be able to tell the workers from which snapshots to recover. The whole cluster should be recovered
from the same transaction snapshot.
Reviewers: msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1338
Summary:
When commiting/aborting a transaction in tx master engine, make a two
phase commit to all workers so they can stop all futures and clear
transactional cache.
Reviewers: dgleich, florijan
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1320
Summary:
Ensures that query plans are invalidated on the remote only when it's
guaranteed they will never be used on the master. This is done by
invalidating remote caches in the `CachedPlan` destructor.
There are two unplesant side-effects. First, an RPC call is made in an
object destructor. This is somewhat ugly, but not that different then
making an RPC call that must succeed in any other function. Note that
this does NOT slow down any query execution because the relevant
destructor is called by the skiplist garbage collector. The second ugly
side-effect is that in the unit test now we need to sleep to ensure the
skiplist GC destructs a cached plan before checking that it's
invalidated on the remote worker.
We might want to redesign this at some point.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1302
Summary:
- Remove caches on workers as a result of plan expiration or race
during insertion.
- Extract caching functionality into a class.
- Minor refactor of Interpreter::operator()
- New RPC and test for it.
- Rename ConsumePlanRes to DispatchPlanRes for consistency, remove
return value as it's always true and never used.
- Interpreter is now constructed with a `GraphDb` reference. At the
moment only for reaching the `distributed::PlanDispatcher`, but in
the future we should probably use that primarily for planning.
I added a function to `PlanConsumer` that is only used for testing.
I prefer not doing this, but I felt this needed testing. I can remove
it now if you like.
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1292
Summary: this will make delete query more efficient
Reviewers: mferencevic, mculinovic
Reviewed By: mferencevic
Differential Revision: https://phabricator.memgraph.io/D1282
Summary: Property connected_frauds added and set to zero for strata scenario.
Reviewers: mtomic
Reviewed By: mtomic
Differential Revision: https://phabricator.memgraph.io/D1275