Summary:
Most instances of `@throw std::bad_alloc` are left unexplained as these
functions perform general heap allocations are it's obvious from the
function name that it will do so. Basically anything with `Create`, `Make` or
`Build` implies allocations. Additionally, which parts exactly perform
allocations are an implementation detail. Functions which do unexpected
heap allocations have the reason stated in the documentation, these
functions typically have exactly one spot which could raise such an
exception.
Some functions are marked as `noexcept`, these are usually "special
functions" such as constructors and operators. This could potentially
improve performance because STL may use API overloads that work faster
with `noexcept` stuff. Remaining non-throwing functions aren't marked as
`noexcept` as that wasn't our practice nor is common in our codebase. On
the other hand, if we continue enforcing the documentation of thrown
exceptions, perhaps we should start using `noexcept`.
Reviewers: mferencevic, ipaljak
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2350
Summary:
This makes Gid the same as the one in storage/v2. Before they can be
merge into one implementation, we probably want to have a similar
transition for remaining ID types.
Depends on D2346
Reviewers: mferencevic, ipaljak
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2347
Summary:
It never made sense that a global ID is its own namespace in the storage
directory tree.
Reviewers: mferencevic, ipaljak
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2346
Summary:
This effectively replaces the old PropertyValue implementation from the
one in storage/v2
Depends on D2333
Reviewers: mferencevic, ipaljak
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2335
Summary:
There's a possible race condition where we add a deleted vertex into
index and garbage collection removes it from main storage before indices are
cleaned-up.
Reviewers: mferencevic, teon.banek
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2314
Summary:
Utils now contain a BasicResult implementation which supports different
types of error. This will make it useful for other parts of the code as
well as during the transition from old to new storage.
Reviewers: mtomic, msantl, mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2291
Summary:
The documentation includes `std` exceptions like `std::bad_alloc` or
`std::system_error`, for which there's probably nothing we can do. This
may seem unnecessary, but it will be really helpful when writing the C
API for interfacing with custom modules and plugins, as well as when
switching to storage v2 API.
In general, we should start updating the documentation of functions
which may throw exceptions. This ought to be enforced in code review, so
that the implementation and documentation are kept in sync.
Reviewers: mferencevic, mtomic, msantl
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2288
Summary:
This ought to simplify the code which needs to work with any kind of
vertex iteration, be it through an index store or regular.
Reviewers: mtomic, mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2258
Summary: this will make GC for indices easier
Reviewers: teon.banek, mferencevic
Reviewed By: teon.banek, mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2223
Summary:
`std::unordered_map` is 56 bytes in size, `std::map` is 48 bytes in size.
Also, `std::map` doesn't require the key type to be hashable.
Reviewers: mtomic, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2218
Summary:
The first 2 tuple elements are redundant as they are available through
EdgeAccessor and they needlessly complicate the usage of the API.
Reviewers: mtomic, mferencevic
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2200
Summary:
`lock_guard` is holding vertex lock while we're deleting the vertex,
and then it might try to unlock it in its destructor and access freed memory.
Reviewers: teon.banek, mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2192
Summary:
This change implements full edges support in storage v2. Edges can be created
and deleted. Support for detach-deleting vertices is added and regular vertex
deletion verifies existance of edges.
Reviewers: mtomic, teon.banek
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2180
Summary:
We forgot to update `modified_vertices` in the case when the vertex
has an empty version chain. It didn't manifest before because it was impossible
for a vertex to have an empty version chain without garbage collection.
Reviewers: mferencevic, teon.banek
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2154
Summary:
We forgot to add the newly created vertex into `modified_vertices`
list.
Reviewers: teon.banek, mferencevic
Reviewed By: mferencevic
Differential Revision: https://phabricator.memgraph.io/D2153
Summary:
Initial implementation of new storage engine. It implements snapshot isolation
for transactions. All changes in the database are stored as deltas instead of
making full copies. Currently, the storage supports full transaction
functionality (commit, abort, command advancement). Also, support has been
implemented only for vertices that have only labels.
Reviewers: teon.banek, mtomic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2138
Summary:
HA should now support constraints in the same way the SM version does.
I only tested this thing manually, but I plan to add a new integration test for
this also.
Reviewers: ipaljak, vkasljevic, mferencevic
Reviewed By: ipaljak, mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2083
Summary:
Preparing unique constraint code to be implemented in HA. To do so, I'm
moving everything to `stograge/common/` folder. I also added a new header,
`gid.hpp` which does a `ifdef` include of the correct `gid.hpp` based on the
product.
Reviewers: ipaljak, mferencevic, vkasljevic, teon.banek
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2079
Summary:
There will be a lot of leftover files, execute the following commands inside
`src/` to remove them:
```
git clean -xf
rm -r rpc/ storage/single_node_ha/rpc/
```
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D2011
Summary:
Manually call `CompactRange` on KVStore (rocksDB) to free disk space
upon Raft log compaction.
Raft log takes surprisingly large amount of disk space (1MB for 1000 entries) so
we need to free disk space as we compact the log into snapshot.
When `log_size_snapshot_threshold` set to `5000` the disk usage of the
`durability_dir` goes from `4.8 MB` to `240 KB`.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1939
Summary:
This is still in progress.
As reported in
https://app.asana.com/0/743890251333732/971034572323525/f
memgraph distributed fails when the snapshot that it tries to recover from is
invalid. Instead of inheritance of `StorageGc` this diff adds composition and we
don't try to register a rpc server after the initialization.
Reviewers: ipaljak, vkasljevic, mtomic
Reviewed By: vkasljevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1930
Summary:
Same as `UniqueLabelPropertyConstratint` except it works with multiple properties.
Because of that it is a bit more complex and slower.
Reviewers: msantl, ipaljak
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1926
Summary:
It seems that older clang versions erroneously accept modification of
constant pointers, this is a quick hack which resolves the issue on new
clang versions.
None of the const methods are really const and should not be marked as
such. This whole accessor thing is just a steaming pile of crap and
should be rewritten from scratch.
Reviewers: vkasljevic, msantl
Reviewed By: vkasljevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1922
Summary:
Case with existence constraints is about 8-9% slower then case without
existence constraints. Before this diff that difference was about 15-16%.
Reviewers: msantl, ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1917
Summary:
UniqueLabelPropertyConstraint defines label + property restriction on vertices.
Label + Property + PropertyValue for that property must be unique at any given moment.
Reviewers: msantl, ipaljak
Reviewed By: msantl
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1884
Summary:
`EdgesIterable` is used for iterating over Edges in distributed memgraph. Because of lru cache there is a possibility of data getting evicted as someone iterates over it.
To prevent that `EdgesIterable` will lock that data and release it when it's deconstructed.
Reviewers: msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1868
Summary:
CachedDataLock is necessary for lru cache as remote data is no longer
persistent. Most methods internally handle this, but for methods that
return pointers or references to remote data, we need to manually
lock data.
Reviewers: msantl, ipaljak
Reviewed By: msantl
Subscribers: teon.banek, pullbot
Differential Revision: https://phabricator.memgraph.io/D1869
Summary:
Existence constraint ensures that all nodes with certain label have a certain property.
`ExistenceRule` defines label -> properties rule and `ExistenceConstraints` manages all
constraints.
Reviewers: msantl, ipaljak, teon.banek, mferencevic
Reviewed By: msantl, teon.banek, mferencevic
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1797
Summary:
RecordAccessor doesn't implement `operator<`, so it doesn't make sense
to have it inherit TotalOrdering.
Reviewers: mferencevic, msantl, vkasljevic
Reviewed By: vkasljevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1866
Summary:
`Edges` class didn't properly filter edges when given both edge type
and destination arguments, which causes weird bugs in query execution (wrong
edge getting matched in `Expand` with existing node flag set). This is probably
because we didn't support using both filters at the same time, but the API and
documentation for `VertexAccessor::in` and `VertexAccessor::out` functions
didn't reflect that.
Reviewers: teon.banek, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1796
Summary:
This is just the first diff that tries to wire the raft protocol into
memgraph.
In this diff I'm introducing transaction engine reset functionality. I also
introduced `RaftInterface` which should be used wherever someone wants to access
Raft from Memgraph.
For design decisions see the feature spec.
Reviewers: ipaljak, teon.banek
Reviewed By: ipaljak
Subscribers: pullbot, teon.banek
Differential Revision: https://phabricator.memgraph.io/D1758
Summary:
GraphDb now has GetStorageStat method that returns live view to
object containing vertex_count, edge_count and avg_degree().
Stat is updated on each garbage collection run and represents number of
objects that are visible to any transaction.
Reviewers: teon.banek, msantl, ipaljak
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1731
Summary:
Since we need to send `StateDelta`s over the wire in HA, we need to be
able to serialize those bad boys.
This diff hopefully does this the right way.
Reviewers: teon.banek, mferencevic, ipaljak
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1725
Summary:
Initial for fork (in form of a c/p) form the current single node
version. Once we finish HA we plan to re-link the files to the single node
versions if they don't change.
Reviewers: ipaljak, buda, mferencevic
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1705
Summary:
Blocking transaction has the ability to stop the transaction engine from
starting new transactions (regular or blocking) and to wait all other active
transactions to finish (to become non active, committed or aborted). One thing
that blocking transactions support is defining the parent transaction which
does not need to end in order for the blocking one to start. This is because of
a use case where we start nested transactions.
One could thing we should build indexes inside those blocking transactions. This
is true and I wanted to implement this, but this would require some digging in
the interpreter which I didn't want to do in this change.
Reviewers: mferencevic, vkasljevic, teon.banek
Reviewed By: mferencevic, teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1695
Summary:
LabelPropertyIndex now has the ability to enforce unique constraint.
This doesn't lock the tx engine.
Reviewers: teon.banek, mferencevic
Reviewed By: teon.banek
Subscribers: pullbot, vkasljevic, buda
Differential Revision: https://phabricator.memgraph.io/D1660
Summary:
This diff fixes the storage part of the bug that happen for the
following queries if all nodes end up on the non-master machine.
```
CREATE (a:T{c: True})-[:X{x: 2.5}]->(:A:B)-[:Y]->()-[:Z{r: 1}]->(a)
MATCH (:T{c: True})-[a:X{x: 2.5}]->(node:A:B)-[:Y]->()-[:Z{r: 1}]->() RETURN a AS node, node AS a
```
The current state of the code doesn't work completely, it's still missing the
query execution fix.
Reviewers: teon.banek, ipaljak, vkasljevic
Reviewed By: teon.banek
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1629
Summary:
Start removing `is_remote` from `Address`
Remove `GlobalAddress`
Remove `GlobalizedAddress`
Remove bitmasks from `Address`
Remove `is_local` from `Address`
Remove `is_local` from `RecordAccessor`
Remove `worker_id` from `Address`
Remove `worker_id` from `GidGenerator`
Unfriend `IndexRpcServer` from `Storage`
Remove `LocalizedAddressIfPossible`
Make member private
Remove `worker_id` from `Storage`
Copy function to ease removal of distributed logic
Remove `worker_id` from `WriteAheadLog`
Remove `worker_id` from `GraphDb`
Remove `worker_id` from durability
Remove nonexistant function
Remove `gid` from `Address`
Remove usage of `Address`
Remove `Address`
Remove `VertexAddress` and `EdgeAddress`
Fix Id test
Remove `cypher_id` from `VersionList`
Remove `cypher_id` from durability
Remove `cypher_id` member from `VersionList`
Remove `cypher_id` from database
Fix recovery (revert D1142)
Remove unnecessary functions from `GraphDbAccessor`
Revert `InsertEdge` implementation to the way it was in/before D1142
Remove leftover `VertexAddress` from `Edge`
Remove `PostCreateIndex` and `PopulateIndexFromBuildIndex`
Split durability paths into single node and distributed
Fix `TransactionIdFromWalFilename` implementation
Fix tests
Remove `cypher_id` from `snapshooter` and `durability` test
Reviewers: msantl, teon.banek
Reviewed By: msantl
Subscribers: msantl, pullbot
Differential Revision: https://phabricator.memgraph.io/D1647
Summary:
`heaptrack` shows a miniscule decrease in memory usage during query
execution.
Running the below query on the TEDTalk database 100 times gives the
following results:
- number of allocations: from 642647 to 642589
- bytes allocated in total: from 48.79 MiB to 48.78 MiB
```
MATCH (t:Tag)<-[:HasTag]-(n:Talk)
RETURN t.name AS Tag, COUNT(n) AS TalksCount
ORDER BY TalksCount DESC, Tag LIMIT 20;
```
Regarding `TypedValue`'s destructor: we're allowed to manually destruct the
union memebers that we constructed using placement-new. However, it is undefined
behavior to call the destructor after an object's lifetime has ended. Calling
`TypedValue`'s own destructor within its assignment operator counts as ending
its lifetime, which means that the next call to its destructor will invoke
undefined behavior.
See the C++ Draft N4140, clauses 12.4/15 and 3.8/1.3.
Reviewers: teon.banek, mtomic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1630
Summary:
To clean the working directory after this diff you should execute:
```
rm src/database/counters_rpc_messages.capnp
rm src/database/counters_rpc_messages.hpp
rm src/database/serialization.capnp
rm src/database/serialization.hpp
```
Reviewers: teon.banek, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1636
Summary:
This diff splits single node and distributed storage from each other.
Currently all of the storage code is copied into two directories (one single
node, one distributed). The logic used in the storage implementation isn't
touched, it will be refactored in following diffs.
To clean the working directory after this diff you should execute:
```
rm database/state_delta.capnp
rm database/state_delta.hpp
rm storage/concurrent_id_mapper_rpc_messages.capnp
rm storage/concurrent_id_mapper_rpc_messages.hpp
```
Reviewers: teon.banek, buda, msantl
Reviewed By: teon.banek, msantl
Subscribers: teon.banek, pullbot
Differential Revision: https://phabricator.memgraph.io/D1625
Summary:
This should allow us to more easily decouple the code which should be
open sourced. Unfortunately, the downside of this approach is that we
cannot rely on virtual calls to dispatch the serialization to correct
type. Another downside is that members need to be publicly accessible
for serialization.
Reviewers: mtomic, msantl
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1596
Summary:
This diff changes the RPC layer to directly return `TResponse` to the user when
issuing a `Call<...>` RPC call. The call throws an exception on failure
(instead of the previous return `nullopt`).
All servers (network, RPC and distributed) are set to have explicit `Shutdown`
methods so that a controlled shutdown can always be performed. The object
destructors now have `CHECK`s to enforce that the `AwaitShutdown` methods were
called.
The distributed memgraph is changed that none of the binaries (master/workers)
crash when there is a communication failure. Instead, the whole cluster starts
a graceful shutdown when a persistent communication error is detected.
Transient errors are allowed during execution. The transaction that errored out
will be aborted on the whole cluster. The cluster state is managed using a new
Heartbeat RPC call.
Reviewers: buda, teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1604
Summary:
Previous version of code was wrong in some cases. Example:
There's an edge e = (a)->(b). Node (a) belongs to worker 1, node (b) belongs to worker 0. Therefore, edge e belongs to worker 1.
When edge e was serialized on worker 0, address of node b was local and it would be globalized, but the worker id of edge (e) would be used, which is wrong.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1599
Summary:
In a bunch of places `TypedValue` was used where `PropertyValue` should be. A lot of times it was only because `TypedValue` serialization code could be reused for `PropertyValue`, only without providing callbacks for `VERTEX`, `EDGE` and `PATH`. So first I wrote separate serialization code for `PropertyValue` and put it into storage folder. Then I fixed all the places where `TypedValue` was incorrectly used instead of `PropertyValue`. I also disabled implicit `TypedValue` to `PropertyValue` conversion in hopes of preventing misuse in the future.
After that, I wrote code for `VertexAccessor` and `EdgeAccessor` serialization and put it into `storage` folder because it was almost duplicated in distributed BFS and pull produce RPC messages. On the sender side, some subset of records (old or new or both) is serialized, and on the reciever side, records are deserialized and immediately put into transaction cache.
Then I rewrote the `TypedValue` serialization functions (`SaveCapnpTypedValue` and `LoadCapnpTypedValue`) to not take callbacks for `VERTEX`, `EDGE` and `PATH`, but use accessor serialization functions instead. That means that any code that wants to use `TypedValue` serialization must hold a reference to `GraphDbAccessor` and `DataManager`, so that should make clients reconsider if they really want to use `TypedValue` instead of `PropertyValue`.
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1598
Summary:
Since RecordAccessor is often instantiated and copied, using a
factory-like function to allocate concrete types on the heap would be
too costly. The approach in this diff uses a Strategy pattern (see
"Design Patterns" by Gamma et al.), where the Strategy interface is
given as RecordAccessor::Impl. Concrete implementations are then created
for each GraphDb. This allows us to instantiate the concrete
RecordAccessors::Impl *once* and *share* it among all RecordAccessors.
Reviewers: msantl, vkasljevic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1510
Summary:
Since we switched to Cap'n Proto serialization there's no need for
keeping boost around anymore.
Reviewers: mtomic, mferencevic, buda
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1515
Summary:
GraphDbAccessor is now constructed only through GraphDb. This allows the
concrete GraphDb to instantiate a concrete GraphDbAccessor. This allows
us to use virtual calls, so that the implementation may be kept
separate. The major downside of doing things this way is heap allocation
of GraphDbAccessor. In case it turns out to be a real performance
issues, another solution with pointer to static implementation may be
used.
InsertVertexIntoRemote is now a non-member function, which reduces
coupling. It made no sense for it to be member function because it used
only the public parts of GraphDbAccessor.
Reviewers: msantl, mtomic, mferencevic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1504
Summary:
This change, hopefully, simplifies the implementation of different kinds
of GraphDb. The pimpl idiom is now simplified by removing all of the
crazy inheritance. Implementations classes are just plain data stores,
without any methods. The interface classes now have a more flat
hierarchy:
```
GraphDb (pure interface)
|
+----+---------- DistributedGraphDb (pure interface)
| |
Single Node +-----+------+
| |
Master Worker
```
DistributedGraphDb is used as an intermediate interface for all the
things that should work only in distributed. Therefore, virtual calls
for distributed stuff have been removed from GraphDb. Some are exposed
via DistributedGraphDb, other's are only in concrete Master and Worker
classes. The code which relied on those virtual calls has been
refactored to either use DistributedGraphDb, take a pointer to what is
actually needed or use dynamic_cast. Obviously, dynamic_cast is a
temporary solution and should be replaced with another mechanism (e.g.
virtual call, or some other function pointer style).
The cost of the above change is some code duplication in constructors
and destructors of classes. This duplication has a lot of little tweaks
that make it hard to generalize, not to mention that virtual calls do
not work in constructor and destructor. If we really care about
generalizing this, we should think about abandoning RAII in favor of
constructor + Init method.
The next steps for splitting the dependencies that seem logical are:
1) Split GraphDbAccessor implementation, either via inheritance or
passing in an implementation pointer. GraphDbAccessor should then
only be created by a virtual call on GraphDb.
2) Split Interpreter implementation. Besides allowing single node
interpreter to exist without depending on distributed, this will
enable the planner and operators to be correctly separated.
Reviewers: msantl, mferencevic, ipaljak
Reviewed By: msantl
Subscribers: dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D1493
Summary:
Session specifics have been move out of the Bolt `executing` state, and
are accessed via pure virtual Session type. Our server is templated on
the session and we are setting the concrete type, so there should be no
virtual call overhead. Abstract Session is used to indicate the
interface, this could have also been templated, but the explicit
interface definition makes it clearer.
Specific session implementation for running Memgraph is now implemented
in memgraph_bolt, which instantiates the concrete session type. This may
not be 100% appropriate place, but Memgraph specific session isn't
needed anywhere else.
Bolt/communication tests now use a dummy session and depend only on
communication, which significantly improves test run times.
All these changes make the communication a library which doesn't depend
on storage nor the database. Only shared connection points, which aren't
part of the base communication library are:
* glue/conversion -- which converts between storage and bolt types, and
* communication/result_stream_faker -- templated, but used in tests and query/repl
Depends on D1453
Reviewers: mferencevic, buda, mtomic, msantl
Reviewed By: mferencevic, mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1456
Summary:
This is the first step in cutting the crazy dependencies of
communication module to the whole database. Includes have been
reorganized and conversion between DecodedValue and other Memgraph types
(TypedValue and PropertyValue) has been extracted to a higher level
component called `communication/conversion`. Encoder, like Decoder, now
relies only on DecodedValue. Hopefully the conversion operations will
not significantly slow down streaming Bolt data.
Additionally, Bolt ID is now wrapped in a class. Our storage model uses
*unsigned* int64, while Bolt expects *signed* int64. The implicit
conversions may lead to encode/decode errors, so the wrapper should
enforce some type safety to prevent such errors.
Reviewers: mferencevic, buda, msantl, mtomic
Reviewed By: mferencevic, mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1453
Summary:
Converts the RPC stack to use Cap'n Proto for serialization instead of
boost. There are still some traces of boost in other places in the code,
but most of it is removed. A future diff should cleanup boost for good.
The RPC API is now changed to be more flexible with regards to how
serialize data. This makes the simplest cases a bit more verbose, but
allows complex serialization code to be correctly written instead of
relying on hacks. (For reference, look for the old serialization of
`PullRpc` which had a nasty pointer hacks to inject accessors in
`TypedValue`.)
Since RPC messages were uselessly modeled via inheritance of Message
base class, that class is now removed. Furthermore, that approach
doesn't really work with Cap'n Proto. Instead, each message type is
required to have some type information. This can be automated, so
`define-rpc` has been added to LCP, which hopefully simplifies defining
new RPC request and response messages.
Specify Cap'n Proto schema ID in cmake
This preserves Cap'n Proto generated typeIds across multiple generations
of capnp schemas through LCP. It is imperative that typeId stays the
same to ensure that different compilations of Memgraph may communicate
via RPC in a distributed cluster.
Use CLOS for meta information on C++ types in LCP
Since some structure slots and functions have started to repeat
themselves, it makes sense to model C++ meta information via Common Lisp
Object System.
Depends on D1391
Reviewers: buda, dgleich, mferencevic, mtomic, mculinovic, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1407
Summary:
Add additional structs and functions for handling C++ meta information.
Add capnp-file and capnp-id arguments to lcp:process-file.
Generate cpp along with hpp and capnp in lcp.
Wrap LogicalOperator base class in lcp:define-class.
Modify logical operators for capnp serialization.
Add query/common.capnp.
Reviewers: mculinovic, buda, mtomic, msantl, ipaljak, dgleich, mferencevic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1391
Summary:
A simplified end-to-end implementation
POD interface set-up, still have bugs with HDDkeys
Version bug fix and first iterator implementation
Fixed out-of-scope reference in PVS iterator
Added PVS unit tests
Reviewers: buda, mferencevic, dgleich, teon.banek
Reviewed By: buda, dgleich
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1369
Summary: Global address of `from` was stored in Edge record when created via CreateEdge RPC.
Reviewers: buda, msantl, dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1387
Summary:
- Removed a lot of stuff that was incorrect and/or unnecessary
- Fixed const-correctness in the skiplist family
Reviewers: dgleich, teon.banek, buda
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1351
Summary: Do it so properties-on-disk can be implemented.
Reviewers: buda, dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1343
Summary:
Remove "produce_" and "Produce" as prefix from all distributed stuff.
It's not removed in src/query/ stuff (operators).
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1315
Summary: This is correct, and uses less memory.
Reviewers: teon.banek, buda, mislav.bradac
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1304
Summary:
Also removed some convenience code that became obsolete.
No logic changes.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1303
Summary:
Instead of passing `coordination`, pass `rpc_worker_clients` that
holds a map of worker_id->clientPool. By having only one instance of
`RpcWorkerClients` that is owned by `GraphDB` and passing it by refference
we'll share the same client pools for rpc clients.
Reviewers: teon.banek, florijan, dgleich, mferencevic
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1261
Summary:
Previously, the RPC stack used the network stack only to receive messages. The
messages were then added to a separate queue that was processed by different
thread pools. This design was inefficient because there was a lock when
inserting and getting messages from the common queue.
This diff removes the need for separate thread pools by utilising the new
network stack design. This is possible because the new network stack allows
full processing of the network request without blocking the whole queue.
Reviewers: buda, florijan, teon.banek, dgleich, mislav.bradac
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1229
Summary:
Updating a record locally while there is an remote update waiting to be applied caused
the operation to return as already deleted, instead of applying it
Reviewers: florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1226
Summary:
Implementation of remote vertex and edge creation. This diff addresses
the creation API (`GraphDbAccessor::InsertEdge`,
`GraphDbAccessor::InsertRemoteVertex`) and the necessary RPC and
`RemoteCache` stuff.
What is missing for full remote creation support are
`query::plan::operator` changes that are expected to minor. Pushing this
diff as it's large enough, operator and end to end tests in the next.
Also, the naming of existing structures and files is confusing (update
refering to both updates and created, `results` used too often etc.). I
will address this too, but feel free to comment on bad naming.
Reviewers: dgleich, teon.banek, msantl
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1210
Summary:
We have been using `Edges::VertexAddress` and `Edges::EdgeAddress` a lot
in other parts of the codebase because it's cleaner to write then
`Address<mvcc::VersionList<Edge>>`, especially in code what should not
really be MVCC-aware. However, a lot of that code should not really be
`Edges` aware either, as that's a storage datastructure that should not
be exposed.
This became annoying, so I extracted these addresses into a type-file. I
don't really like this approach, it might be better to have
`Vertex::Address` and `Edge::Address`, but that means we'd have to
import those headers and we'd get circular dependencies.
“The horror! The horror!”
- Joseph Conrad, Heart of Darkness
Reviewers: teon.banek, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1204
Summary:
Remote caches used to be owned by `GraphDbAccessor`. An advantage of
that was immediate cleanup when destructing. A disadvantage was sharing
the remote cache between mutliple program-flows in the same transaction
in distributed (one would have to share the accessor).
We will have to do post-transactional global cleanup anyway, since we
leak, which reduces the above stated advantage. And the stated
disadvantage is becoming more and more pronounced as additional
components need access to the remote cache.
Hence the refactor.
Reviewers: buda, teon.banek, msantl
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1186
Summary:
Updates are supported, insertions and removals not in this diff. The
test is a bit overdesigned, it happens.
Reviewers: teon.banek, dgleich, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1176
Summary:
This diff consolidates local and remote update handling. It ensures and
tests that updates for remote elements are visible locally (on the
updating worker).
The next part will be accumulating remote updates and applying them on
the owner.
Also extracted a common testing fixture.
Reviewers: dgleich, buda, mtomic
Reviewed By: mtomic
Subscribers: mtomic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1169
Summary:
Refactor in two ways. First, expose members without getters as we will
need most of them in distributed. And this was always the sensible thing
to do. Second, add storage type values to deltas. This is also a
sensible thing to do, and it will be very beneficial in distributed. We
didn't do it before because name<->value type mappings aren't guaranteed
to be the same after recovery. A task has been added to address this
(preserve mappings in durability).
Reviewers: dgleich, buda
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1167
Summary:
It is possible that we have a global address to resolve, for a graph
element that's local. Consider W1 expanding, getting data from W2,
expanding from there and getting data that is on W1. We then don't want
to do RPC from W1 to W1, but do a lookup directly.
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1145
Summary:
Start removal of old logic
Remove more obsolete classes
Move Message class to RPC
Remove client logic from system
Remove messaging namespace
Move protocol from messaging to rpc
Move System from messaging to rpc
Remove unnecessary namespace
Remove System from RPC Client
Split Client and Server into separate files
Start implementing new client logic
First semi-working state
Changed network protocol layout
Rewrite client
Fix client receive bug
Cleanup code of debug lines
Migrate to accessors
Migrate back to binary boost archives
Remove debug logging from server
Disable timeout test
Reduce message_id from uint64_t to uint32_t
Add multiple workers to server
Fix compiler warnings
Apply clang-format
Reviewers: teon.banek, florijan, dgleich, buda, mtomic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1129
Summary:
It seems that RecordAccessor &co are ready for read-only distributed
execution. In read-only there is no command advancement and the implied
cache invalidation, `SwitchOld` and `SwitchNew` perform default
switching and `Reconstruct` uses the `RemoteCache` which is implemented.
I just added a few TODOs for proper CRUD.
Reviewers: dgleich
Reviewed By: dgleich
Differential Revision: https://phabricator.memgraph.io/D1125
Summary:
- End to end distributed GraphDb testing
- Refactors as necessary
- Basic RemoteCache for storing remote data
- RemoteDataRpc
As we are on a tight schedule, please let's focus on the essentials:
functionality and proper testing.
Reviewers: dgleich, teon.banek, buda
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1121
Summary:
Virtual destructors were missing in classes/structs which can
be inherited.
A missing virtual destructor gives undefined behaviour when
deleting derived class using base type.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1117
Summary:
Previously, we would have a `DCHECK` which crashes the application. This
was evident when testing a queries, such as:
MATCH (n) DELETE n SET n.prop = 42
Since the argument to update clauses is evaluated during execution, it
makes it very difficult to prevent such errors during semantic analysis.
For example:
MATCH (n)--(m) WITH collect(n) as ns, m
DETACH DELETE ns[m.prop] SET head(ns).prop = 42
Test query updates on deleted graph elements
Reviewers: florijan, dgleich
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1114
Summary:
GraphDb is refactored to become an API exposing different parts
necessary for the database to function. These different parts can have
different implementations in SingleNode or distributed Master/Server
GraphDb implementations.
Interally GraphDb is implemented using two class heirarchies. One
contains all the members and correct wiring for each situation. The
other takes care of initialization and shutdown. This architecture is
practical because it can guarantee that the initialization of the
object structure is complete, before initializing state.
Reviewers: buda, mislav.bradac, dgleich, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1093
Summary:
Serialization of vertices and edges for distributed. Based on Boost
serialization. Threrefore moved TypedValue serialization from AST to
utils.
Reviewers: buda, dgleich, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1088
Summary:
A PropertyValueStore is not a generic data structure, but only ever used
to store properties in a Vertex/Edge. It has behaviours specific to it.
So, the templatization was not necessary.
Reviewers: buda
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1089
Summary:
Not having a virtual destructor caused tests
to fail (cypher_main_visitor, interpreter) sporadically
since unfreed memory was re-used incorrectly.
Also Valgrind complained constantly.
Reviewers: florijan, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1081
Summary:
The distributed ID mapper is not yet utilised in GraphDb as those
changes are in D1060. Depending on landing order it will be added.
Reviewers: dgleich, mislav.bradac
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1064
Summary:
What's done:
- `RecordAccessor` can represent remote data
- `GraphDbAccessor` manages remote data
- Cleanup: different `EdgeAccessor lazyness (@dgleich: take a look), unused methods, documentation...
- `TODO` placeholders for remote implementation
What's not done:
- RPC and data transfer
- how exactly remote errors are handled
- not sure if any MVCC Record info for remote data should be tracked
- WAL and RPC Deltas properly handled (Gleich working on extracting `Wal::Op`)
This implementation should not break single-node execution, and should provide good abstractions and placeholders for distributed. Once that's satisfied, it should land.
Reviewers: dgleich, buda, mislav.bradac
Reviewed By: dgleich
Subscribers: dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D1030
Summary: Operations are moved and renamed from WAL to a separate file in preparation for HA and distributed storage.
Reviewers: florijan, mtomic, mislav.bradac
Reviewed By: florijan
Subscribers: mislav.bradac, pullbot
Differential Revision: https://phabricator.memgraph.io/D1034
Summary:
The current idea is that the same MG binary can be used for single-node,
distributed master and distributed worker. The transactional engine in
the single-node and distributed master is the same: it determines the
transactional time and exposes all the "global" functionalities. In the
distributed worker the "global" functions must contact the master.
Reviewers: dgleich, mislav.bradac, buda
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1013
Summary: Because it will never be used, we already have replacements for it.
Reviewers: buda
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1016
Summary: Code simplification made possible by making `locks_` `mutable` in `tx::Transaction`.
Reviewers: dgleich, buda
Reviewed By: buda
Differential Revision: https://phabricator.memgraph.io/D1015
Summary:
In preparation for distributed storage we need to have labels/properties/edgetypes uniquely identifiable by their ids, which will be global in near future.
The old design has to be abandoned because it's not possible to keep track of global labels/properties/edgetypes while they are local pointers.
Reviewers: mislav.bradac, florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D993
Summary: Vertex and Edge now use Address for storing connections to other Edges and Vertices, to support distributed storage.
Reviewers: mislav.bradac, dgleich, buda
Reviewed By: mislav.bradac, dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D977
Summary:
My dear fellow Memgraphians. It's friday afternoon, and I am as ready to pop as WAL is to get reviewed...
What's done:
- Vertices and Edges have global IDs, stored in `VersionList`. Main storage is now a concurrent map ID->vlist_ptr.
- WriteAheadLog class added. It's based around buffering WAL::Op objects (elementraly DB changes) and periodically serializing and flusing them to disk.
- Snapshot recovery refactored, WAL recovery added. Snapshot format changed again to include necessary info.
- Durability testing completely reworked.
What's not done (and should be when we decide how):
- Old WAL file purging.
- Config refactor (naming and organization). Will do when we discuss what we want.
- Changelog and new feature documentation (both depending on the point above).
- Better error handling and recovery feedback. Currently it's all returning bools, which is not fine-grained enough (neither for errors nor partial successes, also EOF is reported as a failure at the moment).
- Moving the implementation of WAL stuff to .cpp where possible.
- Not sure if there are transactions being created outside of `GraphDbAccessor` and it's `BuildIndex`. Need to look into.
- True write-ahead logic (flag controlled): not committing a DB transaction if the WAL has not flushed it's data. We can discuss the gain/effort ratio for this feature.
Reviewers: buda, mislav.bradac, teon.banek, dgleich
Reviewed By: dgleich
Subscribers: mtomic, pullbot
Differential Revision: https://phabricator.memgraph.io/D958
Summary: In the current state, it was not possible to iterate, or even access a const map, or const set structure because of an incorrect implementation of "ConstAccessors".
Reviewers: mislav.bradac, teon.banek, buda
Reviewed By: teon.banek, buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D902
Summary:
- The new BFS syntax implemented as proposed.
- AST BreadthFirstAtom now uses EdgeAtom members: has_range_{true}, upper_bound_, lower_bound_
- Edges data structure now handles all the edge filtering (single or multiple edges), to ease planning. Additional edge filtering (additional Filter op in the plan) is removed. AST EdgeTypeTest is no longer used and is removed.
Current state is stable but there are things left to do:
- BFS property filtering.
- BFS lower_bound_ support.
- Support for lambdas in variable length expansion. This includes obligatory (even if not user_defined) inner_node and inner_edge symbols for easier handling.
- Code-sharing between BFS and variable length expansions.
I'll add asana tasks (and probably start working on them immediately) when/if this lands.
Reviewers: buda, teon.banek, mislav.bradac
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D836
Summary: - modified all utils/algorithm functions to be inline and in the utils namespace
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D830
Summary:
Added path encoding and decoding in our Bolt layer. Returning paths to the Neo client seems to work fine.
Not sure how to test that the decoded works fine (our client?).
Tests are not written, that suite is complicated. Leaving that to mferencevic.
Also reduced DecodedValue a bit with macros. Less code, less error prone.
Reviewers: buda, florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D810
Summary:
- The `Edges` data structure now handles common ops, including providing an iterator over edges whose "other" vertex is know.
- This should improve performance on dense_expand tests in the harness without other side-effects.
- query::plan::Expand operator modified not to check for existing-node stuff since that now gets handled by the `Edges` data structure.
- `Edges::Iterator` implemented only for const iterators since that suffices for now. Can implement non-const if the need arrises.
Reviewers: buda, mislav.bradac, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D763
Summary:
Added:
- map support in PropertyValue
- conversion of map TypedValue to PropertyValue if appropriate flag is set (undocumented because it's private)
- ordering of map PropertyValue in LabelPropertyIndex
- issue raised regarding list and value property modifications in storage (currently unsupported)
Maybe I missed some feature or whatever?
Reviewers: mislav.bradac, buda, teon.banek
Reviewed By: mislav.bradac, buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D692
Summary: Not strictly neccessary, but it's been itching me. It took an hour.
Reviewers: buda, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D648
Summary:
Not complete (but review can start):
- implementation should be done
- still need to finish tests
- documentation missing
Reviewers: mislav.bradac, teon.banek, buda
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D616
Summary:
Variable expansion logical operator added. Some functionalities are missing:
- taking into account optional matching when expanding into existing symbol
- accepting Expression bounds (current implementation takes size_t)
Also, a TODO is added for handling optional matching in the uniqueness operator (with an Asana task)
All this will be done in the following diff, this is already substantial.
Also, please consider if we want to have all those `VLOG`s in the code. Not very pretty. And I think that `VLOG` is not compiled-away in release build, will put an asana task.
Reviewers: teon.banek, mislav.bradac, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D569
Summary:
- GraphDbAccessor - index range API added
- index api tests refactored
- skiplist minor cleanup.
Reviewers: teon.banek, buda, mislav.bradac
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D533
Summary:
- GC changed to evaluate old records w.r.t. the oldest transaction's id AND snapshot, as opposed to only id
- MVCC hints exp+aborted race condition prevented
- minor MVCC refactors and cleanups
- minor Transaction refactors and cleanups
Reviewers: buda, dgleich
Reviewed By: buda, dgleich
Subscribers: dtomicevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D434
Summary: Memgraph seemed like it leaked memory while in fact it was just keeping it close by. By forcing release of free memory on top of heap back to the operating system we are making sure we are not using more memory than we have to in any given moment.
Reviewers: mferencevic, buda
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D385
Summary:
Gradicek's Mvcc test have seen the following changes:
- provided a test infrastructure (fixture and macros) to facilitate testing and increase readability
- split tests into multi-transaction update, VersionList::find and general Mvcc testing
- multi-transaction update tests have been refactored (i *think* nothing got deleted, but it was a mess so I don't guarantee)
- changed all multithreaded tests to be single-threaded because multiple threads were not necessary
- changed transaction naming from T5, T8, T10 to T1, T2... for consistency with actual transaction indices
What still needs to be done:
- Gleich and Gradicek need to review the infrastructure (possible improvements)
- multi-transaction update tests need to be addressed by Gradicek (see "TODO gradicek" in code, discuss with Flor)
- the wiki/draw.io documentation needs to be updated. it is not imperative that all the tests be drawn in draw.io, only the general infrastructure explained. perhaps only a few examples drawn. Gradicek discuss with Flor
- Gleich see the "TODO Gleich" lines in the diff and discuss with flor
Suggested workflow:
- review this diff, hopefully land (before resolving all the TODOs)
- discard D169
- Gradicek and Gleich address the TODOs
- Flor reviews the results (in following diffs)
Reviewers: dgleich, matej.gradicek, buda
Reviewed By: matej.gradicek, buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D348
Summary:
- an enum in query/common.hpp for flagging what kind of switching should be done
- changes in expression evaluator
- changes in logical operators
- modification in RecordAccessor::SwitchOld to support operator functionality
- tests
Reviewers: teon.banek, mislav.bradac, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D314