Summary:
This is still in progress.
As reported in
https://app.asana.com/0/743890251333732/971034572323525/f
memgraph distributed fails when the snapshot that it tries to recover from is
invalid. Instead of inheritance of `StorageGc` this diff adds composition and we
don't try to register a rpc server after the initialization.
Reviewers: ipaljak, vkasljevic, mtomic
Reviewed By: vkasljevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1930
Summary:
Case with existence constraints is about 8-9% slower then case without
existence constraints. Before this diff that difference was about 15-16%.
Reviewers: msantl, ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1917
Summary:
UniqueLabelPropertyConstraint defines label + property restriction on vertices.
Label + Property + PropertyValue for that property must be unique at any given moment.
Reviewers: msantl, ipaljak
Reviewed By: msantl
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1884
Summary:
`SHOW STORAGE STATS` when executed in a Raft cluster should return
stats for each member of the cluster.
`StorageStats` starts a RPC server on each member of the cluster that answers
about its local storage stats.
The query can be invoked only on the current leader, the leader sends a request
to each peer and shows the results it gets. If some peers don't answer within 1
second, stats for those peers won't be shown.
The new output can be seen here: P27
Reviewers: ipaljak, mferencevic
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1907
Summary:
Added index creation and deletion handling in StateDelta.
Also included an integration test that creates an index and makes sure that it
gets replicated by killing each peer eventually causing a leader re-election.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1886
Summary: `DataRpcClient` and `DataRpcServer` now work with both old and new record. This will be needed later when I replace `Cache` with `LruCache`.
Reviewers: msantl, teon.banek, ipaljak
Reviewed By: msantl, teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1875
Summary:
Existence constraint ensures that all nodes with certain label have a certain property.
`ExistenceRule` defines label -> properties rule and `ExistenceConstraints` manages all
constraints.
Reviewers: msantl, ipaljak, teon.banek, mferencevic
Reviewed By: msantl, teon.banek, mferencevic
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1797
Summary:
During the following scenario:
- start a HA cluster with 3 machines
- find the leader and start sending queries
- SIGTERM the leader but leave other 2 machines untouched
The leader would be stuck in the shutdown phase.
This was happening because during the shutdown phase of the Bolt server, a
`graph_db_accessor` would try to commit a transaction after we've already shut
down Raft server. Raft, although not running, is still thinking it's in the
Leader mode. Tx Engine calls the `SafeToCommit` method to Commit transactions,
and ends up in an infinite loop.
Since Raft was shut down it won't handle any of the incoming RPCs and won't
change it's mode.
The fix here is to shut down the Bolt server before Raft, so we don't have any
pending commits once Raft is shut down.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1853
Summary:
Added a new awesome function called `storageInfo`. This function
returns a list of key-value pairs containing the following (useful?)
information:
* number of vertices
* number of edges
* average degree
* memory usage
* disk usage
The current implementation is in `awesome_memgraph_functions` but it will end up
as a separate clause for better access control.
Reviewers: teon.banek, mtomic, mferencevic
Reviewed By: teon.banek
Subscribers: pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D1850
Summary:
In this part of log compaction for raft, I've implemented snapshooting
and snapshot recovery. I've also refactored the code a bit, so `RaftServer` now
has a pointer to the `GraphDb` and it can do some things by itself.
Log compaction requires some further work. Since snapshooting isn't synchronous
between peers, and each peer can work at their own pace, once we've compacted
the log so that the next log to be sent to peer `x` isn't available anymore, we
need to send the snapshot over the wire. This means that the next part will
contain the `InstallSnapshotRPC` and then maybe one more that will implement the
logic of sending `LogEntry` or the whole snapshot.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1834
Summary:
Once a leader loses it's leadership, in order to handle hanging
transactions, we reset the storage and the transaction engine.
This requires to re-apply all the commited entries from the log.
Once we add snapshot (log compaction) we would need to do that also.
One thing to have in mind is the `election_timeout_min` parameter. If it's set
too low it could trigger leader re-election too often.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1822
Summary:
Before I wrongly assumed `Shutdown` will always be called on Cursors and
removed BFS subcursors there. Now it is done using transactional cache clean-up
mechanism. There's a separate clean-up for `BfsSubcursorStorage` (for actual
subcursors) and `BfsRpcServer` (for database accessors created by the server).
I've changed `BfsRpcServer` to have a `GraphDbAccessor` per transaction,
instead of per `BfsSubcursor`. Mainly because there is no reliable way to check
if the transaction tied to the accessor has expired as it is not safe to call
`transaction_id` method (since `GraphDbAccessor` is holding only a reference to
`Transaction` object).
Reviewers: teon.banek, mferencevic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1792
Summary:
Creating Raft noop logs on leader change will trigger the whole
log replication procedure that ends up committing/applying state deltas on newly
elected leaders that didn't receive the last commit index from the previous
leader.
I also included a small tweak that won't trigger add logs when a transaction
contains only BEGIN and ABORT StateDeltas, because we don't want to replicate
read queries.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1785
Summary:
TransactionReplicator replicates transactions on follower machines in
HA memgraph. Our DB accessor API doesn't provide us with the functionality to
begin transactions with non-increasing ids. This is why the
`TransactionReplicator` uses a internal map that maps tx ids from the leader
node to transactions on the follower node (whose id doesn't have to match the
leaders tx id).
If the leader has the following transaction timeline:
```
L
tx1
|
| tx2
| |
| |
| |
| |
| |
| tx2
|
|
|
|
tx1
```
`tx2` will commit first and will be replicated. When applying `tx2` on follower
nodes, they will start a new transaction with tx id `1`. When `tx1` starts
replicating, followers will start a new transaction with tx id `2`. And this is
wehre `TransactionReplicator` kicks in.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1775
Summary:
Explicitly start and stop raft server.
This way we can be sure that raft won't try to use coordination after it's
shutdown, and we can define the start of th raft protocol easier.
Reviewers: ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1766
Summary:
This is just the first diff that tries to wire the raft protocol into
memgraph.
In this diff I'm introducing transaction engine reset functionality. I also
introduced `RaftInterface` which should be used wherever someone wants to access
Raft from Memgraph.
For design decisions see the feature spec.
Reviewers: ipaljak, teon.banek
Reviewed By: ipaljak
Subscribers: pullbot, teon.banek
Differential Revision: https://phabricator.memgraph.io/D1758
Summary:
GraphDb now has GetStorageStat method that returns live view to
object containing vertex_count, edge_count and avg_degree().
Stat is updated on each garbage collection run and represents number of
objects that are visible to any transaction.
Reviewers: teon.banek, msantl, ipaljak
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1731
Summary:
Since we need to send `StateDelta`s over the wire in HA, we need to be
able to serialize those bad boys.
This diff hopefully does this the right way.
Reviewers: teon.banek, mferencevic, ipaljak
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1725
Summary:
Initial for fork (in form of a c/p) form the current single node
version. Once we finish HA we plan to re-link the files to the single node
versions if they don't change.
Reviewers: ipaljak, buda, mferencevic
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1705
Summary:
Blocking transaction has the ability to stop the transaction engine from
starting new transactions (regular or blocking) and to wait all other active
transactions to finish (to become non active, committed or aborted). One thing
that blocking transactions support is defining the parent transaction which
does not need to end in order for the blocking one to start. This is because of
a use case where we start nested transactions.
One could thing we should build indexes inside those blocking transactions. This
is true and I wanted to implement this, but this would require some digging in
the interpreter which I didn't want to do in this change.
Reviewers: mferencevic, vkasljevic, teon.banek
Reviewed By: mferencevic, teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1695
Summary:
As Tom requested in the #graphileon slack channel, I changed the error
message when node can't be updated.
Reviewers: mferencevic, ipaljak
Reviewed By: ipaljak
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1687
Summary:
LabelPropertyIndex now has the ability to enforce unique constraint.
This doesn't lock the tx engine.
Reviewers: teon.banek, mferencevic
Reviewed By: teon.banek
Subscribers: pullbot, vkasljevic, buda
Differential Revision: https://phabricator.memgraph.io/D1660
Summary:
Start removing `is_remote` from `Address`
Remove `GlobalAddress`
Remove `GlobalizedAddress`
Remove bitmasks from `Address`
Remove `is_local` from `Address`
Remove `is_local` from `RecordAccessor`
Remove `worker_id` from `Address`
Remove `worker_id` from `GidGenerator`
Unfriend `IndexRpcServer` from `Storage`
Remove `LocalizedAddressIfPossible`
Make member private
Remove `worker_id` from `Storage`
Copy function to ease removal of distributed logic
Remove `worker_id` from `WriteAheadLog`
Remove `worker_id` from `GraphDb`
Remove `worker_id` from durability
Remove nonexistant function
Remove `gid` from `Address`
Remove usage of `Address`
Remove `Address`
Remove `VertexAddress` and `EdgeAddress`
Fix Id test
Remove `cypher_id` from `VersionList`
Remove `cypher_id` from durability
Remove `cypher_id` member from `VersionList`
Remove `cypher_id` from database
Fix recovery (revert D1142)
Remove unnecessary functions from `GraphDbAccessor`
Revert `InsertEdge` implementation to the way it was in/before D1142
Remove leftover `VertexAddress` from `Edge`
Remove `PostCreateIndex` and `PopulateIndexFromBuildIndex`
Split durability paths into single node and distributed
Fix `TransactionIdFromWalFilename` implementation
Fix tests
Remove `cypher_id` from `snapshooter` and `durability` test
Reviewers: msantl, teon.banek
Reviewed By: msantl
Subscribers: msantl, pullbot
Differential Revision: https://phabricator.memgraph.io/D1647
Summary:
To clean the working directory after this diff you should execute:
```
rm src/database/counters_rpc_messages.capnp
rm src/database/counters_rpc_messages.hpp
rm src/database/serialization.capnp
rm src/database/serialization.hpp
```
Reviewers: teon.banek, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1636
Summary:
This diff splits single node and distributed storage from each other.
Currently all of the storage code is copied into two directories (one single
node, one distributed). The logic used in the storage implementation isn't
touched, it will be refactored in following diffs.
To clean the working directory after this diff you should execute:
```
rm database/state_delta.capnp
rm database/state_delta.hpp
rm storage/concurrent_id_mapper_rpc_messages.capnp
rm storage/concurrent_id_mapper_rpc_messages.hpp
```
Reviewers: teon.banek, buda, msantl
Reviewed By: teon.banek, msantl
Subscribers: teon.banek, pullbot
Differential Revision: https://phabricator.memgraph.io/D1625
Summary:
TODO:
~~1. Figure out how to propagate exceptions during lambda evaluation to master.~~
~~2. Make some more complicated test cases to see if everything is~~
~~sent over the network properly (lambdas depending on frame, evaluation context).~~
~~3. Support only `GraphView::OLD`.~~
4. [MAYBE] Send only parts of the frame necessary for lambda evaluation.
~~5. Fix EdgeType handling~~
--------------------
Serialize frame and send it in PrepareForExpand RPC
Move Lambda out of ExpandVariable
Send symbol table and filter lambda in CreateBfsSubcursor RPC
Evaluate filter lambda during the expansion
Send evaluation context in CreateBfsSubcursor RPC
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1600
Summary:
This should allow us to more easily decouple the code which should be
open sourced. Unfortunately, the downside of this approach is that we
cannot rely on virtual calls to dispatch the serialization to correct
type. Another downside is that members need to be publicly accessible
for serialization.
Reviewers: mtomic, msantl
Reviewed By: mtomic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1596
Summary:
This diff changes the RPC layer to directly return `TResponse` to the user when
issuing a `Call<...>` RPC call. The call throws an exception on failure
(instead of the previous return `nullopt`).
All servers (network, RPC and distributed) are set to have explicit `Shutdown`
methods so that a controlled shutdown can always be performed. The object
destructors now have `CHECK`s to enforce that the `AwaitShutdown` methods were
called.
The distributed memgraph is changed that none of the binaries (master/workers)
crash when there is a communication failure. Instead, the whole cluster starts
a graceful shutdown when a persistent communication error is detected.
Transient errors are allowed during execution. The transaction that errored out
will be aborted on the whole cluster. The cluster state is managed using a new
Heartbeat RPC call.
Reviewers: buda, teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1604