memgraph/tests/unit/transaction_engine_single_node_ha.cpp

#include "gtest/gtest.h"

#include <unordered_map>
#include <vector>

#include "durability/single_node_ha/state_delta.hpp"
#include "raft/raft_interface.hpp"
#include "transactions/single_node_ha/engine.hpp"
#include "transactions/transaction.hpp"

using namespace tx;

class RaftMock final : public raft::RaftInterface {
 public:
   raft::DeltaStatus Emplace(const database::StateDelta &delta) override {
    log_[delta.transaction_id].emplace_back(std::move(delta));
    return {true, std::nullopt};
  }

  bool SafeToCommit(const tx::TransactionId &) override {
    return true;
  }

  bool IsLeader() override { return true; }

  uint64_t TermId() override { return 1; }

  raft::TxStatus TransactionStatus(uint64_t term_id,
                                   uint64_t log_index) override {
    return raft::TxStatus::REPLICATED;
  }

  std::vector<database::StateDelta> GetLogForTx(
      const tx::TransactionId &tx_id) {
    return log_[tx_id];
  }

  std::mutex &WithLock() override { return lock_; }

 private:
  std::unordered_map<tx::TransactionId, std::vector<database::StateDelta>> log_;
  std::mutex lock_;
};

TEST(Engine, Reset) {
  RaftMock raft;
  Engine engine{&raft};

  auto t0 = engine.Begin();
  EXPECT_EQ(t0->id_, 1);
  engine.Commit(*t0);

  engine.Reset();

  auto t1 = engine.Begin();
  EXPECT_EQ(t1->id_, 1);
  engine.Commit(*t1);
}

TEST(Engine, TxStateDelta) {
  RaftMock raft;
  Engine engine{&raft};

  auto t0 = engine.Begin();
  tx::TransactionId tx_id = t0->id_;
  engine.Commit(*t0);

  auto t0_log = raft.GetLogForTx(tx_id);
  EXPECT_EQ(t0_log.size(), 2);

  using Type = enum database::StateDelta::Type;
  EXPECT_EQ(t0_log[0].type, Type::TRANSACTION_BEGIN);
  EXPECT_EQ(t0_log[0].transaction_id, tx_id);
  EXPECT_EQ(t0_log[1].type, Type::TRANSACTION_COMMIT);
  EXPECT_EQ(t0_log[1].transaction_id, tx_id);
}
Wire raft into memgraph pt.1. Summary: This is just the first diff that tries to wire the raft protocol into memgraph. In this diff I'm introducing transaction engine reset functionality. I also introduced `RaftInterface` which should be used wherever someone wants to access Raft from Memgraph. For design decisions see the feature spec. Reviewers: ipaljak, teon.banek Reviewed By: ipaljak Subscribers: pullbot, teon.banek Differential Revision: https://phabricator.memgraph.io/D1758 2018-11-30 21:32:32 +08:00			`#include "gtest/gtest.h"`

			`#include <unordered_map>`
			`#include <vector>`

			`#include "durability/single_node_ha/state_delta.hpp"`
			`#include "raft/raft_interface.hpp"`
			`#include "transactions/single_node_ha/engine.hpp"`
			`#include "transactions/transaction.hpp"`

			`using namespace tx;`

			`class RaftMock final : public raft::RaftInterface {`
			`public:`
Expose the status of transaction within Raft Summary: For proper client interaction, we need to expose the (term_id, log_index) pair for the transaction that's about to be replicated and we need to be able to retrieve the status of a transaction defined by that pair. Transaction status can be one of the following: 1) REPLICATED (self-explanatory) 2) WAITING (waiting for replication) 3) ABORTED (self-explanatory) 4) INVALID (received request with either invalid term_id or invalid log_index) Reviewers: mferencevic Reviewed By: mferencevic Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D2201 2019-07-16 16:50:53 +08:00			`raft::DeltaStatus Emplace(const database::StateDelta &delta) override {`
Wire raft into memgraph pt.1. Summary: This is just the first diff that tries to wire the raft protocol into memgraph. In this diff I'm introducing transaction engine reset functionality. I also introduced `RaftInterface` which should be used wherever someone wants to access Raft from Memgraph. For design decisions see the feature spec. Reviewers: ipaljak, teon.banek Reviewed By: ipaljak Subscribers: pullbot, teon.banek Differential Revision: https://phabricator.memgraph.io/D1758 2018-11-30 21:32:32 +08:00			`log_[delta.transaction_id].emplace_back(std::move(delta));`
Expose the status of transaction within Raft Summary: For proper client interaction, we need to expose the (term_id, log_index) pair for the transaction that's about to be replicated and we need to be able to retrieve the status of a transaction defined by that pair. Transaction status can be one of the following: 1) REPLICATED (self-explanatory) 2) WAITING (waiting for replication) 3) ABORTED (self-explanatory) 4) INVALID (received request with either invalid term_id or invalid log_index) Reviewers: mferencevic Reviewed By: mferencevic Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D2201 2019-07-16 16:50:53 +08:00			`return {true, std::nullopt};`
Wire raft into memgraph pt.1. Summary: This is just the first diff that tries to wire the raft protocol into memgraph. In this diff I'm introducing transaction engine reset functionality. I also introduced `RaftInterface` which should be used wherever someone wants to access Raft from Memgraph. For design decisions see the feature spec. Reviewers: ipaljak, teon.banek Reviewed By: ipaljak Subscribers: pullbot, teon.banek Differential Revision: https://phabricator.memgraph.io/D1758 2018-11-30 21:32:32 +08:00			`}`

Fix how HA handles leader change during commit Summary: During it's leadership, one peer can receive RPC messages from other peers that his reign is over. The problem is when this happens during a transaction commit. This is handled in the following way. If we're the current leader and we want to commit a transaction, we need to make sure the Raft Log is replicated before we can tell the client that the transaction is committed. During that wait, we can only notice that the replication takes too long, and we report that with `LOG(WARNING)` messages. If we change the Raft mode during the wait, our Raft implementation will internally commit this transaction, but won't be able to acquire the Raft lock because the `db.Reset` has been called. This is why there is an manual lock acquire. If we pick up that the `db.Reset` has been called, we throw an `UnexpectedLeaderChangeException` exception to the client. Another thing with long running transactions, if someone decides to kill a `memgraph_ha` instance during the commit, the transaction will have `abort` hint set. This will cause the `src/query/operator.cpp` to throw a `HintedAbortError`. We need to catch this during the shutdown, because the `memgraph_ha` isn't dead from the user perspective, and the transaction wasn't aborted because it took too long, but we can differentiate between those two. Reviewers: mferencevic, ipaljak Reviewed By: mferencevic, ipaljak Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D1956 2019-04-09 16:02:11 +08:00			`bool SafeToCommit(const tx::TransactionId &) override {`
			`return true;`
			`}`
Add automated test for Raft Summary: Created a new integration test for Raft protocol. The tests iterates through the Raft cluster and does the following: * kill machine `X` * execute a query * bring `X` back to life The first step is to insert a vertex in the cluster, and last step is to check if the cluster has all the data. I also edited some of the raft core files because this test surafaced some bugs. The `tester` binary is a hacked version of the HA client and so are the parts in the code that refuse to execute a query is the machine is not in `Leader` mode.o Those parts will go away once we have a proper HA client. I've run the `runner.py` for a while (215 times) ``` while ./runner.py &> log.txt; do echo -n "."; done ``` and it didn't break. Reviewers: ipaljak, mferencevic Reviewed By: ipaljak Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D1788 2019-01-04 23:07:57 +08:00
			`bool IsLeader() override { return true; }`
Implement StateDelta apply method for Raft Summary: TransactionReplicator replicates transactions on follower machines in HA memgraph. Our DB accessor API doesn't provide us with the functionality to begin transactions with non-increasing ids. This is why the `TransactionReplicator` uses a internal map that maps tx ids from the leader node to transactions on the follower node (whose id doesn't have to match the leaders tx id). If the leader has the following transaction timeline: ``` L tx1 \| \| tx2 \| \| \| \| \| \| \| \| \| \| \| tx2 \| \| \| \| tx1 ``` `tx2` will commit first and will be replicated. When applying `tx2` on follower nodes, they will start a new transaction with tx id `1`. When `tx1` starts replicating, followers will start a new transaction with tx id `2`. And this is wehre `TransactionReplicator` kicks in. Reviewers: ipaljak Reviewed By: ipaljak Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D1775 2018-12-12 22:50:17 +08:00
Add SHOW RAFT INFO query to HA Reviewers: msantl, mtomic Reviewed By: msantl Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D1946 2019-04-08 19:00:28 +08:00			`uint64_t TermId() override { return 1; }`

Expose the status of transaction within Raft Summary: For proper client interaction, we need to expose the (term_id, log_index) pair for the transaction that's about to be replicated and we need to be able to retrieve the status of a transaction defined by that pair. Transaction status can be one of the following: 1) REPLICATED (self-explanatory) 2) WAITING (waiting for replication) 3) ABORTED (self-explanatory) 4) INVALID (received request with either invalid term_id or invalid log_index) Reviewers: mferencevic Reviewed By: mferencevic Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D2201 2019-07-16 16:50:53 +08:00			`raft::TxStatus TransactionStatus(uint64_t term_id,`
			`uint64_t log_index) override {`
			`return raft::TxStatus::REPLICATED;`
			`}`

Wire raft into memgraph pt.1. Summary: This is just the first diff that tries to wire the raft protocol into memgraph. In this diff I'm introducing transaction engine reset functionality. I also introduced `RaftInterface` which should be used wherever someone wants to access Raft from Memgraph. For design decisions see the feature spec. Reviewers: ipaljak, teon.banek Reviewed By: ipaljak Subscribers: pullbot, teon.banek Differential Revision: https://phabricator.memgraph.io/D1758 2018-11-30 21:32:32 +08:00			`std::vector<database::StateDelta> GetLogForTx(`
			`const tx::TransactionId &tx_id) {`
			`return log_[tx_id];`
			`}`

Fix how HA handles leader change during commit Summary: During it's leadership, one peer can receive RPC messages from other peers that his reign is over. The problem is when this happens during a transaction commit. This is handled in the following way. If we're the current leader and we want to commit a transaction, we need to make sure the Raft Log is replicated before we can tell the client that the transaction is committed. During that wait, we can only notice that the replication takes too long, and we report that with `LOG(WARNING)` messages. If we change the Raft mode during the wait, our Raft implementation will internally commit this transaction, but won't be able to acquire the Raft lock because the `db.Reset` has been called. This is why there is an manual lock acquire. If we pick up that the `db.Reset` has been called, we throw an `UnexpectedLeaderChangeException` exception to the client. Another thing with long running transactions, if someone decides to kill a `memgraph_ha` instance during the commit, the transaction will have `abort` hint set. This will cause the `src/query/operator.cpp` to throw a `HintedAbortError`. We need to catch this during the shutdown, because the `memgraph_ha` isn't dead from the user perspective, and the transaction wasn't aborted because it took too long, but we can differentiate between those two. Reviewers: mferencevic, ipaljak Reviewed By: mferencevic, ipaljak Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D1956 2019-04-09 16:02:11 +08:00			`std::mutex &WithLock() override { return lock_; }`

Wire raft into memgraph pt.1. Summary: This is just the first diff that tries to wire the raft protocol into memgraph. In this diff I'm introducing transaction engine reset functionality. I also introduced `RaftInterface` which should be used wherever someone wants to access Raft from Memgraph. For design decisions see the feature spec. Reviewers: ipaljak, teon.banek Reviewed By: ipaljak Subscribers: pullbot, teon.banek Differential Revision: https://phabricator.memgraph.io/D1758 2018-11-30 21:32:32 +08:00			`private:`
			`std::unordered_map<tx::TransactionId, std::vector<database::StateDelta>> log_;`
Fix how HA handles leader change during commit Summary: During it's leadership, one peer can receive RPC messages from other peers that his reign is over. The problem is when this happens during a transaction commit. This is handled in the following way. If we're the current leader and we want to commit a transaction, we need to make sure the Raft Log is replicated before we can tell the client that the transaction is committed. During that wait, we can only notice that the replication takes too long, and we report that with `LOG(WARNING)` messages. If we change the Raft mode during the wait, our Raft implementation will internally commit this transaction, but won't be able to acquire the Raft lock because the `db.Reset` has been called. This is why there is an manual lock acquire. If we pick up that the `db.Reset` has been called, we throw an `UnexpectedLeaderChangeException` exception to the client. Another thing with long running transactions, if someone decides to kill a `memgraph_ha` instance during the commit, the transaction will have `abort` hint set. This will cause the `src/query/operator.cpp` to throw a `HintedAbortError`. We need to catch this during the shutdown, because the `memgraph_ha` isn't dead from the user perspective, and the transaction wasn't aborted because it took too long, but we can differentiate between those two. Reviewers: mferencevic, ipaljak Reviewed By: mferencevic, ipaljak Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D1956 2019-04-09 16:02:11 +08:00			`std::mutex lock_;`
Wire raft into memgraph pt.1. Summary: This is just the first diff that tries to wire the raft protocol into memgraph. In this diff I'm introducing transaction engine reset functionality. I also introduced `RaftInterface` which should be used wherever someone wants to access Raft from Memgraph. For design decisions see the feature spec. Reviewers: ipaljak, teon.banek Reviewed By: ipaljak Subscribers: pullbot, teon.banek Differential Revision: https://phabricator.memgraph.io/D1758 2018-11-30 21:32:32 +08:00			`};`

			`TEST(Engine, Reset) {`
			`RaftMock raft;`
			`Engine engine{&raft};`

			`auto t0 = engine.Begin();`
			`EXPECT_EQ(t0->id_, 1);`
			`engine.Commit(*t0);`

			`engine.Reset();`

			`auto t1 = engine.Begin();`
			`EXPECT_EQ(t1->id_, 1);`
			`engine.Commit(*t1);`
			`}`

			`TEST(Engine, TxStateDelta) {`
			`RaftMock raft;`
			`Engine engine{&raft};`

			`auto t0 = engine.Begin();`
			`tx::TransactionId tx_id = t0->id_;`
			`engine.Commit(*t0);`

			`auto t0_log = raft.GetLogForTx(tx_id);`
			`EXPECT_EQ(t0_log.size(), 2);`

			`using Type = enum database::StateDelta::Type;`
			`EXPECT_EQ(t0_log[0].type, Type::TRANSACTION_BEGIN);`
			`EXPECT_EQ(t0_log[0].transaction_id, tx_id);`
			`EXPECT_EQ(t0_log[1].type, Type::TRANSACTION_COMMIT);`
			`EXPECT_EQ(t0_log[1].transaction_id, tx_id);`
			`}`