BuildIndex now iterates over less vertices

Summary:
Measured improvements in a test scenario with 4 labels, 1M vertices for each label:
Old code: ~2.5 seconds per index
New code: ~1.5 seconds per index

When building an index for a non-existing label the updated code is done immediately. The old code depends on the number of vertices in the database.

The new code *could* be slower when building an index for a label that has a lot of vertices, and the index is not garbage collected recently and contains a lot of junk. This can be avoided by a simple check in the `BuildIndex` function (if label_index cardinality > total cardinality), if you like.

Reviewers: buda, mislav.bradac, teon.banek

Reviewed By: mislav.bradac

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D767
This commit is contained in:
florijan 2017-09-08 10:30:43 +02:00
parent 6edf2cc5ab
commit 5066b1b80d

View File

@ -94,19 +94,17 @@ void GraphDbAccessor::BuildIndex(const GraphDbTypes::Label &label,
wait_transaction->Commit();
}
// This transaction surely sees everything that happened before CreateIndex.
auto transaction = db_.tx_engine_.Begin();
for (auto vertex_vlist : db_.vertices_.access()) {
auto vertex_record = vertex_vlist->find(*transaction);
// Check if visible record exists, if it exists apply function on it.
if (vertex_record == nullptr) continue;
db_.label_property_index_.UpdateOnLabelProperty(vertex_vlist,
vertex_record);
// This accessor's transaction surely sees everything that happened before
// CreateIndex.
GraphDbAccessor dba(db_);
for (auto vertex : dba.Vertices(label, false)) {
db_.label_property_index_.UpdateOnLabelProperty(vertex.vlist_,
vertex.current_);
}
// Commit transaction as we finished applying method on newest visible
// records.
transaction->Commit();
dba.Commit();
// After these two operations we are certain that everything is contained in
// the index under the assumption that this transaction contained no
// vertex/edge insert/update before this method was invoked.