Reviewers: buda, msantl, ipaljak Reviewed By: ipaljak Subscribers: teon.banek, pullbot Differential Revision: https://phabricator.memgraph.io/D1476
5.1 KiB
DatabaseAccessor
A DatabaseAccessor
actually wraps a transactional access to database
data, for a single transaction. In that sense the naming is bad. It
encapsulates references to the database and the transaction object.
It contains logic for working with database content (graph element
data) in the context of a single transaction. All CRUD operations are
performed within a single transaction (as Memgraph is a transactional
database), and therefore iteration over data, finding a specific graph
element etc are all functionalities of a GraphDbAccessor
.
In single-node Memgraph the database accessor also defined the lifetime
of a transaction. Even though a Transaction
object was owned by the
transactional engine, it was GraphDbAccessor
's lifetime that object
was bound to (the transaction was implicitly aborted in
GraphDbAccessor
's destructor, if it was not explicitly ended before
that).
RecordAccessor
It is important to understand data organization and access in the storage layer. This discussion pertains to vertices and edges as graph elements that the end client works with.
Memgraph uses MVCC (documented on it's own page). This means that for
each graph element there could be different versions visible to
different currently executing transactions. When we talk about a
Vertex
or Edge
as a data structure we typically mean one of those
versions. In code this semantic is implemented so that both those classes
inherit mvcc::Record
, which in turn inherits mvcc::Version
.
Handling MVCC and visibility is not in itself trivial. Next to that,
there is other book-keeping to be performed when working with data. For
that reason, Memgraph uses "accessors" to define an API of working with
data in a safe way. Most of the code in Memgraph (for example the
interpretation code) should work with accessors. There is a
RecordAccessor
as a base class for VertexAccessor
and
EdgeAccessor
. Following is an enumeration of their purpose.
Data access
The client interacts with Memgraph using the Cypher query language. That language has certain semantics which imply that multiple versions of the data need to be visible during the execution of a single query. For example: expansion over the graph is always done over the graph state as it was at the beginning of the transaction.
The RecordAccessor
exposes functions to switch between the old and the new
versions of the same graph element (intelligently named SwitchOld
and
SwitchNew
) within a single transaction. In that way the client code
(mostly the interpreter) can avoid dealing with the underlying MVCC
version concepts.
Updates
Data updates are also done through accessors. Meaning: there are methods
on the accessors that modify data, the client code should almost never
interact directly with Vertex
or Edge
objects.
The accessor layer takes care of creating version in the MVCC layer and performing updates on appropriate versions.
Next, for many kinds of updates it is necessary to update the relevant indexes. There are implicit indexes for vertex labels, as well as user-created indexes for (label, property) pairs. The accessor layer takes care of updating the indexes when these values are changed.
Each update also triggers a log statement in the write-ahead log. This is also handled by the accessor layer.
Distributed
In distributed Memgraph accessors also contain a lot of the remote graph element handling logic. More info on that is available in the documentation for distributed.
Deferred MVCC data lookup for Edges
Vertices and edges are versioned using MVCC. This means that for each transaction an MVCC lookup needs to be done to determine which version is visible to that transaction. This tends to slow things down due to cache invalidations (version lists and versions are stored in arbitrary locations on the heap).
However, for edges, only the properties are mutable. The edge endpoints
and type are fixed once the edge is created. For that reason both edge
endpoints and type are available in vertex data, so that when expanding
it is not mandatory to do MVCC lookups of versioned, mutable data. This
logic is implemented in RecordAccessor
and EdgeAccessor
.
Exposure
The original idea and implementation of graph element accessors was that
they'd prevent client code from ever interacting with raw Vertex
or
Edge
data. This however turned out to be impractical when implementing
distributed Memgraph and the raw data members have since been exposed
(through getters to old and new version pointers). However, refrain from
working with that data directly whenever possible! Always consider the
accessors to be the first go-to for interacting with data, especially
when in the context of a transaction.
Skiplist accessor
The term "accessor" is also used in the context of a skiplist. Every operation on a skiplist must be performed within on an accessor. The skiplist ensures that there will be no physical deletions of an object during the lifetime of an accessor. This mechanism is used to ensure deletion correctness in a highly concurrent container. We only mention that here to avoid confusion regarding terminology.