diff --git a/docs/user_technical/concept__indexing.md b/docs/user_technical/concept__indexing.md new file mode 100644 index 000000000..034de92fc --- /dev/null +++ b/docs/user_technical/concept__indexing.md @@ -0,0 +1,96 @@ +## Indexing {#indexing-concept} + +### Introduction + +A database index is a data structure used to improve the speed of data retrieval +within a database at the cost of additional writes and storage space for +maintaining the index data structure. + +Armed with deep understanding of their data model and use-case, users can decide +which data to index and, by doing so, significantly improve their data retrieval +efficiency + +### Index Types + +At Memgraph, we support two types of indexes: + + * label index + * label-property index + +Label indexing is enabled by default in Memgraph, i.e., Memgraph automatically +indexes labeled data. By doing so we optimize queries which fetch nodes by +label: + +```opencypher +MATCH (n: Label) ... RETURN n +``` + +Indexes can also be created on data with a specific combination of label and +property, hence the name label-property index. This operation needs to be +specified by the user and should be used with a specific data model and +use-case in mind. + +For example, suppose we are storing information about certain people in our +database and we are often interested in retrieving their age. In that case, +it might be beneficial to create an index on nodes labeled as `:Person` which +have a property named `age`. We can do so by using the following language +construct: + +```opencypher +CREATE INDEX ON :Person(age) +``` + +After the creation of that index, those queries will be more efficient due to +the fact that Memgraph's query engine will not have to fetch each `:Person` node +and check whether the property exists. Moreover, even if all nodes labeled as +`:Person` had an `age` property, creating such index might still prove to be +beneficial. The main reason is that entries within that index are kept sorted +by property value. Queries such as the following are therefore more efficient: + +```opencypher +MATCH (n :Person {age: 42}) RETURN n +``` + +Index based retrieval can also be invoked on queries with `WHERE` statements. +For instance, the following query will have the same effect as the previous +one: + +```opencypher +MATCH (n) WHERE n:Person AND n.age = 42 RETURN n +``` + +Naturally, indexes will also be used when filtering based on less than or +greater than comparisons. For example, filtering all minors (persons +under 18 years of age under Croatian law) using the following query will use +index based retrieval: + +```opencypher +MATCH (n) WHERE n:PERSON and n.age < 18 RETURN n +``` + +Bear in mind that `WHERE` filters could contain arbitrarily complex expressions +and index based retrieval might not be used. Nevertheless, we are continually +improving our index usage recognition algorithms. + +### Underlying Implementation + +The central part of our index data structure is a highly-concurrent skip list. +Skip lists are probabilistic data structures that allow fast search within an +ordered sequence of elements. The structure itself is built in layers where the +bottom layer is an ordinary linked list that preserves the order. Each higher +level can be imagined as a highway for layers below. + +The implementation details behind skip list operations are well documented +in the literature and are out of scope for this article. Nevertheless, we +believe that it is important for more advanced users to understand the following +implications of this data structure (`n` denotes the current number of elements +in a skip list): + + * Average insertion time is `O(log(n))` + * Average deletion time is `O(log(n))` + * Average search time is `O(log(n))` + * Average memory consumption is `O(n)` + +### Index Commands + + * [CREATE INDEX ON](reference__create_index.md) diff --git a/docs/user_technical/reference__create_index.md b/docs/user_technical/reference__create_index.md new file mode 100644 index 000000000..f04a44d58 --- /dev/null +++ b/docs/user_technical/reference__create_index.md @@ -0,0 +1,17 @@ +## CREATE INDEX + +### Summary + +Create an index on the specified label, property pair. + +### Syntax + +```opencypher +CREATE INDEX ON :<label_name>(<property_name>) +``` + +### Remarks + + * `label_name` is the name of the record label. + * `property_name` is the name of the property within a record. + * At the moment, created indexes cannot be deleted.