Add indexing concept and reference in user_techincal

Reviewers: buda, dtomicevic Reviewed By: buda Differential Revision: https://phabricator.memgraph.io/D1485
2018-08-03 13:07:30 +02:00 · 2018-08-03 13:07:30 +02:00 · ff5eba73e0
commit ff5eba73e0
parent d51be890d2
2 changed files with 113 additions and 0 deletions
--- a/docs/user_technical/concept__indexing.md
+++ b/docs/user_technical/concept__indexing.md
@ -0,0 +1,96 @@
+## Indexing {#indexing-concept}
+
+### Introduction
+
+A database index is a data structure used to improve the speed of data retrieval
+within a database at the cost of additional writes and storage space for
+maintaining the index data structure.
+
+Armed with deep understanding of their data model and use-case, users can decide
+which data to index and, by doing so, significantly improve their data retrieval
+efficiency
+
+### Index Types
+
+At Memgraph, we support two types of indexes:
+
+  * label index
+  * label-property index
+
+Label indexing is enabled by default in Memgraph, i.e., Memgraph automatically
+indexes labeled data. By doing so we optimize queries which fetch nodes by
+label:
+
+```opencypher
+MATCH (n: Label) ... RETURN n
+```
+
+Indexes can also be created on data with a specific combination of label and
+property, hence the name label-property index. This operation needs to be
+specified by the user and should be used with a specific data model and
+use-case in mind.
+
+For example, suppose we are storing information about certain people in our
+database and we are often interested in retrieving their age. In that case,
+it might be beneficial to create an index on nodes labeled as `:Person` which
+have a property named `age`. We can do so by using the following language
+construct:
+
+```opencypher
+CREATE INDEX ON :Person(age)
+```
+
+After the creation of that index, those queries will be more efficient due to
+the fact that Memgraph's query engine will not have to fetch each `:Person` node
+and check whether the property exists. Moreover, even if all nodes labeled as
+`:Person` had an `age` property, creating such index might still prove to be
+beneficial. The main reason is that entries within that index are kept sorted
+by property value. Queries such as the following are therefore more efficient:
+
+```opencypher
+MATCH (n :Person {age: 42}) RETURN n
+```
+
+Index based retrieval can also be invoked on queries with `WHERE` statements.
+For instance, the following query will have the same effect as the previous
+one:
+
+```opencypher
+MATCH (n) WHERE n:Person AND n.age = 42 RETURN n
+```
+
+Naturally, indexes will also be used when filtering based on less than or
+greater than comparisons. For example, filtering all minors (persons
+under 18 years of age under Croatian law) using the following query will use
+index based retrieval:
+
+```opencypher
+MATCH (n) WHERE n:PERSON and n.age < 18 RETURN n
+```
+
+Bear in mind that `WHERE` filters could contain arbitrarily complex expressions
+and index based retrieval might not be used. Nevertheless, we are continually
+improving our index usage recognition algorithms.
+
+### Underlying Implementation
+
+The central part of our index data structure is a highly-concurrent skip list.
+Skip lists are probabilistic data structures that allow fast search within an
+ordered sequence of elements. The structure itself is built in layers where the
+bottom layer is an ordinary linked list that preserves the order. Each higher
+level can be imagined as a highway for layers below.
+
+The implementation details behind skip list operations are well documented
+in the literature and are out of scope for this article. Nevertheless, we
+believe that it is important for more advanced users to understand the following
+implications of this data structure (`n` denotes the current number of elements
+in a skip list):
+
+  * Average insertion time is `O(log(n))`
+  * Average deletion time is `O(log(n))`
+  * Average search time is `O(log(n))`
+  * Average memory consumption is `O(n)`
+
+### Index Commands
+
+  * [CREATE INDEX ON](reference__create_index.md)
--- a/docs/user_technical/reference__create_index.md
+++ b/docs/user_technical/reference__create_index.md
@ -0,0 +1,17 @@
+## CREATE INDEX
+
+### Summary
+
+Create an index on the specified label, property pair.
+
+### Syntax
+
+```opencypher
+CREATE INDEX ON :<label_name>(<property_name>)
+```
+
+### Remarks
+
+  * `label_name` is the name of the record label.
+  * `property_name` is the name of the property within a record.
+  * At the moment, created indexes cannot be deleted.