2017-10-26 00:02:44 +08:00
|
|
|
DISCLAIMER: this is just an initial test, graph might not resemble
|
|
|
|
the graph in the use case at all and the data might be completely
|
|
|
|
irrelevant.
|
|
|
|
|
|
|
|
We tried generating a few sample graphs from the vague description
|
|
|
|
given in the use case doc. Then we tried writing queries that would
|
|
|
|
solve the problem of updating nodes when a leaf value changes,
|
|
|
|
assuming all the internal nodes compute only the sum function.
|
|
|
|
|
|
|
|
We start by creating an index on `id` property to improve initial lookup
|
|
|
|
performance:
|
|
|
|
|
|
|
|
CREATE INDEX ON :Leaf(id)
|
|
|
|
|
|
|
|
Set values of all leafs to 1:
|
|
|
|
|
|
|
|
MATCH (u:Leaf) SET u.value = 1
|
|
|
|
|
|
|
|
Now we initialize the values of all other nodes in the graph:
|
|
|
|
|
|
|
|
MATCH (u) WHERE NOT u:Leaf SET u.value = 0
|
|
|
|
|
|
|
|
MATCH (u) WITH u
|
|
|
|
ORDER BY u.topological_index DESC
|
|
|
|
MATCH (u)-->(v) SET u.value = u.value + v.value
|
|
|
|
|
|
|
|
Change the value of a leaf:
|
|
|
|
|
2017-10-27 16:27:59 +08:00
|
|
|
MATCH (u:Leaf {id: "18"}) SET u.value = 10
|
2017-10-26 00:02:44 +08:00
|
|
|
|
|
|
|
We have to reset all the updated nodes to a neutral element:
|
|
|
|
|
|
|
|
MATCH (u:Leaf {id: "18"})<-[* bfs]-(v)
|
|
|
|
WHERE NOT v:Leaf SET v.value = 0
|
|
|
|
|
|
|
|
Finally, we recalculate their values in topological order:
|
|
|
|
|
|
|
|
MATCH (u:Leaf {id: "18"})<-[* bfs]-(v)
|
|
|
|
WITH v ORDER BY v.topological_index DESC
|
|
|
|
MATCH (v)-->(w) SET v.value = v.value + w.value
|
|
|
|
|
|
|
|
There are a few assumptions made worth pointing out.
|
|
|
|
|
|
|
|
* We are able to efficiently maintain topological order
|
|
|
|
of vertices in the graph.
|
|
|
|
|
|
|
|
* It is possible to accumulate the value of the function. Formally:
|
2017-10-27 16:27:59 +08:00
|
|
|
$$f(x_1, x_2, ..., x_n) = g(...(g(g(x_1, x_2), x_3), ...), x_n)$$
|
2017-10-26 00:02:44 +08:00
|
|
|
|
|
|
|
* There is a neutral element for the operation. However, this
|
|
|
|
assumption can be dropped by introducing an artificial neutral element.
|
|
|
|
|
2017-10-27 16:27:59 +08:00
|
|
|
Number of operations required is proportional to the sum of degrees of affected
|
2017-10-26 00:02:44 +08:00
|
|
|
nodes.
|
|
|
|
|
|
|
|
We generated graph with $10^5$ nodes ($20\ 000$ nodes in each layer), varied the
|
|
|
|
degree distribution in node layers and measured time for the query to execute:
|
|
|
|
|
|
|
|
| # | Root-Category-Group degree | Group-CustomGroup-Leaf degree | Time |
|
|
|
|
|:-:|:---------------------------:|:-----------------------------:|:---------:|
|
|
|
|
| 1 | [1, 10] | [20, 40] | ~1.1s |
|
|
|
|
| 2 | [1, 10] | [50, 100] | ~2.5s |
|
|
|
|
| 3 | [10, 50] | [50, 100] | ~3.3s |
|
|
|
|
|
|
|
|
Due to the structure of the graph, update of a leaf required update of almost
|
|
|
|
all the nodes in the graph so we don't show times required for initial graph
|
|
|
|
update and update after leaf change separately.
|
|
|
|
|
|
|
|
However, there is not enough info on the use case to make the test more
|
|
|
|
sophisticated.
|