Add BFS and Dijkstra example

Reviewers: buda, msantl, teon.banek, ipaljak

Reviewed By: buda, ipaljak

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1423
This commit is contained in:
Marin Tomic 2018-06-15 10:08:05 +02:00
parent b9be394cb2
commit b017283bfb
4 changed files with 131179 additions and 0 deletions

View File

@ -331,6 +331,170 @@ WHERE p.date < q.date AND q.date < r.date
RETURN a.name AS Team1, b.name AS Team2, c.name AS Team3;
```
### European road network example
In this section we will show how to use some of Memgraph's built-in graph
algorithms. More specifically, we will show how to use breadth-first search
graph traversal algorithm, and Dijkstra's algorithm for finding weighted
shortest paths between nodes in the graph.
#### Data model
One of the most common applications of graph traversal algorithms is driving
route computation, so we will use European road network graph as an example.
The graph consists of 999 major European cities from 39 countries in total.
Each city is connected to the country it belongs to via an edge of type `:In_`.
There are edges of type `:Road` connecting cities less than 500 kilometers
apart. Distance between cities is specified in the `length` property of the
edge.
#### Example queries
We have prepared a database snapshot for this example, so you can easily import
it when starting Memgraph using the `--durability-directory` option.
```bash
/usr/lib/memgraph/memgraph --durability-directory /usr/share/memgraph/examples/Europe \
--durability-enabled=false --snapshot-on-exit=false
```
When using Docker, you can import the example with the following command:
```bash
docker run -p 7687:7687 \
-v mg_lib:/var/lib/memgraph -v mg_log:/var/log/memgraph -v mg_etc:/etc/memgraph \
memgraph --durability-directory /usr/share/memgraph/examples/Europe \
--durability-enabled=false --snapshot-on-exit=false
```
Now you're ready to try out some of the following queries.
NOTE: If you modify the dataset, the changes will stay only during this run of
Memgraph.
Let's start off with a few simple queries.
1) Let's list all of the countries in our road network.
```opencypher
MATCH (c:Country) RETURN c.name ORDER BY c.name;
```
2) Which Croatian cities are in our road network?
```opencypher
MATCH (c:City)-[:In_]->(:Country {name: "Croatia"})
RETURN c.name ORDER BY c.name;
```
3) Which cities in our road network are less than 200 km away from Zagreb?
```opencypher
MATCH (:City {name: "Zagreb"})-[r:Road]->(c:City)
WHERE r.length < 200
RETURN c.name ORDER BY c.name;
```
Now let's try some queries using Memgraph's graph traversal capabilities.
4) Say you want to drive from Zagreb to Paris. You might wonder, what is the
least number of cities you have to visit if you don't want to drive more than
500 kilometers between stops. Since the edges in our road network don't connect
cities that are more than 500 km apart, this is a great use case for the
breadth-first search (BFS) algorithm.
```opencypher
MATCH p = (:City {name: "Zagreb"})
-[:Road * bfs]->
(:City {name: "Paris"})
RETURN nodes(p);
```
5) What if we want to bike to Paris instead of driving? It is unreasonable (and
dangerous!) to bike 500 km per day. Let's limit ourselves to biking no more
than 200 km in one go.
```opencypher
MATCH p = (:City {name: "Zagreb"})
-[:Road * bfs (e, v | e.length <= 200)]->
(:City {name: "Paris"})
RETURN nodes(p);
```
"What is this special syntax?", you might wonder.
`(e, v | e.length <= 200)` is called a *filter lambda*. It's a function that
takes an edge symbol `e` and a vertex symbol `v` and decides whether this edge
and vertex pair should be considered valid in breadth-first expansion by
returning true or false (or nil). In the above example, lambda is returning
true if edge length is not greater than 200, because we don't want to bike more
than 200 km in one go.
6) Let's say we also don't want to visit Vienna on our way to Paris, because we
have a lot of friends there and visiting all of them would take up a lot of our
time. We just have to update our filter lambda.
```opencypher
MATCH p = (:City {name: "Zagreb"})
-[:Road * bfs (e, v | e.length <= 200 AND v.name != "Vienna")]->
(:City {name: "Paris"})
RETURN nodes(p);
```
As you can see, without the additional restriction we could visit 11 cities. If
we want to avoid Vienna, we must visit at least 12 cities.
7) Instead of counting the cities visited, we might want to find the shortest
paths in terms of distance travelled. This is a textbook application of
Dijkstra's algorithm. The following query will return the list of cities on the
shortest path from Zagreb to Paris along with the total length of the path.
```opencypher
MATCH p = (:City {name: "Zagreb"})
-[:Road * wShortest (e, v | e.length) total_weight]->
(:City {name: "Paris"})
RETURN nodes(p) as cities, total_weight;
```
As you can see, the syntax is quite similar to breadth-first search syntax.
Instead of a filter lambda, we need to provide a *weight lambda* and the *total
weight symbol*. Given an edge and vertex pair, weight lambda must return the
cost of expanding to the given vertex using the given edge. The path returned
will have the smallest possible sum of costs and it will be stored in the total
weight symbol. A limitation of Dijkstra's algorithm is that the cost must be
non-negative.
8) We can also combine weight and filter lambdas in the shortest-path query.
Let's say we're interested in the shortest path that doesn't require travelling
more that 200 km in one go for our bike route.
```opencypher
MATCH p = (:City {name: "Zagreb"})
-[:Road * wShortest (e, v | e.length) total_weight (e, v | e.length <= 200)]->
(:City {name: "Paris"})
RETURN nodes(p) as cities, total_weight;
```
9) Let's try and find 10 cities that are furthest away from Zagreb.
```opencypher
MATCH (:City {name: "Zagreb"})
-[:Road * wShortest (e, v | e.length) total_weight]->
(c:City)
RETURN c, total_weight
ORDER BY total_weight DESC LIMIT 10;
```
It is not surprising to see that they are all in Siberia.
To learn more about these algorithms, we suggest you check out their Wikipedia
pages:
* [Breadth-first search](https://en.wikipedia.org/wiki/Breadth-first_search)
* [Dijkstra's algorithm](https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm)
Now you're ready to explore the world of graph databases with Memgraph
by yourself and try it on many more examples and datasets.

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,175 @@
Feature: European Road Network Example
Scenario: List all countries in the dataset
Given graph "europe"
When executing query:
"""
MATCH (c:Country) RETURN c.name ORDER BY c.name
"""
Then the result should be:
| c.name |
| 'Albania' |
| 'Austria' |
| 'Belarus' |
| 'Belgium' |
| 'Bosnia and Herzegovina' |
| 'Bulgaria' |
| 'Croatia' |
| 'Cyprus' |
| 'Czechia' |
| 'Denmark' |
| 'Estonia' |
| 'Finland' |
| 'France' |
| 'Germany' |
| 'Greece' |
| 'Hungary' |
| 'Iceland' |
| 'Ireland' |
| 'Italy' |
| 'Kosovo' |
| 'Latvia' |
| 'Lithuania' |
| 'Macedonia' |
| 'Moldova' |
| 'Montenegro' |
| 'Netherlands' |
| 'Norway' |
| 'Poland' |
| 'Portugal' |
| 'Romania' |
| 'Russia' |
| 'Serbia' |
| 'Slovakia' |
| 'Slovenia' |
| 'Spain' |
| 'Sweden' |
| 'Switzerland' |
| 'Ukraine' |
| 'United Kingdom' |
Scenario: Find all Croatian cities in the road network
Given graph "europe"
When executing query:
"""
MATCH (c:City)-[:In_]->(:Country {name: "Croatia"})
RETURN c.name ORDER BY c.name
"""
Then the result should be:
| c.name |
| 'Osijek' |
| 'Rijeka' |
| 'Split' |
| 'Zagreb' |
Scenario: Find cities less than 200 km from Zagreb
Given graph "europe"
When executing query:
"""
MATCH (:City {name: "Zagreb"})-[r:Road]->(c:City)
WHERE r.length < 200
RETURN c.name ORDER BY c.name;
"""
Then the result should be:
| c.name |
| 'Banja Luka' |
| 'Graz' |
| 'Ljubljana' |
| 'Maribor' |
| 'Rijeka' |
Scenario: Shortest path from Zagreb to Paris (BFS)
Given graph "europe"
When executing query:
"""
MATCH p = (:City {name: "Zagreb"})
-[:Road * bfs]->
(:City {name: "Paris"})
RETURN nodes(p);
"""
Then the result should be:
| nodes(p) |
| [(:City {name: 'Zagreb'}), (:City {name: 'Bolzano'}), (:City {name: 'Mulhouse'}), (:City {name: 'Paris'})] |
Scenario: Shortest path from Zagreb to Paris (BFS <= 200km)
Given graph "europe"
When executing query:
"""
MATCH p = (:City {name: "Zagreb"})
-[:Road * bfs (e, v | e.length <= 200)]->
(:City {name: "Paris"})
RETURN nodes(p);
"""
Then the result should be:
| nodes(p) |
| [(:City{name:'Zagreb'}),(:City{name:'Graz'}),(:City{name:'Vienna'}),(:City{name:'Linz'}),(:City{name:'Salzburg'}),(:City{name:'Munich'}),(:City{name:'Augsburg'}),(:City{name:'Esslingen'}),(:City{name:'Ludwigshafen am Rhein'}),(:City{name:'Metz'}),(:City{name:'Reims'}),(:City{name:'Paris'})] |
Scenario: Shortest path from Zagreb to Paris (BFS <= 200km, no Vienna)
Given graph "europe"
When executing query:
"""
MATCH p = (:City {name: "Zagreb"})
-[:Road * bfs (e, v | e.length <= 200 AND v.name != "Vienna")]->
(:City {name: "Paris"})
RETURN nodes(p);
"""
Then the result should be:
| nodes(p) |
| [(:City{name:'Zagreb'}),(:City{name:'Ljubljana'}),(:City{name:'Trieste'}),(:City{name:'Mestre'}),(:City{name:'Trento'}),(:City{name:'Innsbruck'}),(:City{name:'Munich'}),(:City{name:'Augsburg'}),(:City{name:'Esslingen'}),(:City{name:'Ludwigshafen am Rhein'}),(:City{name:'Metz'}),(:City{name:'Reims'}),(:City{name:'Paris'})] |
Scenario: Shortest path from Zagreb to Paris (Dijkstra)
Given graph "europe"
When executing query:
"""
MATCH p = (:City {name: "Zagreb"})
-[:Road * wShortest (e, v | e.length) total_weight]->
(:City {name: "Paris"})
RETURN nodes(p) as cities;
"""
Then the result should be:
| cities |
| [(:City{name:'Zagreb'}),(:City{name:'Ljubljana'}),(:City{name:'Bolzano'}),(:City{name:'Basel'}),(:City{name:'Créteil'}),(:City{name:'Paris'})] |
Scenario: Shortest path from Zagreb to Paris (Dijkstra <= 200km)
Given graph "europe"
When executing query:
"""
MATCH p = (:City {name: "Zagreb"})
-[:Road * wShortest (e, v | e.length) total_weight (e, v | e.length <= 200)]->
(:City {name: "Paris"})
RETURN nodes(p) as cities;
"""
Then the result should be:
| cities |
| [(:City{name:'Zagreb'}),(:City{name:'Graz'}),(:City{name:'Vienna'}),(:City{name:'Linz'}),(:City{name:'Salzburg'}),(:City{name:'Munich'}),(:City{name:'Augsburg'}),(:City{name:'Pforzheim'}),(:City{name:'Saarbrücken'}),(:City{name:'Metz'}),(:City{name:'Reims'}),(:City{name:'Paris'})] |
Scenario: Ten cities furthest away from Zagreb
Given graph "europe"
When executing query:
"""
MATCH (:City {name: "Zagreb"})
-[:Road * wShortest (e, v | e.length) total_weight]->
(c:City)
RETURN c.name
ORDER BY total_weight DESC LIMIT 10;
"""
Then the result should be:
| c.name |
| 'Norilsk' |
| 'Vorkuta' |
| 'Novyy Urengoy' |
| 'Noyabrsk' |
| 'Ukhta' |
| 'Severodvinsk' |
| 'Arkhangelsk' |
| 'Khabarovsk' |
| 'Blagoveshchensk' |
| 'Petropavlovsk-Kamchatsky' |

File diff suppressed because it is too large Load Diff