6294bf19ec
Reviewers: mtomic, buda, florijan, mferencevic Reviewed By: florijan Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D1073
119 lines
3.3 KiB
Markdown
119 lines
3.3 KiB
Markdown
## Import Tools
|
|
|
|
Memgraph comes with tools for importing data into the database. Currently,
|
|
only import of CSV formatted is supported. We plan to support more formats in
|
|
the future.
|
|
|
|
### CSV Import Tool
|
|
|
|
CSV data should be in Neo4j CSV compatible format. Detailed format
|
|
specification can be found
|
|
[here](https://neo4j.com/docs/operations-manual/current/tools/import/file-header-format/).
|
|
|
|
The import tool is run from the console, using the `mg_import_csv` command.
|
|
|
|
If you installed Memgraph using Docker, you will need to run the importer
|
|
using the following command:
|
|
|
|
```
|
|
docker run -v mg_lib:/var/lib/memgraph -v mg_etc:/etc/memgraph -v mg_import:/import-data \
|
|
--entrypoint=mg_import_csv memgraph
|
|
```
|
|
|
|
You can pass CSV files containing node data using the `--nodes` option.
|
|
Multiple files can be specified by repeating the `--nodes` option. At least
|
|
one node file should be specified. Similarly, graph edges (also known as
|
|
relationships) are passed via the `--relationships` option. Multiple
|
|
relationship files are imported by repeating the option. Unlike nodes,
|
|
relationships are not required.
|
|
|
|
After reading the CSV files, the tool will by default search for the installed
|
|
Memgraph configuration. If the configuration is found, the data will be
|
|
written in the configured durability directory. If the configuration isn't
|
|
found, you will need to use the `--out` option to specify the output file. You
|
|
can use the same option to override the default behaviour.
|
|
|
|
Memgraph will recover the imported data on the next startup by looking in the
|
|
durability directory.
|
|
|
|
For information on other options, run:
|
|
|
|
```
|
|
mg_import_csv --help
|
|
```
|
|
|
|
When using Docker, this translates to:
|
|
|
|
```
|
|
docker run --entrypoint=mg_import_csv memgraph --help
|
|
```
|
|
|
|
#### Example
|
|
|
|
Let's import a simple dataset.
|
|
|
|
Store the following in `comment_nodes.csv`.
|
|
|
|
```
|
|
id:ID(COMMENT_ID),country:string,browser:string,content:string,:LABEL
|
|
0,Croatia,Chrome,yes,Message;Comment
|
|
1,United Kingdom,Chrome,thanks,Message;Comment
|
|
2,Germany,,LOL,Message;Comment
|
|
3,France,Firefox,I see,Message;Comment
|
|
4,Italy,Internet Explorer,fine,Message;Comment
|
|
```
|
|
|
|
Now, let's add `forum_nodes.csv`.
|
|
|
|
```
|
|
id:ID(FORUM_ID),title:string,:LABEL
|
|
0,General,Forum
|
|
1,Support,Forum
|
|
2,Music,Forum
|
|
3,Film,Forum
|
|
4,Programming,Forum
|
|
```
|
|
|
|
And finally, set relationships between comments and forums in
|
|
`relationships.csv`.
|
|
|
|
```
|
|
:START_ID(COMMENT_ID),:END_ID(FORUM_ID),:TYPE
|
|
0,0,POSTED_ON
|
|
1,1,POSTED_ON
|
|
2,2,POSTED_ON
|
|
3,3,POSTED_ON
|
|
4,4,POSTED_ON
|
|
```
|
|
|
|
Now, you can import the dataset in Memgraph.
|
|
|
|
WARNING: Your existing recovery data will be considered obsolete, and Memgraph
|
|
will load the new dataset.
|
|
|
|
Use the following command:
|
|
|
|
```
|
|
mg_import_csv --nodes=comment_nodes.csv --nodes=forum_nodes.csv --relationships=relationships.csv
|
|
```
|
|
|
|
If using Docker, things are a bit more complicated. First you need to move the
|
|
CSV files where the Docker image can see them:
|
|
|
|
```
|
|
mkdir -p /var/lib/docker/volumes/mg_import/_data
|
|
cp comment_nodes.csv forum_nodes.csv relationships.csv /var/lib/docker/volumes/mg_import/_data
|
|
```
|
|
|
|
Then, run the importer with the following:
|
|
|
|
```
|
|
docker run -v mg_lib:/var/lib/memgraph -v mg_etc:/etc/memgraph -v mg_import:/import-data \
|
|
--entrypoint=mg_import_csv memgraph \
|
|
--nodes=/import-data/comment_nodes.csv --nodes=/import-data/forum_nodes.csv \
|
|
--relationships=/import-data/relationships.csv
|
|
```
|
|
|
|
Next time you run Memgraph, the dataset will be loaded.
|
|
|