Reviewers: buda Reviewed By: buda Differential Revision: https://phabricator.memgraph.io/D1432
3.4 KiB
Import Tools
Memgraph comes with tools for importing data into the database. Currently, only import of CSV formatted is supported. We plan to support more formats in the future.
CSV Import Tool
CSV data should be in Neo4j CSV compatible format. Detailed format specification can be found here.
The import tool is run from the console, using the mg_import_csv
command.
If you installed Memgraph using Docker, you will need to run the importer using the following command:
docker run -v mg_lib:/var/lib/memgraph -v mg_etc:/etc/memgraph -v mg_import:/import-data \
--entrypoint=mg_import_csv memgraph
You can pass CSV files containing node data using the --nodes
option.
Multiple files can be specified by repeating the --nodes
option. At least
one node file should be specified. Similarly, graph edges (also known as
relationships) are passed via the --relationships
option. Multiple
relationship files are imported by repeating the option. Unlike nodes,
relationships are not required.
After reading the CSV files, the tool will by default search for the installed
Memgraph configuration. If the configuration is found, the data will be
written in the configured durability directory. If the configuration isn't
found, you will need to use the --out
option to specify the output file. You
can use the same option to override the default behaviour.
Memgraph will recover the imported data on the next startup by looking in the durability directory.
For information on other options, run:
mg_import_csv --help
When using Docker, this translates to:
docker run --entrypoint=mg_import_csv memgraph --help
Example
Let's import a simple dataset.
Store the following in comment_nodes.csv
.
id:ID(COMMENT_ID),country:string,browser:string,content:string,:LABEL
0,Croatia,Chrome,yes,Message;Comment
1,United Kingdom,Chrome,thanks,Message;Comment
2,Germany,,LOL,Message;Comment
3,France,Firefox,I see,Message;Comment
4,Italy,Internet Explorer,fine,Message;Comment
Now, let's add forum_nodes.csv
.
id:ID(FORUM_ID),title:string,:LABEL
0,General,Forum
1,Support,Forum
2,Music,Forum
3,Film,Forum
4,Programming,Forum
And finally, set relationships between comments and forums in
relationships.csv
.
:START_ID(COMMENT_ID),:END_ID(FORUM_ID),:TYPE
0,0,POSTED_ON
1,1,POSTED_ON
2,2,POSTED_ON
3,3,POSTED_ON
4,4,POSTED_ON
Now, you can import the dataset in Memgraph.
WARNING: Your existing recovery data will be considered obsolete, and Memgraph will load the new dataset.
Use the following command:
mg_import_csv --nodes=comment_nodes.csv --nodes=forum_nodes.csv --relationships=relationships.csv
If using Docker, things are a bit more complicated. First you need to move the CSV files where the Docker image can see them:
mkdir -p /var/lib/docker/volumes/mg_import/_data
cp comment_nodes.csv forum_nodes.csv relationships.csv /var/lib/docker/volumes/mg_import/_data
Then, run the importer with the following:
docker run -v mg_lib:/var/lib/memgraph -v mg_etc:/etc/memgraph -v mg_import:/import-data \
--entrypoint=mg_import_csv memgraph \
--nodes=/import-data/comment_nodes.csv --nodes=/import-data/forum_nodes.csv \
--relationships=/import-data/relationships.csv
Next time you run Memgraph, the dataset will be loaded.