Summary: Once a snapshot is successfully written, delete WAL files which are no longer necessary for recovery. Note that this prohibits recovering the WAL from any except the last snapshot.
Reviewers: buda, mislav.bradac, dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1000
Summary:
This diff contains step 1:
- Remove clog exposure from tx::engine
- Reduce and cleanup tx::Engine API
All current functionality is kept, but the API is reduced. This is very
desirable because every function in tx::Engine will need to be
considered and implemented in both Master and Worker situations. The
less we have, the better.
Next step is exactly that: seeing how each of these functions behaves in
a distributed system and implementing accordingly.
Reviewers: dgleich, mislav.bradac, buda
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1008
Summary:
Referring to the TCK failure on:
```
MATCH (a {name: 'Andres'})<-[:FATHER]-(child)
RETURN {foo: a.name='Andres', kids: collect(child.name)}
```
In the planner we'd only treat a list|map as a group_by if it contained
no aggregations. That's changed so that if a map contains both aggregations
and non-aggregations, then non-aggregations are treated as individual
group_by expressions.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: buda, pullbot, teon.banek
Differential Revision: https://phabricator.memgraph.io/D1004
Summary:
In preparation for distributed storage we need to have labels/properties/edgetypes uniquely identifiable by their ids, which will be global in near future.
The old design has to be abandoned because it's not possible to keep track of global labels/properties/edgetypes while they are local pointers.
Reviewers: mislav.bradac, florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D993
Summary:
Looking for connected components in a random graph. This test performs the following:
- Generates a random graph that is NOT sequential in memory (otherwise itertion over edges is 2 or more times faster).
- Connectivity by iterating over all the edges.
- Ditto over vertices.
- Ditto over vertices in parallel.
Not done:
- Edge filtering based on XY. I could/should add that to see how it affects perf.
- Getting component info out from union-find.
Local results are encouraging. Iterating over the graph is the bottleneck. Still, I get connectivity of 10M vertices/edges in <7sec (parallel over vertices). Will test on 250M remote now.
Locally obtained results (20M/20M, 2 threads)
```
I1115 14:57:55.136875 357 otto_parallel.cpp:50] Generating 2000000 vertices...
I1115 14:58:19.057734 357 otto_parallel.cpp:74] Generated 2000000 vertices in 23.9208 seconds.
I1115 14:58:19.919221 357 otto_parallel.cpp:82] Generating 2000000 edges...
I1115 14:58:39.519951 357 otto_parallel.cpp:93] Generated 2000000 edges in 19.3398 seconds.
I1115 14:58:39.520349 357 otto_parallel.cpp:196] Running Edge iteration...
I1115 14:58:43.857264 357 otto_parallel.cpp:199] Done in 4.33691 seconds, result: 3999860270398
I1115 14:58:43.857316 357 otto_parallel.cpp:196] Running Vertex iteration...
I1115 14:58:49.498181 357 otto_parallel.cpp:199] Done in 5.64087 seconds, result: 4000090070787
I1115 14:58:49.498208 357 otto_parallel.cpp:196] Running Connected components - Edges...
I1115 14:58:54.232530 357 otto_parallel.cpp:199] Done in 4.73433 seconds, result: 323935
I1115 14:58:54.232570 357 otto_parallel.cpp:196] Running Connected components - Vertices...
I1115 14:59:00.412395 357 otto_parallel.cpp:199] Done in 6.17983 seconds, result: 323935
I1115 14:59:00.412422 357 otto_parallel.cpp:196] Running Parallel connected components - Vertices...
I1115 14:59:04.662087 357 otto_parallel.cpp:199] Done in 4.24967 seconds, result: 323935
I1115 14:59:04.662116 357 otto_parallel.cpp:196] Running Expansion...
I1115 14:59:13.913015 357 otto_parallel.cpp:199] Done in 9.25091 seconds, result: 323935
```
Reviewers: buda, mislav.bradac, dgleich, teon.banek
Reviewed By: buda, teon.banek
Subscribers: teon.banek, pullbot
Differential Revision: https://phabricator.memgraph.io/D982
Summary: Vertex and Edge now use Address for storing connections to other Edges and Vertices, to support distributed storage.
Reviewers: mislav.bradac, dgleich, buda
Reviewed By: mislav.bradac, dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D977
Summary:
My dear fellow Memgraphians. It's friday afternoon, and I am as ready to pop as WAL is to get reviewed...
What's done:
- Vertices and Edges have global IDs, stored in `VersionList`. Main storage is now a concurrent map ID->vlist_ptr.
- WriteAheadLog class added. It's based around buffering WAL::Op objects (elementraly DB changes) and periodically serializing and flusing them to disk.
- Snapshot recovery refactored, WAL recovery added. Snapshot format changed again to include necessary info.
- Durability testing completely reworked.
What's not done (and should be when we decide how):
- Old WAL file purging.
- Config refactor (naming and organization). Will do when we discuss what we want.
- Changelog and new feature documentation (both depending on the point above).
- Better error handling and recovery feedback. Currently it's all returning bools, which is not fine-grained enough (neither for errors nor partial successes, also EOF is reported as a failure at the moment).
- Moving the implementation of WAL stuff to .cpp where possible.
- Not sure if there are transactions being created outside of `GraphDbAccessor` and it's `BuildIndex`. Need to look into.
- True write-ahead logic (flag controlled): not committing a DB transaction if the WAL has not flushed it's data. We can discuss the gain/effort ratio for this feature.
Reviewers: buda, mislav.bradac, teon.banek, dgleich
Reviewed By: dgleich
Subscribers: mtomic, pullbot
Differential Revision: https://phabricator.memgraph.io/D958
Summary:
Previously, named path symbols remained untracked as `new_symbols` during planning. This meant that
operator `Optional` would be left unaware of those symbols, and therefore not reset them to `Null`
if optional matching failed.
Test Optional operator will be aware of path symbols
Reviewers: florijan, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D974
Summary:
Tests have been updated to catch this error and other behaviour. Other
than this change, `AND` should behave as before.
Reviewers: florijan, mislav.bradac
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D970
Summary:
Use sigaction to register signal handlers.
This is preferred over `signal` function, according to `man 3p signal`.
Add global sig_atomic_t flag when shutting down.
Block other signal handlers when shutting down.
Reviewers: mislav.bradac, mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D943
Summary:
Remove name from GraphDb.
Take GraphDb in query test macros instead of accessor.
Add is_accepting_transactions flag to GraphDb.
Reviewers: mislav.bradac, florijan, mferencevic
Reviewed By: mislav.bradac
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D940
Summary:
- Removed durability::Summary because it was wired into reader and stopped me from recovering WAL files.
- Refactored and renamed BufferedFile(Reader/Writer) to HashedFile(Reader/Writer).
- Vertex and edge counts in the snapshot are now hashed.
Breaking snapshot compatibility again (hashing), but since the previous version was not released, and we are not caching snapshots, the previous version does not need to be supported.
Reviewers: teon.banek, mislav.bradac, buda
Reviewed By: teon.banek, mislav.bradac
Subscribers: dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D932
Summary:
Time csv_to_snapshot conversion and log it.
Check if writing csv_to_snapshot failed.
Extract LoadConfig from memgraph_bolt to config.hpp.
Read memgraph config in csv_to_snapshot for snapshot_directory.
Rename csv_to_snapshot to mg_import_csv.
Add tests for tools.
Run tools tests in apollo.
Reviewers: mislav.bradac, florijan, mferencevic, buda
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D931
Summary:
Move QueryParts and Filters to a new file.
Reorganize FilterInfo struct.
Remove label filter if we do indexed scan by label.
Remove property filter used in indexed scan.
Reviewers: florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D915
Summary: Locked version. There are some benchmarks, it seems the lock won't be the bottleneck in the WAL (DB ops causing WAL delta insertions into it will be slower, flushing the WAL be slower).
Reviewers: buda, mislav.bradac, dgleich
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D919
Summary:
New snapshot structure:
- magic number
- snapshot version (old-version recovery not yet implemented)
- transaction snapshot (will be used in the WAL)
- the rest is as before (indices, vertices, edges)
Not backward compatible with the old snapshotting.
Does not improve error handling (user feedback). A task for that has been added.
Reviewers: buda, mislav.bradac, mferencevic, teon.banek
Reviewed By: teon.banek
Subscribers: teon.banek, dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D912
Summary:
Since we have different kind of workers in Apollo we should pass
assigned cpus to harness from apollo generate script and not define them
in harness or in benchmarks.
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D916