Summary:
- Removed a lot of stuff that was incorrect and/or unnecessary
- Fixed const-correctness in the skiplist family
Reviewers: dgleich, teon.banek, buda
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1351
Summary:
Implemented cluster discovery in distributed memgraph.
When a worker registers, it sends a RPC request to master.
The master assigns that worker an id and sends the information about other
workers (pairs of <worker_id, endpoint>) to the new worker.
Master also sends the information about the new worker to all existing workers
in the process of worker registration.
After the last worker registers, all memgraph instances in the clusters should
know about every other.
Reviewers: mtomic, buda, florijan
Reviewed By: mtomic
Subscribers: teon.banek, dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D1339
Summary: Make synchronized snapshot. This invokese the snapshooter on workers on the master snapshot scheduler interval.
Reviewers: msantl, mtomic
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1334
Summary:
Snapshots should have the transaction from which they were created because we need this info for recovery later on.
Otherwise we wouldn't be able to tell the workers from which snapshots to recover. The whole cluster should be recovered
from the same transaction snapshot.
Reviewers: msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1338
Summary: Adds a commit log garbage collector, which clears old transactions from the commit log
Reviewers: florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1310
Summary:
When commiting/aborting a transaction in tx master engine, make a two
phase commit to all workers so they can stop all futures and clear
transactional cache.
Reviewers: dgleich, florijan
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1320
Summary:
Results are:
```
I0327 12:22:56.312857 13756] Testing with 1 threads and 100000 transactions per thread...
I0327 12:22:56.337549 13756] Result (millions of transactions per second) 4.07272
I0327 12:22:56.337618 13756] Testing with 2 threads and 100000 transactions per thread...
I0327 12:22:56.449129 13756] Result (millions of transactions per second) 1.79392
I0327 12:22:56.449151 13756] Testing with 4 threads and 100000 transactions per thread...
I0327 12:22:56.821496 13756] Result (millions of transactions per second) 1.07434
I0327 12:22:56.821519 13756] Testing with 8 threads and 100000 transactions per thread...
I0327 12:22:58.265359 13756] Result (millions of transactions per second) 0.554081
I0327 12:22:58.265383 13756] Testing with 16 threads and 100000 transactions per thread...
I0327 12:23:03.978154 13756] Result (millions of transactions per second) 0.280075
```
After changing the lock to `std::mutex`:
```
I0327 12:28:47.493680 14755] Testing with 1 threads and 100000 transactions per thread...
I0327 12:28:47.520134 14755] Result (millions of transactions per second) 3.80314
I0327 12:28:47.520270 14755] Testing with 2 threads and 100000 transactions per thread...
I0327 12:28:47.744608 14755] Result (millions of transactions per second) 0.891592
I0327 12:28:47.744639 14755] Testing with 4 threads and 100000 transactions per thread...
I0327 12:28:48.213151 14755] Result (millions of transactions per second) 0.853791
I0327 12:28:48.213181 14755] Testing with 8 threads and 100000 transactions per thread...
I0327 12:28:49.342561 14755] Result (millions of transactions per second) 0.70836
I0327 12:28:49.342594 14755] Testing with 16 threads and 100000 transactions per thread...
I0327 12:28:51.722991 14755] Result (millions of transactions per second) 0.672164
```
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1319
Summary:
The network layer now has a `Session` that handles all things that should be
done before the `Execute` method is called on sessions. Also, all sessions
now communicate using streams instead of holding the input buffer and writing
to the `Socket`. This design will allow implementation of a SSL middleware.
Reviewers: buda, dgleich
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1314
Summary:
Remove "produce_" and "Produce" as prefix from all distributed stuff.
It's not removed in src/query/ stuff (operators).
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1315
Summary:
When performing recovery, ensure that the transaction ID in engine is
bumped to one after the max tx id seen in recovery.
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1312
Summary:
Find[Vertex|Edge] -> Find[Vertex|Edge]Optional
Find[Vertex|Edge]Checked -> Find[Vertex|Edge]
In some places change old code that finds-optional and immediately checks
to use the checked functionality.
It seems that in all the src/ stuff optional finds are no loger used,
only in tests, but there they are used extensively so I don't feel those
functions should be removed.
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1309
Summary:
Writers preference is not guaranteed by the standard and as such
there are variations between implementations in different versions
of libphthread.
Reviewers: dgleich
Reviewed By: dgleich
Differential Revision: https://phabricator.memgraph.io/D1305
Summary:
I think the rethrow as it was done in `.../clientes/common.hpp` is not
correct as it could have caught any exception type descending from
`BasicException`, copied it into `optional<BasicException>` and thus
losing exact type and members of the original exception. I think that
the proposed formulation of a rethrow in the `catch` does it properly.
I made a small test for this behavior, see P11.
Reviewers: teon.banek, mferencevic, mislav.bradac
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1298
Summary:
Ensures that query plans are invalidated on the remote only when it's
guaranteed they will never be used on the master. This is done by
invalidating remote caches in the `CachedPlan` destructor.
There are two unplesant side-effects. First, an RPC call is made in an
object destructor. This is somewhat ugly, but not that different then
making an RPC call that must succeed in any other function. Note that
this does NOT slow down any query execution because the relevant
destructor is called by the skiplist garbage collector. The second ugly
side-effect is that in the unit test now we need to sleep to ensure the
skiplist GC destructs a cached plan before checking that it's
invalidated on the remote worker.
We might want to redesign this at some point.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1302
Summary:
- Remove caches on workers as a result of plan expiration or race
during insertion.
- Extract caching functionality into a class.
- Minor refactor of Interpreter::operator()
- New RPC and test for it.
- Rename ConsumePlanRes to DispatchPlanRes for consistency, remove
return value as it's always true and never used.
- Interpreter is now constructed with a `GraphDb` reference. At the
moment only for reaching the `distributed::PlanDispatcher`, but in
the future we should probably use that primarily for planning.
I added a function to `PlanConsumer` that is only used for testing.
I prefer not doing this, but I felt this needed testing. I can remove
it now if you like.
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1292
Summary:
Instead of waiting for a fix period for the coordinations to start and
coordinate with the master, wait for each of them individually to report
being done.
Also: rename `WorkerInThread` to `WorkerCoordinationInThread`.
Reviewers: dgleich, teon.banek, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1288
Summary: this will make delete query more efficient
Reviewers: mferencevic, mculinovic
Reviewed By: mferencevic
Differential Revision: https://phabricator.memgraph.io/D1282
Summary: Property connected_frauds added and set to zero for strata scenario.
Reviewers: mtomic
Reviewed By: mtomic
Differential Revision: https://phabricator.memgraph.io/D1275
Summary:
This fixes a bug where the planning would raise NotYetImplemented error,
due to preventing plan splitting while the operator is already on
master. For example, `MATCH (a), (b) CREATE (a)-[e:r]->(b) RETURN e`
would split the plan after Cartesian and before Create. Thus, the rest
of the plan would be on master, including the Accumulate before Produce.
We still can (and must) replace Accumulate with Synchronize but this
would fail due to unneeded check.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1272
Summary:
I have not thoroughly thought this through, especially the worker
destruction (is it legit to abort all running tx?), but it's tested to
abort during remote pull, what we need.
Also I improved error handling for vertex deletion failure during
remote pull (@dgleich).
Reviewers: teon.banek, msantl, dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1263
Summary:
Instead of passing `coordination`, pass `rpc_worker_clients` that
holds a map of worker_id->clientPool. By having only one instance of
`RpcWorkerClients` that is owned by `GraphDB` and passing it by refference
we'll share the same client pools for rpc clients.
Reviewers: teon.banek, florijan, dgleich, mferencevic
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1261
Summary:
Release of per-transaction data in distributed Memgraph refactored. The
master node no longer releases each time a transaction is done, thus
offloading some work from the engine.
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1235
Summary:
Remove a method in tx::Engine whose results can be obtained from commit
log info (also guaranteed to be globally correct in distributed).
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1240
Summary:
Difficult to test properly as the problem was an implicit conversion of
`gid::Gid` to `mvcc::VersionList<> *`. The only proper way to defend
against this would be to make `gid::Gid` a type.
The `CHECK` added in this diff does not help, but should be there, so...
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1244
Summary:
During distributed execution, OrderBy is split across workers and the
master gets to merge those results via PullRemoteOrderBy. Since this
operator may be an input to almost any other operator, virtual accessors
to `input` have been added in LogicalOperator.
Depends on D1221
Reviewers: florijan, msantl, buda
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1232
Summary:
Extracted `TypedValueVectorCompare` and `RemotePuller` from operators
so it can be reused.
The new `PullRemoteOrerBy` operator pulls one result from each worker and one
from master, relies on the fact that workers/master returned sorted results,
returns the next one in order, and pulls the source of that result to get the
next one.
Depends on D1215 that (at the moment) is still in review.
Reviewers: florijan, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1221
Summary:
- The expansion vertex gets created on the origin's worker
- The edge automatically gets created wherever necessary
- Vertex creation logic reuse
- End to end test
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1227
Summary:
To see the thread names in htop press <F2> to enter setup and under
'Display options' choose 'Show custom thread names'.
Reviewers: buda, teon.banek, florijan
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1234
Summary:
- Add the said test
- Add `workerid` function
- Add random wait (enabled with flag) to `rpc::Client::Call` for network
latency simulation
Reviewers: buda, mferencevic, teon.banek
Reviewed By: buda, teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1222
Summary:
Previously, the RPC stack used the network stack only to receive messages. The
messages were then added to a separate queue that was processed by different
thread pools. This design was inefficient because there was a lock when
inserting and getting messages from the common queue.
This diff removes the need for separate thread pools by utilising the new
network stack design. This is possible because the new network stack allows
full processing of the network request without blocking the whole queue.
Reviewers: buda, florijan, teon.banek, dgleich, mislav.bradac
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1229
Summary:
This may improve the query execution workload, because the Expand will
yield local results while it waits for remote ones. Note that we rely on
the fact that walking the graph produces results without any predetermined
order. Therefore, we can yield paths as we see fit. Enforcing the order
is done through OrderBy operator, and this change shouldn't affect that.
Reviewers: florijan, msantl, buda
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1215
Summary:
Updating a record locally while there is an remote update waiting to be applied caused
the operation to return as already deleted, instead of applying it
Reviewers: florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1226
Summary:
Previously, the network stack `communication::Server` accepted connections and
assigned them statically in a round-robin fashion to `communication::Worker`.
That meant that if two compute intensive connections were assigned to the same
worker they would block each other while the other workers would do nothing.
This implementation replaces `communication::Worker` with
`communication::Listener` which holds all accepted connections in one pool and
ensures that all workers execute all connections.
Reviewers: buda, florijan, teon.banek
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1220
Summary:
- Add `database::GraphDb::GetWorkerIds()`
- Change `CreateNode` constructor API
- Make `CreateNode` distribute nodes uniformly over workers
Did not yet modify `CreateExpand`, coming in the next diff.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1216
Summary:
Implementation of remote vertex and edge creation. This diff addresses
the creation API (`GraphDbAccessor::InsertEdge`,
`GraphDbAccessor::InsertRemoteVertex`) and the necessary RPC and
`RemoteCache` stuff.
What is missing for full remote creation support are
`query::plan::operator` changes that are expected to minor. Pushing this
diff as it's large enough, operator and end to end tests in the next.
Also, the naming of existing structures and files is confusing (update
refering to both updates and created, `results` used too often etc.). I
will address this too, but feel free to comment on bad naming.
Reviewers: dgleich, teon.banek, msantl
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1210
Summary:
This is still very much in progress. No advanced checks are done to
prevent planning unimplemented things. Basic Cartesian product should
work, for example `MATCH (a), (b) CREATE (a)-[:r]->(c)-[:r]->(b)`. But
anything more advanced may lead to undefined behaviour of the planner
and therefore execution. Use at your own risk!
Add ModifiedSymbols method to LogicalOperator
For planning Cartesian, we need information on which symbols are filled
by operator sub-trees. Currently, this is used to set symbols which
should be transferred over network. Later, they should be used to detect
whether filter expressions use symbols modified from Cartesian branches.
Then we will be able to ensure correct dependency of filters and their
behaviour.
Prepare DistributedPlan for multiple worker plans
Since Cartesian branches need to be split and handled by each worker, we
now dispatch multiple plans to workers.
Reviewers: florijan, msantl, buda
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1208
Summary:
Rename and fix concurrent list test.,
This tests a fix for the concurrent list.I was not able to reproduce the flakyness, but this could be it.
If it happens again we'll know that this is not the solution to the problem, and look further.
Change memory ordering
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: buda, pullbot
Differential Revision: https://phabricator.memgraph.io/D1188
Summary:
Stats server wasn't connecting to the right service on statsd.
Also, benchmark client stats now have prefix `client` instead
of machine name to be consistent with memgraphs stats naming
which starts with `master` or `worker`.
Reviewers: mtomic, buda
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1209
Summary:
On the master cleanups are hooked directly into the transaction engine.
This is beneficial because the master might have bigger caches and we
want to clear them as soon as possible.
On the workers there is a periodic RPC call to the master about living
transactions, which takes care of releasing local caches. This is
suboptimal because long transactions will prevent cache GC (like with
data GC). It is however fairly simple.
Note that all cleanup is not done automatically and `RemotePull` has
been reduced accordingly. @msantl, please verify correctness and
consider if the code can be additionally simplified.
Reviewers: teon.banek, msantl
Reviewed By: msantl
Subscribers: pullbot, msantl
Differential Revision: https://phabricator.memgraph.io/D1202
Summary:
Pulls left op cursor and keeps the result, and then for each pull of
the right op cursor, adds all the left op results to produce a cartesian
product.
Reviewers: teon.banek, florijan
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1201
Summary:
Get rid of client class
Fix cppcheck errors
Add documentation to metrics.hpp
Add documentation to stats.hpp
Remove stats from global namespace
Fix build failures
Refactor a bit
Refactor stopwatch into a function
Add rpc execution time stats
Fix segmentation fault
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1200
Summary:
We have been using `Edges::VertexAddress` and `Edges::EdgeAddress` a lot
in other parts of the codebase because it's cleaner to write then
`Address<mvcc::VersionList<Edge>>`, especially in code what should not
really be MVCC-aware. However, a lot of that code should not really be
`Edges` aware either, as that's a storage datastructure that should not
be exposed.
This became annoying, so I extracted these addresses into a type-file. I
don't really like this approach, it might be better to have
`Vertex::Address` and `Edge::Address`, but that means we'd have to
import those headers and we'd get circular dependencies.
“The horror! The horror!”
- Joseph Conrad, Heart of Darkness
Reviewers: teon.banek, buda
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1204
Summary:
Added latest tck openCypher tests. Made sure the file is parseable and that
it returns coverage.
Also bumped continuous integration configuration to use the 09 version.
Reviewers: mferencevic, teon.banek, buda
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1197
Summary:
Updates are supported, insertions and removals not in this diff. The
test is a bit overdesigned, it happens.
Reviewers: teon.banek, dgleich, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1176
Summary:
Defined a new test based on reported bug, for multiple remote expansion.
Fixed the bug. Introduced minor refactors in distributed unit testing.
Reviewers: mculinovic, dgleich
Reviewed By: mculinovic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1173
Summary:
This is a basic take on planning distributed writes. Main logic is in handling
the Accumulate operator, which requires Synchronize operation if the results
are used after modification.
Tests for planning distributed writes have been added.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1172
Summary:
This diff consolidates local and remote update handling. It ensures and
tests that updates for remote elements are visible locally (on the
updating worker).
The next part will be accumulating remote updates and applying them on
the owner.
Also extracted a common testing fixture.
Reviewers: dgleich, buda, mtomic
Reviewed By: mtomic
Subscribers: mtomic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1169
Summary:
Refactor in two ways. First, expose members without getters as we will
need most of them in distributed. And this was always the sensible thing
to do. Second, add storage type values to deltas. This is also a
sensible thing to do, and it will be very beneficial in distributed. We
didn't do it before because name<->value type mappings aren't guaranteed
to be the same after recovery. A task has been added to address this
(preserve mappings in durability).
Reviewers: dgleich, buda
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1167
Summary:
Queries such as `RETURN 1` should only be run on a single machine. This
change assumes that a query should only be distributed if it contains at
least one `ScanAll` operator, i.e. a `MATCH` clause.
Reviewers: florijan, msantl
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1154
Summary:
Remote pulls can now be async. Note that
`RemotePullRpcClients::RemotePull` takes references to data structures
which should not be temporary in the caller. Still, maybe safer to make
copies?
Changed `RpcWorkerClients` API to make that possible.
Reviewers: dgleich, msantl, teon.banek
Reviewed By: msantl
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1150
Summary:
It is possible that we have a global address to resolve, for a graph
element that's local. Consider W1 expanding, getting data from W2,
expanding from there and getting data that is on W1. We then don't want
to do RPC from W1 to W1, but do a lookup directly.
Reviewers: dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1145
Summary:
NOTE: This diff is still in progress. Many TODOs, lacking documentation
etc. But the main logic is there (some could be different), and it tests
out OK.
Reviewers: teon.banek, msantl, buda
Reviewed By: teon.banek
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D1138
Summary:
Remove AdvanceCommand operator declaration.
Add query/plan/distributed source files.
Add hacked cloning of LogicalOperator via serialization.
Add virtual Default methods to CompositeVisitor.
Use BOOST_CLASS_EXPORT_KEY in ast.hpp and operator.hpp.
This is needed in a single binary to correctly serialize polymorphic
classes. The previous implementation worked because the memgraph_lib was
linked with test binaries, but nobody was actually serializing things
inside the single memgraph binary itself.
Print PullRemote symbols in tests/manual/query_planner
Add names to implicitly created aggregation symbols
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1137
Summary:
With this patch the number of packets for a simple RPC call is lowered
from 22 to 12 (45% reduction). The number of packets for the Bolt protocol
is lowered from 26 to 18 (30% reduction).
Impact on the Bolt protocol will be a constant of ~ 8 packets less per
connection, while the impact on the RPC protocol will be approximately
a 45% reduction overall.
Reviewers: buda, teon.banek
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1141
Summary: See above. The unit test creates two clients on demand so I guess it works.
Reviewers: mferencevic, florijan, teon.banek
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1136
Summary:
Start removal of old logic
Remove more obsolete classes
Move Message class to RPC
Remove client logic from system
Remove messaging namespace
Move protocol from messaging to rpc
Move System from messaging to rpc
Remove unnecessary namespace
Remove System from RPC Client
Split Client and Server into separate files
Start implementing new client logic
First semi-working state
Changed network protocol layout
Rewrite client
Fix client receive bug
Cleanup code of debug lines
Migrate to accessors
Migrate back to binary boost archives
Remove debug logging from server
Disable timeout test
Reduce message_id from uint64_t to uint32_t
Add multiple workers to server
Fix compiler warnings
Apply clang-format
Reviewers: teon.banek, florijan, dgleich, buda, mtomic
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1129
Summary:
Added test for `PlanDispatcher` and `PlanConsumer`.
This diff also contains a fix for the async rpc call on all clients.h
Reviewers: florijan, teon.banek
Reviewed By: florijan
Subscribers: pullbot, dgleich
Differential Revision: https://phabricator.memgraph.io/D1135
Summary:
* add run_pokec script because more than one step is required
* refactor of plot_throughput script
* move all plot scripts under tools/plot
Reviewers: mferencevic, teon.banek, mislav.bradac
Reviewed By: mferencevic
Subscribers: florijan, pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D1106
Summary:
- End to end distributed GraphDb testing
- Refactors as necessary
- Basic RemoteCache for storing remote data
- RemoteDataRpc
As we are on a tight schedule, please let's focus on the essentials:
functionality and proper testing.
Reviewers: dgleich, teon.banek, buda
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1121
Summary:
With the added support for serialization, we should be able to transfer
plans across distributed workers.
The planner tests has been extended to test serialization. Operators
should be mostly tested, but the expression they contain aren't
completely. The quick solution is to use typeid for rudimentary
expression equality testing. The more involved solution of comparing
the expression tree for equality is the correct choice. It should be
done in the near future.
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1122
Summary:
Other than the plan operators and the frame, we will need to pass the
generated symbol table to distributed workers.
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1123
Summary:
Adds worker id to snapshot and wal filename.
Adds a new worker_id flag to be used for recovering a worker with a distributed snapshot.
Adds worker_id field to snapshot to check for consistency.
Reviewers: florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1096
Summary:
Previously, we would have a `DCHECK` which crashes the application. This
was evident when testing a queries, such as:
MATCH (n) DELETE n SET n.prop = 42
Since the argument to update clauses is evaluated during execution, it
makes it very difficult to prevent such errors during semantic analysis.
For example:
MATCH (n)--(m) WITH collect(n) as ns, m
DETACH DELETE ns[m.prop] SET head(ns).prop = 42
Test query updates on deleted graph elements
Reviewers: florijan, dgleich
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1114
Summary:
Close the file descriptor in File destructor. This will prevent
accidental crashes during unexpected destructor calls. For example, if
an exception is thrown before the file is closed. File now takes
ownership of the descriptor. These changes now honor RAII idiom, which
should handle most of the peculiarities of C++.
Use optional value for TryOpenFile function, instead of returning a File
without a descriptor. It makes the failure state more semantically clear
to the API user.
Merge utils/filesystem with utils/file
The files aren't that big, and the naming is a bit confusing because
functions aren't really grouped for file and filesystem distinction.
Reviewers: mferencevic, mtomic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1111
Summary:
Added wrappers for some Unix system calls in utils/filesystem.hpp and implemented
a simple log storage interface for Raft. It is not very efficient, we will need
something more sophisticated later, but this is good enough for testing.
Reviewers: mferencevic, mislav.bradac, buda, mculinovic
Reviewed By: mferencevic
Subscribers: teon.banek, dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D1091
Summary:
The directory was never actually copied on apollo, so tests weren't even
doing anything...
Also remove fswatcher unit test, it should be rewritten correctly.
Reviewers: mislav.bradac, mferencevic, buda
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1108
Summary: Cleanup of very old data files from the tests/data folder.
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D1107
Summary:
GraphDb is refactored to become an API exposing different parts
necessary for the database to function. These different parts can have
different implementations in SingleNode or distributed Master/Server
GraphDb implementations.
Interally GraphDb is implemented using two class heirarchies. One
contains all the members and correct wiring for each situation. The
other takes care of initialization and shutdown. This architecture is
practical because it can guarantee that the initialization of the
object structure is complete, before initializing state.
Reviewers: buda, mislav.bradac, dgleich, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1093
Summary:
No logic changes, just split `tx::MasterEngine` into
`tx::SingleNodeEngine` and `tx::MasterEngine`. This gives better
responsibility separation and is more appropriate now there is no
Start/Shutdown.
Reviewers: dgleich, teon.banek, buda
Reviewed By: dgleich, teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1099
Summary:
Improve Apollo config files
Add name to apollo_build
Remove old generate script from build
Add build_release symlink to release build
Rename 'args' to 'arguments'
Add run definition for cppcheck
Host doxygen documentation
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1095
Summary:
Serialization of vertices and edges for distributed. Based on Boost
serialization. Threrefore moved TypedValue serialization from AST to
utils.
Reviewers: buda, dgleich, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1088
Summary:
`JailService` exposes jail API and knows how to
start and stop binaries. Every test should be
defined as python module with exposed
`run` method. Script `master.py` is used
for running tests and takes test module
name as argument. Machine IP addresses are
defined in environment variables. To run test
locally use `local_runner` script.
Reviewers: mislav.bradac, mferencevic, mtomic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1065
Summary:
A PropertyValueStore is not a generic data structure, but only ever used
to store properties in a Vertex/Edge. It has behaviours specific to it.
So, the templatization was not necessary.
Reviewers: buda
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1089
Summary:
Although the first solution used cereal, the final implementation uses
boost. Since the cereal is still used in the codebase, compilation has
been modified to support multithreaded cereal.
In addition to serializing Ast classes, the following also needed to be
serialized:
* GraphDbTypes
* Symbol
* TypedValue
TypedValue is treated specially, by inlining the serialization code in
the Ast class, concretely PrimitiveLiteral.
Another special case was the Function Ast class, which now stores a
function name which is resolved to a concrete std::function on
construction.
Tests have been added for serialized Ast in
tests/unit/cypher_main_visitor
Reviewers: mferencevic, mislav.bradac, florijan
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1067
Summary:
The distributed ID mapper is not yet utilised in GraphDb as those
changes are in D1060. Depending on landing order it will be added.
Reviewers: dgleich, mislav.bradac
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1064
Summary:
Implement AddComand method on RaftMember
Move RPCType out of rpc request and reply
Add unit test for AddCommand
Reviewers: mislav.bradac, buda
Reviewed By: mislav.bradac
Subscribers: pullbot, mculinovic
Differential Revision: https://phabricator.memgraph.io/D1042
Summary:
Special casing the last token when adding it to the named expression list.
Added a test to check this case.
Reviewers: mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1052
Summary:
Rpc client wasn't thread safe and required a lock before each rpc call.
The locking functionality is now incorporated in Rpc client.
Reviewers: mislav.bradac
Reviewed By: mislav.bradac
Differential Revision: https://phabricator.memgraph.io/D1056
Summary: Add curvy braces handling in `QueryStripper` and an accompanying test for it.
Reviewers: mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1051
Summary: In this diff I just wanted to fix tests' flakyness. We can discuss if we want to always pass endpoint as an argument and never pass address:port pair explicitly. However if we decide that, I will do that change in another diff.
Reviewers: dgleich, florijan
Reviewed By: dgleich, florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1049
Summary:
Union query combinator implementation consists of:
* adjustments to the AST and `cypher_main_visitor`
* enabling `QueryStripper` to parse multiple `return` statements (not stopping after first)
* symbol generation for union results
* union logical operator
* query plan generator adjustments
Reviewers: teon.banek, mislav.bradac
Reviewed By: teon.banek
Subscribers: pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D1038
Summary: RPC recipient should update its term even if it is rejecting the request.
Reviewers: mislav.bradac, buda
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1039
Summary:
Implement log replication
Rebase and fix src/CMakeLists.txt
Some style fixes
Changed shared_ptr to unique_ptr for RaftPeerState
Change Id and Leader to const
Move implementation to separate class
Fix raft_experiments.cpp
Reviewers: mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1033
Summary:
What's done:
- `RecordAccessor` can represent remote data
- `GraphDbAccessor` manages remote data
- Cleanup: different `EdgeAccessor lazyness (@dgleich: take a look), unused methods, documentation...
- `TODO` placeholders for remote implementation
What's not done:
- RPC and data transfer
- how exactly remote errors are handled
- not sure if any MVCC Record info for remote data should be tracked
- WAL and RPC Deltas properly handled (Gleich working on extracting `Wal::Op`)
This implementation should not break single-node execution, and should provide good abstractions and placeholders for distributed. Once that's satisfied, it should land.
Reviewers: dgleich, buda, mislav.bradac
Reviewed By: dgleich
Subscribers: dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D1030
Summary: Operations are moved and renamed from WAL to a separate file in preparation for HA and distributed storage.
Reviewers: florijan, mtomic, mislav.bradac
Reviewed By: florijan
Subscribers: mislav.bradac, pullbot
Differential Revision: https://phabricator.memgraph.io/D1034
Summary:
Parsing query with string values wasn't correct. It didn't matter
with which type of quote mark string value started. It simply took
any possible quote mark which wasn't escaped as possible beginning
or end for string value.
Following example didn't work:
{name: "Chris Anderson: Technology's long tail"}
Detected string value was "Chris Anderson: Technology' and because
of that query broke. Now, string is detected by memorizing which
quote mark was used to start parsing string value and is required
to end string value with that quote mark.
Reviewers: teon.banek, msantl
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1029
Summary:
It occurred that part of the durability flakyness test might be that the
same durability directory is used always. If the test is run
simultaneously on a single system, there will be interference.
This might not actually fix all the flakyness :(
I also made the `utils::RandomString` function since that's now used in
multiple places, tested it etc.
Reviewers: buda, dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1020
Summary:
This change generates multiple PropertyFilters for expressions such as
`n.prop1 = m.prop2`. When choosing one PropertyFilter, we want to also
remove the other one, because they represent the same original
expression. Therefore, the removal is no longer based on FilterInfo
equality, but on the original expression equality. Additionally,
FilterInfo and PropertyFilter equality operators have been removed to
avoid any pretense they do what you expect or want.
Reviewers: florijan, msantl
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1021
Summary:
The current idea is that the same MG binary can be used for single-node,
distributed master and distributed worker. The transactional engine in
the single-node and distributed master is the same: it determines the
transactional time and exposes all the "global" functionalities. In the
distributed worker the "global" functions must contact the master.
Reviewers: dgleich, mislav.bradac, buda
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1013
Summary: Because it will never be used, we already have replacements for it.
Reviewers: buda
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1016
Summary: Once a snapshot is successfully written, delete WAL files which are no longer necessary for recovery. Note that this prohibits recovering the WAL from any except the last snapshot.
Reviewers: buda, mislav.bradac, dgleich
Reviewed By: dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1000
Summary:
This diff contains step 1:
- Remove clog exposure from tx::engine
- Reduce and cleanup tx::Engine API
All current functionality is kept, but the API is reduced. This is very
desirable because every function in tx::Engine will need to be
considered and implemented in both Master and Worker situations. The
less we have, the better.
Next step is exactly that: seeing how each of these functions behaves in
a distributed system and implementing accordingly.
Reviewers: dgleich, mislav.bradac, buda
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D1008
Summary:
Referring to the TCK failure on:
```
MATCH (a {name: 'Andres'})<-[:FATHER]-(child)
RETURN {foo: a.name='Andres', kids: collect(child.name)}
```
In the planner we'd only treat a list|map as a group_by if it contained
no aggregations. That's changed so that if a map contains both aggregations
and non-aggregations, then non-aggregations are treated as individual
group_by expressions.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: buda, pullbot, teon.banek
Differential Revision: https://phabricator.memgraph.io/D1004
Summary:
In preparation for distributed storage we need to have labels/properties/edgetypes uniquely identifiable by their ids, which will be global in near future.
The old design has to be abandoned because it's not possible to keep track of global labels/properties/edgetypes while they are local pointers.
Reviewers: mislav.bradac, florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D993
Summary:
Looking for connected components in a random graph. This test performs the following:
- Generates a random graph that is NOT sequential in memory (otherwise itertion over edges is 2 or more times faster).
- Connectivity by iterating over all the edges.
- Ditto over vertices.
- Ditto over vertices in parallel.
Not done:
- Edge filtering based on XY. I could/should add that to see how it affects perf.
- Getting component info out from union-find.
Local results are encouraging. Iterating over the graph is the bottleneck. Still, I get connectivity of 10M vertices/edges in <7sec (parallel over vertices). Will test on 250M remote now.
Locally obtained results (20M/20M, 2 threads)
```
I1115 14:57:55.136875 357 otto_parallel.cpp:50] Generating 2000000 vertices...
I1115 14:58:19.057734 357 otto_parallel.cpp:74] Generated 2000000 vertices in 23.9208 seconds.
I1115 14:58:19.919221 357 otto_parallel.cpp:82] Generating 2000000 edges...
I1115 14:58:39.519951 357 otto_parallel.cpp:93] Generated 2000000 edges in 19.3398 seconds.
I1115 14:58:39.520349 357 otto_parallel.cpp:196] Running Edge iteration...
I1115 14:58:43.857264 357 otto_parallel.cpp:199] Done in 4.33691 seconds, result: 3999860270398
I1115 14:58:43.857316 357 otto_parallel.cpp:196] Running Vertex iteration...
I1115 14:58:49.498181 357 otto_parallel.cpp:199] Done in 5.64087 seconds, result: 4000090070787
I1115 14:58:49.498208 357 otto_parallel.cpp:196] Running Connected components - Edges...
I1115 14:58:54.232530 357 otto_parallel.cpp:199] Done in 4.73433 seconds, result: 323935
I1115 14:58:54.232570 357 otto_parallel.cpp:196] Running Connected components - Vertices...
I1115 14:59:00.412395 357 otto_parallel.cpp:199] Done in 6.17983 seconds, result: 323935
I1115 14:59:00.412422 357 otto_parallel.cpp:196] Running Parallel connected components - Vertices...
I1115 14:59:04.662087 357 otto_parallel.cpp:199] Done in 4.24967 seconds, result: 323935
I1115 14:59:04.662116 357 otto_parallel.cpp:196] Running Expansion...
I1115 14:59:13.913015 357 otto_parallel.cpp:199] Done in 9.25091 seconds, result: 323935
```
Reviewers: buda, mislav.bradac, dgleich, teon.banek
Reviewed By: buda, teon.banek
Subscribers: teon.banek, pullbot
Differential Revision: https://phabricator.memgraph.io/D982
Summary: Vertex and Edge now use Address for storing connections to other Edges and Vertices, to support distributed storage.
Reviewers: mislav.bradac, dgleich, buda
Reviewed By: mislav.bradac, dgleich
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D977
Summary:
My dear fellow Memgraphians. It's friday afternoon, and I am as ready to pop as WAL is to get reviewed...
What's done:
- Vertices and Edges have global IDs, stored in `VersionList`. Main storage is now a concurrent map ID->vlist_ptr.
- WriteAheadLog class added. It's based around buffering WAL::Op objects (elementraly DB changes) and periodically serializing and flusing them to disk.
- Snapshot recovery refactored, WAL recovery added. Snapshot format changed again to include necessary info.
- Durability testing completely reworked.
What's not done (and should be when we decide how):
- Old WAL file purging.
- Config refactor (naming and organization). Will do when we discuss what we want.
- Changelog and new feature documentation (both depending on the point above).
- Better error handling and recovery feedback. Currently it's all returning bools, which is not fine-grained enough (neither for errors nor partial successes, also EOF is reported as a failure at the moment).
- Moving the implementation of WAL stuff to .cpp where possible.
- Not sure if there are transactions being created outside of `GraphDbAccessor` and it's `BuildIndex`. Need to look into.
- True write-ahead logic (flag controlled): not committing a DB transaction if the WAL has not flushed it's data. We can discuss the gain/effort ratio for this feature.
Reviewers: buda, mislav.bradac, teon.banek, dgleich
Reviewed By: dgleich
Subscribers: mtomic, pullbot
Differential Revision: https://phabricator.memgraph.io/D958
Summary:
Previously, named path symbols remained untracked as `new_symbols` during planning. This meant that
operator `Optional` would be left unaware of those symbols, and therefore not reset them to `Null`
if optional matching failed.
Test Optional operator will be aware of path symbols
Reviewers: florijan, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D974
Summary:
Tests have been updated to catch this error and other behaviour. Other
than this change, `AND` should behave as before.
Reviewers: florijan, mislav.bradac
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D970
Summary:
Use sigaction to register signal handlers.
This is preferred over `signal` function, according to `man 3p signal`.
Add global sig_atomic_t flag when shutting down.
Block other signal handlers when shutting down.
Reviewers: mislav.bradac, mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D943
Summary:
Remove name from GraphDb.
Take GraphDb in query test macros instead of accessor.
Add is_accepting_transactions flag to GraphDb.
Reviewers: mislav.bradac, florijan, mferencevic
Reviewed By: mislav.bradac
Subscribers: mferencevic, pullbot
Differential Revision: https://phabricator.memgraph.io/D940
Summary:
- Removed durability::Summary because it was wired into reader and stopped me from recovering WAL files.
- Refactored and renamed BufferedFile(Reader/Writer) to HashedFile(Reader/Writer).
- Vertex and edge counts in the snapshot are now hashed.
Breaking snapshot compatibility again (hashing), but since the previous version was not released, and we are not caching snapshots, the previous version does not need to be supported.
Reviewers: teon.banek, mislav.bradac, buda
Reviewed By: teon.banek, mislav.bradac
Subscribers: dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D932
Summary:
Time csv_to_snapshot conversion and log it.
Check if writing csv_to_snapshot failed.
Extract LoadConfig from memgraph_bolt to config.hpp.
Read memgraph config in csv_to_snapshot for snapshot_directory.
Rename csv_to_snapshot to mg_import_csv.
Add tests for tools.
Run tools tests in apollo.
Reviewers: mislav.bradac, florijan, mferencevic, buda
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D931
Summary:
Move QueryParts and Filters to a new file.
Reorganize FilterInfo struct.
Remove label filter if we do indexed scan by label.
Remove property filter used in indexed scan.
Reviewers: florijan
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D915
Summary: Locked version. There are some benchmarks, it seems the lock won't be the bottleneck in the WAL (DB ops causing WAL delta insertions into it will be slower, flushing the WAL be slower).
Reviewers: buda, mislav.bradac, dgleich
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D919
Summary:
New snapshot structure:
- magic number
- snapshot version (old-version recovery not yet implemented)
- transaction snapshot (will be used in the WAL)
- the rest is as before (indices, vertices, edges)
Not backward compatible with the old snapshotting.
Does not improve error handling (user feedback). A task for that has been added.
Reviewers: buda, mislav.bradac, mferencevic, teon.banek
Reviewed By: teon.banek
Subscribers: teon.banek, dgleich, pullbot
Differential Revision: https://phabricator.memgraph.io/D912
Summary:
Since we have different kind of workers in Apollo we should pass
assigned cpus to harness from apollo generate script and not define them
in harness or in benchmarks.
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D916
Summary: RemoveEdges is an extremely slow operation because it iterates over all the vertices to find the appropriate edge. It kind of messes up the DB usage. This diff stops it ever getting called, but does not delete the function. We might want it to happen **very rarely**, but it's probably best never to call it.
Reviewers: buda, mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D910
Summary: It wasn't used in MG, only in tests.
Reviewers: buda, dgleich, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D909
Summary: This change increases the planning time, but should reduce memory consumption.
Reviewers: florijan, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D901
Summary: In the current state, it was not possible to iterate, or even access a const map, or const set structure because of an incorrect implementation of "ConstAccessors".
Reviewers: mislav.bradac, teon.banek, buda
Reviewed By: teon.banek, buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D902
Summary: Added support for optional properties, random integers (uniform distr) and random strings.
Reviewers: mislav.bradac, buda
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D885
Summary:
This puts the whole installation and packaging under a single point of
entry. (Docker, DEB, RPM, etc.)
Rename alpha.dockerfile to beta.dockerfile
Use Debian Stretch for docker
Remove building old hardcoded compiler
Rename build_interpreter to build_memgraph
Remove unused config-file
Reviewers: mferencevic, buda
Reviewed By: mferencevic, buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D857
Summary: This is not a very important functionality, but it turned out simple to do, so let's add it to have a consistent query support.
Reviewers: buda, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D862
Summary:
Increased query execution timeout.
Enabled global queries by default.
Implemented faster RandomElement for vertices and edges.
Changed long running verify message format.
Changed vertex and edge count to be per worker.
Reviewers: mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D864
Summary:
Batch creation now does not iterate over V**2 vertex pairs, but over V vertices and generates E/V random edges for each one. Assuming there is an index over :V(id), this is much more efficient and supports larger vertex counts (Ferenc: modify the test config as desired).
An additional requirement is that E/V (for each worker) is a whole number.
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D860
Summary:
- Removed BreadthFirstAtom, using EdgeAtom only with a Type enum.
- Both variable expansions (breadth and depth first) now have mandatory inner node and edge Identifiers.
- Both variable expansions use inline property filtering and support inline lambdas.
- BFS and variable expansion now have the same planning process.
- Planner modified in the following ways:
- Variable expansions support inline property filtering (two filters added to all_filters, one for inline, one for post-expand).
- Asserting against existing_edge since we don't support that anymore.
- Edge and node symbols bound after variable expansion to disallow post-expand filters to get inlined.
- Some things simplified due to different handling.
- BreadthFirstExpand logical operator merged into ExpandVariable. Two Cursor classes remain and are dynamically chosen from.
As part of planned planner refactor we should ensure that a filter is applied only once. The current implementation is very suboptimal for property filtering in variable expansions.
@buda: we will start refactoring this these days. This current planner logic is too dense and complex. It is becoming technical debt. Most of the time I spent working on this has been spent figuring the planning out, and I still needed Teon's help at times. Implementing the correct and optimal version of query execution (avoiding multiple potentially expensive filterings) was out of reach also due to tech debt.
Reviewers: buda, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot, buda
Differential Revision: https://phabricator.memgraph.io/D852
Summary:
Log files aren't created by default anymore.
All logs are reported to stderr by default.
Normalized flag names.
Removed unnecessary flags from gflags.
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D856
Summary:
Add json and cppitertools to libs/CMakeLists.txt.
Import external projects as libraries.
This removes the need to use `add_dependencies` in order to link with
external project.
Extract common ExternalProject_Add function.
Add macro for easier addition of external libraries.
Reviewers: mislav.bradac, mferencevic
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D845
Summary:
- The new BFS syntax implemented as proposed.
- AST BreadthFirstAtom now uses EdgeAtom members: has_range_{true}, upper_bound_, lower_bound_
- Edges data structure now handles all the edge filtering (single or multiple edges), to ease planning. Additional edge filtering (additional Filter op in the plan) is removed. AST EdgeTypeTest is no longer used and is removed.
Current state is stable but there are things left to do:
- BFS property filtering.
- BFS lower_bound_ support.
- Support for lambdas in variable length expansion. This includes obligatory (even if not user_defined) inner_node and inner_edge symbols for easier handling.
- Code-sharing between BFS and variable length expansions.
I'll add asana tasks (and probably start working on them immediately) when/if this lands.
Reviewers: buda, teon.banek, mislav.bradac
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D836
Summary:
AST caching should be well tested by now.
We should consider removing `Context.is_query_cached_` member as well as the
implementation and tests for `CypherMainVisitor`.
Reviewers: mislav.bradac, mferencevic, buda
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D833
Summary: - modified all utils/algorithm functions to be inline and in the utils namespace
Reviewers: teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D830
Summary:
I started with cleaning flags up (removing unused ones, documenting undocumented ones). There were some flags to remove in `QueryEngine`. Seeing how we never use hardcoded queries (AFAIK last Mislav's testing also indicated they aren't faster then interpretation), when removing those unused flags the `QueryEngine` becomes obsolete. That means that a bunch of other stuff becomes obsolete, along with the hardcoded queries. So I removed it all (this has been discussed and approved on the daily).
Some flags that were previously undocumented in `docs/user_technical/installation` are now documented. The following flags are NOT documented and in my opinion should not be displayed when starting `./memgraph --help` (@mferencevic):
```
query_vertex_count_to_expand_existsing (from rule_based_planner.cpp)
query_max_plans (rule_based_planner.cpp)
```
If you think that another organization is needed w.r.t. flag visibility, comment.
@teon.banek: I had to remove some stuff from CMakeLists to make it buildable. Please review what I removed and clean up if necessary if/when this lands. If the needed changes are minor, you can also comment.
Reviewers: buda, mislav.bradac, teon.banek, mferencevic
Reviewed By: buda, mislav.bradac
Subscribers: pullbot, mferencevic, teon.banek
Differential Revision: https://phabricator.memgraph.io/D825
Summary:
Added a warning to the log. Every 3 seconds (let's not make that configurable). Can be turned off.
I still don't like this. We are raising another thread and reading a file to do monitoring. We're developing a DB, not a sys monitor. Serious admins do that themselves. But, here it is.
UPDATE:
Cleaned up `utils/sysinfo/memory`. Removed all unused functions. Removed the faulty memory check in `tests/concurrent`.
Reviewers: buda, mferencevic, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D819
Summary:
Setup scaffolding for building Memgraph tools.
Change `utils::Split` without delimiter to split on whitespace.
This should make `Split` behave just like Python's `str.split`, which is
more practical for splitting on word boundaries.
Add `utils::StartsWith` function.
Rewrite csv_to_snapshot to C++.
Reviewers: mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D822
Summary:
Run neo4j and memgraph from run_benchmark script.
This makes mg and neo scripts obsolete.
Reviewers: buda, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D806
Summary:
Test planner splits MATCH ... WHERE
Remove distinction between FilterAnd and AndOperator
Reviewers: florijan, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D814
Summary:
This fixes a bug, where streaming would try to get the name of the symbol from
an invalid token position. For example, `MATCH (n) WITH n RETURN *`
Reviewers: florijan, mislav.bradac
Reviewed By: florijan, mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D811
Summary: Removed the traversal API, been waiting for named path to land because that data structure has replaced traversal's Path. I left the wiki page of the API, just put a warning that it's not used.
Reviewers: buda, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D809
Summary:
Three TODOs resolved.
1. around line 897 - we currently don't support expansion into existing variable length edges (there is a TODO in symbol_generator.cpp:213), so this should not be done at the moment.
2. around line 1025 - This TODO was on review and nobody commented, so I'm removing it. Should have done that when the diff landed.
3. around line 1560 - This does not seem possible. Edge-uniqueness checks happen within a single `[OPTIONAL ] MATCH`. If it is OPTIONAL (the case interesting here), then the uniqueness check also gets planned under the optional branch. So, if an optional fails, the uniqueness check will get skipped, as opposed to getting executed over a Null. I added an edge-case test to verify this (and checked with the planner test).
Reviewers: buda, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D807
Summary:
- Keys() functions in the indices can't be const because ConcurrentMap doesn't provide const accessors (and they are broken in skiplist) :D
- no cucumber tests because many tests create indices so it's hard to say what's inside and what not
Reviewers: buda, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D797
Summary:
This is done when the generated AST will be cached.
Remove LiteralsPlugger.
Reviewers: florijan, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D788
Summary: I was mistaken in my calculations before and gave it +-3sigma tolerance (0.0027 probability of failure). Now I changed it to +-5sigma, which is good enough for CERN, and should be for us too.
Reviewers: mislav.bradac, mferencevic
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D780
Summary:
Antlr grammar has been updated to support putting edge types after the BFS
symbol. Planner collects edge type filters for BFS and inlines them in the
operator by joining the filter with the user input BFS filter itself. This
requires no change from the standpoint of the operator. On the other hand, in
order to use the faster lookup by a single edge type, `ExpandBreadthFirst`
operator now accept an optional edge type. The edge type is passed from the
planner only if the user is filtering by a single type.
Unit tests as well as tck have been updated.
Reviewers: florijan, mislav.bradac
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D777
Summary:
Add function First to utils.
Insert EdgeType into Expand during planning.
Reviewers: florijan, mislav.bradac
Reviewed By: florijan
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D769
Summary: The constness of the DbAccessor interferes with caching the results.
Reviewers: florijan, mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D771
Summary:
Benchmark planning and estimating indexed ScanAll. According to the benchmark,
caching speeds up the whole process of planning and estimation by a factor of
2. Most of the performance gain is in the `CostEstimator` itself, due to plenty
of calls to `VerticesCount` when estimating all of the generated plans.
Reviewers: mislav.bradac, florijan
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D765
Summary:
This fixes the problem when the test would fail on some PCs...
Notably, mine...
Reviewers: florijan, mferencevic, mislav.bradac, buda
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D770
Summary:
- The `Edges` data structure now handles common ops, including providing an iterator over edges whose "other" vertex is know.
- This should improve performance on dense_expand tests in the harness without other side-effects.
- query::plan::Expand operator modified not to check for existing-node stuff since that now gets handled by the `Edges` data structure.
- `Edges::Iterator` implemented only for const iterators since that suffices for now. Can implement non-const if the need arrises.
Reviewers: buda, mislav.bradac, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D763
Summary: For the time being we'll go with vector and a custom iterator for know-destination-vertex lookups, but this benchmark might be handy to keep for future work.
Reviewers: buda, mislav.bradac, teon.banek
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D762
Summary: Ugly formatting and bad variable naming due to 1024 limit on neo4j client.
Reviewers: buda, teon.banek
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D748
Summary:
1. Test setup rewritten to take cca 8 seconds. Note that edges are created by using:
`MATCH (a) WITH a MATCH (b) WITH b WHERE rand() < X CREATE (a)-[:ET]->(b)`
Where `X` is a threshold calculated so the desired edge count is the expectation. This seems the only feasable way of generating a large number of edges since query execution does not depend on edge count, but on vertex count.
2. Using the new `assert` function to verify graph state. I recommend doing that in all the harness tests (I don't think we currently have something better).
3. All tests rewritten to take around 200ms per iteration.
4. Test are using SKIP to avoid sending data to the client, but ensure that appropriate operations get executed. This currently seems like the best way of removing unwanted side-effects.
Harness will cost us our sanity. And it doesn't even provide good quality regression testing we really need :(
Reviewers: buda, mislav.bradac, mferencevic
Reviewed By: mferencevic
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D752
Summary: It turns out trivial if I use unwind for vertex creation, MATCH for edge creation and UNWIND for test duration. It took hours to converge to this :(
Reviewers: mislav.bradac, buda
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D747
Summary: `assert` function added. Useful to us for asserting DB state in harness tests. Potentially useful to the client for breaking out of a query as soon as a predicate fails, as opposed to collecting result and checking them client-side.
Reviewers: buda, mislav.bradac, teon.banek
Reviewed By: teon.banek
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D750
Summary:
Moved Neo4j config to config dir.
Neo4j and PostgreSQL are now downloaded to libs.
Renamed metadata flags in memgraph.
Changed apollo generate for new harness.
Reviewers: mislav.bradac
Reviewed By: mislav.bradac
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D741
Summary:
Don't require setup_system to run as root, nor apt-get
Implement command_fail for ldbc/setup_dependencies
ldbc.setup_dataset: Find Java on ArchLinux
Reviewers: buda, mferencevic
Reviewed By: buda
Subscribers: pullbot
Differential Revision: https://phabricator.memgraph.io/D729