Summary: Updated the feature specs, the changelog and added a new section in user technical. Reviewers: mferencevic, mculinovic, buda, ipaljak Reviewed By: ipaljak Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D1534
2.2 KiB
Kafka - openCypher clause
One must be able to specify the following when importing data from Kafka:
- Kafka URI
- Kafka topic
- Transform script URI
Minimum required syntax looks like:
CREATE STREAM stream_name AS LOAD DATA KAFKA 'URI'
WITH TOPIC 'topic'
WITH TRANSFORM 'URI';
The full openCypher clause for creating a stream is:
CREATE STREAM stream_name AS
LOAD DATA KAFKA 'URI'
WITH TOPIC 'topic'
WITH TRANSFORM 'URI'
[BATCH_INTERVAL milliseconds]
[BATCH_SIZE count]
The CREATE STREAM
clause happens in a transaction.
WITH TOPIC
parameter specifies the Kafka topic from which we'll stream
data.
WITH TRANSFORM
parameter should contain a URI of the transform script.
BATCH_INTERVAL
parameter defines the time interval in milliseconds
which is the time between two successive stream importing operations.
BATCH_SIZE
parameter defines the count of Kafka messages that will be
batched together before import.
If both BATCH_INTERVAL
and BATCH_SIZE
parameters are given, the condition
that is satisfied first will trigger the batched import.
Default value for BATCH_INTERVAL
is 100 milliseconds, and the default value
for BATCH_SIZE
is 10;
The DROP
clause deletes a stream:
DROP STREAM stream_name;
The SHOW
clause enables you to see all configured streams:
SHOW STREAMS;
You can also start/stop streams with the START
and STOP
clauses:
START STREAM stream_name [LIMIT count BATCHES];
STOP STREAM stream_name;
A stream needs to be stopped in order to start it and it needs to be started in order to stop it. Starting a started or stopping a stopped stream will not affect that stream.
There are also convenience clauses to start and stop all streams:
START ALL STREAMS;
STOP ALL STREAMS;
Before the actual import, you can also test the stream with the TEST STREAM
clause:
TEST STREAM stream_name [LIMIT count BATCHES];
When a stream is tested, data extraction and transformation occurs, but no output is inserted in the graph.
A stream needs to be stopped in order to test it. When the batch limit is
omitted, TEST STREAM
will run for only one batch by default.