memgraph/docs/feature_specs/kafka/transform.md
Matija Santl 4ee3db80b0 Add kafka documentation
Summary:
Updated the feature specs, the changelog and added a new section in
user technical.

Reviewers: mferencevic, mculinovic, buda, ipaljak

Reviewed By: ipaljak

Subscribers: pullbot

Differential Revision: https://phabricator.memgraph.io/D1534
2018-08-09 16:52:52 +02:00

1019 B

Kafka - data transform

The transform script is a user defined script written in Python. The script should be aware of the data format in the Kafka message.

Each Kafka message is byte length encoded, which means that the first eight bytes of each message contain the length of the message.

A sample code for a streaming transform script could look like this:

def create_vertex(vertex_id):
  return ("CREATE (:Node {id: $id})", {"id": vertex_id})


def create_edge(from_id, to_id):
  return ("MATCH (n:Node {id: $from_id}), (m:Node {id: $to_id}) "\
          "CREATE (n)-[:Edge]->(m)", {"from_id": from_id, "to_id": to_id})


def stream(batch):
    result = []
    for item in batch:
        message = item.decode('utf-8').strip().split()
        if len(message) == 1:
          result.append(create_vertex(message[0])))
        else:
          result.append(create_edge(message[0], message[1]))
    return result

The script should output openCypher query strings based on the type of the records.