7a8e6b52e0
Summary: This should cover the minimum required feature set for generating the serialization code for SLK. There are some TODO comments, mostly concerning quality of life improvements. The documentation on LCP has been updated. Additionally, any previous CHECK which would trigger if loading went wrong is now replaced by raising SlkDecodeException. Other assertions of code misuse are left as CHECK invocations. Reviewers: mtomic, llugovic, mferencevic Reviewed By: mtomic Subscribers: pullbot Differential Revision: https://phabricator.memgraph.io/D1754
1017 lines
34 KiB
Markdown
1017 lines
34 KiB
Markdown
# Lisp C++ Preprocessor (LCP)
|
|
|
|
In our development process we are using Common Lisp to generate some parts of
|
|
the C++ codebase. The idea behind this is supplementing C++ with better
|
|
meta-programming capabilities to automate tasks and prevent bugs due to code
|
|
duplication. Primary candidate for using more powerful meta-programming is
|
|
generating serialization code. Such code is almost always the same: go through
|
|
all `struct` or `class` members and invoke the serialization function on them.
|
|
Writing such code manually is error prone when adding members, because you may
|
|
easily forget to correctly update the serialization code. Thus, the Lisp C++
|
|
Preprocessor was born. It is hooked in our build process as a step before
|
|
compilation. The remainder of the document describes how to use LCP and its
|
|
features.
|
|
|
|
Contents
|
|
|
|
* [Running LCP](#running-lcp)
|
|
* [Writing LCP](#writing-lcp)
|
|
- [Inlining C++ in Common Lisp](#inlining-cpp)
|
|
- [C++ Namespaces](#cpp-namespaces)
|
|
- [C++ Enumerations](#cpp-enums)
|
|
- [C++ Classes & Structs](#cpp-classes)
|
|
- [Defining an RPC](#defining-an-rpc)
|
|
- [Cap'n Proto Serialization](#capnp-serial)
|
|
- [SaveLoadKit Serialization](#slk-serial)
|
|
|
|
## Running LCP
|
|
|
|
You can generate C++ from an LCP file by running the following command.
|
|
|
|
`./tools/lcp <path-to-file.lcp>`
|
|
|
|
The LCP will produce a `path-to-file.hpp` file and potentially a
|
|
`path-to-file.lcp.cpp` file. The `.cpp` file is generated if some parts of the
|
|
code need to be in the implementation file. This is usually the case when
|
|
generating serialization code. Note that the `.cpp` file has the extension
|
|
appended to `.lcp`, so that you are free to define your own `path-to-file.cpp`
|
|
which includes the generated `path-to-file.hpp`.
|
|
|
|
One serialization format uses Cap'n Proto library, but to use it, you need to
|
|
provide an ID. The ID is generated by invoking `capnp id`. When you want to
|
|
generate Cap'n Proto serialization, you need to pass the generated ID to LCP.
|
|
|
|
`./tools/lcp <path-to-file.lcp> $(capnp id)`
|
|
|
|
Generating Cap'n Proto serialization will produce an additional file,
|
|
`path-to-file.capnp`, which contains the serialization schema.
|
|
|
|
You may wonder why the LCP doesn't invoke `capnp id` itself. Unfortunately,
|
|
such behaviour would be wrong when running LCP on the same file multiple
|
|
times. Each run would produce a different ID and the serialization code would
|
|
be incompatible between versions.
|
|
|
|
### CMake
|
|
|
|
The LCP is run in CMake using the `add_lcp` function defined in
|
|
`CMakeLists.txt`. You can take a look at the function documentation there for
|
|
information on how to add your new LCP files to the build system.
|
|
|
|
## Writing LCP
|
|
|
|
A LCP file should have the `.lcp` extension, but the code written is
|
|
completely valid Common Lisp code. This means that you have a complete
|
|
language at your disposal before even the C++ is compiled. You can view this
|
|
as similar to the C++ templates and macros, but they do not have access to
|
|
a complete language.
|
|
|
|
Besides Common Lisp, you are allowed to write C++ code verbatim. This means
|
|
that C++ and Lisp code coexist in the file. How to do that, as well as other
|
|
features are described below.
|
|
|
|
### Inlining C++ in Common Lisp {#inlining-cpp}
|
|
|
|
To insert C++ code, you need to use a `#>cpp ... cpp<#` block. This is most
|
|
often used at the top of the file to write Doxygen documentation and put some
|
|
includes. For example:
|
|
|
|
```cpp
|
|
#>cpp
|
|
/// @file My Doxygen style documentation about this file
|
|
|
|
#pragma once
|
|
|
|
#include <vector>
|
|
cpp<#
|
|
```
|
|
|
|
The above code will be pasted as is into the generated header file. If you
|
|
wish to have a C++ block in the `.cpp` implementation file instead, you should
|
|
use `lcp:in-impl` function. For example:
|
|
|
|
```cpp
|
|
(lcp:in-impl
|
|
#>cpp
|
|
void MyClass::Method(int awesome_number) {
|
|
// Do something with awesome_number
|
|
}
|
|
cpp<#)
|
|
```
|
|
|
|
The C++ block also supports string interpolation with a syntax akin to shell
|
|
variable access, `${lisp-variable}`. At the moment, only variables are
|
|
supported and they have to be pretty printable in Common Lisp (i.e. support
|
|
the `~A` format directive). For example, we can make a precomputed sinus
|
|
function for integers from 0 to 5:
|
|
|
|
```lisp
|
|
(let ((sin-from-0-to-5
|
|
(format nil "~{~A~^, ~}" (loop for i from 0 below 5 collect (sin i)))))
|
|
#>cpp
|
|
static const double kSinFrom0To5[] = {${sin-from-0-to-5}};
|
|
cpp<#)
|
|
```
|
|
|
|
The following will be generated.
|
|
|
|
```cpp
|
|
static const double kSinFrom0To5[] = {0.0, 0.84147096, 0.9092974, 0.14112, -0.7568025};
|
|
```
|
|
|
|
Since you have a complete language at your disposal, this is a powerful tool
|
|
to generate tables for computations which would take a very long time during
|
|
the execution of the C++ program.
|
|
|
|
### C++ Namespaces {#cpp-namespaces}
|
|
|
|
Although you can use inline C++ to open and close namespaces, it is
|
|
recommended to use `lcp:namespace` and `lcp:pop-namespace` functions. LCP will
|
|
report an error if you have an unclosed namespace, unlike Clang and GCC which
|
|
most of the times give strange errors due to C++ grammar ambiguity. Additional
|
|
benefit is that LCP will track the namespace stack and correctly wrap any C++
|
|
code which should be put in the `.cpp` file.
|
|
|
|
For example:
|
|
|
|
```lisp
|
|
;; example.lcp
|
|
(lcp:namespace utils)
|
|
|
|
;; Function declaration in header
|
|
#>cpp
|
|
bool StartsWith(const std::string &string, const std::string &prefix);
|
|
cpp<#
|
|
|
|
;; Function implementation in implementation file
|
|
(lcp:in-impl
|
|
#>cpp
|
|
bool StartsWith(const std::string &string, const std::string &prefix) {
|
|
// Implementation code
|
|
return false;
|
|
}
|
|
cpp<#)
|
|
|
|
(lcp:pop-namespace) ;; utils
|
|
```
|
|
|
|
The above will produce 2 files, header and implementation:
|
|
|
|
```cpp
|
|
// example.hpp
|
|
namespace utils {
|
|
|
|
bool StartsWith(const std::string &string, const std::string &prefix);
|
|
|
|
}
|
|
```
|
|
|
|
```cpp
|
|
// example.lcp.cpp
|
|
namespace utils {
|
|
|
|
bool StartsWith(const std::string &string, const std::string &prefix) {
|
|
// Implementation code
|
|
return false;
|
|
}
|
|
|
|
}
|
|
```
|
|
|
|
### C++ Enumerations {#cpp-enums}
|
|
|
|
LCP provides a `lcp:define-enum` macro to define a C++ `enum class` type. This
|
|
will make LCP aware of the type and all its possible values. This makes it
|
|
possible to generate the serialization code. In the future, LCP may generate
|
|
"string to enum" and "enum to string" functions.
|
|
|
|
Example:
|
|
|
|
```lisp
|
|
(lcp:define-enum days-in-week
|
|
(monday tuesday wednesday thursday friday saturday sunday)
|
|
;; Optional documentation
|
|
(:documentation "Enumerates days of the week")
|
|
;; Optional directive to generate serialization code
|
|
(:serialize))
|
|
```
|
|
|
|
Produces:
|
|
|
|
```cpp
|
|
/// Enumerates days of the week
|
|
enum class DaysInWeek {
|
|
MONDAY,
|
|
TUESDAY,
|
|
WEDNESDAY,
|
|
THURSDAY,
|
|
FRIDAY,
|
|
SATURDAY,
|
|
SUNDAY
|
|
};
|
|
|
|
// serialization code ...
|
|
```
|
|
|
|
### C++ Classes & Structs {#cpp-classes}
|
|
|
|
For defining C++ classes, there is a `lcp:define-class` macro. Its counterpart
|
|
for structures is `lcp:define-struct`. They are exactly the same, but
|
|
`lcp:define-struct` will put members in public scope by default. Just like in
|
|
C++.
|
|
|
|
Defining classes is a bit more involved, because they have many customization
|
|
options. They syntax follows the syntax of class definition in Common Lisp
|
|
(see `defclass`).
|
|
|
|
Basic example:
|
|
|
|
```lisp
|
|
(lcp:define-class my-class ()
|
|
((primitive-value :int64_t)
|
|
(stl-vector "std::vector<int>"))
|
|
;; Optional documentation
|
|
(:documentation "My class documentation")
|
|
;; Define explicitly public, protected or private code. All are optional.
|
|
(:public #>cpp // some public code, e.g. methods cpp<#)
|
|
(:protected #>cpp // protected cpp<#)
|
|
(:private #>cpp //private cpp<#))
|
|
```
|
|
|
|
The above will generate:
|
|
|
|
```cpp
|
|
/// My class documentation
|
|
class MyClass {
|
|
public:
|
|
// some public code, e.g. methods
|
|
|
|
protected:
|
|
// protected
|
|
|
|
private:
|
|
// private
|
|
|
|
int64_t primitive_value_;
|
|
std::vector<int> stl_vector_;
|
|
};
|
|
```
|
|
|
|
As you can see, members in LCP are followed by a type. For primitive types, a
|
|
Lisp keyword is used. E.g. `:int64_t`, `:bool`, etc. Other types, like STL
|
|
containers use a valid C++ string to specify type.
|
|
|
|
C++ supports nesting types inside a class. You can do the same in LCP inside
|
|
any of the scoped additions.
|
|
|
|
For example:
|
|
|
|
```lisp
|
|
(lcp:define-class my-class ()
|
|
((member "NestedType")
|
|
(value "NestedEnum"))
|
|
(:private
|
|
(lcp:define-enum nested-enum (first-value second-value))
|
|
|
|
(lcp:define-class nested-type ()
|
|
((member :int64_t)))
|
|
|
|
#>cpp
|
|
// Some other C++ code
|
|
cpp<#))
|
|
```
|
|
|
|
The above should produce expected results.
|
|
|
|
You can add a base classes after the class name. The name should be a Lisp
|
|
symbol for base classes defined through `lcp:define-class`, so that LCP tracks
|
|
the inheritance. Otherwise, it should be a string.
|
|
|
|
For example:
|
|
|
|
```lisp
|
|
(lcp:define-class derived (my-class "UnknownInterface")
|
|
())
|
|
```
|
|
|
|
Will generate:
|
|
|
|
```cpp
|
|
class Derived : public MyClass, public UnknownInterface {
|
|
};
|
|
```
|
|
|
|
Similarly, you can specify template parameters. Instead of giving just a name
|
|
to `define-class`, you give a list where the first element is the name of the
|
|
class, while others name the template parameters.
|
|
|
|
```lisp
|
|
(lcp:define-class (my-map t-key t-value) ()
|
|
((underlying-map "std::unordered_map<TKey, TValue>")))
|
|
```
|
|
|
|
The above will generate:
|
|
|
|
```cpp
|
|
template <class TKey, class TValue>
|
|
class MyMap {
|
|
private:
|
|
std::unordered_map<TKey, TValue> underlying_map_;
|
|
};
|
|
```
|
|
|
|
Other than tweaking the class definition, you can also do additional
|
|
configuration of members. The following options are supported.
|
|
|
|
* `:initval` -- sets the initial value of a member
|
|
* `:reader` -- generates a public getter
|
|
* `:scope` -- set the scope of a member, one of `:public`, `:private` or
|
|
`:protected`
|
|
* `:documentation` -- Doxygen documentation of a member
|
|
* various serialization options which are explained later
|
|
|
|
For example:
|
|
|
|
```lisp
|
|
(lcp:define-class my-class ()
|
|
((member "std::vector<int>" :scope :protected :initval "1, 2, 3" :reader t
|
|
:documentation "Member documentation")))
|
|
```
|
|
|
|
Will generate:
|
|
|
|
```cpp
|
|
class MyClass {
|
|
public:
|
|
const auto &member() { return member_; }
|
|
|
|
protected:
|
|
/// Member documentation
|
|
std::vector<int> member_{1, 2, 3};
|
|
};
|
|
```
|
|
|
|
### Defining an RPC
|
|
|
|
In our codebase, we have implemented remote procedure calls. These are used
|
|
for communication between Memgraph instances in a distributed system. Each RPC
|
|
is registered by its type and requires serializable data structures. Writing
|
|
RPC compliant structure requires a lot of boilerplate. To ease the pain of
|
|
defining a new RPC we have a macro, `lcp:define-rpc`.
|
|
|
|
Definition consists of 2 parts: request and response. You can specify members
|
|
of each part. Member definition is the same as in `lcp:define-class`.
|
|
|
|
For example:
|
|
|
|
```lisp
|
|
(lcp:define-rpc query-result
|
|
(:request
|
|
((tx-id "tx::TransactionId")
|
|
(query-id :int64_t)))
|
|
(:response
|
|
((values "std::vector<int>"))))
|
|
```
|
|
|
|
The above will generate relatively large amount of C++ code, which is omitted
|
|
here as the details aren't important for understanding the use. Examining the
|
|
generated code is left as an exercise for the reader.
|
|
|
|
The important detail is that in C++ you will have a `QueryResultRpc`
|
|
structure, which is used to register the behaviour of an RPC server. You need
|
|
to perform the registration manually. For example:
|
|
|
|
```cpp
|
|
// somewhere in code you have a server instance
|
|
rpc_server.Register<QueryResultRpc>(
|
|
[](const auto &req_reader, auto *res_builder) {
|
|
QueryResultReq request;
|
|
Load(&request, req_reader);
|
|
// process the request and send the response
|
|
QueryResultRes response(values_for_response);
|
|
Save(response, res_builder);
|
|
});
|
|
|
|
|
|
// somewhere else you have a client which sends the RPC
|
|
tx::TransactionId tx_id = ...
|
|
int64_t query_id = ...
|
|
auto response = rpc_client.template Call<QueryResultRpc>(tx_id, query_id);
|
|
if (response) {
|
|
const auto &values = response->getValues();
|
|
// do something with values
|
|
}
|
|
```
|
|
|
|
RPC structures use Cap'n Proto for serialization. The above variables
|
|
`req_reader` and `res_builder` are used to access Cap'n Proto structures.
|
|
Obviously, the LCP will generate the Cap'n Proto schema alongside the C++
|
|
code for serialization.
|
|
|
|
|
|
### Cap'n Proto Serialization {#capnp-serial}
|
|
|
|
Primary purpose of LCP was to make serialization of types easier. Our
|
|
serialization library of choice for C++ is Cap'n Proto. LCP provides
|
|
generation and tuning of its serialization code. Previously, LCP supported
|
|
Boost.Serialization, but it was removed.
|
|
|
|
To specify a class or structure for serialization, you may pass a
|
|
`:serialize :capnp` option when defining such type. (Note that
|
|
`lcp:define-enum` takes `:serialize` without any arguments).
|
|
|
|
For example:
|
|
|
|
```lisp
|
|
(lcp:define-struct my-struct ()
|
|
((member :int64_t))
|
|
(:serialize :capnp))
|
|
```
|
|
|
|
`:serialize` option will generate a Cap'n Proto schema of the class and store
|
|
it in the `.capnp` file. C++ code will be generated for saving and loading
|
|
members:
|
|
|
|
```cpp
|
|
// Top level functions
|
|
void Save(const MyStruct &self, capnp::MyStruct::Builder *builder);
|
|
void Load(MyStruct *self, const capnp::MyStruct::Reader &reader);
|
|
```
|
|
|
|
Since we use top level functions, the class needs to have some sort of public
|
|
access to its members.
|
|
|
|
The schema file will be namespaced in `capnp`. To change add a prefix
|
|
namespace use `lcp:capnp-namespace` function. For example, if we use
|
|
`(lcp:capnp-namespace "my_namespace")` then the reader and builder would be in
|
|
`my_namespace::capnp`.
|
|
|
|
Serializing a class hierarchy is also supported. The most basic case with
|
|
single inheritance works out of the box. Handling other cases is explained in
|
|
later sections.
|
|
|
|
For example:
|
|
|
|
```lisp
|
|
(lcp:define-class base ()
|
|
((base-member "std::vector<int64_t>" :scope :public))
|
|
(:serialize :capnp))
|
|
|
|
(lcp:define-class derived (base)
|
|
((derived-member :bool :scope :public))
|
|
(:serialize :capnp))
|
|
```
|
|
|
|
Note that all classes need to have the `:serialize` option set. Signatures of
|
|
`Save` and `Load` functions are changed to accept reader and builder to the
|
|
base class. The `Load` function now takes a `std::unique_ptr<T> *` which is
|
|
used to take ownership of a concrete type. This approach transfers the
|
|
responsibility of type allocation and construction from the user of `Load` to
|
|
`Load` itself.
|
|
|
|
```cpp
|
|
void Save(const Derived &self, capnp::Base::Builder *builder);
|
|
void Load(std::unique_ptr<Base> *self, const capnp::Base::Reader &reader);
|
|
```
|
|
|
|
#### Multiple Inheritance
|
|
|
|
Cap'n Proto does not support any form of inheritance, instead we are
|
|
handling it manually. Single inheritance was relatively easy to add to Cap'n
|
|
Proto, we simply enumerate all derived types inside the union of a base type.
|
|
|
|
Multiple inheritance is a different beast and as such is not directly
|
|
supported.
|
|
|
|
One way to use multiple inheritance is only to implement the interface of pure
|
|
virtual classes without any members (i.e. interface classes). In such a case,
|
|
you do not want to serialize any other base class except the primary one. To
|
|
let LCP know that is the case, use `:ignore-other-base-classes t`. LCP will
|
|
only try to serialize the base class that is the first (leftmost) in the list
|
|
of super classes.
|
|
|
|
```lisp
|
|
(lcp:define-class derived (primary-base some-interface other-interface)
|
|
...
|
|
(:serialize :capnp :ignore-other-base-classes t))
|
|
```
|
|
|
|
Another form of multiple inheritance is reusing some common code. In
|
|
actuality, this is a very bad code practice and should be replaced with
|
|
composition. If it would take too long to fix such code to use composition
|
|
proper, we can tell LCP to treat such inheritance as if they are indeed
|
|
composed. This is done via `:inherit-compose` option.
|
|
|
|
For example:
|
|
|
|
```lisp
|
|
(lcp:define-class derived (first-base second-base)
|
|
...
|
|
(:serialize :capnp :inherit-compose '(second-base)))
|
|
```
|
|
|
|
With `:inherit-compose` you can pass a list of parent classes which should be
|
|
encoded as composition inside the Cap'n Proto schema. LCP will complain if
|
|
there is multiple inheritance but you didn't specify `:inherit-compose`.
|
|
|
|
The downside of this approach is that `Save` and `Load` will work only on
|
|
`FirstBase`. Serializing a pointer to `SecondBase` would be incorrect.
|
|
|
|
#### Inheriting C++ Class Outside of LCP
|
|
|
|
Classes defined outside of `lcp:define-class` are not visible to LCP and LCP
|
|
will not be able to generate correct serialization code.
|
|
|
|
The cases so far have been only with classes that are pure interface and need
|
|
no serialization code. This is signaled to LCP by passing the option `:base t`
|
|
to `:serialize :capnp`. LCP will treat such classes as actually being the base
|
|
class of a hierarchy.
|
|
|
|
For example:
|
|
|
|
```lisp
|
|
(lcp:define-class my-class ("utils::TotalOrdering")
|
|
(...)
|
|
(:serialize :capnp :base t))
|
|
|
|
(lcp:define-class derived (my-class)
|
|
(...)
|
|
(:serialize :capnp))
|
|
```
|
|
|
|
Only the base class for serialization has the `:base t` option set. Derived
|
|
classes are defined as usual. This relies on the fact that we do not expect
|
|
anyone to have a pointer to `utils::TotalOrdering` and use it for
|
|
serialization and deserialization.
|
|
|
|
#### Template Classes
|
|
|
|
Currently, LCP supports the most primitive form of serializing templated
|
|
classes. The template arguments must be provided to specify an explicit
|
|
instantiation. Cap'n Proto does support generics, so we may want to upgrade
|
|
LCP to use them in the future.
|
|
|
|
To specify template arguments, pass a `:type-args` option. For example:
|
|
|
|
```lisp
|
|
(lcp:define-class (my-container t-value) ()
|
|
(...)
|
|
(:serialize :capnp :type-args '(my-class)))
|
|
```
|
|
|
|
The above will support serialization of `MyContainer<MyClass>` type.
|
|
|
|
The syntax will work even if our templated class inherits from non-templated
|
|
classes. All other cases of inheritance with templates are forbidden in LCP
|
|
serialization.
|
|
|
|
#### Cap'n Proto Schemas and Type Conversions
|
|
|
|
You can import other serialization schemas by using `lcp:capnp-import`
|
|
function. It expects a name for the import and the path to the schema file.
|
|
|
|
For example, to import everything from `utils/serialization.capnp` under the
|
|
name `Utils`, you can do the following:
|
|
|
|
```lisp
|
|
(lcp:capnp-import 'utils "/utils/serialization.capnp")
|
|
```
|
|
|
|
To use those types, you need to register a conversion from C++ type to schema
|
|
type. There are two options, registering a whole file conversion with
|
|
`lcp:capnp-type-conversion` or converting a specific class member.
|
|
|
|
For example, you have a class with member of type `Bound` and there is a
|
|
schema for it also named `Bound` inside the imported schema.
|
|
|
|
You can use `lcp:capnp-type-conversion` like so:
|
|
|
|
```lisp
|
|
(lcp:capnp-type-conversion "Bound" "Utils.Bound")
|
|
|
|
(lcp:define-class my-class ()
|
|
((my-bound "Bound")))
|
|
```
|
|
|
|
Specifying only a member conversion can be done with `:capnp-type` member
|
|
option:
|
|
|
|
```lisp
|
|
(lcp:define-class my-class ()
|
|
((my-bound "Bound" :capnp-type "Utils.Bound")))
|
|
```
|
|
|
|
#### Custom Save and Load Hooks
|
|
|
|
Sometimes the default serialization is not adequate and you may wish to
|
|
provide your own serialization code. For those reasons, LCP provides
|
|
`:capnp-save`, `:capnp-load` and `:capnp-init` options on each class member.
|
|
|
|
The simplest is `:capnp-init` which when set to `nil` will not generate an
|
|
`init<member>` call on a builder. Cap'n Proto requires that compound types are
|
|
initialized before beginning to serialize its members. `:capnp-init` allows you
|
|
to delay the initialization to your custom save code. You rarely want to set
|
|
`:capnp-init nil`.
|
|
|
|
Custom save code is added as a value of `:capnp-save`. It should be a function
|
|
which takes 3 arguments.
|
|
|
|
1. Name of builder variable.
|
|
2. Name of the class (or struct) member.
|
|
3. Name of the member in Cap'n Proto schema.
|
|
|
|
The result of the function needs to be a C++ code block.
|
|
|
|
You will rarely need to use the 3rd argument, so it should be ignored in most
|
|
cases. It is usually needed when you set `:capnp-init nil`, so that you can
|
|
correctly initialize the builder.
|
|
|
|
Similarly, `:capnp-load` expects a function taking a reader, C++ member and
|
|
Cap'n Proto member, then returns a C++ block.
|
|
|
|
Example:
|
|
|
|
```lisp
|
|
(lcp:define-class my-class ()
|
|
((my-member "ComplexType"
|
|
:capnp-init nil
|
|
:capnp-save (lambda (builder member capnp-name)
|
|
#>cpp
|
|
auto data = ${member}.GetSaveData();
|
|
auto my_builder = ${builder}.init${capnp-name}();
|
|
my_builder.setData(data);
|
|
cpp<#)
|
|
:capnp-load (lambda (reader member capnp-name)
|
|
(declare (ignore capnp-name))
|
|
#>cpp
|
|
auto data = ${reader}.getData();
|
|
${member}.LoadFromData(data);
|
|
cpp<#)))
|
|
(:serialize :capnp))
|
|
```
|
|
|
|
With custom serialization code, you may want to get additional details through
|
|
extra arguments to `Save` and `Load` functions. This is described in the next
|
|
section.
|
|
|
|
There are also cases where you always need custom serialization code. LCP
|
|
provides helper functions for abstracting some common details. These functions
|
|
are listed further down in this document.
|
|
|
|
#### Arguments for Save and Load
|
|
|
|
Default arguments for `Save` and `Load` function are Cap'n Proto builder and
|
|
reader, respectively. In some cases you may wish to send additional arguments.
|
|
This is most commonly needed when tracking `shared_ptr` serialization, to
|
|
avoid serializing the same pointer multiple times.
|
|
|
|
Additional arguments are specified by passing `:save-args` and `:load-args`.
|
|
You can specify either of them, but in most cases you want both.
|
|
|
|
For example:
|
|
|
|
```lisp
|
|
;; Class for tracking details during save
|
|
(lcp:define-class save-helper ()
|
|
(...))
|
|
|
|
;; Class for tracking details during load
|
|
(lcp:define-class load-helper ()
|
|
(...))
|
|
|
|
(lcp:define-class my-class ()
|
|
((member "std::shared_ptr<int>"
|
|
:capnp-save ;; custom save
|
|
:capnp-load ;; custom load
|
|
))
|
|
(:serialize :capnp
|
|
:save-args '((save-helper "SaveHelper *"))
|
|
:load-args '((load-helper "LoadHelper *"))))
|
|
```
|
|
|
|
The custom serialization code will now have access to `save_helper` and
|
|
`load_helper` variables in C++. You can add more arguments by expanding the
|
|
list of pairs, e.g.
|
|
|
|
```lisp
|
|
:save-args '((first-helper "SomeType *") (second-helper "OtherType *") ...)
|
|
```
|
|
|
|
#### Custom Serialization Helper Functions
|
|
|
|
##### Helper for `std::optional`
|
|
|
|
When using `std::optional` with primitive C++ types or custom types known to
|
|
LCP, you do not need to use any helper. In the example below, things should be
|
|
serialized as expected:
|
|
|
|
```lisp
|
|
(lcp:define-class my-class-with-primitive-optional ()
|
|
((primitive-optional "std::experimental::optional<int64_t>"))
|
|
(:serialize :capnp))
|
|
|
|
(lcp:define-class my-class-with-known-type-optional ()
|
|
((known-type-optional "std::experimental::optional<MyClassWithPrimitiveOptional>"))
|
|
(:serialize :capnp))
|
|
```
|
|
|
|
In cases when the value contained in `std::optional` needs custom
|
|
serialization code you may use `lcp:capnp-save-optional` and
|
|
`lcp:capnp-load-optional`.
|
|
|
|
Both functions expect 3 arguments.
|
|
|
|
1. Cap'n Proto type in C++.
|
|
2. C++ type of the value inside `std::optional`.
|
|
3. Optional C++ lambda code.
|
|
|
|
The lambda code is optional, because LCP will generate the default
|
|
serialization code which invokes `Save` and `Load` function on the value
|
|
stored inside the optional. Since most of the serialized classes follow the
|
|
convention, you will rarely need to provide this 3rd argument.
|
|
|
|
For example:
|
|
|
|
```lisp
|
|
(lcp:define-class my-class ()
|
|
((member "std::experimental::optional<SomeType>"
|
|
:capnp-save (lcp:capnp-save-optional
|
|
"capnp::SomeType" "SomeType"
|
|
"[](auto *builder, const auto &val) { ... }")
|
|
:capnp-load (lcp:capnp-load-optional
|
|
"capnp:::SomeType" "SomeType"
|
|
"[](const auto &reader) { ... return loaded_val; }"))))
|
|
```
|
|
|
|
##### Helper for `std::vector`
|
|
|
|
For custom serialization of vector elements, you may use
|
|
`lcp:capnp-save-vector` and `lcp:capnp-load-vector`. They function exactly the
|
|
same as helpers for `std::optional`.
|
|
|
|
##### Helper for enumerations
|
|
|
|
If the enumeration is defined via `lcp:define-enum`, the default LCP
|
|
serialization should generate the correct code.
|
|
|
|
However, if LCP cannot infer the serialization code, you can use helper
|
|
functions `lcp:capnp-save-enum` and `lcp:capnp-load-enum`. Both functions
|
|
require 3 arguments.
|
|
|
|
1. C++ type of equivalent Cap'n Proto enum.
|
|
2. Original C++ enum type.
|
|
3. List of enumeration values.
|
|
|
|
Example:
|
|
|
|
```lisp
|
|
(lcp:define-class my-class ()
|
|
((enum-value "SomeEnum"
|
|
:capnp-init nil ;; must be set to nil
|
|
:capnp-save (lcp:capnp-save-enum
|
|
"capnp::SomeEnum" "SomeEnum"
|
|
'(first-value second-value))
|
|
:capnp-load (lcp:capnp-load-enum
|
|
"capnp::SomeEnum" "SomeEnum"
|
|
'(first-value second-value)))))
|
|
```
|
|
|
|
### SaveLoadKit Serialization {#slk-serial}
|
|
|
|
LCP supports generating serialization code for use with our own simple
|
|
serialization framework, SaveLoadKit (SLK).
|
|
|
|
To specify a class or structure for serialization, pass a `:serialize :slk`
|
|
class option. For example:
|
|
|
|
```lisp
|
|
(lcp:define-struct my-struct ()
|
|
((member :int64_t))
|
|
(:serialize :slk))
|
|
```
|
|
|
|
The above will generate C++ functions for saving and loading all members of
|
|
the defined type. The generated code is put inside the `slk` namespace. For
|
|
the above example, we would get the following declarations:
|
|
|
|
```cpp
|
|
namespace slk {
|
|
void Save(const MyStruct &self, slk::Builder *builder);
|
|
void Load(MyStruct *self, slk::Reader *reader);
|
|
}
|
|
```
|
|
|
|
Since we use top level (i.e. non-member) functions, the class members need to
|
|
have public access. The primary reason why we use non-member functions is the
|
|
ability to have them decoupled from types. This in turn allows us to easily
|
|
compile the code with and without serialization. The obvious downside is the
|
|
requirement of public access which could potentially allow for erroneous use
|
|
of classes and its members. Therefore, the recommended way to use
|
|
serialization is with plain old data types. The programmer needs be aware of
|
|
that and use POD as an immutable type as much as possible. This
|
|
recommendation of using POD types will also help minimize the complexity of
|
|
serialization code as well as minimize required features in LCP.
|
|
|
|
Another requirement on serialized types is that they need to be default
|
|
constructible. This keeps the serialization implementation simple and uniform.
|
|
Each type is first default constructed, potentially on stack memory. Then the
|
|
`slk::Load` function is invoked with the pointer to that instance. We could
|
|
add support for having a pointer to an uninitialized memory and perform the
|
|
construct in `slk::Load` to allow types which aren't default constructible.
|
|
At the moment, implementing this support would needlessly complicate our code
|
|
where most of the types can be and are default constructible.
|
|
|
|
#### Single Inheritance
|
|
|
|
The first and most common step out of the POD zone is having classes with
|
|
inheritance. LCP supports serializing classes with single inheritance.
|
|
|
|
A minor complication appears when loading a pointer to a base class. When we
|
|
have a pointer to a base class, serializing it may save the data of some
|
|
concrete, derived type. Loading the pointer back will need to determine which
|
|
type was actually serialized. When we know the concrete type, we need to
|
|
construct it and load it. Finally, we can return a base pointer to that. For
|
|
this reason, we generate 2 loading functions: regular `Load` and
|
|
`ConstructAndLoad`. The latter function is used to do the whole process of
|
|
determining the type, constructing it and invoking regular `Load`. Since we
|
|
cannot know the type of the serialized pointer upfront, we cannot allocate the
|
|
exact required memory on the stack. For that reason, `ConstructAndLoad` will
|
|
perform a heap allocation for you. Obviously, this could be a performance
|
|
issue. In cases when we know the exact concrete type, then we can use the
|
|
regular `Load` which expects the pointer to that type. If you are using `Load`
|
|
instead of `ConstructAndLoad`, read the next paragraph carefully!
|
|
|
|
Determining which type was serialized works by storing the `id` of
|
|
`utils::TypeInfo` when saving a class which is anywhere in the inheritance
|
|
hierarchy. This is the *first* thing the invocation to `Save` does. Later,
|
|
when we call `ConstructAndLoad` it will read that type `id` and dispatch on it
|
|
to construct the instance of that type and call the appropriate `Load`
|
|
function. Beware when invoking `Load` of polymorphic types yourself! You
|
|
*need* to read the type `id` yourself *first* and then invoke the `Load`
|
|
function. Things will not work correctly if you forget to do that, because
|
|
`Load` expects to read the serialized data members and not the type
|
|
information.
|
|
|
|
For example:
|
|
|
|
```lisp
|
|
(lcp:define-class base ()
|
|
...
|
|
(:serialize :slk))
|
|
|
|
(lcp:define-class derived (base)
|
|
...
|
|
(:serialize :slk))
|
|
```
|
|
|
|
We get the following declarations generated:
|
|
|
|
```cpp
|
|
namespace slk {
|
|
// Save will correctly forward to derived class using `dynamic_cast`!
|
|
void Save(const Base &self, slk::Builder *builder);
|
|
// Load only the Base instance, does *not* forward!
|
|
void Load(Base *self, slk::Reader *reader);
|
|
// Construct the concrete type (could be Base or any derived) and call the
|
|
// correct Load. Raises `slk::SlkDecodeException` if an unknown type is
|
|
// serialized.
|
|
void ConstructAndLoad(std::unique_ptr<Base> *self, slk::Reader *reader);
|
|
|
|
void Save(const Derived &self, slk::Builder *builder);
|
|
void Load(Derived *self, slk::Reader *reader);
|
|
// This will raise slk::SlkDecodeException, if something other than `Derived`
|
|
// was serialized. `Derived` does not have any subclassses.
|
|
void ConstructAndLoad(std::unique_ptr<Derived> *self, slk::Reader *reader);
|
|
```
|
|
|
|
#### Multiple Inheritance
|
|
|
|
Serializing classes with multiple inheritance is *not* supported!
|
|
|
|
Usually, multiple inheritance is used to satisfy some interface which doesn't
|
|
carry data for serialization. In such cases, you can ignore the multiple
|
|
inheritance by specifying `:ignore-other-base-classes` option. For example:
|
|
|
|
```lisp
|
|
(lcp:define-class derived (primary-base some-interface ...)
|
|
...
|
|
(:serialize :slk (:ignore-other-base-classes t)))
|
|
```
|
|
|
|
The above will produce serialization code as if `derived` is inheriting *only*
|
|
from `primary-base`.
|
|
|
|
#### Templated Types
|
|
|
|
Serializing templated types is *not* supported!
|
|
|
|
You may still write your own serialization code in C++, but LCP will not
|
|
generate it for you.
|
|
|
|
#### Custom Save and Load Hooks
|
|
|
|
In cases when default serialization is not adequate, you may wish to provide
|
|
your own serialization code. LCP provides `:slk-save` and `:slk-load` options
|
|
for each member.
|
|
|
|
These hooks for custom serialization expect a function with a single argument,
|
|
`member`, representing the member currently being serialized. This allows to
|
|
have a more generic function which works with any member of some type. The
|
|
return value of the function needs to be C++ code. The generated code may
|
|
expect to have `self` and `builder` variables in scope, just like they are
|
|
found in the generated `Save` and `Load` declarations.
|
|
|
|
For example, one of the most common use cases is saving and loading
|
|
a `std::shared_ptr`. You need to provide an argument which is used to track
|
|
which pointers were already (de)serialized. Let's take a look how this could
|
|
be done in LCP.
|
|
|
|
```lisp
|
|
(lcp:define-struct my-struct ()
|
|
((some-ptr "std::shared_ptr<SomeType>"
|
|
:slk-save (lambda (member)
|
|
#>cpp
|
|
std::vector<SomeType *> already_saved;
|
|
slk::Save(self.${member}, builder, &already_saved);
|
|
cpp<#)
|
|
:slk-load (lambda (member)
|
|
#>cpp
|
|
std::vector<std::shared_ptr<SomeType>> already_loaded;
|
|
slk::Load(&self->${member}, reader, &already_loaded);
|
|
cpp<#)))
|
|
(:serialize :slk))
|
|
```
|
|
|
|
The above use is very artificial, because we usually have multiple shared
|
|
pointers across different members. In such cases we would like to share the
|
|
tracking data. One way to do that is explained in the next section.
|
|
|
|
#### Additional Arguments to Generated Save and Load
|
|
|
|
As you may have noticed, primary arguments for `Save` and `Load` are the type
|
|
instance and a `slk::Builder` or a `slk::Reader`. In some cases we would like
|
|
to accept additional arguments to help us with the serialization process.
|
|
Let's see how this is done in LCP using the `:save-args` and `:load-args`
|
|
options for `:slk` serialization.
|
|
|
|
Both `:save-args` and `:load-args` options expect a list of pairs. Each pair
|
|
designates one argument. The first element of the pair is the argument name
|
|
and the second is the C++ type of that argument.
|
|
|
|
As mentioned in the previous section, one of the most common cases where
|
|
default serialization doesn't cut it is when we have a `std::shared_ptr`.
|
|
Here, we would like to track already serialized pointers. Instead of having
|
|
some kind of a global variable, we could pass the tracking data as an
|
|
additional argument. Let's take the example from the previous section, and
|
|
have it take tracking data as an argument to `Save` and `Load` of `my-struct`
|
|
type.
|
|
|
|
```lisp
|
|
(lcp:define-struct my-struct ()
|
|
((some-ptr "std::shared_ptr<SomeType>"
|
|
:slk-save (lambda (member)
|
|
#>cpp
|
|
slk::Save(self.${member}, builder, already_saved);
|
|
cpp<#)
|
|
:slk-load (lambda (member)
|
|
#>cpp
|
|
slk::Load(&self->${member}, reader, already_loaded);
|
|
cpp<#)))
|
|
(:serialize :slk (:save-args '((already-saved "std::vector<SomeType *> *"))
|
|
:load-args '((already-loaded "std::vector<std::shared_ptr<SomeType>> *")))))
|
|
```
|
|
|
|
The generated declarations now look like the following:
|
|
|
|
```cpp
|
|
void Save(const MyStruct &self, slk::Builder *builder,
|
|
std::vector<SomeType *> *already_saved);
|
|
void Load(MyStruct *self, slk::Builder *builder,
|
|
std::vector<std::shared_ptr<SomeType>> *already_loaded);
|
|
```
|
|
|
|
This can now be handy when serializing multiple instances of `my-struct`. For
|
|
example:
|
|
|
|
```lisp
|
|
(lcp:define-struct my-array-of-struct ()
|
|
((structs "std::vector<MyStruct>"
|
|
:slk-save (lambda (member)
|
|
#>cpp
|
|
slk::Save(self.${member}.size(), builder);
|
|
std::vector<SomeType *> already_saved;
|
|
for (const auto &my_struct : structs)
|
|
slk::Save(my_struct, builder, &already_saved);
|
|
cpp<#)
|
|
:slk-load (lambda (member)
|
|
#>cpp
|
|
size_t size = 0;
|
|
slk::Load(&size, reader);
|
|
self->${member}.resize(size);
|
|
std::vector<std::shared_ptr<SomeType>> already_loaded;
|
|
for (size_t i = 0; i < size; ++i)
|
|
slk::Load(&self->${member}[i], reader, &already_loaded);
|
|
cpp<#)))
|
|
(:serialize :slk))
|
|
```
|
|
|