Seaplus: Streamlining a safe execution of C/C++ code from Erlang

Organisation:Copyright (C) 2018-2019 Olivier Boudeville
Contact:about (dash) seaplus (at) esperide (dot) com
Creation date:Sunday, December 23, 2018
Lastly updated:Wednesday, February 20, 2019
Dedication:Users and maintainers of the Seaplus bridge, version 1.0.
Abstract:The role of the Seaplus bridge is to control C or C++ code from Erlang, not as NIF but thanks to a port, and to streamline the corresponding integration process.

The latest version of this documentation is to be found at the official Seaplus website (http://seaplus.esperide.org).

This Seaplus documentation is also available in the PDF format (see seaplus.pdf), and mirrored here.

Important Note

Seaplus is still work in progress - not usable yet!

Overview

A typical use-case is having a C or C++ library of interest that we would like be able to use from Erlang, whereas, for any reason (availability of sources, complexity, size or interest), recoding it (in Erlang) is not desirable.

However tempting it may be to integrate tightly C/C++ code to the Erlang VM (typically through a NIF), one may prefer trading maximum performances for safety, and run that C/C++ code (which is often at last partly foreign, hence possibly unreliable) into a separate, isolated (operating system) process.

Then the integrated code will not be able to crash the Erlang application, and for example any memory leak it would induce would only affect its own process - not the application one.

Indeed, taking into account the Erlang Interoperability Tutorial, the following approaches are the most commonly considered ones when having to make C/C++ code available from Erlang:

  • raw ports and linked-in drivers: they are mostly obsolete for the task at hand (superseded by better counterparts)
  • os:cmd/1: a rudimentary solution that offers little control and requires much syntactic parsing effort
  • custom socket-based protocol: quite low-level and complicated
  • NIF: as mentioned, they may jeopardise the VM (depending on the use case, this may be acceptable or not)
  • C-Node and Erl_Interface: this is the combination that we preferred for Seaplus, and that we tried to streamline/automate here, at least partially

In a nutshell, this approach consists on spawning a "fake" Erlang node written in C (the C-Node) and using the standard Erlang external term format in order to communicate with it (relying for that on the Erl_Interface facility). Doing so allows a seamless communication to happen, despite language heterogeneity.

C-Node and Erl_Interface help a lot, yet, as shown in this example, quite a lot of boiler-plate/bridging code (home-made encoding and conventions) remains needed.

The goal of Seaplus is to reduce that interfacing effort, thanks to a set of generic, transverse functions on either side (modules in Erlang, a library in C/C++) and the use of metaprogramming (i.e. the Seaplus parse transform) in order to generate at least a part of the code needed in both sides, while leaving to the developer enough leeway so that he can define precisely the mapping interface that he prefers (ex: with regards to naming, types introduced and used, management of resource ownership, etc.).

Ceylan-Seaplus relies on various facilities offered by the Ceylan-Myriad toolbox.

Usage

So we would have here a (possibly third-party) service (typically a library, directly usable from C, offering a set of functions) that we want to integrate, i.e. to make available from Erlang.

Let's suppose that said service is named Foobar, and that the functions it provides (hence on the C side) are declared as (typically in some foobar.h header file [1], referring to a possibly opaque foobar.so library):

#include <stdbool.h>

struct foo_data { int count; float value } ;

enum foo_status {low_speed,moderate_speed,full_speed};
enum tur_status {tur_value,non_tur_value};

int foo(int a);
struct foo_data * bar(double a, enum foo_status status);
enum tur_status baz(unsigned int u, const char * m);
bool tur();
char * frob(enum tur_status);
[1]See the full, unedited version of the foobar.h test header that is actually used.

With the definition of this example, we ensured to reproduce real-life situations, like atoms vs enums, dynamic memory allocation (for the returned struct) and runtime failures (since calling foo(0) is to trigger a division by zero).

What would be the corresponding ideal Erlang interface to make such a fantastic service available?

First of all, multiple corresponding Erlang APIs can be considered, and some design choices have to be made (we can foresee that some are more elegant/convenient than others, and that a perfect, universal, one-size-fit-all automated mapping does not seem so achievable).

An easy step is to decide, at least in most cases, to map each of these C functions to an Erlang counterpart function that, unsurprisingly, bears the same name and most of the time has the same arity, and to have them gathered into a single module that would be best named foobar (and thus shall be defined in foobar.erl).

We believe that, in order to rely on a convenient Erlang-side API for this service, adaptations have to be made (ex: with regard to typing), and thus that it should preferably be defined in an ad-hoc manner (i.e. it should be tailor-made, rather than be automatically generated through a mapping possibly suffering from impedance mismatch).

So such a service-specific API shall be devised by the service integrator (i.e. the developer in charge of the integration of the C/C++ code to Erlang). But how?

At the very least, what will be offered on the Erlang side by our foobar module shall be somehow specified. A very appropriate way of doing so is to list the type specifications of the targeted counterpart functions meant to be ultimately available (defined and exported) from Erlang, like in [2]:

-module(foobar).

-record(foo_data, {count :: integer(), value :: float()}).
-type foo_data() :: #foo_data{}.

-type foo_status() :: 'low_speed'|'moderate_speed'|'full_speed'.
-type tur_status() :: 'tur_value'|'non_tur_value'.

-spec foo(integer()) -> integer().
-spec bar(float(), foo_status()) -> foo_data().
-spec baz(integer(), text_utils:ustring()) -> tur_status().
-spec tur() -> bool().
-spec frob(tur_status()) -> text_utils:ustring().
[2]See the full, unedited version of the foobar.erl API module that is actually used, together with its foobar.hrl header file.

Comments (description, usage, examples) are also expected to be joined to these specs, they are omitted in this documentation for brevity.

Other facility functions that all integrated services will need, and whose signature (if not implementation) would be the same from a service to another (ex: to start/stop this service from Erlang), will also certainly be needed. However listing these facility functions in our foobar module would offer little interest (as they are the same for all integrated services), so these extra functions are to remain implicit here [3].

These service-level built-in functions automatically defined by Seaplus are:

  • start/{0,1,2} and start_link/{0,1,2}
  • stop/{0,1}
[3]Note though that, at least for some services, specific initialisation/tear-down functions may exist in the vanilla, C version of that service. In that case, they should be added among said function specifications (preferably named for example init/teardown or alike, in order to distinguish from the Seaplus-reserved start/stop primitives), so that they are available from Erlang as well.

Of course such a module, as it was defined above (i.e. just a set of function specifications), is useless and would not even compile as such. But the Seaplus parse transform will automatically enrich and transform it so that, once the C part will be available, the Foobar service will become fully usable from Erlang, with no extra boilerplate code to be added by the Erlang integrator.

More precisely, for each of the function type specification, a corresponding bridging implementation will be generated and added (unless the foobar module already includes one, so that the user can selectively override the Seaplus code generation), whilst the facility functions will be included as well.

Here is a corresponding (mostly meaningless) usage example [4] of this foobar module, when executed from any given process (ex: a test one):

foobar:start(),
MyFooData = foobar:bar(3.14,full_speed),
NewCount = foobar:foo(MyFooData#foo_data.count),
Res = case foobar:tur() of
  true ->
    foobar:baz(NewCount,"Hello");
  false ->
    non_tur_value
end,
io:format("Having: ~s~n",[foobar:frob(Res)]),
foobar:stop().
[4]See the full, unedited version of the foobar_test.erl module used to test the Erlang-integrated service (emulating an actual use of that service).

At this point, one may think that, thanks to these function specs, the full counterpart C bridging code might have been automagically generated, in the same movement as the Erlang bridging code? Unfortunately, not exactly! At least, not yet; maybe some day (if ever possible and tractable). Currently: only parts of it are generated.

Indeed C-side elements will have been produced by the Seaplus parse-transform (notably the function selector include, used to map functions on either sides), but the conversion (thanks to Erl_Interface) from the Erlang terms received by the port into arguments that will feed the C functions and on the other way round (i.e. from the C results to the Erlang terms that shall be sent back) is still left to the service integrator.

This work remains, yet it is also a chance to better adapt the bridging code to the interfacing contract one would like to be fulfilled, for example with regard to resource ownership. Indeed, should the C part take pointers as arguments, shall it delete them once having used them? Conversely, should a C function return a pointer to a dynamically allocated memory, who is responsible for the eventual deallocation of it?

To address these questions, service-specific choices and conventions have to be applied, and this information cannot be found or deduced generically by an algorithm (including the Seaplus one). As a result, we believe that in all cases some effort has still to be done by the service integrator.

Licence

Seaplus is licensed by its author (Olivier Boudeville) under a disjunctive tri-license giving you the choice of one of the three following sets of free software/open source licensing terms:

This allows the use of the Seaplus code in as wide a variety of software projects as possible, while still maintaining copyleft on this code.

Being triple-licensed means that someone (the licensee) who modifies and/or distributes it can choose which of the available sets of licence terms he is operating under.

We hope that enhancements will be back-contributed (ex: thanks to merge requests), so that everyone will be able to benefit from them.

Current Stable Version & Download

Using Stable Release Archive

Currently no source archive is specifically distributed, please refer to the following section.

Using Cutting-Edge GIT

We try to ensure that the main line (in the master branch) always stays functional. Evolutions are to take place in feature branches.

This integration layer, Ceylan-Seaplus, relies (only) on:

We prefer using GNU/Linux, sticking to the latest stable release of Erlang, and building it from sources, thanks to GNU make.

For that we devised the install-erlang.sh script; a simple use of it is:

$ ./install-erlang.sh --doc-install --generate-plt

One may execute ./install-erlang.sh --help for more details about how to configure it, notably in order to enable all modules of interest (crypto, wx, etc.) even if they are optional in the context of Seaplus.

As a result, once proper Erlang and C environments are available, the Ceylan-Myriad repository should be cloned and built, before doing the same with the Ceylan-Seaplus repository, like in:

$ git clone https://github.com/Olivier-Boudeville/Ceylan-Myriad
$ cd Ceylan-Myriad && make all && cd ..
$ git clone https://github.com/Olivier-Boudeville/Ceylan-Seaplus
$ cd Ceylan-Seaplus && make all

One can then test the whole with:

$ cd tests/c-test
$ make integration-test

Support

Bugs, questions, remarks, patches, requests for enhancements, etc. are to be sent to the project interface, or directly at the mail address mentioned at the beginning of this longer document.

Seaplus Inner Workings

It is mostly the one described in the Erl_Interface tutorial, once augmented with conventions and automated by the Seaplus parse transform as much as realistically possible (hence a code generation that is exhaustive on the Erlang side, and partial of the C side) and adapted for increased performances (notably: no extra relay process between the user code and the port).

Please React!

If you have information more detailed or more recent than those presented in this document, if you noticed errors, neglects or points insufficiently discussed, drop us a line! (for that, follow the Support guidelines).

Ending Word

Have fun with Seaplus!

Seaplus logo