Introduction to Web Assembly with C/C++: Part 1

webassembly-logo

I’ve been taking advantage of Web Assembly lately. It is supported by all the major browsers, let’s one make use of already existing useful code that has been written for other environments, and provides some performance benefits over JavaScript. Web Assembly has a lot of potential and support and I’d like to introduce other developers to it. I’m going to be using C++ in this post. But by no means is this the only language in which someone can make use of Web Assembly. In this post I talk about why someone might want to consider web assembly and how to get a development environment setup.

What is Web Assembly?

Web Assembly is a specification for a virtual machine that runs in the browser. Compared the the highly dynamic JavaScript Web Assembly can achieve much higher performance. Contrary to popular misconception though Web Assembly doesn’t completely replace JavaScript. You will probably use the two together.  Web Assembly is based on LLVM  (Low Level Virtual Machine), a stack based virtual machine that compilers can target.  If someone wanted to make a new programming language they could have the compiler for their language produce LLVM code and then use an already existing tool chain to compile it to platform specific code. A person building a compiler for a new language wouldn’t need to make completely separate systems for different CPU architectures.  Web Assembly being LLVM based could run code that was written by a variety of languages. Currently there isn’t support for garbage collection yet which restricts the languages that target it presently. C/C++, C#, and Rust are a few languages that can be used with Web Assembly presently with more expected in the future.

What Other Languages Can I Use?

  • C/C++ – I’ll be using that language in this article
  • C#/.Net – I’ve got interest in this one and will write about it in the future.
  • Elixir
  • Go
  • Java
  • Python
  • Rust – This is a newer language

Why Use Web Assembly?

I suggest Web Assembly primarily for the performance benefits in computationally expensive operations.  The binary format it uses is much more strict than JavaScript and it is more suitable for computationally intensive operations. There is also a lot of existing and tested code for work such as cryptography or video decoders that exists in C/C++ that one might want to use in a page. Despite all its flexibility interpreted JavaScript code doesn’t run as fast as a native binary. For some types of applications this difference in performance isn’t important (such as in a word processor). For other applications differences in performance translate into differences in experiences.

While the demand for performance is a motivation to make a native binary there are also security considerations. Native binaries may have access to more system resources than a web implemented solution. There may be more concern with ensuring that a program (especially if it is from a third party) doesn’t do anything malicious or access resources without permission. Web Assembly helps bridge the gap between these two needs; it provides a higher performance execution environment within a sandbox.

WebAssemblySupport

C++? Can’t I Cause a Buffer Overflow With That?

Sure. But only within the confines of the sandbox in which the code will run. It could crash your program, but it can’t cause arbitrary execution of code outside the sandbox. Also note that presently Web Assembly doesn’t have any bindings to Host APIs. When you target Web Assembly you don’t have an environment that allows you to bypass the security restrictions in which JavaScript code will run. There’s no direct access to the file system, there’s not access to memory outside of your program, you will still be restricted to communicating with WebSockets and HTTP request that don’t violate CORS restrictions.

How Do I Setup a Developer Environment

There are different versions of instructions on the Internet for installing the Web Assembly tools. If you are running Windows 10 you may come across a set of instructions that start with telling you to install the Windows Subsystem for Linux. Don’t use those instructions; I personally think they are unnecessarily complex. While I have the Windows Sub System for Linux installed and running for other purposes that’s not where I like to compile my Web Assembly code.

Using your operating system of choice (Windows 10/8/7, macOS, Linux) clone the Emscripten git repository, run a few scripts from it, and you are ready to go. Here are the commands to use.  If you are on Windows omit the ./ at the beginning of the commands.

git clone https://github.com/emscripten-core/emsdk.git
cd emsdk
git pull
./emsdk install latest
./emsdk activate latest

With the tools installed you will also want to set the some environment variables. There is a script for doing this. On Windows 10 run

emsdk_env.bat

For the other operating systems run

source emsdk_env.sh

The updates that this makes to environment variables isn’t persistent; it will need to be run again with the next reboot.  For an editor I suggest using Visual Studio Code. I’ll be compiling from the command line in this article. Feel free to use the editor of your choice.

Web Assembly Explorer

I don’t use it in this tool within this article, but Web Assembly Explorer is available as an online tool for compiling C++ into Web Assembly and is an option if you don’t have the tools installed.

https://mbebenita.github.io/WasmExplorer/

Hello World

Now that we have the tools installed we can compile and run something. We will do a hello world program. Type the following source code and save it in hello.cpp.

#include 
int main(int argc, char**argv) 
{
     printf("Hello World!\n");
    return 0;
}

To compile the code from the command line type the following.

emcc hello.cpp -o hello.html

After the compiler runs you will have three new files.

  • hello.wasm – the compiled version of your program
  • hello.html – an HTML page for hosting your web assembly
  • hello.js – JavaScript for loading your web assembly into the page

If you try to open the HTML file directly your code probably will not run. Instead the page will have to be served through an HTTP server. If you have node installed use the node http-server. You can install the http-server with

npm install  http-server -g

Then start the server from the directory with your hello.html

http-server . -p 81

Here I’ve instructed the http-server to run on port 81. You can use the port of your choice here provided nothing else is using it. Remember to substitute the port that you chose throughout the rest of these instructions.

Open a browser and navigate to http://localhost:81/hello.html. You’ll see your code run. If you view the source for the page there is a lot of “noise” in the file. Much of that noise is from the displayed images being embedded within the HTML.  That’s fine for playing around. But you will want to have something customized to your own needs.

We can provide a shell or template file for the compiler to use. Emscripten has a minimal file available at https://github.com/emscripten-core/emscripten/blob/master/src/shell_minimal.html. Download that file. It will be used as our starting point. It is convenient for the sake of distribution for everything to be in one file. But I don’t like the CSS and JavaScript being embedded within the file.  The CSS here isn’t needed and is being deleted. I’m moving the  JavaScript  to its own file and added a script references to it in my HTML.  There are several items within the HTML and the script that are not necessarily needed. Let’s look at the script first and start making this minimal file even more minimalist.

At the top of the script there are three variables to page elements to indicate download and progress. Those are not absolutely necessary. I’m deleting them.  I need to delete references to them too. Lower in the JavaScript is a method named setStatus . I’m replacing it’s body with a call to console.log() to print the text that is passed to it.  The first set of programs that I’m going to write won’t use a canvas. The element isn’t needed for now; I’m commenting it out instead of deleting it so that I can use it later.  Having deleted the first three lines of this file and and code that references them I’m returning to the HTML. Most of it is being deleted. I’ve commented out the canvas reference. There is a line in the HTML file with the text {{{ SCRIPT }}}. The compiler will take this file as a template and replace {{{ SCRIPT }}} with the reference to the script specific to our Web Assembly file.

webAssemble-templateHTML

When the Web Assembly program executes a printf() the text will be written to the textarea element. I place my hello.cpp file among these files and then compile it with the following command.

emcc hello.cpp --shell-file shell_minimal.html -o hello.html

The –shell-file argument indicates what file to use as a template. The -o parameter tells the name of the HTML file to write to. If you look at hello.html you can see it is almost identical to the input template. Run the site now and you’ll see the same result, but with a much cleaner interface. Run the program again and you will see the same result with a much cleaner interface.

Binding Functions

I earlier mentioned that Web Assembly doesn’t have any bindings to any operating system functions. It also doesn’t have bindings do the browser. Nor does it have access to the DOM. It is up to the page that loads the web assembly to expose functions to it. In emscripten.js the Modules object defines a number of functions that are going to be made available to the Web Assembly. When the C/C++ code calls printf it will be passed through the JavaScript function defined here of the same name. It isn’t a requirement that the names be the same, but it is easier to keep track of function associations if they are.

Calling C/C++ From JavaScript

But what if you have your own functions that you wish to bind so that your JavaScript code can call the C++ code? The Module object has a function named ccall that can be used to call C/C++ code from JavaScript and another function named cwrap to build a function object that we can hold onto for repeated calls to the same function. To use these functions some additional compile flags will be needed.

To demonstrate the use of both of these methods of calling C/C++ code from JavaScript I’m going to declare three new functions in the C++ code.

  • void testCall() – accepts no parameters and returns no value. This method only prints a string so that we know that our call to it was successful.
  • void printNumber(int num) – accepts an integer argument and prints it. This lets us know that our value was successfully called.
  • int square(int c) – accepts an integer and returns the square of that integer. This let’s us see that a value can be returned back from the code.

The C++ language perform what is called name mangling; the names of the functions in the compiled code is different than the uncompiled code. For the functions that we want to use from outside the C++ code we need to wrap declarations for the functions in an extern “C” block. If our code were being written in C instead of C++ this wouldn’t be necessary. I still prefer C++ because of some of the features that the language offers.  Normally I would have a declaration such as this in a header file. But for now my C++ program is in a single file. Close to the top of the program I make the following declarations.

extern "C" {
    void testCall();
    void printNumber(int f);
    int square(int c);
}
The implementation for the functions is what you would expect.
void testCall() 
{
    printf("function was called!\n");
}

void printNumber(int f) {
    printf("Printing the number %d\n", f);
}

int square(int c)
{
    return c*c;
}
There’s a change to my main method too. I’ve had to include a new header file, enscripten.h, because I am about to use one of the functions that it provides.  In main added the following line.
EM_ASM ( InitWrappers());
It will result in a JavaScript function named InitWrappers() to get called. I will talk about how EM_ASM works in a following section.   I’m adding a third