Using WebAssembly threads from C, C++ and Rust

WebAssembly threads support is one of the most important performance additions to WebAssembly. It allows you to either run parts of your code in parallel on separate cores, or the same code over independent parts of the input data, scaling it to as many cores as the user has and significantly reducing the overall execution time.

In this article you will learn how to use WebAssembly threads to bring multithreaded applications written in languages like C, C++, and Rust to the web.

How WebAssembly threads work #

WebAssembly threads is not a separate feature, but a combination of several components that allows WebAssembly apps to use traditional multithreading paradigms on the web.

Web Workers #

First component is the regular Workers you know and love from JavaScript. WebAssembly threads use the new Worker constructor to create new underlying threads. Each thread loads a JavaScript glue, and then the main thread uses Worker#postMessage method to share the compiled WebAssembly.Module as well as a shared WebAssembly.Memory (see below) with those other threads. This establishes communication and allows all those threads to run the same WebAssembly code on the same shared memory without going through JavaScript again.

Web Workers have been around for over a decade now, are widely supported, and don’t require any special flags.

`SharedArrayBuffer` #

WebAssembly memory is represented by a WebAssembly.Memory object in the JavaScript API. By default WebAssembly.Memory is a wrapper around an ArrayBuffer—a raw byte buffer that can be accessed only by a single thread.

> new WebAssembly.Memory({ initial:1, maximum:10 }).buffer
ArrayBuffer { … }

To support multithreading, WebAssembly.Memory gained a shared variant too. When created with a shared flag via the JavaScript API, or by the WebAssembly binary itself, it becomes a wrapper around a SharedArrayBuffer instead. It’s a variation of ArrayBuffer that can be shared with other threads and read or modified simultaneously from either side.

> new WebAssembly.Memory({ initial:1, maximum:10, shared:true }).buffer
SharedArrayBuffer { … }

Unlike postMessage, normally used for communication between main thread and Web Workers, SharedArrayBuffer doesn’t require copying data or even waiting for the event loop to send and receive messages. Instead, any changes are seen by all threads nearly instantly, which makes it a much better compilation target for traditional synchronisation primitives.

SharedArrayBuffer has a complicated history. It was initially shipped in several browsers mid-2017, but had to be disabled in the beginning of 2018 due to discovery of Spectre vulnerabilities. The particular reason was that data extraction in Spectre relies on timing attacks—measuring execution time of a particular piece of code. To make this kind of attack harder, browsers reduced precision of standard timing APIs like Date.now and performance.now. However, shared memory, combined with a simple counter loop running in a separate thread is also a very reliable way to get high-precision timing, and it’s much harder to mitigate without significantly throttling runtime performance.

Instead, Chrome 68 (mid-2018) re-enabled SharedArrayBuffer again by leveraging Site Isolation—a feature that puts different websites into different processes and makes it much more difficult to use side-channel attacks like Spectre. However, this mitigation was still limited only to Chrome desktop, as Site Isolation is a fairly expensive feature, and couldn’t be enabled by default for all sites on low-memory mobile devices nor was it yet implemented by other vendors.

Fast-forward to 2020, Chrome and Firefox both have implementations of Site Isolation, and a standard way for websites to opt-in to the feature with COOP and COEP headers. An opt-in mechanism allows to use Site Isolation even on low-powered devices where enabling it for all the websites would be too expensive. To opt-in, add the following headers to the main document in your server configuration:

Cross-Origin-Embedder-Policy: require-corp
Cross-Origin-Opener-Policy: same-origin

Once you opt-in, you get access to SharedArrayBuffer (including WebAssembly.Memory backed by a SharedArrayBuffer), precise timers, memory measurement and other APIs that require an isolated origin for security reasons. Check out the Making your website “cross-origin isolated” using COOP and COEP for more details.

WebAssembly atomics #

While SharedArrayBuffer allows each thread to read and write to the same memory, for correct communication you want to make sure they don’t perform conflicting operations at the same time. For example, it’s possible for one thread to start reading data from a shared address, while another thread is writing to it, so the first thread will now get a corrupted result. This category of bugs is known as race conditions. In order to prevent race conditions, you need to somehow synchronize those accesses. This is where atomic operations come in.

WebAssembly atomics is an extension to the WebAssembly instruction set that allow to read and write small cells of data (usually 32- and 64-bit integers) “atomically”. That is, in a way that guarantees that no two threads are reading or writing to the same cell at the same time, preventing such conflicts at a low level. Additionally, WebAssembly atomics contain two more instruction kinds—“wait” and “notify”—that allow one thread to sleep (“wait”) on a given address in a shared memory until another thread wakes it up via “notify”.

All the higher-level synchronisation primitives, including channels, mutexes, and read-write locks build upon those instructions.

#rust #c, c++ #cplusplus #c++ #c #rust programming

How WebAssembly threads work #

Web Workers #

SharedArrayBuffer #

WebAssembly atomics #

web.dev

Using WebAssembly threads from C, C++ and Rust

`SharedArrayBuffer` #