Code caching for WebAssembly developers

Code caching for WebAssembly developers

Caching is useful for improving the performance of an app,You can store compiled WebAssembly modules on the client so they don't have to be downloaded and compiled every time.

Caching is useful for improving the performance of an app,You can store compiled WebAssembly modules on the client so they don't have to be downloaded and compiled every time.

There’s a saying among developers that the fastest code is code that doesn’t run. Likewise, the fastest compiling code is code that doesn’t have to be compiled. WebAssembly code caching is a new optimization in Chrome and V8 that tries to avoid code compilation by caching the native code produced by the compiler. We’ve written about how Chrome and V8 cache JavaScript code in the past, and best practices for taking advantage of this optimization. In this blog post, we describe the operation of Chrome’s WebAssembly code cache and how developers can take advantage of it to speed up loading for applications with large WebAssembly modules.

WebAssembly compilation recap

WebAssembly is a way to run non-JavaScript code on the Web. A web app can use WebAssembly by loading a .wasm resource, which contains partially compiled code from another language, such as C, C++, or Rust (and more to come.) The WebAssembly compiler’s job is to decode the .wasm resource, validate that it is well-formed, and then compile it to native machine code that can be executed on the user’s machine.

V8 has two compilers for WebAssembly: Liftoff and TurboFan. Liftoff is the baseline compiler, which compiles modules as quickly as possible so execution can begin as soon as possible. TurboFan is V8’s optimizing compiler for both JavaScript and WebAssembly. It runs in the background to generate high-quality native code to give a web app optimal performance over the long term. For large WebAssembly modules, TurboFan can take significant amounts of time — 30 seconds to a minute or more — to completely finish compiling a WebAssembly module to native code.

That’s where code caching comes in. Once TurboFan has finished compiling a large WebAssembly module, Chrome can save the code in its cache so that the next time the module is loaded, we can skip both Liftoff and TurboFan compilation, leading to faster startup and reduced power consumption — compiling code is very CPU-intensive.

WebAssembly code caching uses the same machinery in Chrome that is used for JavaScript code caching. We use the same type of storage, and the same double-keyed caching technique that keeps code compiled by different origins separate in accordance with site isolation, an important Chrome security feature.

WebAssembly code caching algorithm

For now, WebAssembly caching is only implemented for the streaming API calls, compileStreaming and instantiateStreaming. These operate on an HTTP fetch of a .wasm resource, making it easier to use Chrome’s resource fetching and caching mechanisms, and providing a handy resource URL to use as the key to identify the WebAssembly module. The caching algorithm works as follows:

  1. When a .wasm resource is first requested (i.e. a cold run), Chrome downloads it from the network and streams it to V8 to compile. Chrome also stores the .wasm resource in the browser’s resource cache, stored in the file system of the user’s device. This resource cache allows Chrome to load the resource faster the next time it’s needed.
  2. When TurboFan has completely finished compiling the module, and if the .wasm resource is large enough (currently 128 kB), Chrome writes the compiled code to the WebAssembly code cache. This code cache is physically separate from the resource cache in step 1.
  3. When a .wasm resource is requested a second time (i.e. a hot run), Chrome loads the .wasmresource from the resource cache and simultaneously queries the code cache. If there is a cache hit, then the compiled module bytes are sent to the renderer process and passed to V8 which deserializes the code instead of compiling the module. Deserializing is faster and less CPU-intensive than compiling.
  4. It may be that the cached code is no longer valid. This can happen because the .wasm resource has changed, or because V8 has changed, something that is expected to happen at least every 6 weeks because of Chrome’s rapid release cycle. In this case the cached native code is cleared from the cache, and compilation proceeds as in step 1.

Based on this description, we can give some recommendations for improving your website’s use of the WebAssembly code cache.

Tip 1: use the WebAssembly streaming API

Since code caching only works with the streaming API, compile or instantiate your WebAssembly module with compileStreaming or instantiateStreaming, as in this JavaScript snippet:

(async () => {
const fetchPromise = fetch('fibonacci.wasm');
const { instance } = await WebAssembly.instantiateStreaming(fetchPromise);
const result = instance.exports.fibonacci(42);

This article goes into detail about the advantages of using the WebAssembly streaming API. Emscripten tries to use this API by default when it generates loader code for your app. Note that streaming requires that the .wasm resource has the correct MIME type, so the server must send the Content-Type: application/wasm header in its response.

Tip 2: be cache-friendly

Since code caching depends on the resource URL and whether the .wasm resource is up-to-date, developers should try to keep those both stable. If the .wasm resource is fetched from a different URL, it is considered different and V8 has to compile the module again. Similarly, if the .wasm resource is no longer valid in the resource cache, then Chrome has to throw away any cached code.

Keep your code stable

Whenever you ship a new WebAssembly module, it must be completely recompiled. Ship new versions of your code only when necessary to deliver new features or fix bugs. When your code hasn’t changed, let Chrome know. When the browser makes an HTTP request for a resource URL, such as a WebAssembly module, it includes the date and time of the last fetch of that URL. If the server knows that the file hasn’t changed, it can send back a 304 Not Modified response, which tells Chrome and V8 that the cached resource and therefore the cached code are still valid. On the other hand, returning a 200 OK response updates the cached .wasm resource and invalidates the code cache, reverting WebAssembly back to a cold run. Follow web resource best practices by using the response to inform the browser about whether the .wasm resource is cacheable, how long it’s expected to be valid, or when it was last modified.

Don’t change your code’s URL

Cached compiled code is associated with the URL of the .wasm resource, which makes it easy to look up without having to scan the actual resource. This means that changing the URL of a resource (including any query parameters!) creates a new entry in our resource cache, which also requires a complete recompile and creates a new code cache entry.

Go big (but not too big!)

The principal heuristic of WebAssembly code caching is the size of the .wasm resource. If the .wasmresource is smaller than a certain threshold size, we don’t cache the compiled module bytes. The reasoning here is that V8 can compile small modules quickly, possibly faster than loading the compiled code from the cache. At the moment, the cutoff is for .wasm resources of 128 kB or more.

But bigger is better only up to a point. Because caches take up space on the user’s machine, Chrome is careful not to consume too much space. Right now, on desktop machines, the code caches typically hold a few hundred megabytes of data. Since the Chrome caches also restrict the largest entries in the cache to some fraction of the total cache size, there is a further limit of about 150 MB for the compiled WebAssembly code (half the total cache size). It is important to note that compiled modules are often 5–7 times larger than the corresponding .wasm resource on a typical desktop machine.

This size heuristic, like the rest of the caching behavior, may change as we determine what works best for users and developers.

Use a service worker

WebAssembly code caching is enabled for workers and service workers, so it’s possible to use them to load, compile, and cache a new version of code so it’s available the next time your app starts. Every web site must perform at least one full compilation of a WebAssembly module — use workers to hide that from your users.


As a developer, you might want to check that your compiled module is being cached by Chrome. WebAssembly code caching events are not exposed by default in Chrome’s Developer Tools, so the best way to find out whether your modules are being cached is to use the slightly lower-level chrome://tracing feature.

chrome://tracing records instrumented traces of Chrome during some period of time. Tracing records the behavior of the entire browser, including other tabs, windows, and extensions, so it works best when done in a clean user profile, with extensions disabled, and with no other browser tabs open:

# Start a new Chrome browser session with a clean user profile and extensions disabled
google-chrome --user-data-dir="$(mktemp -d)" --disable-extensions

Navigate to chrome://tracing and click ‘Record’ to begin a tracing session. On the dialog window that appears, click ‘Edit Categories’ and check the devtools.timeline category on the right side under ‘Disabled by Default Categories’ (you can uncheck any other pre-selected categories to reduce the amount of data collected). Then click the ‘Record’ button on the dialog to begin the trace.

In another tab load or reload your app. Let it run long enough, 10 seconds or more, to make sure TurboFan compilation completes. When done, click ‘Stop’ to end the trace. A timeline view of events appears. At the top right of the tracing window, there is a text box, just to the right of ‘View Options’. Type v8.wasm to filter out non-WebAssembly events. You should see one or more of the following events:

  • v8.wasm.streamFromResponseCallback — The resource fetch passed to instantiateStreaming received a response.
  • v8.wasm.compiledModule — TurboFan finished compiling the .wasm resource.
  • v8.wasm.cachedModule — Chrome wrote the compiled module to the code cache.
  • v8.wasm.moduleCacheHit — Chrome found the code in its cache while loading the .wasmresource.
  • v8.wasm.moduleCacheInvalid — V8 wasn’t able to deserialize the cached code because it was out of date.

On a cold run, we expect to see v8.wasm.streamFromResponseCallback and v8.wasm.compiledModule events. This indicates that the WebAssembly module was received, and compilation succeeded. If neither event is observed, check that your WebAssembly streaming API calls are working correctly.

After a cold run, if the size threshold was exceeded, we also expect to see a v8.wasm.cachedModuleevent, meaning that the compiled code was sent to the cache. It is possible that we get this event but that the write doesn’t succeed for some reason. There is currently no way to observe this, but metadata on the events can show the size of the code. Very large modules may not fit in the cache.

When caching is working correctly, a hot run produces two events: v8.wasm.streamFromResponseCallback and v8.wasm.moduleCacheHit. The metadata on these events allows you to see the size of the compiled code.

For more on using chrome://tracing, see our article on JavaScript (byte)code caching for developers.


For most developers, code caching should “just work”. It works best, like any cache, when things are stable. Chrome’s caching heuristics may change between versions, but code caching does have behaviors that can be used, and limitations which can be avoided. Careful analysis using chrome://tracing can help you tweak and optimize the use of the WebAssembly code cache by your web app.

Mobile App Development Company India | Ecommerce Web Development Company India

Mobile App Development Company India | Ecommerce Web Development Company India

Best Mobile App Development Company India, WebClues Global is one of the leading web and mobile app development company. Our team offers complete IT solutions including Cross-Platform App Development, CMS & E-Commerce, and UI/UX Design.

We are custom eCommerce Development Company working with all types of industry verticals and providing them end-to-end solutions for their eCommerce store development.

Know more about Top E-Commerce Web Development Company

JavaScript developers should you be using Web Workers?

JavaScript developers should you be using Web Workers?

Do you think JavaScript developers should be making more use of Web Workers to shift execution off of the main thread?

Originally published by David Gilbertson at

So, Web Workers. Those wonderful little critters that allow us to execute JavaScript off the main thread.

Also known as “no, you’re thinking of Service Workers”.

Photo by Caleb Jones on Unsplash

Before I get into the meat of the article, please sit for a lesson in how computers work:

Understood? Good.

For the red/green colourblind, let me explain. While a CPU is doing one thing, it can’t be doing another thing, which means you can’t sort a big array while a user scrolls the screen.

This is bad, if you have a big array and users with fingers.

Enter, Web Workers. These split open the atomic concept of a ‘CPU’ and allow us to think in terms of threads. We can use one thread to handle user-facing work like touch events and rendering the UI, and different threads to carry out all other work.

Check that out, the main thread is green the whole way through, ready to receive and respond to the gentle caress of a user.

You’re excited (I can tell), if we only have UI code on the main thread and all other code can go in a worker, things are going to be amazing (said the way Oprah would say it).

But cool your jets for just a moment, because websites are mostly about the UI — it’s why we have screens. And a lot of a user’s interactions with your site will be tapping on the screen, waiting for a response, reading, tapping, looking, reading, and so on.

So we can’t just say “here’s some JS that takes 20ms to run, chuck it on a thread”, we must think about where that execution time exists in the user’s world of tap, read, look, read, tap…

I like to boil this down to one specific question:

Is the user waiting anyway?

Imagine we have created some sort of git-repository-hosting website that shows all sorts of things about a repository. We have a cool feature called ‘issues’. A user can even click an ‘issues’ tab in our website to see a list of all issues relating to the repository. Groundbreaking!

When our users click this issues tab, the site is going to fetch the issue data, process it in some way — perhaps sort, or format dates, or work out which icon to show — then render the UI.

Inside the user’s computer, that’ll look exactly like this.

Look at that processing stage, locking up the main thread even though it has nothing to do with the UI! That’s terrible, in theory.

But think about what the human is actually doing at this point. They’re waiting for the common trio of network/process/render; just sittin’ around with less to do than the Bolivian Navy.

Because we care about our users, we show a loading indicator to let them know we’ve received their request and are working on it — putting the human in a ‘waiting’ state. Let’s add that to the diagram.

Now that we have a human in the picture, we can mix in a Web Worker and think about the impact it will have on their life:


First thing to note is that we’re not doing anything in parallel. We need the data from the network before we process it, and we need to process the data before we can render the UI. The elapsed time doesn’t change.

(BTW, the time involved in moving data to a Web Worker and back is negligible: 1ms per 100 KB is a decent rule of thumb.)

So we can move work off the main thread and have a page that is responsive during that time, but to what end? If our user is sitting there looking at a spinner for 600ms, have we enriched their experience by having a responsive screen for the middle third?


I’ve fudged these diagrams a little bit to make them the gorgeous specimens of graphic design that they are, but they’re not really to scale.

When responding to a user request, you’ll find that the network and DOM-manipulating part of any given task take much, much longer than the pure-JS data processing part.

I saw an article recently making the case that updating a Redux store was a good candidate for Web Workers because it’s not UI work (and non-UI work doesn’t belong on the main thread).

Chucking the data processing over to a worker thread sounds sensible, but the idea struck me as a little, umm, academic.

First, let’s split instances of ‘updating a store’ into two categories:

  1. Updating a store in response to a user interaction, then updating the UI in response to the data change
  2. Not that first one

If the first scenario, a user taps a button on the screen — perhaps to change the sort order of a list. The store updates, and this results in a re-rendering of the DOM (since that’s the point of a store).

Let me just delete one thing from the previous diagram:

In my experience, it is rare that the store-updating step goes beyond a few dozen milliseconds, and is generally followed by ten times that in DOM updating, layout, and paint. If I’ve got a site that’s taking longer than this, I’d be asking questions about why I have so much data in the browser and so much DOM, rather than on which thread I should do my processing.

So the question we’re faced with is the same one from above: the user tapped something on the screen, we’re going to work on that request for hopefully less than a second, why would we want to make the screen responsive during that time?

OK what about the second scenario, where a store update isn’t in response to a user interaction? Performing an auto-save, for example — there’s nothing more annoying than an app becoming unresponsive doing something you didn’t ask it to do.

Actually there’s heaps of things more annoying than that. Teens, for example.

Anyhoo, if you’re doing an auto-save and taking 100ms to process data client-side before sending it off to a server, then you should absolutely use a Web Worker.

In fact, any ‘background’ task that the user hasn’t asked for, or isn’t waiting for, is a good candidate for moving to a Web Worker.

The matter of value

Complexity is expensive, and implementing Web Workers ain’t cheap.

If you’re using a bundler — and you are — you’ll have a lot of reading to do, and probably npm packages to install. If you’ve got a create-react-app app, prepare to eject (and put aside two days twice a year to update 30 different packages when the next version of Babel/Redux/React/ESLint comes out).

Also, if you want to share anything fancier than plain data between a worker and the main thread you’ve got some more reading to do (comlink is your friend).

What I’m getting at is this: if the benefit is real, but minimal, then you’ve gotta ask if there’s something else you could spend a day or two on with a greater benefit to your users.

This thinking is true of everything, of course, but I’ve found that Web Workers have a particularly poor benefit-to-effort ratio.

Hey David, why you hate Web Workers so bad?

Good question.

This is a doweling jig:

I own a doweling jig. I love my doweling jig. If I need to drill a hole into the end of a piece of wood and ensure that it’s perfectly perpendicular to the surface, I use my doweling jig.

But I don’t use it to eat breakfast. For that I use a spoon.

Four years ago I was working on some fancy animations. They looked slick on a fast device, but janky on a slow one. So I wrote fireball-js, which executes a rudimentary performance benchmark on the user’s device and returns a score, allowing me to run my animations only on devices that would render them smoothly.

Where’s the best spot to run some CPU intensive code that the user didn’t request? On a different thread, of course. A Web Worker was the correct tool for the job.

Fast forward to 2019 and you’ll find me writing a routing algorithm for a mapping application. This requires parsing a big fat GeoJSON map into a collection of nodes and edges, to be used when a user asks for directions. The processing isn’t in response to a user request and the user isn’t waiting on it. And so, a Web Worker is the correct tool for the job.

It was only when doing this that it dawned on me: in the intervening quartet of years, I have seen exactly zero other instances where Web Workers would have improved the user experience.

Contrast this with a recent resurgence in Web Worker wonderment, and combine that contrast with the fact that I couldn’t think of anything else to write about, then concatenate that combined contrast with my contrarian character and you’ve got yourself a blog post telling you that maybe Web Workers are a teeny-tiny bit overhyped.

Thanks for reading

If you liked this post, share it with all of your programming buddies!

Follow us on Facebook | Twitter

Further reading

An Introduction to Web Workers

JavaScript Web Workers: A Beginner’s Guide

Using Web Workers to Real-time Processing

How to use Web Workers in Angular app

Using Web Workers with Angular CLI

WebAssembly for Web Developers

WebAssembly for Web Developers

WebAssembly for Web Developers. WebAssembly is often hailed as a performance tool for critical tasks or to bring existing C++ code bases to the web – such as games. But WebAssembly is so much more. You can use WebAssembly as a puzzle piece to give the web platform the few missing capability that you are missing or to surgically replace a JavaScript bottleneck.