Faster script loading with BinaryAST?

Faster script loading with BinaryAST?

BinaryAST is a new over-the-wire format for JavaScript that aims to speed up parsing while keeping the semantics of the original JavaScript intact

BinaryAST is a new over-the-wire format for JavaScript that aims to speed up parsing while keeping the semantics of the original JavaScript intact

JavaScript Cold starts

The performance of applications on the web platform is becoming increasingly bottlenecked by the startup (load) time. Large amounts of **JavaScript **code are required to create rich web experiences that we’ve become used to. When we look at the total size of JavaScript requested on mobile devices from HTTPArchive, we see that an average page loads 350KB of JavaScript, while 10% of pages go over the 1MB threshold. The rise of more complex applications can push these numbers even higher.

While caching helps, popular websites regularly release new code, which makes cold start (first load) times particularly important. With browsers moving to separate caches for different domains to prevent cross-site leaks, the importance of cold starts is growing even for popular subresources served from CDNs, as they can no longer be safely shared.

Usually, when talking about the cold start performance, the primary factor considered is a raw download speed. However, on modern interactive pages one of the other big contributors to cold starts is JavaScript parsing time. This might seem surprising at first, but makes sense - before starting to execute the code, the engine has to first parse the fetched JavaScript, make sure it doesn’t contain any syntax errors and then compile it to the initial bytecode. As networks become faster, parsing and compilation of JavaScript could become the dominant factor.

The device capability (CPU or memory performance) is the most important factor in the variance of JavaScript parsing times and correspondingly the time to application start. A 1MB JavaScript file will take an order of a 100 ms to parse on a modern desktop or high-end mobile device but can take over a second on an average phone (Moto G4).

A more detailed post on the overall cost of parsing, compiling and execution of JavaScript shows how the JavaScript boot time can vary on different mobile devices. For example, in the case of news.google.com, it can range from 4s on a Pixel 2 to 28s on a low-end device.

While engines continuously improve raw parsing performance, with V8 in particular doubling it over the past year, as well as moving more things off the main thread, parsers still have to do lots of potentially unnecessary work that consumes memory, battery and might delay the processing of the useful resources.

The “BinaryAST” Proposal

This is where **BinaryAST **comes in. BinaryAST is a new over-the-wire format for JavaScript proposed and actively developed by Mozilla that aims to speed up parsing while keeping the semantics of the original JavaScript intact. It does so by using an efficient binary representation for code and data structures, as well as by storing and providing extra information to guide the parser ahead of time.

The name comes from the fact that the format stores the JavaScript source as an AST encoded into a binary file. The specification lives at tc39.github.io/proposal-binary-ast and is being worked on by engineers from Mozilla, Facebook, Bloomberg and Cloudflare.

“Making sure that web applications start quickly is one of the most important, but also one of the most challenging parts of web development. We know that BinaryAST can radically reduce startup time, but we need to collect real-world data to demonstrate its impact. Cloudflare's work on enabling use of BinaryAST with Cloudflare Workers is an important step towards gathering this data at scale.”> Till Schneidereit, Senior Engineering Manager, Developer Technologies> Mozilla## Parsing JavaScript

For regular JavaScript code to execute in a browser the source is parsed into an intermediate representation known as an AST that describes the syntactic structure of the code. This representation can then be compiled into a byte code or a native machine code for execution.

A simple example of adding two numbers can be represented in an AST as:

Parsing JavaScript is not an easy task; no matter which optimisations you apply, it still requires reading the entire text file char by char, while tracking extra context for syntactic analysis.

The goal of the BinaryAST is to reduce the complexity and the amount of work the browser parser has to do overall by providing an additional information and context by the time and place where the parser needs it.

To execute JavaScript delivered as BinaryAST the only steps required are:

Another benefit of BinaryAST is that it makes possible to only parse the critical code necessary for start-up, completely skipping over the unused bits. This can dramatically improve the initial loading time.

This post will now describe some of the challenges of parsing JavaScript in more detail, explain how the proposed format addressed them, and how we made it possible to run its encoder in Workers.

Hoisting

JavaScript relies on hoisting for all declarations - variables, functions, classes. Hoisting is a property of the language that allows you to declare items after the point they’re syntactically used.

Let's take the following example:

function f() {
	return g();
}

function g() {
	return 42;
}

Here, when the parser is looking at the body of f, it doesn’t know yet what g is referring to - it could be an already existing global function or something declared further in the same file - so it can’t finalise parsing of the original function and start the actual compilation.

BinaryAST fixes this by storing all the scope information and making it available upfront before the actual expressions.

As shown by the difference between the initial AST and the enhanced AST in a JSON representation:

Lazy parsing

One common technique used by modern engines to improve parsing times is lazy parsing. It utilises the fact that lots of websites include more JavaScript than they actually need, especially for the start-up.

Working around this involves a set of heuristics that try to guess when any given function body in the code can be safely skipped by the parser initially and delayed for later. A common example of such heuristic is immediately running the full parser for any function that is wrapped into parentheses:

(function(...

Such prefix usually indicates that a following function is going to be an IIFE (immediately-invoked function expression), and so the parser can assume that it will be compiled and executed ASAP, and wouldn’t benefit from being skipped over and delayed for later.

(function() {
	…
})();

These heuristics significantly improve the performance of the initial parsing and cold starts, but they’re not completely reliable or trivial to implement.

One of the reasons is the same as in the previous section - even with lazy parsing, you still need to read the contents, analyse them and store an additional scope information for the declarations.

Another reason is that the JavaScript specification requires reporting any syntax errors immediately during load time, and not when the code is actually executed. A class of these errors, called early errors, is checking for mistakes like usage of the reserved words in invalid contexts, strict mode violations, variable name clashes and more. All of these checks require not only lexing JavaScript source, but also tracking extra state even during the lazy parsing.

Having to do such extra work means you need to be careful about marking functions as lazy too eagerly, especially if they actually end up being executed during the page load. Otherwise you’re making cold start costs even worse, as now every function that is erroneously marked as lazy, needs to be parsed twice - once by the lazy parser and then again by the full one.

Because BinaryAST is meant to be an output format of other tools such as Babel, TypeScript and bundlers such as Webpack, the browser parser can rely on the JavaScript being already analysed and verified by the initial parser. This allows it to skip function bodies completely, making lazy parsing essentially free.

It reduces the cost of a completely unused code - while including it is still a problem in terms of the network bandwidth (don’t do this!), at least it’s not affecting parsing times anymore. These benefits apply equally to the code that is used later in the page lifecycle (for example, invoked in response to user actions), but is not required during the startup.

Last but not least important benefit of such approach is that BinaryAST encodes lazy annotations as part of the format, giving tools and developers direct and full control over the heuristics. For example, a tool targeting the Web platform or a framework CLI can use its domain-specific knowledge to mark some event handlers as lazy or eager depending on the context and the event type.

Avoiding ambiguity in parsing

Using a text format for a programming language is great for readability and debugging, but it's not the most efficient representation for parsing and execution.

For example, parsing low-level types like numbers, booleans and even strings from text requires extra analysis and computation, which is unnecessary when you can just store and read them as native binary-encoded values in the first place and read directly on the other side.

Another problem is an ambiguity in the grammar itself. It was already an issue in the ES5 world, but could usually be resolved with some extra bookkeeping based on the previously seen tokens. However, in ES6+ there are productions that can be ambiguous all the way through until they’re parsed completely.

For example, a token sequence like:

(a, {b: c, d}, [e = 1])...

can start either a parenthesized comma expression with nested object and array literals and an assignment:

(a, {b: c, d}, [e = 1]); // it was an expression

or a parameter list of an arrow expression function with nested object and array patterns and a default value:

(a, {b: c, d}, [e = 1]) => … // it was a parameter list

Both representations are perfectly valid, but have completely different semantics, and you can’t know which one you’re dealing with until you see the final token.

To work around this, parsers usually have to either backtrack, which can easily get exponentially slow, or to parse contents into intermediate node types that are capable of holding both expressions and patterns, with following conversion. The latter approach preserves linear performance, but makes the implementation more complicated and requires preserving more state.

In the BinaryAST format this issue doesn't exist in the first place because the parser sees the type of each node before it even starts parsing its contents.

Cloudflare Implementation

Currently, the format is still in flux, but the very first version of the client-side implementation was released under a flag in Firefox Nightly several months ago. Keep in mind this is only an initial unoptimised prototype, and there are already several experiments changing the format to provide improvements to both size and parsing performance.

On the producer side, the reference implementation lives at github.com/binast/binjs-ref. Our goal was to take this reference implementation and consider how we would deploy it at Cloudflare scale.

If you dig into the codebase, you will notice that it currently consists of two parts.

One is the encoder itself, which is responsible for taking a parsed AST, annotating it with scope and other relevant information, and writing out the result in one of the currently supported formats. This part is written in Rust and is fully native.

Another part is what produces that initial AST - the parser. Interestingly, unlike the encoder, it's implemented in JavaScript.

Unfortunately, there is currently no battle-tested native JavaScript parser with an open API, let alone implemented in Rust. There have been a few attempts, but, given the complexity of JavaScript grammar, it’s better to wait a bit and make sure they’re well-tested before incorporating it into the production encoder.

On the other hand, over the last few years the JavaScript ecosystem grew to extensively rely on developer tools implemented in JavaScript itself. In particular, this gave a push to rigorous parser development and testing. There are several JavaScript parser implementations that have been proven to work on thousands of real-world projects.

With that in mind, it makes sense that the BinaryAST implementation chose to use one of them - in particular, Shift - and integrated it with the Rust encoder, instead of attempting to use a native parser.

Connecting Rust and JavaScript

Integration is where things get interesting.

Rust is a native language that can compile to an executable binary, but JavaScript requires a separate engine to be executed. To connect them, we need some way to transfer data between the two without sharing the memory.

Initially, the reference implementation generated JavaScript code with an embedded input on the fly, passed it to Node.js and then read the output when the process had finished. That code contained a call to the Shift parser with an inlined input string and produced the AST back in a JSON format.

This doesn’t scale well when parsing lots of JavaScript files, so the first thing we did is transformed the Node.js side into a long-living daemon. Now Rust could spawn a required Node.js process just once and keep passing inputs into it and getting responses back as individual messages.

Running in the cloud

While the Node.js solution worked fairly well after these optimisations, shipping both a Node.js instance and a native bundle to production requires some effort. It's also potentially risky and requires manual sandboxing of both processes to make sure we don’t accidentally start executing malicious code.

On the other hand, the only thing we needed from Node.js is the ability to run the JavaScript parser code. And we already have an isolated JavaScript engine running in the cloud - Cloudflare Workers! By additionally compiling the native Rust encoder to Wasm (which is quite easy with the native toolchain and wasm-bindgen), we can even run both parts of the code in the same process, making cold starts and communication much faster than in a previous model.

Optimising data transfer

The next logical step is to reduce the overhead of data transfer. JSON worked fine for communication between separate processes, but with a single process we should be able to retrieve the required bits directly from the JavaScript-based AST.

To attempt this, first of all, we needed to move away from the direct JSON usage to something more generic that would allow us to support various import formats. The Rust ecosystem already has an amazing serialisation framework for that - Serde.

Aside from allowing us to be more flexible in regard to the inputs, rewriting to Serde helped an existing native use case too. Now, instead of parsing JSON into an intermediate representation and then walking through it, all the native typed AST structures can be deserialized directly from the stdout pipe of the Node.js process in a streaming manner. This significantly improved both the CPU usage and memory pressure.

But there is one more thing we can do: instead of serializing and deserializing from an intermediate format (let alone, a text format like JSON), we should be able to operate [almost] directly on JavaScript values, saving memory and repetitive work.

How is this possible? wasm-bindgen provides a type called JsValue that stores a handle to an arbitrary value on the JavaScript side. This handle internally contains an index into a predefined array.

Each time a JavaScript value is passed to the Rust side as a result of a function call or a property access, it’s stored in this array and an index is sent to Rust. The next time Rust wants to do something with that value, it passes the index back and the JavaScript side retrieves the original value from the array and performs the required operation.

By reusing this mechanism, we could implement a Serde deserializer that requests only the required values from the JS side and immediately converts them to their native representation. It’s now open-sourced under https://github.com/cloudflare/serde-wasm-bindgen.

At first, we got a much worse performance out of this due to the overhead of more frequent calls between 1) Wasm and JavaScript - SpiderMonkey has improved these recently, but other engines still lag behind and 2) JavaScript and C++, which also can’t be optimised well in most engines.

The JavaScript <-> C++ overhead comes from the usage of TextEncoder to pass strings between JavaScript and Wasm in wasm-bindgen, and, indeed, it showed up as the highest in the benchmark profiles. This wasn’t surprising - after all, strings can appear not only in the value payloads, but also in property names, which have to be serialized and sent between JavaScript and Wasm over and over when using a generic JSON-like structure.

Luckily, because our deserializer doesn’t have to be compatible with JSON anymore, we can use our knowledge of Rust types and cache all the serialized property names as JavaScript value handles just once, and then keep reusing them for further property accesses.

This, combined with some changes to wasm-bindgen which we have upstreamed, allows our deserializer to be up to 3.5x faster in benchmarks than the original Serde support in wasm-bindgen, while saving ~33% off the resulting code size. Note that for string-heavy data structures it might still be slower than the current JSON-based integration, but situation is expected to improve over time when reference types proposal lands natively in Wasm.

After implementing and integrating this deserializer, we used the wasm-pack plugin for Webpack to build a Worker with both Rust and JavaScript parts combined and shipped it to some test zones.

Show me the numbers

Keep in mind that this proposal is in very early stages, and current benchmarks and demos are not representative of the final outcome (which should improve numbers much further).

As mentioned earlier, BinaryAST can mark functions that should be parsed lazily ahead of time. By using different levels of lazification in the encoder (https://github.com/binast/binjs-ref/blob/b72aff7dac7c692a604e91f166028af957cdcda5/crates/binjs_es6/src/lazy.rs#L43) and running tests against some popular JavaScript libraries, we found following speed-ups.

Level 0 (no functions are lazified)

With lazy parsing disabled in both parsers we got a raw parsing speed improvement of between 3 and 10%.

Level 3 (functions up to 3 levels deep are lazified)

But with the lazification set to skip nested functions of up to 3 levels we see much more dramatic improvements in parsing time between 90 and 97%. As mentioned earlier in the post, BinaryAST makes lazy parsing essentially free by completely skipping over the marked functions.

All the numbers are from manual tests on a Linux x64 Intel i7 with 16Gb of ram.

While these synthetic benchmarks are impressive, they are not representative of real-world scenarios. Normally you will use at least some of the loaded JavaScript during the startup. To check this scenario, we decided to test some realistic pages and demos on desktop and mobile Firefox and found speed-ups in page loads too.

For a sample application (https://github.com/cloudflare/binjs-demo, https://serve-binjs.that-test.site/) which weighed in at around 1.2 MB of JavaScript we got the following numbers for initial script execution:

Here is a video that will give you an idea of the improvement as seen by a user on mobile Firefox (in this case showing the entire page startup time):

Next step is to start gathering data on real-world websites, while improving the underlying format.

How do I test BinaryAST on my website?

We’ve open-sourced our Worker so that it could be installed on any Cloudflare zone: https://github.com/binast/binjs-ref/tree/cf-wasm.

One thing to be currently wary of is that, even though the result gets stored in the cache, the initial encoding is still an expensive process, and might easily hit CPU limits on any non-trivial JavaScript files and fall back to the unencoded variant. We are working to improve this situation by releasing BinaryAST encoder as a separate feature with more relaxed limits in the following few days.

Meanwhile, if you want to play with BinaryAST on larger real-world scripts, an alternative option is to use a static binjs_encode tool from https://github.com/binast/binjs-ref to pre-encode JavaScript files ahead of time. Then, you can use a Worker from https://github.com/cloudflare/binast-cf-worker to serve the resulting BinaryAST assets when supported and requested by the browser.

On the client side, you’ll currently need to download Firefox Nightly, go to about:config and enable unrestricted BinaryAST support via the following options:

Now, when opening a website with either of the Workers installed, Firefox will get BinaryAST instead of JavaScript automatically.

Summary

The amount of JavaScript in modern apps is presenting performance challenges for all consumers. Engine vendors are experimenting with different ways to improve the situation - some are focusing on raw decoding performance, some on parallelizing operations to reduce overall latency, some are researching new optimised formats for data representation, and some are inventing and improving protocols for the network delivery.

No matter which one it is, we all have a shared goal of making the Web better and faster. On Cloudflare's side, we're always excited about collaborating with all the vendors and combining various approaches to make that goal closer with every step.

Mobile App Development Company India | Ecommerce Web Development Company India

Mobile App Development Company India | Ecommerce Web Development Company India

Best Mobile App Development Company India, WebClues Global is one of the leading web and mobile app development company. Our team offers complete IT solutions including Cross-Platform App Development, CMS & E-Commerce, and UI/UX Design.

We are custom eCommerce Development Company working with all types of industry verticals and providing them end-to-end solutions for their eCommerce store development.

Know more about Top E-Commerce Web Development Company

JavaScript developers should you be using Web Workers?

JavaScript developers should you be using Web Workers?

Do you think JavaScript developers should be making more use of Web Workers to shift execution off of the main thread?

Originally published by David Gilbertson at https://medium.com

So, Web Workers. Those wonderful little critters that allow us to execute JavaScript off the main thread.

Also known as “no, you’re thinking of Service Workers”.

Photo by Caleb Jones on Unsplash

Before I get into the meat of the article, please sit for a lesson in how computers work:

Understood? Good.

For the red/green colourblind, let me explain. While a CPU is doing one thing, it can’t be doing another thing, which means you can’t sort a big array while a user scrolls the screen.

This is bad, if you have a big array and users with fingers.

Enter, Web Workers. These split open the atomic concept of a ‘CPU’ and allow us to think in terms of threads. We can use one thread to handle user-facing work like touch events and rendering the UI, and different threads to carry out all other work.

Check that out, the main thread is green the whole way through, ready to receive and respond to the gentle caress of a user.

You’re excited (I can tell), if we only have UI code on the main thread and all other code can go in a worker, things are going to be amazing (said the way Oprah would say it).

But cool your jets for just a moment, because websites are mostly about the UI — it’s why we have screens. And a lot of a user’s interactions with your site will be tapping on the screen, waiting for a response, reading, tapping, looking, reading, and so on.

So we can’t just say “here’s some JS that takes 20ms to run, chuck it on a thread”, we must think about where that execution time exists in the user’s world of tap, read, look, read, tap…

I like to boil this down to one specific question:

Is the user waiting anyway?

Imagine we have created some sort of git-repository-hosting website that shows all sorts of things about a repository. We have a cool feature called ‘issues’. A user can even click an ‘issues’ tab in our website to see a list of all issues relating to the repository. Groundbreaking!

When our users click this issues tab, the site is going to fetch the issue data, process it in some way — perhaps sort, or format dates, or work out which icon to show — then render the UI.

Inside the user’s computer, that’ll look exactly like this.

Look at that processing stage, locking up the main thread even though it has nothing to do with the UI! That’s terrible, in theory.

But think about what the human is actually doing at this point. They’re waiting for the common trio of network/process/render; just sittin’ around with less to do than the Bolivian Navy.

Because we care about our users, we show a loading indicator to let them know we’ve received their request and are working on it — putting the human in a ‘waiting’ state. Let’s add that to the diagram.

Now that we have a human in the picture, we can mix in a Web Worker and think about the impact it will have on their life:

Hmmm.

First thing to note is that we’re not doing anything in parallel. We need the data from the network before we process it, and we need to process the data before we can render the UI. The elapsed time doesn’t change.

(BTW, the time involved in moving data to a Web Worker and back is negligible: 1ms per 100 KB is a decent rule of thumb.)

So we can move work off the main thread and have a page that is responsive during that time, but to what end? If our user is sitting there looking at a spinner for 600ms, have we enriched their experience by having a responsive screen for the middle third?

No.

I’ve fudged these diagrams a little bit to make them the gorgeous specimens of graphic design that they are, but they’re not really to scale.

When responding to a user request, you’ll find that the network and DOM-manipulating part of any given task take much, much longer than the pure-JS data processing part.

I saw an article recently making the case that updating a Redux store was a good candidate for Web Workers because it’s not UI work (and non-UI work doesn’t belong on the main thread).

Chucking the data processing over to a worker thread sounds sensible, but the idea struck me as a little, umm, academic.

First, let’s split instances of ‘updating a store’ into two categories:

  1. Updating a store in response to a user interaction, then updating the UI in response to the data change
  2. Not that first one

If the first scenario, a user taps a button on the screen — perhaps to change the sort order of a list. The store updates, and this results in a re-rendering of the DOM (since that’s the point of a store).

Let me just delete one thing from the previous diagram:

In my experience, it is rare that the store-updating step goes beyond a few dozen milliseconds, and is generally followed by ten times that in DOM updating, layout, and paint. If I’ve got a site that’s taking longer than this, I’d be asking questions about why I have so much data in the browser and so much DOM, rather than on which thread I should do my processing.

So the question we’re faced with is the same one from above: the user tapped something on the screen, we’re going to work on that request for hopefully less than a second, why would we want to make the screen responsive during that time?

OK what about the second scenario, where a store update isn’t in response to a user interaction? Performing an auto-save, for example — there’s nothing more annoying than an app becoming unresponsive doing something you didn’t ask it to do.

Actually there’s heaps of things more annoying than that. Teens, for example.

Anyhoo, if you’re doing an auto-save and taking 100ms to process data client-side before sending it off to a server, then you should absolutely use a Web Worker.

In fact, any ‘background’ task that the user hasn’t asked for, or isn’t waiting for, is a good candidate for moving to a Web Worker.

The matter of value

Complexity is expensive, and implementing Web Workers ain’t cheap.

If you’re using a bundler — and you are — you’ll have a lot of reading to do, and probably npm packages to install. If you’ve got a create-react-app app, prepare to eject (and put aside two days twice a year to update 30 different packages when the next version of Babel/Redux/React/ESLint comes out).

Also, if you want to share anything fancier than plain data between a worker and the main thread you’ve got some more reading to do (comlink is your friend).

What I’m getting at is this: if the benefit is real, but minimal, then you’ve gotta ask if there’s something else you could spend a day or two on with a greater benefit to your users.

This thinking is true of everything, of course, but I’ve found that Web Workers have a particularly poor benefit-to-effort ratio.

Hey David, why you hate Web Workers so bad?

Good question.

This is a doweling jig:

I own a doweling jig. I love my doweling jig. If I need to drill a hole into the end of a piece of wood and ensure that it’s perfectly perpendicular to the surface, I use my doweling jig.

But I don’t use it to eat breakfast. For that I use a spoon.

Four years ago I was working on some fancy animations. They looked slick on a fast device, but janky on a slow one. So I wrote fireball-js, which executes a rudimentary performance benchmark on the user’s device and returns a score, allowing me to run my animations only on devices that would render them smoothly.

Where’s the best spot to run some CPU intensive code that the user didn’t request? On a different thread, of course. A Web Worker was the correct tool for the job.

Fast forward to 2019 and you’ll find me writing a routing algorithm for a mapping application. This requires parsing a big fat GeoJSON map into a collection of nodes and edges, to be used when a user asks for directions. The processing isn’t in response to a user request and the user isn’t waiting on it. And so, a Web Worker is the correct tool for the job.

It was only when doing this that it dawned on me: in the intervening quartet of years, I have seen exactly zero other instances where Web Workers would have improved the user experience.

Contrast this with a recent resurgence in Web Worker wonderment, and combine that contrast with the fact that I couldn’t think of anything else to write about, then concatenate that combined contrast with my contrarian character and you’ve got yourself a blog post telling you that maybe Web Workers are a teeny-tiny bit overhyped.

Thanks for reading

If you liked this post, share it with all of your programming buddies!

Follow us on Facebook | Twitter

Further reading

An Introduction to Web Workers

JavaScript Web Workers: A Beginner’s Guide

Using Web Workers to Real-time Processing

How to use Web Workers in Angular app

Using Web Workers with Angular CLI