Originally published by Jan de Mooij at https://hacks.mozilla.org
Modern web applications load and execute a lot more JavaScript code than they did just a few years ago. While JIT (just-in-time) compilers have been very successful in making JavaScript performant, we needed a better solution to deal with these new workloads.
To address this, we’ve added a new, generated JavaScript bytecode interpreter to the JavaScript engine in Firefox 70. The interpreter is available now in the Firefox Nightly channel, and will go to general release in October. Instead of writing or generating a new interpreter from scratch, we found a way to do this by sharing most code with our existing Baseline JIT.
The new Baseline Interpreter has resulted in performance improvements, memory usage reductions and code simplifications. Here’s how we got there:
In modern JavaScript engines, each function is initially executed in a bytecode interpreter. Functions that are called a lot (or perform many loop iterations) are compiled to native machine code.
Firefox has an interpreter written in C++ and multiple JIT tiers:
Ion JIT code for a function can be ‘deoptimized’ and thrown away for various reasons, for example when the function is called with a new argument type. This is called a bailout. When a bailout happens, execution continues in the Baseline code until the next Ion compilation.
Until Firefox 70, the execution pipeline for a very hot function looked like this:
Although this works pretty well, we ran into the following problems with the first part of the pipeline (C++ Interpreter and Baseline JIT):
We needed type information from the Baseline JIT to enable the more optimized tiers, and we wanted to use JIT compilation for runtime speed. However, the modern web has such large codebases that even the relatively fast Baseline JIT Compiler spent a lot of time compiling. To address this, Firefox 70 adds a new tier called the Baseline Interpreter to the pipeline:
The Baseline Interpreter sits between the C++ interpreter and the Baseline JIT and has elements from both. It executes all bytecode instructions with a fixed interpreter loop (like the C++ interpreter). In addition, it uses Inline Caches to improve performance and collect type information (like the Baseline JIT).
Generating an interpreter isn’t a new idea. However, we found a nice new way to do it by reusing most of the Baseline JIT Compiler code. The Baseline JIT is a template JIT, meaning each bytecode instruction is compiled to a mostly fixed sequence of machine instructions. We generate those sequences into an interpreter loop instead.
As mentioned above, the Baseline JIT uses Inline Caches (ICs) both to make it fast and to help Ion compilation. To get type information, the Ion JIT compiler can inspect the Baseline ICs.
Because we wanted the Baseline Interpreter to use exactly the same Inline Caches and type information as the Baseline JIT, we added a new data structure called JitScript. JitScript contains all type information and IC data structures used by both the Baseline Interpreter and JIT.
The diagram below shows what this looks like in memory. Each arrow is a pointer in C++. Initially, the function just has a JSScript with the bytecode that can be interpreted by the C++ interpreter. After a few calls/iterations we create the JitScript, attach it to the JSScript and can now run the script in the Baseline Interpreter.
As the code gets warmer we may also create the BaselineScript (Baseline JIT code) and then the IonScript (Ion JIT code).
Note that the Baseline JIT data for a function is now just the machine code. We’ve moved all the inline caches and profiling data into JitScript.
The Baseline Interpreter uses the same frame layout as the Baseline JIT, but we’ve added some interpreter-specific fields to the frame. For example, the bytecode PC (program counter), a pointer to the bytecode instruction we are currently executing, is not updated explicitly in Baseline JIT code. It can be determined from the return address if needed, but the Baseline Interpreter has to store it in the frame.
Sharing the frame layout like this has a lot of advantages. We’ve made almost no changes to C++ and IC code to support Baseline Interpreter frames—they’re just like Baseline JIT frames. Furthermore, When the script is warm enough for Baseline JIT compilation, switching from Baseline Interpreter code to Baseline JIT code is a matter of jumping from the interpreter code into JIT code.
Because the Baseline Interpreter and JIT are so similar, a lot of the code generation code can be shared too. To do this, we added a templated BaselineCodeGen
base class with two derived classes:
BaselineCompiler
: used by the Baseline JIT to compile a script’s bytecode to machine code.BaselineInterpreterGenerator
: used to generate the Baseline Interpreter code.The base class has a Handler C++ template argument that can be used to specialize behavior for either the Baseline Interpreter or JIT. A lot of Baseline JIT code can be shared this way. For example, the implementation of the JSOP_GETPROP
bytecode instruction (for a property access like obj.foo
in JavaScript code) is shared code. It calls the emitNextIC
helper method that’s specialized for either Interpreter or JIT mode.
With all these pieces in place, we were able to implement the BaselineInterpreterGenerator
class to generate the Baseline Interpreter! It generates a threaded interpreter loop: The code for each bytecode instruction is followed by an indirect jump to the next bytecode instruction.
For example, on x64 we currently generate the following machine code to interpret JSOP_ZERO
(bytecode instruction to push a zero value on the stack):
// Push Int32Value(0).movabsq $-0x7800000000000, %r11
pushq %r11
// Increment bytecode pc register.
addq $0x1, %r14
// Patchable NOP for debugger support.
nopl (%rax,%rax)
// Load the next opcode.
movzbl (%r14), %ecx
// Jump to interpreter code for the next instruction.
leaq 0x432e(%rip), %rbx
jmpq *(%rbx,%rcx,8)
When we enabled the Baseline Interpreter in Firefox Nightly (version 70) back in July, we increased the Baseline JIT warm-up threshold from 10 to 100. The warm-up count is determined by counting the number of calls to the function + the number of loop iterations so far. The Baseline Interpreter has a threshold of 10, same as the old Baseline JIT threshold. This means that the Baseline JIT has a lot less code to compile.
After this landed in Firefox Nightly our performance testing infrastructure detected several improvements:
Note that we’ve landed more performance improvements since this first landed.
To measure how the Baseline Interpreter’s performance compares to the C++ Interpreter and the Baseline JIT, I ran Speedometer and Google Docs on Windows 10 64-bit on Mozilla’s Try server and enabled the tiers one by one. (The following numbers reflect the best of 7 runs.):
On Google Docs we see that the Baseline Interpreter is much faster than just the C++ Interpreter. Enabling the Baseline JIT too makes the page load only a little bit faster.
On the Speedometer benchmark we get noticeably better results when we enable the Baseline JIT tier. The Baseline Interpreter does again much better than just the C++ Interpreter:
We think these numbers are great: the Baseline Interpreter is much faster than the C++ Interpreter and its start-up time (JitScript allocation) is much faster than Baseline JIT compilation (at least 10 times faster).
After this all landed and stuck, we were able to simplify the Baseline JIT and Ion code by taking advantage of the Baseline Interpreter.
For example, deoptimization bailouts from Ion now resume in the Baseline Interpreter instead of in the Baseline JIT. The interpreter can re-enter Baseline JIT code at the next loop iteration in the JS code. Resuming in the interpreter is much easier than resuming in the middle of Baseline JIT code. We now have to record less metadata for Baseline JIT code, so Baseline JIT compilation got faster too. Similarly, we were able to remove a lot of complicated code for debugger support and exception handling.
With the Baseline Interpreter in place, it should now be possible to move Baseline JIT compilation off-thread. We will be working on that in the coming months, and we anticipate more performance improvements in this area.
Thanks for reading ❤
If you liked this post, please do share/like it with all of your programming buddies!
Follow us on Facebook | Twitter
☞ The Complete JavaScript Course 2019: Build Real Projects!
☞ Vue JS 2 - The Complete Guide (incl. Vue Router & Vuex)
☞ JavaScript Bootcamp - Build Real World Applications
☞ JavaScript Programming Tutorial - Full JavaScript Course for Beginners
☞ New ES2019 Features Every JavaScript Developer Should Know
☞ Best JavaScript Frameworks, Libraries and Tools to Use in 2019
☞ JavaScript Basics Before You Learn React
#javascript #web-development