Implement a complete Rust reimplementation of the LibJS frontend:
lexer, parser, AST, scope collector, and bytecode code generator.
The Rust pipeline is built via Corrosion (CMake-Cargo bridge) and
linked into LibJS as a static library. It is gated behind a build
flag (ENABLE_RUST, on by default except on Windows) and two runtime
environment variables:
- LIBJS_CPP: Use the C++ pipeline instead of Rust
- LIBJS_COMPARE_PIPELINES=1: Run both pipelines in lockstep,
aborting on any difference in AST or bytecode generated.
The C++ side communicates with Rust through a C FFI layer
(RustIntegration.cpp/h) that passes source text to Rust and receives
a populated Executable back via a BytecodeFactory interface.
Now that initialize_environment() uses pre-computed data and
execute_module() caches its executable / uses TLA shared data,
we can drop the AST reference after it's no longer needed.
For TLA modules, the AST is dropped immediately after constructing
the SharedFunctionInstanceData (which takes its own ref). For non-TLA
modules, the AST is dropped after the first bytecode compilation.
Also remove the m_default_export field (replaced by the pre-computed
m_default_export_binding_name) and extract default export info in
parse() instead of the constructor.
For non-TLA modules, cache the compiled Bytecode::Executable on the
SourceTextModule so we only compile once.
For TLA modules, pre-create the SharedFunctionInstanceData for the
async wrapper function at construction time. The wrapper function's
first invocation will compile it to bytecode and then automatically
drop the AST via clear_compile_inputs().
Extract the data needed by initialize_environment() from the AST at
construction time, following the pattern already used by Script for
global declaration instantiation.
This pre-computes var declared names, lexical bindings (with function
indices), functions to initialize (with SharedFunctionInstanceData),
and the default export binding name.
This has quite a lot of fall out. But the majority of it is just type or
UDL substitution, where the changes just fall through to other function
calls.
By changing property key storage to UTF-16, the main affected areas are:
* NativeFunction names must now be UTF-16
* Bytecode identifiers must now be UTF-16
* Module/binding names must now be UTF-16
This reverts commit c14173f651. We
should only annotate the minimum number of symbols that external
consumers actually use, so I am starting from scratch to do that
Resulting in a massive rename across almost everywhere! Alongside the
namespace change, we now have the following names:
* JS::NonnullGCPtr -> GC::Ref
* JS::GCPtr -> GC::Ptr
* JS::HeapFunction -> GC::Function
* JS::CellImpl -> GC::Cell
* JS::Handle -> GC::Root