Now that the Rust pipeline is the sole compilation path, remove all
C++ parser/codegen fallback paths from the callers:
- Script::parse() no longer falls back to C++ Parser
- SourceTextModule::parse() no longer falls back to C++ Parser
- perform_eval() no longer falls back to C++ Parser + Generator
- create_dynamic_function() no longer falls back to C++ Parser
- ShadowRealm eval no longer falls back to C++ Parser + Generator
- Interpreter::run(Script&) no longer falls back to Generator
Also remove the now-dead old constructors that took C++ AST nodes,
the module_requests() helper, and AST dump code from js.cpp.
Replace the BytecodeFactory header with cbindgen.
This will help ensure that types and enums and constants are kept in
sync between the C++ and Rust code. It's also a step in exporting more
Rust enums directly rather than relying on magic constants for
switch statements.
The FFI functions are now all placed in the JS::FFI namespace, which
is the cause for all the churn in the scripting parts of LibJS and
LibWeb.
Create a SourceCode on the main thread (performing UTF-8 to UTF-16
conversion), then submit parse_program() to the ThreadPool for
Rust parsing on a worker thread. This unblocks the WebContent event
loop during external script loading.
Add Script::create_from_parsed() and
ClassicScript::create_from_pre_parsed() factory methods that take a
pre-parsed RustParsedProgram and a SourceCode, performing only the
GC-allocating compile step on the main thread.
Falls back to synchronous parsing when the Rust pipeline is
unavailable (LIBJS_CPP=1 or LIBJS_COMPARE_PIPELINES=1).
Implement a complete Rust reimplementation of the LibJS frontend:
lexer, parser, AST, scope collector, and bytecode code generator.
The Rust pipeline is built via Corrosion (CMake-Cargo bridge) and
linked into LibJS as a static library. It is gated behind a build
flag (ENABLE_RUST, on by default except on Windows) and two runtime
environment variables:
- LIBJS_CPP: Use the C++ pipeline instead of Rust
- LIBJS_COMPARE_PIPELINES=1: Run both pipelines in lockstep,
aborting on any difference in AST or bytecode generated.
The C++ side communicates with Rust through a C FFI layer
(RustIntegration.cpp/h) that passes source text to Rust and receives
a populated Executable back via a BytecodeFactory interface.
Add static factory methods create_for_function_node() on
SharedFunctionInstanceData and update all callers to use them instead
of FunctionNode::ensure_shared_data().
This removes the GC::Root<SharedFunctionInstanceData> cache from
FunctionNode, eliminating the coupling between the RefCounted AST
and GC-managed runtime objects. The cache was effectively dead code:
hoisted declarations use m_functions_to_initialize directly, and
function expressions always create fresh instances during codegen.
After compiling the bytecode executable on first run, null out the
AST (m_parse_node) and clear AnnexB candidates since they are no
longer needed. This frees the memory held by the entire AST for the
script's lifetime.
The parse_node() accessor now returns a nullable pointer. Callers
(js.cpp for AST dumping, Interpreter for first compilation) access
the AST before it is dropped.
Add Script::global_declaration_instantiation() that performs the GDI
algorithm using pre-computed name lists and shared function data
instead of walking the AST.
Runtime checks (has_lexical_declaration, can_declare_global_function,
etc.) remain since they depend on global environment state. AnnexB
iterates pre-collected candidates and calls
set_should_do_additional_annexB_steps() on stored refs.
The Interpreter::run(Script&) now calls the Script method instead of
the Program method.
Extract all data that Program::global_declaration_instantiation()
needs from the AST into value-type metadata stored on the Script
object, computed once during construction. This includes lexical
names, var names, functions to initialize, var scoped names, AnnexB
candidates, and lexical bindings.
This prepares for rewriting GDI to use the pre-computed metadata
instead of walking the AST.
This moves the responsibility of setting up a SourceCode object to the
users of JS::Lexer.
This means Lexer and Parser are free to use string views into the
SourceCode internally while working.
It also means Lexer no longer has to think about anything other than
UTF-16 (or ASCII) inputs. So the unit test for parsing various invalid
UTF-8 sequences is deleted here.
Resulting in a massive rename across almost everywhere! Alongside the
namespace change, we now have the following names:
* JS::NonnullGCPtr -> GC::Ref
* JS::GCPtr -> GC::Ptr
* JS::HeapFunction -> GC::Function
* JS::CellImpl -> GC::Cell
* JS::Handle -> GC::Root
Now that the heap has no knowledge about a JavaScript realm and is
purely for managing the memory of the heap, it does not make sense
to name this function to say that it is a non-realm variant.