Pack the asm Call fast path metadata next to the executable pointer
so the interpreter can fetch both values with one paired load. This
removes several dependent shared-data loads from the hot path.
Keep the executable pointer and packed metadata in separate registers
through this binding so the fast path can still use the paired-load
layout after any non-strict this adjustment.
Lower the packed metadata flag checks correctly on x86_64 as well.
Those bits now live above bit 31, so the generator uses bt for single-
bit high masks and covers that path with a unit test.
Add a runtime test that exercises both object and global this binding
through the asm Call fast path.
Store whether a function can participate in JS-to-JS inline calls on
SharedFunctionInstanceData instead of recomputing the function kind,
class-constructor bit, and bytecode availability at each fast-path
call site.
Cache the flattened enumerable key snapshot for each `for..in` site and
reuse a `PropertyNameIterator` when the receiver shape, dictionary
generation, indexed storage kind and length, prototype chain
validity, and magical-length state still match.
Handle packed indexed receivers as well as plain named-property
objects. Teach `ObjectPropertyIteratorNext` in `asmint.asm` to return
cached property values directly and to fall back to the slow iterator
logic when any guard fails.
Treat arrays' hidden non-enumerable `length` property as a visited
name for for-in shadowing, and include the receiver's magical-length
state in the cache key so arrays and plain objects do not share
snapshots.
Add `test-js` and `test-js-bytecode` coverage for mixed numeric and
named keys, packed receiver transitions, re-entry, iterator reuse, GC
retention, array length shadowing, and same-site cache reuse.
Use mimalloc for Ladybird-owned allocations without overriding malloc().
Route kmalloc(), kcalloc(), krealloc(), and kfree() through mimalloc,
and put the embedded Rust crates on the same allocator via a shared
shim in AK/kmalloc.cpp.
This also lets us drop kfree_sized(), since it no longer used its size
argument. StringData, Utf16StringData, JS object storage, Rust error
strings, and the CoreAudio playback helpers can all free their AK-backed
storage with plain kfree().
Sanitizer builds still use the system allocator. LeakSanitizer does not
reliably trace references stored in mimalloc-managed AK containers, so
static caches and other long-lived roots can look leaked. Pass the old
size into the Rust realloc shim so aligned fallback reallocations can
move posix_memalign-backed blocks safely.
Static builds still need a little linker help. macOS app binaries need
the Rust allocator entry points forced in from liblagom-ak.a, while
static ELF links can pull in identical allocator shim definitions from
multiple Rust staticlibs. Keep the Apple -u flags and allow those
duplicate shim symbols for LibJS and LibRegex links on Linux and BSD.
The proposal has not seemed to progress for a while, and there is
a open issue about module imports which breaks HTML integration.
While we could probably make an AD-HOC change to fix that issue,
it is deep enough in the JS engine that I am not particularly
keen on making that change.
Until other browsers begin to make positive signals about shipping
ShadowRealms, let's remove our implementation for now.
There is still some cleanup that can be done with regard to the
HTML integration, but there are a few more items that need to be
untangled there.
Treat zero length UTF-16 slices from Rust as empty views at the FFI
boundary instead of assuming a non null backing pointer.
Add a regression test which crashed before these changes. Fixes
a crash loading github.com/ladybirdbrowser/ladybird.
When decoding a zero length string, we would pass through the (null)
data() of an empty Vector to the Utf16View constructor. Return an empty
PrimitiveString early instead.
Now that LibRegex is safe to use (for parsing) off the main thread,
we can validate regex literals directly while parsing JavaScript.
This allows us to remove the deferred regex compilation pass that we
previously ran on the main thread after parsing JS in the background.
Switch LibJS `RegExp` over to the Rust-backed `ECMAScriptRegex` APIs.
Route `new RegExp()`, regex literals, and the RegExp builtins through
the new compile and exec APIs, and stop re-validating patterns with the
deleted C++ parser on the way in. Preserve the observable error
behavior by carrying structured compile errors and backtracking-limit
failures across the FFI boundary. Cache compiled regex state and named
capture metadata on `RegExpObject` in the new representation.
Use the new API surface to simplify and speed up the builtin paths too:
share `exec_internal`, cache compiled regex pointers, keep the legacy
RegExp statics lazy, run global replace through batch `find_all`, and
optimize replace, test, split, and String helper paths. Add regression
tests for those JavaScript-visible paths.
Delete Parser.cpp/h and ScopeCollector.cpp/h, now that all parsing
goes through the Rust pipeline.
Port test262-runner to use RustIntegration::parse_program() for its
fast parse-only check instead of the C++ Parser.
Add parsed_program_has_errors() and free_parsed_program() to the
RustIntegration public API for parse-only use cases.
The Rust pipeline is now the only compilation path, so remove:
- The LIBJS_CPP environment variable check
- The rust_pipeline_enabled() helper
- The #ifdef ENABLE_RUST / #else stub section
- The test-js-cpp CTest target and LIBJS_TEST_PARSER_MODE env var
- The ParserMode enum and canParseSourceWithCpp/Rust test functions
rust_pipeline_available() now unconditionally returns true.
Remove PipelineComparison.cpp/h and all LIBJS_COMPARE_PIPELINES
support from RustIntegration.cpp. This includes:
- The compare_pipelines_enabled() function
- All comparison blocks in compile_script/eval/module/function
- The pair_shared_function_data() helper
- The m_cpp_comparison_sfd field on SharedFunctionInstanceData
The Rust pipeline has been validated extensively through comparison
testing and no longer needs the side-by-side verification harness.
Replace the BytecodeFactory header with cbindgen.
This will help ensure that types and enums and constants are kept in
sync between the C++ and Rust code. It's also a step in exporting more
Rust enums directly rather than relying on magic constants for
switch statements.
The FFI functions are now all placed in the JS::FFI namespace, which
is the cause for all the churn in the scripting parts of LibJS and
LibWeb.
Instead of storing a u32 index into a cache vector and looking up the
cache at runtime through a chain of dependent loads (load Executable*,
load vector data pointer, multiply index, add), store the actual cache
pointer as a u64 directly in the instruction stream.
A fixup pass (Executable::fixup_cache_pointers()) runs after Executable
construction in both the Rust and C++ pipelines, walking the bytecode
and replacing each index with the corresponding pointer.
The cache pointer type is encoded in Bytecode.def (e.g.
PropertyLookupCache*, GlobalVariableCache*) so the fixup switch is
auto-generated by the Python Op code generator, making it impossible
to forget updating the fixup when adding new cached instructions.
This eliminates 3-4 dependent loads on every inline cache access in
both the C++ interpreter and the assembly interpreter.
Add compile_parsed_module() to RustIntegration, which takes a
RustParsedProgram and a SourceCode (from parse_program with
ProgramType::Module) and compiles it on the main thread with GC
interaction.
Rewrite compile_module() to use the new split functions internally.
Add SourceTextModule::parse_from_pre_parsed() and
JavaScriptModuleScript::create_from_pre_parsed() to allow creating
module scripts from a pre-parsed RustParsedProgram.
This prepares the infrastructure for off-thread module parsing.
Create a SourceCode on the main thread (performing UTF-8 to UTF-16
conversion), then submit parse_program() to the ThreadPool for
Rust parsing on a worker thread. This unblocks the WebContent event
loop during external script loading.
Add Script::create_from_parsed() and
ClassicScript::create_from_pre_parsed() factory methods that take a
pre-parsed RustParsedProgram and a SourceCode, performing only the
GC-allocating compile step on the main thread.
Falls back to synchronous parsing when the Rust pipeline is
unavailable (LIBJS_CPP=1 or LIBJS_COMPARE_PIPELINES=1).
Expose the Rust parse/compile split to C++ callers:
- parse_program(): takes raw UTF-16 data and a ProgramType
parameter (Script or Module). No GC interaction, thread-safe.
- compile_parsed_script(): takes a pre-parsed RustParsedProgram
and a SourceCode, checks for errors, and calls
rust_compile_parsed_script(). Returns a ScriptResult.
Rewrite compile_script() to use the split path internally. The
pipeline comparison logic now gets the AST dump from the
ParsedProgram before compilation consumes it.
ClassFieldInitializerStatement is a synthetic AST node that does not
support dump_to_string(). Guard the pipeline comparison code against
this case to avoid a VERIFY failure when comparing class field
initializer functions.
LIBJS_COMPARE_PIPELINES previously only compared top-level
script/eval/module bytecodes. Function bodies are compiled lazily
via compile_function(), and that path had no comparison at all.
Fix this by pairing each Rust-compiled SharedFunctionInstanceData
with its C++ counterpart during top-level compilation. When a
function is later lazily compiled, compile_function() runs both
pipelines and compares the bytecodes (crashing on mismatch, same
as the top-level comparisons). The pairing is done recursively so
nested functions are also covered.
The Rust FFI requires UTF-16 source data, so ASCII-stored source code
must be widened to UTF-16. Previously, this conversion was done into a
temporary buffer on every call to compile_function, meaning the entire
source file was converted for each lazily-compiled function. For large
modules with many functions, this caused heavy spinning.
Move the conversion into SourceCode::utf16_data() which lazily converts
and caches the result once per source file. Subsequent compilations of
functions from the same file reuse the cached data.
The Rust bytecode pipeline stores SharedFunctionInstanceData pointers
as raw void pointers invisible to the GC. If garbage collection runs
during compilation (triggered by heap allocation of a new SFD), it
can collect previously created SFDs, leaving stale pointers that
crash during the next GC marking phase.
Every other Rust compilation entry point (compile_script, compile_eval,
compile_shadow_realm_eval, compile_dynamic_function, compile_function)
already uses GC::DeferGC to prevent this. Add the missing DeferGC to
compile_module and compile_builtin_file.
Implement a complete Rust reimplementation of the LibJS frontend:
lexer, parser, AST, scope collector, and bytecode code generator.
The Rust pipeline is built via Corrosion (CMake-Cargo bridge) and
linked into LibJS as a static library. It is gated behind a build
flag (ENABLE_RUST, on by default except on Windows) and two runtime
environment variables:
- LIBJS_CPP: Use the C++ pipeline instead of Rust
- LIBJS_COMPARE_PIPELINES=1: Run both pipelines in lockstep,
aborting on any difference in AST or bytecode generated.
The C++ side communicates with Rust through a C FFI layer
(RustIntegration.cpp/h) that passes source text to Rust and receives
a populated Executable back via a BytecodeFactory interface.