Commit graph

24 commits

Author SHA1 Message Date
Andreas Kling
517812647a LibJS: Pack asm Call shared-data metadata
Pack the asm Call fast path metadata next to the executable pointer
so the interpreter can fetch both values with one paired load. This
removes several dependent shared-data loads from the hot path.

Keep the executable pointer and packed metadata in separate registers
through this binding so the fast path can still use the paired-load
layout after any non-strict this adjustment.

Lower the packed metadata flag checks correctly on x86_64 as well.
Those bits now live above bit 31, so the generator uses bt for single-
bit high masks and covers that path with a unit test.

Add a runtime test that exercises both object and global this binding
through the asm Call fast path.
2026-04-14 12:37:12 +02:00
Andreas Kling
df0fdee2a0 LibJS: Cache JS-to-JS inline call eligibility
Store whether a function can participate in JS-to-JS inline calls on
SharedFunctionInstanceData instead of recomputing the function kind,
class-constructor bit, and bytecode availability at each fast-path
call site.
2026-04-14 08:14:43 +02:00
Andreas Kling
879ac36e45 LibJS: Cache stable for-in iteration at bytecode sites
Cache the flattened enumerable key snapshot for each `for..in` site and
reuse a `PropertyNameIterator` when the receiver shape, dictionary
generation, indexed storage kind and length, prototype chain
validity, and magical-length state still match.

Handle packed indexed receivers as well as plain named-property
objects. Teach `ObjectPropertyIteratorNext` in `asmint.asm` to return
cached property values directly and to fall back to the slow iterator
logic when any guard fails.

Treat arrays' hidden non-enumerable `length` property as a visited
name for for-in shadowing, and include the receiver's magical-length
state in the cache key so arrays and plain objects do not share
snapshots.

Add `test-js` and `test-js-bytecode` coverage for mixed numeric and
named keys, packed receiver transitions, re-entry, iterator reuse, GC
retention, array length shadowing, and same-site cache reuse.
2026-04-10 15:12:53 +02:00
Andreas Kling
b23aa38546 AK: Adopt mimalloc v2 as main allocator
Use mimalloc for Ladybird-owned allocations without overriding malloc().
Route kmalloc(), kcalloc(), krealloc(), and kfree() through mimalloc,
and put the embedded Rust crates on the same allocator via a shared
shim in AK/kmalloc.cpp.

This also lets us drop kfree_sized(), since it no longer used its size
argument. StringData, Utf16StringData, JS object storage, Rust error
strings, and the CoreAudio playback helpers can all free their AK-backed
storage with plain kfree().

Sanitizer builds still use the system allocator. LeakSanitizer does not
reliably trace references stored in mimalloc-managed AK containers, so
static caches and other long-lived roots can look leaked. Pass the old
size into the Rust realloc shim so aligned fallback reallocations can
move posix_memalign-backed blocks safely.

Static builds still need a little linker help. macOS app binaries need
the Rust allocator entry points forced in from liblagom-ak.a, while
static ELF links can pull in identical allocator shim definitions from
multiple Rust staticlibs. Keep the Apple -u flags and allow those
duplicate shim symbols for LibJS and LibRegex links on Linux and BSD.
2026-04-08 09:57:53 +02:00
Shannon Booth
f27bc38aa7 Everywhere: Remove ShadowRealm support
The proposal has not seemed to progress for a while, and there is
a open issue about module imports which breaks HTML integration.
While we could probably make an AD-HOC change to fix that issue,
it is deep enough in the JS engine that I am not particularly
keen on making that change.

Until other browsers begin to make positive signals about shipping
ShadowRealms, let's remove our implementation for now.

There is still some cleanup that can be done with regard to the
HTML integration, but there are a few more items that need to be
untangled there.
2026-04-05 13:57:58 +02:00
Shannon Booth
adabc5cedb LibJS: Handle empty UTF-16 strings in Rust FFI
Treat zero length UTF-16 slices from Rust as empty views at the FFI
boundary instead of assuming a non null backing pointer.

Add a regression test which crashed before these changes. Fixes
a crash loading github.com/ladybirdbrowser/ladybird.
2026-03-31 22:33:36 +02:00
Shannon Booth
685c875431 LibJS: Avoid constructing Utf16View with a null pointer for empty string
When decoding a zero length string, we would pass through the (null)
data() of an empty Vector to the Utf16View constructor. Return an empty
PrimitiveString early instead.
2026-03-31 13:48:50 +01:00
Andreas Kling
c8a0a960b5 LibJS: Validate regex literals during parsing
Now that LibRegex is safe to use (for parsing) off the main thread,
we can validate regex literals directly while parsing JavaScript.

This allows us to remove the deferred regex compilation pass that we
previously ran on the main thread after parsing JS in the background.
2026-03-27 17:32:19 +01:00
Andreas Kling
e243e146de LibJS+LibRegex: Switch RegExp over to the Rust engine
Switch LibJS `RegExp` over to the Rust-backed `ECMAScriptRegex` APIs.

Route `new RegExp()`, regex literals, and the RegExp builtins through
the new compile and exec APIs, and stop re-validating patterns with the
deleted C++ parser on the way in. Preserve the observable error
behavior by carrying structured compile errors and backtracking-limit
failures across the FFI boundary. Cache compiled regex state and named
capture metadata on `RegExpObject` in the new representation.

Use the new API surface to simplify and speed up the builtin paths too:
share `exec_internal`, cache compiled regex pointers, keep the legacy
RegExp statics lazy, run global replace through batch `find_all`, and
optimize replace, test, split, and String helper paths. Add regression
tests for those JavaScript-visible paths.
2026-03-27 17:32:19 +01:00
Andreas Kling
169452f41b LibJS: Remove C++ parser
Delete Parser.cpp/h and ScopeCollector.cpp/h, now that all parsing
goes through the Rust pipeline.

Port test262-runner to use RustIntegration::parse_program() for its
fast parse-only check instead of the C++ Parser.

Add parsed_program_has_errors() and free_parsed_program() to the
RustIntegration public API for parse-only use cases.
2026-03-19 21:55:10 -05:00
Andreas Kling
0c7d50b33d LibJS: Remove LIBJS_CPP env var and ENABLE_RUST guards
The Rust pipeline is now the only compilation path, so remove:

- The LIBJS_CPP environment variable check
- The rust_pipeline_enabled() helper
- The #ifdef ENABLE_RUST / #else stub section
- The test-js-cpp CTest target and LIBJS_TEST_PARSER_MODE env var
- The ParserMode enum and canParseSourceWithCpp/Rust test functions

rust_pipeline_available() now unconditionally returns true.
2026-03-19 21:55:10 -05:00
Andreas Kling
2c45472a11 LibJS: Remove pipeline comparison infrastructure
Remove PipelineComparison.cpp/h and all LIBJS_COMPARE_PIPELINES
support from RustIntegration.cpp. This includes:

- The compare_pipelines_enabled() function
- All comparison blocks in compile_script/eval/module/function
- The pair_shared_function_data() helper
- The m_cpp_comparison_sfd field on SharedFunctionInstanceData

The Rust pipeline has been validated extensively through comparison
testing and no longer needs the side-by-side verification harness.
2026-03-19 21:55:10 -05:00
Andrew Kaster
f06bd0303f LibJS: Use enum for retrieving well known symbols from C++ to Rust 2026-03-19 09:48:32 +01:00
Andrew Kaster
5d43707896 LibJS: Directly use LiteralValueKind enum across FFI boundary 2026-03-19 09:48:32 +01:00
Andrew Kaster
92e4c20ad5 LibJS: Generate FFI header using cbindgen instead of hand-rolling
Replace the BytecodeFactory header with cbindgen.

This will help ensure that types and enums and constants are kept in
sync between the C++ and Rust code. It's also a step in exporting more
Rust enums directly rather than relying on magic constants for
switch statements.

The FFI functions are now all placed in the JS::FFI namespace, which
is the cause for all the churn in the scripting parts of LibJS and
LibWeb.
2026-03-17 20:49:50 -05:00
Andreas Kling
54a1a66112 LibJS: Store cache pointers directly in bytecode instructions
Instead of storing a u32 index into a cache vector and looking up the
cache at runtime through a chain of dependent loads (load Executable*,
load vector data pointer, multiply index, add), store the actual cache
pointer as a u64 directly in the instruction stream.

A fixup pass (Executable::fixup_cache_pointers()) runs after Executable
construction in both the Rust and C++ pipelines, walking the bytecode
and replacing each index with the corresponding pointer.

The cache pointer type is encoded in Bytecode.def (e.g.
PropertyLookupCache*, GlobalVariableCache*) so the fixup switch is
auto-generated by the Python Op code generator, making it impossible
to forget updating the fixup when adding new cached instructions.

This eliminates 3-4 dependent loads on every inline cache access in
both the C++ interpreter and the assembly interpreter.
2026-03-08 10:27:13 +01:00
Andreas Kling
3f4d3d6108 LibJS+LibWeb: Add C++ compile_parsed_module wrapper
Add compile_parsed_module() to RustIntegration, which takes a
RustParsedProgram and a SourceCode (from parse_program with
ProgramType::Module) and compiles it on the main thread with GC
interaction.

Rewrite compile_module() to use the new split functions internally.

Add SourceTextModule::parse_from_pre_parsed() and
JavaScriptModuleScript::create_from_pre_parsed() to allow creating
module scripts from a pre-parsed RustParsedProgram.

This prepares the infrastructure for off-thread module parsing.
2026-03-06 13:06:05 +01:00
Andreas Kling
d8921646f5 LibJS+LibWeb: Parse classic scripts off the main thread
Create a SourceCode on the main thread (performing UTF-8 to UTF-16
conversion), then submit parse_program() to the ThreadPool for
Rust parsing on a worker thread. This unblocks the WebContent event
loop during external script loading.

Add Script::create_from_parsed() and
ClassicScript::create_from_pre_parsed() factory methods that take a
pre-parsed RustParsedProgram and a SourceCode, performing only the
GC-allocating compile step on the main thread.

Falls back to synchronous parsing when the Rust pipeline is
unavailable (LIBJS_CPP=1 or LIBJS_COMPARE_PIPELINES=1).
2026-03-06 13:06:05 +01:00
Andreas Kling
6983c6b1a5 LibJS: Add C++ parse_program/compile_parsed_script wrappers
Expose the Rust parse/compile split to C++ callers:

- parse_program(): takes raw UTF-16 data and a ProgramType
  parameter (Script or Module). No GC interaction, thread-safe.
- compile_parsed_script(): takes a pre-parsed RustParsedProgram
  and a SourceCode, checks for errors, and calls
  rust_compile_parsed_script(). Returns a ScriptResult.

Rewrite compile_script() to use the split path internally. The
pipeline comparison logic now gets the AST dump from the
ParsedProgram before compilation consumes it.
2026-03-06 13:06:05 +01:00
Andreas Kling
8d51371e22 LibJS: Skip AST dump for ClassFieldInitializerStatement
ClassFieldInitializerStatement is a synthetic AST node that does not
support dump_to_string(). Guard the pipeline comparison code against
this case to avoid a VERIFY failure when comparing class field
initializer functions.
2026-03-01 21:20:54 +01:00
Andreas Kling
766567a9d5 LibJS: Compare Rust and C++ bytecode for lazily compiled functions
LIBJS_COMPARE_PIPELINES previously only compared top-level
script/eval/module bytecodes. Function bodies are compiled lazily
via compile_function(), and that path had no comparison at all.

Fix this by pairing each Rust-compiled SharedFunctionInstanceData
with its C++ counterpart during top-level compilation. When a
function is later lazily compiled, compile_function() runs both
pipelines and compares the bytecodes (crashing on mismatch, same
as the top-level comparisons). The pairing is done recursively so
nested functions are also covered.
2026-03-01 21:20:54 +01:00
Andreas Kling
4b47245e99 LibJS: Cache ASCII-to-UTF-16 source conversion for Rust compilation
The Rust FFI requires UTF-16 source data, so ASCII-stored source code
must be widened to UTF-16. Previously, this conversion was done into a
temporary buffer on every call to compile_function, meaning the entire
source file was converted for each lazily-compiled function. For large
modules with many functions, this caused heavy spinning.

Move the conversion into SourceCode::utf16_data() which lazily converts
and caches the result once per source file. Subsequent compilations of
functions from the same file reuse the cached data.
2026-02-25 00:00:52 +01:00
Andreas Kling
a7d1c4baa6 LibJS: Defer GC during Rust-pipeline module and builtin compilation
The Rust bytecode pipeline stores SharedFunctionInstanceData pointers
as raw void pointers invisible to the GC. If garbage collection runs
during compilation (triggered by heap allocation of a new SFD), it
can collect previously created SFDs, leaving stale pointers that
crash during the next GC marking phase.

Every other Rust compilation entry point (compile_script, compile_eval,
compile_shadow_realm_eval, compile_dynamic_function, compile_function)
already uses GC::DeferGC to prevent this. Add the missing DeferGC to
compile_module and compile_builtin_file.
2026-02-25 00:00:52 +01:00
Andreas Kling
6cdfbd01a6 LibJS: Add alternative source-to-bytecode pipeline in Rust
Implement a complete Rust reimplementation of the LibJS frontend:
lexer, parser, AST, scope collector, and bytecode code generator.

The Rust pipeline is built via Corrosion (CMake-Cargo bridge) and
linked into LibJS as a static library. It is gated behind a build
flag (ENABLE_RUST, on by default except on Windows) and two runtime
environment variables:

- LIBJS_CPP: Use the C++ pipeline instead of Rust
- LIBJS_COMPARE_PIPELINES=1: Run both pipelines in lockstep,
  aborting on any difference in AST or bytecode generated.

The C++ side communicates with Rust through a C FFI layer
(RustIntegration.cpp/h) that passes source text to Rust and receives
a populated Executable back via a BytecodeFactory interface.
2026-02-24 09:39:42 +01:00