Add a ref-counted decoded bytecode cache backing so bytecode cache
materialization can create fresh script or module records from a shared
decoded sidecar without passing around one-shot raw blob ownership.
Keep that backing in ExecutableBacking for records materialized from
bytecode cache sidecars, so the immutable decoded data stays alive for
as long as the installed record needs it.
Cover the shared backing path with a bytecode-cache test that
materializes and runs two scripts from one decoded backing.
Store JavaScript bytecode side data in the WebContent HTTP memory
cache and replay it when serving cached responses. Also update an
already-complete memory-cache entry when asynchronous bytecode cache
generation finishes, so the first source-only response does not keep
shadowing the disk-cache sidecar during same-process navigations.
Keep the HTTP memory-cache backfill keyed with the request headers that
populated the memory-cache entry, so Vary responses still receive their
generated bytecode sidecar.
Add LibHTTP coverage for round-tripping bytecode side data through a
memory-cache entry, attaching it after the response body has already
been cached, and matching Vary headers during updates. Add LibWeb
coverage for preserving the memory-cache request headers when cloning
responses.
Warm cache hits used to validate bytecode cache blobs on the main
thread. Route script and module sidecars through a worker step that
decodes and validates the cache blob, then returns the validated blob
for main-thread materialization.
Keep the source bytes mmap-backed and avoid decoding the full source on
cache hits. The main thread still computes the UTF-16 source length so
validation can reject stale blobs before materialization.
Remove the decoded source length getter now that sidecar validation uses
the explicit validation API instead.
The Rust pipeline is always available now, so the getter only wrapped
code that always ran. Remove it and make the JavaScript fetch paths use
the off-thread Rust pipeline directly.
This also removes the unreachable synchronous fallback branches from the
classic script and module fetch paths.
Post off-thread font, script, and DNS completion work back to direct
Core::EventLoop references. These callbacks target the process main
loop, which is intentionally kept alive for the lifetime of the
process.
Switch bytecode cache source identity from decoded UTF-16 source text to
the original encoded response bytes plus the effective source encoding.
Store the decoded source length in the cache blob header so warm loads
can build lazy SourceCode objects without decoding the source before
checking the sidecar.
This removes the main-thread decoded_source_text_info pass from valid
warm-cache script and module loads. The source is only decoded on cache
miss, or when a rejected sidecar falls back to source compilation.
After generating a bytecode cache blob, map it as ImmutableBytes and ask
the owning JS record to install it. This lets a cold-cache load move to
mapped cache backing without waiting for a later page load.
Avoid decoding warm-cache script responses into full UTF-16 SourceCode
buffers when a bytecode cache sidecar is available. SourceCode now keeps
the original immutable source bytes and source encoding, then decodes
only when full source text or a Function.toString() range is requested.
Compute the bytecode cache source hash while streaming decoded code
points from the response bytes, so cache validation does not force an
intermediate UTF-8 string. Function and class source text metadata now
stores SourceCode ranges instead of views into a materialized buffer.
Keep consumed response body bytes in Core::ImmutableBytes instead of
requiring a ByteBuffer. This lets responses that already arrived as
file-backed immutable data keep that representation through body
consumption, while streamed responses can still adopt their
accumulated ByteBuffer without another copy.
Update the body consumers that only inspect bytes to read from
immutable byte views. Font loading still copies at its existing
ownership boundary, where the off-thread preparation path takes a
ByteBuffer.
Hash ASCII-backed script source as equivalent UTF-16 data in fixed-size
chunks instead of first forcing SourceCode to allocate a full widened
copy. Keep the existing bytecode cache key stable by preserving the same
byte sequence that the previous utf16_data() path hashed.
Keep executable bytecode payloads decoded from owner-backed bytecode
cache blobs as ranges into the original blob instead of copying them
into Rust Vec allocations. The mapped blob owner is held by decoded
executable records, including lazy nested function executables, so the
borrowed bytecode remains alive until materialization copies it into the
final C++ Executable.
Use the owner-backed decoder for HTTP bytecode cache hits and keep the
plain byte decoder for tests and in-memory callers. Add coverage for
materializing bytecode cache data from an ImmutableBytes mapped file.
Remove the concrete WebAssemblyModule include from ModuleScript.h and
include it only in the .cpp files that need the complete type. This
keeps the module script header from pulling WebAssembly implementation
details into unrelated LibWeb translation units.
Attach cached JavaScript bytecode sidecars to HTTP response headers so
WebContent can materialize classic and module scripts directly from a
decoded cache blob on cache hits.
Carry the disk cache vary key with the sidecar and reuse it when storing
fresh bytecode, avoiding mismatches against the augmented network
request headers used to create the cache entry.
Keep CORS-filtered module responses intact for status, MIME, and script
creation checks. Read bytecode sidecar data only from the internal
response, and treat decode or materialization failure as a cache miss
that falls back to normal source compilation.
Schedule JavaScript bytecode cache generation after downloaded classic
scripts and modules have been handed back to the main thread.
The cache job reparses and fully compiles on the thread pool,
serializes the bytecode blob, and stores it as HTTP cache sidecar data.
RequestServer finalizes the disk cache entry before notifying
WebContent, so the script fetcher can attach the sidecar immediately.
Store a SHA-256 fingerprint of the decoded source text in each bytecode
cache blob, and require callers to provide the expected fingerprint when
validating or decoding a blob.
This rejects sidecars for stale HTTP cache entries whose URL and request
headers still match but whose source body has been replaced. Bytecode
cache tests cover the mismatched-source rejection path.
After off-thread script or module compilation hands top-level bytecode
back to the main thread, clone the remaining lazy function payloads and
compile them on the thread pool.
Install completed bytecode only when the function is still lazy. If the
main thread compiled it first, discard the stale result and schedule
another pass over that executable so nested lazy payloads still move to
bytecode.
Split Rust program compilation so code generation and assembly finish
before the main thread materializes GC-backed executable objects. The
new CompiledProgram handle owns the parsed program, generator state, and
bytecode until C++ consumes it on the main thread.
Wire WebContent script fetching through that handle for classic scripts
and modules. Syntax-error paths still return ParsedProgram, so existing
error reporting stays in place. Successful fetches now do top-level
codegen on the thread pool before deferred_invoke hands control back to
the main thread.
Executable creation, SharedFunctionInstanceData materialization, module
metadata extraction, and declaration data extraction still run on the
main thread where VM and GC access is valid.
Avoid expensive cross-hierarchy dynamic_cast from JS::Object to
UniversalGlobalScopeMixin on every microtask checkpoint.
Since UniversalGlobalScopeMixin is not in the JS::Object
inheritance chain, as<UniversalGlobalScopeMixin>(JS::Object&)
falls through to dynamic_cast, which is very costly. Profiling
showed this taking ~14% of total CPU time.
Add EnvironmentSettingsObject::universal_global_scope() backed
by a pointer cached eagerly during initialization.
Replace the BytecodeFactory header with cbindgen.
This will help ensure that types and enums and constants are kept in
sync between the C++ and Rust code. It's also a step in exporting more
Rust enums directly rather than relying on magic constants for
switch statements.
The FFI functions are now all placed in the JS::FFI namespace, which
is the cause for all the churn in the scripting parts of LibJS and
LibWeb.
When parse_off_thread() completes, the result callback runs inside a
deferred_invoke, which executes outside the HTML event loop's task
model. This meant that any microtasks queued by the callback (e.g.
promise reactions from react_to_promise during module linking) were
never drained, since HTML::EventLoop::process() only performs
microtask checkpoints after executing an HTML task.
Fix this by performing an explicit microtask checkpoint after the
parse result callback. This ensures that promise reactions queued
during module linking are drained immediately.
This fixes module worker scripts timing out because their loading
promise chains would stall indefinitely.
Use the parse_off_thread() helper to submit
parse_program(ProgramType::Module) to the ThreadPool for parsing
on a worker thread. Bounce back to the main thread to compile and
deliver the result via deferred_invoke.
Falls back to synchronous parsing when the Rust pipeline is
unavailable (LIBJS_CPP=1 or LIBJS_COMPARE_PIPELINES=1).
Create a SourceCode on the main thread (performing UTF-8 to UTF-16
conversion), then submit parse_program() to the ThreadPool for
Rust parsing on a worker thread. This unblocks the WebContent event
loop during external script loading.
Add Script::create_from_parsed() and
ClassicScript::create_from_pre_parsed() factory methods that take a
pre-parsed RustParsedProgram and a SourceCode, performing only the
GC-allocating compile step on the main thread.
Falls back to synchronous parsing when the Rust pipeline is
unavailable (LIBJS_CPP=1 or LIBJS_COMPARE_PIPELINES=1).
Issue #6294 describes an edge case where the browser crash if the same
module is loaded three times in a document, but all attempts fail.
Failure scenario:
1. Module load 1 set the state to "Fetching"
2. Module load 2 registers a callback to `on_complete` since the
current state is "Fetching"
3. Module load 1 finish with a failure, invoking the callback for load
number 2
4. Module load 3 cause a crash. The state is neither "Fetching" or
"ModuleScript", so we'll reset the state to "Fetching". This invokes
the callback for module load 2 again, now with an unexpected state
which will cause an assert violation.
Proposed fix is to remove the condition that invokes `on_complete`
immediately for successfully loaded modules only, the callback should
be invoked regardless of whether the fetch succeeded or failed.
This reveals a separate bug in HTMLScriptElement, where
`mark_as_ready()` can be invoked before
`m_steps_to_run_when_the_result_is_ready` is assigned.
This appears to be a spec bug, reported as
https://github.com/whatwg/html/issues/12073 and addressed by delaying
the callback by a task, similar to the issue was resolved for inline
scripts.
The end goal here is for LibHTTP to be the home of our RFC 9111 (HTTP
caching) implementation. We currently have one implementation in LibWeb
for our in-memory cache and another in RequestServer for our disk cache.
The implementations both largely revolve around interacting with HTTP
headers. But in LibWeb, we are using Fetch's header infra, and in RS we
are using are home-grown header infra from LibHTTP.
So to give these a common denominator, this patch replaces the LibHTTP
implementation with Fetch's infra. Our existing LibHTTP implementation
was not particularly compliant with any spec, so this at least gives us
a standards-based common implementation.
This migration also required moving a handful of other Fetch AOs over
to LibHTTP. (It turns out these AOs were all from the Fetch/Infra/HTTP
folder, so perhaps it makes sense for LibHTTP to be the implementation
of that entire set of facilities.)
An upcoming commit will migrate the contents of Headers.h/cpp to LibHTTP
for use outside of LibWeb. These CORS and MIME helpers depend on other
LibWeb facilities, however, so they cannot be moved.
We have a couple of ways to designate spec notes and (our) developer
notes in comments, but we never really settled on a single approach. As
a result, we have a bit of a mixed bag of note comments on our hands.
To the extent that I could find them, I changed developer notes to
`// NB: ...` and changed spec notes to `// NOTE: ...`. The rationale for
this is that in most web specs, notes are prefixed by `NOTE: ...` so
this makes it easier to copy paste verbatim. The choice for `NB: ...` is
pretty arbitrary, but it makes it stand out from the regular spec notes
and it was already in wide use in our codebase.
Atlassian login gets the base URL for its module scripts by throwing an
error and pulling out the current script's URL from error.stack with
regex.
Since we only returned a basename for module scripts, it would fail to
match and try and use `/` as a base URL (because it does
[matched_string] + "/"), which is not a valid base URL.
While debugging a spec-compliant implementation of ReadableStreamPipeTo,
I spent a lot of time inspecting promise internals. This is much less
noisy if we halve the number of temporary promises.
PrimitiveString is now internally either UTF-8, UTF-16, or both.
We no longer convert them to/from ByteString anywhere, nor does VM have
a ByteString cache.
This patch removes those unused 2 algorithms:
1. `fetch_internal_module_script_graph`
2. `fetch_descendants_of_a_module_script`
Those 2 algorithms were removed in spec and are not used in our
codebase.