The exception handler table is sorted by start_offset, so use
binary_search instead of a linear scan. This matches the pattern
already used by source_range_at() in the same file.
After replacing the runtime unwind context stack with explicit
completion records for try/finally dispatch, the distinction between
"handler" (catch) and "finalizer" (finally) in the exception handler
table is no longer meaningful at runtime.
handle_exception() checked handler first, then finalizer, but they
did the exact same thing (set the PC). When both were present, the
finalizer was dead code.
Collapse both fields into a single handler_offset (now non-optional,
since an entry always has a target), remove the finalizer concept
from BasicBlock, UnwindContext, and ExceptionHandlers, and simplify
handle_exception() to a direct assignment.
Bytecode source map entries are always added in order of increasing
bytecode offset, and lookups only happen during error handling (a cold
path). This makes a sorted vector with binary search a better fit than
a hash map.
This change reduces memory overhead and speeds up bytecode generation
by avoiding hash table operations during compilation. Lookups remain
fast via binary search, and since source_range_at() is only called
when generating stack traces, the O(log n) lookup is acceptable.
Every function call allocates an ExecutionContext with a trailing array
of Values for registers, locals, constants, and arguments. Previously,
the constructor would initialize all slots to js_special_empty_value(),
but constant slots were then immediately overwritten by the interpreter
copying in values from the Executable before execution began.
To eliminate this redundant initialization, we rearrange the layout from
[registers | constants | locals] to [registers | locals | constants].
This groups registers and locals together at the front, allowing us to
initialize only those slots while leaving constant slots uninitialized
until they're populated with their actual values.
This reduces the per-call initialization cost from O(registers + locals
+ constants) to O(registers + locals).
Also tightens up the types involved (size_t -> u32) and adds VERIFYs to
guard against overflow when computing the combined slot counts, and to
ensure the total fits within the 29-bit operand index field.
When a function creates object literals with simple property names,
we now cache the resulting shape after the first instantiation. On
subsequent calls, we create the object with the cached shape directly
and write property values at their known offsets.
This avoids repeated shape transitions and property offset lookups
for a common JavaScript pattern.
The optimization uses two new bytecode instructions:
- CacheObjectShape: Captures the final shape after object construction
- InitObjectLiteralProperty: Writes properties using cached offsets
Only "simple" object literals are optimized (string literal keys with
simple value expressions). Complex cases like computed properties,
getters/setters, and spread elements use the existing slow path.
3.4x speedup on a microbenchmark that repeatedly instantiates an object
literal with 26 properties. Small progressions on various benchmarks.
This adds visit_edges(Cell::Visitor&) methods to various helper structs
that contain GC pointers, and makes sure they are called from owning
GC-heap-allocated objects as needed.
These were found by our Clang plugin after expanding its capabilities.
The added rules will be enforced by CI going forward.
This resolves a FIXME in its code generation, particularly for:
- Caching the template object
- Setting the correct property attributes
- Freezing the resulting objects
This allows archive.org to load, which uses the Lit library.
The Lit library caches these template objects to determine if a
template has changed, allowing it to determine to do a full template
rerender or only partially update the rendering. Before, we would
always cause a full rerender on update because we didn't return the
same template object.
This caused issues with archive.org's code, I believe particularly with
its router library, where we would constantly detach and reattach nodes
unexpectedly, ending up with the page content not being attached to the
router's custom element.
Instead of creating PropertyKeys on the fly during interpreter
execution, we now store fully-formed ones in the Executable.
This avoids a whole bunch of busywork in property access instructions
and substantially reduces code size bloat.
While we're in the bytecode compiler, we want to know which type of
Operand we're dealing with, but once we've generated the bytecode
stream, we only ever need its index.
This patch simplifies Operand by removing the aarch64 bitfield hacks
and makes it 32-bit on all platforms. We keep 3 type bits in the high
bits of the index while compiling, and then zero them out when
flattening the final bytecode stream.
This makes bytecode more compact on x86_64, and avoids bit twiddling
on aarch64. Everyone wins something!
When stringifying bytecode for debugging output, we now have an API in
Executable that can look at a raw operand index and tell you what type
of operand it was, based on known quantities of each type in the stack
frame.
This commits puts the strict mode flag in the header of every bytecode
instruction. This allows us to check for strict mode without looking at
the currently running execution context.
Resulting in a massive rename across almost everywhere! Alongside the
namespace change, we now have the following names:
* JS::NonnullGCPtr -> GC::Ref
* JS::GCPtr -> GC::Ptr
* JS::HeapFunction -> GC::Function
* JS::CellImpl -> GC::Cell
* JS::Handle -> GC::Root