If neither block is a Shared Data Block we can use memcpy rather than
copying one byte at a time. Currently, we are using `memcpy`
unconditionally, as Shared Data Blocks are not yet supported in this
method.
With this change, `GetIterator` no longer GC-allocates an
`IteratorRecord`. Instead, it stores the iterator record fields in
bytecode registers. This avoids per-iteration allocations in patterns
like: `for (let [x] of array) {}`.
`IteratorRecord` now inherits from `IteratorRecordImpl`, which holds the
iteration state. This allows the existing iteration helpers
(`iterator_next()`, `iterator_step()`, etc.) operate on both the
GC-allocated and the register-backed forms.
Microbenchmarks:
1.1x array-destructuring-assignment-rest.js
1.226x array-destructuring-assignment.js
Instead of using this span, we can just use the getter that calculates
the base of the register/constant/local/argument array based on the
ExecutionContext's own address.
We don't need to return two values; running an executable only ever
produces a throw completion, or a normal completion, i.e a Value.
This necessitated a few minor changes, such as adding a way to check
if a JS::Cell is a GeneratorResult.
This simplifies function entry/exit and lets us just walk away from the
used ExecutionContext instead of resetting a bunch of its state when
returning control to the caller.
Instead of having ExecutionContext track function names separately,
we give FunctionObject a virtual function that returns an appropriate
name string for use in call stacks.
This commits puts the strict mode flag in the header of every bytecode
instruction. This allows us to check for strict mode without looking at
the currently running execution context.
This shrinks every Statement and ECMAScriptFunctionObject by one
pointer, and puts the bytecode cache in the only place that actually
makes use of it anyway: functions.
When an object becomes too big (currently 64 properties or more), we
change its shape to a dictionary and don't do any further transitions.
However, this means the Shape of the object no longer changes, so the
cache invalidation check of `current_shape != cache.shape` is no longer
a valid check.
This fixes that by keeping track of a generation number for the Shape
both on the Shape object and in the cache, allowing that to be checked
instead of the Shape identity. The generation is incremented whenever
the dictionary is mutated.
Fixes stale cache lookups on Gmail preventing emails from being
displayed.
I was not able to produce a reproduction for this, plus the generation
count was over the 20k mark on Gmail.
Previously, named capture groups in RegExp results did not always follow
their source order, and unmatched groups were omitted. According to the
spec, all named capture groups must appear in the result object in the
order they are defined, even if they did not participate in the match.
This commit makes sure we follow this requirement.
- Add FLATTEN (same as we do for internal_call()).
- Demote nice-to-have VERIFYs to ASSERTs.
- Pass already-known Realm to ordinary_create_from_constructor
1.03x speedup on Octane/earley-boyer.js
We can use caching in a million more places. This is just me running JS
benchmarks and looking at which get() call sites were hot and putting
caches there.
Lots of nice speedups all over the place, some examples:
1.19x speedup on Octane/raytrace.js
1.13x speedup on Octane/earley-boyer.js
1.12x speedup on Kraken/ai-astar.js
1.10x speedup on Octane/box2d.js
1.08x speedup on Octane/gbemu.js
1.05x speedup on Octane/regexp.js
To speed up property access, callers of get() can now provide a lookup
cache like so:
static Bytecode::PropertyLookupCache cache;
auto value = TRY(object.get(property, cache));
Note that the cache has to be `static` or it won't make sense!
This basically brings the inline caches from our bytecode VM straight
into C++ land, allowing us to gain serious performance improvements.
The implementation shares code with the GetById bytecode instruction.
This makes the instanceof operator signficantly faster by avoiding a
generic function call to @@hasInstance unless it has been overridden.
1.15x speed-up on Octane/earley-boyer.js
Linux, x86_64, Sanitizer, GNU runners on GitHub Action fail randomly
with a stack overflow on recursive test called:
Libraries/LibJS/Tests/runtime-error-call-stack-size.js
Instead of converting them to doubles and doing double math, just do the
arithmetic operation in i64 space instead.
This gives us a ~1.25x speed-up on this kind of micro-benchmark:
(() => {
let a = -2124299999;
for (let i = 0; i < 100_000_000; ++i) {
a + a;
}
})()
Same idea for Add, Sub, and Mul.
There's a fair bit of overflowing Int32 arithmetic in some of the
JetStream benchmarks, and this seems like an obvious improvement.
The [[Set]] operation on objects will normally end up doing
[[GetOwnProperty]] twice, but if the object and the receiver are one on
the same, we can avoid the double work.
AFAIK this is not observable via proxies or other mechanisms.
1.10x speed-up on MicroBench/object-set-with-rope-strings.js
We are often forced to convert numbers to strings inside LibJS, e.g when
iterating over the property names of an array, but it's also just a very
common operation in general.
This patch adds a 1000-entry string cache for the numbers 0-999 since
those appear to be by far the most common ones we convert.
This is a normative change in the ECMA-262 spec. See:
541b2f6
Note we already handled this case well, but let's update our impl to
match the latest spec text.