We can use caching in a million more places. This is just me running JS
benchmarks and looking at which get() call sites were hot and putting
caches there.
Lots of nice speedups all over the place, some examples:
1.19x speedup on Octane/raytrace.js
1.13x speedup on Octane/earley-boyer.js
1.12x speedup on Kraken/ai-astar.js
1.10x speedup on Octane/box2d.js
1.08x speedup on Octane/gbemu.js
1.05x speedup on Octane/regexp.js
To speed up property access, callers of get() can now provide a lookup
cache like so:
static Bytecode::PropertyLookupCache cache;
auto value = TRY(object.get(property, cache));
Note that the cache has to be `static` or it won't make sense!
This basically brings the inline caches from our bytecode VM straight
into C++ land, allowing us to gain serious performance improvements.
The implementation shares code with the GetById bytecode instruction.
This makes the instanceof operator signficantly faster by avoiding a
generic function call to @@hasInstance unless it has been overridden.
1.15x speed-up on Octane/earley-boyer.js
Linux, x86_64, Sanitizer, GNU runners on GitHub Action fail randomly
with a stack overflow on recursive test called:
Libraries/LibJS/Tests/runtime-error-call-stack-size.js
Instead of converting them to doubles and doing double math, just do the
arithmetic operation in i64 space instead.
This gives us a ~1.25x speed-up on this kind of micro-benchmark:
(() => {
let a = -2124299999;
for (let i = 0; i < 100_000_000; ++i) {
a + a;
}
})()
Same idea for Add, Sub, and Mul.
There's a fair bit of overflowing Int32 arithmetic in some of the
JetStream benchmarks, and this seems like an obvious improvement.
The [[Set]] operation on objects will normally end up doing
[[GetOwnProperty]] twice, but if the object and the receiver are one on
the same, we can avoid the double work.
AFAIK this is not observable via proxies or other mechanisms.
1.10x speed-up on MicroBench/object-set-with-rope-strings.js
We are often forced to convert numbers to strings inside LibJS, e.g when
iterating over the property names of an array, but it's also just a very
common operation in general.
This patch adds a 1000-entry string cache for the numbers 0-999 since
those appear to be by far the most common ones we convert.
This is a normative change in the ECMA-262 spec. See:
541b2f6
Note we already handled this case well, but let's update our impl to
match the latest spec text.
The optimization for non-shared ArrayBuffers operates on incorrect
values of from_byte_index and to_byte_index because they will have been
modified in the preceding steps. This causes the incorrect range to be
copied within the buffer.
Before this change, PropertyNameIterator (used by for..in) and
`Object::enumerable_own_property_names()` (used by `Object.keys()`,
`Object.values()`, and `Object.entries()`) enumerated an object's own
enumerable properties exactly as the spec prescribes:
- Call `internal_own_property_keys()`, allocating a list of JS::Value
keys.
- For each key, call internal_get_own_property() to obtain a
descriptor and check `[[Enumerable]]`.
While that is required in the general case (e.g. for Proxy objects or
platform/exotic objects that override `[[OwnPropertyKeys]]`), it's
overkill for ordinary JS objects that store their own properties in the
shape table and indexed-properties storage.
This change introduces `for_each_own_property_with_enumerability()`,
which, for objects where
`eligible_for_own_property_enumeration_fast_path()` is `true`, lets us
read the enumerability directly from shape metadata (and from
indexed-properties storage) without a per-property descriptor lookup.
When we cannot avoid `internal_get_own_property()`, we still
benefit by skipping the temporary `Vector<Value>` of keys and avoiding
the unnecessary round-trip between PropertyKey and Value.
We already had IC support in PutById for the following cases:
- Changing an existing own property
- Calling a setter located in the prototype chain
This was enough to speed up code where structurally identical objects
(same shape) are processed in a loop:
```js
const arr = [{ a: 1 }, { a: 2 }, { a: 3 }];
for (let obj of arr) {
obj.a += 1;
}
```
However, creating structurally identical objects in a loop was still
slow:
```js
for (let i = 0; i < 10_000_000; i++) {
const o = {};
o.a = 1;
o.b = 2;
o.c = 3;
}
```
This change addresses that by adding a new IC type that caches both the
source and target shapes, allowing property additions to be fast-pathed
by directly jumping to the shape that already includes the new property.
Adds inline implementation for the most common case when `Value` is
already an object.
1.47x improvement on the following benchmark:
```js
const o = {};
for (let i = 0; i < 10_000_000; i++) {
o.a = 1;
o.b = 2;
o.c = 3;
}
```
This first pass only applies to the following two cases:
- Public functions returning a view type into an object they own
- Public ctors storing a view type
This catches a grand total of one (1) issue, which is fixed in
the previous commit.
After LibJS had its symbol exports optimized the targets
js, test-js, test262-runner, test-wasm, and LibWeb began to get linker
errors after the work to add Windows support for test-web and ladybird
targets. These extra JS_API annotations fix all those linker errors.
The relevant type of ArrayBuffer DataBlock is now a struct containing
both a ByteBuffer* and a size_t size, and not just a ByteBuffer*, with
the size being that of the ByteBuffer. This type of DataBlock is only
used for WebAssembly.Memory (see commit 4fd43a8f96), meaning this
change won't affect any other code. This change is required to pass one
WPT subtest in wasm/jsapi/memory/grow.any.html, since old fixed-length
SharedArrayBuffers after a WebAssembly.Memory growth should keep their
length, while the new buffer after the growth will have the updated
length.
Before porting to UTF-16, these instances held a String. The port to
UTF-16 changed them to hold the original string as a StringView, and
lazily allocated the UTF-16 message as needed. This somehow negatively
impacting the zlib.js benchmark in the Octane suite.