This hosts the ability to compile and run JavaScript to implement
native functions. This is particularly useful for any native function
that is not a normal function, for example async functions such as
Array.fromAsync, which require yielding.
These functions are not allowed to observe anything from outside their
environment. Any global identifiers will instead be assumed to be a
reference to an abstract operation or a constant. The generator will
inject the appropriate bytecode if the name of the global identifier
matches a known name. Anything else will cause a code generation error.
This commit adds a new Bytecode.def file that describes all the LibJS
bytecode instructions.
From this, we are able to generate the full declarations for all C++
bytecode instruction classes, as well as their serialization code.
Note that some of the bytecode compiler was updated since instructions
no longer have default constructor arguments.
The big immediate benefit here is that we lose a couple thousand lines
of hand-written C++ code. Going forward, this also allows us to do more
tooling for the bytecode VM, now that we have an authoritative
description of its instructions.
Key things to know about:
- Instructions can inherit from one another. At the moment, everything
simply inherits from the base "Instruction".
- @terminator means the instruction terminates a basic block.
- @nothrow means the instruction cannot throw. This affects how the
interpreter interacts with it.
- Variable-length instructions are automatically supported. Just put an
array of something as the last field of the instruction.
- The m_length field is magical. If present, it will be populated with
the full length of the instruction. This is used for variable-length
instructions.
With this change, `GetIterator` no longer GC-allocates an
`IteratorRecord`. Instead, it stores the iterator record fields in
bytecode registers. This avoids per-iteration allocations in patterns
like: `for (let [x] of array) {}`.
`IteratorRecord` now inherits from `IteratorRecordImpl`, which holds the
iteration state. This allows the existing iteration helpers
(`iterator_next()`, `iterator_step()`, etc.) operate on both the
GC-allocated and the register-backed forms.
Microbenchmarks:
1.1x array-destructuring-assignment-rest.js
1.226x array-destructuring-assignment.js
This is only used to specify how a property is being added to an object
by Put* instructions, so let's call it PutKind.
Also add an enumeration X macro for it to prepare for upcoming
specializations.
Previously, PutById constructed a PropertyKey from the identifier,
which coerced numeric-like strings to numbers. This moves that decision
to bytecode generation: the bytecode generator now emits PutByNumericId
for numeric keys and PutById for string keys. This removes per-execution
parsing from the interpreter.
1.4x speedup on the following microbenchmark:
```js
const o = {};
for (let i = 0; i < 10_000_000; i++) {
o.a = 1;
o.b = 2;
o.c = 3;
}
```
Previously, the given test would create an object with the test
property that pointed to itself.
This is because `temp = temp.test || {}` overwrote the `temp` local
register, and `temp.test = temp` used the new object instead of the
original one it fetched.
Allows https://www.yorkshiretea.co.uk/ to load, which was failing in
Gsap library initialization.
If class doesn't have any private fields, we could avoid allocating
PrivateEnvironment for it.
This allows us to skip thousands of unnecessary PrivateEnvironment
allocations on Discord.
We don't override anything with definitions of this function in
`SwitchStatement` and `LabelledStatement`. Also, we can make the
`IterationStatement` abstract, there is no need to add a fallback
error-generating stub implementation of this method.
`var` bindings are never in the temporal dead zone (TDZ), and so we
know accessing them will not throw.
We now take advantage of this by having a specialized environment
binding value getter that doesn't check for exceptional cases.
1.08x speedup on JetStream.
...by avoiding `{ value, done }` iterator result value allocation. This
change applies the same otimization 81b6a11 added for `for..in` and
`for..of`.
Makes following micro benchmark go 22% faster on my computer:
```js
function f() {
const arr = [];
for (let i = 0; i < 10_000_000; i++) {
arr.push([i]);
}
let sum = 0;
for (let [i] of arr) {
sum += i;
}
}
f();
```
Introduce special instruction for `for..of` and `for..in` loop that
skips `{ value, done }` result object allocation if iterator is builtin
(array, map, set, string). This reduces GC pressure significantly and
avoids extracting the `value` and `done` properties.
This change makes this micro benchmark 48% faster on my computer:
```js
const arr = new Array(10_000_000);
let counter = 0;
for (let _ of arr) {
counter++;
}
```
This allows us to get rid of instructions that move arguments to locals
and allocate smaller JS::Value vector in ExecutionContext by reusing
slots that were already allocated for arguments.
With this change for following function:
```js
function f(x, y) {
return x + y;
}
```
we now produce following bytecode:
```
[ 0] 0: Add dst:reg6, lhs:arg0, rhs:arg1
[ 10] Return value:reg6
```
instead of:
```
[ 0] 0: GetArgument 0, dst:x~1
[ 10] GetArgument 1, dst:y~0
[ 20] Add dst:reg6, lhs:x~1, rhs:y~0
[ 30] Return value:reg6
```
By doing that we consistently use Identifier node for identifiers and
also enable mechanism that registers identifiers in a corresponding
ScopePusher for catch parameters, which is necessary for work in the
upcoming changes.
The special empty value (that we use for array holes, Optional<Value>
when empty and a few other other placeholder/sentinel tasks) still
exists, but you now create one via JS::js_special_empty_value() and
check for it with Value::is_special_empty_value().
The main idea here is to make it very unlikely to accidentally create an
unexpected special empty value.
Basically convert o["foo"]=x into o.foo=x when emitting bytecode.
These are effectively the same thing, and the latter format opts
into using an inline cache for the property lookups.
Basically convert o["foo"] into o.foo when emitting bytecode. These are
effectively the same thing, and the latter format opts into using an
inline cache for the property lookups.
Instead of returning internal generator results as ordinary JS::Objects
with properties, we now use GeneratorResult and CompletionCell which
both inherit from Cell directly and allow efficient access to state.
1.59x speedup on JetStream3/lazy-collections.js :^)
This avoids emitting TDZ checks for multiple bindings declared and
referenced within one variable declaration, i.e:
var foo = 0, bar = foo;
In the above case, we'd emit an unnecessary TDZ check for the second
reference to `foo`.