Functions created via new Function() cannot assume that unresolved
identifiers refer to global variables, since they may be called in
an arbitrary scope. Pass a flag through the scope collector analysis
to suppress the global identifier optimization in this case.
When the parser speculatively tries to parse an arrow function
expression, encountering `this` inside a default parameter value
like `(a = this)` propagates uses_this flags to ancestor function
scopes via set_uses_this(). If the arrow attempt fails (no =>
follows), these flags were left behind, incorrectly marking
ancestor scopes as using `this`.
Fix this by saving and restoring the uses_this and
uses_this_from_environment flags on all ancestor function scopes
around speculative arrow function parsing.
The scope collector uses HashMaps for identifier groups and variables,
which means their iteration order is non-deterministic. This causes
local variable indices and function declaration instantiation (FDI)
bytecode to vary between runs.
Fix this by sorting identifier group keys alphabetically before
assigning local variable indices, and sorting vars_to_initialize by
name before emitting FDI bytecode.
Also make register allocation deterministic by always picking the
lowest-numbered free register instead of whichever one happens to be
at the end of the free list.
This is preparation for bringing in a new source->bytecode pipeline
written in Rust. Checking for regressions is significantly easier
if we can expect identical output from both pipelines.
When a nested function (arrow or function expression) inside a default
parameter expression captures a name that also has a body var
declaration, the capture must propagate to the parent scope. Otherwise,
the outer scope optimizes the binding to a local register, making it
invisible to GetBinding at runtime.
When a function has parameter expressions (default values), body var
declarations that shadow a name referenced in a default parameter
expression must not be optimized to local variables. The default
expression needs to resolve the name from the outer scope via the
environment chain, not read the uninitialized local.
We now mark identifiers referenced during formal parameter parsing
with an IsReferencedInFormalParameters flag, and skip local variable
optimization for body vars that carry both this flag and IsVar (but
not IsForbiddenLexical, which indicates parameter names themselves).
When an identifier was registered and its group already existed but
had no declaration_kind set, we failed to propagate it. This caused
var declarations to lose their annotation in AST dumps when the
identifier was referenced before its declaration.
ScopeCollector::add_declaration() was adding var declarations to the
top-level scope's m_var_declarations once per bound identifier and once
more after the for_each_bound_identifier loop - so a `var a, b, c`
would be added 4 times instead of 1.
The Script constructor iterates m_var_declarations and expands each
entry's bound identifiers, resulting in O(N²) work for a single var
statement with N declarators.
Running the Emscripten-compiled version of ScummVM with a 32,174-
declarator var statement, this produced over 1 billion entries,
consuming 14+ GB of RAM and blocking the event loop for 35+ seconds.
After this fix, this drops down to 200 MB and just short of 200ms.
When a function accesses the arguments object in non-strict mode, scope
analysis was skipping argument index assignment for all parameter
candidates. This is correct for regular parameters (which participate in
the sloppy-mode arguments-parameter linkage), but rest parameters never
participate in that linkage and should always get their argument index.
Destructured parameter bindings (e.g. the x and y in function f([x, y]))
were not receiving any local indices during scope analysis. This meant
they could not benefit from the fast local variable access path in the
bytecode interpreter.
The issue had two parts:
1. set_function_parameters() only registered plain Identifier parameters
with the IsParameterCandidate flag. BindingPattern parameters only
got IsForbiddenLexical, which caused their identifiers to be skipped
entirely during resolve_identifiers().
2. resolve_identifiers() unconditionally called set_argument_index() for
all parameter candidates, but get_index_of_parameter_name() only
finds plain Identifier parameters, not bindings inside patterns.
Fix this by registering destructured binding identifiers with
IsParameterCandidate so they participate in resolution, and then falling
back to a local variable slot when get_index_of_parameter_name() does
not find them. This is correct because the argument register holds the
whole object/array to be destructured, while the individual bindings
live in local variable slots.
has_declaration_in_current_function() only checked IsLexical and
IsVar flags, but parameters carry IsParameterCandidate instead.
This meant a parameter named "arguments" wasn't recognized as a
declaration, causing the scope analysis to incorrectly treat it
as an access to the arguments exotic object instead of a plain
parameter binding.
Add IsParameterCandidate to the flags check so parameters are
correctly recognized, allowing "arguments" to get its [argument:N]
index like any other parameter.
Replace the ScopePusher RAII class (which performed scope analysis
in its destructor chain during parsing) with a two-phase approach:
1. ScopeCollector builds a tree of ScopeRecord nodes during parsing
via RAII ScopeHandle objects. It records declarations, identifier
references, and flags, but does not resolve anything.
2. After parsing completes, ScopeCollector::analyze() walks the tree
bottom-up and performs all resolution: propagate eval/with
poisoning, resolve identifiers to locals/globals/arguments, hoist
functions (Annex B.3.3), and build FunctionScopeData.
Key design decisions:
- ScopeRecord::ast_node is a RefPtr<ScopeNode> to prevent
use-after-free when synthesize_binding_pattern re-parses an
expression as a binding pattern (the original parse's scope records
survive with stale AST node pointers).
- Parser::scope_collector() returns the override collector if set
(for synthesize_binding_pattern's nested parser), ensuring all
scope operations route to the outer parser's scope tree.
- FunctionNode::local_variables_names() delegates to its body's
ScopeNode rather than copying at parse time, since analysis runs
after parsing.