ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2026-04-19 02:10:26 +00:00

Author	SHA1	Message	Date
Andreas Kling	40430d6087	LibJS: Detect direct eval calls in parse_call_expression() The parser previously detected direct eval() calls at the end of parse_expression(), by checking if the final expression was a CallExpression with "eval" as the callee. This missed cases where eval() appeared as a subexpression, e.g. `eval(code) \| 0`, since the final expression would be a BinaryExpression, not a CallExpression. Move the detection into parse_call_expression() where the CallExpression is actually created. This ensures we always set the contains_direct_call_to_eval flag regardless of surrounding operators, so local variables are correctly placed in the declarative environment where eval'd code can find them.	2026-02-07 18:05:41 +01:00
Luke Wilde	e5f5329f9d	LibJS: Consume identifier when matching an identifier name in bindings Otherwise we'll fail to recognise that the identifier token is invalid (e.g. a keyword such as null) in a binding pattern and that it may not be a binding pattern after all, but an object expression. Fixes some scripts on Discord failing to parse.	2026-02-06 16:09:33 +01:00
Andreas Kling	5238841da2	LibJS: Mark named function expression identifiers at individual level Previously, when parsing a named function expression like `Oops = function Oops() { Oops }`, the parser set a group-level flag `might_be_variable_in_lexical_scope_in_named_function_assignment` that propagated to the parent scope. This incorrectly prevented ALL `Oops` identifiers from being marked as global, including those outside the function expression. Fix this by marking identifiers individually using `set_is_inside_scope_with_eval()` only for identifiers inside the function scope. This allows identifiers outside the function expression to correctly use GetGlobal/SetGlobal while identifiers inside still use GetBinding (since they may refer to the function's name binding).	2026-01-27 10:58:39 +01:00
Andreas Kling	871d93355b	LibJS: Stop propagating is_inside_scope_with_eval across functions Previously, when a nested function contained eval(), the parser would mark all identifiers in parent functions as "inside scope with eval". This prevented those identifiers from being marked as global, forcing them to use GetBinding instead of GetGlobal. However, eval() can only inject variables into its containing function's scope, not into parent function scopes. So a parent function's reference to a global like `Number` should still be able to use GetGlobal even if a nested function contains eval(). This change adds a new flag `m_eval_in_current_function` that propagates through block scopes within the same function but stops at function boundaries. This flag is used for marking identifiers, while the existing `m_screwed_by_eval_in_scope_chain` continues to propagate across functions for local variable deoptimization (since eval can access closure variables). Before: `new Number(42)` in outer() with eval in inner() -> GetBinding After: `new Number(42)` in outer() with eval in inner() -> GetGlobal	2026-01-27 10:58:39 +01:00
Andreas Kling	88d715fc68	LibJS: Eliminate HashMap operations in SFID by caching parser data Cache necessary data during parsing to eliminate HashMap operations in SharedFunctionInstanceData construction. Before: 2 HashMap copies + N HashMap insertions with hash computations After: Direct vector iteration with no hashing Build FunctionScopeData for function scopes in the parser containing: - functions_to_initialize: deduplicated var-scoped function decls - vars_to_initialize: var decls with is_parameter/is_function_name - var_names: HashTable for AnnexB extension checks - Pre-computed counts for environment size calculation - Flags for "arguments" handling Add ScopeNode::ensure_function_scope_data() to compute the data on-demand for edge cases that don't go through normal parser flow (synthetic class constructors, static initializers, module wrappers). Use this cached data directly in SFID with zero HashMap operations.	2026-01-25 23:08:36 +01:00
Jelle Raaijmakers	ae20ecf857	AK+Everywhere: Add Vector::contains(predicate) and use it No functional changes.	2026-01-08 15:27:30 +00:00
Luke Wilde	d766e41c94	LibJS: Store tagged template literal raw strings as StringLiterals	2026-01-06 23:25:36 +01:00
Andreas Kling	d6fbde43f8	LibJS: Track eval() scope membership per-identifier The previous fix prevented eval() in sibling function scopes from affecting each other, but it still had a limitation: when identifiers from multiple scopes were merged into the same identifier group at Program scope, the presence of eval() anywhere would taint all identifiers in the group. This change tracks per-identifier whether it was inside a scope with eval() in the scope chain. When a scope closes, if it contains eval() or has eval() in its parent chain, each identifier in that scope is marked with `is_inside_scope_with_eval`. At Program scope finalization, only identifiers that are NOT marked can be optimized to global lookups. This allows code like: ```js var x = undefined; // Can be optimized (program scope) (function() { function walk() { undefined; } // Cannot be optimized eval(''); })(); ``` Before: Neither `undefined` could be optimized After: The program-scope `undefined` is optimized, while the one inside the function with eval() correctly uses dynamic lookup.	2026-01-06 00:11:28 +01:00
Andreas Kling	1f5e209032	LibJS: Fix overly conservative eval() deopt for undeclared identifiers The parser was preventing identifiers like `undefined` from being optimized to global constants when eval() was present in any function scope, even when eval() could not affect the identifier's binding. This change makes the parser smarter by preventing eval() in one function scope from affecting global identifier optimization in sibling function scopes. The key insight is that eval() in function A cannot shadow globals in function B.	2026-01-04 13:13:26 +01:00
Andreas Kling	63eccc5640	LibJS: Don't make extra copies of every JS function's source code Instead, let functions have a view into the AST's SourceCode object's underlying string data. The source string is kept alive by the AST, so it's fine to have views into it as long as the AST exists. Reduces memory footprint on my x.com home feed by 65 MiB.	2025-12-21 10:06:04 -06:00
Andreas Kling	ece0b72e3c	LibJS: Don't set [[HomeObject]] for non-method object properties This fixes an issue where we'd incorrectly retain objects via the [[HomeObject]] slot. This common pattern was affected: Object.defineProperty(o, "foo", { get: function() { return 123; } }); Above, the object literal would get assigned to the [[HomeObject]] slot even though "get" is not a "method" per the spec. This frees about 30,000 objects on my x.com home feed.	2025-12-17 12:50:17 -06:00
Andreas Kling	fa44fd58d8	LibJS: Remove ParserState::lookahead_lexer The lookahead lexer used by next_token() no longer needs to be kept alive, since tokens created by Parser::next_token() now have any string views guaranteed safe by the fact that they point into the one true SourceCode provided by whoever set up the lexer.	2025-11-09 12:14:03 +01:00
Andreas Kling	0dacc94edd	LibJS: Have JS::Lexer take a JS::SourceCode as input This moves the responsibility of setting up a SourceCode object to the users of JS::Lexer. This means Lexer and Parser are free to use string views into the SourceCode internally while working. It also means Lexer no longer has to think about anything other than UTF-16 (or ASCII) inputs. So the unit test for parsing various invalid UTF-8 sequences is deleted here.	2025-11-09 12:14:03 +01:00
Andreas Kling	fdd9413e71	LibJS: Avoid some unnecessary ref count churn in parser	2025-11-09 12:14:03 +01:00
Andreas Kling	841fe0b51c	LibJS: Don't store current token in both Lexer and Parser Just give Parser a way to access the one stored in Lexer.	2025-11-09 12:14:03 +01:00
Andreas Kling	d3e8fbd9cd	LibJS: Don't create unique FunctionParameters for every empty param set	2025-11-09 12:14:03 +01:00
Andreas Kling	72aa90312a	LibJS: Make JS::Token::message an enum instead of a StringView Just to make JS::Token a little smaller.	2025-11-09 12:14:03 +01:00
Andreas Kling	fb05063dde	LibJS: Let bytecode instructions know whether they are in strict mode This commits puts the strict mode flag in the header of every bytecode instruction. This allows us to check for strict mode without looking at the currently running execution context.	2025-10-29 21:20:10 +01:00
Andreas Kling	5a7b0a07cb	LibJS: Mark global function declarations as globals This allows us to use the GetGlobal and SetGlobal bytecode instructions for them, enabling cached accesses. 2.62x speed-up on this Fibonacci program: function fib(n) { return n < 2 ? n : fib(n - 1) + fib(n - 2); } for (let i = 0; i < 50_000; ++i) fib(10);	2025-10-13 17:15:44 +02:00
Timothy Flynn	62d85dd90a	LibJS: Port RegExp flags and patterns to UTF-16	2025-08-13 09:56:13 -04:00
Timothy Flynn	b955c9b2a9	LibJS: Port the Identifier AST (and related) nodes to UTF-16 This eliminates quite a lot of UTF-8 / UTF-16 churn.	2025-08-13 09:56:13 -04:00
Timothy Flynn	00182a2405	LibJS: Port the JS lexer and parser to UTF-16 This ports the lexer to UTF-16 and deals with the immediate fallout up to the AST. The AST will be dealt with in upcoming commits. The lexer will still accept UTF-8 strings as input, and will transcode them to UTF-16 for lexing. This doesn't actually incur a new allocation, as we were already converting the input StringView to a ByteString for each lexer. One immediate logical benefit here is that we do not need to know off- hand how many UTF-8 bytes some special code points occupy. They all happen to be a single UTF-16 code unit. So instead of advancing the lexer by 3 positions in some cases, we can just always advance by 1.	2025-08-13 09:56:13 -04:00
Timothy Flynn	eb74781a2d	LibJS: Keep the lookahead lexer alive after parsing its next token Currently, the lexer holds a ByteString, which is always heap-allocated. When we create a copy of the lexer for the lookahead token, that token will outlive the lexer copy. The token holds a couple of string views into the lexer's source string. This is fine for now, because the source string will be kept alive by the original lexer. But if the lexer were to hold a String or Utf16String, short strings will be stored on the stack due to SSO. Thus the token will hold views into released stack data. We need to keep the lookahead lexer alive to prevent UAF on views into its source string.	2025-08-13 09:56:13 -04:00
ayeteadoe	2e2484257d	LibJS: Enable EXPLICIT_SYMBOL_EXPORT and annotate minimum symbol set	2025-07-22 11:51:29 -04:00
ayeteadoe	539a675802	LibJS: Revert Enable EXPLICIT_SYMBOL_EXPORT This reverts commit `c14173f651`. We should only annotate the minimum number of symbols that external consumers actually use, so I am starting from scratch to do that	2025-07-22 11:51:29 -04:00
Timothy Flynn	66006d3812	AK+LibJS: Extract some UTF-16 helpers for use in an outside class An upcoming Utf16String will need access to these helpers. Let's make them publicly available.	2025-07-03 09:51:56 -04:00
ayeteadoe	c14173f651	LibJS: Enable EXPLICIT_SYMBOL_EXPORT	2025-06-30 10:50:36 -06:00
Luke Wilde	f12b6b258f	LibJS: Don't use presence of function params to identify function scope Instead, we can just use the scope type to determine if a scope is a function scope. This fixes using `this` for parameter default values in arrow functions crashing. This happened by `uses_this_from_environment` was not set in `set_uses_this`, as it didn't think it was in a function scope whilst parsing parameters. Fixes closing modal dialogs causing a crash on https://www.ikea.com/ No test262 diff. Reverts the functional part of `08cfd5f`, because it was a workaround for this issue.	2025-06-17 20:48:45 +02:00
Viktor Szépe	19f88f96dc	Everywhere: Fix typos - act III	2025-06-16 14:20:48 +01:00
Aliaksandr Kalenik	dcfc515cd0	LibJS: Fix arrow function parsing bug In the following example: ```js const f = (i) => ({ obj: { a: { x: i }, b: { x: i } }, g: () => {}, }); ``` The body of function `f` is initially parsed as an arrow function. As a result, what is actually an object expression is interpreted as a formal parameter with a binding pattern. Since duplicate identifiers are not allowed in this context (`i` in the example), the parser generates an error, causing the entire script to fail parsing. This change ignores the "Duplicate parameter names in bindings" error during arrow function parameter parsing, allowing the parser to continue and recognize the object expression of the outer arrow function with an implicit return. Fixes error on https://chat.openai.com/	2025-05-26 12:44:21 +03:00
Aliaksandr Kalenik	db480b1f0c	LibJS: Preserve information about local variables declaration kind This is required for upcoming change where we want to emit ThrowIfTDZ for assignment expressions only for lexical declarations.	2025-05-06 12:06:23 +02:00
Andreas Kling	bf1b754e91	LibJS: Optimize reading known-to-be-initialized `var` bindings `var` bindings are never in the temporal dead zone (TDZ), and so we know accessing them will not throw. We now take advantage of this by having a specialized environment binding value getter that doesn't check for exceptional cases. 1.08x speedup on JetStream.	2025-05-04 02:31:18 +02:00
Timothy Flynn	3867a192a1	LibJS: Update spec steps / links for the import-assertions proposal This proposal has reached stage 4 and been merged into the main ECMA-262 spec. See: `4e3450e`	2025-04-29 07:33:08 -04:00
Aliaksandr Kalenik	2d732b2251	LibJS: Skip allocating locals for arguments that allowed to be local This allows us to get rid of instructions that move arguments to locals and allocate smaller JS::Value vector in ExecutionContext by reusing slots that were already allocated for arguments. With this change for following function: ```js function f(x, y) { return x + y; } ``` we now produce following bytecode: ``` [ 0] 0: Add dst:reg6, lhs:arg0, rhs:arg1 [ 10] Return value:reg6 ``` instead of: ``` [ 0] 0: GetArgument 0, dst:x~1 [ 10] GetArgument 1, dst:y~0 [ 20] Add dst:reg6, lhs:x~1, rhs:y~0 [ 30] Return value:reg6 ```	2025-04-26 11:02:29 +02:00
Aliaksandr Kalenik	81a3bfd492	LibJS: Allow using locals if `arguments` is used in strict mode Previously we blocked using locals for function arguments whenever `arguments` was mentioned in function body, however, this is not necessary in strict mode, where mutations to the arguments object are not reflected in the function arguments and vice versa.	2025-04-25 21:08:24 +02:00
Aliaksandr Kalenik	7932091e02	LibJS: Allow using local variable for catch parameters Local variables are faster to access and if all catch parameters are locals we can skip lexical environment allocation.	2025-04-22 21:57:25 +02:00
Aliaksandr Kalenik	0f14c70252	LibJS: Use Identifier to represent CatchClause parameter names By doing that we consistently use Identifier node for identifiers and also enable mechanism that registers identifiers in a corresponding ScopePusher for catch parameters, which is necessary for work in the upcoming changes.	2025-04-22 21:57:25 +02:00
Andrew Kaster	c471faee10	LibJS: Launder const in the parser where required with strict RefPtrs These places should be updated to not require this hackery, but pulling on this thread involves touching almost every method in the parser.	2025-04-16 10:41:44 -06:00
Andreas Kling	ef4e7b7945	LibJS: Make JS parser emit accurate `this` insights for constructors This way we don't have to handle it when instantiating the constructor.	2025-04-08 18:52:35 +02:00
devgianlu	08cfd5ff1b	LibJS: Set empty function parameters on ClassStaticInit scope This prevents the variables declared inside a class static initializer to escape to the nearest containing function causing all sorts of memory corruptions.	2025-04-05 18:20:36 +01:00
devgianlu	6aea459e00	LibJS: Wrap `static_init_block_scope` call in its own scope	2025-04-05 18:20:36 +01:00
R-Goc	28d5d982ce	Everywhere: Remove unused private fields This commit removes the -Wno-unusued-private-field flag, thus reenabling the warning. Unused field were either removed or marked [[maybe_unused]] when unsure.	2025-04-04 12:40:07 +02:00
Andreas Kling	6c70dc5f09	LibJS: Create FunctionParameters earlier in the parser This avoids making multiple copies of the Vector<FunctionParameter> in the parser.	2025-03-27 19:50:13 +00:00
Andreas Kling	7477002e46	LibJS: Keep parsed function parameters in a shared data structure Instead of making a copy of the Vector<FunctionParameter> from the AST every time we instantiate an ECMAScriptFunctionObject, we now keep the parameters in a ref-counted FunctionParameters object. This reduces memory usage, and also allows us to cache the bytecode executables for default parameter expressions without recompiling them for every instantiation. :^)	2025-03-27 15:00:43 +00:00
Andreas Kling	46a5710238	LibJS: Use FlyString in PropertyKey instead of DeprecatedFlyString This required dealing with substantial fallout.	2025-03-24 22:27:17 +00:00
Timothy Flynn	b64a355a30	LibJS: Remove support for the "assert" keyword for import attributes This was removed from the spec some time ago. See: `14286bb`	2025-01-21 14:58:32 +01:00
Timothy Flynn	47ba231a9b	LibJS: Do not consume "with" tokens in import statements as identifiers The "with" statement is its own token (TokenType::With), and thus would fail to parse as an identifier. We've already asserted that the token we are parsing is "with" or "assert", so just consume it.	2025-01-21 14:58:32 +01:00
Timothy Flynn	7d420bbd3d	LibJS: Update the noted grammar for ImportDeclaration	2025-01-21 14:58:32 +01:00
Luke Wilde	5f33383a7b	LibJS: Propagate direct eval presence if the current scope is screwed Previously it only deoptimized the parent scope if the current scope contains direct eval, which is incorrect because code ran in direct eval mode has access to the entire scope chain it was executed in. The fix is to also propagate direct eval's presence if the current scope is marked as being screwed by direct eval. This fixes Google's botguard failing to complete on Google sign in, as it tried to access local variables outside of a direct parent function with eval, causing it throw "unhandled" exceptions. Unhandled is in quotes because their bytecode VM _technically_ caught it, but it was considered an unhandled exception. This was determined by removing get optimizations and then adding debug output for every get operation. Using this, I noticed that for these errors, it would access the 'message' and 'stack' properties. This is because their error handler function noticed this was not a synthesised error, which is never expected to happen. That was determined by using Chrome Devtools 'pause on handled exception' feature, and noticing it never threw a '[var] is not defined' exception, but only synthesized error objects which contained a sentinel value to let it know it was synthesized. I added debug output to eval to print out what was being eval'd because it makes heavy use of eval. This revealed that the exceptions only came from eval. I then dumped every generated executable and noticed the variables it was trying to access were generated as local variables in the top scope. This led to checking what makes a variable considered local or not, which then lead to this block of code in ~ScopePusher that propagates eval presence only to the immediate parent scope. This variable directly controls whether to create all variables properly with variable environments and bindings or allow them to be stored as local registers tied to that function's executable. Since this now lets botguard run to completion, it no longer considers us to be an insecure/potential bot browser when signing in, now allowing us to be able to sign in to Google.	2025-01-17 14:36:03 +01:00
Timothy Flynn	ada36e5c0a	LibJS: Allow async functions named "async" as function properties For example, https://locals.com/site/discover has a script with an object of the form: var f = { parser: { sync() {}, async async() {}, } }; We were previously throwing a syntax error on the async function, as we specifically did not allow using "async" as a function name here.	2024-12-26 17:23:10 +01:00

1 2 3 4 5

222 commits