Commit graph

22 commits

Author SHA1 Message Date
Andreas Kling
0969a5cd9a LibJS: Use Substring for legacy regexp statics
Keep the legacy regexp static properties backed by PrimitiveString
values instead of eagerly copied Utf16Strings. lastMatch, leftContext,
rightContext, and $1-$9 now materialize lazy Substrings from the
original match input when accessed.

Keep RegExp.input as a separate slot from the match source so manual
writes do not rewrite the last match state. Add coverage for that
behavior and for rope-backed UTF-16 inputs.
2026-04-11 00:35:36 +02:00
Andreas Kling
e243e146de LibJS+LibRegex: Switch RegExp over to the Rust engine
Switch LibJS `RegExp` over to the Rust-backed `ECMAScriptRegex` APIs.

Route `new RegExp()`, regex literals, and the RegExp builtins through
the new compile and exec APIs, and stop re-validating patterns with the
deleted C++ parser on the way in. Preserve the observable error
behavior by carrying structured compile errors and backtracking-limit
failures across the FFI boundary. Cache compiled regex state and named
capture metadata on `RegExpObject` in the new representation.

Use the new API surface to simplify and speed up the builtin paths too:
share `exec_internal`, cache compiled regex pointers, keep the legacy
RegExp statics lazy, run global replace through batch `find_all`, and
optimize replace, test, split, and String helper paths. Add regression
tests for those JavaScript-visible paths.
2026-03-27 17:32:19 +01:00
Andreas Kling
30f108ba36 LibJS: Remove C++ lexer, use Rust tokenizer for syntax highlighting
Delete Lexer.cpp/h and Token.cpp, replacing all tokenization with a
new rust_tokenize() FFI function that calls back for each token.

Rewrite SyntaxHighlighter.cpp and js.cpp REPL to use the Rust
tokenizer. The token type and category enums in Token.h now mirror
the Rust definitions in token.rs.

Move is_syntax_character/is_whitespace/is_line_terminator helpers
into RegExpConstructor.cpp as static functions, since they were only
used there.
2026-03-19 21:55:10 -05:00
Timothy Flynn
d7e828a366 LibJS: Make more use of Value::is and Value::as_if in JS runtime 2026-02-27 17:19:33 +01:00
Timothy Flynn
adf6024805 LibJS: Update spec steps / links for the RegExp.escape proposal
This proposal has reached stage 4 and been merged into the main ECMA-262
spec. See:

e2da759
2025-04-29 07:33:08 -04:00
Timothy Flynn
db87f173fb LibJS: Implement the RegExp.escape proposal
https://tc39.es/proposal-regex-escaping/
2024-12-05 13:56:21 +01:00
Shannon Booth
f87041bf3a LibGC+Everywhere: Factor out a LibGC from LibJS
Resulting in a massive rename across almost everywhere! Alongside the
namespace change, we now have the following names:

 * JS::NonnullGCPtr -> GC::Ref
 * JS::GCPtr -> GC::Ptr
 * JS::HeapFunction -> GC::Function
 * JS::CellImpl -> GC::Cell
 * JS::Handle -> GC::Root
2024-11-15 14:49:20 +01:00
Timothy Flynn
93712b24bf Everywhere: Hoist the Libraries folder to the top-level 2024-11-10 12:50:45 +01:00
Andreas Kling
13d7c09125 Libraries: Move to Userland/Libraries/ 2021-01-12 12:17:46 +01:00
Linus Groh
5cb45e4feb LibJS: Make RegExp() constructor spec-compliant
- Default values should depend on arguments being undefined, not being
  missing
- "(?:)" for empty pattern happens in RegExp.prototype.source, not the
  constructor
2020-11-28 01:20:11 +01:00
AnotherTest
3200ff5f4f LibJS+js: Rename RegExp.{content => pattern}
The spec talks about it as 'pattern', so let's use that instead.
2020-11-27 21:32:41 +01:00
Andreas Kling
7b863330dc LibJS: Cache commonly used FlyStrings in the VM
Roughly 7% of test-js runtime was spent creating FlyStrings from string
literals. This patch frontloads that work and caches all the commonly
used names in LibJS on a CommonPropertyNames struct that hangs off VM.
2020-10-13 23:57:45 +02:00
Andreas Kling
2bc5bc64fb LibJS: Remove a whole bunch of includes of <LibJS/Interpreter.h> 2020-09-27 20:26:58 +02:00
Andreas Kling
f79d4c7347 LibJS: Remove Interpreter& argument to Function::construct()
This is no longer needed, we can get everything we need from the VM.
2020-09-27 20:26:58 +02:00
Andreas Kling
340a115dfe LibJS: Make native function/property callbacks take VM, not Interpreter
More work on decoupling the general runtime from Interpreter. The goal
is becoming clearer. Interpreter should be one possible way to execute
code inside a VM. In the future we might have other ways :^)
2020-09-27 20:26:58 +02:00
Andreas Kling
1ff9d33131 LibJS: Make Function::call() not require an Interpreter&
This makes a difference inside ScriptFunction::call(), which will now
instantiate a temporary Interpreter if one is not attached to the VM.
2020-09-27 20:26:58 +02:00
Andreas Kling
aaf6014ae1 LibJS: Simplify Cell::initialize()
Remove the Interpreter& argument and pass only GlobalObject&. We can
find everything we need via the global object anyway.
2020-07-23 17:31:08 +02:00
Matthew Olsson
bda39ef7ab LibJS: Explicitly pass a "Function& new_target" to Function::construct
This allows the proxy handler to pass the proper new.target to construct
handlers.
2020-07-01 11:16:37 +02:00
Andreas Kling
2fe4285693 LibJS: Object::initialize() overrides must always call base class 2020-06-20 17:50:48 +02:00
Andreas Kling
64513f3c23 LibJS: Move native objects towards two-pass construction
To make sure that everything is set up correctly in objects before we
start adding properties to them, we split cell allocation into 3 steps:

1. Allocate a cell of appropriate size from the Heap
2. Call the C++ constructor on the cell
3. Call initialize() on the constructed object

The job of initialize() is to define all the initial properties.
Doing it in a second pass guarantees that the Object has a valid Shape
and can find its own GlobalObject.
2020-06-20 15:46:30 +02:00
Andreas Kling
affc479e83 LibJS+LibWeb: Remove a bunch of calls to Interpreter::global_object()
Objects should get the GlobalObject from themselves instead. However,
it's not yet available during construction so this only switches code
that happens after construction.

To support multiple global objects, Interpreter needs to stop holding
on to "the" global object and let each object graph own their global.
2020-06-08 12:25:45 +02:00
Matthew Olsson
61ac1d3ffa LibJS: Lex and parse regex literals, add RegExp objects
This adds regex parsing/lexing, as well as a relatively empty
RegExpObject. The purpose of this patch is to allow the engine to not
get hung up on parsing regexes. This will aid in finding new syntax
errors (say, from google or twitter) without having to replace all of
their regexes first!
2020-06-07 19:06:55 +02:00