Commit graph

4 commits

Author SHA1 Message Date
Andreas Kling
b81269e78b Libraries: Clean up UTF-16 source text paths
Store parser errors, source range filenames, source code filenames,
module source, and Rust parser errors as UTF-16 where they flow back
into JavaScript-visible strings. Keep byte-oriented source buffers
byte-backed.

Remove temporary PrimitiveString, ByteString, and UTF-8 detours from
JSON, RegExp, module debug logging, print formatting, and tests.
2026-06-22 19:51:25 +02:00
Andreas Kling
2d20322fce LibRegex: Compile ECMAScript patterns from UTF-16
Accept Utf16View patterns at the LibRegex compile boundary and pass
UTF-16 or ASCII storage directly into the Rust regex parser. This keeps
JavaScript regular expression construction from converting patterns
through UTF-8 when LibRegex can consume the same UTF-16 representation
used by LibJS.

Update RegExp construction, HTML pattern validation, the regex fuzzer,
and LibRegex tests to use the UTF-16 compile API.
2026-06-22 16:10:40 +02:00
Andreas Kling
f627b7dcbb LibRegex: Respect V8 astral literal lastIndex behavior
Preserve V8's behavior for bare single-astral literals when a unicode
global search starts in the middle of a surrogate pair. We were
snapping that lastIndex back to the pair start unconditionally,
which let /😀/gu and /\u{1F600}/gu match where V8 returns null.

Expose that literal shape from LibRegex to LibJS and add runtime
coverage for the bare literal case alongside a grouped control.
2026-03-27 17:32:19 +01:00
Andreas Kling
34d954e2d7 LibRegex: Add ECMAScriptRegex and migrate callers
Add `ECMAScriptRegex`, LibRegex's C++ facade for ECMAScript regexes.

The facade owns compilation, execution, captures, named groups, and
error translation for the Rust backend, which lets callers stop
depending on the legacy parser and matcher types directly. Use it in the
remaining non-LibJS callers: URLPattern, HTML input pattern handling,
and the places in LibHTTP that only needed token validation.

Where a full regex engine was unnecessary, replace those call sites with
direct character checks. Also update focused LibURL, LibHTTP, and WPT
coverage for the migrated callers and corrected surrogate handling.
2026-03-27 17:32:19 +01:00