Previously, both string_position and view_index used code unit offsets
regardless of mode. Now in unicode mode, these variables track code
point positions while string_position_in_code_units is properly
updated to reflect code unit offsets.
There apparently is a bit of a disconnect between the spec asking us to
construct the pattern using code points and LibRegex not being able to
swallow those. Whenever we had multi-byte code points in the pattern and
tried to match that in unicode mode, we would fail.
Change the parser to encode all non-ASCII code units. Fixes 2 test262
cases in `language/literals/regexp`.
Before, If the cache was empty we would try and evict non-existant
entries and crash. So the fix is to make sure that we don't saturate
the cache with a single parse result.
If the regex always matches the input, even if it's past the end, then
we need to stop execution of the regex when it's past the end. This
corresponds to step 13.a and prevents it from infinitely looping.
Reduced from: d98672060f/packages/react-i18n/src/utilities/money.ts (L10-L14)
- Default values should depend on arguments being undefined, not being
missing
- "(?:)" for empty pattern happens in RegExp.prototype.source, not the
constructor