Commit graph

34 commits

Author SHA1 Message Date
mjz19910
10ec98dd38 Everywhere: Fix spelling mistakes 2022-01-07 15:44:42 +01:00
Timothy Flynn
f16f3c4677 LibJS: Update ToRawPrecision / ToRawFixed AO spec comments
This is a normative change in the Intl spec:
f0f66cf

There are two main changes here:
1. Converting BigInt/Number objects to mathematical values.
2. A change in how ToRawPrecision computes its exponent and significant
   digits.

For (1), we do not yet support BigInt number formatting, thus already
have coerced Number objects to a double. When BigInt is supported, the
number passed into these methods will likely still be a Value, thus can
be coereced then.

For (2), our implementation already returns the expected edge-case
results pointed out on the spec PR.
2022-01-02 20:07:03 +01:00
Timothy Flynn
a3149c11e5 LibJS: Explicitly handle postive/negative infinity in Intl.NumberFormat
This is a normative change in the Intl spec:
f0f66cf

Our implementation is unaffected by this change. LibUnicode pre-computes
positive, negative, and signless format patterns, so we already format
negative infinity correctly. Also, the CLDR does not contain specific
locale-dependent strings for negative infinity anyways.
2022-01-02 20:07:03 +01:00
Timothy Flynn
9ce4ff4265 LibJS: Avoid crashing when the Unicode data generators are disabled
The general idea when ENABLE_UNICODE_DATABASE_DOWNLOAD is OFF has been
that the Intl APIs will provide obviously incorrect results, but should
not crash. This regressed a bit with NumberFormat and DateTimeFormat.
2021-12-22 17:30:43 +01:00
Timothy Flynn
2a7f36b392 LibJS+LibUnicode: Generate unique numeric symbol lists
There are 443 number system objects generated, each of which held an
array of number system symbols. Of those 443 arrays, only 39 are unique.

To uniquely store these, this change moves the generated NumericSymbol
enumeration to the public LibUnicode/NumberFormat.h header with a pre-
defined set of symbols that we need. This is to ensure the generated,
unique arrays are created in a known order with known symbols. While it
is unfortunate to no longer discover these symbols at generation time,
it does allow us to ignore unwanted symbols and perform less string-to-
enumeration conversions at lookup time.
2021-12-11 14:17:47 +00:00
Timothy Flynn
d2588d852b LibJS: Change all [[RelevantExtensionKeys]] to return constexpr arrays
There's no need to allocate a vector for this internal slot. Similar to
commit: bb11437792
2021-12-01 16:36:26 +00:00
Timothy Flynn
914675e826 LibJS+LibUnicode: Separate number formatting methods from Locale.h
Currently, we generate separate data files for locale and number format
related tables/methods, but provide public accessors for all of the data
in one Locale.h file. Rather than continuing this trend for date-time,
relative time, etc. formatting, it's a bit easier to reason about if the
public accessors are also in separate files.
2021-11-29 22:48:46 +00:00
Timothy Flynn
251f692440 LibJS: Re-implement SetNumberFormatDigitOptions AO
This is an editorial change in the Intl spec.

See: d89c84f
2021-11-24 14:17:15 +00:00
Timothy Flynn
a1d5849e67 LibJS: Implement unit number formatting 2021-11-16 23:14:09 +00:00
Timothy Flynn
04b8b87c17 LibJS+LibUnicode: Support multiple identifiers within format pattern
This wasn't the case for compact patterns, but unit patterns can contain
multiple (up to 2, really) identifiers that must each be recognized by
LibJS.

Each generated NumberFormat object now stores an array of identifiers
parsed. The format pattern itself is encoded with the index into this
array for that identifier, e.g. the compact format string "0K" will
become "{number}{compactIdentifier:0}".
2021-11-16 23:14:09 +00:00
Timothy Flynn
3b68370212 LibJS+LibUnicode: Rename the generated compact_identifier to identifier
This field is currently used to store the StringView into the compact
name/symbol in the format string. Units will need to store a similar
field, so rename the field to be more generic, and extract the parser
for it.
2021-11-16 23:14:09 +00:00
Timothy Flynn
6d34a0b4e8 LibJS+LibUnicode: Rename method to select a NumberFormat plurality
Instead of currency pattern lookups within select_currency_unit_pattern,
rename the method to select_pattern_with_plurality and accept any list
of patterns. This method will be needed for units.
2021-11-16 23:14:09 +00:00
Timothy Flynn
99c15741ba LibJS: Conditionally ignore [[UseGrouping]] in compact notation 2021-11-16 00:56:55 +00:00
Timothy Flynn
14aca03161 LibJS: Remove FIXME comment from PartitionNotationSubPattern AO
All possible patterns generated by LibUnicode are now handled. We have a
similar VERIFY_NOT_REACHED in PartitionNumberPattern.
2021-11-16 00:56:55 +00:00
Timothy Flynn
fdae323401 LibJS: Implement compact formatting for Intl.NumberFormat 2021-11-16 00:56:55 +00:00
Timothy Flynn
80b86d20dc LibJS: Cache the number format used for compact notation
Finding the best number format to use for compact notation involves
creating a Vector of all compact formats for the locale and looking for
the one that best matches the number's magnitude. ECMA-402 wants this
number format to be found multiple times, so cache the result for future
use.
2021-11-16 00:56:55 +00:00
Timothy Flynn
1f546476d5 LibJS+LibUnicode: Fix computation of compact pattern exponents
The compact scale of each formatting rule was precomputed in commit:
be69eae651

Using the formula: compact scale = magnitude - pattern scale

This computation was off-by-one.

For example, consider the format key "10000-count-one", which maps to
"00 thousand" in en-US. What we are really after is the exponent that
best represents the string "thousand" for values greater than 10000
and less than 100000 (the next format key). We were previously doing:

    log10(10000) - "00 thousand".count("0") = 2

Which clearly isn't what we want. Instead, if we do:

    log10(10000) + 1 - "00 thousand".count("0") = 3

We get the correct exponent for each format key for each locale.

This commit also renames the generated variable from "compact_scale" to
"exponent" to match the terminology used in ECMA-402.
2021-11-16 00:56:55 +00:00
Timothy Flynn
4d79ab6866 LibJS: Implement engineering and scientific number formatting 2021-11-14 17:00:35 +00:00
Timothy Flynn
15c5fbd9e9 LibJS: Implement number grouping for Intl.NumberFormat
For example, in en-US, the number 123456 should be formatted as the
string "123,456". In en-IN, it should be formatted as "1,23,456".
2021-11-14 10:35:19 +00:00
Timothy Flynn
3450def494 LibJS: Implement Intl.NumberFormat.prototype.formatToParts 2021-11-13 19:01:25 +00:00
Timothy Flynn
c65dea64bd LibJS+LibUnicode: Don't remove {currency} keys in GetNumberFormatPattern
In order to implement Intl.NumberFormat.prototype.formatToParts, do not
replace {currency} keys in the format pattern before ECMA-402 tells us
to. Otherwise, the array return by formatToParts will not contain the
expected currency key.

Early replacement was done to avoid resolving the currency display more
than once, as it involves a couple of round trips to search through
LibUnicode data. So this adds a non-standard method to NumberFormat to
do this resolution and cache the result.

Another side effect of this change is that LibUnicode must replace unit
format patterns of the form "{0} {1}" during code generation. These were
previously skipped during code generation because LibJS would just
replace the keys with the currency display at runtime. But now that the
currency display injection is delayed, any {0} or {1} keys in the format
pattern will cause PartitionNumberPattern to abort.
2021-11-13 19:01:25 +00:00
Timothy Flynn
d872d030f1 LibJS: Avoid potential for dangling string views in partition patterns
There aren't any dangling views in as of yet, but a subsequent commit
will cause the "part" variable to be a view into an internally generated
string. Therefore, after returning from PartitionNumberPattern, that
view will be pointed at freed memory.

This commit is to set the precendence of not returning a view to "part".
2021-11-13 19:01:25 +00:00
Timothy Flynn
a701ed52fc LibJS+LibUnicode: Fully implement currency number formatting
Currencies are a bit strange; the layout of currency data in the CLDR is
not particularly compatible with what ECMA-402 expects. For example, the
currency format in the "en" and "ar" locales for the Latin script are:

    en: "¤#,##0.00"
    ar: "¤\u00A0#,##0.00"

Note how the "ar" locale has a non-breaking space after the currency
symbol (¤), but "en" does not. This does not mean that this space will
appear in the "ar"-formatted string, nor does it mean that a space won't
appear in the "en"-formatted string. This is a runtime decision based on
the currency display chosen by the user ("$" vs. "USD" vs. "US dollar")
and other rules in the Unicode TR-35 spec.

ECMA-402 shies away from the nuances here with "implementation-defined"
steps. LibUnicode will store the data parsed from the CLDR however it is
presented; making decisions about spacing, etc. will occur at runtime
based on user input.
2021-11-13 11:52:45 +00:00
Timothy Flynn
89523f70cf LibJS: Begin implementing Intl.NumberFormat.prototype.format
There is quite a lot to be done here so this is just a first pass at
number formatting. Decimal and percent formatting are mostly working,
but only for standard and compact notation (engineering and scientific
notation are not implemented here). Currency formatting is parsed, but
there is more work to be done to handle e.g. using symbols instead of
currency codes ("$" instead of "USD"), and putting spaces around the
currency symbol ("USD 2.00" instead of "USD2.00").
2021-11-12 09:17:08 +00:00
Linus Groh
b7e5f08e56 LibJS: Convert Object::get() to ThrowCompletionOr
To no one's surprise, this patch is pretty big - this is possibly the
most used AO of all of them. Definitely worth it though.
2021-10-03 20:14:03 +01:00
Idan Horowitz
768009e005 LibJS: Convert NumberFormat AOs to ThrowCompletionOr 2021-09-18 22:59:15 +03:00
Idan Horowitz
407cf04884 LibJS: Convert get_number_option() to ThrowCompletionOr 2021-09-18 22:21:15 +03:00
Idan Horowitz
6d3de03549 LibJS: Convert default_number_option() to ThrowCompletionOr 2021-09-18 22:21:15 +03:00
Idan Horowitz
b9c7a629f8 LibJS: Convert coerce_options_to_object() to ThrowCompletionOr 2021-09-18 22:21:15 +03:00
Idan Horowitz
de9785b71b LibJS: Convert Intl::get_option() to ThrowCompletionOr 2021-09-18 22:21:15 +03:00
Idan Horowitz
3758e65293 LibJS: Convert canonicalize_locale_list() to ThrowCompletionOr 2021-09-18 22:21:15 +03:00
Timothy Flynn
7769cd2cab LibJS: Move number_format_relevant_extension_keys to Intl.NumberFormat
This method represents the Intl.NumberFormat's [[RelevantExtensionKeys]]
internal slot, so it makes more sense for this to be directly in the
class itself.
2021-09-12 12:57:17 +01:00
Timothy Flynn
94a5a0437c LibJS: Move Intl.NumberFormat's AOs to its object file 2021-09-12 12:57:17 +01:00
Timothy Flynn
07f12b108b LibJS: Implement a nearly empty Intl.NumberFormat object
This adds plumbing for the Intl.NumberFormat object, constructor, and
prototype.
2021-09-11 11:05:50 +01:00