LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
/*
* Copyright ( c ) 2022 , Tim Flynn < trflynn89 @ serenityos . org >
*
* SPDX - License - Identifier : BSD - 2 - Clause
*/
2022-09-02 10:27:46 -04:00
# include "../LibUnicode/GeneratorUtil.h" // FIXME: Move this somewhere common.
2022-12-04 18:02:33 +00:00
# include <AK/DeprecatedString.h>
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
# include <AK/JsonObject.h>
# include <AK/JsonParser.h>
# include <AK/JsonValue.h>
# include <AK/LexicalPath.h>
# include <AK/SourceGenerator.h>
# include <AK/StringBuilder.h>
# include <AK/Variant.h>
# include <LibCore/ArgsParser.h>
2023-02-08 21:08:01 +01:00
# include <LibCore/DeprecatedFile.h>
2023-03-15 15:38:20 +00:00
# include <LibCore/Directory.h>
2022-09-02 12:11:30 -04:00
# include <LibLocale/PluralRules.h>
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
2022-12-04 18:02:33 +00:00
static DeprecatedString format_identifier ( StringView owner , DeprecatedString identifier )
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
{
identifier = identifier . replace ( " - " sv , " _ " sv , ReplaceMode : : All ) ;
if ( all_of ( identifier , is_ascii_digit ) )
2022-12-04 18:02:33 +00:00
return DeprecatedString : : formatted ( " {}_{} " , owner [ 0 ] , identifier ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
if ( is_ascii_lower_alpha ( identifier [ 0 ] ) )
2022-12-04 18:02:33 +00:00
return DeprecatedString : : formatted ( " {:c}{} " , to_ascii_uppercase ( identifier [ 0 ] ) , identifier . substring_view ( 1 ) ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
return identifier ;
}
struct Relation {
using Range = Array < u32 , 2 > ;
using Comparator = Variant < u32 , Range > ;
enum class Type {
Equality ,
Inequality ,
} ;
2022-12-04 18:02:33 +00:00
DeprecatedString const & modulus_variable_name ( ) const
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
{
VERIFY ( modulus . has_value ( ) ) ;
if ( ! cached_modulus_variable_name . has_value ( ) )
2022-12-04 18:02:33 +00:00
cached_modulus_variable_name = DeprecatedString : : formatted ( " mod_{}_{} " , symbol , * modulus ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
return * cached_modulus_variable_name ;
}
2022-12-04 18:02:33 +00:00
DeprecatedString const & exponential_variable_name ( ) const
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
{
if ( ! cached_exponential_variable_name . has_value ( ) )
2022-12-04 18:02:33 +00:00
cached_exponential_variable_name = DeprecatedString : : formatted ( " exp_{} " , symbol ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
return * cached_exponential_variable_name ;
}
void generate_relation ( SourceGenerator & generator ) const
{
auto append_variable_name = [ & ] ( ) {
if ( modulus . has_value ( ) )
generator . append ( modulus_variable_name ( ) ) ;
else if ( symbol = = ' e ' | | symbol = = ' c ' )
generator . append ( exponential_variable_name ( ) ) ;
else
2022-12-04 18:02:33 +00:00
generator . append ( DeprecatedString : : formatted ( " ops.{} " , Locale : : PluralOperands : : symbol_to_variable_name ( symbol ) ) ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
} ;
auto append_value = [ & ] ( u32 value ) {
append_variable_name ( ) ;
generator . append ( " == " sv ) ;
2022-12-04 18:02:33 +00:00
generator . append ( DeprecatedString : : number ( value ) ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
} ;
auto append_range = [ & ] ( auto const & range ) {
// This check avoids generating "0 <= unsigned_value", which is always true.
2022-09-02 12:01:10 -04:00
if ( range [ 0 ] ! = 0 | | Locale : : PluralOperands : : symbol_requires_floating_point_modulus ( symbol ) ) {
2022-12-04 18:02:33 +00:00
generator . append ( DeprecatedString : : formatted ( " {} <= " , range [ 0 ] ) ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
append_variable_name ( ) ;
generator . append ( " && " sv ) ;
}
append_variable_name ( ) ;
2022-12-04 18:02:33 +00:00
generator . append ( DeprecatedString : : formatted ( " <= {} " , range [ 1 ] ) ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
} ;
if ( type = = Type : : Inequality )
generator . append ( " ! " sv ) ;
generator . append ( " ( " sv ) ;
bool first = true ;
for ( auto const & comparator : comparators ) {
generator . append ( first ? " ( " sv : " || ( " sv ) ;
comparator . visit (
[ & ] ( u32 value ) { append_value ( value ) ; } ,
[ & ] ( Range const & range ) { append_range ( range ) ; } ) ;
generator . append ( " ) " sv ) ;
first = false ;
}
generator . append ( " ) " sv ) ;
}
2022-12-04 18:02:33 +00:00
void generate_precomputed_variables ( SourceGenerator & generator , HashTable < DeprecatedString > & generated_variables ) const
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
{
// FIXME: How do we handle the exponential symbols? They seem unused by ECMA-402.
if ( symbol = = ' e ' | | symbol = = ' c ' ) {
if ( auto variable = exponential_variable_name ( ) ; ! generated_variables . contains ( variable ) ) {
generated_variables . set ( variable ) ;
generator . set ( " variable " sv , move ( variable ) ) ;
generator . append ( R " ~~~(
auto @ variable @ = 0 ; ) ~ ~ ~ " );
}
}
if ( ! modulus . has_value ( ) )
return ;
auto variable = modulus_variable_name ( ) ;
if ( generated_variables . contains ( variable ) )
return ;
generated_variables . set ( variable ) ;
generator . set ( " variable " sv , move ( variable ) ) ;
2022-09-02 12:01:10 -04:00
generator . set ( " operand " sv , Locale : : PluralOperands : : symbol_to_variable_name ( symbol ) ) ;
2022-12-04 18:02:33 +00:00
generator . set ( " modulus " sv , DeprecatedString : : number ( * modulus ) ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
2022-09-02 12:01:10 -04:00
if ( Locale : : PluralOperands : : symbol_requires_floating_point_modulus ( symbol ) ) {
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
generator . append ( R " ~~~(
auto @ variable @ = fmod ( ops . @ operand @ , @ modulus @ ) ; ) ~ ~ ~ " );
} else {
generator . append ( R " ~~~(
auto @ variable @ = ops . @ operand @ % @ modulus @ ; ) ~ ~ ~ " );
}
}
Type type ;
char symbol { 0 } ;
Optional < u32 > modulus ;
Vector < Comparator > comparators ;
private :
2022-12-04 18:02:33 +00:00
mutable Optional < DeprecatedString > cached_modulus_variable_name ;
mutable Optional < DeprecatedString > cached_exponential_variable_name ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
} ;
struct Condition {
void generate_condition ( SourceGenerator & generator ) const
{
for ( size_t i = 0 ; i < relations . size ( ) ; + + i ) {
if ( i > 0 )
generator . append ( " || " sv ) ;
auto const & conjunctions = relations [ i ] ;
if ( conjunctions . size ( ) > 1 )
generator . append ( " ( " sv ) ;
for ( size_t j = 0 ; j < conjunctions . size ( ) ; + + j ) {
if ( j > 0 )
generator . append ( " && " sv ) ;
conjunctions [ j ] . generate_relation ( generator ) ;
}
if ( conjunctions . size ( ) > 1 )
generator . append ( " ) " sv ) ;
}
}
2022-12-04 18:02:33 +00:00
void generate_precomputed_variables ( SourceGenerator & generator , HashTable < DeprecatedString > & generated_variables ) const
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
{
for ( auto const & conjunctions : relations ) {
for ( auto const & relation : conjunctions )
relation . generate_precomputed_variables ( generator , generated_variables ) ;
}
}
Vector < Vector < Relation > > relations ;
} ;
2022-07-11 10:58:48 -04:00
struct Range {
2022-12-04 18:02:33 +00:00
DeprecatedString start ;
DeprecatedString end ;
DeprecatedString category ;
2022-07-11 10:58:48 -04:00
} ;
2022-12-04 18:02:33 +00:00
using Conditions = HashMap < DeprecatedString , Condition > ;
2022-07-11 10:58:48 -04:00
using Ranges = Vector < Range > ;
2022-09-02 11:48:05 -04:00
struct LocaleData {
2022-12-04 18:02:33 +00:00
static DeprecatedString generated_method_name ( StringView form , StringView locale )
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
{
2022-12-04 18:02:33 +00:00
return DeprecatedString : : formatted ( " {}_plurality_{} " , form , format_identifier ( { } , locale ) ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
}
2022-07-11 10:58:48 -04:00
Conditions & rules_for_form ( StringView form )
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
{
if ( form = = " cardinal " )
return cardinal_rules ;
if ( form = = " ordinal " )
return ordinal_rules ;
VERIFY_NOT_REACHED ( ) ;
}
2022-07-11 10:58:48 -04:00
Conditions cardinal_rules ;
Conditions ordinal_rules ;
Ranges plural_ranges ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
} ;
2022-09-02 11:48:05 -04:00
struct CLDR {
2022-11-18 11:04:33 -05:00
UniqueStringStorage unique_strings ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
2022-12-04 18:02:33 +00:00
HashMap < DeprecatedString , LocaleData > locales ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
} ;
static Relation parse_relation ( StringView relation )
{
static constexpr auto equality_operator = " = " sv ;
static constexpr auto inequality_operator = " != " sv ;
static constexpr auto modulus_operator = " % " sv ;
static constexpr auto range_operator = " .. " sv ;
static constexpr auto set_operator = ' , ' ;
Relation parsed ;
StringView lhs ;
StringView rhs ;
if ( auto index = relation . find ( equality_operator ) ; index . has_value ( ) ) {
parsed . type = Relation : : Type : : Equality ;
lhs = relation . substring_view ( 0 , * index ) ;
rhs = relation . substring_view ( * index + equality_operator . length ( ) ) ;
} else if ( auto index = relation . find ( inequality_operator ) ; index . has_value ( ) ) {
parsed . type = Relation : : Type : : Inequality ;
lhs = relation . substring_view ( 0 , * index ) ;
rhs = relation . substring_view ( * index + inequality_operator . length ( ) ) ;
} else {
VERIFY_NOT_REACHED ( ) ;
}
if ( auto index = lhs . find ( modulus_operator ) ; index . has_value ( ) ) {
auto symbol = lhs . substring_view ( 0 , * index ) ;
VERIFY ( symbol . length ( ) = = 1 ) ;
auto modulus = lhs . substring_view ( * index + modulus_operator . length ( ) ) . to_uint ( ) ;
VERIFY ( modulus . has_value ( ) ) ;
parsed . symbol = symbol [ 0 ] ;
parsed . modulus = move ( modulus ) ;
} else {
VERIFY ( lhs . length ( ) = = 1 ) ;
parsed . symbol = lhs [ 0 ] ;
}
2022-10-22 15:38:21 +02:00
rhs . for_each_split_view ( set_operator , SplitBehavior : : Nothing , [ & ] ( auto set ) {
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
if ( auto index = set . find ( range_operator ) ; index . has_value ( ) ) {
auto range_begin = set . substring_view ( 0 , * index ) . to_uint ( ) ;
VERIFY ( range_begin . has_value ( ) ) ;
auto range_end = set . substring_view ( * index + range_operator . length ( ) ) . to_uint ( ) ;
VERIFY ( range_end . has_value ( ) ) ;
parsed . comparators . empend ( Array { * range_begin , * range_end } ) ;
} else {
auto value = set . to_uint ( ) ;
VERIFY ( value . has_value ( ) ) ;
parsed . comparators . empend ( * value ) ;
}
} ) ;
return parsed ;
}
// https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
//
// A very simplified view of a plural rule is:
//
// condition.* ([@integer|@decimal] sample)+
//
// The "sample" being series of integer or decimal values that fit the specified condition. The
// condition may be one or more binary expressions, chained together with "and" or "or" operators.
2022-07-11 10:58:48 -04:00
static void parse_condition ( StringView category , StringView rule , Conditions & rules )
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
{
static constexpr auto other_category = " other " sv ;
static constexpr auto disjunction_keyword = " or " sv ;
static constexpr auto conjunction_keyword = " and " sv ;
// We don't need the examples in the generated code, so we can drop them here.
auto example_index = rule . find ( ' @ ' ) ;
VERIFY ( example_index . has_value ( ) ) ;
auto condition = rule . substring_view ( 0 , * example_index ) . trim_whitespace ( ) ;
// Our implementation does not generate rules for the "other" category. We simply return "other"
// for values that do not match any rules. This will need to be revisited if this VERIFY fails.
if ( condition . is_empty ( ) ) {
VERIFY ( category = = other_category ) ;
return ;
}
auto & relation_list = rules . ensure ( category ) ;
// The grammar for a condition (i.e. a chain of relations) is:
//
// condition = and_condition ('or' and_condition)*
// and_condition = relation ('and' relation)*
//
// This affords some simplicity in that disjunctions are never embedded within a conjunction.
2022-10-22 15:38:21 +02:00
condition . for_each_split_view ( disjunction_keyword , SplitBehavior : : Nothing , [ & ] ( auto disjunction ) {
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
Vector < Relation > conjunctions ;
2022-10-22 15:38:21 +02:00
disjunction . for_each_split_view ( conjunction_keyword , SplitBehavior : : Nothing , [ & ] ( auto relation ) {
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
conjunctions . append ( parse_relation ( relation ) ) ;
} ) ;
relation_list . relations . append ( move ( conjunctions ) ) ;
} ) ;
}
2022-12-04 18:02:33 +00:00
static ErrorOr < void > parse_plural_rules ( DeprecatedString core_supplemental_path , StringView file_name , CLDR & cldr )
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
{
static constexpr auto form_prefix = " plurals-type- " sv ;
static constexpr auto rule_prefix = " pluralRule-count- " sv ;
LexicalPath plurals_path ( move ( core_supplemental_path ) ) ;
plurals_path = plurals_path . append ( file_name ) ;
auto plurals = TRY ( read_json_file ( plurals_path . string ( ) ) ) ;
2022-12-21 14:37:27 +00:00
auto const & supplemental_object = plurals . as_object ( ) . get_object ( " supplemental " sv ) . value ( ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
2022-12-21 14:37:27 +00:00
supplemental_object . for_each_member ( [ & ] ( auto const & key , auto const & plurals_object ) {
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
if ( ! key . starts_with ( form_prefix ) )
return ;
auto form = key . substring_view ( form_prefix . length ( ) ) ;
plurals_object . as_object ( ) . for_each_member ( [ & ] ( auto const & loc , auto const & rules ) {
2022-09-02 11:48:05 -04:00
auto locale = cldr . locales . get ( loc ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
if ( ! locale . has_value ( ) )
return ;
rules . as_object ( ) . for_each_member ( [ & ] ( auto const & key , auto const & condition ) {
VERIFY ( key . starts_with ( rule_prefix ) ) ;
auto category = key . substring_view ( rule_prefix . length ( ) ) ;
parse_condition ( category , condition . as_string ( ) , locale - > rules_for_form ( form ) ) ;
} ) ;
} ) ;
} ) ;
return { } ;
}
2022-07-11 10:58:48 -04:00
// https://unicode.org/reports/tr35/tr35-numbers.html#Plural_Ranges
2022-12-04 18:02:33 +00:00
static ErrorOr < void > parse_plural_ranges ( DeprecatedString core_supplemental_path , CLDR & cldr )
2022-07-11 10:58:48 -04:00
{
static constexpr auto start_segment = " -start- " sv ;
static constexpr auto end_segment = " -end- " sv ;
LexicalPath plural_ranges_path ( move ( core_supplemental_path ) ) ;
plural_ranges_path = plural_ranges_path . append ( " pluralRanges.json " sv ) ;
auto plural_ranges = TRY ( read_json_file ( plural_ranges_path . string ( ) ) ) ;
2022-12-21 14:37:27 +00:00
auto const & supplemental_object = plural_ranges . as_object ( ) . get_object ( " supplemental " sv ) . value ( ) ;
auto const & plurals_object = supplemental_object . get_object ( " plurals " sv ) . value ( ) ;
2022-07-11 10:58:48 -04:00
2022-12-21 14:37:27 +00:00
plurals_object . for_each_member ( [ & ] ( auto const & loc , auto const & ranges_object ) {
2022-09-02 11:48:05 -04:00
auto locale = cldr . locales . get ( loc ) ;
2022-07-11 10:58:48 -04:00
if ( ! locale . has_value ( ) )
return ;
ranges_object . as_object ( ) . for_each_member ( [ & ] ( auto const & range , auto const & category ) {
auto start_index = range . find ( start_segment ) ;
VERIFY ( start_index . has_value ( ) ) ;
auto end_index = range . find ( end_segment ) ;
VERIFY ( end_index . has_value ( ) ) ;
* start_index + = start_segment . length ( ) ;
auto start = range . substring ( * start_index , * end_index - * start_index ) ;
auto end = range . substring ( * end_index + end_segment . length ( ) ) ;
locale - > plural_ranges . empend ( move ( start ) , move ( end ) , category . as_string ( ) ) ;
} ) ;
} ) ;
return { } ;
}
2022-12-04 18:02:33 +00:00
static ErrorOr < void > parse_all_locales ( DeprecatedString core_path , DeprecatedString locale_names_path , CLDR & cldr )
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
{
LexicalPath core_supplemental_path ( move ( core_path ) ) ;
core_supplemental_path = core_supplemental_path . append ( " supplemental " sv ) ;
2023-02-08 21:08:01 +01:00
VERIFY ( Core : : DeprecatedFile : : is_directory ( core_supplemental_path . string ( ) ) ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
2022-12-04 18:02:33 +00:00
auto remove_variants_from_path = [ & ] ( DeprecatedString path ) - > ErrorOr < DeprecatedString > {
2022-11-18 11:04:33 -05:00
auto parsed_locale = TRY ( CanonicalLanguageID : : parse ( cldr . unique_strings , LexicalPath : : basename ( path ) ) ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
StringBuilder builder ;
2022-09-02 11:48:05 -04:00
builder . append ( cldr . unique_strings . get ( parsed_locale . language ) ) ;
if ( auto script = cldr . unique_strings . get ( parsed_locale . script ) ; ! script . is_empty ( ) )
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
builder . appendff ( " -{} " , script ) ;
2022-09-02 11:48:05 -04:00
if ( auto region = cldr . unique_strings . get ( parsed_locale . region ) ; ! region . is_empty ( ) )
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
builder . appendff ( " -{} " , region ) ;
2023-01-26 18:58:09 +00:00
return builder . to_deprecated_string ( ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
} ;
2023-03-15 15:38:20 +00:00
TRY ( Core : : Directory : : for_each_entry ( TRY ( String : : formatted ( " {}/main " , locale_names_path ) ) , Core : : DirIterator : : SkipParentAndBaseDir , [ & ] ( auto & entry , auto & directory ) - > ErrorOr < IterationDecision > {
auto locale_path = LexicalPath : : join ( directory . path ( ) . string ( ) , entry . name ) . string ( ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
auto language = TRY ( remove_variants_from_path ( locale_path ) ) ;
2022-09-02 11:48:05 -04:00
cldr . locales . ensure ( language ) ;
2023-03-15 15:38:20 +00:00
return IterationDecision : : Continue ;
} ) ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
2022-09-02 11:48:05 -04:00
TRY ( parse_plural_rules ( core_supplemental_path . string ( ) , " plurals.json " sv , cldr ) ) ;
TRY ( parse_plural_rules ( core_supplemental_path . string ( ) , " ordinals.json " sv , cldr ) ) ;
TRY ( parse_plural_ranges ( core_supplemental_path . string ( ) , cldr ) ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
return { } ;
}
2023-02-09 03:02:46 +01:00
static ErrorOr < void > generate_unicode_locale_header ( Core : : BufferedFile & file , CLDR & )
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
{
StringBuilder builder ;
SourceGenerator generator { builder } ;
generator . append ( R " ~~~(
# include <AK/Types.h>
# pragma once
2022-09-02 12:01:10 -04:00
namespace Locale {
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
) ~ ~ ~ " );
generator . append ( R " ~~~(
}
) ~ ~ ~ " );
2023-03-01 16:28:32 +01:00
TRY ( file . write_until_depleted ( generator . as_string_view ( ) . bytes ( ) ) ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
return { } ;
}
2023-02-09 03:02:46 +01:00
static ErrorOr < void > generate_unicode_locale_implementation ( Core : : BufferedFile & file , CLDR & cldr )
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
{
StringBuilder builder ;
SourceGenerator generator { builder } ;
2022-09-02 11:48:05 -04:00
auto locales = cldr . locales . keys ( ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
quick_sort ( locales ) ;
generator . append ( R " ~~~(
# include <AK/Array.h>
2022-09-02 12:11:30 -04:00
# include <LibLocale/Locale.h>
2022-09-02 11:04:53 -04:00
# include <LibLocale/LocaleData.h>
2022-09-02 12:11:30 -04:00
# include <LibLocale/PluralRules.h>
2022-09-02 11:04:53 -04:00
# include <LibLocale/PluralRulesData.h>
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
# include <math.h>
2022-09-02 12:01:10 -04:00
namespace Locale {
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
using PluralCategoryFunction = PluralCategory ( * ) ( PluralOperands ) ;
2022-07-11 10:58:48 -04:00
using PluralRangeFunction = PluralCategory ( * ) ( PluralCategory , PluralCategory ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
static PluralCategory default_category ( PluralOperands )
{
return PluralCategory : : Other ;
}
2022-07-11 10:58:48 -04:00
static PluralCategory default_range ( PluralCategory , PluralCategory end )
{
return end ;
}
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
) ~ ~ ~ " );
auto append_rules = [ & ] ( auto form , auto const & locale , auto const & rules ) {
if ( rules . is_empty ( ) )
return ;
2022-09-02 11:48:05 -04:00
generator . set ( " method " sv , LocaleData : : generated_method_name ( form , locale ) ) ;
2022-12-04 18:02:33 +00:00
HashTable < DeprecatedString > generated_variables ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
generator . append ( R " ~~~(
static PluralCategory @ method @ ( [[maybe_unused]] PluralOperands ops )
{ ) ~ ~ ~ " );
for ( auto [ category , condition ] : rules ) {
condition . generate_precomputed_variables ( generator , generated_variables ) ;
generator . append ( R " ~~~(
if ( ) ~ ~ ~ " );
generator . set ( " category " sv , format_identifier ( { } , category ) ) ;
condition . generate_condition ( generator ) ;
generator . append ( R " ~~~()
return PluralCategory : : @ category @ ; ) ~ ~ ~ " );
}
generator . append ( R " ~~~(
return PluralCategory : : Other ;
}
) ~ ~ ~ " );
} ;
2022-07-11 10:58:48 -04:00
auto append_ranges = [ & ] ( auto const & locale , auto const & ranges ) {
if ( ranges . is_empty ( ) )
return ;
2022-09-02 11:48:05 -04:00
generator . set ( " method " sv , LocaleData : : generated_method_name ( " range " sv , locale ) ) ;
2022-07-11 10:58:48 -04:00
generator . append ( R " ~~~(
static PluralCategory @ method @ ( PluralCategory start , PluralCategory end )
{ ) ~ ~ ~ " );
for ( auto const & range : ranges ) {
generator . set ( " start " sv , format_identifier ( { } , range . start ) ) ;
generator . set ( " end " sv , format_identifier ( { } , range . end ) ) ;
generator . set ( " category " sv , format_identifier ( { } , range . category ) ) ;
generator . append ( R " ~~~(
if ( start = = PluralCategory : : @ start @ & & end = = PluralCategory : : @ end @ )
return PluralCategory : : @ category @ ; ) ~ ~ ~ " );
}
generator . append ( R " ~~~(
return end ;
}
) ~ ~ ~ " );
} ;
auto append_lookup_table = [ & ] ( auto type , auto form , auto default_ , auto data_for_locale ) {
generator . set ( " type " sv , type ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
generator . set ( " form " sv , form ) ;
2022-07-11 10:58:48 -04:00
generator . set ( " default " sv , default_ ) ;
2022-12-04 18:02:33 +00:00
generator . set ( " size " sv , DeprecatedString : : number ( locales . size ( ) ) ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
generator . append ( R " ~~~(
2022-07-11 10:58:48 -04:00
static constexpr Array < @ type @ , @ size @ > s_ @ form @ _functions { { ) ~ ~ ~ " );
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
for ( auto const & locale : locales ) {
2022-09-02 11:48:05 -04:00
auto & rules = data_for_locale ( cldr . locales . find ( locale ) - > value , form ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
2022-07-11 10:58:48 -04:00
if ( rules . is_empty ( ) ) {
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
generator . append ( R " ~~~(
2022-07-11 10:58:48 -04:00
@ default @ , ) ~ ~ ~ " );
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
} else {
2022-09-02 11:48:05 -04:00
generator . set ( " method " sv , LocaleData : : generated_method_name ( form , locale ) ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
generator . append ( R " ~~~(
@ method @ , ) ~ ~ ~ " );
}
}
generator . append ( R " ~~~(
} } ;
) ~ ~ ~ " );
} ;
2022-07-07 12:05:05 -04:00
auto append_categories = [ & ] ( auto const & name , auto const & rules ) {
generator . set ( " name " , name ) ;
2022-12-04 18:02:33 +00:00
generator . set ( " size " , DeprecatedString : : number ( rules . size ( ) + 1 ) ) ;
2022-07-07 12:05:05 -04:00
generator . append ( R " ~~~(
static constexpr Array < PluralCategory , @ size @ > @ name @ { { PluralCategory : : Other ) ~ ~ ~ " );
for ( auto [ category , condition ] : rules ) {
generator . set ( " category " sv , format_identifier ( { } , category ) ) ;
generator . append ( " , PluralCategory::@category@ " sv ) ;
}
generator . append ( " } }; " ) ;
} ;
2022-09-02 11:48:05 -04:00
for ( auto [ locale , rules ] : cldr . locales ) {
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
append_rules ( " cardinal " sv , locale , rules . cardinal_rules ) ;
append_rules ( " ordinal " sv , locale , rules . ordinal_rules ) ;
2022-07-11 10:58:48 -04:00
append_ranges ( locale , rules . plural_ranges ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
}
2022-07-11 10:58:48 -04:00
append_lookup_table ( " PluralCategoryFunction " sv , " cardinal " sv , " default_category " sv , [ ] ( auto & rules , auto form ) - > Conditions & { return rules . rules_for_form ( form ) ; } ) ;
append_lookup_table ( " PluralCategoryFunction " sv , " ordinal " sv , " default_category " sv , [ ] ( auto & rules , auto form ) - > Conditions & { return rules . rules_for_form ( form ) ; } ) ;
append_lookup_table ( " PluralRangeFunction " sv , " range " sv , " default_range " sv , [ ] ( auto & rules , auto ) - > Ranges & { return rules . plural_ranges ; } ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
2022-07-11 17:32:29 +00:00
generate_mapping ( generator , locales , " PluralCategory " sv , " s_cardinal_categories " sv , " s_cardinal_categories_{} " sv , format_identifier ,
2022-07-07 12:05:05 -04:00
[ & ] ( auto const & name , auto const & locale ) {
2022-09-02 11:48:05 -04:00
auto & rules = cldr . locales . find ( locale ) - > value ;
2022-07-07 12:05:05 -04:00
append_categories ( name , rules . rules_for_form ( " cardinal " sv ) ) ;
} ) ;
2022-07-11 17:32:29 +00:00
generate_mapping ( generator , locales , " PluralCategory " sv , " s_ordinal_categories " sv , " s_ordinal_categories_{} " sv , format_identifier ,
2022-07-07 12:05:05 -04:00
[ & ] ( auto const & name , auto const & locale ) {
2022-09-02 11:48:05 -04:00
auto & rules = cldr . locales . find ( locale ) - > value ;
2022-07-07 12:05:05 -04:00
append_categories ( name , rules . rules_for_form ( " ordinal " sv ) ) ;
} ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
generator . append ( R " ~~~(
PluralCategory determine_plural_category ( StringView locale , PluralForm form , PluralOperands operands )
{
auto locale_value = locale_from_string ( locale ) ;
if ( ! locale_value . has_value ( ) )
return PluralCategory : : Other ;
auto locale_index = to_underlying ( * locale_value ) - 1 ; // Subtract 1 because 0 == Locale::None.
PluralCategoryFunction decider { nullptr } ;
switch ( form ) {
case PluralForm : : Cardinal :
decider = s_cardinal_functions [ locale_index ] ;
break ;
case PluralForm : : Ordinal :
decider = s_ordinal_functions [ locale_index ] ;
break ;
}
return decider ( move ( operands ) ) ;
}
2023-02-05 19:02:54 +00:00
ReadonlySpan < PluralCategory > available_plural_categories ( StringView locale , PluralForm form )
2022-07-07 12:05:05 -04:00
{
auto locale_value = locale_from_string ( locale ) ;
if ( ! locale_value . has_value ( ) )
return { } ;
auto locale_index = to_underlying ( * locale_value ) - 1 ; // Subtract 1 because 0 == Locale::None.
switch ( form ) {
case PluralForm : : Cardinal :
return s_cardinal_categories [ locale_index ] ;
case PluralForm : : Ordinal :
return s_ordinal_categories [ locale_index ] ;
}
VERIFY_NOT_REACHED ( ) ;
}
2022-07-11 10:58:48 -04:00
PluralCategory determine_plural_range ( StringView locale , PluralCategory start , PluralCategory end )
{
auto locale_value = locale_from_string ( locale ) ;
if ( ! locale_value . has_value ( ) )
return PluralCategory : : Other ;
auto locale_index = to_underlying ( * locale_value ) - 1 ; // Subtract 1 because 0 == Locale::None.
PluralRangeFunction decider = s_range_functions [ locale_index ] ;
return decider ( start , end ) ;
}
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
}
) ~ ~ ~ " );
2023-03-01 16:28:32 +01:00
TRY ( file . write_until_depleted ( generator . as_string_view ( ) . bytes ( ) ) ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
return { } ;
}
ErrorOr < int > serenity_main ( Main : : Arguments arguments )
{
StringView generated_header_path ;
StringView generated_implementation_path ;
StringView core_path ;
StringView locale_names_path ;
Core : : ArgsParser args_parser ;
args_parser . add_option ( generated_header_path , " Path to the Unicode locale header file to generate " , " generated-header-path " , ' h ' , " generated-header-path " ) ;
args_parser . add_option ( generated_implementation_path , " Path to the Unicode locale implementation file to generate " , " generated-implementation-path " , ' c ' , " generated-implementation-path " ) ;
args_parser . add_option ( core_path , " Path to cldr-core directory " , " core-path " , ' r ' , " core-path " ) ;
args_parser . add_option ( locale_names_path , " Path to cldr-localenames directory " , " locale-names-path " , ' l ' , " locale-names-path " ) ;
args_parser . parse ( arguments ) ;
2023-02-09 03:02:46 +01:00
auto generated_header_file = TRY ( open_file ( generated_header_path , Core : : File : : OpenMode : : Write ) ) ;
auto generated_implementation_file = TRY ( open_file ( generated_implementation_path , Core : : File : : OpenMode : : Write ) ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
2022-09-02 11:48:05 -04:00
CLDR cldr ;
TRY ( parse_all_locales ( core_path , locale_names_path , cldr ) ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
2022-09-02 11:48:05 -04:00
TRY ( generate_unicode_locale_header ( * generated_header_file , cldr ) ) ;
TRY ( generate_unicode_locale_implementation ( * generated_implementation_file , cldr ) ) ;
LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
"pluralRule-count-one": "i = 1 and v = 0 @integer 1",
"pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
"pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
"pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
2022-07-07 09:44:17 -04:00
return 0 ;
}