| 
									
										
											  
											
												LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
    "pluralRule-count-one": "i = 1 and v = 0 @integer 1",
    "pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
    "pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
    "pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
											
										 
											2022-07-07 09:44:17 -04:00
										 |  |  | /*
 | 
					
						
							| 
									
										
										
										
											2025-07-23 15:25:14 -04:00
										 |  |  |  * Copyright (c) 2022-2025, Tim Flynn <trflynn89@ladybird.org> | 
					
						
							| 
									
										
											  
											
												LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
    "pluralRule-count-one": "i = 1 and v = 0 @integer 1",
    "pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
    "pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
    "pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
											
										 
											2022-07-07 09:44:17 -04:00
										 |  |  |  * | 
					
						
							|  |  |  |  * SPDX-License-Identifier: BSD-2-Clause | 
					
						
							|  |  |  |  */ | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-06-23 09:14:27 -04:00
										 |  |  | #include <LibUnicode/PluralRules.h>
 | 
					
						
							| 
									
										
											  
											
												LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
    "pluralRule-count-one": "i = 1 and v = 0 @integer 1",
    "pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
    "pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
    "pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
											
										 
											2022-07-07 09:44:17 -04:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-06-23 09:14:27 -04:00
										 |  |  | namespace Unicode { | 
					
						
							| 
									
										
											  
											
												LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
    "pluralRule-count-one": "i = 1 and v = 0 @integer 1",
    "pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
    "pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
    "pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
											
										 
											2022-07-07 09:44:17 -04:00
										 |  |  | 
 | 
					
						
							|  |  |  | PluralForm plural_form_from_string(StringView plural_form) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     if (plural_form == "cardinal"sv) | 
					
						
							|  |  |  |         return PluralForm::Cardinal; | 
					
						
							|  |  |  |     if (plural_form == "ordinal"sv) | 
					
						
							|  |  |  |         return PluralForm::Ordinal; | 
					
						
							|  |  |  |     VERIFY_NOT_REACHED(); | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | StringView plural_form_to_string(PluralForm plural_form) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     switch (plural_form) { | 
					
						
							|  |  |  |     case PluralForm::Cardinal: | 
					
						
							|  |  |  |         return "cardinal"sv; | 
					
						
							|  |  |  |     case PluralForm::Ordinal: | 
					
						
							|  |  |  |         return "ordinal"sv; | 
					
						
							|  |  |  |     } | 
					
						
							| 
									
										
										
										
											2024-06-14 14:15:28 -04:00
										 |  |  |     VERIFY_NOT_REACHED(); | 
					
						
							| 
									
										
											  
											
												LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
    "pluralRule-count-one": "i = 1 and v = 0 @integer 1",
    "pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
    "pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
    "pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
											
										 
											2022-07-07 09:44:17 -04:00
										 |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2025-07-23 15:25:14 -04:00
										 |  |  | PluralCategory plural_category_from_string(Utf16View const& category) | 
					
						
							| 
									
										
										
										
											2022-07-07 12:05:05 -04:00
										 |  |  | { | 
					
						
							| 
									
										
										
										
											2024-06-14 14:15:28 -04:00
										 |  |  |     if (category == "other"sv) | 
					
						
							|  |  |  |         return PluralCategory::Other; | 
					
						
							|  |  |  |     if (category == "zero"sv) | 
					
						
							|  |  |  |         return PluralCategory::Zero; | 
					
						
							|  |  |  |     if (category == "one"sv) | 
					
						
							|  |  |  |         return PluralCategory::One; | 
					
						
							|  |  |  |     if (category == "two"sv) | 
					
						
							|  |  |  |         return PluralCategory::Two; | 
					
						
							|  |  |  |     if (category == "few"sv) | 
					
						
							|  |  |  |         return PluralCategory::Few; | 
					
						
							|  |  |  |     if (category == "many"sv) | 
					
						
							|  |  |  |         return PluralCategory::Many; | 
					
						
							|  |  |  |     if (category == "0"sv) | 
					
						
							|  |  |  |         return PluralCategory::ExactlyZero; | 
					
						
							|  |  |  |     if (category == "1"sv) | 
					
						
							|  |  |  |         return PluralCategory::ExactlyOne; | 
					
						
							|  |  |  |     VERIFY_NOT_REACHED(); | 
					
						
							| 
									
										
										
										
											2022-07-07 12:05:05 -04:00
										 |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2025-07-23 15:25:14 -04:00
										 |  |  | Utf16View plural_category_to_string(PluralCategory category) | 
					
						
							| 
									
										
										
										
											2022-07-11 10:58:48 -04:00
										 |  |  | { | 
					
						
							| 
									
										
										
										
											2024-06-14 14:15:28 -04:00
										 |  |  |     switch (category) { | 
					
						
							|  |  |  |     case PluralCategory::Other: | 
					
						
							|  |  |  |         return "other"sv; | 
					
						
							|  |  |  |     case PluralCategory::Zero: | 
					
						
							|  |  |  |         return "zero"sv; | 
					
						
							|  |  |  |     case PluralCategory::One: | 
					
						
							|  |  |  |         return "one"sv; | 
					
						
							|  |  |  |     case PluralCategory::Two: | 
					
						
							|  |  |  |         return "two"sv; | 
					
						
							|  |  |  |     case PluralCategory::Few: | 
					
						
							|  |  |  |         return "few"sv; | 
					
						
							|  |  |  |     case PluralCategory::Many: | 
					
						
							|  |  |  |         return "many"sv; | 
					
						
							|  |  |  |     case PluralCategory::ExactlyZero: | 
					
						
							|  |  |  |         return "0"sv; | 
					
						
							|  |  |  |     case PluralCategory::ExactlyOne: | 
					
						
							|  |  |  |         return "1"sv; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     VERIFY_NOT_REACHED(); | 
					
						
							| 
									
										
										
										
											2022-07-11 10:58:48 -04:00
										 |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
											  
											
												LibUnicode: Parse and generate per-locale plural rules from the CLDR
Plural rules in the CLDR are of the form:
"cs": {
    "pluralRule-count-one": "i = 1 and v = 0 @integer 1",
    "pluralRule-count-few": "i = 2..4 and v = 0 @integer 2~4",
    "pluralRule-count-many": "v != 0 @decimal 0.0~1.5, 10.0, 100.0 ...",
    "pluralRule-count-other": "@integer 0, 5~19, 100, 1000, 10000 ..."
}
The syntax is described here:
https://unicode.org/reports/tr35/tr35-numbers.html#Plural_rules_syntax
There are up to 2 sets of rules for each locale, a cardinal set and an
ordinal set. The approach here is to generate a C++ function for each
set of rules. Each condition in the rules (e.g. "i = 1 and v = 0") is
transpiled to a C++ if-statement within its function. Then lookup tables
are generated to match locales to their generated functions.
NOTE: -Wno-parentheses-equality is added to the LibUnicodeData compile
flags because the generated plural rules have lots of extra parentheses
(because e.g. we need to selectively negate and combine rules). The code
to generate only exactly the right number of parentheses is quite hairy,
so this just tells the compiler to ignore the extras.
											
										 
											2022-07-07 09:44:17 -04:00
										 |  |  | } |