[3.14] gh-87281: Improve documentation for locale.setlocale() and locale.getlocale() (GH-137313) (#137722)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
This commit is contained in:
Miss Islington (bot) 2025-09-11 11:16:18 +02:00 committed by GitHub
parent 4b093e1796
commit 374b242efa
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -34,12 +34,17 @@ The :mod:`locale` module defines the following exception and functions:
If *locale* is given and not ``None``, :func:`setlocale` modifies the locale
setting for the *category*. The available categories are listed in the data
description below. *locale* may be a string, or an iterable of two strings
(language code and encoding). If it's an iterable, it's converted to a locale
name using the locale aliasing engine. An empty string specifies the user's
description below. *locale* may be a :ref:`string <locale_name>`, or a pair,
language code and encoding. An empty string specifies the user's
default settings. If the modification of the locale fails, the exception
:exc:`Error` is raised. If successful, the new locale setting is returned.
If *locale* is a pair, it is converted to a locale name using
the locale aliasing engine.
The language code has the same format as a :ref:`locale name <locale_name>`,
but without encoding and ``@``-modifier.
The language code and encoding can be ``None``.
If *locale* is omitted or ``None``, the current setting for *category* is
returned.
@ -345,22 +350,26 @@ The :mod:`locale` module defines the following exception and functions:
``'LANG'``. The GNU gettext search path contains ``'LC_ALL'``,
``'LC_CTYPE'``, ``'LANG'`` and ``'LANGUAGE'``, in that order.
Except for the code ``'C'``, the language code corresponds to :rfc:`1766`.
*language code* and *encoding* may be ``None`` if their values cannot be
The language code has the same format as a :ref:`locale name <locale_name>`,
but without encoding and ``@``-modifier.
The language code and encoding may be ``None`` if their values cannot be
determined.
The "C" locale is represented as ``(None, None)``.
.. deprecated-removed:: 3.11 3.15
.. function:: getlocale(category=LC_CTYPE)
Returns the current setting for the given locale category as sequence containing
*language code*, *encoding*. *category* may be one of the :const:`!LC_\*` values
except :const:`LC_ALL`. It defaults to :const:`LC_CTYPE`.
Returns the current setting for the given locale category as a tuple containing
the language code and encoding. *category* may be one of the :const:`!LC_\*`
values except :const:`LC_ALL`. It defaults to :const:`LC_CTYPE`.
Except for the code ``'C'``, the language code corresponds to :rfc:`1766`.
*language code* and *encoding* may be ``None`` if their values cannot be
The language code has the same format as a :ref:`locale name <locale_name>`,
but without encoding and ``@``-modifier.
The language code and encoding may be ``None`` if their values cannot be
determined.
The "C" locale is represented as ``(None, None)``.
.. function:: getpreferredencoding(do_setlocale=True)
@ -615,6 +624,61 @@ whose high bit is set (i.e., non-ASCII bytes) are never converted or considered
part of a character class such as letter or whitespace.
.. _locale_name:
Locale names
------------
The format of the locale name is platform dependent, and the set of supported
locales can depend on the system configuration.
On Posix platforms, it usually has the format [1]_:
.. productionlist:: locale_name
: language ["_" territory] ["." charset] ["@" modifier]
where *language* is a two- or three-letter language code from `ISO 639`_,
*territory* is a two-letter country or region code from `ISO 3166`_,
*charset* is a locale encoding, and *modifier* is a script name,
a language subtag, a sort order identifier, or other locale modifier
(for example, "latin", "valencia", "stroke" and "euro").
On Windows, several formats are supported. [2]_ [3]_
A subset of `IETF BCP 47`_ tags:
.. productionlist:: locale_name
: language ["-" script] ["-" territory] ["." charset]
: language ["-" script] "-" territory "-" modifier
where *language* and *territory* have the same meaning as in Posix,
*script* is a four-letter script code from `ISO 15924`_,
and *modifier* is a language subtag, a sort order identifier
or custom modifier (for example, "valencia", "stroke" or "x-python").
Both hyphen (``'-'``) and underscore (``'_'``) separators are supported.
Only UTF-8 encoding is allowed for BCP 47 tags.
Windows also supports locale names in the format:
.. productionlist:: locale_name
: language ["_" territory] ["." charset]
where *language* and *territory* are full names, such as "English" and
"United States", and *charset* is either a code page number (for example, "1252")
or UTF-8.
Only the underscore separator is supported in this format.
The "C" locale is supported on all platforms.
.. _ISO 639: https://www.iso.org/iso-639-language-code
.. _ISO 3166: https://www.iso.org/iso-3166-country-codes.html
.. _IETF BCP 47: https://www.rfc-editor.org/info/bcp47
.. _ISO 15924: https://www.unicode.org/iso15924/
.. [1] `IEEE Std 1003.1-2024; 8.2 Internationalization Variables <https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap08.html#tag_08_02>`_
.. [2] `UCRT Locale names, Languages, and Country/Region strings <https://learn.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings>`_
.. [3] `Locale Names <https://learn.microsoft.com/en-us/windows/win32/intl/locale-names>`_
.. _embedding-locale:
For extension writers and programs that embed Python