mirror of
				https://github.com/python/cpython.git
				synced 2025-10-30 21:21:22 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			277 lines
		
	
	
	
		
			12 KiB
		
	
	
	
		
			TeX
		
	
	
	
	
	
			
		
		
	
	
			277 lines
		
	
	
	
		
			12 KiB
		
	
	
	
		
			TeX
		
	
	
	
	
	
| \section{\module{locale} ---
 | |
|          Internationalization services.}
 | |
| \declaremodule{standard}{locale}
 | |
| 
 | |
| 
 | |
| \modulesynopsis{Internationalization services.}
 | |
| 
 | |
| 
 | |
| The \code{locale} module opens access to the \POSIX{} locale database
 | |
| and functionality. The \POSIX{} locale mechanism allows programmers
 | |
| to deal with certain cultural issues in an application, without
 | |
| requiring the programmer to know all the specifics of each country
 | |
| where the software is executed.
 | |
| 
 | |
| The \module{locale} module is implemented on top of the
 | |
| \module{_locale}\refbimodindex{_locale} module, which in turn uses an
 | |
| ANSI \C{} locale implementation if available.
 | |
| 
 | |
| The \module{locale} module defines the following exception and
 | |
| functions:
 | |
| 
 | |
| 
 | |
| \begin{funcdesc}{setlocale}{category\optional{, value}}
 | |
| If \var{value} is specified, modifies the locale setting for the
 | |
| \var{category}. The available categories are listed in the data
 | |
| description below. The value is the name of a locale. An empty string
 | |
| specifies the user's default settings. If the modification of the
 | |
| locale fails, the exception \exception{Error} is
 | |
| raised. If successful, the new locale setting is returned.
 | |
| 
 | |
| If no \var{value} is specified, the current setting for the
 | |
| \var{category} is returned.
 | |
| 
 | |
| \function{setlocale()} is not thread safe on most systems. Applications
 | |
| typically start with a call of
 | |
| \begin{verbatim}
 | |
| import locale
 | |
| locale.setlocale(locale.LC_ALL,"")
 | |
| \end{verbatim}
 | |
| This sets the locale for all categories to the user's default setting
 | |
| (typically specified in the \code{LANG} environment variable). If the
 | |
| locale is not changed thereafter, using multithreading should not
 | |
| cause problems.
 | |
| \end{funcdesc}
 | |
| 
 | |
| \begin{excdesc}{Error}
 | |
| Exception raised when \function{setlocale()} fails.
 | |
| \end{excdesc}
 | |
| 
 | |
| \begin{funcdesc}{localeconv}{}
 | |
| Returns the database of of the local conventions as a dictionary. This
 | |
| dictionary has the following strings as keys:
 | |
| \begin{itemize}
 | |
| \item \code{decimal_point} specifies the decimal point used in
 | |
| floating point number representations for the \code{LC_NUMERIC}
 | |
| category.
 | |
| \item \code{grouping} is a sequence of numbers specifying at which
 | |
| relative positions the \code{thousands_sep} is expected. If the
 | |
| sequence is terminated with \code{locale.CHAR_MAX}, no further
 | |
| grouping is performed. If the sequence terminates with a \code{0}, the last
 | |
| group size is repeatedly used.
 | |
| \item \code{thousands_sep} is the character used between groups.
 | |
| \item \code{int_curr_symbol} specifies the international currency
 | |
| symbol from the \code{LC_MONETARY} category.
 | |
| \item \code{currency_symbol} is the local currency symbol.
 | |
| \item \code{mon_decimal_point} is the decimal point used in monetary
 | |
| values.
 | |
| \item \code{mon_thousands_sep} is the separator for grouping of
 | |
| monetary values.
 | |
| \item \code{mon_grouping} has the same format as the \code{grouping}
 | |
| key; it is used for monetary values.
 | |
| \item \code{positive_sign} and \code{negative_sign} gives the sign
 | |
| used for positive and negative monetary quantities.
 | |
| \item \code{int_frac_digits} and \code{frac_digits} specify the number
 | |
| of fractional digits used in the international and local formatting
 | |
| of monetary values.
 | |
| \item \code{p_cs_precedes} and \code{n_cs_precedes} specifies whether
 | |
| the currency symbol precedes the value for positive or negative
 | |
| values.
 | |
| \item \code{p_sep_by_space} and \code{n_sep_by_space} specifies
 | |
| whether there is a space between the positive or negative value and
 | |
| the currency symbol.
 | |
| \item \code{p_sign_posn} and \code{n_sign_posn} indicate how the
 | |
| sign should be placed for positive and negative monetary values. 
 | |
| \end{itemize}
 | |
| 
 | |
| The possible values for \code{p_sign_posn} and \code{n_sign_posn}
 | |
| are given below.
 | |
| 
 | |
| \begin{tableii}{c|l}{code}{Value}{Explanation}
 | |
| \lineii{0}{Currency and value are surrounded by parentheses.}
 | |
| \lineii{1}{The sign should precede the value and currency symbol.}
 | |
| \lineii{2}{The sign should follow the value and currency symbol.}
 | |
| \lineii{3}{The sign should immediately precede the value.}
 | |
| \lineii{4}{The sign should immediately follow the value.}
 | |
| \lineii{LC_MAX}{Nothing is specified in this locale.}
 | |
| \end{tableii}
 | |
| \end{funcdesc}
 | |
| 
 | |
| \begin{funcdesc}{strcoll}{string1,string2}
 | |
| Compares two strings according to the current \constant{LC_COLLATE}
 | |
| setting. As any other compare function, returns a negative, or a
 | |
| positive value, or \code{0}, depending on whether \var{string1}
 | |
| collates before or after \var{string2} or is equal to it.
 | |
| \end{funcdesc}
 | |
| 
 | |
| \begin{funcdesc}{strxfrm}{string}
 | |
| Transforms a string to one that can be used for the built-in function
 | |
| \function{cmp()}\bifuncindex{cmp}, and still returns locale-aware
 | |
| results.  This function can be used when the same string is compared
 | |
| repeatedly, e.g. when collating a sequence of strings.
 | |
| \end{funcdesc}
 | |
| 
 | |
| \begin{funcdesc}{format}{format, val, \optional{grouping\code{ = 0}}}
 | |
| Formats a number \var{val} according to the current
 | |
| \constant{LC_NUMERIC} setting.  The format follows the conventions of
 | |
| the \code{\%} operator.  For floating point values, the decimal point
 | |
| is modified if appropriate.  If \var{grouping} is true, also takes the
 | |
| grouping into account.
 | |
| \end{funcdesc}
 | |
| 
 | |
| \begin{funcdesc}{str}{float}
 | |
| Formats a floating point number using the same format as the built-in
 | |
| function \code{str(\var{float})}, but takes the decimal point into
 | |
| account.
 | |
| \end{funcdesc}
 | |
| 
 | |
| \begin{funcdesc}{atof}{string}
 | |
| Converts a string to a floating point number, following the
 | |
| \constant{LC_NUMERIC} settings.
 | |
| \end{funcdesc}
 | |
| 
 | |
| \begin{funcdesc}{atoi}{string}
 | |
| Converts a string to an integer, following the \constant{LC_NUMERIC}
 | |
| conventions.
 | |
| \end{funcdesc}
 | |
| 
 | |
| \begin{datadesc}{LC_CTYPE}
 | |
| \refstmodindex{string}
 | |
| Locale category for the character type functions. Depending on the
 | |
| settings of this category, the functions of module \module{string}
 | |
| dealing with case change their behaviour.
 | |
| \end{datadesc}
 | |
| 
 | |
| \begin{datadesc}{LC_COLLATE}
 | |
| Locale category for sorting strings. The functions
 | |
| \function{strcoll()} and \function{strxfrm()} of the \module{locale}
 | |
| module are affected.
 | |
| \end{datadesc}
 | |
| 
 | |
| \begin{datadesc}{LC_TIME}
 | |
| Locale category for the formatting of time. The function
 | |
| \function{time.strftime()} follows these conventions.
 | |
| \end{datadesc}
 | |
| 
 | |
| \begin{datadesc}{LC_MONETARY}
 | |
| Locale category for formatting of monetary values. The available
 | |
| options are available from the \function{localeconv()} function.
 | |
| \end{datadesc}
 | |
| 
 | |
| \begin{datadesc}{LC_MESSAGES}
 | |
| Locale category for message display. Python currently does not support
 | |
| application specific locale-aware messages. Messages displayed by the
 | |
| operating system, like those returned by \function{os.strerror()}
 | |
| might be affected by this category.
 | |
| \end{datadesc}
 | |
| 
 | |
| \begin{datadesc}{LC_NUMERIC}
 | |
| Locale category for formatting numbers. The functions
 | |
| \function{format()}, \function{atoi()}, \function{atof()} and
 | |
| \function{str()} of the \module{locale} module are affected by that
 | |
| category. All other numeric formatting operations are not affected.
 | |
| \end{datadesc}
 | |
| 
 | |
| \begin{datadesc}{LC_ALL}
 | |
| Combination of all locale settings. If this flag is used when the
 | |
| locale is changed, setting the locale for all categories is
 | |
| attempted. If that fails for any category, no category is changed at
 | |
| all. When the locale is retrieved using this flag, a string indicating
 | |
| the setting for all categories is returned. This string can be later
 | |
| used to restore the settings.
 | |
| \end{datadesc}
 | |
| 
 | |
| \begin{datadesc}{CHAR_MAX}
 | |
| This is a symbolic constant used for different values returned by
 | |
| \function{localeconv()}.
 | |
| \end{datadesc}
 | |
| 
 | |
| Example:
 | |
| 
 | |
| \begin{verbatim}
 | |
| >>> import locale
 | |
| >>> loc = locale.setlocale(locale.LC_ALL) # get current locale
 | |
| >>> locale.setlocale(locale.LC_ALL, "de") # use German locale
 | |
| >>> locale.strcoll("f\344n", "foo") # compare a string containing an umlaut 
 | |
| >>> locale.setlocale(locale.LC_ALL, "") # use user's preferred locale
 | |
| >>> locale.setlocale(locale.LC_ALL, "C") # use default (C) locale
 | |
| >>> locale.setlocale(locale.LC_ALL, loc) # restore saved locale
 | |
| \end{verbatim}
 | |
| 
 | |
| \subsection{Background, details, hints, tips and caveats}
 | |
| 
 | |
| The C standard defines the locale as a program-wide property that may
 | |
| be relatively expensive to change.  On top of that, some
 | |
| implementation are broken in such a way that frequent locale changes
 | |
| may cause core dumps.  This makes the locale somewhat painful to use
 | |
| correctly.
 | |
| 
 | |
| Initially, when a program is started, the locale is the \samp{C} locale, no
 | |
| matter what the user's preferred locale is.  The program must
 | |
| explicitly say that it wants the user's preferred locale settings by
 | |
| calling \code{setlocale(LC_ALL, "")}.
 | |
| 
 | |
| It is generally a bad idea to call \function{setlocale()} in some library
 | |
| routine, since as a side effect it affects the entire program.  Saving
 | |
| and restoring it is almost as bad: it is expensive and affects other
 | |
| threads that happen to run before the settings have been restored.
 | |
| 
 | |
| If, when coding a module for general use, you need a locale
 | |
| independent version of an operation that is affected by the locale
 | |
| (e.g. \function{string.lower()}, or certain formats used with
 | |
| \function{time.strftime()})), you will have to find a way to do it
 | |
| without using the standard library routine.  Even better is convincing
 | |
| yourself that using locale settings is okay.  Only as a last resort
 | |
| should you document that your module is not compatible with
 | |
| non-\samp{C} locale settings.
 | |
| 
 | |
| The case conversion functions in the
 | |
| \module{string}\refstmodindex{string} and
 | |
| \module{strop}\refbimodindex{strop} modules are affected by the locale
 | |
| settings.  When a call to the \function{setlocale()} function changes
 | |
| the \constant{LC_CTYPE} settings, the variables
 | |
| \code{string.lowercase}, \code{string.uppercase} and
 | |
| \code{string.letters} (and their counterparts in \module{strop}) are
 | |
| recalculated.  Note that this code that uses these variable through
 | |
| `\keyword{from} ... \keyword{import} ...', e.g. \code{from string
 | |
| import letters}, is not affected by subsequent \function{setlocale()}
 | |
| calls.
 | |
| 
 | |
| The only way to perform numeric operations according to the locale
 | |
| is to use the special functions defined by this module:
 | |
| \function{atof()}, \function{atoi()}, \function{format()},
 | |
| \function{str()}.
 | |
| 
 | |
| \subsection{For extension writers and programs that embed Python}
 | |
| \label{embedding-locale}
 | |
| 
 | |
| Extension modules should never call \function{setlocale()}, except to
 | |
| find out what the current locale is.  But since the return value can
 | |
| only be used portably to restore it, that is not very useful (except
 | |
| perhaps to find out whether or not the locale is \samp{C}).
 | |
| 
 | |
| When Python is embedded in an application, if the application sets the
 | |
| locale to something specific before initializing Python, that is
 | |
| generally okay, and Python will use whatever locale is set,
 | |
| \emph{except} that the \constant{LC_NUMERIC} locale should always be
 | |
| \samp{C}.
 | |
| 
 | |
| The \function{setlocale()} function in the \module{locale} module contains
 | |
| gives the Python progammer the impression that you can manipulate the
 | |
| \constant{LC_NUMERIC} locale setting, but this not the case at the \C{}
 | |
| level: \C{} code will always find that the \constant{LC_NUMERIC} locale
 | |
| setting is \samp{C}.  This is because too much would break when the
 | |
| decimal point character is set to something else than a period
 | |
| (e.g. the Python parser would break).  Caveat: threads that run
 | |
| without holding Python's global interpreter lock may occasionally find
 | |
| that the numeric locale setting differs; this is because the only
 | |
| portable way to implement this feature is to set the numeric locale
 | |
| settings to what the user requests, extract the relevant
 | |
| characteristics, and then restore the \samp{C} numeric locale.
 | |
| 
 | |
| When Python code uses the \module{locale} module to change the locale,
 | |
| this also affects the embedding application.  If the embedding
 | |
| application doesn't want this to happen, it should remove the
 | |
| \module{_locale} extension module (which does all the work) from the
 | |
| table of built-in modules in the \file{config.c} file, and make sure
 | |
| that the \module{_locale} module is not accessible as a shared library.
 | 
