gh-130567: Remove optimistic allocation in locale.strxfrm() (GH-137143)

On modern systems, the result of wcsxfrm() is much larger the size of
the input string (from 4+2*n on Windows to 4+5*n on Linux for simple
ASCII strings), so optimistic allocation of the buffer of the same size
never works.

The exception is if the locale is "C" (or unset), but in that case the `wcsxfrm`
call should be fast (and calling `locale.strxfrm()` doesn't make too much
sense in the first place).
This commit is contained in:
Serhiy Storchaka 2025-10-16 10:54:41 +03:00 committed by GitHub
parent 3a81313019
commit 2a2bc82cef
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -455,35 +455,23 @@ _locale_strxfrm_impl(PyObject *module, PyObject *str)
goto exit;
}
/* assume no change in size, first */
n1 = n1 + 1;
/* Yet another +1 is needed to work around a platform bug in wcsxfrm()
* on macOS. See gh-130567. */
buf = PyMem_New(wchar_t, n1+1);
if (!buf) {
PyErr_NoMemory();
goto exit;
}
errno = 0;
n2 = wcsxfrm(buf, s, n1);
n2 = wcsxfrm(NULL, s, 0);
if (errno && errno != ERANGE) {
PyErr_SetFromErrno(PyExc_OSError);
goto exit;
}
if (n2 >= (size_t)n1) {
/* more space needed */
wchar_t * new_buf = PyMem_Realloc(buf, (n2+1)*sizeof(wchar_t));
if (!new_buf) {
PyErr_NoMemory();
goto exit;
}
buf = new_buf;
errno = 0;
n2 = wcsxfrm(buf, s, n2+1);
if (errno) {
PyErr_SetFromErrno(PyExc_OSError);
goto exit;
}
buf = PyMem_New(wchar_t, n2+1);
if (!buf) {
PyErr_NoMemory();
goto exit;
}
errno = 0;
n2 = wcsxfrm(buf, s, n2+1);
if (errno) {
PyErr_SetFromErrno(PyExc_OSError);
goto exit;
}
/* The result is just a sequence of integers, they are not necessary
Unicode code points, so PyUnicode_FromWideChar() cannot be used