Commit graph

173 commits

Author SHA1 Message Date
Karolina Surma
1887a95f51
gh-128341: Use _Py_ABI_SLOT in stdlib modules (#145770)
Rename from _Py_INTERNAL_ABI_SLOT to _Py_ABI_SLOT
and define the macro using _PyABIInfo_DEFAULT.

Use the ABI slot in stdlib extension modules to enable running
a check of ABI version compatibility.

_tkinter, _tracemalloc and readline don't use the slots, hence they need
explicit handling.

Co-authored-by: Victor Stinner <vstinner@python.org>
2026-03-24 17:47:55 +00:00
Petr Viktorin
91cd2e5806
gh-146175: Soft-deprecate outdated macros; convert internal usage (GH-146178)
Co-authored-by: Victor Stinner <vstinner@python.org>
2026-03-23 12:42:09 +01:00
Serhiy Storchaka
4561f6418a
gh-145264: Do not ignore excess Base64 data after the first padded quad (GH-145267)
Base64 decoder (see binascii.a2b_base64(), base64.b64decode(), etc)
no longer ignores excess data after the first padded quad in non-strict
(default) mode.  Instead, in conformance with RFC 4648, it ignores the
pad character, "=", if it is present before the end of the encoded data.
2026-03-22 23:12:58 +02:00
kangtastic
b4e5bc2164
gh-146192: Add base32 support to binascii (GH-146193)
Add base32 encoder and decoder functions implemented in
C to the binascii module and use them to greatly improve the
performance and reduce the memory usage of the existing
base32 codec functions in the base64 module.
2026-03-22 23:10:28 +02:00
Victor Stinner
b36b87bcbb
gh-145980: Fix copy/paste mistake in binascii.c (#146230) 2026-03-20 18:12:10 +00:00
Serhiy Storchaka
4507d496b4
gh-145980: Add support for alternative alphabets in the binascii module (GH-145981)
* Add the alphabet parameter in functions b2a_base64(), a2b_base64(),
  b2a_base85(), and a2b_base85().
* And a number of "*_ALPHABET" constants.
* Remove b2a_z85() and a2b_z85().
2026-03-20 13:07:00 +02:00
Serhiy Storchaka
99e2c5eccd
gh-144545: Improve handling of default values in Argument Clinic (GH-146016)
* Add the c_init_default attribute which is used to initialize the C variable
  if the default is not explicitly provided.
* Add the c_default_init() method which is used to derive c_default from
  default if c_default is not explicitly provided.
* Explicit c_default and py_default are now almost always have precedence
  over the generated value.
* Add support for bytes literals as default values.
* Improve support for str literals as default values (support non-ASCII
  and non-printable characters and special characters like backslash or quotes).
* Fix support for str and bytes literals containing trigraphs, "/*" and "*/".
* Improve support for default values in converters "char" and "int(accept={str})".
* Converter "int(accept={str})" now requires 1-character string instead of
  integer as default value.
* Add support for non-None default values in converter "Py_buffer": NULL,
  str and bytes literals.
* Improve error handling for invalid default values.
* Rename Null to NullType for consistency.
2026-03-17 12:16:35 +02:00
Victor Stinner
3c38feb2a2
gh-129813: Document that PyBytesWriter_GetData() cannot fail (#145900)
Document that PyBytesWriter_GetData() and PyBytesWriter_GetSize()
getter functions cannot fail
2026-03-13 19:44:51 +01:00
Pieter Eendebak
8060aa5d7d
gh-145376: Fix various refleaks in Objects/ (#145609) 2026-03-09 14:17:27 +01:00
Serhiy Storchaka
b32c830d44
gh-101178: Fix possible integer overflow in Ascii85 encoder with wrapcol=1 (GH-144778)
It could happen if the size of the input is more than 4/5 of sys.maxsize
(only feasible on 32-bit platforms).

Also simplify the integer overflow checks in the Base64 encoder, and
harmonize them with the code for Ascii85 and Base85.
2026-02-24 11:40:24 +02:00
kangtastic
45d4a34720
gh-101178: Add Ascii85, Base85, and Z85 support to binascii (GH-102753)
Add Ascii85, Base85, and Z85 encoders and decoders to binascii,
replacing the existing pure Python implementations in base64.

This makes the codecs two orders of magnitude faster and consume
two orders of magnitude less memory.

Note that attempting to decode Ascii85 or Base85 data of length 1 mod 5
(after accounting for Ascii85 quirks) now produces an error, as no
encoder would emit such data. This should be the only significant
externally visible difference compared to the old implementation.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2026-02-06 16:43:16 +02:00
Serhiy Storchaka
4644fed819
gh-144001: Support ignoring the invalid pad character in Base64 decoding (GH-144306) 2026-02-05 21:14:49 +02:00
Serhiy Storchaka
92c0ec2b00
gh-144264: Speed up Base64 decoding of data containing ignored characters (GH-144265)
Try the fast path again after decoding a quad the slow path.
Use a bitmap cache for the ignorechars argument.
2026-01-29 17:33:10 +02:00
Serhiy Storchaka
7febbe6b60
gh-144001: Support ignorechars in binascii.a2b_base64() and base64.b64decode() (GH-144024) 2026-01-26 20:11:40 +02:00
Serhiy Storchaka
a471a32f4b
gh-143214: Add the wrapcol parameter in binascii.b2a_base64() and base64.b64encode() (GH-143216) 2026-01-14 14:44:53 +02:00
Gregory P. Smith
61fc72a4a4
gh-124951: Optimize base64 encode & decode for an easy 2-3x speedup [no SIMD] (GH-143262)
Optimize base64 encoding/decoding by eliminating loop-carried dependencies. Key changes:
- Add `base64_encode_trio()` and `base64_decode_quad()` helper functions that process complete groups independently
- Add `base64_encode_fast()` and `base64_decode_fast()` wrappers
- Update `b2a_base64` and `a2b_base64` to use fast path for complete groups

Performance gains (encode/decode speedup vs main, PGO builds):
```
             64 bytes    64K        1M
  Zen2:      1.2x/1.8x   1.7x/2.8x  1.5x/2.8x
  Zen4:      1.2x/1.7x   1.6x/3.0x  1.5x/3.0x  [old data, likely faster]
  M4:        1.3x/1.9x   2.3x/2.8x  2.4x/2.9x  [old data, likely faster]
  RPi5-32:   1.2x/1.2x   2.4x/2.4x  2.0x/2.1x
```

Based on my exploratory work done in https://github.com/python/cpython/compare/main...gpshead:cpython:claude/vectorize-base64-c-S7Hku

See PR and issue for further thoughts on sometimes MUCH faster SIMD vectorized versions of this.
2026-01-01 22:03:05 -08:00
Victor Stinner
ca99af3c5e
gh-129813, PEP 782: Use PyBytesWriter in binascii (#138825)
Replace the private _PyBytesWriter API with the new public
PyBytesWriter API.
2025-09-13 18:25:16 +02:00
Adam Turner
918e3ba6c0
GH-137623: Use an AC decorator for docstring line length enforcement (#137690) 2025-08-18 18:29:00 +01:00
Youfu Zhang
fe47d9bee3
gh-118314: Fix padding edge case in binascii.a2b_base64 strict mode (GH-118320)
Fix an edge case in `binascii.a2b_base64` strict mode, where
excessive padding was not detected when no padding is necessary.

Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
2024-05-07 11:18:45 +02:00
Brett Simmers
c2627d6eea
gh-116322: Add Py_mod_gil module slot (#116882)
This PR adds the ability to enable the GIL if it was disabled at
interpreter startup, and modifies the multi-phase module initialization
path to enable the GIL when loading a module, unless that module's spec
includes a slot indicating it can run safely without the GIL.

PEP 703 called the constant for the slot `Py_mod_gil_not_used`; I went
with `Py_MOD_GIL_NOT_USED` for consistency with gh-104148.

A warning will be issued up to once per interpreter for the first
GIL-using module that is loaded. If `-v` is given, a shorter message
will be printed to stderr every time a GIL-using module is loaded
(including the first one that issues a warning).
2024-05-03 11:30:55 -04:00
Gregory P. Smith
4eddb4c9d9
gh-105967: Work around a macOS bug, limit zlib C library crc32 API calls to 1gig (#112615)
Work around a macOS bug, limit zlib crc32 calls to 1GiB.

Without this, `zlib.crc32` and `binascii.crc32` could produce incorrect
results on multi-gigabyte inputs depending on the macOS version's Apple
supplied zlib implementation.
2023-12-04 12:04:05 -08:00
Furkan Onder
32c37fe1ba
gh-67565: Remove redundant C-contiguity checks (GH-105521)
Co-authored-by: Stefan Krah <skrah@bytereef.org>
2023-10-23 12:54:46 +03:00
Serhiy Storchaka
329e4a1a3f
gh-86493: Modernize modules initialization code (GH-106858)
Use PyModule_Add() or PyModule_AddObjectRef() instead of soft deprecated
PyModule_AddObject().
2023-07-25 14:34:49 +03:00
Inada Naoki
d5bd32fb48
gh-104922: remove PY_SSIZE_T_CLEAN (#106315) 2023-07-02 15:07:46 +09:00
Victor Stinner
ef300937c2
gh-92536: Remove PyUnicode_READY() calls (#105210)
Since Python 3.12, PyUnicode_READY() does nothing and always
returns 0.
2023-06-02 01:33:17 +02:00
Eric Snow
a9c6e0618f
gh-99113: Add Py_MOD_PER_INTERPRETER_GIL_SUPPORTED (gh-104205)
Here we are doing no more than adding the value for Py_mod_multiple_interpreters and using it for stdlib modules.  We will start checking for it in gh-104206 (once PyInterpreterState.ceval.own_gil is added in gh-104204).
2023-05-05 21:11:27 +00:00
Serhiy Storchaka
a87c46eab3
bpo-15999: Accept arbitrary values for boolean parameters. (#15609)
builtins and extension module functions and methods that expect boolean values for parameters now accept any Python object rather than just a bool or int type. This is more consistent with how native Python code itself behaves.
2022-12-03 11:52:21 -08:00
oda-gitso
32e3b790bc
gh-93172: Remove unnecessary "if"s in binascii_a2b_qp_impl() from Modules/binascii.c (GH-93181) 2022-05-25 11:38:47 -04:00
Gregory P. Smith
9d1c4d69db
bpo-38256: Fix binascii.crc32() when inputs are 4+GiB (GH-32000)
When compiled with `USE_ZLIB_CRC32` defined (`configure` sets this on POSIX systems), `binascii.crc32(...)` failed to compute the correct value when the input data was >= 4GiB. Because the zlib crc32 API is limited to a 32-bit length.

This lines it up with the `zlib.crc32(...)` implementation that doesn't have that flaw.

**Performance:** This also adopts the same GIL releasing for larger inputs logic that `zlib.crc32` has, and causes the Windows build to always use zlib's crc32 instead of our slow C code as zlib is a required build dependency on Windows.
2022-03-20 12:28:15 -07:00
Christian Heimes
03e9f5dc75
bpo-43974: Move Py_BUILD_CORE_MODULE into module code (GH-29157)
setup.py no longer defines Py_BUILD_CORE_MODULE. Instead every
module defines the macro before #include "Python.h" unless
Py_BUILD_CORE_BUILTIN is already defined.

Py_BUILD_CORE_BUILTIN is defined for every module that is built by
Modules/Setup.

The PR also simplifies Modules/Setup. Makefile and makesetup
already define Py_BUILD_CORE_BUILTIN and include Modules/internal
for us.

Signed-off-by: Christian Heimes <christian@python.org>
2021-10-22 15:36:28 +02:00
Victor Stinner
5f09bb021a
bpo-35134: Add Include/cpython/longobject.h (GH-29044)
Move Include/longobject.h non-limited API to a new
Include/cpython/longobject.h header file.

Move the following definitions to the internal C API:

* _PyLong_DigitValue
* _PyLong_FormatAdvancedWriter()
* _PyLong_FormatWriter()
2021-10-19 02:04:52 +02:00
Victor Stinner
bbe7497c5a
bpo-45434: Remove pystrhex.h header file (GH-28923)
Move Include/pystrhex.h to Include/internal/pycore_strhex.h.
The header file only contains private functions.

The following C extensions are now built with Py_BUILD_CORE_MODULE
macro defined to get access to the internal C API:

* _blake2
* _hashopenssl
* _md5
* _sha1
* _sha3
* _ssl
* binascii
2021-10-13 15:22:35 +02:00
Victor Stinner
a806608705
bpo-45085: Remove the binhex module (GH-28117)
The binhex module, deprecated in Python 3.9, is now removed. The
following binascii functions, deprecated in Python 3.9, are now also
removed:

* a2b_hqx(), b2a_hqx();
* rlecode_hqx(), rledecode_hqx().

The binascii.crc_hqx() function remains available.
2021-09-02 12:10:08 +02:00
Idan Moral
366fcbac18
bpo-44678: Separate error message for discontinuous padding in binascii.a2b_base64 strict mode (GH-27249)
* Renamed assertLeadingPadding function to match logic
* Added a separate error message for discontinuous padding
* Updated the tests for discontinuous padding
2021-07-19 15:42:19 -07:00
Idan Moral
35b98e38b6
bpo-43086: Add handling for out-of-spec data in a2b_base64 (GH-24402)
binascii.a2b_base64 gains a strict_mode= parameter. When enabled it will raise an
error on input that deviates from the base64 spec in any way.  The default remains
False for backward compatibility.

Code reviews and minor tweaks by: Gregory P. Smith <greg@krypto.org> [Google]
2021-07-18 17:45:19 -07:00
Dong-hee Na
9b06e4b535
Use get_binascii_state instead of PyModule_GetState (GH-26069) 2021-05-13 00:09:30 +09:00
Andy Lester
7668a8bc93
Use calloc-based functions, not malloc. (GH-19152) 2020-03-24 23:26:44 -05:00
Victor Stinner
5b1ef200d3
bpo-39824: module_traverse() don't call m_traverse if md_state=NULL (GH-18738)
Extension modules: m_traverse, m_clear and m_free functions of
PyModuleDef are no longer called if the module state was requested
but is not allocated yet. This is the case immediately after the
module is created and before the module is executed (Py_mod_exec
function). More precisely, these functions are not called if m_size is
greater than 0 and the module state (as returned by
PyModule_GetState()) is NULL.

Extension modules without module state (m_size <= 0) are not affected.

Co-Authored-By: Petr Viktorin <encukou@gmail.com>
2020-03-17 18:09:46 +01:00
Hai Shi
aa0c0808ef
bpo-1635741: Fix potential refleaks in binascii module (GH-18613) 2020-03-11 17:50:52 +01:00
Victor Stinner
c38fd0df2b
bpo-39353: binascii.crc_hqx() is no longer deprecated (GH-18276)
The binascii.crc_hqx() function is no longer deprecated.
2020-01-30 09:56:40 +01:00
Victor Stinner
beea26b57e
bpo-39353: Deprecate the binhex module (GH-18025)
Deprecate binhex4 and hexbin4 standards. Deprecate the binhex module
and the following binascii functions:

* b2a_hqx(), a2b_hqx()
* rlecode_hqx(), rledecode_hqx()
* crc_hqx()
2020-01-22 20:44:22 +01:00
Sergey Fedoseev
1c5e68e714 bpo-34749: Improved performance of binascii.a2b_base64(). (GH-9444)
https://bugs.python.org/issue34749
2019-07-14 05:15:32 -07:00
Gregory P. Smith
0c2f930564
bpo-22385: Support output separators in hex methods. (#13578)
* bpo-22385: Support output separators in hex methods.

Also in binascii.hexlify aka b2a_hex.

The underlying implementation behind all hex generation in CPython uses the
same pystrhex.c implementation.  This adds support to bytes, bytearray,
and memoryview objects.

The binascii module functions exist rather than being slated for deprecation
because they return bytes rather than requiring an intermediate step through a
str object.

This change was inspired by MicroPython which supports sep in its binascii
implementation (and does not yet support the .hex methods).

https://bugs.python.org/issue22385
2019-05-29 11:46:58 -07:00
Marcel Plch
33e71e01e9 bpo-31862: Port binascii to PEP 489 multiphase initialization (GH-4108) 2019-05-22 13:51:25 +02:00
Serhiy Storchaka
d53fe5f407
bpo-36254: Fix invalid uses of %d in format strings in C. (GH-12264) 2019-03-13 22:59:55 +02:00
Tal Einat
1fba2ffc37 bpo-34736: improve error message for invalid length b64decode inputs (GH-9563)
Improvements:
1. Include the number of valid data characters in the error message.
2. Mention "number of data characters" rather than "length".


https://bugs.python.org/issue34736
2018-09-27 22:57:22 -07:00
Tal Einat
1b85c71a21
bpo-33770: improve base64 exception message for encoded inputs of invalid length (#7416) 2018-06-10 10:01:50 +03:00
Sergey Fedoseev
6b5df906af bpo-32147: Improved perfomance of binascii.unhexlify(). (GH-4586) 2018-02-26 22:35:41 +02:00
Segev Finer
679b566622 bpo-9566: Fix some Windows x64 compiler warnings (#2492)
* bpo-9566: Silence liblzma warnings

* bpo-9566: Silence tcl warnings

* bpo-9566: Silence tk warnings

* bpo-9566: Silence tix warnings

* bpo-9566: Fix some library warnings

* bpo-9566: Fix msvcrtmodule.c warnings

* bpo-9566: Silence _bz2 warnings

* bpo-9566: Fixed some _ssl warnings

* bpo-9566: Fix _msi warnings

* bpo-9566: Silence _ctypes warnings

* Revert "bpo-9566: Fixed some _ssl warnings"

This reverts commit a639001c94.

* bpo-9566: Also consider NULL as a possible error in HANDLE_return_converter

* bpo-9566: whitespace fixes
2017-07-26 15:17:57 -07:00
Xiang Zhang
13f1f423fa bpo-30103: Allow Uuencode in Python using backtick as zero instead of space (#1326) 2017-05-03 11:16:21 +08:00