Add Ascii85, Base85, and Z85 encoders and decoders to binascii,
replacing the existing pure Python implementations in base64.
This makes the codecs two orders of magnitude faster and consume
two orders of magnitude less memory.
Note that attempting to decode Ascii85 or Base85 data of length 1 mod 5
(after accounting for Ascii85 quirks) now produces an error, as no
encoder would emit such data. This should be the only significant
externally visible difference compared to the old implementation.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Treat "+" and "/" like other characters not in the alternative Base64
alphabet when both altchars and ignorechars are specified.
E.g. discard them if they are not in altchars but are in ignorechars,
and set error if they are not in altchars and not in ignorechars.
Only emit warnings if ignorechars is not specified.
Emit a warning in base64.urlsafe_b64decode() and base64.b64decode() when
the "+" or "/" characters occur in the Base64 data with alternative
alphabet if they are not the part of the alternative alphabet.
It is a DeprecationWarning in the strict mode (will be error) and
a FutureWarning in non-strict mode (will be ignored).
This makes it analogous to a85encode() and b85encode() and allows the
user to more easily meet the Z85 specification, which requires input
lengths to be a multiple of 4.
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Chris Markiewicz <effigies@gmail.com>
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
* gh-118673: Remove shebang and executable bits from stdlib modules.
* Removed shebangs and exe bits on turtledemo scripts.
The setting was inappropriate for '__main__' and inconsistent across the other modules. The scripts can still be executed directly by invoking with the desired interpreter.
* Use binascii.a2b_base64 to validate b64decode input.
This change leads to exception messages changes (mostly).
* Added more information to docstring of b64decode
* Added a reference to binascii.a2b_base64 in the docs
Remove base64.encodestring() and base64.decodestring(), aliases
deprecated since Python 3.1: use base64.encodebytes() and
base64.decodebytes() instead.
Allow developers to not have to either test on N Python versions or
looked through multiple versions of the docs to know whether they can
easily update.
* There are only two base-64 alphabets defined by the RFCs, not three
* Due to the internal translation, plus (+) and slash (/) are never discarded
* standard_ and urlsafe_b64decode() discard characters as well
Also update the doc strings to clarify data types, based on revision
92760d2edc9e, correct the exception raised by b16decode(), and correct the
parameter name for the base-85 functions.
base32, ascii85 and base85 codecs in the base64 module, and delay the
initialization of the unquote_to_bytes() table of the urllib.parse module, to
not waste memory if these modules are not used.
This mostly affected the encodebytes and decodebytes function
(which are used by base64_codec)
Also added a test to ensure all bytes-bytes codecs can handle
memoryview input and tests for handling of multidimensional
and non-bytes format input in the modern base64 API.
Patch by Neil Tallim. This provides a mechanism for module
users to achieve RFC 3548 compliance in the cases where ignoring
non-base64-alphabet input characters is *not* mandated by the RFC that
references RFC 3548.