gh-148100: Soft deprecate re.match and re.Pattern.match in favour of prefixmatch (#148101)

This commit is contained in:
Hugo van Kemenade 2026-04-15 23:09:47 +03:00 committed by GitHub
parent e998eb9a83
commit eb2f634b83
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 95 additions and 31 deletions

View file

@ -15,6 +15,8 @@ Deprecations
.. include:: pending-removal-in-future.rst
.. include:: soft-deprecations.rst
C API deprecations
------------------

View file

@ -0,0 +1,21 @@
Soft deprecations
-----------------
There are no plans to remove :term:`soft deprecated` APIs.
* :func:`re.match` and :meth:`re.Pattern.match` are now
:term:`soft deprecated` in favor of the new :func:`re.prefixmatch` and
:meth:`re.Pattern.prefixmatch` APIs, which have been added as alternate,
more explicit names. These are intended to be used to alleviate confusion
around what *match* means by following the Zen of Python's *"Explicit is
better than implicit"* mantra. Most other language regular expression
libraries use an API named *match* to mean what Python has always called
*search*.
We **do not** plan to remove the older :func:`!match` name, as it has been
used in code for over 30 years. Code supporting older versions of Python
should continue to use :func:`!match`, while new code should prefer
:func:`!prefixmatch`. See :ref:`prefixmatch-vs-match`.
(Contributed by Gregory P. Smith in :gh:`86519` and
Hugo van Kemenade in :gh:`148100`.)

View file

@ -52,7 +52,7 @@ fine-tuning parameters.
.. _re-syntax:
Regular Expression Syntax
Regular expression syntax
-------------------------
A regular expression (or RE) specifies a set of strings that matches it; the
@ -205,7 +205,7 @@ The special characters are:
*without* establishing any backtracking points.
This is the possessive version of the quantifier above.
For example, on the 6-character string ``'aaaaaa'``, ``a{3,5}+aa``
attempt to match 5 ``'a'`` characters, then, requiring 2 more ``'a'``\ s,
attempts to match 5 ``'a'`` characters, then, requiring 2 more ``'a'``\ s,
will need more characters than available and thus fail, while
``a{3,5}aa`` will match with ``a{3,5}`` capturing 5, then 4 ``'a'``\ s
by backtracking and then the final 2 ``'a'``\ s are matched by the final
@ -717,7 +717,7 @@ three digits in length.
.. _contents-of-module-re:
Module Contents
Module contents
---------------
The module defines several functions, constants, and an exception. Some of the
@ -833,8 +833,8 @@ Flags
will be conditionally ORed with other flags. Example of use as a default
value::
def myfunc(text, flag=re.NOFLAG):
return re.search(text, flag)
def myfunc(pattern, text, flag=re.NOFLAG):
return re.search(pattern, text, flag)
.. versionadded:: 3.11
@ -954,9 +954,10 @@ Functions
:func:`~re.match`. Use that name when you need to retain compatibility with
older Python versions.
.. versionchanged:: 3.15
The alternate :func:`~re.prefixmatch` name of this API was added as a
more explicitly descriptive name than :func:`~re.match`. Use it to better
.. deprecated:: 3.15
:func:`~re.match` has been :term:`soft deprecated` in favor of
the alternate :func:`~re.prefixmatch` name of this API which is
more explicitly descriptive. Use it to better
express intent. The norm in other languages and regular expression
implementations is to use the term *match* to refer to the behavior of
what Python has always called :func:`~re.search`.
@ -1246,7 +1247,7 @@ Exceptions
.. _re-objects:
Regular Expression Objects
Regular expression objects
--------------------------
.. class:: Pattern
@ -1309,9 +1310,10 @@ Regular Expression Objects
:meth:`~Pattern.match`. Use that name when you need to retain compatibility
with older Python versions.
.. versionchanged:: 3.15
The alternate :meth:`~Pattern.prefixmatch` name of this API was added as
a more explicitly descriptive name than :meth:`~Pattern.match`. Use it to
.. deprecated:: 3.15
:meth:`~Pattern.match` has been :term:`soft deprecated` in favor of
the alternate :meth:`~Pattern.prefixmatch` name of this API which is
more explicitly descriptive. Use it to
better express intent. The norm in other languages and regular expression
implementations is to use the term *match* to refer to the behavior of
what Python has always called :meth:`~Pattern.search`.
@ -1396,7 +1398,7 @@ Regular Expression Objects
.. _match-objects:
Match Objects
Match objects
-------------
Match objects always have a boolean value of ``True``.
@ -1615,11 +1617,11 @@ when there is no match, you can test whether there was a match with a simple
.. _re-examples:
Regular Expression Examples
Regular expression examples
---------------------------
Checking for a Pair
Checking for a pair
^^^^^^^^^^^^^^^^^^^
In this example, we'll use the following helper function to display match
@ -1705,15 +1707,21 @@ expressions.
| ``%x``, ``%X`` | ``[-+]?(0[xX])?[\dA-Fa-f]+`` |
+--------------------------------+---------------------------------------------+
To extract the filename and numbers from a string like ::
To extract the filename and numbers from a string like:
.. code-block:: text
/usr/sbin/sendmail - 0 errors, 4 warnings
you would use a :c:func:`!scanf` format like ::
you would use a :c:func:`!scanf` format like:
.. code-block:: text
%s - %d errors, %d warnings
The equivalent regular expression would be ::
The equivalent regular expression would be:
.. code-block:: text
(\S+) - (\d+) errors, (\d+) warnings
@ -1772,18 +1780,24 @@ not familiar with the Python API's divergence from what otherwise become the
industry norm.
Quoting from the Zen Of Python (``python3 -m this``): *"Explicit is better than
implicit"*. Anyone reading the name :func:`~re.prefixmatch` is likely to
understand the intended semantics. When reading :func:`~re.match` there remains
implicit"*. Anyone reading the name :func:`!prefixmatch` is likely to
understand the intended semantics. When reading :func:`!match` there remains
a seed of doubt about the intended behavior to anyone not already familiar with
this old Python gotcha.
We **do not** plan to deprecate and remove the older *match* name,
We **do not** plan to remove the older :func:`!match` name,
as it has been used in code for over 30 years.
Code supporting older versions of Python should continue to use *match*.
It has been :term:`soft deprecated`:
code supporting older versions of Python should continue to use :func:`!match`,
while new code should prefer :func:`!prefixmatch`.
.. versionadded:: 3.15
:func:`!prefixmatch`
Making a Phonebook
.. deprecated:: 3.15
:func:`!match` is :term:`soft deprecated`
Making a phonebook
^^^^^^^^^^^^^^^^^^
:func:`split` splits a string into a list delimited by the passed pattern. The
@ -1844,7 +1858,7 @@ house number from the street name:
['Heather', 'Albrecht', '548.326.4584', '919', 'Park Place']]
Text Munging
Text munging
^^^^^^^^^^^^
:func:`sub` replaces every occurrence of a pattern with a string or the
@ -1864,7 +1878,7 @@ in each word of a sentence except for the first and last characters::
'Pofsroser Aodlambelk, plasee reoprt yuor asnebces potlmrpy.'
Finding all Adverbs
Finding all adverbs
^^^^^^^^^^^^^^^^^^^
:func:`findall` matches *all* occurrences of a pattern, not just the first
@ -1877,7 +1891,7 @@ the following manner::
['carefully', 'quickly']
Finding all Adverbs and their Positions
Finding all adverbs and their positions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If one wants more information about all matches of a pattern than the matched
@ -1893,7 +1907,7 @@ to find all of the adverbs *and their positions* in some text, they would use
40-47: quickly
Raw String Notation
Raw string notation
^^^^^^^^^^^^^^^^^^^
Raw string notation (``r"text"``) keeps regular expressions sane. Without it,
@ -1917,7 +1931,7 @@ functionally identical::
<re.Match object; span=(0, 1), match='\\'>
Writing a Tokenizer
Writing a tokenizer
^^^^^^^^^^^^^^^^^^^
A `tokenizer or scanner <https://en.wikipedia.org/wiki/Lexical_analysis>`_

View file

@ -945,9 +945,10 @@ pickle
re
--
* :func:`re.prefixmatch` and a corresponding :meth:`~re.Pattern.prefixmatch`
have been added as alternate more explicit names for the existing
:func:`re.match` and :meth:`~re.Pattern.match` APIs. These are intended
* :func:`re.prefixmatch` and a corresponding :meth:`re.Pattern.prefixmatch`
have been added as alternate, more explicit names for the existing
and now :term:`soft deprecated`
:func:`re.match` and :meth:`re.Pattern.match` APIs. These are intended
to be used to alleviate confusion around what *match* means by following the
Zen of Python's *"Explicit is better than implicit"* mantra. Most other
language regular expression libraries use an API named *match* to mean what
@ -1685,6 +1686,27 @@ New deprecations
(Contributed by Bénédikt Tran in :gh:`134978`.)
* :mod:`re`:
* :func:`re.match` and :meth:`re.Pattern.match` are now
:term:`soft deprecated` in favor of the new :func:`re.prefixmatch` and
:meth:`re.Pattern.prefixmatch` APIs, which have been added as alternate,
more explicit names. These are intended to be used to alleviate confusion
around what *match* means by following the Zen of Python's *"Explicit is
better than implicit"* mantra. Most other language regular expression
libraries use an API named *match* to mean what Python has always called
*search*.
We **do not** plan to remove the older :func:`!match` name, as it has been
used in code for over 30 years. Code supporting older versions of Python
should continue to use :func:`!match`, while new code should prefer
:func:`!prefixmatch`. See :ref:`prefixmatch-vs-match`.
(Contributed by Gregory P. Smith in :gh:`86519` and
Hugo van Kemenade in :gh:`148100`.)
* :mod:`struct`:
* Calling the ``Struct.__new__()`` without required argument now is
@ -1757,6 +1779,8 @@ New deprecations
.. include:: ../deprecations/pending-removal-in-future.rst
.. include:: ../deprecations/soft-deprecations.rst
C API changes
=============

View file

@ -0,0 +1,3 @@
:term:`Soft deprecate <soft deprecated>` :func:`re.match` and
:meth:`re.Pattern.match` in favour of :func:`re.prefixmatch` and
:meth:`re.Pattern.prefixmatch`. Patch by Hugo van Kemenade.