|
|
|
|
@ -1,4 +1,4 @@
|
|
|
|
|
# Autogenerated by Sphinx on Tue Jul 22 19:42:37 2025
|
|
|
|
|
# Autogenerated by Sphinx on Thu Aug 14 15:19:40 2025
|
|
|
|
|
# as part of the release process.
|
|
|
|
|
|
|
|
|
|
topics = {
|
|
|
|
|
@ -492,18 +492,65 @@ The transformation rule is defined as follows:
|
|
|
|
|
Python supports string and bytes literals and various numeric
|
|
|
|
|
literals:
|
|
|
|
|
|
|
|
|
|
literal: stringliteral | bytesliteral | NUMBER
|
|
|
|
|
literal: strings | NUMBER
|
|
|
|
|
|
|
|
|
|
Evaluation of a literal yields an object of the given type (string,
|
|
|
|
|
bytes, integer, floating-point number, complex number) with the given
|
|
|
|
|
value. The value may be approximated in the case of floating-point
|
|
|
|
|
and imaginary (complex) literals. See section Literals for details.
|
|
|
|
|
and imaginary (complex) literals. See section Literals for details.
|
|
|
|
|
See section String literal concatenation for details on "strings".
|
|
|
|
|
|
|
|
|
|
All literals correspond to immutable data types, and hence the
|
|
|
|
|
object’s identity is less important than its value. Multiple
|
|
|
|
|
evaluations of literals with the same value (either the same
|
|
|
|
|
occurrence in the program text or a different occurrence) may obtain
|
|
|
|
|
the same object or a different object with the same value.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
String literal concatenation
|
|
|
|
|
============================
|
|
|
|
|
|
|
|
|
|
Multiple adjacent string or bytes literals (delimited by whitespace),
|
|
|
|
|
possibly using different quoting conventions, are allowed, and their
|
|
|
|
|
meaning is the same as their concatenation:
|
|
|
|
|
|
|
|
|
|
>>> "hello" 'world'
|
|
|
|
|
"helloworld"
|
|
|
|
|
|
|
|
|
|
Formally:
|
|
|
|
|
|
|
|
|
|
strings: ( STRING | fstring)+ | tstring+
|
|
|
|
|
|
|
|
|
|
This feature is defined at the syntactical level, so it only works
|
|
|
|
|
with literals. To concatenate string expressions at run time, the ‘+’
|
|
|
|
|
operator may be used:
|
|
|
|
|
|
|
|
|
|
>>> greeting = "Hello"
|
|
|
|
|
>>> space = " "
|
|
|
|
|
>>> name = "Blaise"
|
|
|
|
|
>>> print(greeting + space + name) # not: print(greeting space name)
|
|
|
|
|
Hello Blaise
|
|
|
|
|
|
|
|
|
|
Literal concatenation can freely mix raw strings, triple-quoted
|
|
|
|
|
strings, and formatted string literals. For example:
|
|
|
|
|
|
|
|
|
|
>>> "Hello" r', ' f"{name}!"
|
|
|
|
|
"Hello, Blaise!"
|
|
|
|
|
|
|
|
|
|
This feature can be used to reduce the number of backslashes needed,
|
|
|
|
|
to split long strings conveniently across long lines, or even to add
|
|
|
|
|
comments to parts of strings. For example:
|
|
|
|
|
|
|
|
|
|
re.compile("[A-Za-z_]" # letter or underscore
|
|
|
|
|
"[A-Za-z0-9_]*" # letter, digit or underscore
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
However, bytes literals may only be combined with other byte literals;
|
|
|
|
|
not with string literals of any kind. Also, template string literals
|
|
|
|
|
may only be combined with other template string literals:
|
|
|
|
|
|
|
|
|
|
>>> t"Hello" t"{name}!"
|
|
|
|
|
Template(strings=('Hello', '!'), interpolations=(...))
|
|
|
|
|
''',
|
|
|
|
|
'attribute-access': r'''Customizing attribute access
|
|
|
|
|
****************************
|
|
|
|
|
@ -10209,71 +10256,112 @@ str.zfill(width)
|
|
|
|
|
'strings': '''String and Bytes literals
|
|
|
|
|
*************************
|
|
|
|
|
|
|
|
|
|
String literals are described by the following lexical definitions:
|
|
|
|
|
String literals are text enclosed in single quotes ("'") or double
|
|
|
|
|
quotes ("""). For example:
|
|
|
|
|
|
|
|
|
|
stringliteral: [stringprefix](shortstring | longstring)
|
|
|
|
|
stringprefix: "r" | "u" | "R" | "U" | "f" | "F" | "t" | "T"
|
|
|
|
|
| "fr" | "Fr" | "fR" | "FR" | "rf" | "rF" | "Rf" | "RF"
|
|
|
|
|
| "tr" | "Tr" | "tR" | "TR" | "rt" | "rT" | "Rt" | "RT"
|
|
|
|
|
shortstring: "'" shortstringitem* "'" | '"' shortstringitem* '"'
|
|
|
|
|
longstring: "\'\'\'" longstringitem* "\'\'\'" | '"""' longstringitem* '"""'
|
|
|
|
|
shortstringitem: shortstringchar | stringescapeseq
|
|
|
|
|
longstringitem: longstringchar | stringescapeseq
|
|
|
|
|
shortstringchar: <any source character except "\\" or newline or the quote>
|
|
|
|
|
longstringchar: <any source character except "\\">
|
|
|
|
|
stringescapeseq: "\\" <any source character>
|
|
|
|
|
"spam"
|
|
|
|
|
'eggs'
|
|
|
|
|
|
|
|
|
|
bytesliteral: bytesprefix(shortbytes | longbytes)
|
|
|
|
|
bytesprefix: "b" | "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB"
|
|
|
|
|
shortbytes: "'" shortbytesitem* "'" | '"' shortbytesitem* '"'
|
|
|
|
|
longbytes: "\'\'\'" longbytesitem* "\'\'\'" | '"""' longbytesitem* '"""'
|
|
|
|
|
shortbytesitem: shortbyteschar | bytesescapeseq
|
|
|
|
|
longbytesitem: longbyteschar | bytesescapeseq
|
|
|
|
|
shortbyteschar: <any ASCII character except "\\" or newline or the quote>
|
|
|
|
|
longbyteschar: <any ASCII character except "\\">
|
|
|
|
|
bytesescapeseq: "\\" <any ASCII character>
|
|
|
|
|
The quote used to start the literal also terminates it, so a string
|
|
|
|
|
literal can only contain the other quote (except with escape
|
|
|
|
|
sequences, see below). For example:
|
|
|
|
|
|
|
|
|
|
One syntactic restriction not indicated by these productions is that
|
|
|
|
|
whitespace is not allowed between the "stringprefix" or "bytesprefix"
|
|
|
|
|
and the rest of the literal. The source character set is defined by
|
|
|
|
|
the encoding declaration; it is UTF-8 if no encoding declaration is
|
|
|
|
|
given in the source file; see section Encoding declarations.
|
|
|
|
|
'Say "Hello", please.'
|
|
|
|
|
"Don't do that!"
|
|
|
|
|
|
|
|
|
|
In plain English: Both types of literals can be enclosed in matching
|
|
|
|
|
single quotes ("'") or double quotes ("""). They can also be enclosed
|
|
|
|
|
in matching groups of three single or double quotes (these are
|
|
|
|
|
generally referred to as *triple-quoted strings*). The backslash ("\\")
|
|
|
|
|
character is used to give special meaning to otherwise ordinary
|
|
|
|
|
characters like "n", which means ‘newline’ when escaped ("\\n"). It can
|
|
|
|
|
also be used to escape characters that otherwise have a special
|
|
|
|
|
meaning, such as newline, backslash itself, or the quote character.
|
|
|
|
|
See escape sequences below for examples.
|
|
|
|
|
Except for this limitation, the choice of quote character ("'" or """)
|
|
|
|
|
does not affect how the literal is parsed.
|
|
|
|
|
|
|
|
|
|
Bytes literals are always prefixed with "'b'" or "'B'"; they produce
|
|
|
|
|
an instance of the "bytes" type instead of the "str" type. They may
|
|
|
|
|
only contain ASCII characters; bytes with a numeric value of 128 or
|
|
|
|
|
greater must be expressed with escapes.
|
|
|
|
|
Inside a string literal, the backslash ("\\") character introduces an
|
|
|
|
|
*escape sequence*, which has special meaning depending on the
|
|
|
|
|
character after the backslash. For example, "\\"" denotes the double
|
|
|
|
|
quote character, and does *not* end the string:
|
|
|
|
|
|
|
|
|
|
Both string and bytes literals may optionally be prefixed with a
|
|
|
|
|
letter "'r'" or "'R'"; such constructs are called *raw string
|
|
|
|
|
literals* and *raw bytes literals* respectively and treat backslashes
|
|
|
|
|
as literal characters. As a result, in raw string literals, "'\\U'"
|
|
|
|
|
and "'\\u'" escapes are not treated specially.
|
|
|
|
|
>>> print("Say \\"Hello\\" to everyone!")
|
|
|
|
|
Say "Hello" to everyone!
|
|
|
|
|
|
|
|
|
|
See escape sequences below for a full list of such sequences, and more
|
|
|
|
|
details.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Triple-quoted strings
|
|
|
|
|
=====================
|
|
|
|
|
|
|
|
|
|
Strings can also be enclosed in matching groups of three single or
|
|
|
|
|
double quotes. These are generally referred to as *triple-quoted
|
|
|
|
|
strings*:
|
|
|
|
|
|
|
|
|
|
"""This is a triple-quoted string."""
|
|
|
|
|
|
|
|
|
|
In triple-quoted literals, unescaped quotes are allowed (and are
|
|
|
|
|
retained), except that three unescaped quotes in a row terminate the
|
|
|
|
|
literal, if they are of the same kind ("'" or """) used at the start:
|
|
|
|
|
|
|
|
|
|
"""This string has "quotes" inside."""
|
|
|
|
|
|
|
|
|
|
Unescaped newlines are also allowed and retained:
|
|
|
|
|
|
|
|
|
|
\'\'\'This triple-quoted string
|
|
|
|
|
continues on the next line.\'\'\'
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
String prefixes
|
|
|
|
|
===============
|
|
|
|
|
|
|
|
|
|
String literals can have an optional *prefix* that influences how the
|
|
|
|
|
content of the literal is parsed, for example:
|
|
|
|
|
|
|
|
|
|
b"data"
|
|
|
|
|
f'{result=}'
|
|
|
|
|
|
|
|
|
|
The allowed prefixes are:
|
|
|
|
|
|
|
|
|
|
* "b": Bytes literal
|
|
|
|
|
|
|
|
|
|
* "r": Raw string
|
|
|
|
|
|
|
|
|
|
* "f": Formatted string literal (“f-string”)
|
|
|
|
|
|
|
|
|
|
* "t": Template string literal (“t-string”)
|
|
|
|
|
|
|
|
|
|
* "u": No effect (allowed for backwards compatibility)
|
|
|
|
|
|
|
|
|
|
See the linked sections for details on each type.
|
|
|
|
|
|
|
|
|
|
Prefixes are case-insensitive (for example, "B" works the same as
|
|
|
|
|
"b"). The "r" prefix can be combined with "f", "t" or "b", so "fr",
|
|
|
|
|
"rf", "tr", "rt", "br" and "rb" are also valid prefixes.
|
|
|
|
|
|
|
|
|
|
Added in version 3.3: The "'rb'" prefix of raw bytes literals has been
|
|
|
|
|
added as a synonym of "'br'".Support for the unicode legacy literal
|
|
|
|
|
("u'value'") was reintroduced to simplify the maintenance of dual
|
|
|
|
|
Python 2.x and 3.x codebases. See **PEP 414** for more information.
|
|
|
|
|
|
|
|
|
|
A string literal with "f" or "F" in its prefix is a *formatted string
|
|
|
|
|
literal*; see f-strings. The "f" may be combined with "r", but not
|
|
|
|
|
with "b" or "u", therefore raw formatted strings are possible, but
|
|
|
|
|
formatted bytes literals are not.
|
|
|
|
|
|
|
|
|
|
In triple-quoted literals, unescaped newlines and quotes are allowed
|
|
|
|
|
(and are retained), except that three unescaped quotes in a row
|
|
|
|
|
terminate the literal. (A “quote” is the character used to open the
|
|
|
|
|
literal, i.e. either "'" or """.)
|
|
|
|
|
Formal grammar
|
|
|
|
|
==============
|
|
|
|
|
|
|
|
|
|
String literals, except “f-strings” and “t-strings”, are described by
|
|
|
|
|
the following lexical definitions.
|
|
|
|
|
|
|
|
|
|
These definitions use negative lookaheads ("!") to indicate that an
|
|
|
|
|
ending quote ends the literal.
|
|
|
|
|
|
|
|
|
|
STRING: [stringprefix] (stringcontent)
|
|
|
|
|
stringprefix: <("r" | "u" | "b" | "br" | "rb"), case-insensitive>
|
|
|
|
|
stringcontent:
|
|
|
|
|
| "'" ( !"'" stringitem)* "'"
|
|
|
|
|
| '"' ( !'"' stringitem)* '"'
|
|
|
|
|
| "\'\'\'" ( !"\'\'\'" longstringitem)* "\'\'\'"
|
|
|
|
|
| '"""' ( !'"""' longstringitem)* '"""'
|
|
|
|
|
stringitem: stringchar | stringescapeseq
|
|
|
|
|
stringchar: <any source_character, except backslash and newline>
|
|
|
|
|
longstringitem: stringitem | newline
|
|
|
|
|
stringescapeseq: "\\" <any source_character>
|
|
|
|
|
|
|
|
|
|
Note that as in all lexical definitions, whitespace is significant. In
|
|
|
|
|
particular, the prefix (if any) must be immediately followed by the
|
|
|
|
|
starting quote.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Escape sequences
|
|
|
|
|
@ -10283,101 +10371,197 @@ Unless an "'r'" or "'R'" prefix is present, escape sequences in string
|
|
|
|
|
and bytes literals are interpreted according to rules similar to those
|
|
|
|
|
used by Standard C. The recognized escape sequences are:
|
|
|
|
|
|
|
|
|
|
+---------------------------+-----------------------------------+---------+
|
|
|
|
|
| Escape Sequence | Meaning | Notes |
|
|
|
|
|
|===========================|===================================|=========|
|
|
|
|
|
| "\\"<newline> | Backslash and newline ignored | (1) |
|
|
|
|
|
+---------------------------+-----------------------------------+---------+
|
|
|
|
|
| "\\\\" | Backslash ("\\") | |
|
|
|
|
|
+---------------------------+-----------------------------------+---------+
|
|
|
|
|
| "\\'" | Single quote ("'") | |
|
|
|
|
|
+---------------------------+-----------------------------------+---------+
|
|
|
|
|
| "\\"" | Double quote (""") | |
|
|
|
|
|
+---------------------------+-----------------------------------+---------+
|
|
|
|
|
| "\\a" | ASCII Bell (BEL) | |
|
|
|
|
|
+---------------------------+-----------------------------------+---------+
|
|
|
|
|
| "\\b" | ASCII Backspace (BS) | |
|
|
|
|
|
+---------------------------+-----------------------------------+---------+
|
|
|
|
|
| "\\f" | ASCII Formfeed (FF) | |
|
|
|
|
|
+---------------------------+-----------------------------------+---------+
|
|
|
|
|
| "\\n" | ASCII Linefeed (LF) | |
|
|
|
|
|
+---------------------------+-----------------------------------+---------+
|
|
|
|
|
| "\\r" | ASCII Carriage Return (CR) | |
|
|
|
|
|
+---------------------------+-----------------------------------+---------+
|
|
|
|
|
| "\\t" | ASCII Horizontal Tab (TAB) | |
|
|
|
|
|
+---------------------------+-----------------------------------+---------+
|
|
|
|
|
| "\\v" | ASCII Vertical Tab (VT) | |
|
|
|
|
|
+---------------------------+-----------------------------------+---------+
|
|
|
|
|
| "\\*ooo*" | Character with octal value *ooo* | (2,4) |
|
|
|
|
|
+---------------------------+-----------------------------------+---------+
|
|
|
|
|
| "\\x*hh*" | Character with hex value *hh* | (3,4) |
|
|
|
|
|
+---------------------------+-----------------------------------+---------+
|
|
|
|
|
+----------------------------------------------------+----------------------------------------------------+
|
|
|
|
|
| Escape Sequence | Meaning |
|
|
|
|
|
|====================================================|====================================================|
|
|
|
|
|
| "\\"<newline> | Ignored end of line |
|
|
|
|
|
+----------------------------------------------------+----------------------------------------------------+
|
|
|
|
|
| "\\\\" | Backslash |
|
|
|
|
|
+----------------------------------------------------+----------------------------------------------------+
|
|
|
|
|
| "\\'" | Single quote |
|
|
|
|
|
+----------------------------------------------------+----------------------------------------------------+
|
|
|
|
|
| "\\"" | Double quote |
|
|
|
|
|
+----------------------------------------------------+----------------------------------------------------+
|
|
|
|
|
| "\\a" | ASCII Bell (BEL) |
|
|
|
|
|
+----------------------------------------------------+----------------------------------------------------+
|
|
|
|
|
| "\\b" | ASCII Backspace (BS) |
|
|
|
|
|
+----------------------------------------------------+----------------------------------------------------+
|
|
|
|
|
| "\\f" | ASCII Formfeed (FF) |
|
|
|
|
|
+----------------------------------------------------+----------------------------------------------------+
|
|
|
|
|
| "\\n" | ASCII Linefeed (LF) |
|
|
|
|
|
+----------------------------------------------------+----------------------------------------------------+
|
|
|
|
|
| "\\r" | ASCII Carriage Return (CR) |
|
|
|
|
|
+----------------------------------------------------+----------------------------------------------------+
|
|
|
|
|
| "\\t" | ASCII Horizontal Tab (TAB) |
|
|
|
|
|
+----------------------------------------------------+----------------------------------------------------+
|
|
|
|
|
| "\\v" | ASCII Vertical Tab (VT) |
|
|
|
|
|
+----------------------------------------------------+----------------------------------------------------+
|
|
|
|
|
| "\\*ooo*" | Octal character |
|
|
|
|
|
+----------------------------------------------------+----------------------------------------------------+
|
|
|
|
|
| "\\x*hh*" | Hexadecimal character |
|
|
|
|
|
+----------------------------------------------------+----------------------------------------------------+
|
|
|
|
|
| "\\N{*name*}" | Named Unicode character |
|
|
|
|
|
+----------------------------------------------------+----------------------------------------------------+
|
|
|
|
|
| "\\u*xxxx*" | Hexadecimal Unicode character |
|
|
|
|
|
+----------------------------------------------------+----------------------------------------------------+
|
|
|
|
|
| "\\U*xxxxxxxx*" | Hexadecimal Unicode character |
|
|
|
|
|
+----------------------------------------------------+----------------------------------------------------+
|
|
|
|
|
|
|
|
|
|
Escape sequences only recognized in string literals are:
|
|
|
|
|
|
|
|
|
|
+---------------------------+-----------------------------------+---------+
|
|
|
|
|
| Escape Sequence | Meaning | Notes |
|
|
|
|
|
|===========================|===================================|=========|
|
|
|
|
|
| "\\N{*name*}" | Character named *name* in the | (5) |
|
|
|
|
|
| | Unicode database | |
|
|
|
|
|
+---------------------------+-----------------------------------+---------+
|
|
|
|
|
| "\\u*xxxx*" | Character with 16-bit hex value | (6) |
|
|
|
|
|
| | *xxxx* | |
|
|
|
|
|
+---------------------------+-----------------------------------+---------+
|
|
|
|
|
| "\\U*xxxxxxxx*" | Character with 32-bit hex value | (7) |
|
|
|
|
|
| | *xxxxxxxx* | |
|
|
|
|
|
+---------------------------+-----------------------------------+---------+
|
|
|
|
|
Ignored end of line
|
|
|
|
|
-------------------
|
|
|
|
|
|
|
|
|
|
Notes:
|
|
|
|
|
A backslash can be added at the end of a line to ignore the newline:
|
|
|
|
|
|
|
|
|
|
1. A backslash can be added at the end of a line to ignore the
|
|
|
|
|
newline:
|
|
|
|
|
>>> 'This string will not include \\
|
|
|
|
|
... backslashes or newline characters.'
|
|
|
|
|
'This string will not include backslashes or newline characters.'
|
|
|
|
|
|
|
|
|
|
>>> 'This string will not include \\
|
|
|
|
|
... backslashes or newline characters.'
|
|
|
|
|
'This string will not include backslashes or newline characters.'
|
|
|
|
|
The same result can be achieved using triple-quoted strings, or
|
|
|
|
|
parentheses and string literal concatenation.
|
|
|
|
|
|
|
|
|
|
The same result can be achieved using triple-quoted strings, or
|
|
|
|
|
parentheses and string literal concatenation.
|
|
|
|
|
|
|
|
|
|
2. As in Standard C, up to three octal digits are accepted.
|
|
|
|
|
Escaped characters
|
|
|
|
|
------------------
|
|
|
|
|
|
|
|
|
|
Changed in version 3.11: Octal escapes with value larger than
|
|
|
|
|
"0o377" produce a "DeprecationWarning".
|
|
|
|
|
To include a backslash in a non-raw Python string literal, it must be
|
|
|
|
|
doubled. The "\\\\" escape sequence denotes a single backslash
|
|
|
|
|
character:
|
|
|
|
|
|
|
|
|
|
Changed in version 3.12: Octal escapes with value larger than
|
|
|
|
|
"0o377" produce a "SyntaxWarning". In a future Python version they
|
|
|
|
|
will be eventually a "SyntaxError".
|
|
|
|
|
>>> print('C:\\\\Program Files')
|
|
|
|
|
C:\\Program Files
|
|
|
|
|
|
|
|
|
|
3. Unlike in Standard C, exactly two hex digits are required.
|
|
|
|
|
Similarly, the "\\'" and "\\"" sequences denote the single and double
|
|
|
|
|
quote character, respectively:
|
|
|
|
|
|
|
|
|
|
4. In a bytes literal, hexadecimal and octal escapes denote the byte
|
|
|
|
|
with the given value. In a string literal, these escapes denote a
|
|
|
|
|
Unicode character with the given value.
|
|
|
|
|
>>> print('\\' and \\"')
|
|
|
|
|
' and "
|
|
|
|
|
|
|
|
|
|
5. Changed in version 3.3: Support for name aliases [1] has been
|
|
|
|
|
added.
|
|
|
|
|
|
|
|
|
|
6. Exactly four hex digits are required.
|
|
|
|
|
Octal character
|
|
|
|
|
---------------
|
|
|
|
|
|
|
|
|
|
7. Any Unicode character can be encoded this way. Exactly eight hex
|
|
|
|
|
digits are required.
|
|
|
|
|
The sequence "\\*ooo*" denotes a *character* with the octal (base 8)
|
|
|
|
|
value *ooo*:
|
|
|
|
|
|
|
|
|
|
Unlike Standard C, all unrecognized escape sequences are left in the
|
|
|
|
|
string unchanged, i.e., *the backslash is left in the result*. (This
|
|
|
|
|
behavior is useful when debugging: if an escape sequence is mistyped,
|
|
|
|
|
the resulting output is more easily recognized as broken.) It is also
|
|
|
|
|
important to note that the escape sequences only recognized in string
|
|
|
|
|
literals fall into the category of unrecognized escapes for bytes
|
|
|
|
|
literals.
|
|
|
|
|
>>> '\\120'
|
|
|
|
|
'P'
|
|
|
|
|
|
|
|
|
|
Up to three octal digits (0 through 7) are accepted.
|
|
|
|
|
|
|
|
|
|
In a bytes literal, *character* means a *byte* with the given value.
|
|
|
|
|
In a string literal, it means a Unicode character with the given
|
|
|
|
|
value.
|
|
|
|
|
|
|
|
|
|
Changed in version 3.11: Octal escapes with value larger than "0o377"
|
|
|
|
|
(255) produce a "DeprecationWarning".
|
|
|
|
|
|
|
|
|
|
Changed in version 3.12: Octal escapes with value larger than "0o377"
|
|
|
|
|
(255) produce a "SyntaxWarning". In a future Python version they will
|
|
|
|
|
raise a "SyntaxError".
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hexadecimal character
|
|
|
|
|
---------------------
|
|
|
|
|
|
|
|
|
|
The sequence "\\x*hh*" denotes a *character* with the hex (base 16)
|
|
|
|
|
value *hh*:
|
|
|
|
|
|
|
|
|
|
>>> '\\x50'
|
|
|
|
|
'P'
|
|
|
|
|
|
|
|
|
|
Unlike in Standard C, exactly two hex digits are required.
|
|
|
|
|
|
|
|
|
|
In a bytes literal, *character* means a *byte* with the given value.
|
|
|
|
|
In a string literal, it means a Unicode character with the given
|
|
|
|
|
value.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Named Unicode character
|
|
|
|
|
-----------------------
|
|
|
|
|
|
|
|
|
|
The sequence "\\N{*name*}" denotes a Unicode character with the given
|
|
|
|
|
*name*:
|
|
|
|
|
|
|
|
|
|
>>> '\\N{LATIN CAPITAL LETTER P}'
|
|
|
|
|
'P'
|
|
|
|
|
>>> '\\N{SNAKE}'
|
|
|
|
|
'🐍'
|
|
|
|
|
|
|
|
|
|
This sequence cannot appear in bytes literals.
|
|
|
|
|
|
|
|
|
|
Changed in version 3.3: Support for name aliases has been added.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hexadecimal Unicode characters
|
|
|
|
|
------------------------------
|
|
|
|
|
|
|
|
|
|
These sequences "\\u*xxxx*" and "\\U*xxxxxxxx*" denote the Unicode
|
|
|
|
|
character with the given hex (base 16) value. Exactly four digits are
|
|
|
|
|
required for "\\u"; exactly eight digits are required for "\\U". The
|
|
|
|
|
latter can encode any Unicode character.
|
|
|
|
|
|
|
|
|
|
>>> '\\u1234'
|
|
|
|
|
'ሴ'
|
|
|
|
|
>>> '\\U0001f40d'
|
|
|
|
|
'🐍'
|
|
|
|
|
|
|
|
|
|
These sequences cannot appear in bytes literals.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Unrecognized escape sequences
|
|
|
|
|
-----------------------------
|
|
|
|
|
|
|
|
|
|
Unlike in Standard C, all unrecognized escape sequences are left in
|
|
|
|
|
the string unchanged, that is, *the backslash is left in the result*:
|
|
|
|
|
|
|
|
|
|
>>> print('\\q')
|
|
|
|
|
\\q
|
|
|
|
|
>>> list('\\q')
|
|
|
|
|
['\\\\', 'q']
|
|
|
|
|
|
|
|
|
|
Note that for bytes literals, the escape sequences only recognized in
|
|
|
|
|
string literals ("\\N...", "\\u...", "\\U...") fall into the category of
|
|
|
|
|
unrecognized escapes.
|
|
|
|
|
|
|
|
|
|
Changed in version 3.6: Unrecognized escape sequences produce a
|
|
|
|
|
"DeprecationWarning".
|
|
|
|
|
|
|
|
|
|
Changed in version 3.12: Unrecognized escape sequences produce a
|
|
|
|
|
"SyntaxWarning". In a future Python version they will be eventually a
|
|
|
|
|
"SyntaxWarning". In a future Python version they will raise a
|
|
|
|
|
"SyntaxError".
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Bytes literals
|
|
|
|
|
==============
|
|
|
|
|
|
|
|
|
|
*Bytes literals* are always prefixed with "'b'" or "'B'"; they produce
|
|
|
|
|
an instance of the "bytes" type instead of the "str" type. They may
|
|
|
|
|
only contain ASCII characters; bytes with a numeric value of 128 or
|
|
|
|
|
greater must be expressed with escape sequences (typically Hexadecimal
|
|
|
|
|
character or Octal character):
|
|
|
|
|
|
|
|
|
|
>>> b'\\x89PNG\\r\\n\\x1a\\n'
|
|
|
|
|
b'\\x89PNG\\r\\n\\x1a\\n'
|
|
|
|
|
>>> list(b'\\x89PNG\\r\\n\\x1a\\n')
|
|
|
|
|
[137, 80, 78, 71, 13, 10, 26, 10]
|
|
|
|
|
|
|
|
|
|
Similarly, a zero byte must be expressed using an escape sequence
|
|
|
|
|
(typically "\\0" or "\\x00").
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Raw string literals
|
|
|
|
|
===================
|
|
|
|
|
|
|
|
|
|
Both string and bytes literals may optionally be prefixed with a
|
|
|
|
|
letter "'r'" or "'R'"; such constructs are called *raw string
|
|
|
|
|
literals* and *raw bytes literals* respectively and treat backslashes
|
|
|
|
|
as literal characters. As a result, in raw string literals, escape
|
|
|
|
|
sequences are not treated specially:
|
|
|
|
|
|
|
|
|
|
>>> r'\\d{4}-\\d{2}-\\d{2}'
|
|
|
|
|
'\\\\d{4}-\\\\d{2}-\\\\d{2}'
|
|
|
|
|
|
|
|
|
|
Even in a raw literal, quotes can be escaped with a backslash, but the
|
|
|
|
|
backslash remains in the result; for example, "r"\\""" is a valid
|
|
|
|
|
string literal consisting of two characters: a backslash and a double
|
|
|
|
|
@ -10387,6 +10571,201 @@ cannot end in a single backslash* (since the backslash would escape
|
|
|
|
|
the following quote character). Note also that a single backslash
|
|
|
|
|
followed by a newline is interpreted as those two characters as part
|
|
|
|
|
of the literal, *not* as a line continuation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
f-strings
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
Added in version 3.6.
|
|
|
|
|
|
|
|
|
|
A *formatted string literal* or *f-string* is a string literal that is
|
|
|
|
|
prefixed with "f" or "F". These strings may contain replacement
|
|
|
|
|
fields, which are expressions delimited by curly braces "{}". While
|
|
|
|
|
other string literals always have a constant value, formatted strings
|
|
|
|
|
are really expressions evaluated at run time.
|
|
|
|
|
|
|
|
|
|
Escape sequences are decoded like in ordinary string literals (except
|
|
|
|
|
when a literal is also marked as a raw string). After decoding, the
|
|
|
|
|
grammar for the contents of the string is:
|
|
|
|
|
|
|
|
|
|
f_string: (literal_char | "{{" | "}}" | replacement_field)*
|
|
|
|
|
replacement_field: "{" f_expression ["="] ["!" conversion] [":" format_spec] "}"
|
|
|
|
|
f_expression: (conditional_expression | "*" or_expr)
|
|
|
|
|
("," conditional_expression | "," "*" or_expr)* [","]
|
|
|
|
|
| yield_expression
|
|
|
|
|
conversion: "s" | "r" | "a"
|
|
|
|
|
format_spec: (literal_char | replacement_field)*
|
|
|
|
|
literal_char: <any code point except "{", "}" or NULL>
|
|
|
|
|
|
|
|
|
|
The parts of the string outside curly braces are treated literally,
|
|
|
|
|
except that any doubled curly braces "'{{'" or "'}}'" are replaced
|
|
|
|
|
with the corresponding single curly brace. A single opening curly
|
|
|
|
|
bracket "'{'" marks a replacement field, which starts with a Python
|
|
|
|
|
expression. To display both the expression text and its value after
|
|
|
|
|
evaluation, (useful in debugging), an equal sign "'='" may be added
|
|
|
|
|
after the expression. A conversion field, introduced by an exclamation
|
|
|
|
|
point "'!'" may follow. A format specifier may also be appended,
|
|
|
|
|
introduced by a colon "':'". A replacement field ends with a closing
|
|
|
|
|
curly bracket "'}'".
|
|
|
|
|
|
|
|
|
|
Expressions in formatted string literals are treated like regular
|
|
|
|
|
Python expressions surrounded by parentheses, with a few exceptions.
|
|
|
|
|
An empty expression is not allowed, and both "lambda" and assignment
|
|
|
|
|
expressions ":=" must be surrounded by explicit parentheses. Each
|
|
|
|
|
expression is evaluated in the context where the formatted string
|
|
|
|
|
literal appears, in order from left to right. Replacement expressions
|
|
|
|
|
can contain newlines in both single-quoted and triple-quoted f-strings
|
|
|
|
|
and they can contain comments. Everything that comes after a "#"
|
|
|
|
|
inside a replacement field is a comment (even closing braces and
|
|
|
|
|
quotes). In that case, replacement fields must be closed in a
|
|
|
|
|
different line.
|
|
|
|
|
|
|
|
|
|
>>> f"abc{a # This is a comment }"
|
|
|
|
|
... + 3}"
|
|
|
|
|
'abc5'
|
|
|
|
|
|
|
|
|
|
Changed in version 3.7: Prior to Python 3.7, an "await" expression and
|
|
|
|
|
comprehensions containing an "async for" clause were illegal in the
|
|
|
|
|
expressions in formatted string literals due to a problem with the
|
|
|
|
|
implementation.
|
|
|
|
|
|
|
|
|
|
Changed in version 3.12: Prior to Python 3.12, comments were not
|
|
|
|
|
allowed inside f-string replacement fields.
|
|
|
|
|
|
|
|
|
|
When the equal sign "'='" is provided, the output will have the
|
|
|
|
|
expression text, the "'='" and the evaluated value. Spaces after the
|
|
|
|
|
opening brace "'{'", within the expression and after the "'='" are all
|
|
|
|
|
retained in the output. By default, the "'='" causes the "repr()" of
|
|
|
|
|
the expression to be provided, unless there is a format specified.
|
|
|
|
|
When a format is specified it defaults to the "str()" of the
|
|
|
|
|
expression unless a conversion "'!r'" is declared.
|
|
|
|
|
|
|
|
|
|
Added in version 3.8: The equal sign "'='".
|
|
|
|
|
|
|
|
|
|
If a conversion is specified, the result of evaluating the expression
|
|
|
|
|
is converted before formatting. Conversion "'!s'" calls "str()" on
|
|
|
|
|
the result, "'!r'" calls "repr()", and "'!a'" calls "ascii()".
|
|
|
|
|
|
|
|
|
|
The result is then formatted using the "format()" protocol. The
|
|
|
|
|
format specifier is passed to the "__format__()" method of the
|
|
|
|
|
expression or conversion result. An empty string is passed when the
|
|
|
|
|
format specifier is omitted. The formatted result is then included in
|
|
|
|
|
the final value of the whole string.
|
|
|
|
|
|
|
|
|
|
Top-level format specifiers may include nested replacement fields.
|
|
|
|
|
These nested fields may include their own conversion fields and format
|
|
|
|
|
specifiers, but may not include more deeply nested replacement fields.
|
|
|
|
|
The format specifier mini-language is the same as that used by the
|
|
|
|
|
"str.format()" method.
|
|
|
|
|
|
|
|
|
|
Formatted string literals may be concatenated, but replacement fields
|
|
|
|
|
cannot be split across literals.
|
|
|
|
|
|
|
|
|
|
Some examples of formatted string literals:
|
|
|
|
|
|
|
|
|
|
>>> name = "Fred"
|
|
|
|
|
>>> f"He said his name is {name!r}."
|
|
|
|
|
"He said his name is 'Fred'."
|
|
|
|
|
>>> f"He said his name is {repr(name)}." # repr() is equivalent to !r
|
|
|
|
|
"He said his name is 'Fred'."
|
|
|
|
|
>>> width = 10
|
|
|
|
|
>>> precision = 4
|
|
|
|
|
>>> value = decimal.Decimal("12.34567")
|
|
|
|
|
>>> f"result: {value:{width}.{precision}}" # nested fields
|
|
|
|
|
'result: 12.35'
|
|
|
|
|
>>> today = datetime(year=2017, month=1, day=27)
|
|
|
|
|
>>> f"{today:%B %d, %Y}" # using date format specifier
|
|
|
|
|
'January 27, 2017'
|
|
|
|
|
>>> f"{today=:%B %d, %Y}" # using date format specifier and debugging
|
|
|
|
|
'today=January 27, 2017'
|
|
|
|
|
>>> number = 1024
|
|
|
|
|
>>> f"{number:#0x}" # using integer format specifier
|
|
|
|
|
'0x400'
|
|
|
|
|
>>> foo = "bar"
|
|
|
|
|
>>> f"{ foo = }" # preserves whitespace
|
|
|
|
|
" foo = 'bar'"
|
|
|
|
|
>>> line = "The mill's closed"
|
|
|
|
|
>>> f"{line = }"
|
|
|
|
|
'line = "The mill\\'s closed"'
|
|
|
|
|
>>> f"{line = :20}"
|
|
|
|
|
"line = The mill's closed "
|
|
|
|
|
>>> f"{line = !r:20}"
|
|
|
|
|
'line = "The mill\\'s closed" '
|
|
|
|
|
|
|
|
|
|
Reusing the outer f-string quoting type inside a replacement field is
|
|
|
|
|
permitted:
|
|
|
|
|
|
|
|
|
|
>>> a = dict(x=2)
|
|
|
|
|
>>> f"abc {a["x"]} def"
|
|
|
|
|
'abc 2 def'
|
|
|
|
|
|
|
|
|
|
Changed in version 3.12: Prior to Python 3.12, reuse of the same
|
|
|
|
|
quoting type of the outer f-string inside a replacement field was not
|
|
|
|
|
possible.
|
|
|
|
|
|
|
|
|
|
Backslashes are also allowed in replacement fields and are evaluated
|
|
|
|
|
the same way as in any other context:
|
|
|
|
|
|
|
|
|
|
>>> a = ["a", "b", "c"]
|
|
|
|
|
>>> print(f"List a contains:\\n{"\\n".join(a)}")
|
|
|
|
|
List a contains:
|
|
|
|
|
a
|
|
|
|
|
b
|
|
|
|
|
c
|
|
|
|
|
|
|
|
|
|
Changed in version 3.12: Prior to Python 3.12, backslashes were not
|
|
|
|
|
permitted inside an f-string replacement field.
|
|
|
|
|
|
|
|
|
|
Formatted string literals cannot be used as docstrings, even if they
|
|
|
|
|
do not include expressions.
|
|
|
|
|
|
|
|
|
|
>>> def foo():
|
|
|
|
|
... f"Not a docstring"
|
|
|
|
|
...
|
|
|
|
|
>>> foo.__doc__ is None
|
|
|
|
|
True
|
|
|
|
|
|
|
|
|
|
See also **PEP 498** for the proposal that added formatted string
|
|
|
|
|
literals, and "str.format()", which uses a related format string
|
|
|
|
|
mechanism.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
t-strings
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
Added in version 3.14.
|
|
|
|
|
|
|
|
|
|
A *template string literal* or *t-string* is a string literal that is
|
|
|
|
|
prefixed with "t" or "T". These strings follow the same syntax and
|
|
|
|
|
evaluation rules as formatted string literals, with the following
|
|
|
|
|
differences:
|
|
|
|
|
|
|
|
|
|
* Rather than evaluating to a "str" object, t-strings evaluate to a
|
|
|
|
|
"Template" object from the "string.templatelib" module.
|
|
|
|
|
|
|
|
|
|
* The "format()" protocol is not used. Instead, the format specifier
|
|
|
|
|
and conversions (if any) are passed to a new "Interpolation" object
|
|
|
|
|
that is created for each evaluated expression. It is up to code that
|
|
|
|
|
processes the resulting "Template" object to decide how to handle
|
|
|
|
|
format specifiers and conversions.
|
|
|
|
|
|
|
|
|
|
* Format specifiers containing nested replacement fields are evaluated
|
|
|
|
|
eagerly, prior to being passed to the "Interpolation" object. For
|
|
|
|
|
instance, an interpolation of the form "{amount:.{precision}f}" will
|
|
|
|
|
evaluate the expression "{precision}" before setting the
|
|
|
|
|
"format_spec" attribute of the resulting "Interpolation" object; if
|
|
|
|
|
"precision" is (for example) "2", the resulting format specifier
|
|
|
|
|
will be "'.2f'".
|
|
|
|
|
|
|
|
|
|
* When the equal sign "'='" is provided in an interpolation
|
|
|
|
|
expression, the resulting "Template" object will have the expression
|
|
|
|
|
text along with a "'='" character placed in its "strings" attribute.
|
|
|
|
|
The "interpolations" attribute will also contain an "Interpolation"
|
|
|
|
|
instance for the expression. By default, the "conversion" attribute
|
|
|
|
|
will be set to "'r'" (that is, "repr()"), unless there is a
|
|
|
|
|
conversion explicitly specified (in which case it overrides the
|
|
|
|
|
default) or a format specifier is provided (in which case, the
|
|
|
|
|
"conversion" defaults to "None").
|
|
|
|
|
''',
|
|
|
|
|
'subscriptions': r'''Subscriptions
|
|
|
|
|
*************
|
|
|
|
|
|