* PEP-384 _struct
* More PEP-384 fixes for _struct
Summary: Add a couple of more fixes for `_struct` that were previously missed such as removing `tp_*` accessors and using `PyBytesWriter` instead of calling `PyBytes_FromStringAndSize` with `NULL`. Also added a test to confirm that `iter_unpack` type is still uninstantiable.
* 📜🤖 Added by blurb_it.
Having these in a separate file from the one that's named after the
module in the usual way makes it very easy to miss them when looking
for tests for these two functions.
(In fact when working recently on is_normalized, I'd been surprised to
see no tests for it here and concluded the function had evaded being
tested at all. I'd gone as far as to write up some tests myself
before I spotted this other file.)
Mostly this just means moving all the one file's code into the other,
and moving code from the module toplevel to inside the test class to
keep it tidily separate from the rest of the file's code.
There's one substantive change, which reduces by a bit the amount of
code to be moved: we drop the `x > sys.maxunicode` conditional and all
the `RangeError` logic behind it. Now if that condition ever occurs
it will cause an error at `chr(x)`, and a test failure. That's the
right result because, since PEP 393 in Python 3.3, there is no longer
such a thing as an "unsupported character".
Accumulate certificates in a set instead of doing a costly list contain
operation. A Windows cert store can easily contain over hundred
certificates. The old code would result in way over 5,000 comparison
operations
Signed-off-by: Christian Heimes <christian@python.org>
Since PEP 393 in Python 3.3, this value is always 0x10ffff, the
maximum codepoint in Unicode; there's no longer such a thing as a
UCS-2 build of Python, which couldn't properly represent some
characters.
There are a couple of spots left where we still condition on the value
of this constant. Take them out.
weakref.WeakValueDictionary defines a local remove() function used as
callback for weak references. This function was created with a
closure. Modify the implementation to avoid the closure.
This is the converse of GH-15353 -- in addition to plenty of
scripts in the tree that are marked with the executable bit
(and so can be directly executed), there are a few that have
a leading `#!` which could let them be executed, but it doesn't
do anything because they don't have the executable bit set.
Here's a command which finds such files and marks them. The
first line finds files in the tree with a `#!` line *anywhere*;
the next-to-last step checks that the *first* line is actually of
that form. In between we filter out files that already have the
bit set, and some files that are meant as fragments to be
consumed by one or another kind of preprocessor.
$ git grep -l '^#!' \
| grep -vxFf <( \
git ls-files --stage \
| perl -lane 'print $F[3] if (!/^100644/)' \
) \
| grep -ve '\.in$' -e '^Doc/includes/' \
| while read f; do
head -c2 "$f" | grep -qxF '#!' \
&& chmod a+x "$f"; \
done
* bpo-26185: Fix repr() on empty ZipInfo object
It was failing on AttributeError due to inexistant
but required attributes file_size and compress_size.
They are now initialized to 0 in ZipInfo.__init__().
* Remove useless hasattr() in ZipInfo._open_to_write()
* Completely remove file_size setting in _open_to_write().
Restart lines now always start with '=' and never end with ' ' and fill the width of the window unless that would require ending with ' ', which could be wrapped by itself and possible confusing the user.
The purpose of the `unicodedata.is_normalized` function is to answer
the question `str == unicodedata.normalized(form, str)` more
efficiently than writing just that, by using the "quick check"
optimization described in the Unicode standard in UAX #15.
However, it turns out the code doesn't implement the full algorithm
from the standard, and as a result we often miss the optimization and
end up having to compute the whole normalized string after all.
Implement the standard's algorithm. This greatly speeds up
`unicodedata.is_normalized` in many cases where our partial variant
of quick-check had been returning MAYBE and the standard algorithm
returns NO.
At a quick test on my desktop, the existing code takes about 4.4 ms/MB
(so 4.4 ns per byte) when the partial quick-check returns MAYBE and it
has to do the slow normalize-and-compare:
$ build.base/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
-- 'unicodedata.is_normalized("NFD", s)'
50 loops, best of 5: 4.39 msec per loop
With this patch, it gets the answer instantly (58 ns) on the same 1 MB
string:
$ build.dev/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
-- 'unicodedata.is_normalized("NFD", s)'
5000000 loops, best of 5: 58.2 nsec per loop
This restores a small optimization that the original version of this
code had for the `unicodedata.normalize` use case.
With this, that case is actually faster than in master!
$ build.base/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \
-- 'unicodedata.normalize("NFD", s)'
500 loops, best of 5: 561 usec per loop
$ build.dev/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \
-- 'unicodedata.normalize("NFD", s)'
500 loops, best of 5: 512 usec per loop
The HTML5 output from Sphinx 2.x adds '<p>' tags within list elements. Using a new prevtag attribute, ignore these instead of emitting unwanted '\n\n'.
Also stop looking for 'first' classes on tags (no longer present) and fix the bug of double-spacing instead of single spacing after <pre> blocks.
Extending the hover delay in test_tooltip should avoid spurious test_idle failures.
One longer delay instead of two shorter delays results in a net speedup.
* Use the 'p' format unit instead of manually called PyObject_IsTrue().
* Pass boolean value instead 0/1 integers to functions that needs boolean.
* Convert some arguments to boolean only once.