Commit graph

1564 commits

Author SHA1 Message Date
Guido van Rossum
a6edfd9737 Mark Hammond:
Fixes the MBCS codec to work correctly with zero length strings.
2000-05-03 11:03:24 +00:00
Guido van Rossum
0e4f657a50 Marc-Andre Lemburg:
Fixed \OOO interpretation for Unicode objects. \777 now
correctly produces the Unicode character with ordinal 511.
2000-05-01 21:27:20 +00:00
Guido van Rossum
3c1bb8043f Marc-Andre Lemburg:
Fixed a reference leak in the allocator.

Renamed utf8_string to _PyUnicode_AsUTF8String() and made
it external for use by other parts of the interpreter.
2000-04-27 20:13:50 +00:00
Guido van Rossum
86662914be Marc-Andre Lemburg:
The maxsplit functionality in .splitlines() was replaced by the keepends
functionality which allows keeping the line end markers together
with the string.
2000-04-11 15:38:46 +00:00
Guido van Rossum
fd4b957b06 Marc-Andre Lemburg:
* New exported API PyUnicode_Resize()

* The experimental Keep-Alive optimization was turned back
  on after some tweaks to the implementation. It should now
  work without causing core dumps... this has yet to tested
  though (switching it off is easy: see the unicodeobject.c
  file for details).

* Fixed a memory leak in the Unicode freelist cleanup code.

* Added tests to correctly process the return code from
  _PyUnicode_Resize().

* Fixed a bug in the 'ignore' error handling routines
  of some builtin codecs. Added test cases for these to
  test_unicode.py.
2000-04-10 13:51:10 +00:00
Guido van Rossum
5db862dd0c Skip Montanaro: add string precisions to calls to PyErr_Format
to prevent possible buffer overruns.
2000-04-10 12:46:51 +00:00
Guido van Rossum
ba47704943 Conrad Huang points out that "if (0 < ch < 256)", while legal C,
doesn't mean what the Python programmer thought...
2000-04-06 18:18:10 +00:00
Guido van Rossum
34888ed689 Fredrik Lundh: eliminate a MSVC compiler warning. 2000-04-05 21:29:50 +00:00
Guido van Rossum
9e896b37c7 Marc-Andre's third try at this bulk patch seems to work (except that
his copy of test_contains.py seems to be broken -- the lines he
deleted were already absent).  Checkin messages:


New Unicode support for int(), float(), complex() and long().

- new APIs PyInt_FromUnicode() and PyLong_FromUnicode()
- added support for Unicode to PyFloat_FromString()
- new encoding API PyUnicode_EncodeDecimal() which converts
  Unicode to a decimal char* string (used in the above new
  APIs)
- shortcuts for calls like int(<int object>) and float(<float obj>)
- tests for all of the above

Unicode compares and contains checks:
- comparing Unicode and non-string types now works; TypeErrors
  are masked, all other errors such as ValueError during
  Unicode coercion are passed through (note that PyUnicode_Compare
  does not implement the masking -- PyObject_Compare does this)
- contains now works for non-string types too; TypeErrors are
  masked and 0 returned; all other errors are passed through

Better testing support for the standard codecs.

Misc minor enhancements, such as an alias dbcs for the mbcs codec.

Changes:
- PyLong_FromString() now applies the same error checks as
  does PyInt_FromString(): trailing garbage is reported
  as error and not longer silently ignored. The only characters
  which may be trailing the digits are 'L' and 'l' -- these
  are still silently ignored.
- string.ato?() now directly interface to int(), long() and
  float(). The error strings are now a little different, but
  the type still remains the same. These functions are now
  ready to get declared obsolete ;-)
- PyNumber_Int() now also does a check for embedded NULL chars
  in the input string; PyNumber_Long() already did this (and
  still does)

Followed by:

Looks like I've gone a step too far there... (and test_contains.py
seem to have a bug too).

I've changed back to reporting all errors in PyUnicode_Contains()
and added a few more test cases to test_contains.py (plus corrected
the join() NameError).
2000-04-05 20:11:21 +00:00
Guido van Rossum
2ea3e143f0 Some blank lines. 2000-03-31 17:24:09 +00:00
Guido van Rossum
b7a40ba8d3 MBCS codecs. (Win32 only.) By Mark Hammond. 2000-03-28 02:01:52 +00:00
Barry Warsaw
51ac58039f On 17-Mar-2000, Marc-Andre Lemburg said:
Attached you find an update of the Unicode implementation.

    The patch is against the current CVS version. I would appreciate
    if someone with CVS checkin permissions could check the changes
    in.

    The patch contains all bugs and patches sent this week and also
    fixes a leak in the codecs code and a bug in the free list code
    for Unicode objects (which only shows up when compiling Python
    with Py_DEBUG; thanks to MarkH for spotting this one).
2000-03-20 16:36:48 +00:00
Guido van Rossum
403d68b484 Add sq_contains implementation. 2000-03-13 15:55:09 +00:00
Guido van Rossum
d57fd91488 Unicode implementation by Marc-Andre Lemburg based on original code by
Fredrik Lundh.
2000-03-10 22:53:23 +00:00