| 
									
										
										
										
											2007-08-15 14:28:22 +00:00
										 |  |  | :mod:`xml.sax.saxutils` --- SAX Utilities
 | 
					
						
							|  |  |  | =========================================
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | .. module:: xml.sax.saxutils
 | 
					
						
							|  |  |  |    :synopsis: Convenience functions and classes for use with SAX.
 | 
					
						
							| 
									
										
										
										
											2016-06-11 15:02:54 -04:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2007-08-15 14:28:22 +00:00
										 |  |  | .. moduleauthor:: Lars Marius Garshol <larsga@garshol.priv.no>
 | 
					
						
							|  |  |  | .. sectionauthor:: Martin v. Löwis <martin@v.loewis.de>
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2016-06-11 15:02:54 -04:00
										 |  |  | **Source code:** :source:`Lib/xml/sax/saxutils.py`
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | --------------
 | 
					
						
							| 
									
										
										
										
											2007-08-15 14:28:22 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | The module :mod:`xml.sax.saxutils` contains a number of classes and functions
 | 
					
						
							|  |  |  | that are commonly useful when creating SAX applications, either in direct use,
 | 
					
						
							|  |  |  | or as base classes.
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2009-09-16 15:58:14 +00:00
										 |  |  | .. function:: escape(data, entities={})
 | 
					
						
							| 
									
										
										
										
											2007-08-15 14:28:22 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |    Escape ``'&'``, ``'<'``, and ``'>'`` in a string of data.
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |    You can escape other strings of data by passing a dictionary as the optional
 | 
					
						
							|  |  |  |    *entities* parameter.  The keys and values must all be strings; each key will be
 | 
					
						
							| 
									
										
											  
											
												Merged revisions 60094-60123 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
*** NOTE ***
I haven't merged the files in Doc/c-api/. I got too many conflicts. Georg,
please split them manually.
........
  r60095 | andrew.kuchling | 2008-01-19 21:12:04 +0100 (Sat, 19 Jan 2008) | 2 lines
  Bug 1277: make Maildir use the user-provided factory instead of hard-wiring MaildirMessage.
  2.5.2 bugfix candidate.
........
  r60097 | georg.brandl | 2008-01-19 21:22:13 +0100 (Sat, 19 Jan 2008) | 4 lines
  #1663329: add os.closerange() to close a range of fds,
  ignoring errors, and use this in subprocess to speed up
  subprocess creation in close_fds mode. Patch by Mike Klaas.
........
  r60099 | georg.brandl | 2008-01-19 21:40:24 +0100 (Sat, 19 Jan 2008) | 2 lines
  #1411695: clarify behavior of xml.sax.utils.[un]escape.
........
  r60101 | andrew.kuchling | 2008-01-19 21:47:59 +0100 (Sat, 19 Jan 2008) | 7 lines
  Patch #1019808 from Federico Schwindt: Return correct socket error when
  a default timeout has been set, by using getsockopt() to get the error
  condition (instead of trying another connect() call, which seems to be
  a Linuxism).
  2.5 bugfix candidate, assuming no one reports any problems with this change.
........
  r60102 | gregory.p.smith | 2008-01-19 21:49:02 +0100 (Sat, 19 Jan 2008) | 3 lines
  fix comment typos, use not arg instead of arg == "", add test coverage
  for inside of the final if needquotes: within subprocess.list2cmdline().
........
  r60103 | georg.brandl | 2008-01-19 21:53:07 +0100 (Sat, 19 Jan 2008) | 2 lines
  #1509: fix sqlite3 docstrings and docs w.r.t. cursor.fetchXXX methods.
........
  r60104 | gregory.p.smith | 2008-01-19 21:57:59 +0100 (Sat, 19 Jan 2008) | 6 lines
  Fixes issue1336 - a race condition could occur when forking if the gc
  kicked in during the critical section.  solution: disable gc during
  that section.  Patch contributed by jpa and updated by me to cover the
  race condition still existing what therve from twistedmatrix pointed
  out (already seen and fixed in twisted's own subprocess code).
........
  r60105 | gregory.p.smith | 2008-01-19 22:00:37 +0100 (Sat, 19 Jan 2008) | 2 lines
  note about r60104
........
  r60106 | andrew.kuchling | 2008-01-19 22:00:38 +0100 (Sat, 19 Jan 2008) | 1 line
  Bug 1296: restore text describing OptionGroup
........
  r60109 | georg.brandl | 2008-01-19 23:08:21 +0100 (Sat, 19 Jan 2008) | 2 lines
  Split the monstrous C API manual files in smaller parts.
........
  r60110 | georg.brandl | 2008-01-19 23:14:27 +0100 (Sat, 19 Jan 2008) | 2 lines
  Missed one big file to split up.
........
  r60111 | gregory.p.smith | 2008-01-19 23:23:56 +0100 (Sat, 19 Jan 2008) | 12 lines
  Undo an unnecessary else: and indentation that r60104 added.
  try:
    ...
  except:
    ...
    raise
  else:
    ...
  the else: is unecessary due to the blind except: with a raise.
........
  r60115 | gregory.p.smith | 2008-01-19 23:49:37 +0100 (Sat, 19 Jan 2008) | 3 lines
  Fix issue 1300: Quote command line arguments that contain a '|' character in
  subprocess.list2cmdline (windows).
........
  r60116 | gregory.p.smith | 2008-01-20 00:10:52 +0100 (Sun, 20 Jan 2008) | 3 lines
  Fixes/Accepts Patch for issue1189216 - Work properly with archives
  that have file headers past the 2**31 byte boundary.
........
  r60119 | andrew.kuchling | 2008-01-20 01:00:38 +0100 (Sun, 20 Jan 2008) | 3 lines
  Patch #1048820 from Stefan Wehr: add insert-mode editing to Textbox.
  Fix an off-by-one error I noticed.
........
  r60120 | andrew.kuchling | 2008-01-20 01:12:19 +0100 (Sun, 20 Jan 2008) | 1 line
  Add an interactive test script for exercising curses
........
  r60121 | gregory.p.smith | 2008-01-20 02:21:03 +0100 (Sun, 20 Jan 2008) | 7 lines
  Fix zipfile decryption.  The check for validity only worked on one
  type of encrypted zip files.  Files using extended local headers
  needed to compare the check byte against different values.  (according
  to reading the infozip unzip crypt.c source code)
  Fixes issue1003.
........
  r60122 | gregory.p.smith | 2008-01-20 02:26:04 +0100 (Sun, 20 Jan 2008) | 2 lines
  note for r60121
........
  r60123 | gregory.p.smith | 2008-01-20 02:32:00 +0100 (Sun, 20 Jan 2008) | 4 lines
  Document that zipfile decryption is insanely slow and fix a typo and
  blatant lie in a docstring (it is not useful for security regardless of
  how you spell it).
........
											
										 
											2008-01-20 09:06:41 +00:00
										 |  |  |    replaced with its corresponding value.  The characters ``'&'``, ``'<'`` and
 | 
					
						
							|  |  |  |    ``'>'`` are always escaped, even if *entities* is provided.
 | 
					
						
							| 
									
										
										
										
											2007-08-15 14:28:22 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2022-10-11 02:27:49 -07:00
										 |  |  |    .. note::
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |       This function should only be used to escape characters that
 | 
					
						
							|  |  |  |       can't be used directly in XML. Do not use this function as a general
 | 
					
						
							|  |  |  |       string translation function.
 | 
					
						
							| 
									
										
										
										
											2007-08-15 14:28:22 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2009-09-16 15:58:14 +00:00
										 |  |  | .. function:: unescape(data, entities={})
 | 
					
						
							| 
									
										
										
										
											2007-08-15 14:28:22 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |    Unescape ``'&'``, ``'<'``, and ``'>'`` in a string of data.
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |    You can unescape other strings of data by passing a dictionary as the optional
 | 
					
						
							|  |  |  |    *entities* parameter.  The keys and values must all be strings; each key will be
 | 
					
						
							| 
									
										
											  
											
												Merged revisions 60094-60123 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
*** NOTE ***
I haven't merged the files in Doc/c-api/. I got too many conflicts. Georg,
please split them manually.
........
  r60095 | andrew.kuchling | 2008-01-19 21:12:04 +0100 (Sat, 19 Jan 2008) | 2 lines
  Bug 1277: make Maildir use the user-provided factory instead of hard-wiring MaildirMessage.
  2.5.2 bugfix candidate.
........
  r60097 | georg.brandl | 2008-01-19 21:22:13 +0100 (Sat, 19 Jan 2008) | 4 lines
  #1663329: add os.closerange() to close a range of fds,
  ignoring errors, and use this in subprocess to speed up
  subprocess creation in close_fds mode. Patch by Mike Klaas.
........
  r60099 | georg.brandl | 2008-01-19 21:40:24 +0100 (Sat, 19 Jan 2008) | 2 lines
  #1411695: clarify behavior of xml.sax.utils.[un]escape.
........
  r60101 | andrew.kuchling | 2008-01-19 21:47:59 +0100 (Sat, 19 Jan 2008) | 7 lines
  Patch #1019808 from Federico Schwindt: Return correct socket error when
  a default timeout has been set, by using getsockopt() to get the error
  condition (instead of trying another connect() call, which seems to be
  a Linuxism).
  2.5 bugfix candidate, assuming no one reports any problems with this change.
........
  r60102 | gregory.p.smith | 2008-01-19 21:49:02 +0100 (Sat, 19 Jan 2008) | 3 lines
  fix comment typos, use not arg instead of arg == "", add test coverage
  for inside of the final if needquotes: within subprocess.list2cmdline().
........
  r60103 | georg.brandl | 2008-01-19 21:53:07 +0100 (Sat, 19 Jan 2008) | 2 lines
  #1509: fix sqlite3 docstrings and docs w.r.t. cursor.fetchXXX methods.
........
  r60104 | gregory.p.smith | 2008-01-19 21:57:59 +0100 (Sat, 19 Jan 2008) | 6 lines
  Fixes issue1336 - a race condition could occur when forking if the gc
  kicked in during the critical section.  solution: disable gc during
  that section.  Patch contributed by jpa and updated by me to cover the
  race condition still existing what therve from twistedmatrix pointed
  out (already seen and fixed in twisted's own subprocess code).
........
  r60105 | gregory.p.smith | 2008-01-19 22:00:37 +0100 (Sat, 19 Jan 2008) | 2 lines
  note about r60104
........
  r60106 | andrew.kuchling | 2008-01-19 22:00:38 +0100 (Sat, 19 Jan 2008) | 1 line
  Bug 1296: restore text describing OptionGroup
........
  r60109 | georg.brandl | 2008-01-19 23:08:21 +0100 (Sat, 19 Jan 2008) | 2 lines
  Split the monstrous C API manual files in smaller parts.
........
  r60110 | georg.brandl | 2008-01-19 23:14:27 +0100 (Sat, 19 Jan 2008) | 2 lines
  Missed one big file to split up.
........
  r60111 | gregory.p.smith | 2008-01-19 23:23:56 +0100 (Sat, 19 Jan 2008) | 12 lines
  Undo an unnecessary else: and indentation that r60104 added.
  try:
    ...
  except:
    ...
    raise
  else:
    ...
  the else: is unecessary due to the blind except: with a raise.
........
  r60115 | gregory.p.smith | 2008-01-19 23:49:37 +0100 (Sat, 19 Jan 2008) | 3 lines
  Fix issue 1300: Quote command line arguments that contain a '|' character in
  subprocess.list2cmdline (windows).
........
  r60116 | gregory.p.smith | 2008-01-20 00:10:52 +0100 (Sun, 20 Jan 2008) | 3 lines
  Fixes/Accepts Patch for issue1189216 - Work properly with archives
  that have file headers past the 2**31 byte boundary.
........
  r60119 | andrew.kuchling | 2008-01-20 01:00:38 +0100 (Sun, 20 Jan 2008) | 3 lines
  Patch #1048820 from Stefan Wehr: add insert-mode editing to Textbox.
  Fix an off-by-one error I noticed.
........
  r60120 | andrew.kuchling | 2008-01-20 01:12:19 +0100 (Sun, 20 Jan 2008) | 1 line
  Add an interactive test script for exercising curses
........
  r60121 | gregory.p.smith | 2008-01-20 02:21:03 +0100 (Sun, 20 Jan 2008) | 7 lines
  Fix zipfile decryption.  The check for validity only worked on one
  type of encrypted zip files.  Files using extended local headers
  needed to compare the check byte against different values.  (according
  to reading the infozip unzip crypt.c source code)
  Fixes issue1003.
........
  r60122 | gregory.p.smith | 2008-01-20 02:26:04 +0100 (Sun, 20 Jan 2008) | 2 lines
  note for r60121
........
  r60123 | gregory.p.smith | 2008-01-20 02:32:00 +0100 (Sun, 20 Jan 2008) | 4 lines
  Document that zipfile decryption is insanely slow and fix a typo and
  blatant lie in a docstring (it is not useful for security regardless of
  how you spell it).
........
											
										 
											2008-01-20 09:06:41 +00:00
										 |  |  |    replaced with its corresponding value.  ``'&'``, ``'<'``, and ``'>'``
 | 
					
						
							|  |  |  |    are always unescaped, even if *entities* is provided.
 | 
					
						
							| 
									
										
										
										
											2007-08-15 14:28:22 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2009-09-16 15:58:14 +00:00
										 |  |  | .. function:: quoteattr(data, entities={})
 | 
					
						
							| 
									
										
										
										
											2007-08-15 14:28:22 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |    Similar to :func:`escape`, but also prepares *data* to be used as an
 | 
					
						
							|  |  |  |    attribute value.  The return value is a quoted version of *data* with any
 | 
					
						
							|  |  |  |    additional required replacements. :func:`quoteattr` will select a quote
 | 
					
						
							|  |  |  |    character based on the content of *data*, attempting to avoid encoding any
 | 
					
						
							|  |  |  |    quote characters in the string.  If both single- and double-quote characters
 | 
					
						
							|  |  |  |    are already in *data*, the double-quote characters will be encoded and *data*
 | 
					
						
							|  |  |  |    will be wrapped in double-quotes.  The resulting string can be used directly
 | 
					
						
							|  |  |  |    as an attribute value::
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2007-09-01 23:34:30 +00:00
										 |  |  |       >>> print("<element attr=%s>" % quoteattr("ab ' cd \" ef"))
 | 
					
						
							| 
									
										
										
										
											2007-08-15 14:28:22 +00:00
										 |  |  |       <element attr="ab ' cd " ef">
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |    This function is useful when generating attribute values for HTML or any SGML
 | 
					
						
							|  |  |  |    using the reference concrete syntax.
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-10-17 22:46:45 +00:00
										 |  |  | .. class:: XMLGenerator(out=None, encoding='iso-8859-1', short_empty_elements=False)
 | 
					
						
							| 
									
										
										
										
											2007-08-15 14:28:22 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2013-08-29 10:28:44 +03:00
										 |  |  |    This class implements the :class:`~xml.sax.handler.ContentHandler` interface
 | 
					
						
							|  |  |  |    by writing SAX
 | 
					
						
							| 
									
										
										
										
											2007-08-15 14:28:22 +00:00
										 |  |  |    events back into an XML document. In other words, using an :class:`XMLGenerator`
 | 
					
						
							|  |  |  |    as the content handler will reproduce the original document being parsed. *out*
 | 
					
						
							|  |  |  |    should be a file-like object which will default to *sys.stdout*. *encoding* is
 | 
					
						
							|  |  |  |    the encoding of the output stream which defaults to ``'iso-8859-1'``.
 | 
					
						
							| 
									
										
										
										
											2010-10-17 22:46:45 +00:00
										 |  |  |    *short_empty_elements* controls the formatting of elements that contain no
 | 
					
						
							| 
									
										
										
										
											2016-10-19 16:43:42 +03:00
										 |  |  |    content:  if ``False`` (the default) they are emitted as a pair of start/end
 | 
					
						
							|  |  |  |    tags, if set to ``True`` they are emitted as a single self-closed tag.
 | 
					
						
							| 
									
										
										
										
											2010-10-17 22:46:45 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |    .. versionadded:: 3.2
 | 
					
						
							| 
									
										
										
										
											2012-06-24 22:48:30 +02:00
										 |  |  |       The *short_empty_elements* parameter.
 | 
					
						
							| 
									
										
										
										
											2007-08-15 14:28:22 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | .. class:: XMLFilterBase(base)
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2013-08-29 10:28:44 +03:00
										 |  |  |    This class is designed to sit between an
 | 
					
						
							|  |  |  |    :class:`~xml.sax.xmlreader.XMLReader` and the client
 | 
					
						
							| 
									
										
										
										
											2007-08-15 14:28:22 +00:00
										 |  |  |    application's event handlers.  By default, it does nothing but pass requests up
 | 
					
						
							|  |  |  |    to the reader and events on to the handlers unmodified, but subclasses can
 | 
					
						
							|  |  |  |    override specific methods to modify the event stream or the configuration
 | 
					
						
							|  |  |  |    requests as they pass through.
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2009-09-16 15:58:14 +00:00
										 |  |  | .. function:: prepare_input_source(source, base='')
 | 
					
						
							| 
									
										
										
										
											2007-08-15 14:28:22 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2013-08-29 10:28:44 +03:00
										 |  |  |    This function takes an input source and an optional base URL and returns a
 | 
					
						
							|  |  |  |    fully resolved :class:`~xml.sax.xmlreader.InputSource` object ready for
 | 
					
						
							|  |  |  |    reading.  The input source can be given as a string, a file-like object, or
 | 
					
						
							|  |  |  |    an :class:`~xml.sax.xmlreader.InputSource` object; parsers will use this
 | 
					
						
							|  |  |  |    function to implement the polymorphic *source* argument to their
 | 
					
						
							| 
									
										
										
										
											2023-07-29 08:48:10 +03:00
										 |  |  |    :meth:`~xml.sax.xmlreader.XMLReader.parse` method.
 | 
					
						
							| 
									
										
										
										
											2007-08-15 14:28:22 +00:00
										 |  |  | 
 |