| 
									
										
										
										
											2000-03-31 14:58:54 +00:00
										 |  |  | # | 
					
						
							|  |  |  | # Secret Labs' Regular Expression Engine | 
					
						
							|  |  |  | # | 
					
						
							|  |  |  | # convert template to internal format | 
					
						
							|  |  |  | # | 
					
						
							| 
									
										
										
										
											2001-01-14 15:06:11 +00:00
										 |  |  | # Copyright (c) 1997-2001 by Secret Labs AB.  All rights reserved. | 
					
						
							| 
									
										
										
										
											2000-03-31 14:58:54 +00:00
										 |  |  | # | 
					
						
							| 
									
										
										
										
											2000-08-01 18:20:07 +00:00
										 |  |  | # See the sre.py file for information on usage and redistribution. | 
					
						
							| 
									
										
										
										
											2000-03-31 14:58:54 +00:00
										 |  |  | # | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2001-09-04 19:10:20 +00:00
										 |  |  | """Internal support module for sre""" | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2002-06-27 20:08:25 +00:00
										 |  |  | import _sre, sys | 
					
						
							| 
									
										
											  
											
												Merged revisions 62194,62197-62198,62204-62205,62214,62219-62221,62227,62229-62231,62233-62235,62237-62239 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
  r62194 | jeffrey.yasskin | 2008-04-07 01:04:28 +0200 (Mon, 07 Apr 2008) | 7 lines
  Add enough debugging information to diagnose failures where the
  HandlerBException is ignored, and fix one such problem, where it was thrown
  during the __del__ method of the previous Popen object.
  We may want to find a better way of printing verbose information so it's not
  spammy when the test passes.
........
  r62197 | mark.hammond | 2008-04-07 03:53:39 +0200 (Mon, 07 Apr 2008) | 2 lines
  Issue #2513: enable 64bit cross compilation on windows.
........
  r62198 | mark.hammond | 2008-04-07 03:59:40 +0200 (Mon, 07 Apr 2008) | 2 lines
  correct heading underline for new "Cross-compiling on Windows" section
........
  r62204 | gregory.p.smith | 2008-04-07 08:33:21 +0200 (Mon, 07 Apr 2008) | 4 lines
  Use the new PyFile_IncUseCount & PyFile_DecUseCount calls appropriatly
  within the standard library.  These modules use PyFile_AsFile and later
  release the GIL while operating on the previously returned FILE*.
........
  r62205 | mark.summerfield | 2008-04-07 09:39:23 +0200 (Mon, 07 Apr 2008) | 4 lines
  changed "2500 components" to "several thousand" since the number keeps
  growning:-)
........
  r62214 | georg.brandl | 2008-04-07 20:51:59 +0200 (Mon, 07 Apr 2008) | 2 lines
  #2525: update timezone info examples in the docs.
........
  r62219 | andrew.kuchling | 2008-04-08 01:57:07 +0200 (Tue, 08 Apr 2008) | 1 line
  Write PEP 3127 section; add items
........
  r62220 | andrew.kuchling | 2008-04-08 01:57:21 +0200 (Tue, 08 Apr 2008) | 1 line
  Typo fix
........
  r62221 | andrew.kuchling | 2008-04-08 03:33:10 +0200 (Tue, 08 Apr 2008) | 1 line
  Typographical fix: 32bit -> 32-bit, 64bit -> 64-bit
........
  r62227 | andrew.kuchling | 2008-04-08 23:22:53 +0200 (Tue, 08 Apr 2008) | 1 line
  Add items
........
  r62229 | amaury.forgeotdarc | 2008-04-08 23:27:42 +0200 (Tue, 08 Apr 2008) | 7 lines
  Issue2564: Prevent a hang in "import test.autotest", which runs the entire test
  suite as a side-effect of importing the module.
  - in test_capi, a thread tried to import other modules
  - re.compile() imported sre_parse again on every call.
........
  r62230 | amaury.forgeotdarc | 2008-04-08 23:51:57 +0200 (Tue, 08 Apr 2008) | 2 lines
  Prevent an error when inspect.isabstract() is called with something else than a new-style class.
........
  r62231 | amaury.forgeotdarc | 2008-04-09 00:07:05 +0200 (Wed, 09 Apr 2008) | 8 lines
  Issue 2408: remove the _types module
  It was only used as a helper in types.py to access types (GetSetDescriptorType and MemberDescriptorType),
  when they can easily be obtained with python code.
  These expressions even work with Jython.
  I don't know what the future of the types module is; (cf. discussion in http://bugs.python.org/issue1605 )
  at least this change makes it simpler.
........
  r62233 | amaury.forgeotdarc | 2008-04-09 01:10:07 +0200 (Wed, 09 Apr 2008) | 2 lines
  Add a NEWS entry for previous checkin
........
  r62234 | trent.nelson | 2008-04-09 01:47:30 +0200 (Wed, 09 Apr 2008) | 37 lines
  - Issue #2550: The approach used by client/server code for obtaining ports
    to listen on in network-oriented tests has been refined in an effort to
    facilitate running multiple instances of the entire regression test suite
    in parallel without issue.  test_support.bind_port() has been fixed such
    that it will always return a unique port -- which wasn't always the case
    with the previous implementation, especially if socket options had been
    set that affected address reuse (i.e. SO_REUSEADDR, SO_REUSEPORT).  The
    new implementation of bind_port() will actually raise an exception if it
    is passed an AF_INET/SOCK_STREAM socket with either the SO_REUSEADDR or
    SO_REUSEPORT socket option set.  Furthermore, if available, bind_port()
    will set the SO_EXCLUSIVEADDRUSE option on the socket it's been passed.
    This currently only applies to Windows.  This option prevents any other
    sockets from binding to the host/port we've bound to, thus removing the
    possibility of the 'non-deterministic' behaviour, as Microsoft puts it,
    that occurs when a second SOCK_STREAM socket binds and accepts to a
    host/port that's already been bound by another socket.  The optional
    preferred port parameter to bind_port() has been removed.  Under no
    circumstances should tests be hard coding ports!
    test_support.find_unused_port() has also been introduced, which will pass
    a temporary socket object to bind_port() in order to obtain an unused port.
    The temporary socket object is then closed and deleted, and the port is
    returned.  This method should only be used for obtaining an unused port
    in order to pass to an external program (i.e. the -accept [port] argument
    to openssl's s_server mode) or as a parameter to a server-oriented class
    that doesn't give you direct access to the underlying socket used.
    Finally, test_support.HOST has been introduced, which should be used for
    the host argument of any relevant socket calls (i.e. bind and connect).
    The following tests were updated to following the new conventions:
      test_socket, test_smtplib, test_asyncore, test_ssl, test_httplib,
      test_poplib, test_ftplib, test_telnetlib, test_socketserver,
      test_asynchat and test_socket_ssl.
    It is now possible for multiple instances of the regression test suite to
    run in parallel without issue.
........
  r62235 | gregory.p.smith | 2008-04-09 02:25:17 +0200 (Wed, 09 Apr 2008) | 3 lines
  Fix zlib crash from zlib.decompressobj().flush(val) when val was not positive.
  It tried to allocate negative or zero memory.  That fails.
........
  r62237 | trent.nelson | 2008-04-09 02:34:53 +0200 (Wed, 09 Apr 2008) | 1 line
  Fix typo with regards to self.PORT shadowing class variables with the same name.
........
  r62238 | andrew.kuchling | 2008-04-09 03:08:32 +0200 (Wed, 09 Apr 2008) | 1 line
  Add items
........
  r62239 | jerry.seutter | 2008-04-09 07:07:58 +0200 (Wed, 09 Apr 2008) | 1 line
  Changed test so it no longer runs as a side effect of importing.
........
											
										 
											2008-04-09 08:37:03 +00:00
										 |  |  | import sre_parse | 
					
						
							| 
									
										
										
										
											2000-03-31 14:58:54 +00:00
										 |  |  | from sre_constants import * | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2001-01-15 12:46:09 +00:00
										 |  |  | assert _sre.MAGIC == MAGIC, "SRE module mismatch" | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-04-19 12:56:08 +00:00
										 |  |  | if _sre.CODESIZE == 2: | 
					
						
							|  |  |  |     MAXCODE = 65535 | 
					
						
							|  |  |  | else: | 
					
						
							| 
									
										
										
										
											2007-01-15 16:59:06 +00:00
										 |  |  |     MAXCODE = 0xFFFFFFFF | 
					
						
							| 
									
										
										
										
											2000-07-02 12:00:07 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  | def _identityfunction(x): | 
					
						
							|  |  |  |     return x | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2005-02-28 19:27:52 +00:00
										 |  |  | _LITERAL_CODES = set([LITERAL, NOT_LITERAL]) | 
					
						
							|  |  |  | _REPEATING_CODES = set([REPEAT, MIN_REPEAT, MAX_REPEAT]) | 
					
						
							|  |  |  | _SUCCESS_CODES = set([SUCCESS, FAILURE]) | 
					
						
							|  |  |  | _ASSERT_CODES = set([ASSERT, ASSERT_NOT]) | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2000-06-29 08:58:44 +00:00
										 |  |  | def _compile(code, pattern, flags): | 
					
						
							| 
									
										
										
										
											2000-06-29 23:33:12 +00:00
										 |  |  |     # internal: compile a (sub)pattern | 
					
						
							| 
									
										
										
										
											2000-06-29 16:57:40 +00:00
										 |  |  |     emit = code.append | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |     _len = len | 
					
						
							| 
									
										
										
										
											2005-02-28 19:27:52 +00:00
										 |  |  |     LITERAL_CODES = _LITERAL_CODES | 
					
						
							|  |  |  |     REPEATING_CODES = _REPEATING_CODES | 
					
						
							|  |  |  |     SUCCESS_CODES = _SUCCESS_CODES | 
					
						
							|  |  |  |     ASSERT_CODES = _ASSERT_CODES | 
					
						
							| 
									
										
										
										
											2000-03-31 14:58:54 +00:00
										 |  |  |     for op, av in pattern: | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |         if op in LITERAL_CODES: | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |             if flags & SRE_FLAG_IGNORECASE: | 
					
						
							|  |  |  |                 emit(OPCODES[OP_IGNORE[op]]) | 
					
						
							| 
									
										
										
										
											2001-01-15 18:28:14 +00:00
										 |  |  |                 emit(_sre.getlower(av, flags)) | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |             else: | 
					
						
							|  |  |  |                 emit(OPCODES[op]) | 
					
						
							| 
									
										
										
										
											2001-01-15 18:28:14 +00:00
										 |  |  |                 emit(av) | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |         elif op is IN: | 
					
						
							|  |  |  |             if flags & SRE_FLAG_IGNORECASE: | 
					
						
							|  |  |  |                 emit(OPCODES[OP_IGNORE[op]]) | 
					
						
							|  |  |  |                 def fixup(literal, flags=flags): | 
					
						
							| 
									
										
										
										
											2000-06-30 13:55:15 +00:00
										 |  |  |                     return _sre.getlower(literal, flags) | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |             else: | 
					
						
							|  |  |  |                 emit(OPCODES[op]) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                 fixup = _identityfunction | 
					
						
							|  |  |  |             skip = _len(code); emit(0) | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |             _compile_charset(av, flags, code, fixup) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |             code[skip] = _len(code) - skip | 
					
						
							| 
									
										
										
										
											2000-06-30 10:41:31 +00:00
										 |  |  |         elif op is ANY: | 
					
						
							|  |  |  |             if flags & SRE_FLAG_DOTALL: | 
					
						
							| 
									
										
										
										
											2000-08-01 22:47:49 +00:00
										 |  |  |                 emit(OPCODES[ANY_ALL]) | 
					
						
							| 
									
										
										
										
											2000-06-30 10:41:31 +00:00
										 |  |  |             else: | 
					
						
							| 
									
										
										
										
											2000-08-01 22:47:49 +00:00
										 |  |  |                 emit(OPCODES[ANY]) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |         elif op in REPEATING_CODES: | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |             if flags & SRE_FLAG_TEMPLATE: | 
					
						
							| 
									
										
										
										
											2007-08-30 01:19:48 +00:00
										 |  |  |                 raise error("internal: unsupported template operator") | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |                 emit(OPCODES[REPEAT]) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                 skip = _len(code); emit(0) | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |                 emit(av[0]) | 
					
						
							|  |  |  |                 emit(av[1]) | 
					
						
							|  |  |  |                 _compile(code, av[2], flags) | 
					
						
							|  |  |  |                 emit(OPCODES[SUCCESS]) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                 code[skip] = _len(code) - skip | 
					
						
							|  |  |  |             elif _simple(av) and op is not REPEAT: | 
					
						
							|  |  |  |                 if op is MAX_REPEAT: | 
					
						
							| 
									
										
										
										
											2003-04-14 17:59:34 +00:00
										 |  |  |                     emit(OPCODES[REPEAT_ONE]) | 
					
						
							|  |  |  |                 else: | 
					
						
							|  |  |  |                     emit(OPCODES[MIN_REPEAT_ONE]) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                 skip = _len(code); emit(0) | 
					
						
							| 
									
										
										
										
											2000-08-01 22:47:49 +00:00
										 |  |  |                 emit(av[0]) | 
					
						
							|  |  |  |                 emit(av[1]) | 
					
						
							|  |  |  |                 _compile(code, av[2], flags) | 
					
						
							|  |  |  |                 emit(OPCODES[SUCCESS]) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                 code[skip] = _len(code) - skip | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |             else: | 
					
						
							| 
									
										
										
										
											2000-08-01 22:47:49 +00:00
										 |  |  |                 emit(OPCODES[REPEAT]) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                 skip = _len(code); emit(0) | 
					
						
							| 
									
										
										
										
											2000-08-01 22:47:49 +00:00
										 |  |  |                 emit(av[0]) | 
					
						
							|  |  |  |                 emit(av[1]) | 
					
						
							|  |  |  |                 _compile(code, av[2], flags) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                 code[skip] = _len(code) - skip | 
					
						
							|  |  |  |                 if op is MAX_REPEAT: | 
					
						
							| 
									
										
										
										
											2000-08-01 22:47:49 +00:00
										 |  |  |                     emit(OPCODES[MAX_UNTIL]) | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |                 else: | 
					
						
							| 
									
										
										
										
											2000-08-01 22:47:49 +00:00
										 |  |  |                     emit(OPCODES[MIN_UNTIL]) | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |         elif op is SUBPATTERN: | 
					
						
							| 
									
										
										
										
											2000-08-01 18:20:07 +00:00
										 |  |  |             if av[0]: | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |                 emit(OPCODES[MARK]) | 
					
						
							| 
									
										
										
										
											2000-08-01 18:20:07 +00:00
										 |  |  |                 emit((av[0]-1)*2) | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |             # _compile_info(code, av[1], flags) | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |             _compile(code, av[1], flags) | 
					
						
							| 
									
										
										
										
											2000-08-01 18:20:07 +00:00
										 |  |  |             if av[0]: | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |                 emit(OPCODES[MARK]) | 
					
						
							| 
									
										
										
										
											2000-08-01 18:20:07 +00:00
										 |  |  |                 emit((av[0]-1)*2+1) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |         elif op in SUCCESS_CODES: | 
					
						
							| 
									
										
										
										
											2000-06-30 10:41:31 +00:00
										 |  |  |             emit(OPCODES[op]) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |         elif op in ASSERT_CODES: | 
					
						
							| 
									
										
										
										
											2000-07-03 18:44:21 +00:00
										 |  |  |             emit(OPCODES[op]) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |             skip = _len(code); emit(0) | 
					
						
							| 
									
										
										
										
											2000-07-03 18:44:21 +00:00
										 |  |  |             if av[0] >= 0: | 
					
						
							|  |  |  |                 emit(0) # look ahead | 
					
						
							|  |  |  |             else: | 
					
						
							|  |  |  |                 lo, hi = av[1].getwidth() | 
					
						
							|  |  |  |                 if lo != hi: | 
					
						
							| 
									
										
										
										
											2007-08-30 01:19:48 +00:00
										 |  |  |                     raise error("look-behind requires fixed-width pattern") | 
					
						
							| 
									
										
										
										
											2000-07-03 18:44:21 +00:00
										 |  |  |                 emit(lo) # look behind | 
					
						
							|  |  |  |             _compile(code, av[1], flags) | 
					
						
							|  |  |  |             emit(OPCODES[SUCCESS]) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |             code[skip] = _len(code) - skip | 
					
						
							| 
									
										
										
										
											2000-07-03 18:44:21 +00:00
										 |  |  |         elif op is CALL: | 
					
						
							| 
									
										
										
										
											2000-06-30 10:41:31 +00:00
										 |  |  |             emit(OPCODES[op]) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |             skip = _len(code); emit(0) | 
					
						
							| 
									
										
										
										
											2000-06-30 10:41:31 +00:00
										 |  |  |             _compile(code, av, flags) | 
					
						
							|  |  |  |             emit(OPCODES[SUCCESS]) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |             code[skip] = _len(code) - skip | 
					
						
							| 
									
										
										
										
											2000-06-30 10:41:31 +00:00
										 |  |  |         elif op is AT: | 
					
						
							|  |  |  |             emit(OPCODES[op]) | 
					
						
							|  |  |  |             if flags & SRE_FLAG_MULTILINE: | 
					
						
							| 
									
										
										
										
											2001-03-22 15:50:10 +00:00
										 |  |  |                 av = AT_MULTILINE.get(av, av) | 
					
						
							|  |  |  |             if flags & SRE_FLAG_LOCALE: | 
					
						
							|  |  |  |                 av = AT_LOCALE.get(av, av) | 
					
						
							|  |  |  |             elif flags & SRE_FLAG_UNICODE: | 
					
						
							|  |  |  |                 av = AT_UNICODE.get(av, av) | 
					
						
							|  |  |  |             emit(ATCODES[av]) | 
					
						
							| 
									
										
										
										
											2000-06-30 10:41:31 +00:00
										 |  |  |         elif op is BRANCH: | 
					
						
							| 
									
										
										
										
											2000-08-01 18:20:07 +00:00
										 |  |  |             emit(OPCODES[op]) | 
					
						
							| 
									
										
										
										
											2000-06-30 10:41:31 +00:00
										 |  |  |             tail = [] | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |             tailappend = tail.append | 
					
						
							| 
									
										
										
										
											2000-06-30 10:41:31 +00:00
										 |  |  |             for av in av[1]: | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                 skip = _len(code); emit(0) | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |                 # _compile_info(code, av, flags) | 
					
						
							| 
									
										
										
										
											2000-06-30 10:41:31 +00:00
										 |  |  |                 _compile(code, av, flags) | 
					
						
							|  |  |  |                 emit(OPCODES[JUMP]) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                 tailappend(_len(code)); emit(0) | 
					
						
							|  |  |  |                 code[skip] = _len(code) - skip | 
					
						
							| 
									
										
										
										
											2000-06-30 10:41:31 +00:00
										 |  |  |             emit(0) # end of branch | 
					
						
							|  |  |  |             for tail in tail: | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                 code[tail] = _len(code) - tail | 
					
						
							| 
									
										
										
										
											2000-06-30 10:41:31 +00:00
										 |  |  |         elif op is CATEGORY: | 
					
						
							|  |  |  |             emit(OPCODES[op]) | 
					
						
							|  |  |  |             if flags & SRE_FLAG_LOCALE: | 
					
						
							| 
									
										
										
										
											2001-03-22 15:50:10 +00:00
										 |  |  |                 av = CH_LOCALE[av] | 
					
						
							| 
									
										
										
										
											2000-06-30 10:41:31 +00:00
										 |  |  |             elif flags & SRE_FLAG_UNICODE: | 
					
						
							| 
									
										
										
										
											2001-03-22 15:50:10 +00:00
										 |  |  |                 av = CH_UNICODE[av] | 
					
						
							|  |  |  |             emit(CHCODES[av]) | 
					
						
							| 
									
										
										
										
											2000-07-03 21:31:48 +00:00
										 |  |  |         elif op is GROUPREF: | 
					
						
							| 
									
										
										
										
											2000-06-30 10:41:31 +00:00
										 |  |  |             if flags & SRE_FLAG_IGNORECASE: | 
					
						
							|  |  |  |                 emit(OPCODES[OP_IGNORE[op]]) | 
					
						
							|  |  |  |             else: | 
					
						
							|  |  |  |                 emit(OPCODES[op]) | 
					
						
							|  |  |  |             emit(av-1) | 
					
						
							| 
									
										
										
										
											2003-10-17 22:13:16 +00:00
										 |  |  |         elif op is GROUPREF_EXISTS: | 
					
						
							|  |  |  |             emit(OPCODES[op]) | 
					
						
							| 
									
										
										
										
											2005-06-02 13:35:52 +00:00
										 |  |  |             emit(av[0]-1) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |             skipyes = _len(code); emit(0) | 
					
						
							| 
									
										
										
										
											2003-10-17 22:13:16 +00:00
										 |  |  |             _compile(code, av[1], flags) | 
					
						
							|  |  |  |             if av[2]: | 
					
						
							|  |  |  |                 emit(OPCODES[JUMP]) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                 skipno = _len(code); emit(0) | 
					
						
							|  |  |  |                 code[skipyes] = _len(code) - skipyes + 1 | 
					
						
							| 
									
										
										
										
											2003-10-17 22:13:16 +00:00
										 |  |  |                 _compile(code, av[2], flags) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                 code[skipno] = _len(code) - skipno | 
					
						
							| 
									
										
										
										
											2003-10-17 22:13:16 +00:00
										 |  |  |             else: | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                 code[skipyes] = _len(code) - skipyes + 1 | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |         else: | 
					
						
							| 
									
										
										
										
											2007-08-30 01:19:48 +00:00
										 |  |  |             raise ValueError("unsupported operand type", op) | 
					
						
							| 
									
										
										
										
											2000-03-31 14:58:54 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  | def _compile_charset(charset, flags, code, fixup=None): | 
					
						
							|  |  |  |     # compile charset subprogram | 
					
						
							|  |  |  |     emit = code.append | 
					
						
							| 
									
										
										
										
											2002-06-02 00:40:05 +00:00
										 |  |  |     if fixup is None: | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |         fixup = _identityfunction | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |     for op, av in _optimize_charset(charset, fixup): | 
					
						
							|  |  |  |         emit(OPCODES[op]) | 
					
						
							|  |  |  |         if op is NEGATE: | 
					
						
							|  |  |  |             pass | 
					
						
							|  |  |  |         elif op is LITERAL: | 
					
						
							|  |  |  |             emit(fixup(av)) | 
					
						
							|  |  |  |         elif op is RANGE: | 
					
						
							|  |  |  |             emit(fixup(av[0])) | 
					
						
							|  |  |  |             emit(fixup(av[1])) | 
					
						
							|  |  |  |         elif op is CHARSET: | 
					
						
							|  |  |  |             code.extend(av) | 
					
						
							| 
									
										
										
										
											2001-07-02 16:58:38 +00:00
										 |  |  |         elif op is BIGCHARSET: | 
					
						
							|  |  |  |             code.extend(av) | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |         elif op is CATEGORY: | 
					
						
							|  |  |  |             if flags & SRE_FLAG_LOCALE: | 
					
						
							|  |  |  |                 emit(CHCODES[CH_LOCALE[av]]) | 
					
						
							|  |  |  |             elif flags & SRE_FLAG_UNICODE: | 
					
						
							|  |  |  |                 emit(CHCODES[CH_UNICODE[av]]) | 
					
						
							|  |  |  |             else: | 
					
						
							|  |  |  |                 emit(CHCODES[av]) | 
					
						
							|  |  |  |         else: | 
					
						
							| 
									
										
										
										
											2007-08-30 01:19:48 +00:00
										 |  |  |             raise error("internal: unsupported set operator") | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |     emit(OPCODES[FAILURE]) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def _optimize_charset(charset, fixup): | 
					
						
							|  |  |  |     # internal: optimize character set | 
					
						
							|  |  |  |     out = [] | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |     outappend = out.append | 
					
						
							| 
									
										
										
										
											2004-03-27 09:24:36 +00:00
										 |  |  |     charmap = [0]*256 | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |     try: | 
					
						
							|  |  |  |         for op, av in charset: | 
					
						
							|  |  |  |             if op is NEGATE: | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                 outappend((op, av)) | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |             elif op is LITERAL: | 
					
						
							| 
									
										
										
										
											2004-03-27 09:24:36 +00:00
										 |  |  |                 charmap[fixup(av)] = 1 | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |             elif op is RANGE: | 
					
						
							|  |  |  |                 for i in range(fixup(av[0]), fixup(av[1])+1): | 
					
						
							| 
									
										
										
										
											2004-03-27 09:24:36 +00:00
										 |  |  |                     charmap[i] = 1 | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |             elif op is CATEGORY: | 
					
						
							| 
									
										
										
										
											2001-01-14 15:06:11 +00:00
										 |  |  |                 # XXX: could append to charmap tail | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |                 return charset # cannot compress | 
					
						
							|  |  |  |     except IndexError: | 
					
						
							|  |  |  |         # character set contains unicode characters | 
					
						
							| 
									
										
										
										
											2001-07-02 16:58:38 +00:00
										 |  |  |         return _optimize_unicode(charset, fixup) | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |     # compress character map | 
					
						
							|  |  |  |     i = p = n = 0 | 
					
						
							|  |  |  |     runs = [] | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |     runsappend = runs.append | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |     for c in charmap: | 
					
						
							|  |  |  |         if c: | 
					
						
							|  |  |  |             if n == 0: | 
					
						
							|  |  |  |                 p = i | 
					
						
							|  |  |  |             n = n + 1 | 
					
						
							|  |  |  |         elif n: | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |             runsappend((p, n)) | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |             n = 0 | 
					
						
							|  |  |  |         i = i + 1 | 
					
						
							|  |  |  |     if n: | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |         runsappend((p, n)) | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |     if len(runs) <= 2: | 
					
						
							|  |  |  |         # use literal/range | 
					
						
							|  |  |  |         for p, n in runs: | 
					
						
							|  |  |  |             if n == 1: | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                 outappend((LITERAL, p)) | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |             else: | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                 outappend((RANGE, (p, p+n-1))) | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |         if len(out) < len(charset): | 
					
						
							|  |  |  |             return out | 
					
						
							|  |  |  |     else: | 
					
						
							|  |  |  |         # use bitmap | 
					
						
							| 
									
										
										
										
											2001-07-02 16:58:38 +00:00
										 |  |  |         data = _mk_bitmap(charmap) | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |         outappend((CHARSET, data)) | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |         return out | 
					
						
							|  |  |  |     return charset | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2001-07-02 16:58:38 +00:00
										 |  |  | def _mk_bitmap(bits): | 
					
						
							|  |  |  |     data = [] | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |     dataappend = data.append | 
					
						
							| 
									
										
										
										
											2003-04-19 12:56:08 +00:00
										 |  |  |     if _sre.CODESIZE == 2: | 
					
						
							|  |  |  |         start = (1, 0) | 
					
						
							|  |  |  |     else: | 
					
						
							| 
									
										
										
										
											2007-01-15 16:59:06 +00:00
										 |  |  |         start = (1, 0) | 
					
						
							| 
									
										
										
										
											2003-04-19 12:56:08 +00:00
										 |  |  |     m, v = start | 
					
						
							| 
									
										
										
										
											2001-07-02 16:58:38 +00:00
										 |  |  |     for c in bits: | 
					
						
							|  |  |  |         if c: | 
					
						
							|  |  |  |             v = v + m | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |         m = m + m | 
					
						
							| 
									
										
										
										
											2001-07-02 16:58:38 +00:00
										 |  |  |         if m > MAXCODE: | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |             dataappend(v) | 
					
						
							| 
									
										
										
										
											2003-04-19 12:56:08 +00:00
										 |  |  |             m, v = start | 
					
						
							| 
									
										
										
										
											2001-07-02 16:58:38 +00:00
										 |  |  |     return data | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | # To represent a big charset, first a bitmap of all characters in the | 
					
						
							|  |  |  | # set is constructed. Then, this bitmap is sliced into chunks of 256 | 
					
						
							| 
									
										
											  
											
												Merged revisions 56154-56264 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/p3yk
................
  r56155 | neal.norwitz | 2007-07-03 08:59:08 +0300 (Tue, 03 Jul 2007) | 1 line
  Get this test working after converting map to return an iterator
................
  r56202 | neal.norwitz | 2007-07-09 04:30:09 +0300 (Mon, 09 Jul 2007) | 37 lines
  Merged revisions 56124-56201 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk
  ........
    r56129 | georg.brandl | 2007-06-30 04:01:01 -0700 (Sat, 30 Jun 2007) | 2 lines
    Document smtp.SMTPAuthenticationError.
  ........
    r56137 | georg.brandl | 2007-07-01 01:11:35 -0700 (Sun, 01 Jul 2007) | 2 lines
    Fix a few webbrowser.py problems.
  ........
    r56143 | georg.brandl | 2007-07-02 04:54:28 -0700 (Mon, 02 Jul 2007) | 2 lines
    Remove duplicate sentence from alarm() doc.
  ........
    r56170 | mark.hammond | 2007-07-03 19:03:10 -0700 (Tue, 03 Jul 2007) | 3 lines
    copy built files to the PCBuild directory, where tools like
    distutils or external build processes can find them.
  ........
    r56176 | kurt.kaiser | 2007-07-05 15:03:39 -0700 (Thu, 05 Jul 2007) | 10 lines
    Many calls to tk.call involve an arglist containing a single tuple.
    Calls using METH_OLDARGS unpack this tuple; calls using METH_VARARG
    don't.  Tcl's concatenation of args was affected; IDLE doesn't start.
    Modify Tkapp_Call() to unpack single tuple arglists.
    Bug 1733943
    Ref http://mail.python.org/pipermail/python-checkins/2007-May/060454.html
  ........
    r56177 | neal.norwitz | 2007-07-05 21:13:39 -0700 (Thu, 05 Jul 2007) | 1 line
    Fix typo in comment
  ........
................
  r56251 | neal.norwitz | 2007-07-11 10:01:01 +0300 (Wed, 11 Jul 2007) | 1 line
  Get working with map returning an iterator (had to fix whitespace too)
................
  r56255 | thomas.wouters | 2007-07-11 13:41:37 +0300 (Wed, 11 Jul 2007) | 6 lines
  Clean up merge glitch or copy-paste error (the entire module was duplicated,
  except the first half even had some more copy-paste errors, referring to
  listcomps and genexps instead of setcomps)
................
  r56256 | thomas.wouters | 2007-07-11 15:16:01 +0300 (Wed, 11 Jul 2007) | 14 lines
  Dict comprehensions. Still needs doc changes (like many python-3000 features
  ;-). It generates bytecode similar to:
  x = {}
  for k, v in (generator here):
    x[k] = v
  except there is no tuple-packing and -unpacking involved. Trivial
  measurement suggests it's significantly faster than dict(generator here) (in
  the order of 2 to 3 times as fast) but I have not done extensive
  measurements.
................
  r56263 | guido.van.rossum | 2007-07-11 15:36:26 +0300 (Wed, 11 Jul 2007) | 3 lines
  Patch 1724999 by Ali Gholami Rudi -- avoid complaints about dict size
  change during iter in destroy call.
................
											
										 
											2007-07-11 13:09:30 +00:00
										 |  |  | # characters, duplicate chunks are eliminated, and each chunk is | 
					
						
							| 
									
										
										
										
											2001-07-02 16:58:38 +00:00
										 |  |  | # given a number. In the compiled expression, the charset is | 
					
						
							|  |  |  | # represented by a 16-bit word sequence, consisting of one word for | 
					
						
							|  |  |  | # the number of different chunks, a sequence of 256 bytes (128 words) | 
					
						
							|  |  |  | # of chunk numbers indexed by their original chunk position, and a | 
					
						
							|  |  |  | # sequence of chunks (16 words each). | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | # Compression is normally good: in a typical charset, large ranges of | 
					
						
							|  |  |  | # Unicode will be either completely excluded (e.g. if only cyrillic | 
					
						
							|  |  |  | # letters are to be matched), or completely included (e.g. if large | 
					
						
							|  |  |  | # subranges of Kanji match). These ranges will be represented by | 
					
						
							|  |  |  | # chunks of all one-bits or all zero-bits. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | # Matching can be also done efficiently: the more significant byte of | 
					
						
							|  |  |  | # the Unicode character is an index into the chunk number, and the | 
					
						
							|  |  |  | # less significant byte is a bit index in the chunk (just like the | 
					
						
							|  |  |  | # CHARSET matching). | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-04-19 12:56:08 +00:00
										 |  |  | # In UCS-4 mode, the BIGCHARSET opcode still supports only subsets | 
					
						
							|  |  |  | # of the basic multilingual plane; an efficient representation | 
					
						
							|  |  |  | # for all of UTF-16 has not yet been developed. This means, | 
					
						
							|  |  |  | # in particular, that negated charsets cannot be represented as | 
					
						
							|  |  |  | # bigcharsets. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2001-07-02 16:58:38 +00:00
										 |  |  | def _optimize_unicode(charset, fixup): | 
					
						
							| 
									
										
										
										
											2003-04-19 12:56:08 +00:00
										 |  |  |     try: | 
					
						
							|  |  |  |         import array | 
					
						
							|  |  |  |     except ImportError: | 
					
						
							|  |  |  |         return charset | 
					
						
							| 
									
										
										
										
											2004-03-27 09:24:36 +00:00
										 |  |  |     charmap = [0]*65536 | 
					
						
							| 
									
										
										
										
											2001-07-02 16:58:38 +00:00
										 |  |  |     negate = 0 | 
					
						
							| 
									
										
										
										
											2003-04-19 12:56:08 +00:00
										 |  |  |     try: | 
					
						
							|  |  |  |         for op, av in charset: | 
					
						
							|  |  |  |             if op is NEGATE: | 
					
						
							|  |  |  |                 negate = 1 | 
					
						
							|  |  |  |             elif op is LITERAL: | 
					
						
							| 
									
										
										
										
											2004-03-27 09:24:36 +00:00
										 |  |  |                 charmap[fixup(av)] = 1 | 
					
						
							| 
									
										
										
										
											2003-04-19 12:56:08 +00:00
										 |  |  |             elif op is RANGE: | 
					
						
							| 
									
										
										
										
											2007-05-07 22:24:25 +00:00
										 |  |  |                 for i in range(fixup(av[0]), fixup(av[1])+1): | 
					
						
							| 
									
										
										
										
											2004-03-27 09:24:36 +00:00
										 |  |  |                     charmap[i] = 1 | 
					
						
							| 
									
										
										
										
											2003-04-19 12:56:08 +00:00
										 |  |  |             elif op is CATEGORY: | 
					
						
							|  |  |  |                 # XXX: could expand category | 
					
						
							|  |  |  |                 return charset # cannot compress | 
					
						
							|  |  |  |     except IndexError: | 
					
						
							| 
									
										
										
										
											2011-10-04 19:06:00 +03:00
										 |  |  |         # non-BMP characters; XXX now they should work | 
					
						
							| 
									
										
										
										
											2003-04-19 12:56:08 +00:00
										 |  |  |         return charset | 
					
						
							| 
									
										
										
										
											2001-07-02 16:58:38 +00:00
										 |  |  |     if negate: | 
					
						
							| 
									
										
										
										
											2003-04-19 12:56:08 +00:00
										 |  |  |         if sys.maxunicode != 65535: | 
					
						
							|  |  |  |             # XXX: negation does not work with big charsets | 
					
						
							| 
									
										
										
										
											2011-10-04 19:06:00 +03:00
										 |  |  |             # XXX2: now they should work, but removing this will make the | 
					
						
							|  |  |  |             # charmap 17 times bigger | 
					
						
							| 
									
										
										
										
											2003-04-19 12:56:08 +00:00
										 |  |  |             return charset | 
					
						
							| 
									
										
										
										
											2007-05-07 22:24:25 +00:00
										 |  |  |         for i in range(65536): | 
					
						
							| 
									
										
										
										
											2001-07-02 16:58:38 +00:00
										 |  |  |             charmap[i] = not charmap[i] | 
					
						
							|  |  |  |     comps = {} | 
					
						
							|  |  |  |     mapping = [0]*256 | 
					
						
							|  |  |  |     block = 0 | 
					
						
							|  |  |  |     data = [] | 
					
						
							| 
									
										
										
										
											2007-05-07 22:24:25 +00:00
										 |  |  |     for i in range(256): | 
					
						
							| 
									
										
										
										
											2001-07-02 16:58:38 +00:00
										 |  |  |         chunk = tuple(charmap[i*256:(i+1)*256]) | 
					
						
							|  |  |  |         new = comps.setdefault(chunk, block) | 
					
						
							|  |  |  |         mapping[i] = new | 
					
						
							|  |  |  |         if new == block: | 
					
						
							| 
									
										
										
										
											2002-06-27 20:08:25 +00:00
										 |  |  |             block = block + 1 | 
					
						
							|  |  |  |             data = data + _mk_bitmap(chunk) | 
					
						
							| 
									
										
										
										
											2001-07-02 16:58:38 +00:00
										 |  |  |     header = [block] | 
					
						
							| 
									
										
										
										
											2004-05-07 07:18:13 +00:00
										 |  |  |     if _sre.CODESIZE == 2: | 
					
						
							| 
									
										
										
										
											2003-04-19 12:56:08 +00:00
										 |  |  |         code = 'H' | 
					
						
							|  |  |  |     else: | 
					
						
							| 
									
										
										
										
											2004-05-07 07:18:13 +00:00
										 |  |  |         code = 'I' | 
					
						
							| 
									
										
										
										
											2003-04-19 12:56:08 +00:00
										 |  |  |     # Convert block indices to byte array of 256 bytes | 
					
						
							| 
									
										
										
										
											2010-09-01 20:29:34 +00:00
										 |  |  |     mapping = array.array('b', mapping).tobytes() | 
					
						
							| 
									
										
										
										
											2003-04-19 12:56:08 +00:00
										 |  |  |     # Convert byte array to word array | 
					
						
							| 
									
										
										
										
											2004-05-07 07:18:13 +00:00
										 |  |  |     mapping = array.array(code, mapping) | 
					
						
							|  |  |  |     assert mapping.itemsize == _sre.CODESIZE | 
					
						
							| 
									
										
										
										
											2007-07-03 16:22:09 +00:00
										 |  |  |     assert len(mapping) * mapping.itemsize == 256 | 
					
						
							| 
									
										
										
										
											2004-05-07 07:18:13 +00:00
										 |  |  |     header = header + mapping.tolist() | 
					
						
							| 
									
										
										
										
											2001-07-02 16:58:38 +00:00
										 |  |  |     data[0:0] = header | 
					
						
							| 
									
										
										
										
											2001-07-21 01:41:30 +00:00
										 |  |  |     return [(BIGCHARSET, data)] | 
					
						
							| 
									
										
										
										
											2001-07-02 16:58:38 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  | def _simple(av): | 
					
						
							|  |  |  |     # check if av is a "simple" operator | 
					
						
							|  |  |  |     lo, hi = av[2].getwidth() | 
					
						
							| 
									
										
										
										
											2000-10-07 17:38:23 +00:00
										 |  |  |     if lo == 0 and hi == MAXREPEAT: | 
					
						
							| 
									
										
										
										
											2007-08-30 01:19:48 +00:00
										 |  |  |         raise error("nothing to repeat") | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |     return lo == hi == 1 and av[2][0][0] != SUBPATTERN | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2000-06-29 23:33:12 +00:00
										 |  |  | def _compile_info(code, pattern, flags): | 
					
						
							|  |  |  |     # internal: compile an info block.  in the current version, | 
					
						
							| 
									
										
										
										
											2000-07-02 12:00:07 +00:00
										 |  |  |     # this contains min/max pattern width, and an optional literal | 
					
						
							|  |  |  |     # prefix or a character map | 
					
						
							| 
									
										
										
										
											2000-06-29 23:33:12 +00:00
										 |  |  |     lo, hi = pattern.getwidth() | 
					
						
							|  |  |  |     if lo == 0: | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |         return # not worth it | 
					
						
							| 
									
										
										
										
											2000-06-29 23:33:12 +00:00
										 |  |  |     # look for a literal prefix | 
					
						
							|  |  |  |     prefix = [] | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |     prefixappend = prefix.append | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |     prefix_skip = 0 | 
					
						
							| 
									
										
										
										
											2000-07-02 12:00:07 +00:00
										 |  |  |     charset = [] # not used | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |     charsetappend = charset.append | 
					
						
							| 
									
										
										
										
											2000-06-29 23:33:12 +00:00
										 |  |  |     if not (flags & SRE_FLAG_IGNORECASE): | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |         # look for literal prefix | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |         for op, av in pattern.data: | 
					
						
							|  |  |  |             if op is LITERAL: | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |                 if len(prefix) == prefix_skip: | 
					
						
							|  |  |  |                     prefix_skip = prefix_skip + 1 | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                 prefixappend(av) | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |             elif op is SUBPATTERN and len(av[1]) == 1: | 
					
						
							|  |  |  |                 op, av = av[1][0] | 
					
						
							|  |  |  |                 if op is LITERAL: | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                     prefixappend(av) | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |                 else: | 
					
						
							|  |  |  |                     break | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |             else: | 
					
						
							|  |  |  |                 break | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |         # if no prefix, look for charset prefix | 
					
						
							|  |  |  |         if not prefix and pattern.data: | 
					
						
							|  |  |  |             op, av = pattern.data[0] | 
					
						
							|  |  |  |             if op is SUBPATTERN and av[1]: | 
					
						
							|  |  |  |                 op, av = av[1][0] | 
					
						
							|  |  |  |                 if op is LITERAL: | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                     charsetappend((op, av)) | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |                 elif op is BRANCH: | 
					
						
							|  |  |  |                     c = [] | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                     cappend = c.append | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |                     for p in av[1]: | 
					
						
							|  |  |  |                         if not p: | 
					
						
							|  |  |  |                             break | 
					
						
							|  |  |  |                         op, av = p[0] | 
					
						
							|  |  |  |                         if op is LITERAL: | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                             cappend((op, av)) | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |                         else: | 
					
						
							|  |  |  |                             break | 
					
						
							|  |  |  |                     else: | 
					
						
							|  |  |  |                         charset = c | 
					
						
							|  |  |  |             elif op is BRANCH: | 
					
						
							|  |  |  |                 c = [] | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                 cappend = c.append | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |                 for p in av[1]: | 
					
						
							|  |  |  |                     if not p: | 
					
						
							|  |  |  |                         break | 
					
						
							|  |  |  |                     op, av = p[0] | 
					
						
							|  |  |  |                     if op is LITERAL: | 
					
						
							| 
									
										
										
										
											2004-03-26 11:16:55 +00:00
										 |  |  |                         cappend((op, av)) | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |                     else: | 
					
						
							|  |  |  |                         break | 
					
						
							|  |  |  |                 else: | 
					
						
							|  |  |  |                     charset = c | 
					
						
							|  |  |  |             elif op is IN: | 
					
						
							|  |  |  |                 charset = av | 
					
						
							|  |  |  | ##     if prefix: | 
					
						
							|  |  |  | ##         print "*** PREFIX", prefix, prefix_skip | 
					
						
							|  |  |  | ##     if charset: | 
					
						
							|  |  |  | ##         print "*** CHARSET", charset | 
					
						
							| 
									
										
										
										
											2000-06-29 23:33:12 +00:00
										 |  |  |     # add an info block | 
					
						
							|  |  |  |     emit = code.append | 
					
						
							|  |  |  |     emit(OPCODES[INFO]) | 
					
						
							|  |  |  |     skip = len(code); emit(0) | 
					
						
							|  |  |  |     # literal flag | 
					
						
							|  |  |  |     mask = 0 | 
					
						
							| 
									
										
										
										
											2000-07-02 12:00:07 +00:00
										 |  |  |     if prefix: | 
					
						
							|  |  |  |         mask = SRE_INFO_PREFIX | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |         if len(prefix) == prefix_skip == len(pattern.data): | 
					
						
							| 
									
										
										
										
											2000-07-02 12:00:07 +00:00
										 |  |  |             mask = mask + SRE_INFO_LITERAL | 
					
						
							|  |  |  |     elif charset: | 
					
						
							|  |  |  |         mask = mask + SRE_INFO_CHARSET | 
					
						
							| 
									
										
										
										
											2000-06-29 23:33:12 +00:00
										 |  |  |     emit(mask) | 
					
						
							|  |  |  |     # pattern length | 
					
						
							| 
									
										
										
										
											2000-07-02 12:00:07 +00:00
										 |  |  |     if lo < MAXCODE: | 
					
						
							|  |  |  |         emit(lo) | 
					
						
							|  |  |  |     else: | 
					
						
							|  |  |  |         emit(MAXCODE) | 
					
						
							|  |  |  |         prefix = prefix[:MAXCODE] | 
					
						
							|  |  |  |     if hi < MAXCODE: | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |         emit(hi) | 
					
						
							| 
									
										
										
										
											2000-06-29 23:33:12 +00:00
										 |  |  |     else: | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |         emit(0) | 
					
						
							| 
									
										
										
										
											2000-06-29 23:33:12 +00:00
										 |  |  |     # add literal prefix | 
					
						
							|  |  |  |     if prefix: | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |         emit(len(prefix)) # length | 
					
						
							|  |  |  |         emit(prefix_skip) # skip | 
					
						
							|  |  |  |         code.extend(prefix) | 
					
						
							|  |  |  |         # generate overlap table | 
					
						
							|  |  |  |         table = [-1] + ([0]*len(prefix)) | 
					
						
							| 
									
										
										
										
											2007-05-07 22:24:25 +00:00
										 |  |  |         for i in range(len(prefix)): | 
					
						
							| 
									
										
										
										
											2000-08-07 20:59:04 +00:00
										 |  |  |             table[i+1] = table[i]+1 | 
					
						
							|  |  |  |             while table[i+1] > 0 and prefix[i] != prefix[table[i+1]-1]: | 
					
						
							|  |  |  |                 table[i+1] = table[table[i+1]-1]+1 | 
					
						
							|  |  |  |         code.extend(table[1:]) # don't store first entry | 
					
						
							| 
									
										
										
										
											2000-07-02 12:00:07 +00:00
										 |  |  |     elif charset: | 
					
						
							| 
									
										
										
										
											2003-02-24 01:18:35 +00:00
										 |  |  |         _compile_charset(charset, flags, code) | 
					
						
							| 
									
										
										
										
											2000-06-29 23:33:12 +00:00
										 |  |  |     code[skip] = len(code) - skip | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-07-02 21:37:16 +00:00
										 |  |  | def isstring(obj): | 
					
						
							| 
									
										
										
										
											2008-03-18 20:19:54 +00:00
										 |  |  |     return isinstance(obj, (str, bytes)) | 
					
						
							| 
									
										
										
										
											2003-07-02 21:37:16 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2000-08-01 21:05:41 +00:00
										 |  |  | def _code(p, flags): | 
					
						
							| 
									
										
										
										
											2000-06-29 23:33:12 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2000-06-01 17:39:12 +00:00
										 |  |  |     flags = p.pattern.flags | flags | 
					
						
							| 
									
										
										
										
											2000-06-29 16:57:40 +00:00
										 |  |  |     code = [] | 
					
						
							| 
									
										
										
										
											2000-06-29 23:33:12 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |     # compile info block | 
					
						
							|  |  |  |     _compile_info(code, p, flags) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     # compile the pattern | 
					
						
							| 
									
										
										
										
											2000-06-01 17:39:12 +00:00
										 |  |  |     _compile(code, p.data, flags) | 
					
						
							| 
									
										
										
										
											2000-06-29 23:33:12 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2000-06-01 17:39:12 +00:00
										 |  |  |     code.append(OPCODES[SUCCESS]) | 
					
						
							| 
									
										
										
										
											2000-06-29 23:33:12 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2000-08-01 18:20:07 +00:00
										 |  |  |     return code | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def compile(p, flags=0): | 
					
						
							|  |  |  |     # internal: convert pattern list to internal format | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-07-02 21:37:16 +00:00
										 |  |  |     if isstring(p): | 
					
						
							| 
									
										
										
										
											2000-08-01 18:20:07 +00:00
										 |  |  |         pattern = p | 
					
						
							|  |  |  |         p = sre_parse.parse(p, flags) | 
					
						
							|  |  |  |     else: | 
					
						
							|  |  |  |         pattern = None | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2000-08-01 21:05:41 +00:00
										 |  |  |     code = _code(p, flags) | 
					
						
							| 
									
										
										
										
											2000-08-01 18:20:07 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2000-07-23 21:46:17 +00:00
										 |  |  |     # print code | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2001-01-14 15:06:11 +00:00
										 |  |  |     # XXX: <fl> get rid of this limitation! | 
					
						
							| 
									
										
										
										
											2004-10-15 06:15:08 +00:00
										 |  |  |     if p.pattern.groups > 100: | 
					
						
							|  |  |  |         raise AssertionError( | 
					
						
							|  |  |  |             "sorry, but this version only supports 100 named groups" | 
					
						
							|  |  |  |             ) | 
					
						
							| 
									
										
										
										
											2000-06-29 23:33:12 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2000-07-02 22:25:39 +00:00
										 |  |  |     # map in either direction | 
					
						
							|  |  |  |     groupindex = p.pattern.groupdict | 
					
						
							|  |  |  |     indexgroup = [None] * p.pattern.groups | 
					
						
							|  |  |  |     for k, i in groupindex.items(): | 
					
						
							|  |  |  |         indexgroup[i] = k | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2000-06-01 17:39:12 +00:00
										 |  |  |     return _sre.compile( | 
					
						
							| 
									
										
											  
											
												Merged revisions 59666-59679 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
  r59666 | christian.heimes | 2008-01-02 19:28:32 +0100 (Wed, 02 Jan 2008) | 1 line
  Made vs9to8 Unix compatible
........
  r59669 | guido.van.rossum | 2008-01-02 20:00:46 +0100 (Wed, 02 Jan 2008) | 2 lines
  Patch #1696.  Don't attempt to close None in dry-run mode.
........
  r59671 | jeffrey.yasskin | 2008-01-03 03:21:52 +0100 (Thu, 03 Jan 2008) | 6 lines
  Backport PEP 3141 from the py3k branch to the trunk. This includes r50877 (just
  the complex_pow part), r56649, r56652, r56715, r57296, r57302, r57359, r57361,
  r57372, r57738, r57739, r58017, r58039, r58040, and r59390, and new
  documentation. The only significant difference is that round(x) returns a float
  to preserve backward-compatibility. See http://bugs.python.org/issue1689.
........
  r59672 | christian.heimes | 2008-01-03 16:41:30 +0100 (Thu, 03 Jan 2008) | 1 line
  Issue #1726: Remove Python/atof.c from PCBuild/pythoncore.vcproj
........
  r59675 | guido.van.rossum | 2008-01-03 20:12:44 +0100 (Thu, 03 Jan 2008) | 4 lines
  Issue #1700, reported by Nguyen Quan Son, fix by Fredruk Lundh:
  Regular Expression inline flags not handled correctly for some unicode
  characters.  (Forward port from 2.5.2.)
........
  r59676 | christian.heimes | 2008-01-03 21:23:15 +0100 (Thu, 03 Jan 2008) | 1 line
  Added math.isinf() and math.isnan()
........
  r59677 | christian.heimes | 2008-01-03 22:14:48 +0100 (Thu, 03 Jan 2008) | 1 line
  Some build bots don't compile mathmodule. There is an issue with the long definition of pi and euler
........
  r59678 | christian.heimes | 2008-01-03 23:16:32 +0100 (Thu, 03 Jan 2008) | 2 lines
  Modified PyImport_Import and PyImport_ImportModule to always use absolute imports by calling __import__ with an explicit level of 0
  Added a new API function PyImport_ImportModuleNoBlock. It solves the problem with dead locks when mixing threads and imports
........
  r59679 | christian.heimes | 2008-01-03 23:32:26 +0100 (Thu, 03 Jan 2008) | 1 line
  Added copysign(x, y) function to the math module
........
											
										 
											2008-01-03 23:01:04 +00:00
										 |  |  |         pattern, flags | p.pattern.flags, code, | 
					
						
							| 
									
										
										
										
											2000-07-03 18:44:21 +00:00
										 |  |  |         p.pattern.groups-1, | 
					
						
							|  |  |  |         groupindex, indexgroup | 
					
						
							| 
									
										
										
										
											2000-06-30 07:50:59 +00:00
										 |  |  |         ) |