| 
									
										
										
										
											2000-02-04 15:28:42 +00:00
										 |  |  | """Parse (absolute and relative) URLs.
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-04-17 14:44:14 +00:00
										 |  |  | urlparse module is based upon the following RFC specifications. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | RFC 3986 (STD66): "Uniform Resource Identifiers" by T. Berners-Lee, R. Fielding | 
					
						
							|  |  |  | and L.  Masinter, January 2005. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | RFC 2732 : "Format for Literal IPv6 Addresses in URL's by R.Hinden, B.Carpenter | 
					
						
							|  |  |  | and L.Masinter, December 1999. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
											  
											
												Merged revisions 80605-80609,80642-80646,80651-80652,80674,80684-80686,80748,80852,80854,80870,80872-80873,80907,80915-80916,80951-80952,80976-80977,80985,81038-81040,81042,81053,81070,81104-81105,81114,81125,81245,81285,81402,81463,81516,81562-81563,81567,81593,81635,81680-81681,81684,81801,81888,81931-81933,81939-81942,81963,81984,81991,82120,82188,82264-82267 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
  r80605 | andrew.kuchling | 2010-04-28 19:22:16 -0500 (Wed, 28 Apr 2010) | 1 line
  Add various items
........
  r80606 | andrew.kuchling | 2010-04-28 20:44:30 -0500 (Wed, 28 Apr 2010) | 6 lines
  Fix doubled 'the'.
  Markup fixes to use :exc:, :option: in a few places.
    (Glitch: unittest.main's -c ends up a link to the Python
    interpreter's -c option.  Should we skip using :option: for that
    switch, or disable the auto-linking somehow?)
........
  r80607 | andrew.kuchling | 2010-04-28 20:45:41 -0500 (Wed, 28 Apr 2010) | 1 line
  Add various unittest items
........
  r80608 | benjamin.peterson | 2010-04-28 22:18:05 -0500 (Wed, 28 Apr 2010) | 1 line
  update pypy description
........
  r80609 | benjamin.peterson | 2010-04-28 22:30:59 -0500 (Wed, 28 Apr 2010) | 1 line
  update pypy url
........
  r80642 | andrew.kuchling | 2010-04-29 19:49:09 -0500 (Thu, 29 Apr 2010) | 1 line
  Always add space after RFC; reword paragraph
........
  r80643 | andrew.kuchling | 2010-04-29 19:52:31 -0500 (Thu, 29 Apr 2010) | 6 lines
  Reword paragraph to make its meaning clearer.
  Antoine Pitrou: is my version of the paragraph still correct?
  R. David Murray: is this more understandable than the previous version?
........
  r80644 | andrew.kuchling | 2010-04-29 20:02:15 -0500 (Thu, 29 Apr 2010) | 1 line
  Fix typos
........
  r80645 | andrew.kuchling | 2010-04-29 20:32:47 -0500 (Thu, 29 Apr 2010) | 1 line
  Markup fix; clarify by adding 'in that order'
........
  r80646 | andrew.kuchling | 2010-04-29 20:33:40 -0500 (Thu, 29 Apr 2010) | 1 line
  Add various items; rearrange unittest section a bit
........
  r80651 | andrew.kuchling | 2010-04-30 08:46:55 -0500 (Fri, 30 Apr 2010) | 1 line
  Minor grammar re-wording
........
  r80652 | andrew.kuchling | 2010-04-30 08:47:34 -0500 (Fri, 30 Apr 2010) | 1 line
  Add item
........
  r80674 | andrew.kuchling | 2010-04-30 20:19:16 -0500 (Fri, 30 Apr 2010) | 1 line
  Add various items
........
  r80684 | andrew.kuchling | 2010-05-01 07:05:52 -0500 (Sat, 01 May 2010) | 1 line
  Minor grammar fix
........
  r80685 | andrew.kuchling | 2010-05-01 07:06:51 -0500 (Sat, 01 May 2010) | 1 line
  Describe memoryview
........
  r80686 | antoine.pitrou | 2010-05-01 07:16:39 -0500 (Sat, 01 May 2010) | 4 lines
  Fix attribution. Travis didn't do much and he did a bad work.
  (yes, this is a sensitive subject, sorry)
........
  r80748 | andrew.kuchling | 2010-05-03 20:24:22 -0500 (Mon, 03 May 2010) | 1 line
  Add some more items; the urlparse change is added twice
........
  r80852 | andrew.kuchling | 2010-05-05 20:09:47 -0500 (Wed, 05 May 2010) | 1 line
  Reword paragraph; fix filename, which should be pyconfig.h
........
  r80854 | andrew.kuchling | 2010-05-05 20:10:56 -0500 (Wed, 05 May 2010) | 1 line
  Add various items
........
  r80870 | andrew.kuchling | 2010-05-06 09:14:09 -0500 (Thu, 06 May 2010) | 1 line
  Describe ElementTree 1.3; rearrange new-module sections; describe dict views as sets; small edits and items
........
  r80872 | andrew.kuchling | 2010-05-06 12:21:59 -0500 (Thu, 06 May 2010) | 1 line
  Add 2 items; record ideas for two initial sections; clarify wording
........
  r80873 | andrew.kuchling | 2010-05-06 12:27:57 -0500 (Thu, 06 May 2010) | 1 line
  Change section title; point to unittest2
........
  r80907 | andrew.kuchling | 2010-05-06 20:45:14 -0500 (Thu, 06 May 2010) | 1 line
  Add a new section on the development plan; add an item
........
  r80915 | antoine.pitrou | 2010-05-07 05:15:51 -0500 (Fri, 07 May 2010) | 3 lines
  Fix some markup and a class name. Also, wrap a long line.
........
  r80916 | andrew.kuchling | 2010-05-07 06:30:47 -0500 (Fri, 07 May 2010) | 1 line
  Re-word text
........
  r80951 | andrew.kuchling | 2010-05-07 20:15:26 -0500 (Fri, 07 May 2010) | 1 line
  Add two items
........
  r80952 | andrew.kuchling | 2010-05-07 20:35:55 -0500 (Fri, 07 May 2010) | 1 line
  Get accents correct
........
  r80976 | andrew.kuchling | 2010-05-08 08:28:03 -0500 (Sat, 08 May 2010) | 1 line
  Add logging.dictConfig example; give up on writing a Ttk example
........
  r80977 | andrew.kuchling | 2010-05-08 08:29:46 -0500 (Sat, 08 May 2010) | 1 line
  Markup fixes
........
  r80985 | andrew.kuchling | 2010-05-08 10:39:46 -0500 (Sat, 08 May 2010) | 7 lines
  Write summary of the 2.7 release; rewrite the future section some more;
  mention PYTHONWARNINGS env. var; tweak some examples for readability.
  And with this commit, the "What's New" is done... except for a
  complete read-through to polish the text, and fixing any reported errors,
  but those tasks can easily wait until after beta2.
........
  r81038 | benjamin.peterson | 2010-05-09 16:09:40 -0500 (Sun, 09 May 2010) | 1 line
  finish clause
........
  r81039 | andrew.kuchling | 2010-05-10 09:18:27 -0500 (Mon, 10 May 2010) | 1 line
  Markup fix; re-word a sentence
........
  r81040 | andrew.kuchling | 2010-05-10 09:20:12 -0500 (Mon, 10 May 2010) | 1 line
  Use title case
........
  r81042 | andrew.kuchling | 2010-05-10 10:03:35 -0500 (Mon, 10 May 2010) | 1 line
  Link to unittest2 article
........
  r81053 | florent.xicluna | 2010-05-10 14:59:22 -0500 (Mon, 10 May 2010) | 2 lines
  Add a link on maketrans().
........
  r81070 | andrew.kuchling | 2010-05-10 18:13:41 -0500 (Mon, 10 May 2010) | 1 line
  Fix typo
........
  r81104 | andrew.kuchling | 2010-05-11 19:38:44 -0500 (Tue, 11 May 2010) | 1 line
  Revision pass: lots of edits, typo fixes, rearrangements
........
  r81105 | andrew.kuchling | 2010-05-11 19:40:47 -0500 (Tue, 11 May 2010) | 1 line
  Let's call this done
........
  r81114 | andrew.kuchling | 2010-05-12 08:56:07 -0500 (Wed, 12 May 2010) | 1 line
  Grammar fix
........
  r81125 | andrew.kuchling | 2010-05-12 13:56:48 -0500 (Wed, 12 May 2010) | 1 line
  #8696: add documentation for logging.config.dictConfig (PEP 391)
........
  r81245 | andrew.kuchling | 2010-05-16 18:31:16 -0500 (Sun, 16 May 2010) | 1 line
  Add cross-reference to later section
........
  r81285 | vinay.sajip | 2010-05-18 03:16:27 -0500 (Tue, 18 May 2010) | 1 line
  Fixed minor typo in ReST markup.
........
  r81402 | vinay.sajip | 2010-05-21 12:41:34 -0500 (Fri, 21 May 2010) | 1 line
  Updated logging documentation with more dictConfig information.
........
  r81463 | georg.brandl | 2010-05-22 03:17:23 -0500 (Sat, 22 May 2010) | 1 line
  #8785: less confusing description of regex.find*.
........
  r81516 | andrew.kuchling | 2010-05-25 08:34:08 -0500 (Tue, 25 May 2010) | 1 line
  Add three items
........
  r81562 | andrew.kuchling | 2010-05-27 08:22:53 -0500 (Thu, 27 May 2010) | 1 line
  Rewrite wxWidgets section
........
  r81563 | andrew.kuchling | 2010-05-27 08:30:09 -0500 (Thu, 27 May 2010) | 1 line
  Remove top-level 'General Questions' section, pushing up the questions it contains
........
  r81567 | andrew.kuchling | 2010-05-27 16:29:59 -0500 (Thu, 27 May 2010) | 1 line
  Add item
........
  r81593 | georg.brandl | 2010-05-29 03:46:18 -0500 (Sat, 29 May 2010) | 1 line
  #8616: add new turtle demo "nim".
........
  r81635 | georg.brandl | 2010-06-01 02:25:23 -0500 (Tue, 01 Jun 2010) | 1 line
  Put docs for RegexObject.search() before RegexObject.match() to mirror re.search() and re.match() order.
........
  r81680 | vinay.sajip | 2010-06-03 17:34:42 -0500 (Thu, 03 Jun 2010) | 1 line
  Issue #8890: Documentation changed to avoid reference to temporary files.
........
  r81681 | sean.reifschneider | 2010-06-03 20:51:26 -0500 (Thu, 03 Jun 2010) | 2 lines
  Issue8810: Clearing up docstring for tzinfo.utcoffset.
........
  r81684 | vinay.sajip | 2010-06-04 08:41:02 -0500 (Fri, 04 Jun 2010) | 1 line
  Issue #8890: Documentation changed to avoid reference to temporary files - other cases covered.
........
  r81801 | andrew.kuchling | 2010-06-07 08:38:40 -0500 (Mon, 07 Jun 2010) | 1 line
  #8875: Remove duplicated paragraph
........
  r81888 | andrew.kuchling | 2010-06-10 20:54:58 -0500 (Thu, 10 Jun 2010) | 1 line
  Add a few more items
........
  r81931 | georg.brandl | 2010-06-12 01:26:54 -0500 (Sat, 12 Jun 2010) | 1 line
  Fix punctuation.
........
  r81932 | georg.brandl | 2010-06-12 01:28:58 -0500 (Sat, 12 Jun 2010) | 1 line
  Document that an existing directory raises in mkdir().
........
  r81933 | georg.brandl | 2010-06-12 01:45:33 -0500 (Sat, 12 Jun 2010) | 1 line
  Update version in README.
........
  r81939 | georg.brandl | 2010-06-12 04:45:01 -0500 (Sat, 12 Jun 2010) | 1 line
  Use newer toctree syntax.
........
  r81940 | georg.brandl | 2010-06-12 04:45:28 -0500 (Sat, 12 Jun 2010) | 1 line
  Add document on how to build.
........
  r81941 | georg.brandl | 2010-06-12 04:45:58 -0500 (Sat, 12 Jun 2010) | 1 line
  Fix gratuitous indentation.
........
  r81942 | georg.brandl | 2010-06-12 04:46:03 -0500 (Sat, 12 Jun 2010) | 1 line
  Update README.
........
  r81963 | andrew.kuchling | 2010-06-12 15:00:55 -0500 (Sat, 12 Jun 2010) | 1 line
  Grammar fix
........
  r81984 | georg.brandl | 2010-06-14 10:58:39 -0500 (Mon, 14 Jun 2010) | 1 line
  #8993: fix reference.
........
  r81991 | andrew.kuchling | 2010-06-14 19:38:58 -0500 (Mon, 14 Jun 2010) | 1 line
  Add another bunch of items
........
  r82120 | andrew.kuchling | 2010-06-20 16:45:45 -0500 (Sun, 20 Jun 2010) | 1 line
  Note that Python 3.x isn't covered; add forward ref. for UTF-8; note error in 2.5 and up
........
  r82188 | benjamin.peterson | 2010-06-23 19:02:46 -0500 (Wed, 23 Jun 2010) | 1 line
  remove reverted changed
........
  r82264 | georg.brandl | 2010-06-27 05:47:47 -0500 (Sun, 27 Jun 2010) | 1 line
  Confusing punctuation.
........
  r82265 | georg.brandl | 2010-06-27 05:49:23 -0500 (Sun, 27 Jun 2010) | 1 line
  Use designated syntax for optional grammar element.
........
  r82266 | georg.brandl | 2010-06-27 05:51:44 -0500 (Sun, 27 Jun 2010) | 1 line
  Fix URL.
........
  r82267 | georg.brandl | 2010-06-27 05:55:38 -0500 (Sun, 27 Jun 2010) | 1 line
  Two typos.
........
											
										 
											2010-06-27 22:32:30 +00:00
										 |  |  | RFC 2396:  "Uniform Resource Identifiers (URI)": Generic Syntax by T. | 
					
						
							| 
									
										
										
										
											2010-04-17 14:44:14 +00:00
										 |  |  | Berners-Lee, R. Fielding, and L. Masinter, August 1998. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-12-02 16:41:00 +00:00
										 |  |  | RFC 2368: "The mailto URL scheme", by P.Hoffman , L Masinter, J. Zawinski, July 1998. | 
					
						
							| 
									
										
										
										
											2010-04-17 14:44:14 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | RFC 1808: "Relative Uniform Resource Locators", by R. Fielding, UC Irvine, June | 
					
						
							|  |  |  | 1995. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
											  
											
												Merged revisions 80605-80609,80642-80646,80651-80652,80674,80684-80686,80748,80852,80854,80870,80872-80873,80907,80915-80916,80951-80952,80976-80977,80985,81038-81040,81042,81053,81070,81104-81105,81114,81125,81245,81285,81402,81463,81516,81562-81563,81567,81593,81635,81680-81681,81684,81801,81888,81931-81933,81939-81942,81963,81984,81991,82120,82188,82264-82267 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
  r80605 | andrew.kuchling | 2010-04-28 19:22:16 -0500 (Wed, 28 Apr 2010) | 1 line
  Add various items
........
  r80606 | andrew.kuchling | 2010-04-28 20:44:30 -0500 (Wed, 28 Apr 2010) | 6 lines
  Fix doubled 'the'.
  Markup fixes to use :exc:, :option: in a few places.
    (Glitch: unittest.main's -c ends up a link to the Python
    interpreter's -c option.  Should we skip using :option: for that
    switch, or disable the auto-linking somehow?)
........
  r80607 | andrew.kuchling | 2010-04-28 20:45:41 -0500 (Wed, 28 Apr 2010) | 1 line
  Add various unittest items
........
  r80608 | benjamin.peterson | 2010-04-28 22:18:05 -0500 (Wed, 28 Apr 2010) | 1 line
  update pypy description
........
  r80609 | benjamin.peterson | 2010-04-28 22:30:59 -0500 (Wed, 28 Apr 2010) | 1 line
  update pypy url
........
  r80642 | andrew.kuchling | 2010-04-29 19:49:09 -0500 (Thu, 29 Apr 2010) | 1 line
  Always add space after RFC; reword paragraph
........
  r80643 | andrew.kuchling | 2010-04-29 19:52:31 -0500 (Thu, 29 Apr 2010) | 6 lines
  Reword paragraph to make its meaning clearer.
  Antoine Pitrou: is my version of the paragraph still correct?
  R. David Murray: is this more understandable than the previous version?
........
  r80644 | andrew.kuchling | 2010-04-29 20:02:15 -0500 (Thu, 29 Apr 2010) | 1 line
  Fix typos
........
  r80645 | andrew.kuchling | 2010-04-29 20:32:47 -0500 (Thu, 29 Apr 2010) | 1 line
  Markup fix; clarify by adding 'in that order'
........
  r80646 | andrew.kuchling | 2010-04-29 20:33:40 -0500 (Thu, 29 Apr 2010) | 1 line
  Add various items; rearrange unittest section a bit
........
  r80651 | andrew.kuchling | 2010-04-30 08:46:55 -0500 (Fri, 30 Apr 2010) | 1 line
  Minor grammar re-wording
........
  r80652 | andrew.kuchling | 2010-04-30 08:47:34 -0500 (Fri, 30 Apr 2010) | 1 line
  Add item
........
  r80674 | andrew.kuchling | 2010-04-30 20:19:16 -0500 (Fri, 30 Apr 2010) | 1 line
  Add various items
........
  r80684 | andrew.kuchling | 2010-05-01 07:05:52 -0500 (Sat, 01 May 2010) | 1 line
  Minor grammar fix
........
  r80685 | andrew.kuchling | 2010-05-01 07:06:51 -0500 (Sat, 01 May 2010) | 1 line
  Describe memoryview
........
  r80686 | antoine.pitrou | 2010-05-01 07:16:39 -0500 (Sat, 01 May 2010) | 4 lines
  Fix attribution. Travis didn't do much and he did a bad work.
  (yes, this is a sensitive subject, sorry)
........
  r80748 | andrew.kuchling | 2010-05-03 20:24:22 -0500 (Mon, 03 May 2010) | 1 line
  Add some more items; the urlparse change is added twice
........
  r80852 | andrew.kuchling | 2010-05-05 20:09:47 -0500 (Wed, 05 May 2010) | 1 line
  Reword paragraph; fix filename, which should be pyconfig.h
........
  r80854 | andrew.kuchling | 2010-05-05 20:10:56 -0500 (Wed, 05 May 2010) | 1 line
  Add various items
........
  r80870 | andrew.kuchling | 2010-05-06 09:14:09 -0500 (Thu, 06 May 2010) | 1 line
  Describe ElementTree 1.3; rearrange new-module sections; describe dict views as sets; small edits and items
........
  r80872 | andrew.kuchling | 2010-05-06 12:21:59 -0500 (Thu, 06 May 2010) | 1 line
  Add 2 items; record ideas for two initial sections; clarify wording
........
  r80873 | andrew.kuchling | 2010-05-06 12:27:57 -0500 (Thu, 06 May 2010) | 1 line
  Change section title; point to unittest2
........
  r80907 | andrew.kuchling | 2010-05-06 20:45:14 -0500 (Thu, 06 May 2010) | 1 line
  Add a new section on the development plan; add an item
........
  r80915 | antoine.pitrou | 2010-05-07 05:15:51 -0500 (Fri, 07 May 2010) | 3 lines
  Fix some markup and a class name. Also, wrap a long line.
........
  r80916 | andrew.kuchling | 2010-05-07 06:30:47 -0500 (Fri, 07 May 2010) | 1 line
  Re-word text
........
  r80951 | andrew.kuchling | 2010-05-07 20:15:26 -0500 (Fri, 07 May 2010) | 1 line
  Add two items
........
  r80952 | andrew.kuchling | 2010-05-07 20:35:55 -0500 (Fri, 07 May 2010) | 1 line
  Get accents correct
........
  r80976 | andrew.kuchling | 2010-05-08 08:28:03 -0500 (Sat, 08 May 2010) | 1 line
  Add logging.dictConfig example; give up on writing a Ttk example
........
  r80977 | andrew.kuchling | 2010-05-08 08:29:46 -0500 (Sat, 08 May 2010) | 1 line
  Markup fixes
........
  r80985 | andrew.kuchling | 2010-05-08 10:39:46 -0500 (Sat, 08 May 2010) | 7 lines
  Write summary of the 2.7 release; rewrite the future section some more;
  mention PYTHONWARNINGS env. var; tweak some examples for readability.
  And with this commit, the "What's New" is done... except for a
  complete read-through to polish the text, and fixing any reported errors,
  but those tasks can easily wait until after beta2.
........
  r81038 | benjamin.peterson | 2010-05-09 16:09:40 -0500 (Sun, 09 May 2010) | 1 line
  finish clause
........
  r81039 | andrew.kuchling | 2010-05-10 09:18:27 -0500 (Mon, 10 May 2010) | 1 line
  Markup fix; re-word a sentence
........
  r81040 | andrew.kuchling | 2010-05-10 09:20:12 -0500 (Mon, 10 May 2010) | 1 line
  Use title case
........
  r81042 | andrew.kuchling | 2010-05-10 10:03:35 -0500 (Mon, 10 May 2010) | 1 line
  Link to unittest2 article
........
  r81053 | florent.xicluna | 2010-05-10 14:59:22 -0500 (Mon, 10 May 2010) | 2 lines
  Add a link on maketrans().
........
  r81070 | andrew.kuchling | 2010-05-10 18:13:41 -0500 (Mon, 10 May 2010) | 1 line
  Fix typo
........
  r81104 | andrew.kuchling | 2010-05-11 19:38:44 -0500 (Tue, 11 May 2010) | 1 line
  Revision pass: lots of edits, typo fixes, rearrangements
........
  r81105 | andrew.kuchling | 2010-05-11 19:40:47 -0500 (Tue, 11 May 2010) | 1 line
  Let's call this done
........
  r81114 | andrew.kuchling | 2010-05-12 08:56:07 -0500 (Wed, 12 May 2010) | 1 line
  Grammar fix
........
  r81125 | andrew.kuchling | 2010-05-12 13:56:48 -0500 (Wed, 12 May 2010) | 1 line
  #8696: add documentation for logging.config.dictConfig (PEP 391)
........
  r81245 | andrew.kuchling | 2010-05-16 18:31:16 -0500 (Sun, 16 May 2010) | 1 line
  Add cross-reference to later section
........
  r81285 | vinay.sajip | 2010-05-18 03:16:27 -0500 (Tue, 18 May 2010) | 1 line
  Fixed minor typo in ReST markup.
........
  r81402 | vinay.sajip | 2010-05-21 12:41:34 -0500 (Fri, 21 May 2010) | 1 line
  Updated logging documentation with more dictConfig information.
........
  r81463 | georg.brandl | 2010-05-22 03:17:23 -0500 (Sat, 22 May 2010) | 1 line
  #8785: less confusing description of regex.find*.
........
  r81516 | andrew.kuchling | 2010-05-25 08:34:08 -0500 (Tue, 25 May 2010) | 1 line
  Add three items
........
  r81562 | andrew.kuchling | 2010-05-27 08:22:53 -0500 (Thu, 27 May 2010) | 1 line
  Rewrite wxWidgets section
........
  r81563 | andrew.kuchling | 2010-05-27 08:30:09 -0500 (Thu, 27 May 2010) | 1 line
  Remove top-level 'General Questions' section, pushing up the questions it contains
........
  r81567 | andrew.kuchling | 2010-05-27 16:29:59 -0500 (Thu, 27 May 2010) | 1 line
  Add item
........
  r81593 | georg.brandl | 2010-05-29 03:46:18 -0500 (Sat, 29 May 2010) | 1 line
  #8616: add new turtle demo "nim".
........
  r81635 | georg.brandl | 2010-06-01 02:25:23 -0500 (Tue, 01 Jun 2010) | 1 line
  Put docs for RegexObject.search() before RegexObject.match() to mirror re.search() and re.match() order.
........
  r81680 | vinay.sajip | 2010-06-03 17:34:42 -0500 (Thu, 03 Jun 2010) | 1 line
  Issue #8890: Documentation changed to avoid reference to temporary files.
........
  r81681 | sean.reifschneider | 2010-06-03 20:51:26 -0500 (Thu, 03 Jun 2010) | 2 lines
  Issue8810: Clearing up docstring for tzinfo.utcoffset.
........
  r81684 | vinay.sajip | 2010-06-04 08:41:02 -0500 (Fri, 04 Jun 2010) | 1 line
  Issue #8890: Documentation changed to avoid reference to temporary files - other cases covered.
........
  r81801 | andrew.kuchling | 2010-06-07 08:38:40 -0500 (Mon, 07 Jun 2010) | 1 line
  #8875: Remove duplicated paragraph
........
  r81888 | andrew.kuchling | 2010-06-10 20:54:58 -0500 (Thu, 10 Jun 2010) | 1 line
  Add a few more items
........
  r81931 | georg.brandl | 2010-06-12 01:26:54 -0500 (Sat, 12 Jun 2010) | 1 line
  Fix punctuation.
........
  r81932 | georg.brandl | 2010-06-12 01:28:58 -0500 (Sat, 12 Jun 2010) | 1 line
  Document that an existing directory raises in mkdir().
........
  r81933 | georg.brandl | 2010-06-12 01:45:33 -0500 (Sat, 12 Jun 2010) | 1 line
  Update version in README.
........
  r81939 | georg.brandl | 2010-06-12 04:45:01 -0500 (Sat, 12 Jun 2010) | 1 line
  Use newer toctree syntax.
........
  r81940 | georg.brandl | 2010-06-12 04:45:28 -0500 (Sat, 12 Jun 2010) | 1 line
  Add document on how to build.
........
  r81941 | georg.brandl | 2010-06-12 04:45:58 -0500 (Sat, 12 Jun 2010) | 1 line
  Fix gratuitous indentation.
........
  r81942 | georg.brandl | 2010-06-12 04:46:03 -0500 (Sat, 12 Jun 2010) | 1 line
  Update README.
........
  r81963 | andrew.kuchling | 2010-06-12 15:00:55 -0500 (Sat, 12 Jun 2010) | 1 line
  Grammar fix
........
  r81984 | georg.brandl | 2010-06-14 10:58:39 -0500 (Mon, 14 Jun 2010) | 1 line
  #8993: fix reference.
........
  r81991 | andrew.kuchling | 2010-06-14 19:38:58 -0500 (Mon, 14 Jun 2010) | 1 line
  Add another bunch of items
........
  r82120 | andrew.kuchling | 2010-06-20 16:45:45 -0500 (Sun, 20 Jun 2010) | 1 line
  Note that Python 3.x isn't covered; add forward ref. for UTF-8; note error in 2.5 and up
........
  r82188 | benjamin.peterson | 2010-06-23 19:02:46 -0500 (Wed, 23 Jun 2010) | 1 line
  remove reverted changed
........
  r82264 | georg.brandl | 2010-06-27 05:47:47 -0500 (Sun, 27 Jun 2010) | 1 line
  Confusing punctuation.
........
  r82265 | georg.brandl | 2010-06-27 05:49:23 -0500 (Sun, 27 Jun 2010) | 1 line
  Use designated syntax for optional grammar element.
........
  r82266 | georg.brandl | 2010-06-27 05:51:44 -0500 (Sun, 27 Jun 2010) | 1 line
  Fix URL.
........
  r82267 | georg.brandl | 2010-06-27 05:55:38 -0500 (Sun, 27 Jun 2010) | 1 line
  Two typos.
........
											
										 
											2010-06-27 22:32:30 +00:00
										 |  |  | RFC 1738: "Uniform Resource Locators (URL)" by T. Berners-Lee, L. Masinter, M. | 
					
						
							| 
									
										
										
										
											2010-04-17 14:44:14 +00:00
										 |  |  | McCahill, December 1994 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
											  
											
												Merged revisions 80605-80609,80642-80646,80651-80652,80674,80684-80686,80748,80852,80854,80870,80872-80873,80907,80915-80916,80951-80952,80976-80977,80985,81038-81040,81042,81053,81070,81104-81105,81114,81125,81245,81285,81402,81463,81516,81562-81563,81567,81593,81635,81680-81681,81684,81801,81888,81931-81933,81939-81942,81963,81984,81991,82120,82188,82264-82267 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
  r80605 | andrew.kuchling | 2010-04-28 19:22:16 -0500 (Wed, 28 Apr 2010) | 1 line
  Add various items
........
  r80606 | andrew.kuchling | 2010-04-28 20:44:30 -0500 (Wed, 28 Apr 2010) | 6 lines
  Fix doubled 'the'.
  Markup fixes to use :exc:, :option: in a few places.
    (Glitch: unittest.main's -c ends up a link to the Python
    interpreter's -c option.  Should we skip using :option: for that
    switch, or disable the auto-linking somehow?)
........
  r80607 | andrew.kuchling | 2010-04-28 20:45:41 -0500 (Wed, 28 Apr 2010) | 1 line
  Add various unittest items
........
  r80608 | benjamin.peterson | 2010-04-28 22:18:05 -0500 (Wed, 28 Apr 2010) | 1 line
  update pypy description
........
  r80609 | benjamin.peterson | 2010-04-28 22:30:59 -0500 (Wed, 28 Apr 2010) | 1 line
  update pypy url
........
  r80642 | andrew.kuchling | 2010-04-29 19:49:09 -0500 (Thu, 29 Apr 2010) | 1 line
  Always add space after RFC; reword paragraph
........
  r80643 | andrew.kuchling | 2010-04-29 19:52:31 -0500 (Thu, 29 Apr 2010) | 6 lines
  Reword paragraph to make its meaning clearer.
  Antoine Pitrou: is my version of the paragraph still correct?
  R. David Murray: is this more understandable than the previous version?
........
  r80644 | andrew.kuchling | 2010-04-29 20:02:15 -0500 (Thu, 29 Apr 2010) | 1 line
  Fix typos
........
  r80645 | andrew.kuchling | 2010-04-29 20:32:47 -0500 (Thu, 29 Apr 2010) | 1 line
  Markup fix; clarify by adding 'in that order'
........
  r80646 | andrew.kuchling | 2010-04-29 20:33:40 -0500 (Thu, 29 Apr 2010) | 1 line
  Add various items; rearrange unittest section a bit
........
  r80651 | andrew.kuchling | 2010-04-30 08:46:55 -0500 (Fri, 30 Apr 2010) | 1 line
  Minor grammar re-wording
........
  r80652 | andrew.kuchling | 2010-04-30 08:47:34 -0500 (Fri, 30 Apr 2010) | 1 line
  Add item
........
  r80674 | andrew.kuchling | 2010-04-30 20:19:16 -0500 (Fri, 30 Apr 2010) | 1 line
  Add various items
........
  r80684 | andrew.kuchling | 2010-05-01 07:05:52 -0500 (Sat, 01 May 2010) | 1 line
  Minor grammar fix
........
  r80685 | andrew.kuchling | 2010-05-01 07:06:51 -0500 (Sat, 01 May 2010) | 1 line
  Describe memoryview
........
  r80686 | antoine.pitrou | 2010-05-01 07:16:39 -0500 (Sat, 01 May 2010) | 4 lines
  Fix attribution. Travis didn't do much and he did a bad work.
  (yes, this is a sensitive subject, sorry)
........
  r80748 | andrew.kuchling | 2010-05-03 20:24:22 -0500 (Mon, 03 May 2010) | 1 line
  Add some more items; the urlparse change is added twice
........
  r80852 | andrew.kuchling | 2010-05-05 20:09:47 -0500 (Wed, 05 May 2010) | 1 line
  Reword paragraph; fix filename, which should be pyconfig.h
........
  r80854 | andrew.kuchling | 2010-05-05 20:10:56 -0500 (Wed, 05 May 2010) | 1 line
  Add various items
........
  r80870 | andrew.kuchling | 2010-05-06 09:14:09 -0500 (Thu, 06 May 2010) | 1 line
  Describe ElementTree 1.3; rearrange new-module sections; describe dict views as sets; small edits and items
........
  r80872 | andrew.kuchling | 2010-05-06 12:21:59 -0500 (Thu, 06 May 2010) | 1 line
  Add 2 items; record ideas for two initial sections; clarify wording
........
  r80873 | andrew.kuchling | 2010-05-06 12:27:57 -0500 (Thu, 06 May 2010) | 1 line
  Change section title; point to unittest2
........
  r80907 | andrew.kuchling | 2010-05-06 20:45:14 -0500 (Thu, 06 May 2010) | 1 line
  Add a new section on the development plan; add an item
........
  r80915 | antoine.pitrou | 2010-05-07 05:15:51 -0500 (Fri, 07 May 2010) | 3 lines
  Fix some markup and a class name. Also, wrap a long line.
........
  r80916 | andrew.kuchling | 2010-05-07 06:30:47 -0500 (Fri, 07 May 2010) | 1 line
  Re-word text
........
  r80951 | andrew.kuchling | 2010-05-07 20:15:26 -0500 (Fri, 07 May 2010) | 1 line
  Add two items
........
  r80952 | andrew.kuchling | 2010-05-07 20:35:55 -0500 (Fri, 07 May 2010) | 1 line
  Get accents correct
........
  r80976 | andrew.kuchling | 2010-05-08 08:28:03 -0500 (Sat, 08 May 2010) | 1 line
  Add logging.dictConfig example; give up on writing a Ttk example
........
  r80977 | andrew.kuchling | 2010-05-08 08:29:46 -0500 (Sat, 08 May 2010) | 1 line
  Markup fixes
........
  r80985 | andrew.kuchling | 2010-05-08 10:39:46 -0500 (Sat, 08 May 2010) | 7 lines
  Write summary of the 2.7 release; rewrite the future section some more;
  mention PYTHONWARNINGS env. var; tweak some examples for readability.
  And with this commit, the "What's New" is done... except for a
  complete read-through to polish the text, and fixing any reported errors,
  but those tasks can easily wait until after beta2.
........
  r81038 | benjamin.peterson | 2010-05-09 16:09:40 -0500 (Sun, 09 May 2010) | 1 line
  finish clause
........
  r81039 | andrew.kuchling | 2010-05-10 09:18:27 -0500 (Mon, 10 May 2010) | 1 line
  Markup fix; re-word a sentence
........
  r81040 | andrew.kuchling | 2010-05-10 09:20:12 -0500 (Mon, 10 May 2010) | 1 line
  Use title case
........
  r81042 | andrew.kuchling | 2010-05-10 10:03:35 -0500 (Mon, 10 May 2010) | 1 line
  Link to unittest2 article
........
  r81053 | florent.xicluna | 2010-05-10 14:59:22 -0500 (Mon, 10 May 2010) | 2 lines
  Add a link on maketrans().
........
  r81070 | andrew.kuchling | 2010-05-10 18:13:41 -0500 (Mon, 10 May 2010) | 1 line
  Fix typo
........
  r81104 | andrew.kuchling | 2010-05-11 19:38:44 -0500 (Tue, 11 May 2010) | 1 line
  Revision pass: lots of edits, typo fixes, rearrangements
........
  r81105 | andrew.kuchling | 2010-05-11 19:40:47 -0500 (Tue, 11 May 2010) | 1 line
  Let's call this done
........
  r81114 | andrew.kuchling | 2010-05-12 08:56:07 -0500 (Wed, 12 May 2010) | 1 line
  Grammar fix
........
  r81125 | andrew.kuchling | 2010-05-12 13:56:48 -0500 (Wed, 12 May 2010) | 1 line
  #8696: add documentation for logging.config.dictConfig (PEP 391)
........
  r81245 | andrew.kuchling | 2010-05-16 18:31:16 -0500 (Sun, 16 May 2010) | 1 line
  Add cross-reference to later section
........
  r81285 | vinay.sajip | 2010-05-18 03:16:27 -0500 (Tue, 18 May 2010) | 1 line
  Fixed minor typo in ReST markup.
........
  r81402 | vinay.sajip | 2010-05-21 12:41:34 -0500 (Fri, 21 May 2010) | 1 line
  Updated logging documentation with more dictConfig information.
........
  r81463 | georg.brandl | 2010-05-22 03:17:23 -0500 (Sat, 22 May 2010) | 1 line
  #8785: less confusing description of regex.find*.
........
  r81516 | andrew.kuchling | 2010-05-25 08:34:08 -0500 (Tue, 25 May 2010) | 1 line
  Add three items
........
  r81562 | andrew.kuchling | 2010-05-27 08:22:53 -0500 (Thu, 27 May 2010) | 1 line
  Rewrite wxWidgets section
........
  r81563 | andrew.kuchling | 2010-05-27 08:30:09 -0500 (Thu, 27 May 2010) | 1 line
  Remove top-level 'General Questions' section, pushing up the questions it contains
........
  r81567 | andrew.kuchling | 2010-05-27 16:29:59 -0500 (Thu, 27 May 2010) | 1 line
  Add item
........
  r81593 | georg.brandl | 2010-05-29 03:46:18 -0500 (Sat, 29 May 2010) | 1 line
  #8616: add new turtle demo "nim".
........
  r81635 | georg.brandl | 2010-06-01 02:25:23 -0500 (Tue, 01 Jun 2010) | 1 line
  Put docs for RegexObject.search() before RegexObject.match() to mirror re.search() and re.match() order.
........
  r81680 | vinay.sajip | 2010-06-03 17:34:42 -0500 (Thu, 03 Jun 2010) | 1 line
  Issue #8890: Documentation changed to avoid reference to temporary files.
........
  r81681 | sean.reifschneider | 2010-06-03 20:51:26 -0500 (Thu, 03 Jun 2010) | 2 lines
  Issue8810: Clearing up docstring for tzinfo.utcoffset.
........
  r81684 | vinay.sajip | 2010-06-04 08:41:02 -0500 (Fri, 04 Jun 2010) | 1 line
  Issue #8890: Documentation changed to avoid reference to temporary files - other cases covered.
........
  r81801 | andrew.kuchling | 2010-06-07 08:38:40 -0500 (Mon, 07 Jun 2010) | 1 line
  #8875: Remove duplicated paragraph
........
  r81888 | andrew.kuchling | 2010-06-10 20:54:58 -0500 (Thu, 10 Jun 2010) | 1 line
  Add a few more items
........
  r81931 | georg.brandl | 2010-06-12 01:26:54 -0500 (Sat, 12 Jun 2010) | 1 line
  Fix punctuation.
........
  r81932 | georg.brandl | 2010-06-12 01:28:58 -0500 (Sat, 12 Jun 2010) | 1 line
  Document that an existing directory raises in mkdir().
........
  r81933 | georg.brandl | 2010-06-12 01:45:33 -0500 (Sat, 12 Jun 2010) | 1 line
  Update version in README.
........
  r81939 | georg.brandl | 2010-06-12 04:45:01 -0500 (Sat, 12 Jun 2010) | 1 line
  Use newer toctree syntax.
........
  r81940 | georg.brandl | 2010-06-12 04:45:28 -0500 (Sat, 12 Jun 2010) | 1 line
  Add document on how to build.
........
  r81941 | georg.brandl | 2010-06-12 04:45:58 -0500 (Sat, 12 Jun 2010) | 1 line
  Fix gratuitous indentation.
........
  r81942 | georg.brandl | 2010-06-12 04:46:03 -0500 (Sat, 12 Jun 2010) | 1 line
  Update README.
........
  r81963 | andrew.kuchling | 2010-06-12 15:00:55 -0500 (Sat, 12 Jun 2010) | 1 line
  Grammar fix
........
  r81984 | georg.brandl | 2010-06-14 10:58:39 -0500 (Mon, 14 Jun 2010) | 1 line
  #8993: fix reference.
........
  r81991 | andrew.kuchling | 2010-06-14 19:38:58 -0500 (Mon, 14 Jun 2010) | 1 line
  Add another bunch of items
........
  r82120 | andrew.kuchling | 2010-06-20 16:45:45 -0500 (Sun, 20 Jun 2010) | 1 line
  Note that Python 3.x isn't covered; add forward ref. for UTF-8; note error in 2.5 and up
........
  r82188 | benjamin.peterson | 2010-06-23 19:02:46 -0500 (Wed, 23 Jun 2010) | 1 line
  remove reverted changed
........
  r82264 | georg.brandl | 2010-06-27 05:47:47 -0500 (Sun, 27 Jun 2010) | 1 line
  Confusing punctuation.
........
  r82265 | georg.brandl | 2010-06-27 05:49:23 -0500 (Sun, 27 Jun 2010) | 1 line
  Use designated syntax for optional grammar element.
........
  r82266 | georg.brandl | 2010-06-27 05:51:44 -0500 (Sun, 27 Jun 2010) | 1 line
  Fix URL.
........
  r82267 | georg.brandl | 2010-06-27 05:55:38 -0500 (Sun, 27 Jun 2010) | 1 line
  Two typos.
........
											
										 
											2010-06-27 22:32:30 +00:00
										 |  |  | RFC 3986 is considered the current standard and any future changes to | 
					
						
							|  |  |  | urlparse module should conform with it.  The urlparse module is | 
					
						
							|  |  |  | currently not entirely compliant with this RFC due to defacto | 
					
						
							|  |  |  | scenarios for parsing, and for backward compatibility purposes, some | 
					
						
							|  |  |  | parsing quirks from older RFCs are retained. The testcases in | 
					
						
							| 
									
										
										
										
											2010-04-17 14:44:14 +00:00
										 |  |  | test_urlparse.py provides a good indicator of parsing behavior. | 
					
						
							| 
									
										
										
										
											2000-02-04 15:28:42 +00:00
										 |  |  | """
 | 
					
						
							| 
									
										
										
										
											1994-09-12 10:36:35 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-05-11 17:01:44 -07:00
										 |  |  | from collections import namedtuple | 
					
						
							|  |  |  | import functools | 
					
						
							| 
									
										
										
										
											2022-09-19 16:06:25 -07:00
										 |  |  | import math | 
					
						
							| 
									
										
										
										
											2013-03-14 21:31:37 +02:00
										 |  |  | import re | 
					
						
							| 
									
										
										
										
											2020-04-10 17:46:36 +03:00
										 |  |  | import types | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  | import warnings | 
					
						
							| 
									
										
										
										
											2008-07-07 18:24:11 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2002-10-16 21:21:39 +00:00
										 |  |  | __all__ = ["urlparse", "urlunparse", "urljoin", "urldefrag", | 
					
						
							| 
									
										
										
										
											2010-10-25 16:36:20 +00:00
										 |  |  |            "urlsplit", "urlunsplit", "urlencode", "parse_qs", | 
					
						
							|  |  |  |            "parse_qsl", "quote", "quote_plus", "quote_from_bytes", | 
					
						
							| 
									
										
										
										
											2015-04-07 19:09:01 +03:00
										 |  |  |            "unquote", "unquote_plus", "unquote_to_bytes", | 
					
						
							|  |  |  |            "DefragResult", "ParseResult", "SplitResult", | 
					
						
							|  |  |  |            "DefragResultBytes", "ParseResultBytes", "SplitResultBytes"] | 
					
						
							| 
									
										
										
										
											2001-03-01 04:27:19 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2017-05-17 21:48:59 -07:00
										 |  |  | # A classification of schemes. | 
					
						
							|  |  |  | # The empty string classifies URLs with no scheme specified, | 
					
						
							|  |  |  | # being the default value returned by “urlsplit” and “urlparse”. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | uses_relative = ['', 'ftp', 'http', 'gopher', 'nntp', 'imap', | 
					
						
							| 
									
										
										
										
											2006-01-20 17:24:23 +00:00
										 |  |  |                  'wais', 'file', 'https', 'shttp', 'mms', | 
					
						
							| 
									
										
										
										
											2017-05-17 21:48:59 -07:00
										 |  |  |                  'prospero', 'rtsp', 'rtspu', 'sftp', | 
					
						
							| 
									
										
										
										
											2016-09-16 14:43:58 +03:00
										 |  |  |                  'svn', 'svn+ssh', 'ws', 'wss'] | 
					
						
							| 
									
										
										
										
											2017-05-17 21:48:59 -07:00
										 |  |  | 
 | 
					
						
							|  |  |  | uses_netloc = ['', 'ftp', 'http', 'gopher', 'nntp', 'telnet', | 
					
						
							| 
									
										
										
										
											2006-01-20 17:24:23 +00:00
										 |  |  |                'imap', 'wais', 'file', 'mms', 'https', 'shttp', | 
					
						
							| 
									
										
										
										
											2017-05-17 21:48:59 -07:00
										 |  |  |                'snews', 'prospero', 'rtsp', 'rtspu', 'rsync', | 
					
						
							| 
									
										
										
										
											2016-09-16 14:43:58 +03:00
										 |  |  |                'svn', 'svn+ssh', 'sftp', 'nfs', 'git', 'git+ssh', | 
					
						
							|  |  |  |                'ws', 'wss'] | 
					
						
							| 
									
										
										
										
											2017-05-17 21:48:59 -07:00
										 |  |  | 
 | 
					
						
							|  |  |  | uses_params = ['', 'ftp', 'hdl', 'prospero', 'http', 'imap', | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  |                'https', 'shttp', 'rtsp', 'rtspu', 'sip', 'sips', | 
					
						
							| 
									
										
										
										
											2017-05-17 21:48:59 -07:00
										 |  |  |                'mms', 'sftp', 'tel'] | 
					
						
							| 
									
										
										
										
											1994-09-12 10:36:35 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2012-08-24 18:15:29 +02:00
										 |  |  | # These are not actually used anymore, but should stay for backwards | 
					
						
							|  |  |  | # compatibility.  (They are undocumented, but have a public-looking name.) | 
					
						
							| 
									
										
										
										
											2017-05-17 21:48:59 -07:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2012-08-24 18:15:29 +02:00
										 |  |  | non_hierarchical = ['gopher', 'hdl', 'mailto', 'news', | 
					
						
							|  |  |  |                     'telnet', 'wais', 'imap', 'snews', 'sip', 'sips'] | 
					
						
							| 
									
										
										
										
											2017-05-17 21:48:59 -07:00
										 |  |  | 
 | 
					
						
							|  |  |  | uses_query = ['', 'http', 'wais', 'imap', 'https', 'shttp', 'mms', | 
					
						
							|  |  |  |               'gopher', 'rtsp', 'rtspu', 'sip', 'sips'] | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | uses_fragment = ['', 'ftp', 'hdl', 'http', 'gopher', 'news', | 
					
						
							| 
									
										
										
										
											2012-08-24 18:15:29 +02:00
										 |  |  |                  'nntp', 'wais', 'https', 'shttp', 'snews', | 
					
						
							| 
									
										
										
										
											2017-05-17 21:48:59 -07:00
										 |  |  |                  'file', 'prospero'] | 
					
						
							| 
									
										
										
										
											2012-08-24 18:15:29 +02:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1994-09-12 10:36:35 +00:00
										 |  |  | # Characters valid in scheme names | 
					
						
							| 
									
										
										
										
											2000-12-19 16:48:13 +00:00
										 |  |  | scheme_chars = ('abcdefghijklmnopqrstuvwxyz' | 
					
						
							|  |  |  |                 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' | 
					
						
							|  |  |  |                 '0123456789' | 
					
						
							|  |  |  |                 '+-.') | 
					
						
							| 
									
										
										
										
											1994-09-12 10:36:35 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-04-29 10:16:50 -07:00
										 |  |  | # Unsafe bytes to be removed per WHATWG spec | 
					
						
							|  |  |  | _UNSAFE_URL_BYTES_TO_REMOVE = ['\t', '\r', '\n'] | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-05-28 23:54:24 +00:00
										 |  |  | def clear_cache(): | 
					
						
							| 
									
										
										
										
											2021-05-11 17:01:44 -07:00
										 |  |  |     """Clear internal performance caches. Undocumented; some tests want it.""" | 
					
						
							|  |  |  |     urlsplit.cache_clear() | 
					
						
							|  |  |  |     _byte_quoter_factory.cache_clear() | 
					
						
							| 
									
										
										
										
											1996-05-28 23:54:24 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  | # Helpers for bytes handling | 
					
						
							|  |  |  | # For 3.2, we deliberately require applications that | 
					
						
							|  |  |  | # handle improperly quoted URLs to do their own | 
					
						
							|  |  |  | # decoding and encoding. If valid use cases are | 
					
						
							|  |  |  | # presented, we may relax this by using latin-1 | 
					
						
							|  |  |  | # decoding internally for 3.3 | 
					
						
							|  |  |  | _implicit_encoding = 'ascii' | 
					
						
							|  |  |  | _implicit_errors = 'strict' | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def _noop(obj): | 
					
						
							|  |  |  |     return obj | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def _encode_result(obj, encoding=_implicit_encoding, | 
					
						
							|  |  |  |                         errors=_implicit_errors): | 
					
						
							|  |  |  |     return obj.encode(encoding, errors) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def _decode_args(args, encoding=_implicit_encoding, | 
					
						
							|  |  |  |                        errors=_implicit_errors): | 
					
						
							|  |  |  |     return tuple(x.decode(encoding, errors) if x else '' for x in args) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def _coerce_args(*args): | 
					
						
							|  |  |  |     # Invokes decode if necessary to create str args | 
					
						
							|  |  |  |     # and returns the coerced inputs along with | 
					
						
							|  |  |  |     # an appropriate result coercion function | 
					
						
							|  |  |  |     #   - noop for str inputs | 
					
						
							|  |  |  |     #   - encoding function otherwise | 
					
						
							|  |  |  |     str_input = isinstance(args[0], str) | 
					
						
							|  |  |  |     for arg in args[1:]: | 
					
						
							|  |  |  |         # We special-case the empty string to support the | 
					
						
							|  |  |  |         # "scheme=''" default argument to some functions | 
					
						
							|  |  |  |         if arg and isinstance(arg, str) != str_input: | 
					
						
							|  |  |  |             raise TypeError("Cannot mix str and non-str arguments") | 
					
						
							|  |  |  |     if str_input: | 
					
						
							|  |  |  |         return args + (_noop,) | 
					
						
							|  |  |  |     return _decode_args(args) + (_encode_result,) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | # Result objects are more helpful than simple tuples | 
					
						
							|  |  |  | class _ResultMixinStr(object): | 
					
						
							|  |  |  |     """Standard approach to encoding parsed results from str to bytes""" | 
					
						
							|  |  |  |     __slots__ = () | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     def encode(self, encoding='ascii', errors='strict'): | 
					
						
							|  |  |  |         return self._encoded_counterpart(*(x.encode(encoding, errors) for x in self)) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | class _ResultMixinBytes(object): | 
					
						
							|  |  |  |     """Standard approach to decoding parsed results from bytes to str""" | 
					
						
							|  |  |  |     __slots__ = () | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     def decode(self, encoding='ascii', errors='strict'): | 
					
						
							|  |  |  |         return self._decoded_counterpart(*(x.decode(encoding, errors) for x in self)) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | class _NetlocResultMixinBase(object): | 
					
						
							|  |  |  |     """Shared methods for the parsed result objects containing a netloc element""" | 
					
						
							|  |  |  |     __slots__ = () | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |     @property | 
					
						
							|  |  |  |     def username(self): | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |         return self._userinfo[0] | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |     @property | 
					
						
							|  |  |  |     def password(self): | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |         return self._userinfo[1] | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |     @property | 
					
						
							|  |  |  |     def hostname(self): | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |         hostname = self._hostinfo[0] | 
					
						
							|  |  |  |         if not hostname: | 
					
						
							| 
									
										
										
										
											2017-12-21 17:16:17 +05:00
										 |  |  |             return None | 
					
						
							|  |  |  |         # Scoped IPv6 address may have zone info, which must not be lowercased | 
					
						
							|  |  |  |         # like http://[fe80::822a:a8ff:fe49:470c%tESt]:1234/keys | 
					
						
							|  |  |  |         separator = '%' if isinstance(hostname, str) else b'%' | 
					
						
							|  |  |  |         hostname, percent, zone = hostname.partition(separator) | 
					
						
							|  |  |  |         return hostname.lower() + percent + zone | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |     @property | 
					
						
							|  |  |  |     def port(self): | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |         port = self._hostinfo[1] | 
					
						
							|  |  |  |         if port is not None: | 
					
						
							| 
									
										
										
										
											2022-10-20 17:00:56 -04:00
										 |  |  |             if port.isdigit() and port.isascii(): | 
					
						
							|  |  |  |                 port = int(port) | 
					
						
							|  |  |  |             else: | 
					
						
							|  |  |  |                 raise ValueError(f"Port could not be cast to integer value as {port!r}") | 
					
						
							|  |  |  |             if not (0 <= port <= 65535): | 
					
						
							| 
									
										
										
										
											2015-08-10 09:53:30 +12:00
										 |  |  |                 raise ValueError("Port out of range 0-65535") | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |         return port | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2020-04-10 17:46:36 +03:00
										 |  |  |     __class_getitem__ = classmethod(types.GenericAlias) | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | class _NetlocResultMixinStr(_NetlocResultMixinBase, _ResultMixinStr): | 
					
						
							|  |  |  |     __slots__ = () | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     @property | 
					
						
							|  |  |  |     def _userinfo(self): | 
					
						
							|  |  |  |         netloc = self.netloc | 
					
						
							|  |  |  |         userinfo, have_info, hostinfo = netloc.rpartition('@') | 
					
						
							|  |  |  |         if have_info: | 
					
						
							|  |  |  |             username, have_password, password = userinfo.partition(':') | 
					
						
							|  |  |  |             if not have_password: | 
					
						
							|  |  |  |                 password = None | 
					
						
							|  |  |  |         else: | 
					
						
							|  |  |  |             username = password = None | 
					
						
							|  |  |  |         return username, password | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     @property | 
					
						
							|  |  |  |     def _hostinfo(self): | 
					
						
							|  |  |  |         netloc = self.netloc | 
					
						
							|  |  |  |         _, _, hostinfo = netloc.rpartition('@') | 
					
						
							|  |  |  |         _, have_open_br, bracketed = hostinfo.partition('[') | 
					
						
							|  |  |  |         if have_open_br: | 
					
						
							|  |  |  |             hostname, _, port = bracketed.partition(']') | 
					
						
							| 
									
										
										
										
											2014-01-18 18:30:33 +02:00
										 |  |  |             _, _, port = port.partition(':') | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |         else: | 
					
						
							| 
									
										
										
										
											2014-01-18 18:30:33 +02:00
										 |  |  |             hostname, _, port = hostinfo.partition(':') | 
					
						
							|  |  |  |         if not port: | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |             port = None | 
					
						
							|  |  |  |         return hostname, port | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | class _NetlocResultMixinBytes(_NetlocResultMixinBase, _ResultMixinBytes): | 
					
						
							|  |  |  |     __slots__ = () | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     @property | 
					
						
							|  |  |  |     def _userinfo(self): | 
					
						
							|  |  |  |         netloc = self.netloc | 
					
						
							|  |  |  |         userinfo, have_info, hostinfo = netloc.rpartition(b'@') | 
					
						
							|  |  |  |         if have_info: | 
					
						
							|  |  |  |             username, have_password, password = userinfo.partition(b':') | 
					
						
							|  |  |  |             if not have_password: | 
					
						
							|  |  |  |                 password = None | 
					
						
							| 
									
										
										
										
											2010-04-16 03:02:13 +00:00
										 |  |  |         else: | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |             username = password = None | 
					
						
							|  |  |  |         return username, password | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     @property | 
					
						
							|  |  |  |     def _hostinfo(self): | 
					
						
							|  |  |  |         netloc = self.netloc | 
					
						
							|  |  |  |         _, _, hostinfo = netloc.rpartition(b'@') | 
					
						
							|  |  |  |         _, have_open_br, bracketed = hostinfo.partition(b'[') | 
					
						
							|  |  |  |         if have_open_br: | 
					
						
							|  |  |  |             hostname, _, port = bracketed.partition(b']') | 
					
						
							| 
									
										
										
										
											2014-01-18 18:30:33 +02:00
										 |  |  |             _, _, port = port.partition(b':') | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |         else: | 
					
						
							| 
									
										
										
										
											2014-01-18 18:30:33 +02:00
										 |  |  |             hostname, _, port = hostinfo.partition(b':') | 
					
						
							|  |  |  |         if not port: | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |             port = None | 
					
						
							|  |  |  |         return hostname, port | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  | _DefragResultBase = namedtuple('DefragResult', 'url fragment') | 
					
						
							| 
									
										
										
										
											2016-01-14 00:11:39 -08:00
										 |  |  | _SplitResultBase = namedtuple( | 
					
						
							|  |  |  |     'SplitResult', 'scheme netloc path query fragment') | 
					
						
							|  |  |  | _ParseResultBase = namedtuple( | 
					
						
							|  |  |  |     'ParseResult', 'scheme netloc path params query fragment') | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _DefragResultBase.__doc__ = """
 | 
					
						
							|  |  |  | DefragResult(url, fragment) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | A 2-tuple that contains the url without fragment identifier and the fragment | 
					
						
							|  |  |  | identifier as a separate argument. | 
					
						
							|  |  |  | """
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _DefragResultBase.url.__doc__ = """The URL with no fragment identifier.""" | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _DefragResultBase.fragment.__doc__ = """
 | 
					
						
							|  |  |  | Fragment identifier separated from URL, that allows indirect identification of a | 
					
						
							|  |  |  | secondary resource by reference to a primary resource and additional identifying | 
					
						
							|  |  |  | information. | 
					
						
							|  |  |  | """
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _SplitResultBase.__doc__ = """
 | 
					
						
							|  |  |  | SplitResult(scheme, netloc, path, query, fragment) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | A 5-tuple that contains the different components of a URL. Similar to | 
					
						
							|  |  |  | ParseResult, but does not split params. | 
					
						
							|  |  |  | """
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _SplitResultBase.scheme.__doc__ = """Specifies URL scheme for the request.""" | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _SplitResultBase.netloc.__doc__ = """
 | 
					
						
							|  |  |  | Network location where the request is made to. | 
					
						
							|  |  |  | """
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _SplitResultBase.path.__doc__ = """
 | 
					
						
							|  |  |  | The hierarchical path, such as the path to a file to download. | 
					
						
							|  |  |  | """
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _SplitResultBase.query.__doc__ = """
 | 
					
						
							|  |  |  | The query component, that contains non-hierarchical data, that along with data | 
					
						
							|  |  |  | in path component, identifies a resource in the scope of URI's scheme and | 
					
						
							|  |  |  | network location. | 
					
						
							|  |  |  | """
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _SplitResultBase.fragment.__doc__ = """
 | 
					
						
							|  |  |  | Fragment identifier, that allows indirect identification of a secondary resource | 
					
						
							|  |  |  | by reference to a primary resource and additional identifying information. | 
					
						
							|  |  |  | """
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _ParseResultBase.__doc__ = """
 | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  | ParseResult(scheme, netloc, path, params, query, fragment) | 
					
						
							| 
									
										
										
										
											2016-01-14 00:11:39 -08:00
										 |  |  | 
 | 
					
						
							|  |  |  | A 6-tuple that contains components of a parsed URL. | 
					
						
							|  |  |  | """
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _ParseResultBase.scheme.__doc__ = _SplitResultBase.scheme.__doc__ | 
					
						
							|  |  |  | _ParseResultBase.netloc.__doc__ = _SplitResultBase.netloc.__doc__ | 
					
						
							|  |  |  | _ParseResultBase.path.__doc__ = _SplitResultBase.path.__doc__ | 
					
						
							|  |  |  | _ParseResultBase.params.__doc__ = """
 | 
					
						
							|  |  |  | Parameters for last path element used to dereference the URI in order to provide | 
					
						
							|  |  |  | access to perform some operation on the resource. | 
					
						
							|  |  |  | """
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _ParseResultBase.query.__doc__ = _SplitResultBase.query.__doc__ | 
					
						
							|  |  |  | _ParseResultBase.fragment.__doc__ = _SplitResultBase.fragment.__doc__ | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | # For backwards compatibility, alias _NetlocResultMixinStr | 
					
						
							|  |  |  | # ResultBase is no longer part of the documented API, but it is | 
					
						
							|  |  |  | # retained since deprecating it isn't worth the hassle | 
					
						
							|  |  |  | ResultBase = _NetlocResultMixinStr | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  | # Structured result objects for string data | 
					
						
							|  |  |  | class DefragResult(_DefragResultBase, _ResultMixinStr): | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  |     __slots__ = () | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |     def geturl(self): | 
					
						
							|  |  |  |         if self.fragment: | 
					
						
							|  |  |  |             return self.url + '#' + self.fragment | 
					
						
							|  |  |  |         else: | 
					
						
							|  |  |  |             return self.url | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  | class SplitResult(_SplitResultBase, _NetlocResultMixinStr): | 
					
						
							|  |  |  |     __slots__ = () | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  |     def geturl(self): | 
					
						
							|  |  |  |         return urlunsplit(self) | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  | class ParseResult(_ParseResultBase, _NetlocResultMixinStr): | 
					
						
							|  |  |  |     __slots__ = () | 
					
						
							|  |  |  |     def geturl(self): | 
					
						
							|  |  |  |         return urlunparse(self) | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  | # Structured result objects for bytes data | 
					
						
							|  |  |  | class DefragResultBytes(_DefragResultBase, _ResultMixinBytes): | 
					
						
							|  |  |  |     __slots__ = () | 
					
						
							|  |  |  |     def geturl(self): | 
					
						
							|  |  |  |         if self.fragment: | 
					
						
							|  |  |  |             return self.url + b'#' + self.fragment | 
					
						
							|  |  |  |         else: | 
					
						
							|  |  |  |             return self.url | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  | class SplitResultBytes(_SplitResultBase, _NetlocResultMixinBytes): | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  |     __slots__ = () | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |     def geturl(self): | 
					
						
							|  |  |  |         return urlunsplit(self) | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  | class ParseResultBytes(_ParseResultBase, _NetlocResultMixinBytes): | 
					
						
							|  |  |  |     __slots__ = () | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  |     def geturl(self): | 
					
						
							|  |  |  |         return urlunparse(self) | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  | # Set up the encode/decode result pairs | 
					
						
							|  |  |  | def _fix_result_transcoding(): | 
					
						
							|  |  |  |     _result_pairs = ( | 
					
						
							|  |  |  |         (DefragResult, DefragResultBytes), | 
					
						
							|  |  |  |         (SplitResult, SplitResultBytes), | 
					
						
							|  |  |  |         (ParseResult, ParseResultBytes), | 
					
						
							|  |  |  |     ) | 
					
						
							|  |  |  |     for _decoded, _encoded in _result_pairs: | 
					
						
							|  |  |  |         _decoded._encoded_counterpart = _encoded | 
					
						
							|  |  |  |         _encoded._decoded_counterpart = _decoded | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _fix_result_transcoding() | 
					
						
							|  |  |  | del _fix_result_transcoding | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | def urlparse(url, scheme='', allow_fragments=True): | 
					
						
							| 
									
										
										
										
											2001-01-15 03:34:38 +00:00
										 |  |  |     """Parse a URL into 6 components:
 | 
					
						
							|  |  |  |     <scheme>://<netloc>/<path>;<params>?<query>#<fragment> | 
					
						
							| 
									
										
										
										
											2020-02-16 14:17:58 -05:00
										 |  |  | 
 | 
					
						
							|  |  |  |     The result is a named 6-tuple with fields corresponding to the | 
					
						
							|  |  |  |     above. It is either a ParseResult or ParseResultBytes object, | 
					
						
							|  |  |  |     depending on the type of the url parameter. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     The username, password, hostname, and port sub-components of netloc | 
					
						
							|  |  |  |     can also be accessed as attributes of the returned object. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     The scheme argument provides the default value of the scheme | 
					
						
							|  |  |  |     component when no scheme is found in url. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     If allow_fragments is False, no attempt is made to separate the | 
					
						
							|  |  |  |     fragment component from the previous component, which can be either | 
					
						
							|  |  |  |     path or query. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     Note that % escapes are not expanded. | 
					
						
							|  |  |  |     """
 | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |     url, scheme, _coerce_result = _coerce_args(url, scheme) | 
					
						
							| 
									
										
										
										
											2012-06-29 11:08:20 -07:00
										 |  |  |     splitresult = urlsplit(url, scheme, allow_fragments) | 
					
						
							|  |  |  |     scheme, netloc, url, query, fragment = splitresult | 
					
						
							| 
									
										
										
										
											2001-11-16 02:52:57 +00:00
										 |  |  |     if scheme in uses_params and ';' in url: | 
					
						
							|  |  |  |         url, params = _splitparams(url) | 
					
						
							|  |  |  |     else: | 
					
						
							|  |  |  |         params = '' | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |     result = ParseResult(scheme, netloc, url, params, query, fragment) | 
					
						
							|  |  |  |     return _coerce_result(result) | 
					
						
							| 
									
										
										
										
											2001-11-16 02:52:57 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | def _splitparams(url): | 
					
						
							|  |  |  |     if '/'  in url: | 
					
						
							|  |  |  |         i = url.find(';', url.rfind('/')) | 
					
						
							|  |  |  |         if i < 0: | 
					
						
							|  |  |  |             return url, '' | 
					
						
							|  |  |  |     else: | 
					
						
							|  |  |  |         i = url.find(';') | 
					
						
							|  |  |  |     return url[:i], url[i+1:] | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2005-01-09 15:29:10 +00:00
										 |  |  | def _splitnetloc(url, start=0): | 
					
						
							| 
									
										
											  
											
												Merged revisions 59703-59773 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
  r59704 | christian.heimes | 2008-01-04 04:15:05 +0100 (Fri, 04 Jan 2008) | 1 line
  Moved include "Python.h" in front of other imports to silence a warning.
........
  r59706 | raymond.hettinger | 2008-01-04 04:22:53 +0100 (Fri, 04 Jan 2008) | 10 lines
  Minor fix-ups to named tuples:
  * Make the _replace() method respect subclassing.
  * Using property() to make _fields read-only wasn't a good idea.
    It caused len(Point._fields) to fail.
  * Add note to _cast() about length checking and alternative with the star-operator.
........
  r59707 | jeffrey.yasskin | 2008-01-04 09:01:23 +0100 (Fri, 04 Jan 2008) | 3 lines
  Make math.{floor,ceil}({int,long}) return float again for backwards
  compatibility after r59671 made them return integral types.
........
  r59709 | christian.heimes | 2008-01-04 14:21:07 +0100 (Fri, 04 Jan 2008) | 1 line
  Bug #1713: posixpath.ismount() claims symlink to a mountpoint is a mountpoint.
........
  r59712 | lars.gustaebel | 2008-01-04 15:00:33 +0100 (Fri, 04 Jan 2008) | 5 lines
  Issue #1735: TarFile.extractall() now correctly sets
  directory permissions and times.
  (will backport to 2.5)
........
  r59714 | andrew.kuchling | 2008-01-04 15:47:17 +0100 (Fri, 04 Jan 2008) | 1 line
  Update links to bug/patch tracker
........
  r59716 | christian.heimes | 2008-01-04 16:23:30 +0100 (Fri, 04 Jan 2008) | 1 line
  Added interface to Windows' WSAIoctl and a simple example for a network sniffer.
........
  r59717 | christian.heimes | 2008-01-04 16:29:00 +0100 (Fri, 04 Jan 2008) | 1 line
  And here is the rest of Hirokazu Yamamoto's patch for VS6.0 support. Thanks Hiro!
........
  r59719 | christian.heimes | 2008-01-04 16:34:06 +0100 (Fri, 04 Jan 2008) | 1 line
  Reverted last transaction. It's the wrong branch.
........
  r59721 | christian.heimes | 2008-01-04 16:48:06 +0100 (Fri, 04 Jan 2008) | 1 line
  socket.ioctl is only available on Windows
........
  r59722 | andrew.kuchling | 2008-01-04 19:24:41 +0100 (Fri, 04 Jan 2008) | 1 line
  Fix markup
........
  r59723 | andrew.kuchling | 2008-01-04 19:25:05 +0100 (Fri, 04 Jan 2008) | 1 line
  Fix markup
........
  r59725 | guido.van.rossum | 2008-01-05 01:59:59 +0100 (Sat, 05 Jan 2008) | 3 lines
  Patch #1725 by Mark Dickinson, fixes incorrect conversion of -1e1000
  and adds errors for -0x.
........
  r59726 | guido.van.rossum | 2008-01-05 02:21:57 +0100 (Sat, 05 Jan 2008) | 2 lines
  Patch #1698 by Senthil: allow '@' in username when parsed by urlparse.py.
........
  r59727 | raymond.hettinger | 2008-01-05 02:35:43 +0100 (Sat, 05 Jan 2008) | 1 line
  Improve namedtuple's _cast() method with a docstring, new name, and error-checking.
........
  r59728 | raymond.hettinger | 2008-01-05 03:17:24 +0100 (Sat, 05 Jan 2008) | 1 line
  Add error-checking to namedtuple's _replace() method.
........
  r59730 | fred.drake | 2008-01-05 05:38:38 +0100 (Sat, 05 Jan 2008) | 2 lines
  clean up a comment
........
  r59731 | jeffrey.yasskin | 2008-01-05 09:47:13 +0100 (Sat, 05 Jan 2008) | 11 lines
  Continue rolling back pep-3141 changes that changed behavior from 2.5. This
  round included:
   * Revert round to its 2.6 behavior (half away from 0).
   * Because round, floor, and ceil always return float again, it's no
     longer necessary to have them delegate to __xxx___, so I've ripped
     that out of their implementations and the Real ABC. This also helps
     in implementing types that work in both 2.6 and 3.0: you return int
     from the __xxx__ methods, and let it get enabled by the version
     upgrade.
   * Make pow(-1, .5) raise a ValueError again.
........
  r59736 | andrew.kuchling | 2008-01-05 16:13:49 +0100 (Sat, 05 Jan 2008) | 1 line
  Fix comment typo
........
  r59738 | thomas.heller | 2008-01-05 18:15:44 +0100 (Sat, 05 Jan 2008) | 1 line
  Add myself.
........
  r59739 | georg.brandl | 2008-01-05 18:49:17 +0100 (Sat, 05 Jan 2008) | 2 lines
  Fix C++-style comment.
........
  r59742 | georg.brandl | 2008-01-05 20:28:16 +0100 (Sat, 05 Jan 2008) | 2 lines
  Remove with_statement future imports from 2.6 docs.
........
  r59743 | georg.brandl | 2008-01-05 20:29:45 +0100 (Sat, 05 Jan 2008) | 2 lines
  Simplify index entries; fix #1712.
........
  r59744 | georg.brandl | 2008-01-05 20:44:22 +0100 (Sat, 05 Jan 2008) | 2 lines
  Doc patch #1730 from Robin Stocker; minor corrections mostly to os.rst.
........
  r59749 | georg.brandl | 2008-01-05 21:29:13 +0100 (Sat, 05 Jan 2008) | 2 lines
  Revert socket.rst to unix-eol.
........
  r59750 | georg.brandl | 2008-01-05 21:33:46 +0100 (Sat, 05 Jan 2008) | 2 lines
  Set native svn:eol-style property for text files.
........
  r59752 | georg.brandl | 2008-01-05 21:46:29 +0100 (Sat, 05 Jan 2008) | 2 lines
  #1719: capitalization error in "UuidCreate".
........
  r59753 | georg.brandl | 2008-01-05 22:02:25 +0100 (Sat, 05 Jan 2008) | 2 lines
  Repair markup.
........
  r59754 | georg.brandl | 2008-01-05 22:10:50 +0100 (Sat, 05 Jan 2008) | 2 lines
  Use markup.
........
  r59757 | christian.heimes | 2008-01-05 22:35:52 +0100 (Sat, 05 Jan 2008) | 1 line
  Final adjustments for #1601
........
  r59758 | guido.van.rossum | 2008-01-05 23:19:06 +0100 (Sat, 05 Jan 2008) | 3 lines
  Patch #1637: fix urlparse for URLs like 'http://x.com?arg=/foo'.
  Fix by John Nagle.
........
  r59759 | guido.van.rossum | 2008-01-05 23:20:01 +0100 (Sat, 05 Jan 2008) | 2 lines
  Add John Nagle (of issue #1637).
........
  r59765 | raymond.hettinger | 2008-01-06 10:02:24 +0100 (Sun, 06 Jan 2008) | 1 line
  Small code simplification.  Forgot that classmethods can be called from intances.
........
  r59766 | martin.v.loewis | 2008-01-06 11:09:48 +0100 (Sun, 06 Jan 2008) | 2 lines
  Use vcbuild for VS 2009.
........
  r59767 | martin.v.loewis | 2008-01-06 12:03:43 +0100 (Sun, 06 Jan 2008) | 2 lines
  Package using VS 2008.
........
  r59768 | martin.v.loewis | 2008-01-06 12:13:16 +0100 (Sun, 06 Jan 2008) | 2 lines
  Don't try to package msvcr90 for the moment.
........
  r59769 | georg.brandl | 2008-01-06 15:17:36 +0100 (Sun, 06 Jan 2008) | 4 lines
  #1696393: don't check for '.' and '..' in ntpath.walk since
  they aren't returned from os.listdir anymore.
  Reported by Michael Haggerty.
........
  r59770 | georg.brandl | 2008-01-06 15:27:15 +0100 (Sun, 06 Jan 2008) | 3 lines
  #1742: don't raise exception on os.path.relpath("a", "a"), but return os.curdir.
  Reported by Jesse Towner.
........
  r59771 | georg.brandl | 2008-01-06 15:33:52 +0100 (Sun, 06 Jan 2008) | 2 lines
  #1591: Clarify docstring of Popen3.
........
  r59772 | georg.brandl | 2008-01-06 16:30:34 +0100 (Sun, 06 Jan 2008) | 2 lines
  #1680: fix context manager example function name.
........
  r59773 | georg.brandl | 2008-01-06 16:34:57 +0100 (Sun, 06 Jan 2008) | 2 lines
  #1755097: document default values for [].sort() and sorted().
........
											
										 
											2008-01-06 16:59:19 +00:00
										 |  |  |     delim = len(url)   # position of end of domain part of url, default is end | 
					
						
							|  |  |  |     for c in '/?#':    # look for delimiters; the order is NOT important | 
					
						
							|  |  |  |         wdelim = url.find(c, start)        # find first of this delim | 
					
						
							|  |  |  |         if wdelim >= 0:                    # if found | 
					
						
							|  |  |  |             delim = min(delim, wdelim)     # use earliest delim position | 
					
						
							|  |  |  |     return url[start:delim], url[delim:]   # return (domain, rest) | 
					
						
							| 
									
										
										
										
											2005-01-09 15:29:10 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2019-03-07 08:02:26 -08:00
										 |  |  | def _checknetloc(netloc): | 
					
						
							|  |  |  |     if not netloc or netloc.isascii(): | 
					
						
							|  |  |  |         return | 
					
						
							|  |  |  |     # looking for characters like \u2100 that expand to 'a/c' | 
					
						
							|  |  |  |     # IDNA uses NFKC equivalence, so normalize for this check | 
					
						
							|  |  |  |     import unicodedata | 
					
						
							| 
									
										
										
										
											2019-06-04 08:55:30 -07:00
										 |  |  |     n = netloc.replace('@', '')   # ignore characters already included | 
					
						
							|  |  |  |     n = n.replace(':', '')        # but not the surrounding text | 
					
						
							|  |  |  |     n = n.replace('#', '') | 
					
						
							| 
									
										
										
										
											2019-04-30 12:03:02 +00:00
										 |  |  |     n = n.replace('?', '') | 
					
						
							|  |  |  |     netloc2 = unicodedata.normalize('NFKC', n) | 
					
						
							|  |  |  |     if n == netloc2: | 
					
						
							| 
									
										
										
										
											2019-03-07 08:02:26 -08:00
										 |  |  |         return | 
					
						
							|  |  |  |     for c in '/?#@:': | 
					
						
							|  |  |  |         if c in netloc2: | 
					
						
							| 
									
										
										
										
											2019-04-30 12:03:02 +00:00
										 |  |  |             raise ValueError("netloc '" + netloc + "' contains invalid " + | 
					
						
							| 
									
										
										
										
											2019-03-07 08:02:26 -08:00
										 |  |  |                              "characters under NFKC normalization") | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-05-11 17:01:44 -07:00
										 |  |  | # typed=True avoids BytesWarnings being emitted during cache key | 
					
						
							|  |  |  | # comparison since this API supports both bytes and str input. | 
					
						
							|  |  |  | @functools.lru_cache(typed=True) | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  | def urlsplit(url, scheme='', allow_fragments=True): | 
					
						
							| 
									
										
										
										
											2001-11-16 02:52:57 +00:00
										 |  |  |     """Parse a URL into 5 components:
 | 
					
						
							|  |  |  |     <scheme>://<netloc>/<path>?<query>#<fragment> | 
					
						
							| 
									
										
										
										
											2020-02-16 14:17:58 -05:00
										 |  |  | 
 | 
					
						
							|  |  |  |     The result is a named 5-tuple with fields corresponding to the | 
					
						
							|  |  |  |     above. It is either a SplitResult or SplitResultBytes object, | 
					
						
							|  |  |  |     depending on the type of the url parameter. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     The username, password, hostname, and port sub-components of netloc | 
					
						
							|  |  |  |     can also be accessed as attributes of the returned object. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     The scheme argument provides the default value of the scheme | 
					
						
							|  |  |  |     component when no scheme is found in url. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     If allow_fragments is False, no attempt is made to separate the | 
					
						
							|  |  |  |     fragment component from the previous component, which can be either | 
					
						
							|  |  |  |     path or query. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     Note that % escapes are not expanded. | 
					
						
							|  |  |  |     """
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |     url, scheme, _coerce_result = _coerce_args(url, scheme) | 
					
						
							| 
									
										
										
										
											2021-05-05 15:50:05 -07:00
										 |  |  | 
 | 
					
						
							|  |  |  |     for b in _UNSAFE_URL_BYTES_TO_REMOVE: | 
					
						
							|  |  |  |         url = url.replace(b, "") | 
					
						
							|  |  |  |         scheme = scheme.replace(b, "") | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  |     allow_fragments = bool(allow_fragments) | 
					
						
							| 
									
										
										
										
											2001-11-16 02:52:57 +00:00
										 |  |  |     netloc = query = fragment = '' | 
					
						
							| 
									
										
										
										
											2001-01-15 03:34:38 +00:00
										 |  |  |     i = url.find(':') | 
					
						
							|  |  |  |     if i > 0: | 
					
						
							| 
									
										
										
										
											2011-04-15 18:20:24 +08:00
										 |  |  |         for c in url[:i]: | 
					
						
							|  |  |  |             if c not in scheme_chars: | 
					
						
							|  |  |  |                 break | 
					
						
							|  |  |  |         else: | 
					
						
							| 
									
										
										
										
											2019-10-18 09:07:20 -04:00
										 |  |  |             scheme, url = url[:i].lower(), url[i+1:] | 
					
						
							| 
									
										
										
										
											2011-04-15 18:20:24 +08:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-02-19 07:42:50 +00:00
										 |  |  |     if url[:2] == '//': | 
					
						
							| 
									
										
										
										
											2005-01-09 15:29:10 +00:00
										 |  |  |         netloc, url = _splitnetloc(url, 2) | 
					
						
							| 
									
										
										
										
											2010-04-22 12:19:46 +00:00
										 |  |  |         if (('[' in netloc and ']' not in netloc) or | 
					
						
							|  |  |  |                 (']' in netloc and '[' not in netloc)): | 
					
						
							|  |  |  |             raise ValueError("Invalid IPv6 URL") | 
					
						
							| 
									
										
										
										
											2012-05-19 08:12:00 +08:00
										 |  |  |     if allow_fragments and '#' in url: | 
					
						
							| 
									
										
										
										
											2001-11-16 02:52:57 +00:00
										 |  |  |         url, fragment = url.split('#', 1) | 
					
						
							| 
									
										
										
										
											2012-05-19 08:12:00 +08:00
										 |  |  |     if '?' in url: | 
					
						
							| 
									
										
										
										
											2001-11-16 02:52:57 +00:00
										 |  |  |         url, query = url.split('?', 1) | 
					
						
							| 
									
										
										
										
											2019-03-07 08:02:26 -08:00
										 |  |  |     _checknetloc(netloc) | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  |     v = SplitResult(scheme, netloc, url, query, fragment) | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |     return _coerce_result(v) | 
					
						
							| 
									
										
										
										
											1994-09-12 10:36:35 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2007-05-15 18:46:22 +00:00
										 |  |  | def urlunparse(components): | 
					
						
							| 
									
										
										
										
											2001-01-15 03:34:38 +00:00
										 |  |  |     """Put a parsed URL back together again.  This may result in a
 | 
					
						
							|  |  |  |     slightly different, but equivalent URL, if the URL that was parsed | 
					
						
							|  |  |  |     originally had redundant delimiters, e.g. a ? with an empty query | 
					
						
							|  |  |  |     (the draft states that these are equivalent)."""
 | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |     scheme, netloc, url, params, query, fragment, _coerce_result = ( | 
					
						
							|  |  |  |                                                   _coerce_args(*components)) | 
					
						
							| 
									
										
										
										
											2001-11-16 02:52:57 +00:00
										 |  |  |     if params: | 
					
						
							|  |  |  |         url = "%s;%s" % (url, params) | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |     return _coerce_result(urlunsplit((scheme, netloc, url, query, fragment))) | 
					
						
							| 
									
										
										
										
											2001-11-16 02:52:57 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2007-05-15 18:46:22 +00:00
										 |  |  | def urlunsplit(components): | 
					
						
							| 
									
										
										
										
											2010-06-28 14:08:00 +00:00
										 |  |  |     """Combine the elements of a tuple as returned by urlsplit() into a
 | 
					
						
							|  |  |  |     complete URL as a string. The data argument can be any five-item iterable. | 
					
						
							|  |  |  |     This may result in a slightly different, but equivalent URL, if the URL that | 
					
						
							|  |  |  |     was parsed originally had unnecessary delimiters (for example, a ? with an | 
					
						
							|  |  |  |     empty query; the RFC states that these are equivalent)."""
 | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |     scheme, netloc, url, query, fragment, _coerce_result = ( | 
					
						
							|  |  |  |                                           _coerce_args(*components)) | 
					
						
							| 
									
										
										
										
											2002-10-14 19:59:54 +00:00
										 |  |  |     if netloc or (scheme and scheme in uses_netloc and url[:2] != '//'): | 
					
						
							| 
									
										
										
										
											2001-01-15 03:34:38 +00:00
										 |  |  |         if url and url[:1] != '/': url = '/' + url | 
					
						
							|  |  |  |         url = '//' + (netloc or '') + url | 
					
						
							|  |  |  |     if scheme: | 
					
						
							|  |  |  |         url = scheme + ':' + url | 
					
						
							|  |  |  |     if query: | 
					
						
							|  |  |  |         url = url + '?' + query | 
					
						
							|  |  |  |     if fragment: | 
					
						
							|  |  |  |         url = url + '#' + fragment | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |     return _coerce_result(url) | 
					
						
							| 
									
										
										
										
											1994-09-12 10:36:35 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2006-04-21 10:40:58 +00:00
										 |  |  | def urljoin(base, url, allow_fragments=True): | 
					
						
							| 
									
										
										
										
											2001-01-15 03:34:38 +00:00
										 |  |  |     """Join a base URL and a possibly relative URL to form an absolute
 | 
					
						
							|  |  |  |     interpretation of the latter."""
 | 
					
						
							|  |  |  |     if not base: | 
					
						
							|  |  |  |         return url | 
					
						
							|  |  |  |     if not url: | 
					
						
							|  |  |  |         return base | 
					
						
							| 
									
										
										
										
											2014-08-21 19:16:17 -04:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |     base, url, _coerce_result = _coerce_args(base, url) | 
					
						
							| 
									
										
										
										
											2001-01-15 03:34:38 +00:00
										 |  |  |     bscheme, bnetloc, bpath, bparams, bquery, bfragment = \ | 
					
						
							|  |  |  |             urlparse(base, '', allow_fragments) | 
					
						
							|  |  |  |     scheme, netloc, path, params, query, fragment = \ | 
					
						
							|  |  |  |             urlparse(url, bscheme, allow_fragments) | 
					
						
							| 
									
										
										
										
											2014-08-21 19:16:17 -04:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2001-01-15 03:34:38 +00:00
										 |  |  |     if scheme != bscheme or scheme not in uses_relative: | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |         return _coerce_result(url) | 
					
						
							| 
									
										
										
										
											2001-01-15 03:34:38 +00:00
										 |  |  |     if scheme in uses_netloc: | 
					
						
							|  |  |  |         if netloc: | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |             return _coerce_result(urlunparse((scheme, netloc, path, | 
					
						
							|  |  |  |                                               params, query, fragment))) | 
					
						
							| 
									
										
										
										
											2001-01-15 03:34:38 +00:00
										 |  |  |         netloc = bnetloc | 
					
						
							| 
									
										
										
										
											2014-08-21 19:16:17 -04:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-12-17 04:48:45 +00:00
										 |  |  |     if not path and not params: | 
					
						
							| 
									
										
										
										
											2008-08-14 16:55:14 +00:00
										 |  |  |         path = bpath | 
					
						
							| 
									
										
										
										
											2010-12-17 04:48:45 +00:00
										 |  |  |         params = bparams | 
					
						
							| 
									
										
										
										
											2008-08-14 16:55:14 +00:00
										 |  |  |         if not query: | 
					
						
							|  |  |  |             query = bquery | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |         return _coerce_result(urlunparse((scheme, netloc, path, | 
					
						
							|  |  |  |                                           params, query, fragment))) | 
					
						
							| 
									
										
										
										
											2014-08-21 19:16:17 -04:00
										 |  |  | 
 | 
					
						
							|  |  |  |     base_parts = bpath.split('/') | 
					
						
							|  |  |  |     if base_parts[-1] != '': | 
					
						
							|  |  |  |         # the last item is not a directory, so will not be taken into account | 
					
						
							|  |  |  |         # in resolving the relative path | 
					
						
							|  |  |  |         del base_parts[-1] | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     # for rfc3986, ignore all base path should the first character be root. | 
					
						
							|  |  |  |     if path[:1] == '/': | 
					
						
							|  |  |  |         segments = path.split('/') | 
					
						
							|  |  |  |     else: | 
					
						
							|  |  |  |         segments = base_parts + path.split('/') | 
					
						
							| 
									
										
										
										
											2014-09-22 15:49:16 +08:00
										 |  |  |         # filter out elements that would cause redundant slashes on re-joining | 
					
						
							|  |  |  |         # the resolved_path | 
					
						
							| 
									
										
										
										
											2015-04-16 02:31:14 +03:00
										 |  |  |         segments[1:-1] = filter(None, segments[1:-1]) | 
					
						
							| 
									
										
										
										
											2014-08-21 19:16:17 -04:00
										 |  |  | 
 | 
					
						
							|  |  |  |     resolved_path = [] | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     for seg in segments: | 
					
						
							|  |  |  |         if seg == '..': | 
					
						
							|  |  |  |             try: | 
					
						
							|  |  |  |                 resolved_path.pop() | 
					
						
							|  |  |  |             except IndexError: | 
					
						
							|  |  |  |                 # ignore any .. segments that would otherwise cause an IndexError | 
					
						
							|  |  |  |                 # when popped from resolved_path if resolving for rfc3986 | 
					
						
							|  |  |  |                 pass | 
					
						
							|  |  |  |         elif seg == '.': | 
					
						
							|  |  |  |             continue | 
					
						
							| 
									
										
										
										
											2001-01-15 03:34:38 +00:00
										 |  |  |         else: | 
					
						
							| 
									
										
										
										
											2014-08-21 19:16:17 -04:00
										 |  |  |             resolved_path.append(seg) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     if segments[-1] in ('.', '..'): | 
					
						
							|  |  |  |         # do some post-processing here. if the last segment was a relative dir, | 
					
						
							|  |  |  |         # then we need to append the trailing '/' | 
					
						
							|  |  |  |         resolved_path.append('') | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     return _coerce_result(urlunparse((scheme, netloc, '/'.join( | 
					
						
							| 
									
										
										
										
											2014-09-22 15:49:16 +08:00
										 |  |  |         resolved_path) or '/', params, query, fragment))) | 
					
						
							| 
									
										
										
										
											2014-08-21 19:16:17 -04:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1994-09-12 10:36:35 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-05-28 23:54:24 +00:00
										 |  |  | def urldefrag(url): | 
					
						
							| 
									
										
										
										
											2001-01-15 03:34:38 +00:00
										 |  |  |     """Removes any existing fragment from URL.
 | 
					
						
							| 
									
										
										
										
											1996-05-28 23:54:24 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2001-01-15 03:34:38 +00:00
										 |  |  |     Returns a tuple of the defragmented URL and the fragment.  If | 
					
						
							|  |  |  |     the URL contained no fragments, the second element is the | 
					
						
							|  |  |  |     empty string. | 
					
						
							|  |  |  |     """
 | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |     url, _coerce_result = _coerce_args(url) | 
					
						
							| 
									
										
										
										
											2001-11-16 02:52:57 +00:00
										 |  |  |     if '#' in url: | 
					
						
							|  |  |  |         s, n, p, a, q, frag = urlparse(url) | 
					
						
							|  |  |  |         defrag = urlunparse((s, n, p, a, q, '')) | 
					
						
							|  |  |  |     else: | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |         frag = '' | 
					
						
							|  |  |  |         defrag = url | 
					
						
							|  |  |  |     return _coerce_result(DefragResult(defrag, frag)) | 
					
						
							| 
									
										
										
										
											1996-05-28 23:54:24 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2013-03-14 21:31:37 +02:00
										 |  |  | _hexdig = '0123456789ABCDEFabcdef' | 
					
						
							| 
									
										
										
										
											2014-03-17 22:38:41 +01:00
										 |  |  | _hextobyte = None | 
					
						
							| 
									
										
										
										
											2013-03-14 21:31:37 +02:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  | def unquote_to_bytes(string): | 
					
						
							|  |  |  |     """unquote_to_bytes('abc%20def') -> b'abc def'.""" | 
					
						
							|  |  |  |     # Note: strings are encoded as UTF-8. This is only an issue if it contains | 
					
						
							|  |  |  |     # unescaped non-ASCII characters, which URIs should not. | 
					
						
							| 
									
										
										
										
											2010-08-14 18:30:35 +00:00
										 |  |  |     if not string: | 
					
						
							|  |  |  |         # Is it a string-like object? | 
					
						
							|  |  |  |         string.split | 
					
						
							| 
									
										
										
										
											2010-05-17 17:33:07 +00:00
										 |  |  |         return b'' | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  |     if isinstance(string, str): | 
					
						
							|  |  |  |         string = string.encode('utf-8') | 
					
						
							| 
									
										
										
										
											2013-03-14 21:31:37 +02:00
										 |  |  |     bits = string.split(b'%') | 
					
						
							|  |  |  |     if len(bits) == 1: | 
					
						
							| 
									
										
										
										
											2010-05-17 17:33:07 +00:00
										 |  |  |         return string | 
					
						
							| 
									
										
										
										
											2013-03-14 21:31:37 +02:00
										 |  |  |     res = [bits[0]] | 
					
						
							|  |  |  |     append = res.append | 
					
						
							| 
									
										
										
										
											2014-03-17 22:38:41 +01:00
										 |  |  |     # Delay the initialization of the table to not waste memory | 
					
						
							|  |  |  |     # if the function is never called | 
					
						
							|  |  |  |     global _hextobyte | 
					
						
							|  |  |  |     if _hextobyte is None: | 
					
						
							| 
									
										
										
										
											2016-12-21 12:59:28 +02:00
										 |  |  |         _hextobyte = {(a + b).encode(): bytes.fromhex(a + b) | 
					
						
							| 
									
										
										
										
											2014-03-17 22:38:41 +01:00
										 |  |  |                       for a in _hexdig for b in _hexdig} | 
					
						
							| 
									
										
										
										
											2013-03-14 21:31:37 +02:00
										 |  |  |     for item in bits[1:]: | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  |         try: | 
					
						
							| 
									
										
										
										
											2013-03-14 21:31:37 +02:00
										 |  |  |             append(_hextobyte[item[:2]]) | 
					
						
							|  |  |  |             append(item[2:]) | 
					
						
							|  |  |  |         except KeyError: | 
					
						
							|  |  |  |             append(b'%') | 
					
						
							|  |  |  |             append(item) | 
					
						
							|  |  |  |     return b''.join(res) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _asciire = re.compile('([\x00-\x7f]+)') | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | def unquote(string, encoding='utf-8', errors='replace'): | 
					
						
							|  |  |  |     """Replace %xx escapes by their single-character equivalent. The optional
 | 
					
						
							|  |  |  |     encoding and errors parameters specify how to decode percent-encoded | 
					
						
							|  |  |  |     sequences into Unicode characters, as accepted by the bytes.decode() | 
					
						
							|  |  |  |     method. | 
					
						
							|  |  |  |     By default, percent-encoded sequences are decoded with UTF-8, and invalid | 
					
						
							|  |  |  |     sequences are replaced by a placeholder character. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     unquote('abc%20def') -> 'abc def'. | 
					
						
							|  |  |  |     """
 | 
					
						
							| 
									
										
										
										
											2019-10-14 12:36:29 +02:00
										 |  |  |     if isinstance(string, bytes): | 
					
						
							|  |  |  |         return unquote_to_bytes(string).decode(encoding, errors) | 
					
						
							| 
									
										
										
										
											2013-03-14 21:31:37 +02:00
										 |  |  |     if '%' not in string: | 
					
						
							|  |  |  |         string.split | 
					
						
							| 
									
										
										
										
											2010-05-17 17:33:07 +00:00
										 |  |  |         return string | 
					
						
							|  |  |  |     if encoding is None: | 
					
						
							|  |  |  |         encoding = 'utf-8' | 
					
						
							|  |  |  |     if errors is None: | 
					
						
							|  |  |  |         errors = 'replace' | 
					
						
							| 
									
										
										
										
											2013-03-14 21:31:37 +02:00
										 |  |  |     bits = _asciire.split(string) | 
					
						
							|  |  |  |     res = [bits[0]] | 
					
						
							|  |  |  |     append = res.append | 
					
						
							|  |  |  |     for i in range(1, len(bits), 2): | 
					
						
							|  |  |  |         append(unquote_to_bytes(bits[i]).decode(encoding, errors)) | 
					
						
							|  |  |  |         append(bits[i + 1]) | 
					
						
							|  |  |  |     return ''.join(res) | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2017-04-04 21:19:43 -07:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2011-01-14 12:52:12 +00:00
										 |  |  | def parse_qs(qs, keep_blank_values=False, strict_parsing=False, | 
					
						
							| 
									
										
										
										
											2021-02-15 00:41:57 +02:00
										 |  |  |              encoding='utf-8', errors='replace', max_num_fields=None, separator='&'): | 
					
						
							| 
									
										
										
										
											2008-09-03 22:49:01 +00:00
										 |  |  |     """Parse a query given as a string argument.
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |         Arguments: | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-08-09 20:01:35 +00:00
										 |  |  |         qs: percent-encoded query string to be parsed | 
					
						
							| 
									
										
										
										
											2008-09-03 22:49:01 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |         keep_blank_values: flag indicating whether blank values in | 
					
						
							| 
									
										
										
										
											2010-08-09 20:01:35 +00:00
										 |  |  |             percent-encoded queries should be treated as blank strings. | 
					
						
							| 
									
										
										
										
											2008-09-03 22:49:01 +00:00
										 |  |  |             A true value indicates that blanks should be retained as | 
					
						
							|  |  |  |             blank strings.  The default false value indicates that | 
					
						
							|  |  |  |             blank values are to be ignored and treated as if they were | 
					
						
							|  |  |  |             not included. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |         strict_parsing: flag indicating what to do with parsing errors. | 
					
						
							|  |  |  |             If false (the default), errors are silently ignored. | 
					
						
							|  |  |  |             If true, errors raise a ValueError exception. | 
					
						
							| 
									
										
										
										
											2011-01-14 12:52:12 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |         encoding and errors: specify how to decode percent-encoded sequences | 
					
						
							|  |  |  |             into Unicode characters, as accepted by the bytes.decode() method. | 
					
						
							| 
									
										
										
										
											2017-04-04 21:19:43 -07:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2018-10-19 05:52:59 -05:00
										 |  |  |         max_num_fields: int. If set, then throws a ValueError if there | 
					
						
							|  |  |  |             are more than n fields read by parse_qsl(). | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-02-15 00:41:57 +02:00
										 |  |  |         separator: str. The symbol to use for separating the query arguments. | 
					
						
							|  |  |  |             Defaults to &. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2017-04-04 21:19:43 -07:00
										 |  |  |         Returns a dictionary. | 
					
						
							| 
									
										
										
										
											2008-09-03 22:49:01 +00:00
										 |  |  |     """
 | 
					
						
							| 
									
										
										
										
											2012-06-29 11:08:20 -07:00
										 |  |  |     parsed_result = {} | 
					
						
							| 
									
										
										
										
											2011-01-14 12:52:12 +00:00
										 |  |  |     pairs = parse_qsl(qs, keep_blank_values, strict_parsing, | 
					
						
							| 
									
										
										
										
											2018-10-19 05:52:59 -05:00
										 |  |  |                       encoding=encoding, errors=errors, | 
					
						
							| 
									
										
										
										
											2021-02-15 00:41:57 +02:00
										 |  |  |                       max_num_fields=max_num_fields, separator=separator) | 
					
						
							| 
									
										
										
										
											2011-01-14 12:52:12 +00:00
										 |  |  |     for name, value in pairs: | 
					
						
							| 
									
										
										
										
											2012-06-29 11:08:20 -07:00
										 |  |  |         if name in parsed_result: | 
					
						
							|  |  |  |             parsed_result[name].append(value) | 
					
						
							| 
									
										
										
										
											2008-09-03 22:49:01 +00:00
										 |  |  |         else: | 
					
						
							| 
									
										
										
										
											2012-06-29 11:08:20 -07:00
										 |  |  |             parsed_result[name] = [value] | 
					
						
							|  |  |  |     return parsed_result | 
					
						
							| 
									
										
										
										
											2008-09-03 22:49:01 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2017-04-04 21:19:43 -07:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2011-01-14 12:52:12 +00:00
										 |  |  | def parse_qsl(qs, keep_blank_values=False, strict_parsing=False, | 
					
						
							| 
									
										
										
										
											2021-02-15 00:41:57 +02:00
										 |  |  |               encoding='utf-8', errors='replace', max_num_fields=None, separator='&'): | 
					
						
							| 
									
										
										
										
											2008-09-03 22:49:01 +00:00
										 |  |  |     """Parse a query given as a string argument.
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2017-04-04 21:19:43 -07:00
										 |  |  |         Arguments: | 
					
						
							| 
									
										
										
										
											2008-09-03 22:49:01 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2017-04-04 21:19:43 -07:00
										 |  |  |         qs: percent-encoded query string to be parsed | 
					
						
							| 
									
										
										
										
											2008-09-03 22:49:01 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2017-04-04 21:19:43 -07:00
										 |  |  |         keep_blank_values: flag indicating whether blank values in | 
					
						
							|  |  |  |             percent-encoded queries should be treated as blank strings. | 
					
						
							|  |  |  |             A true value indicates that blanks should be retained as blank | 
					
						
							|  |  |  |             strings.  The default false value indicates that blank values | 
					
						
							|  |  |  |             are to be ignored and treated as if they were  not included. | 
					
						
							| 
									
										
										
										
											2008-09-03 22:49:01 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2017-04-04 21:19:43 -07:00
										 |  |  |         strict_parsing: flag indicating what to do with parsing errors. If | 
					
						
							|  |  |  |             false (the default), errors are silently ignored. If true, | 
					
						
							|  |  |  |             errors raise a ValueError exception. | 
					
						
							| 
									
										
										
										
											2008-09-03 22:49:01 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2017-04-04 21:19:43 -07:00
										 |  |  |         encoding and errors: specify how to decode percent-encoded sequences | 
					
						
							|  |  |  |             into Unicode characters, as accepted by the bytes.decode() method. | 
					
						
							| 
									
										
										
										
											2011-01-14 12:52:12 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2018-10-19 05:52:59 -05:00
										 |  |  |         max_num_fields: int. If set, then throws a ValueError | 
					
						
							|  |  |  |             if there are more than n fields read by parse_qsl(). | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-02-15 00:41:57 +02:00
										 |  |  |         separator: str. The symbol to use for separating the query arguments. | 
					
						
							|  |  |  |             Defaults to &. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2017-04-04 21:19:43 -07:00
										 |  |  |         Returns a list, as G-d intended. | 
					
						
							| 
									
										
										
										
											2008-09-03 22:49:01 +00:00
										 |  |  |     """
 | 
					
						
							| 
									
										
										
										
											2010-11-30 15:48:08 +00:00
										 |  |  |     qs, _coerce_result = _coerce_args(qs) | 
					
						
							| 
									
										
										
										
											2021-04-11 21:26:09 +08:00
										 |  |  |     separator, _ = _coerce_args(separator) | 
					
						
							| 
									
										
										
										
											2018-10-19 05:52:59 -05:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-02-16 01:00:20 +08:00
										 |  |  |     if not separator or (not isinstance(separator, (str, bytes))): | 
					
						
							| 
									
										
										
										
											2021-02-15 00:41:57 +02:00
										 |  |  |         raise ValueError("Separator must be of type string or bytes.") | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2018-10-19 05:52:59 -05:00
										 |  |  |     # If max_num_fields is defined then check that the number of fields | 
					
						
							|  |  |  |     # is less than max_num_fields. This prevents a memory exhaustion DOS | 
					
						
							|  |  |  |     # attack via post bodies with many fields. | 
					
						
							|  |  |  |     if max_num_fields is not None: | 
					
						
							| 
									
										
										
										
											2021-12-12 09:41:12 +01:00
										 |  |  |         num_fields = 1 + qs.count(separator) if qs else 0 | 
					
						
							| 
									
										
										
										
											2018-10-19 05:52:59 -05:00
										 |  |  |         if max_num_fields < num_fields: | 
					
						
							|  |  |  |             raise ValueError('Max number of fields exceeded') | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-09-03 22:49:01 +00:00
										 |  |  |     r = [] | 
					
						
							| 
									
										
										
										
											2021-12-12 09:41:12 +01:00
										 |  |  |     query_args = qs.split(separator) if qs else [] | 
					
						
							|  |  |  |     for name_value in query_args: | 
					
						
							| 
									
										
										
										
											2008-09-03 22:49:01 +00:00
										 |  |  |         if not name_value and not strict_parsing: | 
					
						
							|  |  |  |             continue | 
					
						
							|  |  |  |         nv = name_value.split('=', 1) | 
					
						
							|  |  |  |         if len(nv) != 2: | 
					
						
							|  |  |  |             if strict_parsing: | 
					
						
							|  |  |  |                 raise ValueError("bad query field: %r" % (name_value,)) | 
					
						
							|  |  |  |             # Handle case of a control-name with no equal sign | 
					
						
							|  |  |  |             if keep_blank_values: | 
					
						
							|  |  |  |                 nv.append('') | 
					
						
							|  |  |  |             else: | 
					
						
							|  |  |  |                 continue | 
					
						
							|  |  |  |         if len(nv[1]) or keep_blank_values: | 
					
						
							| 
									
										
										
										
											2011-01-14 12:52:12 +00:00
										 |  |  |             name = nv[0].replace('+', ' ') | 
					
						
							|  |  |  |             name = unquote(name, encoding=encoding, errors=errors) | 
					
						
							|  |  |  |             name = _coerce_result(name) | 
					
						
							|  |  |  |             value = nv[1].replace('+', ' ') | 
					
						
							|  |  |  |             value = unquote(value, encoding=encoding, errors=errors) | 
					
						
							|  |  |  |             value = _coerce_result(value) | 
					
						
							| 
									
										
										
										
											2008-09-03 22:49:01 +00:00
										 |  |  |             r.append((name, value)) | 
					
						
							|  |  |  |     return r | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  | def unquote_plus(string, encoding='utf-8', errors='replace'): | 
					
						
							|  |  |  |     """Like unquote(), but also replace plus signs by spaces, as required for
 | 
					
						
							|  |  |  |     unquoting HTML form values. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     unquote_plus('%7e/abc+def') -> '~/abc def' | 
					
						
							|  |  |  |     """
 | 
					
						
							|  |  |  |     string = string.replace('+', ' ') | 
					
						
							|  |  |  |     return unquote(string, encoding, errors) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _ALWAYS_SAFE = frozenset(b'ABCDEFGHIJKLMNOPQRSTUVWXYZ' | 
					
						
							|  |  |  |                          b'abcdefghijklmnopqrstuvwxyz' | 
					
						
							|  |  |  |                          b'0123456789' | 
					
						
							| 
									
										
										
										
											2017-02-25 14:30:28 +05:30
										 |  |  |                          b'_.-~') | 
					
						
							| 
									
										
										
										
											2010-05-17 17:33:07 +00:00
										 |  |  | _ALWAYS_SAFE_BYTES = bytes(_ALWAYS_SAFE) | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-05-11 17:01:44 -07:00
										 |  |  | def __getattr__(name): | 
					
						
							|  |  |  |     if name == 'Quoter': | 
					
						
							|  |  |  |         warnings.warn('Deprecated in 3.11. ' | 
					
						
							|  |  |  |                       'urllib.parse.Quoter will be removed in Python 3.14. ' | 
					
						
							|  |  |  |                       'It was not intended to be a public API.', | 
					
						
							|  |  |  |                       DeprecationWarning, stacklevel=2) | 
					
						
							|  |  |  |         return _Quoter | 
					
						
							|  |  |  |     raise AttributeError(f'module {__name__!r} has no attribute {name!r}') | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | class _Quoter(dict): | 
					
						
							|  |  |  |     """A mapping from bytes numbers (in range(0,256)) to strings.
 | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |     String values are percent-encoded byte values, unless the key < 128, and | 
					
						
							| 
									
										
										
										
											2021-05-11 17:01:44 -07:00
										 |  |  |     in either of the specified safe set, or the always safe set. | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  |     """
 | 
					
						
							| 
									
										
										
										
											2021-05-11 17:01:44 -07:00
										 |  |  |     # Keeps a cache internally, via __missing__, for efficiency (lookups | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  |     # of cached keys don't call Python code at all). | 
					
						
							| 
									
										
										
										
											2008-08-06 19:31:34 +00:00
										 |  |  |     def __init__(self, safe): | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  |         """safe: bytes object.""" | 
					
						
							| 
									
										
										
										
											2010-05-17 17:33:07 +00:00
										 |  |  |         self.safe = _ALWAYS_SAFE.union(safe) | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  |     def __repr__(self): | 
					
						
							| 
									
										
										
										
											2021-05-11 17:01:44 -07:00
										 |  |  |         return f"<Quoter {dict(self)!r}>" | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |     def __missing__(self, b): | 
					
						
							|  |  |  |         # Handle a cache miss. Store quoted string in cache and return. | 
					
						
							| 
									
										
										
										
											2010-05-17 17:33:07 +00:00
										 |  |  |         res = chr(b) if b in self.safe else '%{:02X}'.format(b) | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  |         self[b] = res | 
					
						
							|  |  |  |         return res | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def quote(string, safe='/', encoding=None, errors=None): | 
					
						
							| 
									
										
										
										
											2008-08-06 19:31:34 +00:00
										 |  |  |     """quote('abc def') -> 'abc%20def'
 | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-08-06 19:31:34 +00:00
										 |  |  |     Each part of a URL, e.g. the path info, the query, etc., has a | 
					
						
							| 
									
										
										
										
											2019-04-10 02:31:18 +02:00
										 |  |  |     different set of reserved characters that must be quoted. The | 
					
						
							|  |  |  |     quote function offers a cautious (not minimal) way to quote a | 
					
						
							|  |  |  |     string for most of these parts. | 
					
						
							| 
									
										
										
										
											2008-08-06 19:31:34 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2019-04-10 02:31:18 +02:00
										 |  |  |     RFC 3986 Uniform Resource Identifier (URI): Generic Syntax lists | 
					
						
							|  |  |  |     the following (un)reserved characters. | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2019-04-10 02:31:18 +02:00
										 |  |  |     unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~" | 
					
						
							|  |  |  |     reserved      = gen-delims / sub-delims | 
					
						
							|  |  |  |     gen-delims    = ":" / "/" / "?" / "#" / "[" / "]" / "@" | 
					
						
							|  |  |  |     sub-delims    = "!" / "$" / "&" / "'" / "(" / ")" | 
					
						
							|  |  |  |                   / "*" / "+" / "," / ";" / "=" | 
					
						
							| 
									
										
										
										
											2008-08-06 19:31:34 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2019-04-10 02:31:18 +02:00
										 |  |  |     Each of the reserved characters is reserved in some component of a URL, | 
					
						
							| 
									
										
										
										
											2008-08-06 19:31:34 +00:00
										 |  |  |     but not necessarily in all of them. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2019-04-10 02:31:18 +02:00
										 |  |  |     The quote function %-escapes all characters that are neither in the | 
					
						
							|  |  |  |     unreserved chars ("always safe") nor the additional chars set via the | 
					
						
							|  |  |  |     safe arg. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     The default for the safe arg is '/'. The character is reserved, but in | 
					
						
							|  |  |  |     typical usage the quote function is being called on a path where the | 
					
						
							|  |  |  |     existing slash characters are to be preserved. | 
					
						
							| 
									
										
										
										
											2017-02-25 14:30:28 +05:30
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2019-04-10 02:31:18 +02:00
										 |  |  |     Python 3.7 updates from using RFC 2396 to RFC 3986 to quote URL strings. | 
					
						
							|  |  |  |     Now, "~" is included in the set of unreserved characters. | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2014-12-24 21:23:18 -05:00
										 |  |  |     string and safe may be either str or bytes objects. encoding and errors | 
					
						
							|  |  |  |     must not be specified if string is a bytes object. | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |     The optional encoding and errors parameters specify how to deal with | 
					
						
							|  |  |  |     non-ASCII characters, as accepted by the str.encode method. | 
					
						
							|  |  |  |     By default, encoding='utf-8' (characters are encoded with UTF-8), and | 
					
						
							|  |  |  |     errors='strict' (unsupported characters raise a UnicodeEncodeError). | 
					
						
							|  |  |  |     """
 | 
					
						
							|  |  |  |     if isinstance(string, str): | 
					
						
							| 
									
										
										
										
											2010-05-17 17:33:07 +00:00
										 |  |  |         if not string: | 
					
						
							|  |  |  |             return string | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  |         if encoding is None: | 
					
						
							|  |  |  |             encoding = 'utf-8' | 
					
						
							|  |  |  |         if errors is None: | 
					
						
							|  |  |  |             errors = 'strict' | 
					
						
							|  |  |  |         string = string.encode(encoding, errors) | 
					
						
							|  |  |  |     else: | 
					
						
							|  |  |  |         if encoding is not None: | 
					
						
							|  |  |  |             raise TypeError("quote() doesn't support 'encoding' for bytes") | 
					
						
							|  |  |  |         if errors is not None: | 
					
						
							|  |  |  |             raise TypeError("quote() doesn't support 'errors' for bytes") | 
					
						
							|  |  |  |     return quote_from_bytes(string, safe) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def quote_plus(string, safe='', encoding=None, errors=None): | 
					
						
							|  |  |  |     """Like quote(), but also replace ' ' with '+', as required for quoting
 | 
					
						
							|  |  |  |     HTML form values. Plus signs in the original string are escaped unless | 
					
						
							|  |  |  |     they are included in safe. It also does not have safe default to '/'. | 
					
						
							| 
									
										
										
										
											2008-08-06 19:31:34 +00:00
										 |  |  |     """
 | 
					
						
							| 
									
										
										
										
											2009-03-26 16:55:08 +00:00
										 |  |  |     # Check if ' ' in string, where string may either be a str or bytes.  If | 
					
						
							|  |  |  |     # there are no spaces, the regular quote will produce the right answer. | 
					
						
							|  |  |  |     if ((isinstance(string, str) and ' ' not in string) or | 
					
						
							|  |  |  |         (isinstance(string, bytes) and b' ' not in string)): | 
					
						
							|  |  |  |         return quote(string, safe, encoding, errors) | 
					
						
							|  |  |  |     if isinstance(safe, str): | 
					
						
							|  |  |  |         space = ' ' | 
					
						
							|  |  |  |     else: | 
					
						
							|  |  |  |         space = b' ' | 
					
						
							| 
									
										
										
										
											2009-05-26 18:31:11 +00:00
										 |  |  |     string = quote(string, safe + space, encoding, errors) | 
					
						
							| 
									
										
										
										
											2009-03-26 16:55:08 +00:00
										 |  |  |     return string.replace(' ', '+') | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-05-11 17:01:44 -07:00
										 |  |  | # Expectation: A typical program is unlikely to create more than 5 of these. | 
					
						
							|  |  |  | @functools.lru_cache | 
					
						
							|  |  |  | def _byte_quoter_factory(safe): | 
					
						
							|  |  |  |     return _Quoter(safe).__getitem__ | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  | def quote_from_bytes(bs, safe='/'): | 
					
						
							|  |  |  |     """Like quote(), but accepts a bytes object rather than a str, and does
 | 
					
						
							|  |  |  |     not perform string-to-bytes encoding.  It always returns an ASCII string. | 
					
						
							| 
									
										
										
										
											2012-05-26 09:53:32 +08:00
										 |  |  |     quote_from_bytes(b'abc def\x3f') -> 'abc%20def%3f' | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  |     """
 | 
					
						
							| 
									
										
										
										
											2010-05-17 17:33:07 +00:00
										 |  |  |     if not isinstance(bs, (bytes, bytearray)): | 
					
						
							|  |  |  |         raise TypeError("quote_from_bytes() expected bytes") | 
					
						
							|  |  |  |     if not bs: | 
					
						
							|  |  |  |         return '' | 
					
						
							| 
									
										
										
										
											2008-08-18 21:44:30 +00:00
										 |  |  |     if isinstance(safe, str): | 
					
						
							|  |  |  |         # Normalize 'safe' by converting to bytes and removing non-ASCII chars | 
					
						
							|  |  |  |         safe = safe.encode('ascii', 'ignore') | 
					
						
							| 
									
										
										
										
											2010-05-17 17:33:07 +00:00
										 |  |  |     else: | 
					
						
							| 
									
										
										
										
											2021-05-11 17:01:44 -07:00
										 |  |  |         # List comprehensions are faster than generator expressions. | 
					
						
							| 
									
										
										
										
											2010-05-17 17:33:07 +00:00
										 |  |  |         safe = bytes([c for c in safe if c < 128]) | 
					
						
							|  |  |  |     if not bs.rstrip(_ALWAYS_SAFE_BYTES + safe): | 
					
						
							|  |  |  |         return bs.decode() | 
					
						
							| 
									
										
										
										
											2021-05-11 17:01:44 -07:00
										 |  |  |     quoter = _byte_quoter_factory(safe) | 
					
						
							| 
									
										
										
										
											2022-09-19 16:06:25 -07:00
										 |  |  |     if (bs_len := len(bs)) < 200_000: | 
					
						
							|  |  |  |         return ''.join(map(quoter, bs)) | 
					
						
							|  |  |  |     else: | 
					
						
							|  |  |  |         # This saves memory - https://github.com/python/cpython/issues/95865 | 
					
						
							|  |  |  |         chunk_size = math.isqrt(bs_len) | 
					
						
							|  |  |  |         chunks = [''.join(map(quoter, bs[i:i+chunk_size])) | 
					
						
							|  |  |  |                   for i in range(0, bs_len, chunk_size)] | 
					
						
							|  |  |  |         return ''.join(chunks) | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2015-05-17 20:44:50 -04:00
										 |  |  | def urlencode(query, doseq=False, safe='', encoding=None, errors=None, | 
					
						
							|  |  |  |               quote_via=quote_plus): | 
					
						
							| 
									
										
										
										
											2013-09-05 21:42:38 -07:00
										 |  |  |     """Encode a dict or sequence of two-element tuples into a URL query string.
 | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |     If any values in the query arg are sequences and doseq is true, each | 
					
						
							|  |  |  |     sequence element is converted to a separate parameter. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     If the query arg is a sequence of two-element tuples, the order of the | 
					
						
							|  |  |  |     parameters in the output will match the order of parameters in the | 
					
						
							|  |  |  |     input. | 
					
						
							| 
									
										
										
										
											2010-07-03 17:48:22 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2013-09-05 21:42:38 -07:00
										 |  |  |     The components of a query arg may each be either a string or a bytes type. | 
					
						
							| 
									
										
										
										
											2014-12-24 21:23:18 -05:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2015-05-17 20:44:50 -04:00
										 |  |  |     The safe, encoding, and errors parameters are passed down to the function | 
					
						
							|  |  |  |     specified by quote_via (encoding and errors only if a component is a str). | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     """
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2009-03-26 14:49:26 +00:00
										 |  |  |     if hasattr(query, "items"): | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |         query = query.items() | 
					
						
							|  |  |  |     else: | 
					
						
							| 
									
										
										
										
											2009-03-26 16:56:59 +00:00
										 |  |  |         # It's a bother at times that strings and string-like objects are | 
					
						
							|  |  |  |         # sequences. | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |         try: | 
					
						
							|  |  |  |             # non-sequence items should not work with len() | 
					
						
							|  |  |  |             # non-empty strings will fail this | 
					
						
							|  |  |  |             if len(query) and not isinstance(query[0], tuple): | 
					
						
							|  |  |  |                 raise TypeError | 
					
						
							| 
									
										
										
										
											2009-03-26 16:56:59 +00:00
										 |  |  |             # Zero-length sequences of all types will get here and succeed, | 
					
						
							|  |  |  |             # but that's a minor nit.  Since the original implementation | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |             # allowed empty dicts that type of behavior probably should be | 
					
						
							|  |  |  |             # preserved for consistency | 
					
						
							| 
									
										
										
										
											2022-03-30 15:28:20 +03:00
										 |  |  |         except TypeError as err: | 
					
						
							| 
									
										
										
										
											2009-03-26 14:49:26 +00:00
										 |  |  |             raise TypeError("not a valid non-string sequence " | 
					
						
							| 
									
										
										
										
											2022-03-30 15:28:20 +03:00
										 |  |  |                             "or mapping object") from err | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |     l = [] | 
					
						
							|  |  |  |     if not doseq: | 
					
						
							|  |  |  |         for k, v in query: | 
					
						
							| 
									
										
										
										
											2010-07-03 17:48:22 +00:00
										 |  |  |             if isinstance(k, bytes): | 
					
						
							| 
									
										
										
										
											2015-05-17 20:44:50 -04:00
										 |  |  |                 k = quote_via(k, safe) | 
					
						
							| 
									
										
										
										
											2010-07-03 17:48:22 +00:00
										 |  |  |             else: | 
					
						
							| 
									
										
										
										
											2015-05-17 20:44:50 -04:00
										 |  |  |                 k = quote_via(str(k), safe, encoding, errors) | 
					
						
							| 
									
										
										
										
											2010-07-03 17:48:22 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |             if isinstance(v, bytes): | 
					
						
							| 
									
										
										
										
											2015-05-17 20:44:50 -04:00
										 |  |  |                 v = quote_via(v, safe) | 
					
						
							| 
									
										
										
										
											2010-07-03 17:48:22 +00:00
										 |  |  |             else: | 
					
						
							| 
									
										
										
										
											2015-05-17 20:44:50 -04:00
										 |  |  |                 v = quote_via(str(v), safe, encoding, errors) | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |             l.append(k + '=' + v) | 
					
						
							|  |  |  |     else: | 
					
						
							|  |  |  |         for k, v in query: | 
					
						
							| 
									
										
										
										
											2010-07-03 17:48:22 +00:00
										 |  |  |             if isinstance(k, bytes): | 
					
						
							| 
									
										
										
										
											2015-05-17 20:44:50 -04:00
										 |  |  |                 k = quote_via(k, safe) | 
					
						
							| 
									
										
										
										
											2010-07-03 17:48:22 +00:00
										 |  |  |             else: | 
					
						
							| 
									
										
										
										
											2015-05-17 20:44:50 -04:00
										 |  |  |                 k = quote_via(str(k), safe, encoding, errors) | 
					
						
							| 
									
										
										
										
											2010-07-03 17:48:22 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |             if isinstance(v, bytes): | 
					
						
							| 
									
										
										
										
											2015-05-17 20:44:50 -04:00
										 |  |  |                 v = quote_via(v, safe) | 
					
						
							| 
									
										
										
										
											2010-07-03 17:48:22 +00:00
										 |  |  |                 l.append(k + '=' + v) | 
					
						
							|  |  |  |             elif isinstance(v, str): | 
					
						
							| 
									
										
										
										
											2015-05-17 20:44:50 -04:00
										 |  |  |                 v = quote_via(v, safe, encoding, errors) | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |                 l.append(k + '=' + v) | 
					
						
							|  |  |  |             else: | 
					
						
							|  |  |  |                 try: | 
					
						
							| 
									
										
										
										
											2009-03-26 16:56:59 +00:00
										 |  |  |                     # Is this a sufficient test for sequence-ness? | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |                     x = len(v) | 
					
						
							|  |  |  |                 except TypeError: | 
					
						
							|  |  |  |                     # not a sequence | 
					
						
							| 
									
										
										
										
											2015-05-17 20:44:50 -04:00
										 |  |  |                     v = quote_via(str(v), safe, encoding, errors) | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |                     l.append(k + '=' + v) | 
					
						
							|  |  |  |                 else: | 
					
						
							|  |  |  |                     # loop over the sequence | 
					
						
							|  |  |  |                     for elt in v: | 
					
						
							| 
									
										
										
										
											2010-07-03 17:48:22 +00:00
										 |  |  |                         if isinstance(elt, bytes): | 
					
						
							| 
									
										
										
										
											2015-05-17 20:44:50 -04:00
										 |  |  |                             elt = quote_via(elt, safe) | 
					
						
							| 
									
										
										
										
											2010-07-03 17:48:22 +00:00
										 |  |  |                         else: | 
					
						
							| 
									
										
										
										
											2015-05-17 20:44:50 -04:00
										 |  |  |                             elt = quote_via(str(elt), safe, encoding, errors) | 
					
						
							| 
									
										
										
										
											2010-07-03 17:48:22 +00:00
										 |  |  |                         l.append(k + '=' + elt) | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     return '&'.join(l) | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-07-01 19:56:00 +00:00
										 |  |  | def to_bytes(url): | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  |     warnings.warn("urllib.parse.to_bytes() is deprecated as of 3.8", | 
					
						
							|  |  |  |                   DeprecationWarning, stacklevel=2) | 
					
						
							|  |  |  |     return _to_bytes(url) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def _to_bytes(url): | 
					
						
							| 
									
										
										
										
											2008-07-01 19:56:00 +00:00
										 |  |  |     """to_bytes(u"URL") --> 'URL'.""" | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     # Most URL schemes require ASCII. If that changes, the conversion | 
					
						
							|  |  |  |     # can be relaxed. | 
					
						
							| 
									
										
										
										
											2008-07-01 19:56:00 +00:00
										 |  |  |     # XXX get rid of to_bytes() | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     if isinstance(url, str): | 
					
						
							|  |  |  |         try: | 
					
						
							|  |  |  |             url = url.encode("ASCII").decode() | 
					
						
							|  |  |  |         except UnicodeError: | 
					
						
							|  |  |  |             raise UnicodeError("URL " + repr(url) + | 
					
						
							|  |  |  |                                " contains non-ASCII characters") | 
					
						
							|  |  |  |     return url | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | def unwrap(url): | 
					
						
							| 
									
										
										
										
											2019-05-27 15:43:45 +02:00
										 |  |  |     """Transform a string like '<URL:scheme://host/path>' into 'scheme://host/path'.
 | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2019-05-27 15:43:45 +02:00
										 |  |  |     The string is returned unchanged if it's not a wrapped URL. | 
					
						
							|  |  |  |     """
 | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     url = str(url).strip() | 
					
						
							|  |  |  |     if url[:1] == '<' and url[-1:] == '>': | 
					
						
							|  |  |  |         url = url[1:-1].strip() | 
					
						
							| 
									
										
										
										
											2019-05-27 15:43:45 +02:00
										 |  |  |     if url[:4] == 'URL:': | 
					
						
							|  |  |  |         url = url[4:].strip() | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     return url | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | def splittype(url): | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  |     warnings.warn("urllib.parse.splittype() is deprecated as of 3.8, " | 
					
						
							|  |  |  |                   "use urllib.parse.urlparse() instead", | 
					
						
							|  |  |  |                   DeprecationWarning, stacklevel=2) | 
					
						
							|  |  |  |     return _splittype(url) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _typeprog = None | 
					
						
							|  |  |  | def _splittype(url): | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     """splittype('type:opaquestring') --> 'type', 'opaquestring'.""" | 
					
						
							|  |  |  |     global _typeprog | 
					
						
							|  |  |  |     if _typeprog is None: | 
					
						
							| 
									
										
										
										
											2015-03-03 20:21:35 +02:00
										 |  |  |         _typeprog = re.compile('([^/:]+):(.*)', re.DOTALL) | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |     match = _typeprog.match(url) | 
					
						
							|  |  |  |     if match: | 
					
						
							| 
									
										
										
										
											2015-03-03 20:21:35 +02:00
										 |  |  |         scheme, data = match.groups() | 
					
						
							|  |  |  |         return scheme.lower(), data | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     return None, url | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | def splithost(url): | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  |     warnings.warn("urllib.parse.splithost() is deprecated as of 3.8, " | 
					
						
							|  |  |  |                   "use urllib.parse.urlparse() instead", | 
					
						
							|  |  |  |                   DeprecationWarning, stacklevel=2) | 
					
						
							|  |  |  |     return _splithost(url) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | _hostprog = None | 
					
						
							|  |  |  | def _splithost(url): | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     """splithost('//host[:port]/path') --> 'host[:port]', '/path'.""" | 
					
						
							|  |  |  |     global _hostprog | 
					
						
							|  |  |  |     if _hostprog is None: | 
					
						
							| 
									
										
										
										
											2017-06-20 06:02:44 -07:00
										 |  |  |         _hostprog = re.compile('//([^/#?]*)(.*)', re.DOTALL) | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |     match = _hostprog.match(url) | 
					
						
							| 
									
										
										
										
											2010-11-22 04:48:26 +00:00
										 |  |  |     if match: | 
					
						
							| 
									
										
										
										
											2015-03-03 20:21:35 +02:00
										 |  |  |         host_port, path = match.groups() | 
					
						
							|  |  |  |         if path and path[0] != '/': | 
					
						
							| 
									
										
										
										
											2010-11-22 04:48:26 +00:00
										 |  |  |             path = '/' + path | 
					
						
							|  |  |  |         return host_port, path | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     return None, url | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | def splituser(host): | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  |     warnings.warn("urllib.parse.splituser() is deprecated as of 3.8, " | 
					
						
							|  |  |  |                   "use urllib.parse.urlparse() instead", | 
					
						
							|  |  |  |                   DeprecationWarning, stacklevel=2) | 
					
						
							|  |  |  |     return _splituser(host) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def _splituser(host): | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     """splituser('user[:passwd]@host[:port]') --> 'user[:passwd]', 'host[:port]'.""" | 
					
						
							| 
									
										
										
										
											2015-03-03 20:21:35 +02:00
										 |  |  |     user, delim, host = host.rpartition('@') | 
					
						
							|  |  |  |     return (user if delim else None), host | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | def splitpasswd(user): | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  |     warnings.warn("urllib.parse.splitpasswd() is deprecated as of 3.8, " | 
					
						
							|  |  |  |                   "use urllib.parse.urlparse() instead", | 
					
						
							|  |  |  |                   DeprecationWarning, stacklevel=2) | 
					
						
							|  |  |  |     return _splitpasswd(user) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def _splitpasswd(user): | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     """splitpasswd('user:passwd') -> 'user', 'passwd'.""" | 
					
						
							| 
									
										
										
										
											2015-03-03 20:21:35 +02:00
										 |  |  |     user, delim, passwd = user.partition(':') | 
					
						
							|  |  |  |     return user, (passwd if delim else None) | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  | 
 | 
					
						
							|  |  |  | def splitport(host): | 
					
						
							|  |  |  |     warnings.warn("urllib.parse.splitport() is deprecated as of 3.8, " | 
					
						
							|  |  |  |                   "use urllib.parse.urlparse() instead", | 
					
						
							|  |  |  |                   DeprecationWarning, stacklevel=2) | 
					
						
							|  |  |  |     return _splitport(host) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | # splittag('/path#tag') --> '/path', 'tag' | 
					
						
							|  |  |  | _portprog = None | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  | def _splitport(host): | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     """splitport('host:port') --> 'host', 'port'.""" | 
					
						
							|  |  |  |     global _portprog | 
					
						
							|  |  |  |     if _portprog is None: | 
					
						
							| 
									
										
										
										
											2020-01-05 14:14:31 +02:00
										 |  |  |         _portprog = re.compile('(.*):([0-9]*)', re.DOTALL) | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2020-01-05 14:14:31 +02:00
										 |  |  |     match = _portprog.fullmatch(host) | 
					
						
							| 
									
										
										
										
											2014-01-18 18:30:33 +02:00
										 |  |  |     if match: | 
					
						
							|  |  |  |         host, port = match.groups() | 
					
						
							|  |  |  |         if port: | 
					
						
							|  |  |  |             return host, port | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     return host, None | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | def splitnport(host, defport=-1): | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  |     warnings.warn("urllib.parse.splitnport() is deprecated as of 3.8, " | 
					
						
							|  |  |  |                   "use urllib.parse.urlparse() instead", | 
					
						
							|  |  |  |                   DeprecationWarning, stacklevel=2) | 
					
						
							|  |  |  |     return _splitnport(host, defport) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def _splitnport(host, defport=-1): | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     """Split host and port, returning numeric port.
 | 
					
						
							|  |  |  |     Return given default port if no ':' found; defaults to -1. | 
					
						
							| 
									
										
										
										
											2022-10-20 17:00:56 -04:00
										 |  |  |     Return numerical port if a valid number is found after ':'. | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     Return None if ':' but not a valid number."""
 | 
					
						
							| 
									
										
										
										
											2015-03-03 20:21:35 +02:00
										 |  |  |     host, delim, port = host.rpartition(':') | 
					
						
							|  |  |  |     if not delim: | 
					
						
							|  |  |  |         host = port | 
					
						
							|  |  |  |     elif port: | 
					
						
							| 
									
										
										
										
											2022-10-20 17:00:56 -04:00
										 |  |  |         if port.isdigit() and port.isascii(): | 
					
						
							| 
									
										
										
										
											2015-03-03 20:21:35 +02:00
										 |  |  |             nport = int(port) | 
					
						
							| 
									
										
										
										
											2022-10-20 17:00:56 -04:00
										 |  |  |         else: | 
					
						
							| 
									
										
										
										
											2015-03-03 20:21:35 +02:00
										 |  |  |             nport = None | 
					
						
							|  |  |  |         return host, nport | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     return host, defport | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | def splitquery(url): | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  |     warnings.warn("urllib.parse.splitquery() is deprecated as of 3.8, " | 
					
						
							|  |  |  |                   "use urllib.parse.urlparse() instead", | 
					
						
							|  |  |  |                   DeprecationWarning, stacklevel=2) | 
					
						
							|  |  |  |     return _splitquery(url) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def _splitquery(url): | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     """splitquery('/path?query') --> '/path', 'query'.""" | 
					
						
							| 
									
										
										
										
											2015-03-03 20:21:35 +02:00
										 |  |  |     path, delim, query = url.rpartition('?') | 
					
						
							|  |  |  |     if delim: | 
					
						
							|  |  |  |         return path, query | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     return url, None | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | def splittag(url): | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  |     warnings.warn("urllib.parse.splittag() is deprecated as of 3.8, " | 
					
						
							|  |  |  |                   "use urllib.parse.urlparse() instead", | 
					
						
							|  |  |  |                   DeprecationWarning, stacklevel=2) | 
					
						
							|  |  |  |     return _splittag(url) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def _splittag(url): | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     """splittag('/path#tag') --> '/path', 'tag'.""" | 
					
						
							| 
									
										
										
										
											2015-03-03 20:21:35 +02:00
										 |  |  |     path, delim, tag = url.rpartition('#') | 
					
						
							|  |  |  |     if delim: | 
					
						
							|  |  |  |         return path, tag | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     return url, None | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | def splitattr(url): | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  |     warnings.warn("urllib.parse.splitattr() is deprecated as of 3.8, " | 
					
						
							|  |  |  |                   "use urllib.parse.urlparse() instead", | 
					
						
							|  |  |  |                   DeprecationWarning, stacklevel=2) | 
					
						
							|  |  |  |     return _splitattr(url) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def _splitattr(url): | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     """splitattr('/path;attr1=value1;attr2=value2;...') ->
 | 
					
						
							|  |  |  |         '/path', ['attr1=value1', 'attr2=value2', ...]."""
 | 
					
						
							|  |  |  |     words = url.split(';') | 
					
						
							|  |  |  |     return words[0], words[1:] | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  | def splitvalue(attr): | 
					
						
							| 
									
										
										
										
											2018-04-25 16:51:54 -07:00
										 |  |  |     warnings.warn("urllib.parse.splitvalue() is deprecated as of 3.8, " | 
					
						
							|  |  |  |                   "use urllib.parse.parse_qsl() instead", | 
					
						
							|  |  |  |                   DeprecationWarning, stacklevel=2) | 
					
						
							|  |  |  |     return _splitvalue(attr) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def _splitvalue(attr): | 
					
						
							| 
									
										
										
										
											2008-06-18 20:49:58 +00:00
										 |  |  |     """splitvalue('attr=value') --> 'attr', 'value'.""" | 
					
						
							| 
									
										
										
										
											2015-03-03 20:21:35 +02:00
										 |  |  |     attr, delim, value = attr.partition('=') | 
					
						
							|  |  |  |     return attr, (value if delim else None) |