Serhiy Storchaka
bc285e5832
gh-138907: Support RFC 9309 in robotparser (GH-138908)
...
* empty lines are always ignored instead of separating groups
* the "user-agent" line after a rule starts a new group
* groups matching the same user agent are now merged
* the rule with the longest match wins instead of the first matching rule
* in case of equal matches, the “Allow” rule wins over “Disallow”
* special characters “$” and “*” are now supported in rules
* prefer full match for user agent
2026-05-04 18:03:11 +00:00
Ned Batchelder
852ec18978
Docs: remove unneeded author attributions ( #145002 )
...
These directives are not maintained and misleadingly indicate individual
rather than community ownership.
See https://github.com/python/docs-community/issues/180 for discussion,
and https://github.com/python/devguide/pull/1740 for an update to the
devguide.
Also ensured that everyone is in the Misc/ACKS file.
2026-02-19 18:45:28 -05:00
kovan
34e5a63f14
gh-141444: Replace dead URL in urllib.robotparser example (GH-144443)
...
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 10:45:15 +02:00
commitWithTisha
08115d241a
Fix minor typo: 'web site' -> 'website' (GH-140561)
2025-11-04 10:23:49 +01:00
Ned Batchelder
bcb435ee8f
docs: module page titles should not start with a link to themselves ( #117099 )
2024-05-08 20:34:40 +01:00
Skip Montanaro
5ecfd750b4
Correction Skip Montanaro's email address ( #114677 )
2024-01-28 14:51:25 +00:00
Mariusz Felisiak
11749e2dc2
bpo-44740: Lowercase "internet" and "web" where appropriate. ( #27378 )
...
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
2021-07-27 00:11:55 +02:00
Christopher Beacham
5db5c0669e
bpo-21475: Support the Sitemap extension in robotparser (GH-6883)
2018-05-16 10:52:07 -04:00
Berker Peksag
3df02dbc8e
bpo-31325: Fix usage of namedtuple in RobotFileParser.parse() ( #4529 )
2017-11-23 15:40:26 -08:00
Terry Jan Reedy
4da945f361
Merge Issue #22558 .
2016-06-11 15:06:08 -04:00
Terry Jan Reedy
fa089b9b0b
Issue #22558 : Add remaining doc links to source code for Python-coded modules.
...
Reformat header above separator line (added if missing) to a common format.
Patch by Yoni Lavi.
2016-06-11 15:02:54 -04:00
Berker Peksag
960e848f0d
Issue #16099 : RobotFileParser now supports Crawl-delay and Request-rate
...
extensions.
Patch by Nikolay Bogoychev.
2015-10-08 12:27:06 +03:00
Terry Jan Reedy
f3f0681794
Issue #17398 : document url argument of RobotFileParser
2013-03-15 16:50:23 -04:00
Georg Brandl
0f7ede4569
Review the doc changes for the urllib package creation.
2008-06-23 11:23:31 +00:00
Senthil Kumaran
aca8fd7a9d
Documentation updates for urllib package. Modified the documentation for the
...
urllib,urllib2 -> urllib.request,urllib.error
urlparse -> urllib.parse
RobotParser -> urllib.robotparser
Updated tutorial references and other module references (http.client.rst,
ftplib.rst,contextlib.rst)
Updated the examples in the urllib2-howto
Addresses Issue3142.
2008-06-23 04:41:59 +00:00