mirror of
https://github.com/mirror/wget.git
synced 2025-01-27 21:00:31 +08:00
Fix links to www.robotstxt.org
* NEWS: Fix links * doc/wget.texi: Likewise * src/res.c: Likewise Reported-by: Noël Köthe
This commit is contained in:
parent
f31b93424b
commit
84a93f4127
2
NEWS
2
NEWS
@ -698,7 +698,7 @@ addresses when accessing the first one fails.
|
||||
non-standard port.
|
||||
|
||||
** Wget now supports the robots.txt directives specified in
|
||||
<http://www.robotstxt.org/wc/norobots-rfc.txt>.
|
||||
<http://www.robotstxt.org/norobots-rfc.txt>.
|
||||
|
||||
** URL parser has been fixed, especially the infamous overzealous
|
||||
quoting. Wget no longer dequotes reserved characters, e.g. `%3F' is
|
||||
|
@ -4266,7 +4266,7 @@ server.
|
||||
|
||||
Until version 1.8, Wget supported the first version of the standard,
|
||||
written by Martijn Koster in 1994 and available at
|
||||
@url{http://www.robotstxt.org/robotstxt.html}. As of version 1.8,
|
||||
@url{http://www.robotstxt.org/orig.html}. As of version 1.8,
|
||||
Wget has supported the additional directives specified in the internet
|
||||
draft @samp{<draft-koster-robots-00.txt>} titled ``A Method for Web
|
||||
Robots Control''. The draft, which has as far as I know never made to
|
||||
@ -4285,7 +4285,7 @@ this:
|
||||
@end example
|
||||
|
||||
This is explained in some detail at
|
||||
@url{http://www.robotstxt.org/wc/meta-user.html}. Wget supports this
|
||||
@url{http://www.robotstxt.org/meta.html}. Wget supports this
|
||||
method of robot exclusion in addition to the usual @file{/robots.txt}
|
||||
exclusion.
|
||||
|
||||
|
@ -37,12 +37,12 @@ as that of the covered work. */
|
||||
disallow access to certain parts of the site.
|
||||
|
||||
The first specification was written by Martijn Koster in 1994, and
|
||||
is still available at <http://www.robotstxt.org/wc/norobots.html>.
|
||||
is still available at <http://www.robotstxt.org/orig.html>.
|
||||
In 1996, Martijn wrote an Internet Draft specifying an improved RES
|
||||
specification; however, that work was apparently abandoned since
|
||||
the draft has expired in 1997 and hasn't been replaced since. The
|
||||
draft is available at
|
||||
<http://www.robotstxt.org/wc/norobots-rfc.html>.
|
||||
<http://www.robotstxt.org/norobots-rfc.txt>.
|
||||
|
||||
This file implements RES as specified by the draft. Note that this
|
||||
only handles the "robots.txt" support. The META tag that controls
|
||||
@ -428,7 +428,7 @@ free_specs (struct robot_specs *specs)
|
||||
|
||||
/* The inner matching engine: return true if RECORD_PATH matches
|
||||
URL_PATH. The rules for matching are described at
|
||||
<http://www.robotstxt.org/wc/norobots-rfc.txt>, section 3.2.2. */
|
||||
<http://www.robotstxt.org/norobots-rfc.txt>, section 3.2.2. */
|
||||
|
||||
static bool
|
||||
matches (const char *record_path, const char *url_path)
|
||||
|
Loading…
Reference in New Issue
Block a user