mirror of
https://github.com/mirror/wget.git
synced 2025-01-29 05:40:27 +08:00
Fix links to www.robotstxt.org
* NEWS: Fix links * doc/wget.texi: Likewise * src/res.c: Likewise Reported-by: Noël Köthe
This commit is contained in:
parent
f31b93424b
commit
84a93f4127
2
NEWS
2
NEWS
@ -698,7 +698,7 @@ addresses when accessing the first one fails.
|
|||||||
non-standard port.
|
non-standard port.
|
||||||
|
|
||||||
** Wget now supports the robots.txt directives specified in
|
** Wget now supports the robots.txt directives specified in
|
||||||
<http://www.robotstxt.org/wc/norobots-rfc.txt>.
|
<http://www.robotstxt.org/norobots-rfc.txt>.
|
||||||
|
|
||||||
** URL parser has been fixed, especially the infamous overzealous
|
** URL parser has been fixed, especially the infamous overzealous
|
||||||
quoting. Wget no longer dequotes reserved characters, e.g. `%3F' is
|
quoting. Wget no longer dequotes reserved characters, e.g. `%3F' is
|
||||||
|
@ -4266,7 +4266,7 @@ server.
|
|||||||
|
|
||||||
Until version 1.8, Wget supported the first version of the standard,
|
Until version 1.8, Wget supported the first version of the standard,
|
||||||
written by Martijn Koster in 1994 and available at
|
written by Martijn Koster in 1994 and available at
|
||||||
@url{http://www.robotstxt.org/robotstxt.html}. As of version 1.8,
|
@url{http://www.robotstxt.org/orig.html}. As of version 1.8,
|
||||||
Wget has supported the additional directives specified in the internet
|
Wget has supported the additional directives specified in the internet
|
||||||
draft @samp{<draft-koster-robots-00.txt>} titled ``A Method for Web
|
draft @samp{<draft-koster-robots-00.txt>} titled ``A Method for Web
|
||||||
Robots Control''. The draft, which has as far as I know never made to
|
Robots Control''. The draft, which has as far as I know never made to
|
||||||
@ -4285,7 +4285,7 @@ this:
|
|||||||
@end example
|
@end example
|
||||||
|
|
||||||
This is explained in some detail at
|
This is explained in some detail at
|
||||||
@url{http://www.robotstxt.org/wc/meta-user.html}. Wget supports this
|
@url{http://www.robotstxt.org/meta.html}. Wget supports this
|
||||||
method of robot exclusion in addition to the usual @file{/robots.txt}
|
method of robot exclusion in addition to the usual @file{/robots.txt}
|
||||||
exclusion.
|
exclusion.
|
||||||
|
|
||||||
|
@ -37,12 +37,12 @@ as that of the covered work. */
|
|||||||
disallow access to certain parts of the site.
|
disallow access to certain parts of the site.
|
||||||
|
|
||||||
The first specification was written by Martijn Koster in 1994, and
|
The first specification was written by Martijn Koster in 1994, and
|
||||||
is still available at <http://www.robotstxt.org/wc/norobots.html>.
|
is still available at <http://www.robotstxt.org/orig.html>.
|
||||||
In 1996, Martijn wrote an Internet Draft specifying an improved RES
|
In 1996, Martijn wrote an Internet Draft specifying an improved RES
|
||||||
specification; however, that work was apparently abandoned since
|
specification; however, that work was apparently abandoned since
|
||||||
the draft has expired in 1997 and hasn't been replaced since. The
|
the draft has expired in 1997 and hasn't been replaced since. The
|
||||||
draft is available at
|
draft is available at
|
||||||
<http://www.robotstxt.org/wc/norobots-rfc.html>.
|
<http://www.robotstxt.org/norobots-rfc.txt>.
|
||||||
|
|
||||||
This file implements RES as specified by the draft. Note that this
|
This file implements RES as specified by the draft. Note that this
|
||||||
only handles the "robots.txt" support. The META tag that controls
|
only handles the "robots.txt" support. The META tag that controls
|
||||||
@ -428,7 +428,7 @@ free_specs (struct robot_specs *specs)
|
|||||||
|
|
||||||
/* The inner matching engine: return true if RECORD_PATH matches
|
/* The inner matching engine: return true if RECORD_PATH matches
|
||||||
URL_PATH. The rules for matching are described at
|
URL_PATH. The rules for matching are described at
|
||||||
<http://www.robotstxt.org/wc/norobots-rfc.txt>, section 3.2.2. */
|
<http://www.robotstxt.org/norobots-rfc.txt>, section 3.2.2. */
|
||||||
|
|
||||||
static bool
|
static bool
|
||||||
matches (const char *record_path, const char *url_path)
|
matches (const char *record_path, const char *url_path)
|
||||||
|
Loading…
Reference in New Issue
Block a user