[svn] Make -p work with framed pages.

Published in <sxsu1vby71t.fsf@florida.arsdigita.de>.
This commit is contained in:
hniksic 2001-11-30 19:06:41 -08:00
parent a244a67bc3
commit 7ab7f93f8d
4 changed files with 22 additions and 10 deletions

4
NEWS
View File

@ -41,6 +41,10 @@ are now converted correctly.
retrieving for inline images, stylesheets, and other documents needed
to display the page.
*** Page-requisites (-p) mode now works with frames. In other words,
`wget -p URL-THAT-USES-FRAMES' will now download the frame HTML files,
and all the files that they need to be displayed properly.
** If a host has more than one IP address, Wget uses the other
addresses when accessing the first one fails.

2
TODO
View File

@ -15,8 +15,6 @@ changes.
It should connect to the proxy URL, log in as username@target-site,
and continue as usual.
* -p should probably go "_two_ more hops" on <FRAMESET> pages.
* Add a --range parameter allowing you to explicitly specify a range of bytes to
get from a file over HTTP (FTP only supports ranges ending at the end of the
file, though forcibly disconnecting from the server at the desired endpoint

View File

@ -1,3 +1,8 @@
2001-12-01 Hrvoje Niksic <hniksic@arsdigita.com>
* recur.c (retrieve_tree): Allow -p retrievals to exceed maximum
depth by more than one.
2001-11-30 Hrvoje Niksic <hniksic@arsdigita.com>
* retr.c (retrieve_url): Don't allow more than 20 redirections.

View File

@ -255,17 +255,22 @@ retrieve_tree (const char *start_url)
if (descend
&& depth >= opt.reclevel && opt.reclevel != INFINITE_RECURSION)
{
if (opt.page_requisites && depth == opt.reclevel)
/* When -p is specified, we can do one more partial
recursion from the "leaf nodes" on the HTML document
tree. The recursion is partial in that we won't
traverse any <A> or <AREA> tags, nor any <LINK> tags
except for <LINK REL="stylesheet">. */
dash_p_leaf_HTML = TRUE;
if (opt.page_requisites
&& (depth == opt.reclevel || depth == opt.reclevel + 1))
{
/* When -p is specified, we are allowed to exceed the
maximum depth, but only for the "inline" links,
i.e. those that are needed to display the page.
Originally this could exceed the depth at most by
one, but we allow one more level so that the leaf
pages that contain frames can be loaded
correctly. */
dash_p_leaf_HTML = TRUE;
}
else
{
/* Either -p wasn't specified or it was and we've
already gone the one extra (pseudo-)level that it
already spent the two extra (pseudo-)levels that it
affords us, so we need to bail out. */
DEBUGP (("Not descending further; at depth %d, max. %d.\n",
depth, opt.reclevel));