[svn] Make -p work with framed pages.

Published in <sxsu1vby71t.fsf@florida.arsdigita.de>.
This commit is contained in:
hniksic 2001-11-30 19:06:41 -08:00
parent a244a67bc3
commit 7ab7f93f8d
4 changed files with 22 additions and 10 deletions

4
NEWS
View File

@ -41,6 +41,10 @@ are now converted correctly.
retrieving for inline images, stylesheets, and other documents needed retrieving for inline images, stylesheets, and other documents needed
to display the page. to display the page.
*** Page-requisites (-p) mode now works with frames. In other words,
`wget -p URL-THAT-USES-FRAMES' will now download the frame HTML files,
and all the files that they need to be displayed properly.
** If a host has more than one IP address, Wget uses the other ** If a host has more than one IP address, Wget uses the other
addresses when accessing the first one fails. addresses when accessing the first one fails.

2
TODO
View File

@ -15,8 +15,6 @@ changes.
It should connect to the proxy URL, log in as username@target-site, It should connect to the proxy URL, log in as username@target-site,
and continue as usual. and continue as usual.
* -p should probably go "_two_ more hops" on <FRAMESET> pages.
* Add a --range parameter allowing you to explicitly specify a range of bytes to * Add a --range parameter allowing you to explicitly specify a range of bytes to
get from a file over HTTP (FTP only supports ranges ending at the end of the get from a file over HTTP (FTP only supports ranges ending at the end of the
file, though forcibly disconnecting from the server at the desired endpoint file, though forcibly disconnecting from the server at the desired endpoint

View File

@ -1,3 +1,8 @@
2001-12-01 Hrvoje Niksic <hniksic@arsdigita.com>
* recur.c (retrieve_tree): Allow -p retrievals to exceed maximum
depth by more than one.
2001-11-30 Hrvoje Niksic <hniksic@arsdigita.com> 2001-11-30 Hrvoje Niksic <hniksic@arsdigita.com>
* retr.c (retrieve_url): Don't allow more than 20 redirections. * retr.c (retrieve_url): Don't allow more than 20 redirections.

View File

@ -255,17 +255,22 @@ retrieve_tree (const char *start_url)
if (descend if (descend
&& depth >= opt.reclevel && opt.reclevel != INFINITE_RECURSION) && depth >= opt.reclevel && opt.reclevel != INFINITE_RECURSION)
{ {
if (opt.page_requisites && depth == opt.reclevel) if (opt.page_requisites
/* When -p is specified, we can do one more partial && (depth == opt.reclevel || depth == opt.reclevel + 1))
recursion from the "leaf nodes" on the HTML document {
tree. The recursion is partial in that we won't /* When -p is specified, we are allowed to exceed the
traverse any <A> or <AREA> tags, nor any <LINK> tags maximum depth, but only for the "inline" links,
except for <LINK REL="stylesheet">. */ i.e. those that are needed to display the page.
dash_p_leaf_HTML = TRUE; Originally this could exceed the depth at most by
one, but we allow one more level so that the leaf
pages that contain frames can be loaded
correctly. */
dash_p_leaf_HTML = TRUE;
}
else else
{ {
/* Either -p wasn't specified or it was and we've /* Either -p wasn't specified or it was and we've
already gone the one extra (pseudo-)level that it already spent the two extra (pseudo-)levels that it
affords us, so we need to bail out. */ affords us, so we need to bail out. */
DEBUGP (("Not descending further; at depth %d, max. %d.\n", DEBUGP (("Not descending further; at depth %d, max. %d.\n",
depth, opt.reclevel)); depth, opt.reclevel));