Commit Graph

309 Commits

Author SHA1 Message Date
Tom Szilagyi
977276374d Add support for --retry-on-http-error
* doc/wget.texi: Add description for --retry-on-http-error
* src/http.c (gethttp):
Consider given HTTP response codes as non-fatal, transient errors.
Supply a comma-separated list of 3-digit HTTP response codes as
argument. Useful to work around special circumstances where retries
are required, but the server responds with an error code normally not
retried by Wget. Such errors might be 503 (Service Unavailable) and
429 (Too Many Requests). Retries enabled by this option are performed
subject to the normal retry timing and retry count limitations of
Wget.

Using this option is intended to support special use cases only and is
generally not recommended, as it can force retries even in cases where
the server is actually trying to decrease its load. Please use it
wisely and only if you know what you are doing.

Example use and a starting point for manual testing:
  wget --retry-on-http-error=429,503 http://httpbin.org/status/503
2017-02-09 21:17:20 +01:00
Dale R. Worley
d6eead1794 Improve documentation of --trust-server-names. 2017-02-02 12:10:43 +01:00
Matthew White
c403e67935 New: --metalink-over-http Content-Type/Disposition Metalink/XML processing
* src/http.c (metalink_from_http): Process the Content-Type header.
  Add an application/metalink4+xml URL as metalink metaurl.  If the
  option opt.content_disposition is true, the Content-Disposition's
  filename is the metaurl's name
* doc/wget.texi: Update --content-disposition and --metalink-over-http
* doc/metalink-standard.txt: Update doc. Content-Type/Disposition
  processing through --metalink-over-http. Update download naming
  system about --trust-server-names and --content-disposition
* testenv/Makefile.am: Add new files
* testenv/Test-metalink-http-xml-type.py: New file. Metalink/HTTP
  Content-Type/Disposition header automated Metalink/XML tests
* testenv/Test-metalink-http-xml-type-trust.py: New file. Metalink/HTTP
  Content-Type/Disposition header with --trust-server-names automated
  Metalink/XML tests
* testenv/Test-metalink-http-xml-type-content.py: New file. Metalink/HTTP
  Content-Type/Disposition header with --content-disposition automated
  Metalink/XML tests
* testenv/Test-metalink-http-xml-type-trust-content.py: New file.
  Metalink/HTTP Content-Type/Disposition header with --trust-server-names
  and --content-disposition automated Metalink/XML tests

Process the Content-Type header, identify an application/metalink4+xml
file.  The Content-Disposition could provide an alternate name through
the "filename" field for the metalink xml file.  Respectively, the cli
options --metalink-over-http and --content-disposition are required.

When Metalink/XML auto-processing, to use the Content-Disposition's
filename, the cli option --trust-server-names is also required.
2016-09-30 19:44:06 +02:00
Matthew White
0538e791fb New option --metalink-index to process Metalink application/metalink4+xml
* NEWS: Mention the effect of --metalink-index over Metalink
* src/init.c: Add new option metalinkindex (opt.metalink_index),
  initialize to -1
* src/main.c: Add new option metalink-index (--metalink-index=NUMBER)
* src/options.h: Add new option metalink_index (int)
* src/metalink.h: Add declaration of functions fetch_metalink_file(),
  replace_metalink_basename()
* src/metalink.c: Add functions fetch_metalink_file() simple file
  fetch, replace_metalink_basename() replace file basename
* src/metalink.c (retrieve_from_metalink): New. Process Metalink
  application/metalink4+xml of opt.metalink_index ordinal number
* doc/wget.texi: Add new option metalink-index (--metalink-index)
  documentation
* doc/metalink-standard.txt: Updated doc. Add documentation about
  Metalink application/metalink4+xml metaurls download naming system
* doc/metalink-standard.txt: Update Metalink/XML and HTTP examples
* testenv/Makefile.am: Add new files
* testenv/Test-metalink-http-xml.py: New file. Metalink/HTTP automated
  Metalink/XML "application/metalink4+xml" --metalink-index tests
* testenv/Test-metalink-http-xml-trust.py: New file. Metalink/HTTP
  automated Metalink/XML "application/metalink4+xml" --metalink-index
  retrieval with --trust-server-names tests

WARNING: Do not use lib/dirname.c (dir_name) to get the directory
name, it may append a dot '.' character to the directory name.
2016-09-30 19:44:06 +02:00
Liam R. Howlett
21e1725e12 Add --use-askpass=COMMAND support
* doc/wget.texi: Add --use-askpass to documentation.
* src/init.c: Add cmd_use_askpasss to set opt.use_askpass based on
argument, WGET_ASKPASS, and SSH_ASKPASS environment variables.
opt.wget-askpass is freed in cleanup ()
* src/main.c: Update options & add spawn process of opt.use_askpass
command.
* src/options.h: Addition of string use_askpass.
* src/url.c: Function scheme_leading_string to access the leading
string of a parsed url.
* src/url.h: Prototype for scheme_leading_string for returning the
leading string.
* bootstrap.conf: Add posix_spawn to gnulib_modules

This adds the --use-askpass option which is disabled by default.

--use-askpass=COMMAND will request the username and password for a given
URL by executing the external program COMMAND.  If COMMAND is left
blank, then the external program in the environment variable
WGET_ASKPASS will be used.  If WGET_ASKPASS is not set then the
environment variable SSH_ASKPASS is used.  If there is no value set, an
error is returned.  If an error occurs requesting the username or
password, wget will exit.

Signed-off-by: Liam R. Howlett <Liam.Howlett@WindRiver.com>
2016-09-03 21:01:24 +02:00
Matthew White
943a6d585f Add new option --keep-badhash to keep Metalink's files with a bad hash
* src/init.c: Add keepbadhash
* src/main.c: Add keep-badhash
* src/options.h: Add keep_badhash
* doc/wget.texi: Add docs for --keep-badhash
* src/metalink.h: Add prototypes badhash_suffix(), badhash_or_remove()
* src/metalink.c: New functions badhash_suffix(), badhash_or_remove().
  (retrieve_from_metalink): Call badhash_or_remove() on download error

With --keep-badhash, append .badhash to Metalink's files with checksum
mismatch. (retrieve_from_metalink): unique_create() may append another
suffix to avoid overwriting existing files.

Without --keep-badhash, remove downloaded files with checksum mismatch
(this conforms to the old behaviour).
2016-08-04 12:03:49 +02:00
Noël Köthe
ef372a4f27 Fix typos
* ChangeLog-2014-12-10: invokation -> invocation
* doc/wget.texi: invokation -> invocation
* src/main.c: seperated -> separated
* src/options.h: seperated -> separated
* testenv/README: invokation -> invocation
* testenv/conf/wget_commands.py: invokation -> invocation
2016-07-02 19:01:24 +02:00
moparisthebest
54746578e9 Implement --pinnedpubkey option to pin public keys
* doc/wget.texi: Add description for --pinnedpubkey
* src/gnutls.c: New function pkp_pin_peer_pubkey(),
  (ssl_check_certificate): Check pinned cert via pkp_pin_peer_pubkey()
* src/init.c: Add option --pinnedpubkey
* src/main.c: Add option --pinnedpubkey
* src/openssl.c: New function pkp_pin_peer_pubkey(),
  (ssl_check_certificate): Check pinned cert via pkp_pin_peer_pubkey()
* src/options.h: Add new option variable 'pinnedpubkey'
* src/utils.c: New functions wg_pubkey_pem_to_der(), wg_pin_peer_pubkey()
* src/utils.h: Add prototype for wg_pin_peer_pubkey()
2016-04-11 16:18:05 +02:00
Tim Rühsen
281ad7dfb9 Fixed URLs and references in wget.texi
* wget.texi: Replace server.com by example.com,
  replace ftp://wuarchive.wustl.edu by https://example.com,
  use HTTPS instead of HTTP where possible,
  fix list archive reference,
  remove reference to wget-notify@addictivecode.org,
  change bugtracker URL to bugtracker on Savannah,
  replace yoyodyne.com by example.com,
  fix URL to VMS port
2016-03-28 18:33:33 +02:00
Tim Rühsen
76ef65b23c Add options --bind-dns-address and --dns-servers
* README.checkout: Add description for libares
* configure.ac: Add check for libares
* doc/wget.texi: Add docs for the new options
* src/build_info.c.in: Add +/-cares for --version output
* src/host.c:
  (merge_address_lists): New static function
  (address_list_from_hostent): New static function
  (wait_ares): New static function
  (callback): New static function
  (lookup_host): Add libares resolver code
* src/init.c: Add new options,
  (cleanup): Add cleanup code
* src/main.c: Add global libares channel variable
  (cmdline_option option_data): Add new options
  (print_help): Add short descriptions
  (main): Add libares init code
* src/options.h (struct options): Add option members

The new options allow to specify alternative DNS servers and
an alternate packet route for the resolver packets.
Wget has to built with libares, enabled at configure time by
./configure --with-cares.
2016-03-23 09:26:22 +01:00
Tim Rühsen
598445ebd1 Fix links to original Robots Exclusion Standard
* doc/wget.texi: Fix links
2016-03-10 16:20:05 +01:00
Darshit Shah
75e5be7aad Update documentation about bahviour of -c
* docs/wget.texi: -c will restart download from scratch if server
    does not support RANGE.

    Reported-By: David Chavez
    http://stackoverflow.com/questions/30147332/unexpected-behavior-of-wget
2016-03-01 08:10:59 +01:00
Ángel González
3eddf5c173 * doc/wget.texi: add hint for self-signed certificates 2015-12-10 23:26:14 +01:00
Giuseppe Scrivano
81061571d1 Add --check-certificate=quiet
* doc/wget.texi: Add documentation for  --check-certificate=quiet.
* src/options.h (enum CHECK_CERT_MODES): New enum.
* src/init.c (cmd_check_cert): New static function.
(cmd_boolean_internal): Likewise.
* src/gnutls.c (ssl_check_certificate): Handle CHECK_CERT_QUIET.
* src/openssl.c (ssl_check_certificate): Handle CHECK_CERT_QUIET.
2015-12-03 11:49:55 +01:00
Tim Rühsen
b041658451 Document combination of -nc and -O
Fixes #46359
2015-11-03 15:13:28 +01:00
Ander Juaristi
4ad201a7e7 Added --convert-file-only option
* src/convert.c (convert_links_in_hashtable, convert_links):
   test for CO_CONVERT_BASENAME_ONLY.
   (convert_basename): new function.
 * src/convert.h: new constant CO_CONVERT_BASENAME_ONLY.
 * src/init.c, src/main.c, src/options.h: new option "--convert-file-only".
 * doc/wget.texi: updated documentation.

 Reviewed-by: Gabriel Somlo <somlo@cmu.edu>
2015-10-13 16:17:20 +02:00
Ander Juaristi
f8901af4e0 Added support for FTPS
* doc/wget.texi: updated documentation to reflect the new FTPS functionality.
 * src/ftp-basic.c (ftp_greeting): new function to read the server's greeting.
   (ftp_login): greeting code was previously here. Moved to ftp_greeting to
   support FTPS implicit mode.
   (ftp_auth): wrapper around the AUTH TLS command.
   (ftp_ccc): wrapper around the CCC command.
   (ftp_pbsz): wrapper around the PBSZ command.
   (ftp_prot): wraooer around the PROT command.
 * src/ftp.c (get_ftp_greeting): new static function.
   (init_control_ssl_connection): new static function to start SSL/TLS on the
   control channel.
   (getftp): added hooks to support FTPS commands (RFCs 2228 and 4217).
   (ftp_loop_internal): test for new FTPS error codes.
 * src/ftp.h: new enum 'prot_level' with available FTPS protection levels +
   prototypes of previous functions. New flag for enum 'wget_ftp_fstatus' to track
   whether the data channel has some security mechanism enabled or not.
 * src/gnutls.c (struct wgnutls_transport_context): new field 'session_data'.
   (wgnutls_close): free GnuTLS session data before exiting.
   (ssl_connect_wget): save/resume SSL/TLS session.
 * src/http.c (establish_connection): refactor ssl_connect_wget call.
   (metalink_from_http): take into account SCHEME_FTPS as well.
 * src/init.c, src/main.c, src/options.h: new command line/wgetrc options.
   (main): in recursive downloads, check for SCHEME_FTPS as well.
 * src/openssl.c (struct openssl_transport_context): new field 'sess'.
   (ssl_connect_wget): save/resume SSL/TLS session.
 * src/retr.c (retrieve_url): check new scheme SCHEME_FTPS.
 * src/ssl.h (ssl_connect_wget): refactor. New parameter of type 'int *'.
 * src/url.c. src/url.h: new scheme SCHEME_FTPS.
 * src/wget.h: new FTPS error codes.
 * src/metalink.h: support FTPS scheme.
2015-09-14 10:16:44 +02:00
Ander Juaristi
58917dcde1 Updated HSTS documentation
* doc/wget.texi: updated HSTS documentation.

   Reported-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
2015-09-01 13:50:40 +02:00
Jookia
e4db00d74d Add option to write URL rejections to a tab-delimited CSV log.
* main.c: Add "--rejected-log" option.
 * init.c: Add "rejectedlog" command.
 * options.h: Add "rejected_log" parameter string.
 * wget.texi: Add brief documentation on new --rejected-log option.
 * recur.c: Optionally log details of URLs not traversed.
   Add reject_reason enum.
   (download_child_p -> download_child): Return a reject_reason.
   (descend_redirect_p -> descend_redirect): Return a reject_reason.
   (retrieve_tree): Support logging reasons for rejection.
   Add write_reject_log_header that writes a CSV format header to a file.
   Add write_reject_log_url that writes a url struct to a file in CSV format.
   Add write_reject_log_reason that writes the URL and parent URL as well as the
   rejection reason to a CSV file.
 * Test--rejected-log.px: Add a basic test for the --rejected-log command.
 * tests/Makefile.am: Run Test--rejected-log.px.

This allows you to figure out why URLs are being rejected and some context
around it. CSV is used as the output format since it can be used easily parsed,
it's delimited by tabs instead of commas to allow using all (quoted) URL
characters and includes column names which may be used for compatibility.
2015-08-06 08:10:55 +02:00
Hubert Tarasiuk
6064f21c66 Geolocation support for Metalink resources.
* doc/wget.text: Add information about --preferred-location.
* src/init.c: Add --preferred-location option.
* src/main.c (option_data): Handle --preferred-location argument.
(main): Sort resources based on location if requested.
* src/metalink.c (metalink_res_cmp): Compare based on location if
priority and preference are equal.
* src/options.h (options): Add preferred_location option.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
37b58e3976 Metalink support.
* bootstrap.conf: Add crypto/sha256
* configure.ac: Look for libmetalink and GPGME
* doc/wget.texi: Add --input-metalink and --metalink-over-http
options description.
* po/POTFILES.in: Add metalink.c
* src/Makefile.am: Add new translation unit (metalink.c)
* src/http.c (http_stat): Add metalink field.
(free_stat): Free metalink field.
(find_key_value): Find value of given key in header string.
(has_key): Check if token exists in header string.
(find_key_values): Find all key=value pairs in header string.
(metalink_from_http): Obtain Metalink metadata from HTTP response.
(gethttp): Call metalink_from_http if requested.
(http_loop): Request Metalink metadata from HTTP response if should be.
Fall back to regular download if no Metalink metadata found.
* src/init.c: Add --input-metalink and --metalink-over-http options
* src/main.c (option_data): Handle --input-metalink and
--metalink-over-http cmd arguments.
(print_help): Print --input-metalink option description.
(main): Retrieve files from Metalink file
* src/metalink.c (retrieve_from_metalink): Download files described by
metalink.
(metalink_res_cmp): Comparator for resources priority-sorting.
* src/metalink.h: Create header for metalink.c
(RES_TYPE_SUPPORTED): Define supported resources media.
(DEFAULT_PRI): Default mirror priority for Metalink over HTTP.
(VALID_PRI_RANGE): Valid priority range.
* src/options.h (options): Add input_metalink option and metalink_over_http
options.
* src/utils.c (hex_to_string): Convert binary data to ASCII-hex.
* src/utils.h (hex_to_string): Add prototype.
* src/wget.h: Add metalink-related error enums
Add METALINK_METADATA flag for document type.
2015-07-20 15:30:39 +02:00
Hubert Tarasiuk
885eaaa214 Include --if-modified-since option in user manual.
* doc/wget.texi: Add --if-modified-since section.
2015-05-22 11:08:30 +02:00
Rohan Prinja
8d4bb928b9 Convert wget.texi to UTF-8
* doc/wget.texi: convert to UTF-8 by adding @documentencoding
* doc/Makefile.am: Pass argument --utf8 to pod2man.
2015-03-17 09:52:23 +01:00
Ander Juaristi Alamos
d0f406a13f * doc/wget.texi: The default is GnuTLS, not OpenSSL. 2015-03-16 22:46:08 +01:00
Darshit Shah
524f26a200 Docs: --post-file is binary data
Wget considers the file mentioned in the --post-file argument as a
binary file and does not strip any control characters. The lack of this
information in the documentation can cause a lot of headaches debugging
for a simple issue
2015-03-13 23:51:59 +05:30
Giuseppe Scrivano
16f1fb1d1f maint: update copyright year ranges to include 2015 2015-03-09 16:32:01 +01:00
Darshit Shah
8705e27e20 progress bar: Allow display on stderr alongwith -o
This commit causes the --show-progress option to print the progress bar
to stderr even when a logfile was explicitly provided on the command
line. Such a combination allows a user to log the output of Wget while
simultaneously keeping track of the download status.
2015-01-20 20:16:20 +01:00
Tim Ruehsen
b147fb1638 doc/wget.texi: Add 'https_only' in 'wgetrc commands' section
Reported-By: Eli Zaretskii <eliz@gnu.org>
2014-12-20 17:42:00 +01:00
Tim Rühsen
f0de37cc27 wget.texi: Clarify wgetrc command syntax
Reported-by: Eli Zaretskii <eliz@gnu.org>
2014-12-19 16:56:35 +01:00
Tim Rühsen
f37dd1aa2d wget.texi: Document --random-file and --egd-file as OpenSSL only 2014-12-17 12:39:00 +01:00
Tim Rühsen
e4a8fe84e2 Added --crl-file to load a Certificate Revocation List (CRL) file
Reported-by: Noël Köthe <noel@debian.org>
2014-11-11 15:06:51 +01:00
Darshit Shah
18b0979357 CVE-2014-4877: Arbitrary Symlink Access
Wget was susceptible to a symlink attack which could create arbitrary
files, directories or symbolic links and set their permissions when
retrieving a directory recursively through FTP. This commit changes the
default settings in Wget such that Wget no longer creates local symbolic
links, but rather traverses them and retrieves the pointed-to file in
such a retrieval.

The old behaviour can be attained by passing the --retr-symlinks=no
option to the Wget invokation command.
2014-10-27 09:18:13 +01:00
Tim Ruehsen
3e3073ca7b add TLSv1_1 and TLSv1_2 to --secure-protocol 2014-10-23 21:16:37 +02:00
Tim Ruehsen
6fc11e46ec do not use SSLv3 except explicitely requested 2014-10-19 21:57:06 +02:00
Giuseppe Scrivano
3858500de4 Fix some texinfo warnings 2014-08-03 22:52:46 +02:00
Giuseppe Scrivano
a22cd7394b Remove trailing whitespaces 2014-06-12 18:49:14 +02:00
Darshit Shah
4eeabffee6 More progress bar aesthetic changes
This commit introduces two new changes to how the progress bar looks:
1. Support the --progress=bar:noscroll option which will prevent the filename
   from scrolling in the progress bar
2. Print human readable value for the amount already downloaded for any file
2014-05-30 13:28:02 +05:30
Darshit Shah
8c2fd06ba8 Add --show-progress to force display progress bar
This is a relatively large commit that implements two major features:

1. Implement --show-progress switch to force the display of the progress bar in
   any verbosity level
2. Edit the implementation of the progress bar so that the filename is displayed
   in the same line.
2014-05-01 01:07:43 +02:00
Yousong Zhou
dfa1f4e064 Make wget capable of starting downloads from a specified position.
This patch adds an option `--start-pos' for specifying starting position
of a HTTP or FTP download.

Signed-off-by: Yousong Zhou <yszhou4tech@gmail.com>
2014-03-21 11:21:00 +01:00
Giuseppe Scrivano
351d328c07 doc: use GFDL 1.3 2013-12-29 11:41:22 +01:00
Tim Ruehsen
1dec2028d0 add/explain quoting of wildcard patterns in wget.texi 2013-10-06 23:52:22 +02:00
Tim Ruehsen
e505664ef3 added PFS to --secure-protocol 2013-09-07 13:22:15 +02:00
Tim Ruehsen
42c78fdd71 added option --https-only 2013-08-22 20:05:41 +02:00
Hrvoje Niksic
a7df7ecc2f Fix misspelling. 2013-08-13 20:52:07 +02:00
Giuseppe Scrivano
44ba49b31f doc: document --backups 2013-07-13 13:36:55 +02:00
Tomas Hozza
c78caecbb4 Document missing options and fix --preserve-permissions
Added documentation for --regex-type and --preserve-permissions
options.

Fixed --preserve-permissions to work properly also if downloading a
single file from FTP.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2013-07-11 22:01:43 +02:00
Darshit Shah
90896e3314 Follow RFC 2616 and httpbis specifications when handling redirects 2013-06-16 22:31:16 +02:00
Darshit Shah
ccd369d5f2 Fix typo in documentation. 2013-05-12 19:57:32 +02:00
Giuseppe Scrivano
8dc52c6eaa doc: add documentation for --accept-regex and --reject-regex 2013-04-28 22:41:24 +02:00
Giuseppe Scrivano
0e6f1c2dac doc: add documentation for mega dot style. 2013-04-14 14:43:16 +02:00