* src/ftp.c (getftp): Fix password/user selection
* src/http.c (initialize_request): Likewise
Before, netrc password won over interactive
--ask-password but now --ask-password wins
after change of program logic
Fixes Issue #48811
* src/connect.c: check that the fd is not bigger than FD_SETSIZE
before using FD_SET. An fd_set cannot hold fds bigger than
FD_SETSIZE, causing out-of-bounds write to a buffer on the stack.
Reported by: Jann Horn <jannh@google.com>
* src/gnutls.c (ssl_connect_wget): Call gnutls_set_default_priority()
for --secure-protocol=auto (default).
The patch fixes a behavior that may have unintended side-effects in
certain gnutls versions. Instead use the default priorities when no
options are given.
Signed-off-by: Nikos Mavrogiannopoulos <nmav@gnutls.org>
* src/http-ntlm.c: Rename base64_{encode,decode}
* src/http.c: Likewise
* src/utils.c: Likewise
* src/utils.h: Likewise
When statically linking with gnutls, we get definition clash error for
base64_encode which is also defined by gnutls.
To prevent definition clash, rename base64_{encode,decode}
Signed-off-by: Rahul Bedarkar <rahul.bedarkar@imgtec.com>
* .travis.yml: Install libidn2-dev instead libidn11-dev.
* bootstrap.conf: Add modules libunistring-optional, unistr/base,
unicase/tolower.
* configure.ac: Check for libidn2.
* src/Makefile.am: Add $(LTLIBUNISTRING) to LDADD.
* tests/Makefile.am: Set LDADD similar to LDADD in src/Makefile.am
* src/connect.c: Use libidn2 code instead of libidn.
* src/host.c: Likewise.
* src/iri.c: Likewise.
* src/iri.h: Likewise.
* src/options.h: Likewise.
* src/url.c: Likewise.
* src/url.h: Likewise.
* src/log.c: Fix C99 comment.
IDN2003 should not be used any more due to security concerns.
We use libunistring (resp. the unicode code from gnulib) for
lowercasing UTF-8 before we give data to libidn2.
TR#46 is missing, no support in libidn2 nor in libunistring.
* src/log.c: Use tcgetpgrp(STDIN_FILENO) != getpgrp() to determine when to print to STD* or logfile.
Deprecate log_request_redirect_output function.
Use different file handles for STD* and logfile, to easily switch between them when changing fg/bg.
* src/log.h: Make redirect_output function externally linked.
* src/main.c: Don't use deprecated log_request_redirect_output function. Use redirect_output instead.
* src/mswindows.c: Don't use deprecated log_request_redirect_output function. Use redirect_output instead.
* src/recur.c (descend_redirect): Ignore WG_RR_LIST and WG_RR_REGEX
for redirections.
* testenv/Makefile.am: Add Test-recursive-redirect.py
* testenv/Test-recursive-redirect.py: New test
Test-recursive-redirect.py written by Dale R. Worley.
Reported-by: "Dale R. Worley" <worley@ariadne.com>
* src/http.c (metalink_from_http): Process the Content-Type header.
Add an application/metalink4+xml URL as metalink metaurl. If the
option opt.content_disposition is true, the Content-Disposition's
filename is the metaurl's name
* doc/wget.texi: Update --content-disposition and --metalink-over-http
* doc/metalink-standard.txt: Update doc. Content-Type/Disposition
processing through --metalink-over-http. Update download naming
system about --trust-server-names and --content-disposition
* testenv/Makefile.am: Add new files
* testenv/Test-metalink-http-xml-type.py: New file. Metalink/HTTP
Content-Type/Disposition header automated Metalink/XML tests
* testenv/Test-metalink-http-xml-type-trust.py: New file. Metalink/HTTP
Content-Type/Disposition header with --trust-server-names automated
Metalink/XML tests
* testenv/Test-metalink-http-xml-type-content.py: New file. Metalink/HTTP
Content-Type/Disposition header with --content-disposition automated
Metalink/XML tests
* testenv/Test-metalink-http-xml-type-trust-content.py: New file.
Metalink/HTTP Content-Type/Disposition header with --trust-server-names
and --content-disposition automated Metalink/XML tests
Process the Content-Type header, identify an application/metalink4+xml
file. The Content-Disposition could provide an alternate name through
the "filename" field for the metalink xml file. Respectively, the cli
options --metalink-over-http and --content-disposition are required.
When Metalink/XML auto-processing, to use the Content-Disposition's
filename, the cli option --trust-server-names is also required.
* src/http.c (http_loop): Prevent SIGSEGV when hstat.local_file is
NULL, opt.content_disposition has a role in leaving the value unset
* src/http.c (gethttp): If hs->local_file is NULL (aka http_loop()'s
hstat.local_file), set it to the value of hs->metalink->origin
* src/metalink.c (retrieve_from_metalink): If opt.trustservernames is
true, use the basename of the metaurl's name to save the xml file
* doc/metalink-standard.txt: Update doc. With --trust-server-names any
Metalink/HTTP Link application/metalink4+xml file is saved using the
basename of the "name" field, if any. Update Metalink/HTTP examples
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-http-xml-trust-name.py: New file. Metalink/HTTP
automated Metalink/XML, save xml files using the "name" field tests
* src/metalink.c (retrieve_from_metalink): Reject any metalink:file
without hashes. Prompt the error and switch to the next file
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-xml-nohash.py: New file. Metalink/XML with no
hashes tests
Prevent SIGSEGV.
* src/http.c (metalink_from_http): Fix hash_bin_len type. Use ssize_t
instead than size_t. Reject -1 as base64_decode() return value
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-http-baddigest.py: New file. Metalink/HTTP
malformed base64 Digest header tests
On malformed base64 input, ssize_t base64_decode() returns -1. Such
value is too big for a size_t variable, and used as xmalloc() value
will exaust all the memory.
* NEWS: Mention the effect of --metalink-index over Metalink
* src/init.c: Add new option metalinkindex (opt.metalink_index),
initialize to -1
* src/main.c: Add new option metalink-index (--metalink-index=NUMBER)
* src/options.h: Add new option metalink_index (int)
* src/metalink.h: Add declaration of functions fetch_metalink_file(),
replace_metalink_basename()
* src/metalink.c: Add functions fetch_metalink_file() simple file
fetch, replace_metalink_basename() replace file basename
* src/metalink.c (retrieve_from_metalink): New. Process Metalink
application/metalink4+xml of opt.metalink_index ordinal number
* doc/wget.texi: Add new option metalink-index (--metalink-index)
documentation
* doc/metalink-standard.txt: Updated doc. Add documentation about
Metalink application/metalink4+xml metaurls download naming system
* doc/metalink-standard.txt: Update Metalink/XML and HTTP examples
* testenv/Makefile.am: Add new files
* testenv/Test-metalink-http-xml.py: New file. Metalink/HTTP automated
Metalink/XML "application/metalink4+xml" --metalink-index tests
* testenv/Test-metalink-http-xml-trust.py: New file. Metalink/HTTP
automated Metalink/XML "application/metalink4+xml" --metalink-index
retrieval with --trust-server-names tests
WARNING: Do not use lib/dirname.c (dir_name) to get the directory
name, it may append a dot '.' character to the directory name.
* src/http.c (metalink_from_http): Parse Metalink/HTTP header for
metaurls application/metalink4+xml media types
* src/metalink.h: Add function declaration metalink_meta_cmp()
* src/metalink.c: Add function metalink_meta_cmp() compare metalink
metaurls priorities
Add Metalink/HTTP application/metalink4+xml media types as metaurls to
the metalink variable that will be used to download the files.
* src/metalink.h: Add declaration of function dequote_metalink_string()
* src/metalink.c: Add function dequote_metalink_string() remove
surrounding quotes from string, \' or \"
* src/metalink.c (find_key_value, find_key_values): Call dequote_metalink_string()
to remove the surrounding quotes from the parsed value
* src/metalink.c (test_find_key_value, test_find_key_values): Add
quoted key's values for unit-tests
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-http-quoted.py: New file. Metalink/HTTP quoted
values tests
Some Metalink/HTTP keys, like "type" [2], may have a quoted value [1]:
Link: <http://example.com/example.ext.meta4>; rel=describedby;
type="application/metalink4+xml"
Wget was expecting a dequoted value from the Metalink module. This
patch addresses this problem.
References:
[1] Metalink/HTTP: Mirrors and Hashes
1.1. Example Metalink Server Response
https://tools.ietf.org/html/rfc6249#section-1.1
[2] Additional Link Relations
6. "type"
https://tools.ietf.org/html/rfc6903#section-6
* src/metalink.h: Add declaration of function clean_metalink_string()
* src/metalink.c: Add directive #include "xmemdup0.h"
* src/metalink.c: Add function clean_metalink_string() remove leading
and trailing white spaces and CRLF from string
* src/metalink.c (retrieve_from_metalink): Remove leading and trailing
white spaces and CRLF from url resource mres->url
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-xml-urlbreak.py: New test. Metalink/XML white
spaces and CRLF in url resources tests
White spaces and CRLF are not automatically removed by libmetalink
from url strings. The Wget's Metalink module was unable to process
such url strings. This patch implements the processing of such url
strings cleaning off leading and trailing white spaces and CRLF.
If a parsed Metalink/XML url string contains strings separated by
CRLF, only the first of the series is accepted.
* src/wget.h (uerr_t): Add error code METALINK_SIZE_ERROR to enum
* src/metalink.c (retrieve_from_metalink): Use boolean variable
size_ok, when false set retr_err to METALINK_SIZE_ERROR
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-xml-size.py: New file. Metalink/XML file size
tests (<size></size>)
Before this patch, no appropriate error code was returned to inform a
file size mismatch.
This patch introduces the error code METALINK_SIZE_ERROR to inform a
file size mismatch.
* NEWS: Mention the effect of --trust-server-names over Metalink
* src/metalink.h: Add declaration of function append_suffix_number()
* src/metalink.c: Add function append_suffix_number() append number to
string
* src/metalink.c (retrieve_from_metalink): Safer Metalink/XML and
Metalink/HTTP download naming system, opt.trustservernames based
* doc/metalink-standard.txt: Update doc. Explain new Metalink/XML and
Metalin/HTTP download naming system and --trust-server-names role
* testenv/Makefile.am: Add new files
* testenv/Test-metalink-xml-continue.py: Update test. Metalink/XML
continue/keep existing files (HTTP 416) with --continue tests
* testenv/Test-metalink-xml.py: Update test. Metalink/XML naming tests
* testenv/Test-metalink-xml-trust.py: New file. Metalink/XML naming
tests with --trust-server-names
* testenv/Test-metalink-xml-abspath.py: Update test. Metalink/XML
absolute path tests
* testenv/Test-metalink-xml-abspath-trust.py: New file. Metalink/XML
absolute path tests with --trust-server-names
* testenv/Test-metalink-xml-relpath.py: Update test. Metalink/XML
relative path tests
* testenv/Test-metalink-xml-relpath-trust.py: New file. Metalink/XML
relative path tests with --trust-server-names
* testenv/Test-metalink-xml-homepath.py: Update test. Metalink/XML
home path and ~ (tilde) tests
* testenv/Test-metalink-xml-homepath-trust.py: New file. Metalink/XML
home path and ~ (tilde) tests with --trust-server-names
* testenv/Test-metalink-xml-prefix.py: New file. Metalink/XML naming
tests with --directory-prefix
* testenv/Test-metalink-xml-prefix-trust.py: New file. Metalink/XML
naming tests with --directory-prefix and --trust-server-names
* testenv/Test-metalink-xml-absprefix.py: New file. Metalink/XML
absolute --directory-prefix tests
* testenv/Test-metalink-xml-absprefix-trust.py: New file. Metalink/XML
absolute --directory-prefix tests with --trust-server-names
* testenv/Test-metalink-xml-relprefix.py: New file. Metalink/XML
relative --directory-prefix tests
* testenv/Test-metalink-xml-relprefix-trust.py: New file. Metalink/XML
relative --directory-prefix tests with --trust-server-names
* testenv/Test-metalink-xml-homeprefix.py: New file. Metalink/XML home
--directory-prefix tests
* testenv/Test-metalink-xml-homeprefix-trust.py: New file. Metalink/XML
home --directory-prefix tests with --trust-server-names
The option --trust-server-names allows to use the file names parsed
from a Metalink/XML file. Without --trust-server-names, the safety
mechanism provides secure and predictable file names.
* NEWS: Mention the use of a safe Metalink destination path
* src/metalink.h: Add declaration of functions get_metalink_basename(),
last_component(), metalink_check_safe_path()
* src/metalink.c: Add directive #include "dosname.h"
* src/metalink.c: Add function get_metalink_basename() to return the
basename of a file name, strip w32's drive letter prefixes
* src/metalink.c (retrieve_from_metalink): Enforce Metalink file name
verification, if the file name is unsafe try its basename
* doc/metalink.txt: Update document. Explain --directory-prefix
The function get_metalink_basename() uses FILE_SYSTEM_PREFIX_LEN to
catch any 'C:D:file' (w32 environment), then it removes each drive
letter prefix, i.e. 'C:' and 'D:'.
Unsafe file names contain an absolute, relative, or home path. Safe
paths can be verified by libmetalink's metalink_check_safe_path().
* NEWS: Mention the effect of --directory-prefix over Metalink
* src/metalink.c (retrieve_from_metalink): Add opt.dir_prefix as
prefix to the metalink:file name mfile->name
* doc/metalink.txt: Update document. Explain --directory-prefix
When --directory-prefix=<prefix> is used, set the top of the retrieval
tree to prefix. The default is . (the current directory). Metalink/XML
and Metalink/HTTP files will be downloaded under prefix.
* src/metalink.c (retrieve_from_metalink): Change mfile->name to
filename when referring to the downloaded file
The file name could have been changed by unique_create() (or by any
other mean) before downloading. Use the name of the downloaded file
(filename) when printing output which refer to it.
* NEWS: Mention Metalink's file size verification
* src/metalink.c (retrieve_from_metalink): Add file size computation
* doc/metalink.txt: Update document. Remove resolved bugs
Reject downloaded files when they do not agree with their Metalink/XML
metalink:size: https://tools.ietf.org/html/rfc5854#section-4.2.14
At the moment of writing, Metalink/HTTP headers do not provide a file
size field. This information could be obtained from the Content-Length
header field: https://tools.ietf.org/html/rfc6249#section-7
* NEWS: Mention the effects of --continue over Metalink
* src/metalink.c (retrieve_from_metalink): On download error, resume
output_stream with the next mres->url. Keep fully downloaded files
started with --continue, otherwise rename/remove the file
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-xml-continue.py: New file. Metalink/XML
continue/keep existing files (HTTP 416) with --continue tests
Before this patch, with --continue, existing and/or fully retrieved
files which fail the sanity tests were renamed (--keep-badhash), or
removed.
This patch ensures that --continue doesn't rename/remove existing
and/or fully retrieved files (HTTP 416) which fail the sanity tests.
* NEWS: Mention the Metalink "path/file" name format handling
* src/metalink.c (retrieve_from_metalink): Fix NULL filename, set
filename to the right "path/file" value
* src/metalink.c (retrieve_from_metalink): Fix NULL output_stream, set
output_stream to filename when it is created by retrieve_url()
* src/metalink.c (retrieve_from_metalink): Add RFC5854 comments about
proper metalink:file "path/file" name format handling
* doc/metalink.txt: Update document. Remove resolved bugs
If unique_create() cannot create/open the destination file, filename
and output_stream remain NULL. If fopen() is used instead, filename
always remains NULL. Both functions cannot create "path/file" trees.
Setting filename to the right value is sufficient to prevent SIGSEGV
generating from testing a NULL value. This also allows retrieve_url()
to create a "path/file" tree through opt.output_document.
Reading NULL as output_stream, when it shall not be, leads to wrong
results. For instance, a non-NULL output_stream tells when a stream
was interrupted, reading NULL instead means to assume the contrary.
This patch conforms to the RFC5854 specification:
The Metalink Download Description Format
4.1.2.1. The "name" Attribute
https://tools.ietf.org/html/rfc5854#section-4.1.2.1
* src/html-url.c (tag_handle_img): Check append_url() for NULL
return value before dereference.
Crashed reproducable with parsing srcset="data:..." inline data.
Reported-by: Coverity
* src/http.c: Add const to first param of initialize_request(),
initialize_proxy_configuration(), establish_connection(),
check_file_output(), check_auth(), gethttp(), http_loop().
* src/http.h: Add const to first param of http_loop().
* src/connect.c (connect_to_ip): Check return value of setsockopt.
* src/ftp.c (ftp_retrieve_list): Check return value of chmod.
* src/http.c (digest_authentication_encode): Cleanup code.
* src/init.c (setval_internal): Explicitely check comind range.
* src/main.c (main): Explicitely check optarg.
* src/retr.c (retr_rate): Use snprintf instead sprintf,
(retrieve_from_file): More verbose error message,
(rotate_backups): Use snprintf instead sprintf, check return
value of rename().
* src/url.c (mkalldirs): Check return value of unlink().
* src/utils.c (strdupdelim): Explicitely check beg and end for NULL,
(merge_vecs): Fix sizeof argument to char *,
(stable_sort): Use malloc instead of alloca.
* bootstrap.conf: Add xmemdup0 and strpbrk.
* src/init.c (cmd_use_askpass): Add 'const' to char *,
remove check for file existence.
* src/main.c (run_use_askpass): C89 compat init of argv,
added \n to error messages,
fixed stripping of \n and \r from input,
make run_use_askpass and use_askpass static.
* doc/wget.texi: Add --use-askpass to documentation.
* src/init.c: Add cmd_use_askpasss to set opt.use_askpass based on
argument, WGET_ASKPASS, and SSH_ASKPASS environment variables.
opt.wget-askpass is freed in cleanup ()
* src/main.c: Update options & add spawn process of opt.use_askpass
command.
* src/options.h: Addition of string use_askpass.
* src/url.c: Function scheme_leading_string to access the leading
string of a parsed url.
* src/url.h: Prototype for scheme_leading_string for returning the
leading string.
* bootstrap.conf: Add posix_spawn to gnulib_modules
This adds the --use-askpass option which is disabled by default.
--use-askpass=COMMAND will request the username and password for a given
URL by executing the external program COMMAND. If COMMAND is left
blank, then the external program in the environment variable
WGET_ASKPASS will be used. If WGET_ASKPASS is not set then the
environment variable SSH_ASKPASS is used. If there is no value set, an
error is returned. If an error occurs requesting the username or
password, wget will exit.
Signed-off-by: Liam R. Howlett <Liam.Howlett@WindRiver.com>
* configure.ac: Add --with-psl-file to set a PSL file
* src/cookies.c (check_domain_match): Load PSL_FILE with
fallback to built-in data.
This change allows package maintainers to make Wget use the latest
PSL (DAFSA or plain text), without updating libpsl itself.
E.g. Debian now comes with a DAFSA binary within the 'publicsuffix'
package which allows very fast loading (no parsing or processing needed).
* src/cookies.c (cookie_header): Use heap instead of stack.
* src/http.c (request_send): Likewise.
If wget has to handle an insanely large amount of cookies (~700,000 on
32 bit systems or ~530,000 on 64 bit systems), the stack is not large
enough to hold these pointers, leading to undefined behaviour according
to POSIX; expect a segmentation fault in real life. ;)
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The signal handler for SIGALRM calls longjmp, but the handler is
installed before the jump target has been initialized. If another
process sends SIGALRM right between handler installation and target
initialization, the jump leads to undefined behavior.
This can easily be fixed by moving the signal handler installation
into the "SETJMP == 0" conditional block, which means that the target
has just been initialized.
* src/utils.c: call signal after SETJMP.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
* src/init.c: Remove hyphens from command names
* src/main.c: Likewise
Options with hyphens (or underscores) in their command name cannot be
set in a wgetrc file.
Signed-off-by: Jeffery To <jeffery.to@gmail.com>
* src/metalink.c (retrieve_from_metalink): Continue file download if
opt.always_rest is true
Without --continue, download as a new file with an unique name (this
conforms to the old behaviour).
* bootstrap.conf: Add crypto/md2, and crypto/md4
* src/metalink.c (retrieve_from_metalink): Add md2, and md4 support
This patch adds support for the deprecated (insecure) md2, and md4
Message-Digest algorithms to the Metalink module.
* bootstrap.conf: Add crypto/sha512
* src/metalink.c (retrieve_from_metalink): Add md5, sha1, sha224,
sha384, and sha512 support
Metalink's checksum verification was limited to sha256. This patch
adds support for md5, sha1, sha224, sha384, and sha512.
* configure.ac: Check for xattr availability
* src/Makefile.am: Add xattr.c
* src/ftp.c: Include xattr.h.
(getftp): Set attributes if enabled.
* src/http.c: Include xattr.h.
(gethttp): Add parameter 'original_url',
set attributes if enabled.
(http_loop): Add 'original_url' to call of gethttp().
* src/init.c: Add new option --xattr.
* src/main.c: Add new option --xattr, add description to help text.
* src/options.h: Add new config member 'enable_xattr'.
* src/xatrr.c: New file.
* src/xattr.h: New file.
These attributes provide a lightweight method of later determining
where a file was downloaded from.
This patch changes:
* autoconf detects whether extended attributes are available and
enables the code if they are.
* The new flags --xattr and --no-xattr control whether xattr is enabled.
* The new command "xattr = (on|off)" can be used in ~/.wgetrc or /etc/wgetrc
* The original and redirected URLs are recorded as shown below.
* This works for both single fetches and recursive mode.
The attributes that are set are:
user.xdg.origin.url: The URL that the content was fetched from.
user.xdg.referrer.url: The URL that was originally requested.
Here is an example, where http://archive.org redirects to https://archive.org:
$ wget --xattr http://archive.org
...
$ getfattr -d index.html
user.xdg.origin.url="https://archive.org/"
user.xdg.referrer.url="http://archive.org/"
These attributes were chosen based on those stored by Google Chrome
https://bugs.chromium.org/p/chromium/issues/detail?id=45903
and curl https://github.com/curl/curl/blob/master/src/tool_xattr.c
* src/openssl.c (ssl_init): Use SSL_is_init_finished() instead of
SSL_state(), conditionally skip SSLeay function calls
The python test suite makes SSL_peek() hang, consuming 100% CPU time.
This does not happen on real world TLS connections, though, but needs
investigations.
* src/hsts.c (hsts_file_access_valid): we should check for "world-writable"
files only on Unix-based systems. It's difficult to mimic the same behavior
on Windows, so it's better to just not do it.
Reported-by: Gisle Vanem <gvanem@yahoo.no>
Reported-by: Eli Zaretskii <eliz@gnu.org>
If not --trust-server-names is used, FTP will also get the destination
file name from the original url specified by the user instead of the
redirected url. Closes CVE-2016-4971.
* src/ftp.c (ftp_get_listing): Add argument original_url.
(getftp): Likewise.
(ftp_loop_internal): Likewise. Use original_url to generate the
file name if --trust-server-names is not provided.
(ftp_retrieve_glob): Likewise.
(ftp_loop): Likewise.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* src/main.c (save_hsts): save the in-memory HSTS database to a file
only if something changed.
* src/hsts.c (struct hsts_store): new field 'changed'.
(hsts_match): update field 'changed' accordingly.
(hsts_store_entry): update field 'changed' accordingly.
(hsts_store_has_changed): new function.
* src/hsts.h (hsts_store_has_changed): new function.
* hsts.c (hsts_file_access_valid): check that the file is a regular
file, and that it's not world-writable.
(hsts_store_open): if the HSTS database file does not meet the
above requirements, disable HSTS at all.
* src/hsts.c (hsts_store_entry): strictly comply with RFC 6797.
RFC 6797 states in section 8.1 that the UA's cached information should
only be updated if:
"either or both of the max-age and includeSubDomains header field
value tokens are conveying information different than that already
maintained by the UA."
* src/metalink.c (retrieve_from_metalink): Fix debug message to print the
fingerprint instead of a pointer.
* Do not delete the ChangeLog file since it is required by the Makefile
and breaks compilation
* README.checkout: Add description for libares
* configure.ac: Add check for libares
* doc/wget.texi: Add docs for the new options
* src/build_info.c.in: Add +/-cares for --version output
* src/host.c:
(merge_address_lists): New static function
(address_list_from_hostent): New static function
(wait_ares): New static function
(callback): New static function
(lookup_host): Add libares resolver code
* src/init.c: Add new options,
(cleanup): Add cleanup code
* src/main.c: Add global libares channel variable
(cmdline_option option_data): Add new options
(print_help): Add short descriptions
(main): Add libares init code
* src/options.h (struct options): Add option members
The new options allow to specify alternative DNS servers and
an alternate packet route for the resolver packets.
Wget has to built with libares, enabled at configure time by
./configure --with-cares.
* src/gnutls.c (ssl_connect_wget, ssl_check_certificate): Fix SNI server name
* src/openssl.c (ssl_connect_wget, ssl_check_certificate): Fix SNI server name
Fixes#47408
* src/url.c [HAVE_ICONV]: Include iconv.h and langinfo.h.
(convert_fname): New function.
[HAVE_ICONV]: Convert file name from remote encoding to local
encoding.
(url_file_name): Call convert_fname.
(filechr_table): Don't consider bytes in 128..159 as control
characters.
* tests/Test-ftp-iri.px: Fix the expected file name to match the
new file-name recoding. State the remote encoding explicitly on
the Wget command line.
* NEWS: Mention the URI recoding when built with libiconv.
* src/iri.c: Kick out the last converted character from iconv()
Thanks to Eli Zaretskii <eliz@gnu.org> for suggesting the fix.
Reported-by: "Andries E. Brouwer" <Andries.Brouwer@cwi.nl>
* src/ftp.c (getftp): on error, close the file and attempt to remove it
before exiting.
* src/hsts.c (hsts_store_open): update modification time in the end.
* src/options.h (CHECK_CERT_MODES): Remove C99 style comma after last
value
* src/progress.c (create_image): Do not mix statements and declarations
* src/init.c (cmd_boolean_internal): Mark unused parameters
* src/progress.c (bar_create): Define size of progress buffer explicitly
(create_image): Clean up progress bar image creation. Use memset
instead of for loops to create arrays of the same byte.
* src/http.c (initialize_request): Fix wrong params to search_netrc()
Regression introduced in commit 29850e77
Reported-by: Axel Reinhold <axel@freakout.de>
* src/hsts.c (hsts_find_entry): Fix freeing memory
(hsts_remove_entry): Remove freeing host member
(hsts_match): Free host member here
(hsts_store_entry): Free host member here
(test_url_rewrite): Fix 'created' value
(test_hsts_read_database): Fix 'created' value
Reported-by: Dagobert Michelsen <dam@opencsw.org>
* src/ftp-basic.c: The code for the new FTPS functionality was unintentionally
inside a #ifdef IPV6 block. Move the code around so that it is defined even when
IPV6 isn't used
* src/http.c (gethttp,http_loop):
Do not download/save file on error when --spider is enabled and not
working recursive.
Reported-by: Сковорода Никита Андреевич chalkerx@gmail.comFixes#45821
* src/convert.c (convert_links_in_hashtable, convert_links):
test for CO_CONVERT_BASENAME_ONLY.
(convert_basename): new function.
* src/convert.h: new constant CO_CONVERT_BASENAME_ONLY.
* src/init.c, src/main.c, src/options.h: new option "--convert-file-only".
* doc/wget.texi: updated documentation.
Reviewed-by: Gabriel Somlo <somlo@cmu.edu>
* src/hsts.c (hsts_read_database): get an open file handle
instead of a file name.
(hsts_store_dump): get an open file handle
instead of a file name.
(hsts_store_open): open the file and pass the open file handle.
(hsts_store_save): lock the file before the read-merge-dump
process.
Reported-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
* src/hsts.c (hsts_store_merge): call hsts_new_entry() if the entry
does not exist in the database.
When merging the existing HSTS database on disk with the one on memory,
the entries that were on disk but not on memory were ignored. Thus,
only the existing entries were merged. This behavior was only triggered
when more than one Wget processes were using the same HSTS database
simultaneously. This commit fixes the bug by adding the new entries
to the on-memory database if they were not found there.
* http.c (digest_authentication_encode): Wget already errors out if
qop != "auth". Then it makes no sense to test for qop == "auth-int"
later on. Currently, Wget does not support the "auth-int" qop value
and till nobidy requests, it may remain so.
* http.c (digest_authentication_encode): Some servers are still
using the obsolete RFC 2069 Digest Authentication. Allow Digest
authentication without the qop parameter for this.
Reported-by: Andreas Longwitz <longwitz@incore.de>
* doc/wget.texi: updated documentation to reflect the new FTPS functionality.
* src/ftp-basic.c (ftp_greeting): new function to read the server's greeting.
(ftp_login): greeting code was previously here. Moved to ftp_greeting to
support FTPS implicit mode.
(ftp_auth): wrapper around the AUTH TLS command.
(ftp_ccc): wrapper around the CCC command.
(ftp_pbsz): wrapper around the PBSZ command.
(ftp_prot): wraooer around the PROT command.
* src/ftp.c (get_ftp_greeting): new static function.
(init_control_ssl_connection): new static function to start SSL/TLS on the
control channel.
(getftp): added hooks to support FTPS commands (RFCs 2228 and 4217).
(ftp_loop_internal): test for new FTPS error codes.
* src/ftp.h: new enum 'prot_level' with available FTPS protection levels +
prototypes of previous functions. New flag for enum 'wget_ftp_fstatus' to track
whether the data channel has some security mechanism enabled or not.
* src/gnutls.c (struct wgnutls_transport_context): new field 'session_data'.
(wgnutls_close): free GnuTLS session data before exiting.
(ssl_connect_wget): save/resume SSL/TLS session.
* src/http.c (establish_connection): refactor ssl_connect_wget call.
(metalink_from_http): take into account SCHEME_FTPS as well.
* src/init.c, src/main.c, src/options.h: new command line/wgetrc options.
(main): in recursive downloads, check for SCHEME_FTPS as well.
* src/openssl.c (struct openssl_transport_context): new field 'sess'.
(ssl_connect_wget): save/resume SSL/TLS session.
* src/retr.c (retrieve_url): check new scheme SCHEME_FTPS.
* src/ssl.h (ssl_connect_wget): refactor. New parameter of type 'int *'.
* src/url.c. src/url.h: new scheme SCHEME_FTPS.
* src/wget.h: new FTPS error codes.
* src/metalink.h: support FTPS scheme.
* src/progress.c (create_image): progress only when in foreground
Sometimes I start wget, but the remote site is too slow, so I rather
want to run it in background, however when I simply use job control
for that, wget will keep spewing the progress bar all over my
terminal. I have found the SIGHUP/SIGUSR1 feature to redirect output
to a log file, but I think the following small patch is even more
useful, since the progress bar will simply resume when wget is
foregrounded again (also, the final message is still printed to the
terminal in any case):
* http.c (test_parse_range_header): New function to test the
function for parsing the HTTP/1.1 Content-Range header.
* test.[ch]: Same
* http.c (parse_content_range): Fix parsing code. Fail on scenarios
mentioned in rfc 7233.
* hsts.c (get_hsts_store_filename): Free the homedir value
(close_hsts_test_store): Actually free the store struct too
(test_hsts_new_entry): Pass store to close_hsts_test_store()
(test_hsts_url_rewrite_superdomain): Same
(test_hsts_url_rewrite_congruent): Same
(test_hsts_read_database): Same and homedir and store filename
* http.c (test_parse_content_disposition): Free the returned
filename
* url.c (test_append_uri_pathel): Free allocated string
* src/ftp.c (getftp): Do not use PORT when PASV fails.
* tests/FTPServer.px: Add pasv_not_supported server flag.
* tests/Makefile.am: Add Test-ftp-pasv-not-supported.px
* tests/Test-ftp-pasv-not-supported.px: New test
Fix IP address exposure when automatically falling back from
passive mode to active mode (using the PORT command). A behavior that
may be used to expose a client's privacy even when using a proxy.
* src/recur.c: Declare variables before code
(write_reject_log_url):
Use const keyword where appropriate
Use the 'default' switch statement
Use xfree() instead of free()
Renamed variable f -> fp
(write_reject_log_reason):
Use const keyword where appropriate
Use the 'default' switch statement
Renamed variable f -> fp
Renamed variable r -> reason
* main.c: Add "--rejected-log" option.
* init.c: Add "rejectedlog" command.
* options.h: Add "rejected_log" parameter string.
* wget.texi: Add brief documentation on new --rejected-log option.
* recur.c: Optionally log details of URLs not traversed.
Add reject_reason enum.
(download_child_p -> download_child): Return a reject_reason.
(descend_redirect_p -> descend_redirect): Return a reject_reason.
(retrieve_tree): Support logging reasons for rejection.
Add write_reject_log_header that writes a CSV format header to a file.
Add write_reject_log_url that writes a url struct to a file in CSV format.
Add write_reject_log_reason that writes the URL and parent URL as well as the
rejection reason to a CSV file.
* Test--rejected-log.px: Add a basic test for the --rejected-log command.
* tests/Makefile.am: Run Test--rejected-log.px.
This allows you to figure out why URLs are being rejected and some context
around it. CSV is used as the output format since it can be used easily parsed,
it's delimited by tabs instead of commas to allow using all (quoted) URL
characters and includes column names which may be used for compatibility.
* src/utils.[ch], src/http.c, src/metalink.c: Rename function
hex_to_string() to wg_hex_to_string sine it collides with a
similarly named function in OpenSSL Library.
* Makefile.am: Added new source files hsts.c and hsts.h.
* http.c (parse_strict_transport_security): new function for STS header
parsing.
(gethttp): update the HSTS store.
* http.h: new include "hsts.h".
* init.c: new options --hsts and --hsts-file.
* main.c (get_hsts_database, load_hsts, save_hsts): new functions.
New options --no-hsts and --hsts-file added to help.
(main): load and save HSTS store.
* options.h: new variables for supporting --hsts and --hsts-file.
* retr.c (retrieve_url): rewrite the URI according to the HSTS policy before
entering http_loop.
* test.c, test.h: new unit tests for HSTS.
* utils.c, utils.h (countchars): new function.
* wget.h: new preprocessor check.
* hsts.c, hsts.h: new files with the HSTS engine implementation.
Added support for HTTP Strict Transport Security (HSTS), as defined by RFC
6797.
* doc/wget.text: Add information about --preferred-location.
* src/init.c: Add --preferred-location option.
* src/main.c (option_data): Handle --preferred-location argument.
(main): Sort resources based on location if requested.
* src/metalink.c (metalink_res_cmp): Compare based on location if
priority and preference are equal.
* src/options.h (options): Add preferred_location option.
* src/http.c: Move find_key_value, has_key, find_key_values.
* src/metalink.c: To here.
* src/metalink.h: Make them non-static and add prototypes here.
* src/wget.h: Add IF_MODIFIED_SINCE enum for dt. Add TIMECONV_ERR
enum to uerr_t.
* src/http.c (time_to_rfc1123): Convert time_t do http time.
* src/http.c (initialize_request): Include If-Modified-Since header
if appropriate.
* src/http.c (set_file_timestamp): Separate this code from check_file_output.
* src/http.c (check_file_output): Use set_file_timestamp.
* src/http.c (gethttp): Handle properly 304 return code and 200 if server
ignores If-Modified-Since headers.
* src/http.c (http_loop): Load filename to hstat if condget was requested,
use IF_MODIFIED_SINCE if requested and current timestamp can be obtained.
* src/iri.c (do_conversion): Call url_unescape_except_reserved,
instead of url_unescape.
* src/url.c (url_unescape_1): New static function.
(url_unescape): Calls url_unescape_1 with mask zero. Preserves
same behavior as before. Only code changes.
(url_unescape_except_reserved): New function.
* src/url.h: Added prototype for url_unescape_except_reserved().
When the locale is US-ASCII, URIs that contain special characters
in them are converted to IRIs according to RFC 3987, section 3.2
"Converting URIs to IRIs".
* progress.c (update_speed_ring): The comment for the function
incorrectly stated that the function uses thirty samples from the
past instead of twenty.
Reported-By: Yi Li <lovelylich@gmail.com>
* src/ftp.c (ftp_loop_internal): Add option `force_full_retrieve' that force to
retrieve full file.
(ftp_retrieve_list): Pass `true' as `force_full_retrieve' option to
`ftp_loop_internal' if we want to download file with newer timestamp than local
copy.
* src/iri.c (remote_to_utf8): Do not qualify with const the output pointer.
(do_conversion): Use the provided input parameter as const.
(idn_encode): casts to remote_to_utf8 parameters are no longer needed.
* src/iri.h: Adjusted remote_to_utf8 prototype.
* src/url.c: It is no longer necessary to cast new_url to const char.
src/http.c (parse_content_disposition): stores filename* and filename
separately and choses filename* if available.
(test_parse_content_disposition): added new tests.
* src/http.c (resp_free): Change the semantics of this function.
(request_free): Change the semantics of this function.
(initialize_request): Adjust request_free call.
(establish_connection): Adjust request_free, resp_free calls.
(gethttp): Adjust request_free, resp_free calls.
* src/http.c (establish_connection): Do not free request here (it is
* never allocated here).
* src/http.c (gethttp): Free request before returning if error in
* establish_connection encountered.
* src/http.c: Log --content-on-error downloads.
* src/retr.c (retrieve_url): Register the download of an error page
when --content-on-error is specified.
* warc.c (windows_uuid_str) [WINDOWS]: New function specific to
MS-Windows.
(warc_uuid_str) [WINDOWS]: If windows_uuid_str succeeds, use its
result; otherwise use the fallback method.
xfree() might crash on libidn memory on Windows.
From 'man idn_free':
"Under Windows, different parts of the same application may use different
heap memory, and then it is important to deallocate memory allocated within
the same module that allocated it. This function makes that possible."
This commit causes the --show-progress option to print the progress bar
to stderr even when a logfile was explicitly provided on the command
line. Such a combination allows a user to log the output of Wget while
simultaneously keeping track of the download status.