* configure.ac: Add --with-psl-file to set a PSL file
* src/cookies.c (check_domain_match): Load PSL_FILE with
fallback to built-in data.
This change allows package maintainers to make Wget use the latest
PSL (DAFSA or plain text), without updating libpsl itself.
E.g. Debian now comes with a DAFSA binary within the 'publicsuffix'
package which allows very fast loading (no parsing or processing needed).
* src/cookies.c (cookie_header): Use heap instead of stack.
* src/http.c (request_send): Likewise.
If wget has to handle an insanely large amount of cookies (~700,000 on
32 bit systems or ~530,000 on 64 bit systems), the stack is not large
enough to hold these pointers, leading to undefined behaviour according
to POSIX; expect a segmentation fault in real life. ;)
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The signal handler for SIGALRM calls longjmp, but the handler is
installed before the jump target has been initialized. If another
process sends SIGALRM right between handler installation and target
initialization, the jump leads to undefined behavior.
This can easily be fixed by moving the signal handler installation
into the "SETJMP == 0" conditional block, which means that the target
has just been initialized.
* src/utils.c: call signal after SETJMP.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
* src/init.c: Remove hyphens from command names
* src/main.c: Likewise
Options with hyphens (or underscores) in their command name cannot be
set in a wgetrc file.
Signed-off-by: Jeffery To <jeffery.to@gmail.com>
* src/metalink.c (retrieve_from_metalink): Continue file download if
opt.always_rest is true
Without --continue, download as a new file with an unique name (this
conforms to the old behaviour).
* bootstrap.conf: Add crypto/md2, and crypto/md4
* src/metalink.c (retrieve_from_metalink): Add md2, and md4 support
This patch adds support for the deprecated (insecure) md2, and md4
Message-Digest algorithms to the Metalink module.
* bootstrap.conf: Add crypto/sha512
* src/metalink.c (retrieve_from_metalink): Add md5, sha1, sha224,
sha384, and sha512 support
Metalink's checksum verification was limited to sha256. This patch
adds support for md5, sha1, sha224, sha384, and sha512.
* configure.ac: Check for xattr availability
* src/Makefile.am: Add xattr.c
* src/ftp.c: Include xattr.h.
(getftp): Set attributes if enabled.
* src/http.c: Include xattr.h.
(gethttp): Add parameter 'original_url',
set attributes if enabled.
(http_loop): Add 'original_url' to call of gethttp().
* src/init.c: Add new option --xattr.
* src/main.c: Add new option --xattr, add description to help text.
* src/options.h: Add new config member 'enable_xattr'.
* src/xatrr.c: New file.
* src/xattr.h: New file.
These attributes provide a lightweight method of later determining
where a file was downloaded from.
This patch changes:
* autoconf detects whether extended attributes are available and
enables the code if they are.
* The new flags --xattr and --no-xattr control whether xattr is enabled.
* The new command "xattr = (on|off)" can be used in ~/.wgetrc or /etc/wgetrc
* The original and redirected URLs are recorded as shown below.
* This works for both single fetches and recursive mode.
The attributes that are set are:
user.xdg.origin.url: The URL that the content was fetched from.
user.xdg.referrer.url: The URL that was originally requested.
Here is an example, where http://archive.org redirects to https://archive.org:
$ wget --xattr http://archive.org
...
$ getfattr -d index.html
user.xdg.origin.url="https://archive.org/"
user.xdg.referrer.url="http://archive.org/"
These attributes were chosen based on those stored by Google Chrome
https://bugs.chromium.org/p/chromium/issues/detail?id=45903
and curl https://github.com/curl/curl/blob/master/src/tool_xattr.c
* src/openssl.c (ssl_init): Use SSL_is_init_finished() instead of
SSL_state(), conditionally skip SSLeay function calls
The python test suite makes SSL_peek() hang, consuming 100% CPU time.
This does not happen on real world TLS connections, though, but needs
investigations.
* src/hsts.c (hsts_file_access_valid): we should check for "world-writable"
files only on Unix-based systems. It's difficult to mimic the same behavior
on Windows, so it's better to just not do it.
Reported-by: Gisle Vanem <gvanem@yahoo.no>
Reported-by: Eli Zaretskii <eliz@gnu.org>
If not --trust-server-names is used, FTP will also get the destination
file name from the original url specified by the user instead of the
redirected url. Closes CVE-2016-4971.
* src/ftp.c (ftp_get_listing): Add argument original_url.
(getftp): Likewise.
(ftp_loop_internal): Likewise. Use original_url to generate the
file name if --trust-server-names is not provided.
(ftp_retrieve_glob): Likewise.
(ftp_loop): Likewise.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* src/main.c (save_hsts): save the in-memory HSTS database to a file
only if something changed.
* src/hsts.c (struct hsts_store): new field 'changed'.
(hsts_match): update field 'changed' accordingly.
(hsts_store_entry): update field 'changed' accordingly.
(hsts_store_has_changed): new function.
* src/hsts.h (hsts_store_has_changed): new function.
* hsts.c (hsts_file_access_valid): check that the file is a regular
file, and that it's not world-writable.
(hsts_store_open): if the HSTS database file does not meet the
above requirements, disable HSTS at all.
* src/hsts.c (hsts_store_entry): strictly comply with RFC 6797.
RFC 6797 states in section 8.1 that the UA's cached information should
only be updated if:
"either or both of the max-age and includeSubDomains header field
value tokens are conveying information different than that already
maintained by the UA."
* src/metalink.c (retrieve_from_metalink): Fix debug message to print the
fingerprint instead of a pointer.
* Do not delete the ChangeLog file since it is required by the Makefile
and breaks compilation
* README.checkout: Add description for libares
* configure.ac: Add check for libares
* doc/wget.texi: Add docs for the new options
* src/build_info.c.in: Add +/-cares for --version output
* src/host.c:
(merge_address_lists): New static function
(address_list_from_hostent): New static function
(wait_ares): New static function
(callback): New static function
(lookup_host): Add libares resolver code
* src/init.c: Add new options,
(cleanup): Add cleanup code
* src/main.c: Add global libares channel variable
(cmdline_option option_data): Add new options
(print_help): Add short descriptions
(main): Add libares init code
* src/options.h (struct options): Add option members
The new options allow to specify alternative DNS servers and
an alternate packet route for the resolver packets.
Wget has to built with libares, enabled at configure time by
./configure --with-cares.
* src/gnutls.c (ssl_connect_wget, ssl_check_certificate): Fix SNI server name
* src/openssl.c (ssl_connect_wget, ssl_check_certificate): Fix SNI server name
Fixes#47408
* src/url.c [HAVE_ICONV]: Include iconv.h and langinfo.h.
(convert_fname): New function.
[HAVE_ICONV]: Convert file name from remote encoding to local
encoding.
(url_file_name): Call convert_fname.
(filechr_table): Don't consider bytes in 128..159 as control
characters.
* tests/Test-ftp-iri.px: Fix the expected file name to match the
new file-name recoding. State the remote encoding explicitly on
the Wget command line.
* NEWS: Mention the URI recoding when built with libiconv.
* src/iri.c: Kick out the last converted character from iconv()
Thanks to Eli Zaretskii <eliz@gnu.org> for suggesting the fix.
Reported-by: "Andries E. Brouwer" <Andries.Brouwer@cwi.nl>
* src/ftp.c (getftp): on error, close the file and attempt to remove it
before exiting.
* src/hsts.c (hsts_store_open): update modification time in the end.
* src/options.h (CHECK_CERT_MODES): Remove C99 style comma after last
value
* src/progress.c (create_image): Do not mix statements and declarations
* src/init.c (cmd_boolean_internal): Mark unused parameters
* src/progress.c (bar_create): Define size of progress buffer explicitly
(create_image): Clean up progress bar image creation. Use memset
instead of for loops to create arrays of the same byte.
* src/http.c (initialize_request): Fix wrong params to search_netrc()
Regression introduced in commit 29850e77
Reported-by: Axel Reinhold <axel@freakout.de>
* src/hsts.c (hsts_find_entry): Fix freeing memory
(hsts_remove_entry): Remove freeing host member
(hsts_match): Free host member here
(hsts_store_entry): Free host member here
(test_url_rewrite): Fix 'created' value
(test_hsts_read_database): Fix 'created' value
Reported-by: Dagobert Michelsen <dam@opencsw.org>
* src/ftp-basic.c: The code for the new FTPS functionality was unintentionally
inside a #ifdef IPV6 block. Move the code around so that it is defined even when
IPV6 isn't used
* src/http.c (gethttp,http_loop):
Do not download/save file on error when --spider is enabled and not
working recursive.
Reported-by: Сковорода Никита Андреевич chalkerx@gmail.comFixes#45821
* src/convert.c (convert_links_in_hashtable, convert_links):
test for CO_CONVERT_BASENAME_ONLY.
(convert_basename): new function.
* src/convert.h: new constant CO_CONVERT_BASENAME_ONLY.
* src/init.c, src/main.c, src/options.h: new option "--convert-file-only".
* doc/wget.texi: updated documentation.
Reviewed-by: Gabriel Somlo <somlo@cmu.edu>
* src/hsts.c (hsts_read_database): get an open file handle
instead of a file name.
(hsts_store_dump): get an open file handle
instead of a file name.
(hsts_store_open): open the file and pass the open file handle.
(hsts_store_save): lock the file before the read-merge-dump
process.
Reported-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
* src/hsts.c (hsts_store_merge): call hsts_new_entry() if the entry
does not exist in the database.
When merging the existing HSTS database on disk with the one on memory,
the entries that were on disk but not on memory were ignored. Thus,
only the existing entries were merged. This behavior was only triggered
when more than one Wget processes were using the same HSTS database
simultaneously. This commit fixes the bug by adding the new entries
to the on-memory database if they were not found there.
* http.c (digest_authentication_encode): Wget already errors out if
qop != "auth". Then it makes no sense to test for qop == "auth-int"
later on. Currently, Wget does not support the "auth-int" qop value
and till nobidy requests, it may remain so.
* http.c (digest_authentication_encode): Some servers are still
using the obsolete RFC 2069 Digest Authentication. Allow Digest
authentication without the qop parameter for this.
Reported-by: Andreas Longwitz <longwitz@incore.de>
* doc/wget.texi: updated documentation to reflect the new FTPS functionality.
* src/ftp-basic.c (ftp_greeting): new function to read the server's greeting.
(ftp_login): greeting code was previously here. Moved to ftp_greeting to
support FTPS implicit mode.
(ftp_auth): wrapper around the AUTH TLS command.
(ftp_ccc): wrapper around the CCC command.
(ftp_pbsz): wrapper around the PBSZ command.
(ftp_prot): wraooer around the PROT command.
* src/ftp.c (get_ftp_greeting): new static function.
(init_control_ssl_connection): new static function to start SSL/TLS on the
control channel.
(getftp): added hooks to support FTPS commands (RFCs 2228 and 4217).
(ftp_loop_internal): test for new FTPS error codes.
* src/ftp.h: new enum 'prot_level' with available FTPS protection levels +
prototypes of previous functions. New flag for enum 'wget_ftp_fstatus' to track
whether the data channel has some security mechanism enabled or not.
* src/gnutls.c (struct wgnutls_transport_context): new field 'session_data'.
(wgnutls_close): free GnuTLS session data before exiting.
(ssl_connect_wget): save/resume SSL/TLS session.
* src/http.c (establish_connection): refactor ssl_connect_wget call.
(metalink_from_http): take into account SCHEME_FTPS as well.
* src/init.c, src/main.c, src/options.h: new command line/wgetrc options.
(main): in recursive downloads, check for SCHEME_FTPS as well.
* src/openssl.c (struct openssl_transport_context): new field 'sess'.
(ssl_connect_wget): save/resume SSL/TLS session.
* src/retr.c (retrieve_url): check new scheme SCHEME_FTPS.
* src/ssl.h (ssl_connect_wget): refactor. New parameter of type 'int *'.
* src/url.c. src/url.h: new scheme SCHEME_FTPS.
* src/wget.h: new FTPS error codes.
* src/metalink.h: support FTPS scheme.
* src/progress.c (create_image): progress only when in foreground
Sometimes I start wget, but the remote site is too slow, so I rather
want to run it in background, however when I simply use job control
for that, wget will keep spewing the progress bar all over my
terminal. I have found the SIGHUP/SIGUSR1 feature to redirect output
to a log file, but I think the following small patch is even more
useful, since the progress bar will simply resume when wget is
foregrounded again (also, the final message is still printed to the
terminal in any case):
* http.c (test_parse_range_header): New function to test the
function for parsing the HTTP/1.1 Content-Range header.
* test.[ch]: Same
* http.c (parse_content_range): Fix parsing code. Fail on scenarios
mentioned in rfc 7233.
* hsts.c (get_hsts_store_filename): Free the homedir value
(close_hsts_test_store): Actually free the store struct too
(test_hsts_new_entry): Pass store to close_hsts_test_store()
(test_hsts_url_rewrite_superdomain): Same
(test_hsts_url_rewrite_congruent): Same
(test_hsts_read_database): Same and homedir and store filename
* http.c (test_parse_content_disposition): Free the returned
filename
* url.c (test_append_uri_pathel): Free allocated string