* src/http.c (resp_new): Replace \r\n by space in continuation lines
Fixes#53763
"Malicious website can write arbitrary cookie entries to cookie jar"
HTTP header parsing left the \r\n from continuation line intact.
The Set-Cookie code didn't check and could be tricked to write
\r\n into the cookie jar, allowing a server to generate cookies at will.
* http.c(gethttp): In case of a 416 response, try to drain the socket of
any bytes before reusing the connection
Reported-By: Iru Cai <mytbk920423@gmail.com>
* src/http.c(gethttp): When Encoding is gzip, ensure that the
Content-Type Header was actually seen. Without this, the "type" variable
is null causing a Segfault.
Reported-By: Noël Köthe <noel@debian.org>
* src/http.c (skip_short_body): Return error on negative chunk size
Reported-by: Antti Levomäki, Christian Jalio, Joonas Pihlaja from Forcepoint
Reported-by: Juhani Eronen from Finnish National Cyber Security Centre
* src/http.c (gethttp): Move 304 code before --adjust-extension code
This fixes applying --adjust-extension in combination with 304
HTTP responses. It could lead to .html extensions to arbitrary
files.
Reported-by: anfractuosity
There seemed to be a copy&paste error in http.c code, which decides
whether to get credentials from .netrc. In ftp.c "user" and "pass"
variables are char*, while in http.c, these are char**. For this reason
they should be dereferenced when determining if password and user login
is set to some value.
Also since both variables are dereferenced on lines above the changed
code, it does not really make sense to check if they are NULL.
This patch is based on fix from Bruce Jerrick <bmj001@gmail.com>.
Fedora bug: https://bugzilla.redhat.com/show_bug.cgi?id=1425097
Signed-off-by: Tomas Hozza <thozza@redhat.com>
* src/utils.h: Add struct file_stat_s declaration,
change prototypes of file_exists_p(),
add prototypes for fopen_stat() and open_stat().
* src/utils.c: Extend file_exists_p(),
new function fopen_stat() and open_stat(),
add new param for file_exists_p().
* src/init.h: Add param file_stats_t to run_wgetrc().
* src/ftp.c: Amend calls to extended functions.
* src/hsts.c: Likewise.
* src/http.c: Likewise.
* src/init.c: Likewise.
* src/main.c: Likewise.
* src/metalink.c: Likewise.
* src/retr.c: Likewise.
* src/url.c: Likewise.
Added fopen_stat() and open_stat() that checks to makes sure the file didn't
change underneath us.
Return error from file_exists_p().
Added a way to return error from this file without major surgery to the
callers.
Fixes: #20369
* src/http.c (gethttp): Move 504 handling to correct place.
(http_loop): Fix memeory leak.
* testenv/server/http/http_server.py: Add Content-Length header on non-2xx
status codes with a body
Reported-by: Adam Sampson
* doc/wget.texi: Add description for --retry-on-http-error
* src/http.c (gethttp):
Consider given HTTP response codes as non-fatal, transient errors.
Supply a comma-separated list of 3-digit HTTP response codes as
argument. Useful to work around special circumstances where retries
are required, but the server responds with an error code normally not
retried by Wget. Such errors might be 503 (Service Unavailable) and
429 (Too Many Requests). Retries enabled by this option are performed
subject to the normal retry timing and retry count limitations of
Wget.
Using this option is intended to support special use cases only and is
generally not recommended, as it can force retries even in cases where
the server is actually trying to decrease its load. Please use it
wisely and only if you know what you are doing.
Example use and a starting point for manual testing:
wget --retry-on-http-error=429,503 http://httpbin.org/status/503
* src/ftp.c (getftp): Fix password/user selection
* src/http.c (initialize_request): Likewise
Before, netrc password won over interactive
--ask-password but now --ask-password wins
after change of program logic
Fixes Issue #48811
* src/http-ntlm.c: Rename base64_{encode,decode}
* src/http.c: Likewise
* src/utils.c: Likewise
* src/utils.h: Likewise
When statically linking with gnutls, we get definition clash error for
base64_encode which is also defined by gnutls.
To prevent definition clash, rename base64_{encode,decode}
Signed-off-by: Rahul Bedarkar <rahul.bedarkar@imgtec.com>
* src/http.c (metalink_from_http): Process the Content-Type header.
Add an application/metalink4+xml URL as metalink metaurl. If the
option opt.content_disposition is true, the Content-Disposition's
filename is the metaurl's name
* doc/wget.texi: Update --content-disposition and --metalink-over-http
* doc/metalink-standard.txt: Update doc. Content-Type/Disposition
processing through --metalink-over-http. Update download naming
system about --trust-server-names and --content-disposition
* testenv/Makefile.am: Add new files
* testenv/Test-metalink-http-xml-type.py: New file. Metalink/HTTP
Content-Type/Disposition header automated Metalink/XML tests
* testenv/Test-metalink-http-xml-type-trust.py: New file. Metalink/HTTP
Content-Type/Disposition header with --trust-server-names automated
Metalink/XML tests
* testenv/Test-metalink-http-xml-type-content.py: New file. Metalink/HTTP
Content-Type/Disposition header with --content-disposition automated
Metalink/XML tests
* testenv/Test-metalink-http-xml-type-trust-content.py: New file.
Metalink/HTTP Content-Type/Disposition header with --trust-server-names
and --content-disposition automated Metalink/XML tests
Process the Content-Type header, identify an application/metalink4+xml
file. The Content-Disposition could provide an alternate name through
the "filename" field for the metalink xml file. Respectively, the cli
options --metalink-over-http and --content-disposition are required.
When Metalink/XML auto-processing, to use the Content-Disposition's
filename, the cli option --trust-server-names is also required.
* src/http.c (http_loop): Prevent SIGSEGV when hstat.local_file is
NULL, opt.content_disposition has a role in leaving the value unset
* src/http.c (gethttp): If hs->local_file is NULL (aka http_loop()'s
hstat.local_file), set it to the value of hs->metalink->origin
* src/http.c (metalink_from_http): Fix hash_bin_len type. Use ssize_t
instead than size_t. Reject -1 as base64_decode() return value
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-http-baddigest.py: New file. Metalink/HTTP
malformed base64 Digest header tests
On malformed base64 input, ssize_t base64_decode() returns -1. Such
value is too big for a size_t variable, and used as xmalloc() value
will exaust all the memory.
* src/http.c (metalink_from_http): Parse Metalink/HTTP header for
metaurls application/metalink4+xml media types
* src/metalink.h: Add function declaration metalink_meta_cmp()
* src/metalink.c: Add function metalink_meta_cmp() compare metalink
metaurls priorities
Add Metalink/HTTP application/metalink4+xml media types as metaurls to
the metalink variable that will be used to download the files.
* src/http.c: Add const to first param of initialize_request(),
initialize_proxy_configuration(), establish_connection(),
check_file_output(), check_auth(), gethttp(), http_loop().
* src/http.h: Add const to first param of http_loop().
* src/connect.c (connect_to_ip): Check return value of setsockopt.
* src/ftp.c (ftp_retrieve_list): Check return value of chmod.
* src/http.c (digest_authentication_encode): Cleanup code.
* src/init.c (setval_internal): Explicitely check comind range.
* src/main.c (main): Explicitely check optarg.
* src/retr.c (retr_rate): Use snprintf instead sprintf,
(retrieve_from_file): More verbose error message,
(rotate_backups): Use snprintf instead sprintf, check return
value of rename().
* src/url.c (mkalldirs): Check return value of unlink().
* src/utils.c (strdupdelim): Explicitely check beg and end for NULL,
(merge_vecs): Fix sizeof argument to char *,
(stable_sort): Use malloc instead of alloca.
* src/cookies.c (cookie_header): Use heap instead of stack.
* src/http.c (request_send): Likewise.
If wget has to handle an insanely large amount of cookies (~700,000 on
32 bit systems or ~530,000 on 64 bit systems), the stack is not large
enough to hold these pointers, leading to undefined behaviour according
to POSIX; expect a segmentation fault in real life. ;)
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
* configure.ac: Check for xattr availability
* src/Makefile.am: Add xattr.c
* src/ftp.c: Include xattr.h.
(getftp): Set attributes if enabled.
* src/http.c: Include xattr.h.
(gethttp): Add parameter 'original_url',
set attributes if enabled.
(http_loop): Add 'original_url' to call of gethttp().
* src/init.c: Add new option --xattr.
* src/main.c: Add new option --xattr, add description to help text.
* src/options.h: Add new config member 'enable_xattr'.
* src/xatrr.c: New file.
* src/xattr.h: New file.
These attributes provide a lightweight method of later determining
where a file was downloaded from.
This patch changes:
* autoconf detects whether extended attributes are available and
enables the code if they are.
* The new flags --xattr and --no-xattr control whether xattr is enabled.
* The new command "xattr = (on|off)" can be used in ~/.wgetrc or /etc/wgetrc
* The original and redirected URLs are recorded as shown below.
* This works for both single fetches and recursive mode.
The attributes that are set are:
user.xdg.origin.url: The URL that the content was fetched from.
user.xdg.referrer.url: The URL that was originally requested.
Here is an example, where http://archive.org redirects to https://archive.org:
$ wget --xattr http://archive.org
...
$ getfattr -d index.html
user.xdg.origin.url="https://archive.org/"
user.xdg.referrer.url="http://archive.org/"
These attributes were chosen based on those stored by Google Chrome
https://bugs.chromium.org/p/chromium/issues/detail?id=45903
and curl https://github.com/curl/curl/blob/master/src/tool_xattr.c
* src/http.c (initialize_request): Fix wrong params to search_netrc()
Regression introduced in commit 29850e77
Reported-by: Axel Reinhold <axel@freakout.de>
* src/http.c (gethttp,http_loop):
Do not download/save file on error when --spider is enabled and not
working recursive.
Reported-by: Сковорода Никита Андреевич chalkerx@gmail.comFixes#45821
* http.c (digest_authentication_encode): Wget already errors out if
qop != "auth". Then it makes no sense to test for qop == "auth-int"
later on. Currently, Wget does not support the "auth-int" qop value
and till nobidy requests, it may remain so.
* http.c (digest_authentication_encode): Some servers are still
using the obsolete RFC 2069 Digest Authentication. Allow Digest
authentication without the qop parameter for this.
Reported-by: Andreas Longwitz <longwitz@incore.de>
* doc/wget.texi: updated documentation to reflect the new FTPS functionality.
* src/ftp-basic.c (ftp_greeting): new function to read the server's greeting.
(ftp_login): greeting code was previously here. Moved to ftp_greeting to
support FTPS implicit mode.
(ftp_auth): wrapper around the AUTH TLS command.
(ftp_ccc): wrapper around the CCC command.
(ftp_pbsz): wrapper around the PBSZ command.
(ftp_prot): wraooer around the PROT command.
* src/ftp.c (get_ftp_greeting): new static function.
(init_control_ssl_connection): new static function to start SSL/TLS on the
control channel.
(getftp): added hooks to support FTPS commands (RFCs 2228 and 4217).
(ftp_loop_internal): test for new FTPS error codes.
* src/ftp.h: new enum 'prot_level' with available FTPS protection levels +
prototypes of previous functions. New flag for enum 'wget_ftp_fstatus' to track
whether the data channel has some security mechanism enabled or not.
* src/gnutls.c (struct wgnutls_transport_context): new field 'session_data'.
(wgnutls_close): free GnuTLS session data before exiting.
(ssl_connect_wget): save/resume SSL/TLS session.
* src/http.c (establish_connection): refactor ssl_connect_wget call.
(metalink_from_http): take into account SCHEME_FTPS as well.
* src/init.c, src/main.c, src/options.h: new command line/wgetrc options.
(main): in recursive downloads, check for SCHEME_FTPS as well.
* src/openssl.c (struct openssl_transport_context): new field 'sess'.
(ssl_connect_wget): save/resume SSL/TLS session.
* src/retr.c (retrieve_url): check new scheme SCHEME_FTPS.
* src/ssl.h (ssl_connect_wget): refactor. New parameter of type 'int *'.
* src/url.c. src/url.h: new scheme SCHEME_FTPS.
* src/wget.h: new FTPS error codes.
* src/metalink.h: support FTPS scheme.
* http.c (test_parse_range_header): New function to test the
function for parsing the HTTP/1.1 Content-Range header.
* test.[ch]: Same
* http.c (parse_content_range): Fix parsing code. Fail on scenarios
mentioned in rfc 7233.
* hsts.c (get_hsts_store_filename): Free the homedir value
(close_hsts_test_store): Actually free the store struct too
(test_hsts_new_entry): Pass store to close_hsts_test_store()
(test_hsts_url_rewrite_superdomain): Same
(test_hsts_url_rewrite_congruent): Same
(test_hsts_read_database): Same and homedir and store filename
* http.c (test_parse_content_disposition): Free the returned
filename
* url.c (test_append_uri_pathel): Free allocated string
* src/utils.[ch], src/http.c, src/metalink.c: Rename function
hex_to_string() to wg_hex_to_string sine it collides with a
similarly named function in OpenSSL Library.
* Makefile.am: Added new source files hsts.c and hsts.h.
* http.c (parse_strict_transport_security): new function for STS header
parsing.
(gethttp): update the HSTS store.
* http.h: new include "hsts.h".
* init.c: new options --hsts and --hsts-file.
* main.c (get_hsts_database, load_hsts, save_hsts): new functions.
New options --no-hsts and --hsts-file added to help.
(main): load and save HSTS store.
* options.h: new variables for supporting --hsts and --hsts-file.
* retr.c (retrieve_url): rewrite the URI according to the HSTS policy before
entering http_loop.
* test.c, test.h: new unit tests for HSTS.
* utils.c, utils.h (countchars): new function.
* wget.h: new preprocessor check.
* hsts.c, hsts.h: new files with the HSTS engine implementation.
Added support for HTTP Strict Transport Security (HSTS), as defined by RFC
6797.
* src/http.c: Move find_key_value, has_key, find_key_values.
* src/metalink.c: To here.
* src/metalink.h: Make them non-static and add prototypes here.
* src/wget.h: Add IF_MODIFIED_SINCE enum for dt. Add TIMECONV_ERR
enum to uerr_t.
* src/http.c (time_to_rfc1123): Convert time_t do http time.
* src/http.c (initialize_request): Include If-Modified-Since header
if appropriate.
* src/http.c (set_file_timestamp): Separate this code from check_file_output.
* src/http.c (check_file_output): Use set_file_timestamp.
* src/http.c (gethttp): Handle properly 304 return code and 200 if server
ignores If-Modified-Since headers.
* src/http.c (http_loop): Load filename to hstat if condget was requested,
use IF_MODIFIED_SINCE if requested and current timestamp can be obtained.
src/http.c (parse_content_disposition): stores filename* and filename
separately and choses filename* if available.
(test_parse_content_disposition): added new tests.
* src/http.c (resp_free): Change the semantics of this function.
(request_free): Change the semantics of this function.
(initialize_request): Adjust request_free call.
(establish_connection): Adjust request_free, resp_free calls.
(gethttp): Adjust request_free, resp_free calls.
* src/http.c (establish_connection): Do not free request here (it is
* never allocated here).
* src/http.c (gethttp): Free request before returning if error in
* establish_connection encountered.
* src/http.c: Log --content-on-error downloads.
* src/retr.c (retrieve_url): Register the download of an error page
when --content-on-error is specified.
MIN and MAx are macros that a developer will universally expect
throughout the source. Yet, they were being defined in multiple places
across the source. Instead, define them in a single location in the
common wget.h header file and use them consistently everywhere.
This patch also adds support for multiple challenges per
WWW-Authenticate header line.
The test Test-auth-both.py now succeeds and thus is taken away
from XFAIL_TESTS (expected to fail tests).