* src/init.c (defaults): Set enable_xattr to false by default
* src/main.c (print_help): Reverse option logic of --xattr
* doc/wget.texi: Add description for --xattr
Users may not be aware that the origin URL and Referer are saved
including credentials, and possibly access tokens within
the urls.
This reverts commit 6f3b995993.
The code is obviously wrong, see https://savannah.gnu.org/bugs/?54963
Also, the example from the original post doesn't work any more.
With other words, the broken server behavior has been fixed meanwhile.
* src/openssl.c: Check for OPENSSL_NO_ENGINE before
including openssl/engine.h and before calling ENGINE_load_builtin_engines()
Fixes compilation with no engines compiled.
Copyright-paperwork-exempt: Yes
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Current sufmatch does not match when domain is dot-prefixed.
The example of no_proxy in man (.mit.edu) does use a dot-prefixed
domain.
Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Copyright-paperwork-exempt: Yes
* configure.ac: Check for libpcre2-8
* src/init.c (choices): Test for HAVE_LIBPCRE2
* src/main.c (main): Set regex compile and match functions
* src/options.h: Test for HAVE_LIBPCRE2
* src/utils.c: Include pcre2.h, add functions
compile_pcre2_regex() and match_pcre2_regex()
* src/utils.h: Declare compile_pcre2_regex() and match_pcre2_regex()
Fixes#54677
Reported-by: Noël Köthe
* doc/wget.texi: Add "TLSv1_3" to --secure-protocol
* src/gnutls.c (set_prio_default): Use GNUTLS_TLS1_3 where needed
Wget currently allows specifying "TLSv1_3" as the parameter for
--secure-protocol option. However it is only implemented for OpenSSL
and in case wget is compiled with GnuTLS, it causes wget to abort with:
GnuTLS: unimplemented 'secure-protocol' option value 6
GnuTLS contains TLS 1.3 implementation since version 3.6.3 [1]. However
currently it must be enabled explicitly in the application of it to be
used. This will change after the draft is finalized. [2] However for
the time being, I enabled it explicitly in case "TLSv1_3" is used with
--secure-protocol.
I also fixed man page to contain "TLSv1_3" in all listings of available
parameters for --secure-protocol
[1] https://lists.gnupg.org/pipermail/gnutls-devel/2018-July/008584.html
[2] https://nikmav.blogspot.com/2018/05/gnutls-and-tls-13.html
Signed-off-by: Tomas Hozza <thozza@redhat.com>
Error: RESOURCE_LEAK (CWE-772): - REAL ERROR
wget-1.19.5/src/warc.c:1376: alloc_fn: Storage is returned from allocation function "url_escape".
wget-1.19.5/src/url.c:284:3: alloc_fn: Storage is returned from allocation function "url_escape_1".
wget-1.19.5/src/url.c:255:3: alloc_fn: Storage is returned from allocation function "xmalloc".
wget-1.19.5/lib/xmalloc.c:41:11: alloc_fn: Storage is returned from allocation function "malloc".
wget-1.19.5/lib/xmalloc.c:41:11: var_assign: Assigning: "p" = "malloc(n)".
wget-1.19.5/lib/xmalloc.c:44:3: return_alloc: Returning allocated memory "p".
wget-1.19.5/src/url.c:255:3: var_assign: Assigning: "newstr" = "xmalloc(newlen + 1)".
wget-1.19.5/src/url.c:258:3: var_assign: Assigning: "p2" = "newstr".
wget-1.19.5/src/url.c:275:3: return_alloc: Returning allocated memory "newstr".
wget-1.19.5/src/url.c:284:3: return_alloc_fn: Directly returning storage allocated by "url_escape_1".
wget-1.19.5/src/warc.c:1376: var_assign: Assigning: "redirect_location" = storage returned from "url_escape(redirect_location)".
wget-1.19.5/src/warc.c:1381: noescape: Resource "redirect_location" is not freed or pointed-to in "fprintf".
wget-1.19.5/src/warc.c:1387: leaked_storage: Returning without freeing "redirect_location" leaks the storage that it points to.
\# 1385| fflush (warc_current_cdx_file);
\# 1386|
\# 1387|-> return true;
\# 1388| }
\# 1389|
url_escape() really returns a newly allocated memory and it leaks when the warc_write_cdx_record() returns. The memory returned from url_escape() is usually stored in a temporary variable in other parts of the project and then freed. I took the same approach.
Signed-off-by: Tomas Hozza <thozza@redhat.com>
In warc_write_start_record() function, the reutrn value of dup() is
directly used in gzdopen() call and not stored anywhere. However the
zlib documentation says that "The duplicated descriptor should be saved
to avoid a leak, since gzdopen does not close fd if it fails." [1].
This change stores the FD in a variable and closes it in case gzopen()
fails.
[1] https://www.zlib.net/manual.html
Error: RESOURCE_LEAK (CWE-772):
wget-1.19.5/src/warc.c:217: open_fn: Returning handle opened by "dup".
wget-1.19.5/src/warc.c:217: leaked_handle: Failing to save or close handle opened by "dup(fileno(warc_current_file))" leaks it.
\# 215|
\# 216| /* Start a new GZIP stream. */
\# 217|-> warc_current_gzfile = gzdopen (dup (fileno (warc_current_file)), "wb9");
\# 218| warc_current_gzfile_uncompressed_size = 0;
\# 219|
Signed-off-by: Tomas Hozza <thozza@redhat.com>
Error: RESOURCE_LEAK (CWE-772):
wget-1.19.5/src/utils.c:914: open_fn: Returning handle opened by "open". [Note: The source code implementation of the function has been overridden by a user model.]
wget-1.19.5/src/utils.c:914: var_assign: Assigning: "fd" = handle returned from "open(fname, flags, mode)".
wget-1.19.5/src/utils.c:921: noescape: Resource "fd" is not freed or pointed-to in "fstat". [Note: The source code implementation of the function has been overridden by a builtin model.]
wget-1.19.5/src/utils.c:924: leaked_handle: Handle variable "fd" going out of scope leaks the handle.
\# 922| {
\# 923| logprintf (LOG_NOTQUIET, _("Failed to stat file %s, error: %s\n"), fname, strerror(errno));
\# 924|-> return -1;
\# 925| }
\# 926| #if !(defined(WINDOWS) || defined(__VMS))
This seems to be a real issue, since the opened file descriptor in "fd"
would leak. There is also additional check below the "fstat" call, which
closes the opened "fd".
Signed-off-by: Tomas Hozza <thozza@redhat.com>
Error: RESOURCE_LEAK (CWE-772):
wget-1.19.5/src/http.c:4486: alloc_fn: Storage is returned from allocation function "url_string".
wget-1.19.5/src/url.c:2248:3: alloc_fn: Storage is returned from allocation function "xmalloc".
wget-1.19.5/lib/xmalloc.c:41:11: alloc_fn: Storage is returned from allocation function "malloc".
wget-1.19.5/lib/xmalloc.c:41:11: var_assign: Assigning: "p" = "malloc(n)".
wget-1.19.5/lib/xmalloc.c:44:3: return_alloc: Returning allocated memory "p".
wget-1.19.5/src/url.c:2248:3: var_assign: Assigning: "result" = "xmalloc(size)".
wget-1.19.5/src/url.c:2248:3: var_assign: Assigning: "p" = "result".
wget-1.19.5/src/url.c:2250:3: noescape: Resource "p" is not freed or pointed-to in function "memcpy". [Note: The source code implementation of the function has been overridden by a builtin model.]
wget-1.19.5/src/url.c:2253:7: noescape: Resource "p" is not freed or pointed-to in function "memcpy". [Note: The source code implementation of the function has been overridden by a builtin model.]
wget-1.19.5/src/url.c:2257:11: noescape: Resource "p" is not freed or pointed-to in function "memcpy". [Note: The source code implementation of the function has been overridden by a builtin model.]
wget-1.19.5/src/url.c:2264:3: noescape: Resource "p" is not freed or pointed-to in function "memcpy". [Note: The source code implementation of the function has been overridden by a builtin model.]
wget-1.19.5/src/url.c:2270:7: identity_transfer: Passing "p" as argument 1 to function "number_to_string", which returns an offset off that argument.
wget-1.19.5/src/utils.c:1776:11: var_assign_parm: Assigning: "p" = "buffer".
wget-1.19.5/src/utils.c:1847:3: return_var: Returning "p", which is a copy of a parameter.
wget-1.19.5/src/url.c:2270:7: noescape: Resource "p" is not freed or pointed-to in function "number_to_string".
wget-1.19.5/src/utils.c:1774:25: noescape: "number_to_string(char *, wgint)" does not free or save its parameter "buffer".
wget-1.19.5/src/url.c:2270:7: var_assign: Assigning: "p" = "number_to_string(p, url->port)".
wget-1.19.5/src/url.c:2273:3: noescape: Resource "p" is not freed or pointed-to in function "full_path_write".
wget-1.19.5/src/url.c:1078:47: noescape: "full_path_write(struct url const *, char *)" does not free or save its parameter "where".
wget-1.19.5/src/url.c:2287:3: return_alloc: Returning allocated memory "result".
wget-1.19.5/src/http.c:4486: var_assign: Assigning: "hurl" = storage returned from "url_string(u, URL_AUTH_HIDE_PASSWD)".
wget-1.19.5/src/http.c:4487: noescape: Resource "hurl" is not freed or pointed-to in "logprintf".
wget-1.19.5/src/http.c:4513: leaked_storage: Variable "hurl" going out of scope leaks the storage it points to.
\# 4511| {
\# 4512| printwhat (count, opt.ntry);
\# 4513|-> continue;
\# 4514| }
\# 4515| else
There are two conditional branches, which call continue, without freeing memory potentially allocated and pointed to by"hurl" pointer. In fase "!opt.verbose" is True and some of the appropriate conditions in the following if/else if construction, in which "continue" is called, are also true, then the memory allocated to "hurl" will leak.
Signed-off-by: Tomas Hozza <thozza@redhat.com>
Error: RESOURCE_LEAK (CWE-772):
wget-1.19.5/src/http.c:2434: alloc_fn: Storage is returned from allocation function "xmalloc".
wget-1.19.5/lib/xmalloc.c:41:11: alloc_fn: Storage is returned from allocation function "malloc".
wget-1.19.5/lib/xmalloc.c:41:11: var_assign: Assigning: "p" = "malloc(n)".
wget-1.19.5/lib/xmalloc.c:44:3: return_alloc: Returning allocated memory "p".
wget-1.19.5/src/http.c:2434: var_assign: Assigning: "auth_stat" = storage returned from "xmalloc(4UL)".
wget-1.19.5/src/http.c:2446: noescape: Resource "auth_stat" is not freed or pointed-to in "create_authorization_line".
wget-1.19.5/src/http.c:5203:70: noescape: "create_authorization_line(char const *, char const *, char const *, char const *, char const *, _Bool *, uerr_t *)" does not free or save its parameter "auth_err".
wget-1.19.5/src/http.c:2476: leaked_storage: Variable "auth_stat" going out of scope leaks the storage it points to.
\# 2474| /* Creating the Authorization header went wrong */
\# 2475| }
\# 2476|-> }
\# 2477| else
\# 2478| {
Error: RESOURCE_LEAK (CWE-772):
wget-1.19.5/src/http.c:2431: alloc_fn: Storage is returned from allocation function "url_full_path".
wget-1.19.5/src/url.c:1105:19: alloc_fn: Storage is returned from allocation function "xmalloc".
wget-1.19.5/lib/xmalloc.c:41:11: alloc_fn: Storage is returned from allocation function "malloc".
wget-1.19.5/lib/xmalloc.c:41:11: var_assign: Assigning: "p" = "malloc(n)".
wget-1.19.5/lib/xmalloc.c:44:3: return_alloc: Returning allocated memory "p".
wget-1.19.5/src/url.c:1105:19: var_assign: Assigning: "full_path" = "xmalloc(length + 1)".
wget-1.19.5/src/url.c:1107:3: noescape: Resource "full_path" is not freed or pointed-to in function "full_path_write".
wget-1.19.5/src/url.c:1078:47: noescape: "full_path_write(struct url const *, char *)" does not free or save its parameter "where".
wget-1.19.5/src/url.c:1110:3: return_alloc: Returning allocated memory "full_path".
wget-1.19.5/src/http.c:2431: var_assign: Assigning: "pth" = storage returned from "url_full_path(u)".
wget-1.19.5/src/http.c:2446: noescape: Resource "pth" is not freed or pointed-to in "create_authorization_line".
wget-1.19.5/src/http.c:5203:40: noescape: "create_authorization_line(char const *, char const *, char const *, char const *, char const *, _Bool *, uerr_t *)" does not free or save its parameter "path".
wget-1.19.5/src/http.c:2476: leaked_storage: Variable "pth" going out of scope leaks the storage it points to.
\# 2474| /* Creating the Authorization header went wrong */
\# 2475| }
\# 2476|-> }
\# 2477| else
\# 2478| {
Both "pth" and "auth_stat" are allocated in "check_auth()" function. These are used for creating the HTTP Authorization Request header via "create_authorization_line()" function. In case the creation went OK (auth_err == RETROK), then the memory previously allocated to "pth" and "auth_stat" is freed. However if the creation failed, then the memory is never freed and it leaks.
Signed-off-by: Tomas Hozza <thozza@redhat.com>
Error: RESOURCE_LEAK (CWE-772):
wget-1.19.5/src/ftp.c:1493: alloc_fn: Storage is returned from allocation function "fopen".
wget-1.19.5/src/ftp.c:1493: var_assign: Assigning: "fp" = storage returned from "fopen(con->target, "wb")".
wget-1.19.5/src/ftp.c:1811: leaked_storage: Variable "fp" going out of scope leaks the storage it points to.
\# 1809| if (fp && !output_stream)
\# 1810| fclose (fp);
\# 1811|-> return err;
\# 1812| }
\# 1813|
It can happen, that "if (!output_stream || con->cmd & DO_LIST)" on line #1398 can be true, even though "output_stream != NULL". In this case a new file is opened to "fp". Later it may happen in the FTPS branch, that some error will occure and code will jump to label "exit_error". In "exit_error", the "fp" is closed only if "output_stream == NULL". However this may not be true as described earlier and "fp" leaks.
On line #1588, there is the following conditional free of "fp":
/* Close the local file. */
if (!output_stream || con->cmd & DO_LIST)
fclose (fp);
Therefore the conditional at the end of the function after "exit_error" label should be modified to:
if (fp && (!output_stream || con->cmd & DO_LIST))
fclose (fp);
This will ensure that "fp" does not leak in any case it sould be opened.
Signed-off-by: Tomas Hozza <thozza@redhat.com>
* src/http.c (resp_new): Replace \r\n by space in continuation lines
Fixes#53763
"Malicious website can write arbitrary cookie entries to cookie jar"
HTTP header parsing left the \r\n from continuation line intact.
The Set-Cookie code didn't check and could be tricked to write
\r\n into the cookie jar, allowing a server to generate cookies at will.
* src/openssl.c (init_prng): keep gathering entropy even though we
already have enough
(ssl_connect_with_timeout_callback): reseed PRNG again just before
the handshake
Reported-by: Jeffrey Walton <noloader@gmail.com>
This commit hardens SSL/TLS a bit more in the following ways:
* Explicitly exclude NULL authentication and the 'MEDIUM' cipher list
category. Ciphers in the 'HIGH' level are only considered - this
includes all symmetric ciphers with key lengths larger than 128 bits,
and some ('modern') 128-bit ciphers, such as AES in GCM mode.
* Allow RSA key exchange by default, but exclude it when
Perfect Forward Secrecy is desired (with --secure-protocol=PFS).
* Introduce new option --ciphers to set the cipher list that the SSL/TLS
engine will favor. This string is fed directly to the underlying TLS
library (GnuTLS or OpenSSL) without further processing, and hence its
format and syntax are directly dependent on the specific library.
Reported-by: Jeffrey Walton <noloader@gmail.com>
* src/css-url.c (get_uri_string): Check input length
* fuzz/wget_css_fuzzer.repro/buffer-overflow-6600180399865856:
Add reproducer corpus
Fixes OSS-Fuzz issue #8033.
This is a long standing bug affecting all versions <= 1.19.4.
* src/css-url.c (get_urls_css): Check input string length
* fuzz/wget_css_fuzzer.repro/negative-size-param-5724866467594240:
Add reproducer corpus
Fixes OSS-Fuzz issue #8032.
This is a long standing bug affecting all versions <= 1.19.4.
* src/css-tokens.h: Add enums and fixate values
* src/css.l: Include config.h,
ignore several compiler warnings,
update the grammar to CSS 2.2
Fixes OSS-Fuzz issue #8010 (slowness issue).
This is a long standing bug affecting all versions <= 1.19.4.
Some crafted CSS input was extremely slow / CPU wasting, so it could
be used as a DOS attack against website scanning.
The code/grammar changes were backported from Wget2.x.
* fuzz/Makefile.am: Add wget_ftpls_fuzzer
* fuzz/wget_ftpls_fuzzer.c: New fuzzer
* fuzz/wget_ftpls_fuzzer.dict: Fuzzer dictionary
* fuzz/wget_ftpls_fuzzer.in/starter: Starting corpus
* src/ftp-ls.c: Parsing function take FILE * as argument,
new function ftp_parse_ls_fp()
* src/ftp.c: Remove static from freefileinfo()
* src/ftp.h: Add ftp_parse_ls_fp() and freefileinfo()
* fuzz/Makefile.am: Add wget_html_fuzzer
* fuzz/wget_html_fuzzer.c: New fuzzer
* fuzz/wget_html_fuzzer.dict: HTML dictionary for fuzzing
* fuzz/wget_html_fuzzer.in: Initial corpora
* src/html-url.c: Add new function get_urls_html_fm()
* src/html-url.h: Add ne function get_urls_html_fm()
* src/wget.h: Fix define for fopen_wgetrc()
* fuzz/wget_options_fuzzer.c: Add fopen_wget() and fopen_wgetrc()
* src/utils.c: Use fopen_wgetrc() for config files,
don't read from stdin when fuzzing
* src/wget.h: Define fopen as fopen_wget when fuzzing,
define fopen_wgetrc as fopen when not fuzzing
* src/ftp.c (ftp_loop_internal): Set warc_tmp to NULL after ffclose()
* src/init.c (cleanup): Set output_stream to NULL after fclose()
* src/log.c (log_close): Set global stream vars to NULL after closing
* src/recur.c (retrieve_tree): Set rejectedlog to NULL after closing
* src/warc.c (warc_close): Set stream vars to NULL after closing
* Makefile.am: Add fuzz/ to SUBDIRS
* cfg.mk: Fix 'make syntax-check'
* configure.ac: Add --enable-fuzzing
* fuzz/Makefile.am: New file
* fuzz/README.md: New file
* fuzz/fuzzer.h: New file
* fuzz/get_all_corpora: New file
* fuzz/get_ossfuzz_corpora: New file
* fuzz/glob_crash.c: New file
* fuzz/main.c: New file
* fuzz/run-afl.sh: New file
* fuzz/run-clang.sh: New file
* fuzz/view-coverage.sh: New file
* fuzz/wget_options_fuzzer.c: New file
* fuzz/wget_options_fuzzer.dict: New file
* src/init.c (cleanup): Free more resources
* src/main.c (init_switches): Initialize only once,
(print_usage): Don't print if TESTING is defined
* src/utils.h: Include wget.h
Gzip compression has a number of bugs which need to be ironed out before
we can support it by default. Some of these stem from a misunderstanding
of the HTTP spec, but a lot of them are also due to many web servers not
being compliant with RFC 7231.
With this commit, I am marking GZip compression support as experimental
in GNU Wget pending further investigation and the addition of tests.
* src/init.c (defaults): Switch of compression support by default
* docs/wget.texi: State that compression is experimental
* src/log.c (check_redirect_output): tcgetpgrp can return -1 (ENOTTY),
be sure to check whether a valid controlling terminal exists before
redirecting.
Fixes: #51181
* http.c(gethttp): In case of a 416 response, try to drain the socket of
any bytes before reusing the connection
Reported-By: Iru Cai <mytbk920423@gmail.com>
* src/http.c(gethttp): When Encoding is gzip, ensure that the
Content-Type Header was actually seen. Without this, the "type" variable
is null causing a Segfault.
Reported-By: Noël Köthe <noel@debian.org>
* src/retr.c (fd_read_body): Stop processing on negative chunk size
Reported-by: Antti Levomäki, Christian Jalio, Joonas Pihlaja from Forcepoint
Reported-by: Juhani Eronen from Finnish National Cyber Security Centre
* src/http.c (skip_short_body): Return error on negative chunk size
Reported-by: Antti Levomäki, Christian Jalio, Joonas Pihlaja from Forcepoint
Reported-by: Juhani Eronen from Finnish National Cyber Security Centre
* src/http.c (gethttp): Move 304 code before --adjust-extension code
This fixes applying --adjust-extension in combination with 304
HTTP responses. It could lead to .html extensions to arbitrary
files.
Reported-by: anfractuosity
Although internally code uses option for (not) reading .netrc for
credentials, it was not possible to turn this behavior off on command
line. Note that it was possible to turn it off using wgetrc.
Idea for this change came from Bruce Jerrick (bmj001@gmail.com).
Reference: https://bugzilla.redhat.com/show_bug.cgi?id=1425097
Signed-off-by: Tomas Hozza <thozza@redhat.com>
There seemed to be a copy&paste error in http.c code, which decides
whether to get credentials from .netrc. In ftp.c "user" and "pass"
variables are char*, while in http.c, these are char**. For this reason
they should be dereferenced when determining if password and user login
is set to some value.
Also since both variables are dereferenced on lines above the changed
code, it does not really make sense to check if they are NULL.
This patch is based on fix from Bruce Jerrick <bmj001@gmail.com>.
Fedora bug: https://bugzilla.redhat.com/show_bug.cgi?id=1425097
Signed-off-by: Tomas Hozza <thozza@redhat.com>
* src/main.c: The --secure-protocol option accepts also values TLSv1_1
and TLSv1_2, as mentioned in the man page. However the help message
doesn't mention these two values. This patch adds TLSv1_1 and TLSv1_2 as
possible values to the help message.
Signed-off-by: Tomas Hozza <thozza@redhat.com>
* src/url.c: Check iconv() against 0, not -1
On some libiconv implementations, unknown codepoints become
encoded as ?, e.g. when converting a non-ascii codepoint to ASCII.
This results in ambigious file names which also fails our tests.
* src/connect.c (connect_to_ip): Use xfree() instead of idn2_free()
* src/host.c (lookup_host): Use xfree() instead of idn2_free()
* src/iri.h: Do not include idn2.h
* src/url.c (url_free): Use xfree() instead of idn2_free()
* src/url.h (struct url): Remove 'idn_allocated' from struct
Reported-by: Gisle Vanem
* src/utils.h: Add struct file_stat_s declaration,
change prototypes of file_exists_p(),
add prototypes for fopen_stat() and open_stat().
* src/utils.c: Extend file_exists_p(),
new function fopen_stat() and open_stat(),
add new param for file_exists_p().
* src/init.h: Add param file_stats_t to run_wgetrc().
* src/ftp.c: Amend calls to extended functions.
* src/hsts.c: Likewise.
* src/http.c: Likewise.
* src/init.c: Likewise.
* src/main.c: Likewise.
* src/metalink.c: Likewise.
* src/retr.c: Likewise.
* src/url.c: Likewise.
Added fopen_stat() and open_stat() that checks to makes sure the file didn't
change underneath us.
Return error from file_exists_p().
Added a way to return error from this file without major surgery to the
callers.
Fixes: #20369
* src/iri.c: Check for libidn2 < 0.14 to include libunistring headers
The unistring functions are used only when an older version of libidn2
is used, so don't include its headers either w/newer libdin2 versions.
The WARC spec requires that all URIs be enclosed in angle brackets. This
was being done in most cases, but not for "WARC-Target-URI" fields in
WARC blocks of type "response", "resource", "revisit", and "metadata".
* src/http.c (gethttp): Move 504 handling to correct place.
(http_loop): Fix memeory leak.
* testenv/server/http/http_server.py: Add Content-Length header on non-2xx
status codes with a body
Reported-by: Adam Sampson
In a non-ASCII environment, the local path may contain non-ASCII
characters. The server responded file name must be converted before
it is concatenated to the local path. Conversion after concatenation
may result in 'iconv' errors.
* doc/wget.texi: Add description for --retry-on-http-error
* src/http.c (gethttp):
Consider given HTTP response codes as non-fatal, transient errors.
Supply a comma-separated list of 3-digit HTTP response codes as
argument. Useful to work around special circumstances where retries
are required, but the server responds with an error code normally not
retried by Wget. Such errors might be 503 (Service Unavailable) and
429 (Too Many Requests). Retries enabled by this option are performed
subject to the normal retry timing and retry count limitations of
Wget.
Using this option is intended to support special use cases only and is
generally not recommended, as it can force retries even in cases where
the server is actually trying to decrease its load. Please use it
wisely and only if you know what you are doing.
Example use and a starting point for manual testing:
wget --retry-on-http-error=429,503 http://httpbin.org/status/503
* src/ftp.c (getftp): Fix password/user selection
* src/http.c (initialize_request): Likewise
Before, netrc password won over interactive
--ask-password but now --ask-password wins
after change of program logic
Fixes Issue #48811
* src/connect.c: check that the fd is not bigger than FD_SETSIZE
before using FD_SET. An fd_set cannot hold fds bigger than
FD_SETSIZE, causing out-of-bounds write to a buffer on the stack.
Reported by: Jann Horn <jannh@google.com>
* src/gnutls.c (ssl_connect_wget): Call gnutls_set_default_priority()
for --secure-protocol=auto (default).
The patch fixes a behavior that may have unintended side-effects in
certain gnutls versions. Instead use the default priorities when no
options are given.
Signed-off-by: Nikos Mavrogiannopoulos <nmav@gnutls.org>
* src/http-ntlm.c: Rename base64_{encode,decode}
* src/http.c: Likewise
* src/utils.c: Likewise
* src/utils.h: Likewise
When statically linking with gnutls, we get definition clash error for
base64_encode which is also defined by gnutls.
To prevent definition clash, rename base64_{encode,decode}
Signed-off-by: Rahul Bedarkar <rahul.bedarkar@imgtec.com>
* .travis.yml: Install libidn2-dev instead libidn11-dev.
* bootstrap.conf: Add modules libunistring-optional, unistr/base,
unicase/tolower.
* configure.ac: Check for libidn2.
* src/Makefile.am: Add $(LTLIBUNISTRING) to LDADD.
* tests/Makefile.am: Set LDADD similar to LDADD in src/Makefile.am
* src/connect.c: Use libidn2 code instead of libidn.
* src/host.c: Likewise.
* src/iri.c: Likewise.
* src/iri.h: Likewise.
* src/options.h: Likewise.
* src/url.c: Likewise.
* src/url.h: Likewise.
* src/log.c: Fix C99 comment.
IDN2003 should not be used any more due to security concerns.
We use libunistring (resp. the unicode code from gnulib) for
lowercasing UTF-8 before we give data to libidn2.
TR#46 is missing, no support in libidn2 nor in libunistring.
* src/log.c: Use tcgetpgrp(STDIN_FILENO) != getpgrp() to determine when to print to STD* or logfile.
Deprecate log_request_redirect_output function.
Use different file handles for STD* and logfile, to easily switch between them when changing fg/bg.
* src/log.h: Make redirect_output function externally linked.
* src/main.c: Don't use deprecated log_request_redirect_output function. Use redirect_output instead.
* src/mswindows.c: Don't use deprecated log_request_redirect_output function. Use redirect_output instead.
* src/recur.c (descend_redirect): Ignore WG_RR_LIST and WG_RR_REGEX
for redirections.
* testenv/Makefile.am: Add Test-recursive-redirect.py
* testenv/Test-recursive-redirect.py: New test
Test-recursive-redirect.py written by Dale R. Worley.
Reported-by: "Dale R. Worley" <worley@ariadne.com>
* src/http.c (metalink_from_http): Process the Content-Type header.
Add an application/metalink4+xml URL as metalink metaurl. If the
option opt.content_disposition is true, the Content-Disposition's
filename is the metaurl's name
* doc/wget.texi: Update --content-disposition and --metalink-over-http
* doc/metalink-standard.txt: Update doc. Content-Type/Disposition
processing through --metalink-over-http. Update download naming
system about --trust-server-names and --content-disposition
* testenv/Makefile.am: Add new files
* testenv/Test-metalink-http-xml-type.py: New file. Metalink/HTTP
Content-Type/Disposition header automated Metalink/XML tests
* testenv/Test-metalink-http-xml-type-trust.py: New file. Metalink/HTTP
Content-Type/Disposition header with --trust-server-names automated
Metalink/XML tests
* testenv/Test-metalink-http-xml-type-content.py: New file. Metalink/HTTP
Content-Type/Disposition header with --content-disposition automated
Metalink/XML tests
* testenv/Test-metalink-http-xml-type-trust-content.py: New file.
Metalink/HTTP Content-Type/Disposition header with --trust-server-names
and --content-disposition automated Metalink/XML tests
Process the Content-Type header, identify an application/metalink4+xml
file. The Content-Disposition could provide an alternate name through
the "filename" field for the metalink xml file. Respectively, the cli
options --metalink-over-http and --content-disposition are required.
When Metalink/XML auto-processing, to use the Content-Disposition's
filename, the cli option --trust-server-names is also required.
* src/http.c (http_loop): Prevent SIGSEGV when hstat.local_file is
NULL, opt.content_disposition has a role in leaving the value unset
* src/http.c (gethttp): If hs->local_file is NULL (aka http_loop()'s
hstat.local_file), set it to the value of hs->metalink->origin
* src/metalink.c (retrieve_from_metalink): If opt.trustservernames is
true, use the basename of the metaurl's name to save the xml file
* doc/metalink-standard.txt: Update doc. With --trust-server-names any
Metalink/HTTP Link application/metalink4+xml file is saved using the
basename of the "name" field, if any. Update Metalink/HTTP examples
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-http-xml-trust-name.py: New file. Metalink/HTTP
automated Metalink/XML, save xml files using the "name" field tests
* src/metalink.c (retrieve_from_metalink): Reject any metalink:file
without hashes. Prompt the error and switch to the next file
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-xml-nohash.py: New file. Metalink/XML with no
hashes tests
Prevent SIGSEGV.
* src/http.c (metalink_from_http): Fix hash_bin_len type. Use ssize_t
instead than size_t. Reject -1 as base64_decode() return value
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-http-baddigest.py: New file. Metalink/HTTP
malformed base64 Digest header tests
On malformed base64 input, ssize_t base64_decode() returns -1. Such
value is too big for a size_t variable, and used as xmalloc() value
will exaust all the memory.
* NEWS: Mention the effect of --metalink-index over Metalink
* src/init.c: Add new option metalinkindex (opt.metalink_index),
initialize to -1
* src/main.c: Add new option metalink-index (--metalink-index=NUMBER)
* src/options.h: Add new option metalink_index (int)
* src/metalink.h: Add declaration of functions fetch_metalink_file(),
replace_metalink_basename()
* src/metalink.c: Add functions fetch_metalink_file() simple file
fetch, replace_metalink_basename() replace file basename
* src/metalink.c (retrieve_from_metalink): New. Process Metalink
application/metalink4+xml of opt.metalink_index ordinal number
* doc/wget.texi: Add new option metalink-index (--metalink-index)
documentation
* doc/metalink-standard.txt: Updated doc. Add documentation about
Metalink application/metalink4+xml metaurls download naming system
* doc/metalink-standard.txt: Update Metalink/XML and HTTP examples
* testenv/Makefile.am: Add new files
* testenv/Test-metalink-http-xml.py: New file. Metalink/HTTP automated
Metalink/XML "application/metalink4+xml" --metalink-index tests
* testenv/Test-metalink-http-xml-trust.py: New file. Metalink/HTTP
automated Metalink/XML "application/metalink4+xml" --metalink-index
retrieval with --trust-server-names tests
WARNING: Do not use lib/dirname.c (dir_name) to get the directory
name, it may append a dot '.' character to the directory name.
* src/http.c (metalink_from_http): Parse Metalink/HTTP header for
metaurls application/metalink4+xml media types
* src/metalink.h: Add function declaration metalink_meta_cmp()
* src/metalink.c: Add function metalink_meta_cmp() compare metalink
metaurls priorities
Add Metalink/HTTP application/metalink4+xml media types as metaurls to
the metalink variable that will be used to download the files.
* src/metalink.h: Add declaration of function dequote_metalink_string()
* src/metalink.c: Add function dequote_metalink_string() remove
surrounding quotes from string, \' or \"
* src/metalink.c (find_key_value, find_key_values): Call dequote_metalink_string()
to remove the surrounding quotes from the parsed value
* src/metalink.c (test_find_key_value, test_find_key_values): Add
quoted key's values for unit-tests
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-http-quoted.py: New file. Metalink/HTTP quoted
values tests
Some Metalink/HTTP keys, like "type" [2], may have a quoted value [1]:
Link: <http://example.com/example.ext.meta4>; rel=describedby;
type="application/metalink4+xml"
Wget was expecting a dequoted value from the Metalink module. This
patch addresses this problem.
References:
[1] Metalink/HTTP: Mirrors and Hashes
1.1. Example Metalink Server Response
https://tools.ietf.org/html/rfc6249#section-1.1
[2] Additional Link Relations
6. "type"
https://tools.ietf.org/html/rfc6903#section-6
* src/metalink.h: Add declaration of function clean_metalink_string()
* src/metalink.c: Add directive #include "xmemdup0.h"
* src/metalink.c: Add function clean_metalink_string() remove leading
and trailing white spaces and CRLF from string
* src/metalink.c (retrieve_from_metalink): Remove leading and trailing
white spaces and CRLF from url resource mres->url
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-xml-urlbreak.py: New test. Metalink/XML white
spaces and CRLF in url resources tests
White spaces and CRLF are not automatically removed by libmetalink
from url strings. The Wget's Metalink module was unable to process
such url strings. This patch implements the processing of such url
strings cleaning off leading and trailing white spaces and CRLF.
If a parsed Metalink/XML url string contains strings separated by
CRLF, only the first of the series is accepted.
* src/wget.h (uerr_t): Add error code METALINK_SIZE_ERROR to enum
* src/metalink.c (retrieve_from_metalink): Use boolean variable
size_ok, when false set retr_err to METALINK_SIZE_ERROR
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-xml-size.py: New file. Metalink/XML file size
tests (<size></size>)
Before this patch, no appropriate error code was returned to inform a
file size mismatch.
This patch introduces the error code METALINK_SIZE_ERROR to inform a
file size mismatch.
* NEWS: Mention the effect of --trust-server-names over Metalink
* src/metalink.h: Add declaration of function append_suffix_number()
* src/metalink.c: Add function append_suffix_number() append number to
string
* src/metalink.c (retrieve_from_metalink): Safer Metalink/XML and
Metalink/HTTP download naming system, opt.trustservernames based
* doc/metalink-standard.txt: Update doc. Explain new Metalink/XML and
Metalin/HTTP download naming system and --trust-server-names role
* testenv/Makefile.am: Add new files
* testenv/Test-metalink-xml-continue.py: Update test. Metalink/XML
continue/keep existing files (HTTP 416) with --continue tests
* testenv/Test-metalink-xml.py: Update test. Metalink/XML naming tests
* testenv/Test-metalink-xml-trust.py: New file. Metalink/XML naming
tests with --trust-server-names
* testenv/Test-metalink-xml-abspath.py: Update test. Metalink/XML
absolute path tests
* testenv/Test-metalink-xml-abspath-trust.py: New file. Metalink/XML
absolute path tests with --trust-server-names
* testenv/Test-metalink-xml-relpath.py: Update test. Metalink/XML
relative path tests
* testenv/Test-metalink-xml-relpath-trust.py: New file. Metalink/XML
relative path tests with --trust-server-names
* testenv/Test-metalink-xml-homepath.py: Update test. Metalink/XML
home path and ~ (tilde) tests
* testenv/Test-metalink-xml-homepath-trust.py: New file. Metalink/XML
home path and ~ (tilde) tests with --trust-server-names
* testenv/Test-metalink-xml-prefix.py: New file. Metalink/XML naming
tests with --directory-prefix
* testenv/Test-metalink-xml-prefix-trust.py: New file. Metalink/XML
naming tests with --directory-prefix and --trust-server-names
* testenv/Test-metalink-xml-absprefix.py: New file. Metalink/XML
absolute --directory-prefix tests
* testenv/Test-metalink-xml-absprefix-trust.py: New file. Metalink/XML
absolute --directory-prefix tests with --trust-server-names
* testenv/Test-metalink-xml-relprefix.py: New file. Metalink/XML
relative --directory-prefix tests
* testenv/Test-metalink-xml-relprefix-trust.py: New file. Metalink/XML
relative --directory-prefix tests with --trust-server-names
* testenv/Test-metalink-xml-homeprefix.py: New file. Metalink/XML home
--directory-prefix tests
* testenv/Test-metalink-xml-homeprefix-trust.py: New file. Metalink/XML
home --directory-prefix tests with --trust-server-names
The option --trust-server-names allows to use the file names parsed
from a Metalink/XML file. Without --trust-server-names, the safety
mechanism provides secure and predictable file names.
* NEWS: Mention the use of a safe Metalink destination path
* src/metalink.h: Add declaration of functions get_metalink_basename(),
last_component(), metalink_check_safe_path()
* src/metalink.c: Add directive #include "dosname.h"
* src/metalink.c: Add function get_metalink_basename() to return the
basename of a file name, strip w32's drive letter prefixes
* src/metalink.c (retrieve_from_metalink): Enforce Metalink file name
verification, if the file name is unsafe try its basename
* doc/metalink.txt: Update document. Explain --directory-prefix
The function get_metalink_basename() uses FILE_SYSTEM_PREFIX_LEN to
catch any 'C:D:file' (w32 environment), then it removes each drive
letter prefix, i.e. 'C:' and 'D:'.
Unsafe file names contain an absolute, relative, or home path. Safe
paths can be verified by libmetalink's metalink_check_safe_path().
* NEWS: Mention the effect of --directory-prefix over Metalink
* src/metalink.c (retrieve_from_metalink): Add opt.dir_prefix as
prefix to the metalink:file name mfile->name
* doc/metalink.txt: Update document. Explain --directory-prefix
When --directory-prefix=<prefix> is used, set the top of the retrieval
tree to prefix. The default is . (the current directory). Metalink/XML
and Metalink/HTTP files will be downloaded under prefix.
* src/metalink.c (retrieve_from_metalink): Change mfile->name to
filename when referring to the downloaded file
The file name could have been changed by unique_create() (or by any
other mean) before downloading. Use the name of the downloaded file
(filename) when printing output which refer to it.
* NEWS: Mention Metalink's file size verification
* src/metalink.c (retrieve_from_metalink): Add file size computation
* doc/metalink.txt: Update document. Remove resolved bugs
Reject downloaded files when they do not agree with their Metalink/XML
metalink:size: https://tools.ietf.org/html/rfc5854#section-4.2.14
At the moment of writing, Metalink/HTTP headers do not provide a file
size field. This information could be obtained from the Content-Length
header field: https://tools.ietf.org/html/rfc6249#section-7
* NEWS: Mention the effects of --continue over Metalink
* src/metalink.c (retrieve_from_metalink): On download error, resume
output_stream with the next mres->url. Keep fully downloaded files
started with --continue, otherwise rename/remove the file
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-xml-continue.py: New file. Metalink/XML
continue/keep existing files (HTTP 416) with --continue tests
Before this patch, with --continue, existing and/or fully retrieved
files which fail the sanity tests were renamed (--keep-badhash), or
removed.
This patch ensures that --continue doesn't rename/remove existing
and/or fully retrieved files (HTTP 416) which fail the sanity tests.
* NEWS: Mention the Metalink "path/file" name format handling
* src/metalink.c (retrieve_from_metalink): Fix NULL filename, set
filename to the right "path/file" value
* src/metalink.c (retrieve_from_metalink): Fix NULL output_stream, set
output_stream to filename when it is created by retrieve_url()
* src/metalink.c (retrieve_from_metalink): Add RFC5854 comments about
proper metalink:file "path/file" name format handling
* doc/metalink.txt: Update document. Remove resolved bugs
If unique_create() cannot create/open the destination file, filename
and output_stream remain NULL. If fopen() is used instead, filename
always remains NULL. Both functions cannot create "path/file" trees.
Setting filename to the right value is sufficient to prevent SIGSEGV
generating from testing a NULL value. This also allows retrieve_url()
to create a "path/file" tree through opt.output_document.
Reading NULL as output_stream, when it shall not be, leads to wrong
results. For instance, a non-NULL output_stream tells when a stream
was interrupted, reading NULL instead means to assume the contrary.
This patch conforms to the RFC5854 specification:
The Metalink Download Description Format
4.1.2.1. The "name" Attribute
https://tools.ietf.org/html/rfc5854#section-4.1.2.1
* src/html-url.c (tag_handle_img): Check append_url() for NULL
return value before dereference.
Crashed reproducable with parsing srcset="data:..." inline data.
Reported-by: Coverity
* src/http.c: Add const to first param of initialize_request(),
initialize_proxy_configuration(), establish_connection(),
check_file_output(), check_auth(), gethttp(), http_loop().
* src/http.h: Add const to first param of http_loop().
* src/connect.c (connect_to_ip): Check return value of setsockopt.
* src/ftp.c (ftp_retrieve_list): Check return value of chmod.
* src/http.c (digest_authentication_encode): Cleanup code.
* src/init.c (setval_internal): Explicitely check comind range.
* src/main.c (main): Explicitely check optarg.
* src/retr.c (retr_rate): Use snprintf instead sprintf,
(retrieve_from_file): More verbose error message,
(rotate_backups): Use snprintf instead sprintf, check return
value of rename().
* src/url.c (mkalldirs): Check return value of unlink().
* src/utils.c (strdupdelim): Explicitely check beg and end for NULL,
(merge_vecs): Fix sizeof argument to char *,
(stable_sort): Use malloc instead of alloca.
* bootstrap.conf: Add xmemdup0 and strpbrk.
* src/init.c (cmd_use_askpass): Add 'const' to char *,
remove check for file existence.
* src/main.c (run_use_askpass): C89 compat init of argv,
added \n to error messages,
fixed stripping of \n and \r from input,
make run_use_askpass and use_askpass static.
* doc/wget.texi: Add --use-askpass to documentation.
* src/init.c: Add cmd_use_askpasss to set opt.use_askpass based on
argument, WGET_ASKPASS, and SSH_ASKPASS environment variables.
opt.wget-askpass is freed in cleanup ()
* src/main.c: Update options & add spawn process of opt.use_askpass
command.
* src/options.h: Addition of string use_askpass.
* src/url.c: Function scheme_leading_string to access the leading
string of a parsed url.
* src/url.h: Prototype for scheme_leading_string for returning the
leading string.
* bootstrap.conf: Add posix_spawn to gnulib_modules
This adds the --use-askpass option which is disabled by default.
--use-askpass=COMMAND will request the username and password for a given
URL by executing the external program COMMAND. If COMMAND is left
blank, then the external program in the environment variable
WGET_ASKPASS will be used. If WGET_ASKPASS is not set then the
environment variable SSH_ASKPASS is used. If there is no value set, an
error is returned. If an error occurs requesting the username or
password, wget will exit.
Signed-off-by: Liam R. Howlett <Liam.Howlett@WindRiver.com>
* configure.ac: Add --with-psl-file to set a PSL file
* src/cookies.c (check_domain_match): Load PSL_FILE with
fallback to built-in data.
This change allows package maintainers to make Wget use the latest
PSL (DAFSA or plain text), without updating libpsl itself.
E.g. Debian now comes with a DAFSA binary within the 'publicsuffix'
package which allows very fast loading (no parsing or processing needed).
* src/cookies.c (cookie_header): Use heap instead of stack.
* src/http.c (request_send): Likewise.
If wget has to handle an insanely large amount of cookies (~700,000 on
32 bit systems or ~530,000 on 64 bit systems), the stack is not large
enough to hold these pointers, leading to undefined behaviour according
to POSIX; expect a segmentation fault in real life. ;)
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The signal handler for SIGALRM calls longjmp, but the handler is
installed before the jump target has been initialized. If another
process sends SIGALRM right between handler installation and target
initialization, the jump leads to undefined behavior.
This can easily be fixed by moving the signal handler installation
into the "SETJMP == 0" conditional block, which means that the target
has just been initialized.
* src/utils.c: call signal after SETJMP.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
* src/init.c: Remove hyphens from command names
* src/main.c: Likewise
Options with hyphens (or underscores) in their command name cannot be
set in a wgetrc file.
Signed-off-by: Jeffery To <jeffery.to@gmail.com>