* fuzz/wget_options_fuzzer.c: Add fopen_wget() and fopen_wgetrc()
* src/utils.c: Use fopen_wgetrc() for config files,
don't read from stdin when fuzzing
* src/wget.h: Define fopen as fopen_wget when fuzzing,
define fopen_wgetrc as fopen when not fuzzing
* src/ftp.c (ftp_loop_internal): Set warc_tmp to NULL after ffclose()
* src/init.c (cleanup): Set output_stream to NULL after fclose()
* src/log.c (log_close): Set global stream vars to NULL after closing
* src/recur.c (retrieve_tree): Set rejectedlog to NULL after closing
* src/warc.c (warc_close): Set stream vars to NULL after closing
* Makefile.am: Add fuzz/ to SUBDIRS
* cfg.mk: Fix 'make syntax-check'
* configure.ac: Add --enable-fuzzing
* fuzz/Makefile.am: New file
* fuzz/README.md: New file
* fuzz/fuzzer.h: New file
* fuzz/get_all_corpora: New file
* fuzz/get_ossfuzz_corpora: New file
* fuzz/glob_crash.c: New file
* fuzz/main.c: New file
* fuzz/run-afl.sh: New file
* fuzz/run-clang.sh: New file
* fuzz/view-coverage.sh: New file
* fuzz/wget_options_fuzzer.c: New file
* fuzz/wget_options_fuzzer.dict: New file
* src/init.c (cleanup): Free more resources
* src/main.c (init_switches): Initialize only once,
(print_usage): Don't print if TESTING is defined
* src/utils.h: Include wget.h
Gzip compression has a number of bugs which need to be ironed out before
we can support it by default. Some of these stem from a misunderstanding
of the HTTP spec, but a lot of them are also due to many web servers not
being compliant with RFC 7231.
With this commit, I am marking GZip compression support as experimental
in GNU Wget pending further investigation and the addition of tests.
* src/init.c (defaults): Switch of compression support by default
* docs/wget.texi: State that compression is experimental
* src/log.c (check_redirect_output): tcgetpgrp can return -1 (ENOTTY),
be sure to check whether a valid controlling terminal exists before
redirecting.
Fixes: #51181
* http.c(gethttp): In case of a 416 response, try to drain the socket of
any bytes before reusing the connection
Reported-By: Iru Cai <mytbk920423@gmail.com>
* src/http.c(gethttp): When Encoding is gzip, ensure that the
Content-Type Header was actually seen. Without this, the "type" variable
is null causing a Segfault.
Reported-By: Noël Köthe <noel@debian.org>
* src/retr.c (fd_read_body): Stop processing on negative chunk size
Reported-by: Antti Levomäki, Christian Jalio, Joonas Pihlaja from Forcepoint
Reported-by: Juhani Eronen from Finnish National Cyber Security Centre
* src/http.c (skip_short_body): Return error on negative chunk size
Reported-by: Antti Levomäki, Christian Jalio, Joonas Pihlaja from Forcepoint
Reported-by: Juhani Eronen from Finnish National Cyber Security Centre
* src/http.c (gethttp): Move 304 code before --adjust-extension code
This fixes applying --adjust-extension in combination with 304
HTTP responses. It could lead to .html extensions to arbitrary
files.
Reported-by: anfractuosity
Although internally code uses option for (not) reading .netrc for
credentials, it was not possible to turn this behavior off on command
line. Note that it was possible to turn it off using wgetrc.
Idea for this change came from Bruce Jerrick (bmj001@gmail.com).
Reference: https://bugzilla.redhat.com/show_bug.cgi?id=1425097
Signed-off-by: Tomas Hozza <thozza@redhat.com>
There seemed to be a copy&paste error in http.c code, which decides
whether to get credentials from .netrc. In ftp.c "user" and "pass"
variables are char*, while in http.c, these are char**. For this reason
they should be dereferenced when determining if password and user login
is set to some value.
Also since both variables are dereferenced on lines above the changed
code, it does not really make sense to check if they are NULL.
This patch is based on fix from Bruce Jerrick <bmj001@gmail.com>.
Fedora bug: https://bugzilla.redhat.com/show_bug.cgi?id=1425097
Signed-off-by: Tomas Hozza <thozza@redhat.com>
* src/main.c: The --secure-protocol option accepts also values TLSv1_1
and TLSv1_2, as mentioned in the man page. However the help message
doesn't mention these two values. This patch adds TLSv1_1 and TLSv1_2 as
possible values to the help message.
Signed-off-by: Tomas Hozza <thozza@redhat.com>
* src/url.c: Check iconv() against 0, not -1
On some libiconv implementations, unknown codepoints become
encoded as ?, e.g. when converting a non-ascii codepoint to ASCII.
This results in ambigious file names which also fails our tests.
* src/connect.c (connect_to_ip): Use xfree() instead of idn2_free()
* src/host.c (lookup_host): Use xfree() instead of idn2_free()
* src/iri.h: Do not include idn2.h
* src/url.c (url_free): Use xfree() instead of idn2_free()
* src/url.h (struct url): Remove 'idn_allocated' from struct
Reported-by: Gisle Vanem
* src/utils.h: Add struct file_stat_s declaration,
change prototypes of file_exists_p(),
add prototypes for fopen_stat() and open_stat().
* src/utils.c: Extend file_exists_p(),
new function fopen_stat() and open_stat(),
add new param for file_exists_p().
* src/init.h: Add param file_stats_t to run_wgetrc().
* src/ftp.c: Amend calls to extended functions.
* src/hsts.c: Likewise.
* src/http.c: Likewise.
* src/init.c: Likewise.
* src/main.c: Likewise.
* src/metalink.c: Likewise.
* src/retr.c: Likewise.
* src/url.c: Likewise.
Added fopen_stat() and open_stat() that checks to makes sure the file didn't
change underneath us.
Return error from file_exists_p().
Added a way to return error from this file without major surgery to the
callers.
Fixes: #20369
* src/iri.c: Check for libidn2 < 0.14 to include libunistring headers
The unistring functions are used only when an older version of libidn2
is used, so don't include its headers either w/newer libdin2 versions.
The WARC spec requires that all URIs be enclosed in angle brackets. This
was being done in most cases, but not for "WARC-Target-URI" fields in
WARC blocks of type "response", "resource", "revisit", and "metadata".
* src/http.c (gethttp): Move 504 handling to correct place.
(http_loop): Fix memeory leak.
* testenv/server/http/http_server.py: Add Content-Length header on non-2xx
status codes with a body
Reported-by: Adam Sampson
In a non-ASCII environment, the local path may contain non-ASCII
characters. The server responded file name must be converted before
it is concatenated to the local path. Conversion after concatenation
may result in 'iconv' errors.
* doc/wget.texi: Add description for --retry-on-http-error
* src/http.c (gethttp):
Consider given HTTP response codes as non-fatal, transient errors.
Supply a comma-separated list of 3-digit HTTP response codes as
argument. Useful to work around special circumstances where retries
are required, but the server responds with an error code normally not
retried by Wget. Such errors might be 503 (Service Unavailable) and
429 (Too Many Requests). Retries enabled by this option are performed
subject to the normal retry timing and retry count limitations of
Wget.
Using this option is intended to support special use cases only and is
generally not recommended, as it can force retries even in cases where
the server is actually trying to decrease its load. Please use it
wisely and only if you know what you are doing.
Example use and a starting point for manual testing:
wget --retry-on-http-error=429,503 http://httpbin.org/status/503
* src/ftp.c (getftp): Fix password/user selection
* src/http.c (initialize_request): Likewise
Before, netrc password won over interactive
--ask-password but now --ask-password wins
after change of program logic
Fixes Issue #48811
* src/connect.c: check that the fd is not bigger than FD_SETSIZE
before using FD_SET. An fd_set cannot hold fds bigger than
FD_SETSIZE, causing out-of-bounds write to a buffer on the stack.
Reported by: Jann Horn <jannh@google.com>
* src/gnutls.c (ssl_connect_wget): Call gnutls_set_default_priority()
for --secure-protocol=auto (default).
The patch fixes a behavior that may have unintended side-effects in
certain gnutls versions. Instead use the default priorities when no
options are given.
Signed-off-by: Nikos Mavrogiannopoulos <nmav@gnutls.org>
* src/http-ntlm.c: Rename base64_{encode,decode}
* src/http.c: Likewise
* src/utils.c: Likewise
* src/utils.h: Likewise
When statically linking with gnutls, we get definition clash error for
base64_encode which is also defined by gnutls.
To prevent definition clash, rename base64_{encode,decode}
Signed-off-by: Rahul Bedarkar <rahul.bedarkar@imgtec.com>
* .travis.yml: Install libidn2-dev instead libidn11-dev.
* bootstrap.conf: Add modules libunistring-optional, unistr/base,
unicase/tolower.
* configure.ac: Check for libidn2.
* src/Makefile.am: Add $(LTLIBUNISTRING) to LDADD.
* tests/Makefile.am: Set LDADD similar to LDADD in src/Makefile.am
* src/connect.c: Use libidn2 code instead of libidn.
* src/host.c: Likewise.
* src/iri.c: Likewise.
* src/iri.h: Likewise.
* src/options.h: Likewise.
* src/url.c: Likewise.
* src/url.h: Likewise.
* src/log.c: Fix C99 comment.
IDN2003 should not be used any more due to security concerns.
We use libunistring (resp. the unicode code from gnulib) for
lowercasing UTF-8 before we give data to libidn2.
TR#46 is missing, no support in libidn2 nor in libunistring.
* src/log.c: Use tcgetpgrp(STDIN_FILENO) != getpgrp() to determine when to print to STD* or logfile.
Deprecate log_request_redirect_output function.
Use different file handles for STD* and logfile, to easily switch between them when changing fg/bg.
* src/log.h: Make redirect_output function externally linked.
* src/main.c: Don't use deprecated log_request_redirect_output function. Use redirect_output instead.
* src/mswindows.c: Don't use deprecated log_request_redirect_output function. Use redirect_output instead.
* src/recur.c (descend_redirect): Ignore WG_RR_LIST and WG_RR_REGEX
for redirections.
* testenv/Makefile.am: Add Test-recursive-redirect.py
* testenv/Test-recursive-redirect.py: New test
Test-recursive-redirect.py written by Dale R. Worley.
Reported-by: "Dale R. Worley" <worley@ariadne.com>
* src/http.c (metalink_from_http): Process the Content-Type header.
Add an application/metalink4+xml URL as metalink metaurl. If the
option opt.content_disposition is true, the Content-Disposition's
filename is the metaurl's name
* doc/wget.texi: Update --content-disposition and --metalink-over-http
* doc/metalink-standard.txt: Update doc. Content-Type/Disposition
processing through --metalink-over-http. Update download naming
system about --trust-server-names and --content-disposition
* testenv/Makefile.am: Add new files
* testenv/Test-metalink-http-xml-type.py: New file. Metalink/HTTP
Content-Type/Disposition header automated Metalink/XML tests
* testenv/Test-metalink-http-xml-type-trust.py: New file. Metalink/HTTP
Content-Type/Disposition header with --trust-server-names automated
Metalink/XML tests
* testenv/Test-metalink-http-xml-type-content.py: New file. Metalink/HTTP
Content-Type/Disposition header with --content-disposition automated
Metalink/XML tests
* testenv/Test-metalink-http-xml-type-trust-content.py: New file.
Metalink/HTTP Content-Type/Disposition header with --trust-server-names
and --content-disposition automated Metalink/XML tests
Process the Content-Type header, identify an application/metalink4+xml
file. The Content-Disposition could provide an alternate name through
the "filename" field for the metalink xml file. Respectively, the cli
options --metalink-over-http and --content-disposition are required.
When Metalink/XML auto-processing, to use the Content-Disposition's
filename, the cli option --trust-server-names is also required.
* src/http.c (http_loop): Prevent SIGSEGV when hstat.local_file is
NULL, opt.content_disposition has a role in leaving the value unset
* src/http.c (gethttp): If hs->local_file is NULL (aka http_loop()'s
hstat.local_file), set it to the value of hs->metalink->origin
* src/metalink.c (retrieve_from_metalink): If opt.trustservernames is
true, use the basename of the metaurl's name to save the xml file
* doc/metalink-standard.txt: Update doc. With --trust-server-names any
Metalink/HTTP Link application/metalink4+xml file is saved using the
basename of the "name" field, if any. Update Metalink/HTTP examples
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-http-xml-trust-name.py: New file. Metalink/HTTP
automated Metalink/XML, save xml files using the "name" field tests
* src/metalink.c (retrieve_from_metalink): Reject any metalink:file
without hashes. Prompt the error and switch to the next file
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-xml-nohash.py: New file. Metalink/XML with no
hashes tests
Prevent SIGSEGV.