Commit Graph

2259 Commits

Author SHA1 Message Date
Tim Rühsen
9ffb64ba6a Limit file mode to u=rw on temp. downloaded files
* bootstrap.conf: Add gnulib modules fopen, open.
* src/http.c (open_output_stream): Limit file mode to u=rw
on temporary downloaded files.

Reported-by: "Misra, Deapesh" <dmisra@verisign.com>
Discovered by: Dawid Golunski (http://legalhackers.com)
2016-08-24 12:28:55 +02:00
Tim Rühsen
0787d7253e * src/css-url.c (get_urls_css): Fix memory leak 2016-08-17 23:13:27 +02:00
Tim Rühsen
964f4646da * src/html-url.c (get_urls_html): Fix memory leak 2016-08-17 23:12:25 +02:00
Tim Rühsen
262baeb113 Improve PSL cookie checking
* configure.ac: Add --with-psl-file to set a PSL file
* src/cookies.c (check_domain_match): Load PSL_FILE with
  fallback to built-in data.

This change allows package maintainers to make Wget use the latest
PSL (DAFSA or plain text), without updating libpsl itself.

E.g. Debian now comes with a DAFSA binary within the 'publicsuffix'
package which allows very fast loading (no parsing or processing needed).
2016-08-17 16:32:26 +02:00
Tobias Stoeckmann
f4aeb41899 Fix stack overflow with way too many cookies
* src/cookies.c (cookie_header): Use heap instead of stack.
* src/http.c (request_send): Likewise.

If wget has to handle an insanely large amount of cookies (~700,000 on
32 bit systems or ~530,000 on 64 bit systems), the stack is not large
enough to hold these pointers, leading to undefined behaviour according
to POSIX; expect a segmentation fault in real life. ;)

Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
2016-08-10 19:59:25 +02:00
Tobias Stoeckmann
a9d49e5b15 Fix signal race condition
The signal handler for SIGALRM calls longjmp, but the handler is
installed before the jump target has been initialized. If another
process sends SIGALRM right between handler installation and target
initialization, the jump leads to undefined behavior.

This can easily be fixed by moving the signal handler installation
into the "SETJMP == 0" conditional block, which means that the target
has just been initialized.

* src/utils.c: call signal after SETJMP.

Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
2016-08-09 17:38:29 +02:00
Jeffery To
0fe79eeacb Remove hyphens from command names
* src/init.c: Remove hyphens from command names
* src/main.c: Likewise

Options with hyphens (or underscores) in their command name cannot be
set in a wgetrc file.

Signed-off-by: Jeffery To <jeffery.to@gmail.com>
2016-08-05 09:45:09 +02:00
Tim Rühsen
e3fb4c3859 * src/metalink.c (badhash_suffix): Fix quoting 2016-08-04 13:09:28 +02:00
Matthew White
943a6d585f Add new option --keep-badhash to keep Metalink's files with a bad hash
* src/init.c: Add keepbadhash
* src/main.c: Add keep-badhash
* src/options.h: Add keep_badhash
* doc/wget.texi: Add docs for --keep-badhash
* src/metalink.h: Add prototypes badhash_suffix(), badhash_or_remove()
* src/metalink.c: New functions badhash_suffix(), badhash_or_remove().
  (retrieve_from_metalink): Call badhash_or_remove() on download error

With --keep-badhash, append .badhash to Metalink's files with checksum
mismatch. (retrieve_from_metalink): unique_create() may append another
suffix to avoid overwriting existing files.

Without --keep-badhash, remove downloaded files with checksum mismatch
(this conforms to the old behaviour).
2016-08-04 12:03:49 +02:00
Tim Rühsen
7fad76db4c * src/metalink.c: Remove C++ style comments 2016-08-03 13:48:07 +02:00
Matthew White
e0b60fd073 New: --continue continues partially downloaded Metalink's files
* src/metalink.c (retrieve_from_metalink): Continue file download if
  opt.always_rest is true

Without --continue, download as a new file with an unique name (this
conforms to the old behaviour).
2016-08-03 13:37:27 +02:00
Matthew White
9db02a0c46 Add support for Metalink's md2, and md4 hashes
* bootstrap.conf: Add crypto/md2, and crypto/md4
* src/metalink.c (retrieve_from_metalink): Add md2, and md4 support

This patch adds support for the deprecated (insecure) md2, and md4
Message-Digest algorithms to the Metalink module.
2016-08-03 12:58:43 +02:00
Matthew White
edad3c1df3 Add support for Metalink's md5, sha1, sha224, sha384, and sha512 hashes
* bootstrap.conf: Add crypto/sha512
* src/metalink.c (retrieve_from_metalink): Add md5, sha1, sha224,
  sha384, and sha512 support

Metalink's checksum verification was limited to sha256. This patch
adds support for md5, sha1, sha224, sha384, and sha512.
2016-08-03 12:49:26 +02:00
Sean Burford
20cac2c5ab Style fixes and DEBUG on setxattr failure.
* src/ftp.c: Fix style.
* src/http.c: Likewise.
* src/xattr.h: Likewise.
* src/xattr.c: Likewise,
  (write_xattr_metadata): Print debug msg on error.
2016-07-27 17:05:57 +02:00
Sean Burford
a933bdd31e Keep fetched URLs in POSIX extended attributes
* configure.ac: Check for xattr availability
* src/Makefile.am: Add xattr.c
* src/ftp.c: Include xattr.h.
  (getftp): Set attributes if enabled.
* src/http.c: Include xattr.h.
  (gethttp): Add parameter 'original_url',
  set attributes if enabled.
  (http_loop): Add 'original_url' to call of gethttp().
* src/init.c: Add new option --xattr.
* src/main.c: Add new option --xattr, add description to help text.
* src/options.h: Add new config member 'enable_xattr'.
* src/xatrr.c: New file.
* src/xattr.h: New file.

These attributes provide a lightweight method of later determining
where a file was downloaded from.

This patch changes:
*   autoconf detects whether extended attributes are available and
    enables the code if they are.
*   The new flags --xattr and --no-xattr control whether xattr is enabled.
*   The new command "xattr = (on|off)" can be used in ~/.wgetrc or /etc/wgetrc
*   The original and redirected URLs are recorded as shown below.
*   This works for both single fetches and recursive mode.

The attributes that are set are:
user.xdg.origin.url: The URL that the content was fetched from.
user.xdg.referrer.url: The URL that was originally requested.

Here is an example, where http://archive.org redirects to https://archive.org:
$ wget --xattr http://archive.org
...
$ getfattr -d index.html
user.xdg.origin.url="https://archive.org/"
user.xdg.referrer.url="http://archive.org/"

These attributes were chosen based on those stored by Google Chrome
https://bugs.chromium.org/p/chromium/issues/detail?id=45903
and curl https://github.com/curl/curl/blob/master/src/tool_xattr.c
2016-07-22 13:42:23 +02:00
Noël Köthe
ef372a4f27 Fix typos
* ChangeLog-2014-12-10: invokation -> invocation
* doc/wget.texi: invokation -> invocation
* src/main.c: seperated -> separated
* src/options.h: seperated -> separated
* testenv/README: invokation -> invocation
* testenv/conf/wget_commands.py: invokation -> invocation
2016-07-02 19:01:24 +02:00
Tim Rühsen
309e72c74f Fix compilation for OpenSSL 1.1.0
* src/openssl.c (ssl_init): Use SSL_is_init_finished() instead of
  SSL_state(), conditionally skip SSLeay function calls

The python test suite makes SSL_peek() hang, consuming 100% CPU time.
This does not happen on real world TLS connections, though, but needs
investigations.
2016-06-30 13:24:33 +02:00
Ander Juaristi
cdc3e28d8e Bypass world-writable checks on Windows
* src/hsts.c (hsts_file_access_valid): we should check for "world-writable"
   files only on Unix-based systems. It's difficult to mimic the same behavior
   on Windows, so it's better to just not do it.

Reported-by: Gisle Vanem <gvanem@yahoo.no>
Reported-by: Eli Zaretskii <eliz@gnu.org>
2016-06-27 09:54:32 +02:00
Tim Rühsen
e1e7afb210 Use ICONV_CONST to avoid type warning for iconv()
* src/iri.c (do_conversion): Cast 2. param of iconv() to
 'ICONV_CONST char **'
* src/url.c (convert_fname): Likewise
2016-06-12 21:51:34 +02:00
Tim Rühsen
7e585fe23d Remove check for HAVE_ICONV in src/url.c
* src/url.c: Remove check for HAVE_ICONV
2016-06-12 21:49:23 +02:00
Tim Rühsen
d75f43f083 Include gnulib fcntl.h instead of sys/fcntl.h
* src/gnutls.c: Include gnulib fcntl.h
2016-06-12 17:06:31 +02:00
Tim Rühsen
d4f97dc9af Add libraries to LDADD for wget
* src/Makefile.am: Add $(GETADDRINFO_LIB) $(HOSTENT_LIB) $(INET_NTOP_LIB)
 $(LIBSOCKET) $(LIB_CLOCK_GETTIME) $(LIB_CRYPTO) $(LIB_SELECT)
 $(LTLIBICONV) $(LTLIBINTL) $(LTLIBTHREAD) $(SERVENT_LIB) to LDADD
2016-06-12 17:02:12 +02:00
Giuseppe Scrivano
e996e322ff ftp: understand --trust-server-names on a HTTP->FTP redirect
If not --trust-server-names is used, FTP will also get the destination
file name from the original url specified by the user instead of the
redirected url.  Closes CVE-2016-4971.

* src/ftp.c (ftp_get_listing): Add argument original_url.
(getftp): Likewise.
(ftp_loop_internal): Likewise.  Use original_url to generate the
file name if --trust-server-names is not provided.
(ftp_retrieve_glob): Likewise.
(ftp_loop): Likewise.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2016-06-09 15:02:49 +02:00
Tim Rühsen
2bdfc4f521 Fix warnings for --disable-iri configure flag
* src/iri.h: Fix #define for parse_charset
* src/html-url.c: Surround some IRI code parts by #ifdef ENABLE_IRI
* src/http.c: Likewise
* src/iri.h: Likewise
* src/recur.c: Likewise
* src/retr.c: Likewise
2016-06-07 12:52:59 +02:00
Tim Rühsen
2c736abb4c Fix warning about redefinition of MAP_FAILED
* src/sysdep.h: Removed definition of MAP_FAILED
* src/utils.c: Check and define MAP_FAILED after including sys/mmap.h
2016-06-07 09:56:01 +02:00
Ander Juaristi
5224d752a5 Correct HSTS debug message
* src/main.c (save_hsts): save the in-memory HSTS database to a file
   only if something changed.
 * src/hsts.c (struct hsts_store): new field 'changed'.
   (hsts_match): update field 'changed' accordingly.
   (hsts_store_entry): update field 'changed' accordingly.
   (hsts_store_has_changed): new function.
 * src/hsts.h (hsts_store_has_changed): new function.
2016-05-26 16:37:51 +02:00
Ander Juaristi
2aaf12990c Check the HSTS file is not world-writable
* hsts.c (hsts_file_access_valid): check that the file is a regular
   file, and that it's not world-writable.
   (hsts_store_open): if the HSTS database file does not meet the
   above requirements, disable HSTS at all.
2016-05-26 16:29:29 +02:00
Tim Rühsen
a952f81f3e Remove special handling for Emacs in progress bar code
* src/progress.c: Remove special 'emacs' code

Fixes #47989
2016-05-23 21:46:29 +02:00
Jernej Simončič
42cc84b6b6 Fix xsleep() for Windows (trivial change)
* src/mswindows.c (xsleep): Fix check for number of seconds
2016-04-25 15:50:23 +02:00
Sergio Gelato
96ab9cad88 More accurate log message from do_conversion()
* src/iri.c (do_conversion): More accurate log message
2016-04-17 15:28:48 +02:00
Tim Rühsen
268163444d Include sys/select.h if HAVE_LIBCARES
* src/hosts.c: Include sys/select.h if HAVE_LIBCARES

Reported-by: Gisle Vanem <gvanem@yahoo.no>
2016-04-17 14:18:55 +02:00
Gisle Vanem
53800415a9 Fix Windows gnulib/c-ares incompatibility of select()
* src/host.c: Undef 'select' on Windows
2016-04-17 14:15:51 +02:00
Ander Juaristi
2f1c6a05c8 Strictly comply with RFC 6797
* src/hsts.c (hsts_store_entry): strictly comply with RFC 6797.

RFC 6797 states in section 8.1 that the UA's cached information should
only be updated if:

    "either or both of the max-age and includeSubDomains header field
    value tokens are conveying information different than that already
    maintained by the UA."
2016-04-11 16:44:47 +02:00
Ander Juaristi
33d860e1ef Correct HSTS database file description
* src/hsts.c (hsts_store_dump): s/[:port]/<port>/
2016-04-11 16:44:41 +02:00
moparisthebest
54746578e9 Implement --pinnedpubkey option to pin public keys
* doc/wget.texi: Add description for --pinnedpubkey
* src/gnutls.c: New function pkp_pin_peer_pubkey(),
  (ssl_check_certificate): Check pinned cert via pkp_pin_peer_pubkey()
* src/init.c: Add option --pinnedpubkey
* src/main.c: Add option --pinnedpubkey
* src/openssl.c: New function pkp_pin_peer_pubkey(),
  (ssl_check_certificate): Check pinned cert via pkp_pin_peer_pubkey()
* src/options.h: Add new option variable 'pinnedpubkey'
* src/utils.c: New functions wg_pubkey_pem_to_der(), wg_pin_peer_pubkey()
* src/utils.h: Add prototype for wg_pin_peer_pubkey()
2016-04-11 16:18:05 +02:00
Darshit Shah
d26377053d Print the fingerprint instead of the raw pointer in debugging message
* src/metalink.c (retrieve_from_metalink): Fix debug message to print the
fingerprint instead of a pointer.

* Do not delete the ChangeLog file since it is required by the Makefile
and breaks compilation
2016-03-29 15:09:04 +02:00
Darshit Shah
722675553c Revert "Print the fingerprint instead of the raw pointer in debugging message"
This reverts commit b916595168.
2016-03-29 15:07:29 +02:00
Giuseppe Scrivano
f3e63f0071 * metalink.c (retrieve_from_metalink): Fix typo 2016-03-25 16:46:39 +01:00
Giuseppe Scrivano
b916595168 Print the fingerprint instead of the raw pointer in debugging message
* src/metalink.c (retrieve_from_metalink): Fix debug message to print the
fingerprint instead of a pointer.
2016-03-25 16:23:19 +01:00
Tim Rühsen
76ef65b23c Add options --bind-dns-address and --dns-servers
* README.checkout: Add description for libares
* configure.ac: Add check for libares
* doc/wget.texi: Add docs for the new options
* src/build_info.c.in: Add +/-cares for --version output
* src/host.c:
  (merge_address_lists): New static function
  (address_list_from_hostent): New static function
  (wait_ares): New static function
  (callback): New static function
  (lookup_host): Add libares resolver code
* src/init.c: Add new options,
  (cleanup): Add cleanup code
* src/main.c: Add global libares channel variable
  (cmdline_option option_data): Add new options
  (print_help): Add short descriptions
  (main): Add libares init code
* src/options.h (struct options): Add option members

The new options allow to specify alternative DNS servers and
an alternate packet route for the resolver packets.
Wget has to built with libares, enabled at configure time by
./configure --with-cares.
2016-03-23 09:26:22 +01:00
Tim Rühsen
d7726f8a13 Fix SNI server names with trailing dot(s)
* src/gnutls.c (ssl_connect_wget, ssl_check_certificate): Fix SNI server name
* src/openssl.c (ssl_connect_wget, ssl_check_certificate): Fix SNI server name

Fixes #47408
2016-03-16 11:23:51 +01:00
Darshit Shah
7cb9efa668 Fix assertion in Progress bar
* src/progress.c (create_image): Fix off-by-one error in assert()
    statement for progress bar width.
    Reported-By: Gisle Vanem <gvanem@yahoo.no>
2016-03-05 13:27:46 +01:00
Giuseppe Scrivano
44aedd8321 src/url.c: fix make syntax-check 2016-03-03 09:40:39 +01:00
Maks Orlovich
c28f51aadf Parse <img srcset> attributes, they have image URLs.
* src/convert.h: Add link_noquote_html_p to permit rewriting URLs deep
                 inside attributes without adding extraneous quoting
* src/convert.c (convert_links): Honor link_noquote_html_p
* src/html_url.c (tag_handle_img): New function. Add srcset parsing.
2016-03-03 09:38:45 +01:00
Darshit Shah
7099f48998 Sanitize value sent to memset to prevent SEGFAULT 2016-03-01 08:11:13 +01:00
Tim Rühsen
100da11312 Fix writing WARC-Target-URI value
src/warc.c: Add function warc_write_header_uri(),
            Use it for creating WARC-Target-URI

Fixes #47281
2016-02-27 23:08:28 +01:00
Tim Rühsen
cacac6f996 Retain value of errno in logprintf(), logputs() even better
* src/log.c (logprintf,logputs): Save&Restore value of errno

Reported-by: Gisle Vanem <gvanem@yahoo.no>
2016-02-11 10:53:02 +01:00
Tim Rühsen
3056617e9c Retain value of errno in logprintf()
* src/log.c (logprintf): Save&Restore value of errno

Reported-by: Gisle Vanem <gvanem@yahoo.no>
2016-02-10 15:28:10 +01:00
Tim Rühsen
b30500f0f4 Fix Test-iri-forced-remote
* tests/Test-iri-forced-remote.px: Fix encodings
2015-12-20 21:32:06 +01:00
Eli Zaretskii
59b920874d Support non-ASCII URLs
* src/url.c [HAVE_ICONV]: Include iconv.h and langinfo.h.
(convert_fname): New function.
[HAVE_ICONV]: Convert file name from remote encoding to local
encoding.
(url_file_name): Call convert_fname.
(filechr_table): Don't consider bytes in 128..159 as control
characters.

* tests/Test-ftp-iri.px: Fix the expected file name to match the
new file-name recoding.  State the remote encoding explicitly on
the Wget command line.

* NEWS: Mention the URI recoding when built with libiconv.
2015-12-18 20:54:39 +01:00