Commit Graph

2538 Commits

Author SHA1 Message Date
Tim Rühsen
d061e553a1 * src/http.c (initialize_request): Fix regression in .netrc auth
Reported-by: Axel Reinhold
2017-02-06 21:44:18 +01:00
Tim Rühsen
2ddd2b69e4 * src/iri.c (idn_encode): Fix memory leak 2017-02-06 21:39:44 +01:00
Tim Rühsen
31ac36e170 Fix include/define clash with gnulib's unlink module
* src/options.h: Rename options.unlink to options.unlink_requested
* src/init.c: Replace options unlink member by unlink_requested
* src/http.c: Likewise
* src/ftp.c: Likewise
2017-02-04 18:02:54 +01:00
Tim Rühsen
f2c4289557 * src/xattr.h: Fix #define fsetxattr for MacOS and FreeBSD
Reported-by: Zhiming Wang
2017-02-04 15:29:44 +01:00
Tim Rühsen
366d82f349 * src/utils.c: Move macro FMT_MAX_LENGTH into scope 2017-02-03 12:35:49 +01:00
Tim Rühsen
f2574e90b7 * src/utils.c: Fix -Wformat= warnings 2017-02-03 12:33:38 +01:00
Tim Rühsen
81b3aaf75c * src/gnutls.c: Fix -Wformat= warnings 2017-02-03 12:31:51 +01:00
Tim Rühsen
17d2f42a3d * src/iri.c: Remove unused macro IDNA_FLAGS 2017-02-03 12:28:37 +01:00
Tim Rühsen
638df40476 * src/iri.c: Remove use of __func__ macros 2017-02-03 12:28:05 +01:00
Tim Rühsen
00bafe72f1 * src/http.c: Fix -Wformat= warnings 2017-02-03 12:24:41 +01:00
Tim Rühsen
e777c01f43 * src/progress.c: Remove unused macro move_to_end 2017-02-03 12:18:14 +01:00
Tim Rühsen
3ba112ea57 * src/html-parse.c: Remove unused macro SKIP_NON_WS 2017-02-03 12:15:24 +01:00
Tim Rühsen
485fcfcc20 * src/hsts.c: Remove unused macro CHECK_EXPLICIT_PORT 2017-02-03 12:09:18 +01:00
Tim Rühsen
a5094731cd * src/hsts.c: Fix -Wformat= warnings 2017-02-03 12:08:08 +01:00
Tim Rühsen
3186eb2976 * src/hash.c: Explicitly convert float to int 2017-02-03 12:03:50 +01:00
Tim Rühsen
9947663af8 * src/ftp-ls.c: Fix -Wformat= warnings 2017-02-03 11:59:33 +01:00
Tim Rühsen
cfae085665 * src/ftp.c (ftp_retrieve_list): Add default to switch 2017-02-03 11:57:02 +01:00
Tim Rühsen
e69808256b * src/css-url.h: Remove redundant declaration 2017-02-03 11:53:28 +01:00
Tim Rühsen
11989ef669 * src/ftp.c: Fix -Wformat= warning 2017-02-03 11:52:08 +01:00
Tim Rühsen
5fceab6cb9 * src/http.c (test_parse_range_header): Fix constants 2017-02-03 10:32:42 +01:00
Tim Rühsen
d0e02a54ae * src/url.c (mkalldirs): Add newline to log message 2017-02-02 11:11:50 +01:00
Tim Rühsen
2e70409844 * src/cookies.c (check_domain_match): Add newline to DEBUG lines 2017-02-02 11:11:07 +01:00
Tim Rühsen
6de24fe3c0 * src/iri.c: Use TR46 non-transitional for toASCII conversion 2017-01-13 15:53:03 +01:00
Tim Rühsen
0ab3d92c85 * src/main.c: Fix _Noreturn compiler warnings 2017-01-13 15:50:19 +01:00
Tim Rühsen
4cf8af84e0 * src/utils.c: Fix _Noreturn compiler warning 2017-01-13 15:49:05 +01:00
Tim Rühsen
42b8761cbc * src/init.c (setval_internal): Fix sign compare warning 2017-01-13 15:47:02 +01:00
Tim Rühsen
fd0f759597 Replace home-grown portability code by gnulib modules
* bootstrap.conf: Add intprops, inttypes, limits-h, signal-h,
  stat, sys_types
* src/ftp.c: Replace 'struct_stat' by 'struct stat'
* src/hsts.c: Likewise
* src/http.c: Likewise
* src/main.c: Likewise
* src/netrc.c: Likewise
* src/retr.c: Likewise
* src/url.c: Likewise
* src/utils.c: Likewise
* src/sysdep.h: Remove old portability code

Further portability issues should be addressed by gnulib.
2017-01-13 15:38:15 +01:00
Tim Rühsen
a384f5e2e9 Replace WGET_* m4 macros by gnulib modules
* bootstrap.conf: Add hostent, inet_ntop, nanosleep, utimens
* configure.ac: Remove WGET_STRUCT_UTIMBUF, WGET_FNMATCH,
  WGET_NANOSLEEP, WGET_POSIX_CLOCK, WGET_NSL_SOCKET
* m4/wget.m4: Likewise
* src/Makefile.am: Add $(LIB_NANOSLEEP) $(LIB_POSIX_SPAWN) to LDADD
* tests/Makefile.am: Likewise
* src/host.c (print_address): Use inet_ntop also for IPV4
2017-01-13 12:54:35 +01:00
Tim Rühsen
5ae1f37902 Remove libidn vulnerability work-around
* src/iri.c (_utf8_is_valid): Removed

Since we are using libidn2 for IDNs, we no longer need
this work-around.
2017-01-13 12:03:47 +01:00
Tim Rühsen
0eb4a21b6c * src/iri.c (idn_encode): Use TR46 transitional if available 2017-01-13 11:45:18 +01:00
Tim Rühsen
1a01a6b2d0 Fix previous commit 2427ca4ac0 2017-01-07 15:59:11 +01:00
vijeth-aradhya
2427ca4ac0 Fix http.c and ftp.c passwd logic error
* src/ftp.c (getftp): Fix password/user selection
* src/http.c (initialize_request): Likewise

Before, netrc password won over interactive
--ask-password but now --ask-password wins
after change of program logic

Fixes Issue #48811
2017-01-06 16:18:40 +01:00
Giuseppe Scrivano
42c2ce71bc * src/main.c (main): Add missing \n in error message 2016-12-31 17:11:01 +01:00
Giuseppe Scrivano
def133f26f Check that fd_set has not fds bigger than FD_SETSIZE
* src/connect.c: check that the fd is not bigger than FD_SETSIZE
  before using FD_SET.  An fd_set cannot hold fds bigger than
  FD_SETSIZE, causing out-of-bounds write to a buffer on the stack.

Reported by: Jann Horn <jannh@google.com>
2016-12-28 12:24:19 +01:00
Nikos Mavrogiannopoulos
b9ed06afd8 Avoid calling the gnutls priority functions multiple times
* src/gnutls.c (ssl_connect_wget): Call gnutls_set_default_priority()
  for --secure-protocol=auto (default).

The patch fixes a behavior that may have unintended side-effects in
certain gnutls versions. Instead use the default priorities when no
options are given.

Signed-off-by: Nikos Mavrogiannopoulos <nmav@gnutls.org>
2016-12-20 14:48:31 +01:00
Tim Rühsen
1bdc20d774 Print debug message when skipping certain recursive downloads
* src/recur.c (retrieve_tree): Print debug message instead silently
  skipping recursive downloads.
2016-12-19 12:19:52 +01:00
Rahul Bedarkar
e4e9d3c1c8 Rename base64_{encode,decode} (trivial patch)
* src/http-ntlm.c: Rename base64_{encode,decode}
* src/http.c: Likewise
* src/utils.c: Likewise
* src/utils.h: Likewise

When statically linking with gnutls, we get definition clash error for
base64_encode which is also defined by gnutls.

To prevent definition clash, rename base64_{encode,decode}

Signed-off-by: Rahul Bedarkar <rahul.bedarkar@imgtec.com>
2016-12-14 15:52:52 +01:00
Tim Rühsen
dcdd618b18 Add support for psl_latest()
* configure.ac: Add check for psl_latest(),
  remove --with-psl-file
* src/cookies.c (check_domain_match): Use psl_latest() if available
2016-12-11 21:04:40 +01:00
Piotr Wajda
3c796b9a85 Respect -o parameter again
* log.c: don't choose log output dynamically when opt.lfilename is set

 Regression introduced by dd5c549f6a
 Reported-by: Dale R. Worley
2016-11-09 13:32:14 +01:00
Tim Rühsen
00ae9b4ee2 Move Wget from IDN2003 (libidn) to IDN2008 (libidn2)
* .travis.yml: Install libidn2-dev instead libidn11-dev.
* bootstrap.conf: Add modules libunistring-optional, unistr/base,
  unicase/tolower.
* configure.ac: Check for libidn2.
* src/Makefile.am: Add $(LTLIBUNISTRING) to LDADD.
* tests/Makefile.am: Set LDADD similar to LDADD in src/Makefile.am
* src/connect.c: Use libidn2 code instead of libidn.
* src/host.c: Likewise.
* src/iri.c: Likewise.
* src/iri.h: Likewise.
* src/options.h: Likewise.
* src/url.c: Likewise.
* src/url.h: Likewise.
* src/log.c: Fix C99 comment.

IDN2003 should not be used any more due to security concerns.
We use libunistring (resp. the unicode code from gnulib) for
lowercasing UTF-8 before we give data to libidn2.
TR#46 is missing, no support in libidn2 nor in libunistring.
2016-11-07 11:03:42 +01:00
Tim Rühsen
be5517f98f * src/metalink.c: Fix typo 'suceeded' -> 'succeeded'
Reported-by: Göran Uddeborg <goeran@uddeborg.se>,
             Anders Jonsson <anders.jonsson@norsjovallen.se>
2016-10-22 22:21:39 +02:00
losgrandes
dd5c549f6a Fixes #45790: wget prints it's progress even when background
* src/log.c: Use tcgetpgrp(STDIN_FILENO) != getpgrp() to determine when to print to STD* or logfile.
  Deprecate log_request_redirect_output function.
  Use different file handles for STD* and logfile, to easily switch between them when changing fg/bg.
* src/log.h: Make redirect_output function externally linked.
* src/main.c: Don't use deprecated log_request_redirect_output function. Use redirect_output instead.
* src/mswindows.c: Don't use deprecated log_request_redirect_output function. Use redirect_output instead.
2016-10-21 19:33:29 +02:00
losgrandes
78e0ec5f03 Fixes #46584: wget --spider always returns zero exit status
* src/ftp.c: Return error as exit value if even one file doesn't exist
2016-10-21 10:24:28 +02:00
Tim Rühsen
807d1c7d94 * src/http.c (gethttp): Accept 206 for request w/o Range header
Fixes: #49319
2016-10-12 14:59:36 +02:00
Tim Rühsen
517d799b6f Properly include iconv.h
* src/iri.c: Check HAVE_ICONV to include iconv.h
* src/url.c: Same
2016-10-07 13:24:15 +02:00
Tim Rühsen
e5164a8260 Amend redirection behavior
* src/recur.c (descend_redirect): Ignore WG_RR_LIST and WG_RR_REGEX
  for redirections.
* testenv/Makefile.am: Add Test-recursive-redirect.py
* testenv/Test-recursive-redirect.py: New test

Test-recursive-redirect.py written by Dale R. Worley.

Reported-by: "Dale R. Worley" <worley@ariadne.com>
2016-10-07 11:49:07 +02:00
Matthew White
c403e67935 New: --metalink-over-http Content-Type/Disposition Metalink/XML processing
* src/http.c (metalink_from_http): Process the Content-Type header.
  Add an application/metalink4+xml URL as metalink metaurl.  If the
  option opt.content_disposition is true, the Content-Disposition's
  filename is the metaurl's name
* doc/wget.texi: Update --content-disposition and --metalink-over-http
* doc/metalink-standard.txt: Update doc. Content-Type/Disposition
  processing through --metalink-over-http. Update download naming
  system about --trust-server-names and --content-disposition
* testenv/Makefile.am: Add new files
* testenv/Test-metalink-http-xml-type.py: New file. Metalink/HTTP
  Content-Type/Disposition header automated Metalink/XML tests
* testenv/Test-metalink-http-xml-type-trust.py: New file. Metalink/HTTP
  Content-Type/Disposition header with --trust-server-names automated
  Metalink/XML tests
* testenv/Test-metalink-http-xml-type-content.py: New file. Metalink/HTTP
  Content-Type/Disposition header with --content-disposition automated
  Metalink/XML tests
* testenv/Test-metalink-http-xml-type-trust-content.py: New file.
  Metalink/HTTP Content-Type/Disposition header with --trust-server-names
  and --content-disposition automated Metalink/XML tests

Process the Content-Type header, identify an application/metalink4+xml
file.  The Content-Disposition could provide an alternate name through
the "filename" field for the metalink xml file.  Respectively, the cli
options --metalink-over-http and --content-disposition are required.

When Metalink/XML auto-processing, to use the Content-Disposition's
filename, the cli option --trust-server-names is also required.
2016-09-30 19:44:06 +02:00
Matthew White
3021466817 Bugfix: Set NULL variable due to --content-disposition to Metalink origin
* src/http.c (http_loop): Prevent SIGSEGV when hstat.local_file is
  NULL, opt.content_disposition has a role in leaving the value unset
* src/http.c (gethttp): If hs->local_file is NULL (aka http_loop()'s
  hstat.local_file), set it to the value of hs->metalink->origin
2016-09-30 19:44:06 +02:00
Matthew White
c89767d8d1 New: --trust-server-names saves Metalink/HTTP xml files using the "name" field
* src/metalink.c (retrieve_from_metalink): If opt.trustservernames is
  true, use the basename of the metaurl's name to save the xml file
* doc/metalink-standard.txt: Update doc. With --trust-server-names any
  Metalink/HTTP Link application/metalink4+xml file is saved using the
  basename of the "name" field, if any. Update Metalink/HTTP examples
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-http-xml-trust-name.py: New file. Metalink/HTTP
  automated Metalink/XML, save xml files using the "name" field tests
2016-09-30 19:44:06 +02:00
Matthew White
f030cdf8e2 Bugfix: Detect when a metalink:file doesn't have any hash
* src/metalink.c (retrieve_from_metalink): Reject any metalink:file
  without hashes. Prompt the error and switch to the next file
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-xml-nohash.py: New file. Metalink/XML with no
  hashes tests

Prevent SIGSEGV.
2016-09-30 19:44:06 +02:00
Matthew White
5dccb2a9ce Bugfix: Detect malformed base64 Metalink/HTTP Digest header
* src/http.c (metalink_from_http): Fix hash_bin_len type. Use ssize_t
  instead than size_t. Reject -1 as base64_decode() return value
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-http-baddigest.py: New file. Metalink/HTTP
  malformed base64 Digest header tests

On malformed base64 input, ssize_t base64_decode() returns -1. Such
value is too big for a size_t variable, and used as xmalloc() value
will exaust all the memory.
2016-09-30 19:44:06 +02:00
Matthew White
0538e791fb New option --metalink-index to process Metalink application/metalink4+xml
* NEWS: Mention the effect of --metalink-index over Metalink
* src/init.c: Add new option metalinkindex (opt.metalink_index),
  initialize to -1
* src/main.c: Add new option metalink-index (--metalink-index=NUMBER)
* src/options.h: Add new option metalink_index (int)
* src/metalink.h: Add declaration of functions fetch_metalink_file(),
  replace_metalink_basename()
* src/metalink.c: Add functions fetch_metalink_file() simple file
  fetch, replace_metalink_basename() replace file basename
* src/metalink.c (retrieve_from_metalink): New. Process Metalink
  application/metalink4+xml of opt.metalink_index ordinal number
* doc/wget.texi: Add new option metalink-index (--metalink-index)
  documentation
* doc/metalink-standard.txt: Updated doc. Add documentation about
  Metalink application/metalink4+xml metaurls download naming system
* doc/metalink-standard.txt: Update Metalink/XML and HTTP examples
* testenv/Makefile.am: Add new files
* testenv/Test-metalink-http-xml.py: New file. Metalink/HTTP automated
  Metalink/XML "application/metalink4+xml" --metalink-index tests
* testenv/Test-metalink-http-xml-trust.py: New file. Metalink/HTTP
  automated Metalink/XML "application/metalink4+xml" --metalink-index
  retrieval with --trust-server-names tests

WARNING: Do not use lib/dirname.c (dir_name) to get the directory
name, it may append a dot '.' character to the directory name.
2016-09-30 19:44:06 +02:00
Matthew White
acb1d1a668 Bugfix: Prevent sorting when there are less than two elements
* src/utils.c (stable_sort): Add condition nmemb > 1, sort only when
  there is more than one element

Prevent SIGSEGV.
2016-09-30 19:44:06 +02:00
Matthew White
628fb565c7 New: Parse Metalink/HTTP header for application/metalink4+xml
* src/http.c (metalink_from_http): Parse Metalink/HTTP header for
  metaurls application/metalink4+xml media types
* src/metalink.h: Add function declaration metalink_meta_cmp()
* src/metalink.c: Add function metalink_meta_cmp() compare metalink
  metaurls priorities

Add Metalink/HTTP application/metalink4+xml media types as metaurls to
the metalink variable that will be used to download the files.
2016-09-30 19:44:05 +02:00
Matthew White
9532861aef Bugfix: Remove surrounding quotes from Metalink/HTTP key's value
* src/metalink.h: Add declaration of function dequote_metalink_string()
* src/metalink.c: Add function dequote_metalink_string() remove
  surrounding quotes from string, \' or \"
* src/metalink.c (find_key_value, find_key_values): Call dequote_metalink_string()
  to remove the surrounding quotes from the parsed value
* src/metalink.c (test_find_key_value, test_find_key_values): Add
  quoted key's values for unit-tests
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-http-quoted.py: New file. Metalink/HTTP quoted
  values tests

Some Metalink/HTTP keys, like "type" [2], may have a quoted value [1]:
Link: <http://example.com/example.ext.meta4>; rel=describedby;
type="application/metalink4+xml"

Wget was expecting a dequoted value from the Metalink module. This
patch addresses this problem.

References:
 [1] Metalink/HTTP: Mirrors and Hashes
     1.1. Example Metalink Server Response
     https://tools.ietf.org/html/rfc6249#section-1.1

 [2] Additional Link Relations
     6. "type"
     https://tools.ietf.org/html/rfc6903#section-6
2016-09-30 19:44:05 +02:00
Matthew White
8aca8fc80d Bugfix: Process Metalink/XML url strings containing white spaces and CRLF
* src/metalink.h: Add declaration of function clean_metalink_string()
* src/metalink.c: Add directive #include "xmemdup0.h"
* src/metalink.c: Add function clean_metalink_string() remove leading
  and trailing white spaces and CRLF from string
* src/metalink.c (retrieve_from_metalink): Remove leading and trailing
  white spaces and CRLF from url resource mres->url
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-xml-urlbreak.py: New test. Metalink/XML white
  spaces and CRLF in url resources tests

White spaces and CRLF are not automatically removed by libmetalink
from url strings. The Wget's Metalink module was unable to process
such url strings. This patch implements the processing of such url
strings cleaning off leading and trailing white spaces and CRLF.

If a parsed Metalink/XML url string contains strings separated by
CRLF, only the first of the series is accepted.
2016-09-30 19:44:05 +02:00
Matthew White
70360b3eab New: Metalink file size mismatch returns error code METALINK_SIZE_ERROR
* src/wget.h (uerr_t): Add error code METALINK_SIZE_ERROR to enum
* src/metalink.c (retrieve_from_metalink): Use boolean variable
  size_ok, when false set retr_err to METALINK_SIZE_ERROR
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-xml-size.py: New file. Metalink/XML file size
  tests (<size></size>)

Before this patch, no appropriate error code was returned to inform a
file size mismatch.

This patch introduces the error code METALINK_SIZE_ERROR to inform a
file size mismatch.
2016-09-30 19:44:05 +02:00
Matthew White
c29983a044 New: Metalink/XML and Metalink/HTTP file naming safety rules
* NEWS: Mention the effect of --trust-server-names over Metalink
* src/metalink.h: Add declaration of function append_suffix_number()
* src/metalink.c: Add function append_suffix_number() append number to
  string
* src/metalink.c (retrieve_from_metalink): Safer Metalink/XML and
  Metalink/HTTP download naming system, opt.trustservernames based
* doc/metalink-standard.txt: Update doc. Explain new Metalink/XML and
  Metalin/HTTP download naming system and --trust-server-names role
* testenv/Makefile.am: Add new files
* testenv/Test-metalink-xml-continue.py: Update test. Metalink/XML
  continue/keep existing files (HTTP 416) with --continue tests
* testenv/Test-metalink-xml.py: Update test. Metalink/XML naming tests
* testenv/Test-metalink-xml-trust.py: New file. Metalink/XML naming
  tests with --trust-server-names
* testenv/Test-metalink-xml-abspath.py: Update test. Metalink/XML
  absolute path tests
* testenv/Test-metalink-xml-abspath-trust.py: New file. Metalink/XML
  absolute path tests with --trust-server-names
* testenv/Test-metalink-xml-relpath.py: Update test. Metalink/XML
  relative path tests
* testenv/Test-metalink-xml-relpath-trust.py: New file. Metalink/XML
  relative path tests with --trust-server-names
* testenv/Test-metalink-xml-homepath.py: Update test. Metalink/XML
  home path and ~ (tilde) tests
* testenv/Test-metalink-xml-homepath-trust.py: New file. Metalink/XML
  home path and ~ (tilde) tests with --trust-server-names
* testenv/Test-metalink-xml-prefix.py: New file. Metalink/XML naming
  tests with --directory-prefix
* testenv/Test-metalink-xml-prefix-trust.py: New file. Metalink/XML
  naming tests with --directory-prefix and --trust-server-names
* testenv/Test-metalink-xml-absprefix.py: New file. Metalink/XML
  absolute --directory-prefix tests
* testenv/Test-metalink-xml-absprefix-trust.py: New file. Metalink/XML
  absolute --directory-prefix tests with --trust-server-names
* testenv/Test-metalink-xml-relprefix.py: New file. Metalink/XML
  relative --directory-prefix tests
* testenv/Test-metalink-xml-relprefix-trust.py: New file. Metalink/XML
  relative --directory-prefix tests with --trust-server-names
* testenv/Test-metalink-xml-homeprefix.py: New file. Metalink/XML home
  --directory-prefix tests
* testenv/Test-metalink-xml-homeprefix-trust.py: New file. Metalink/XML
  home --directory-prefix tests with --trust-server-names

The option --trust-server-names allows to use the file names parsed
from a Metalink/XML file.  Without --trust-server-names, the safety
mechanism provides secure and predictable file names.
2016-09-30 19:44:05 +02:00
Matthew White
43ec7008f2 Enforce Metalink file name verification, strip directory if necessary
* NEWS: Mention the use of a safe Metalink destination path
* src/metalink.h: Add declaration of functions get_metalink_basename(),
  last_component(), metalink_check_safe_path()
* src/metalink.c: Add directive #include "dosname.h"
* src/metalink.c: Add function get_metalink_basename() to return the
  basename of a file name, strip w32's drive letter prefixes
* src/metalink.c (retrieve_from_metalink): Enforce Metalink file name
  verification, if the file name is unsafe try its basename
* doc/metalink.txt: Update document. Explain --directory-prefix

The function get_metalink_basename() uses FILE_SYSTEM_PREFIX_LEN to
catch any 'C:D:file' (w32 environment), then it removes each drive
letter prefix, i.e. 'C:' and 'D:'.

Unsafe file names contain an absolute, relative, or home path.  Safe
paths can be verified by libmetalink's metalink_check_safe_path().
2016-09-30 19:44:03 +02:00
Matthew White
7d4942864b Implement Metalink/XML --directory-prefix option in Metalink module
* NEWS: Mention the effect of --directory-prefix over Metalink
* src/metalink.c (retrieve_from_metalink): Add opt.dir_prefix as
  prefix to the metalink:file name mfile->name
* doc/metalink.txt: Update document. Explain --directory-prefix

When --directory-prefix=<prefix> is used, set the top of the retrieval
tree to prefix. The default is . (the current directory). Metalink/XML
and Metalink/HTTP files will be downloaded under prefix.
2016-09-27 20:29:03 +02:00
Matthew White
666b7862bf Change mfile->name to filename in Metalink module's messages
* src/metalink.c (retrieve_from_metalink): Change mfile->name to
  filename when referring to the downloaded file

The file name could have been changed by unique_create() (or by any
other mean) before downloading. Use the name of the downloaded file
(filename) when printing output which refer to it.
2016-09-27 20:29:03 +02:00
Matthew White
f3f349a0cf Add file size computation in Metalink module
* NEWS: Mention Metalink's file size verification
* src/metalink.c (retrieve_from_metalink): Add file size computation
* doc/metalink.txt: Update document. Remove resolved bugs

Reject downloaded files when they do not agree with their Metalink/XML
metalink:size: https://tools.ietf.org/html/rfc5854#section-4.2.14

At the moment of writing, Metalink/HTTP headers do not provide a file
size field. This information could be obtained from the Content-Length
header field: https://tools.ietf.org/html/rfc6249#section-7
2016-09-27 20:29:03 +02:00
Matthew White
ff444ebc2a Bugfix: Keep the download progress when alternating metalink:url
* NEWS: Mention the effects of --continue over Metalink
* src/metalink.c (retrieve_from_metalink): On download error, resume
  output_stream with the next mres->url. Keep fully downloaded files
  started with --continue, otherwise rename/remove the file
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-xml-continue.py: New file. Metalink/XML
  continue/keep existing files (HTTP 416) with --continue tests

Before this patch, with --continue, existing and/or fully retrieved
files which fail the sanity tests were renamed (--keep-badhash), or
removed.

This patch ensures that --continue doesn't rename/remove existing
and/or fully retrieved files (HTTP 416) which fail the sanity tests.
2016-09-27 20:28:50 +02:00
Matthew White
96554861f9 Bugfix: Fix NULL filename and output_stream in Metalink module
* NEWS: Mention the Metalink "path/file" name format handling
* src/metalink.c (retrieve_from_metalink): Fix NULL filename, set
  filename to the right "path/file" value
* src/metalink.c (retrieve_from_metalink): Fix NULL output_stream, set
  output_stream to filename when it is created by retrieve_url()
* src/metalink.c (retrieve_from_metalink): Add RFC5854 comments about
  proper metalink:file "path/file" name format handling
* doc/metalink.txt: Update document. Remove resolved bugs

If unique_create() cannot create/open the destination file, filename
and output_stream remain NULL. If fopen() is used instead, filename
always remains NULL. Both functions cannot create "path/file" trees.

Setting filename to the right value is sufficient to prevent SIGSEGV
generating from testing a NULL value. This also allows retrieve_url()
to create a "path/file" tree through opt.output_document.

Reading NULL as output_stream, when it shall not be, leads to wrong
results. For instance, a non-NULL output_stream tells when a stream
was interrupted, reading NULL instead means to assume the contrary.

This patch conforms to the RFC5854 specification:
  The Metalink Download Description Format
  4.1.2.1.  The "name" Attribute
  https://tools.ietf.org/html/rfc5854#section-4.1.2.1
2016-09-27 20:17:08 +02:00
Tim Rühsen
a2c4849900 Fix crash on 'srcset' inline URIs
* src/html-url.c (tag_handle_img): Check append_url() for NULL
  return value before dereference.

Crashed reproducable with parsing srcset="data:..." inline data.
Reported-by: Coverity
2016-09-09 11:44:02 +02:00
Tim Rühsen
40870e1271 * src/hsts.c (hsts_store_open): NULL check param for fclose().
Reported-by: Coverity
2016-09-09 10:22:58 +02:00
Tim Rühsen
15c1e0eb7b * src/ftp-ls.c (ftp_parse_winnt_ls): Fix memset params 2016-09-09 10:22:58 +02:00
Tim Rühsen
eba724a128 * src/utils.c (stable_sort): Use xmalloc instead of malloc 2016-09-09 10:22:58 +02:00
Tim Rühsen
66a9883c8f * src/ftp-ls.c (ftp_parse_winnt_ls): Initialize struct fileinfo cur
Reported-by: Coverity
2016-09-08 16:51:15 +02:00
Tim Rühsen
4febe72bd2 Add const to url param of some functions
* src/http.c: Add const to first param of initialize_request(),
  initialize_proxy_configuration(), establish_connection(),
  check_file_output(), check_auth(), gethttp(), http_loop().
* src/http.h: Add const to first param of http_loop().
2016-09-08 16:13:54 +02:00
Tim Rühsen
03da900c5b * src/recur.c (retrieve_tree): Fix possible NULL dereference
Reported-by: Coverity
2016-09-08 13:04:37 +02:00
Tim Rühsen
b7b67e23cd * src/http.c (initialize_request): Fix check for user
Reported-by: Coverity
2016-09-08 12:48:32 +02:00
Tim Rühsen
22aed3ed4b * src/retr.c (retrieve_url): NULL check mynewloc
Reported-by: Coverity
2016-09-08 12:46:25 +02:00
Tim Rühsen
b4465afa8a * src/utils.c (stable_sort): Reduce tmp allocation size
Reported-by: Coverity
2016-09-08 12:44:17 +02:00
Tim Rühsen
a78b83b1e9 Fix some issues detected by Coverity
* src/connect.c (connect_to_ip): Check return value of setsockopt.
* src/ftp.c (ftp_retrieve_list): Check return value of chmod.
* src/http.c (digest_authentication_encode): Cleanup code.
* src/init.c (setval_internal): Explicitely check comind range.
* src/main.c (main): Explicitely check optarg.
* src/retr.c (retr_rate): Use snprintf instead sprintf,
  (retrieve_from_file): More verbose error message,
  (rotate_backups): Use snprintf instead sprintf, check return
  value of rename().
* src/url.c (mkalldirs): Check return value of unlink().
* src/utils.c (strdupdelim): Explicitely check beg and end for NULL,
  (merge_vecs): Fix sizeof argument to char *,
  (stable_sort): Use malloc instead of alloca.
2016-09-08 10:12:02 +02:00
Tim Rühsen
37a5257c66 Code cleanup for --use-askpass
* bootstrap.conf: Add xmemdup0 and strpbrk.
* src/init.c (cmd_use_askpass): Add 'const' to char *,
  remove check for file existence.
* src/main.c (run_use_askpass): C89 compat init of argv,
  added \n to error messages,
  fixed stripping of \n and \r from input,
  make run_use_askpass and use_askpass static.
2016-09-08 09:07:32 +02:00
Tim Rühsen
49af22ca94 * src/http.c (check_file_output): Replace asprintf by aprint 2016-09-07 09:31:43 +02:00
Liam R. Howlett
21e1725e12 Add --use-askpass=COMMAND support
* doc/wget.texi: Add --use-askpass to documentation.
* src/init.c: Add cmd_use_askpasss to set opt.use_askpass based on
argument, WGET_ASKPASS, and SSH_ASKPASS environment variables.
opt.wget-askpass is freed in cleanup ()
* src/main.c: Update options & add spawn process of opt.use_askpass
command.
* src/options.h: Addition of string use_askpass.
* src/url.c: Function scheme_leading_string to access the leading
string of a parsed url.
* src/url.h: Prototype for scheme_leading_string for returning the
leading string.
* bootstrap.conf: Add posix_spawn to gnulib_modules

This adds the --use-askpass option which is disabled by default.

--use-askpass=COMMAND will request the username and password for a given
URL by executing the external program COMMAND.  If COMMAND is left
blank, then the external program in the environment variable
WGET_ASKPASS will be used.  If WGET_ASKPASS is not set then the
environment variable SSH_ASKPASS is used.  If there is no value set, an
error is returned.  If an error occurs requesting the username or
password, wget will exit.

Signed-off-by: Liam R. Howlett <Liam.Howlett@WindRiver.com>
2016-09-03 21:01:24 +02:00
Giuseppe Scrivano
690c47e3b1 Append .tmp to temporary files
* src/http.c (struct http_stat): Add `temporary` flag.
(check_file_output): Append .tmp to temporary files.
(open_output_stream): Refactor condition to use hs->temporary instead.

Reported-by: "Misra, Deapesh" <dmisra@verisign.com>
Discovered by: Dawid Golunski (http://legalhackers.com)
2016-08-24 12:29:01 +02:00
Tim Rühsen
9ffb64ba6a Limit file mode to u=rw on temp. downloaded files
* bootstrap.conf: Add gnulib modules fopen, open.
* src/http.c (open_output_stream): Limit file mode to u=rw
on temporary downloaded files.

Reported-by: "Misra, Deapesh" <dmisra@verisign.com>
Discovered by: Dawid Golunski (http://legalhackers.com)
2016-08-24 12:28:55 +02:00
Tim Rühsen
0787d7253e * src/css-url.c (get_urls_css): Fix memory leak 2016-08-17 23:13:27 +02:00
Tim Rühsen
964f4646da * src/html-url.c (get_urls_html): Fix memory leak 2016-08-17 23:12:25 +02:00
Tim Rühsen
262baeb113 Improve PSL cookie checking
* configure.ac: Add --with-psl-file to set a PSL file
* src/cookies.c (check_domain_match): Load PSL_FILE with
  fallback to built-in data.

This change allows package maintainers to make Wget use the latest
PSL (DAFSA or plain text), without updating libpsl itself.

E.g. Debian now comes with a DAFSA binary within the 'publicsuffix'
package which allows very fast loading (no parsing or processing needed).
2016-08-17 16:32:26 +02:00
Tobias Stoeckmann
f4aeb41899 Fix stack overflow with way too many cookies
* src/cookies.c (cookie_header): Use heap instead of stack.
* src/http.c (request_send): Likewise.

If wget has to handle an insanely large amount of cookies (~700,000 on
32 bit systems or ~530,000 on 64 bit systems), the stack is not large
enough to hold these pointers, leading to undefined behaviour according
to POSIX; expect a segmentation fault in real life. ;)

Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
2016-08-10 19:59:25 +02:00
Tobias Stoeckmann
a9d49e5b15 Fix signal race condition
The signal handler for SIGALRM calls longjmp, but the handler is
installed before the jump target has been initialized. If another
process sends SIGALRM right between handler installation and target
initialization, the jump leads to undefined behavior.

This can easily be fixed by moving the signal handler installation
into the "SETJMP == 0" conditional block, which means that the target
has just been initialized.

* src/utils.c: call signal after SETJMP.

Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
2016-08-09 17:38:29 +02:00
Jeffery To
0fe79eeacb Remove hyphens from command names
* src/init.c: Remove hyphens from command names
* src/main.c: Likewise

Options with hyphens (or underscores) in their command name cannot be
set in a wgetrc file.

Signed-off-by: Jeffery To <jeffery.to@gmail.com>
2016-08-05 09:45:09 +02:00
Tim Rühsen
e3fb4c3859 * src/metalink.c (badhash_suffix): Fix quoting 2016-08-04 13:09:28 +02:00
Matthew White
943a6d585f Add new option --keep-badhash to keep Metalink's files with a bad hash
* src/init.c: Add keepbadhash
* src/main.c: Add keep-badhash
* src/options.h: Add keep_badhash
* doc/wget.texi: Add docs for --keep-badhash
* src/metalink.h: Add prototypes badhash_suffix(), badhash_or_remove()
* src/metalink.c: New functions badhash_suffix(), badhash_or_remove().
  (retrieve_from_metalink): Call badhash_or_remove() on download error

With --keep-badhash, append .badhash to Metalink's files with checksum
mismatch. (retrieve_from_metalink): unique_create() may append another
suffix to avoid overwriting existing files.

Without --keep-badhash, remove downloaded files with checksum mismatch
(this conforms to the old behaviour).
2016-08-04 12:03:49 +02:00
Tim Rühsen
7fad76db4c * src/metalink.c: Remove C++ style comments 2016-08-03 13:48:07 +02:00
Matthew White
e0b60fd073 New: --continue continues partially downloaded Metalink's files
* src/metalink.c (retrieve_from_metalink): Continue file download if
  opt.always_rest is true

Without --continue, download as a new file with an unique name (this
conforms to the old behaviour).
2016-08-03 13:37:27 +02:00
Matthew White
9db02a0c46 Add support for Metalink's md2, and md4 hashes
* bootstrap.conf: Add crypto/md2, and crypto/md4
* src/metalink.c (retrieve_from_metalink): Add md2, and md4 support

This patch adds support for the deprecated (insecure) md2, and md4
Message-Digest algorithms to the Metalink module.
2016-08-03 12:58:43 +02:00
Matthew White
edad3c1df3 Add support for Metalink's md5, sha1, sha224, sha384, and sha512 hashes
* bootstrap.conf: Add crypto/sha512
* src/metalink.c (retrieve_from_metalink): Add md5, sha1, sha224,
  sha384, and sha512 support

Metalink's checksum verification was limited to sha256. This patch
adds support for md5, sha1, sha224, sha384, and sha512.
2016-08-03 12:49:26 +02:00
Sean Burford
20cac2c5ab Style fixes and DEBUG on setxattr failure.
* src/ftp.c: Fix style.
* src/http.c: Likewise.
* src/xattr.h: Likewise.
* src/xattr.c: Likewise,
  (write_xattr_metadata): Print debug msg on error.
2016-07-27 17:05:57 +02:00
Sean Burford
a933bdd31e Keep fetched URLs in POSIX extended attributes
* configure.ac: Check for xattr availability
* src/Makefile.am: Add xattr.c
* src/ftp.c: Include xattr.h.
  (getftp): Set attributes if enabled.
* src/http.c: Include xattr.h.
  (gethttp): Add parameter 'original_url',
  set attributes if enabled.
  (http_loop): Add 'original_url' to call of gethttp().
* src/init.c: Add new option --xattr.
* src/main.c: Add new option --xattr, add description to help text.
* src/options.h: Add new config member 'enable_xattr'.
* src/xatrr.c: New file.
* src/xattr.h: New file.

These attributes provide a lightweight method of later determining
where a file was downloaded from.

This patch changes:
*   autoconf detects whether extended attributes are available and
    enables the code if they are.
*   The new flags --xattr and --no-xattr control whether xattr is enabled.
*   The new command "xattr = (on|off)" can be used in ~/.wgetrc or /etc/wgetrc
*   The original and redirected URLs are recorded as shown below.
*   This works for both single fetches and recursive mode.

The attributes that are set are:
user.xdg.origin.url: The URL that the content was fetched from.
user.xdg.referrer.url: The URL that was originally requested.

Here is an example, where http://archive.org redirects to https://archive.org:
$ wget --xattr http://archive.org
...
$ getfattr -d index.html
user.xdg.origin.url="https://archive.org/"
user.xdg.referrer.url="http://archive.org/"

These attributes were chosen based on those stored by Google Chrome
https://bugs.chromium.org/p/chromium/issues/detail?id=45903
and curl https://github.com/curl/curl/blob/master/src/tool_xattr.c
2016-07-22 13:42:23 +02:00
Noël Köthe
ef372a4f27 Fix typos
* ChangeLog-2014-12-10: invokation -> invocation
* doc/wget.texi: invokation -> invocation
* src/main.c: seperated -> separated
* src/options.h: seperated -> separated
* testenv/README: invokation -> invocation
* testenv/conf/wget_commands.py: invokation -> invocation
2016-07-02 19:01:24 +02:00
Tim Rühsen
309e72c74f Fix compilation for OpenSSL 1.1.0
* src/openssl.c (ssl_init): Use SSL_is_init_finished() instead of
  SSL_state(), conditionally skip SSLeay function calls

The python test suite makes SSL_peek() hang, consuming 100% CPU time.
This does not happen on real world TLS connections, though, but needs
investigations.
2016-06-30 13:24:33 +02:00
Ander Juaristi
cdc3e28d8e Bypass world-writable checks on Windows
* src/hsts.c (hsts_file_access_valid): we should check for "world-writable"
   files only on Unix-based systems. It's difficult to mimic the same behavior
   on Windows, so it's better to just not do it.

Reported-by: Gisle Vanem <gvanem@yahoo.no>
Reported-by: Eli Zaretskii <eliz@gnu.org>
2016-06-27 09:54:32 +02:00
Tim Rühsen
e1e7afb210 Use ICONV_CONST to avoid type warning for iconv()
* src/iri.c (do_conversion): Cast 2. param of iconv() to
 'ICONV_CONST char **'
* src/url.c (convert_fname): Likewise
2016-06-12 21:51:34 +02:00
Tim Rühsen
7e585fe23d Remove check for HAVE_ICONV in src/url.c
* src/url.c: Remove check for HAVE_ICONV
2016-06-12 21:49:23 +02:00
Tim Rühsen
d75f43f083 Include gnulib fcntl.h instead of sys/fcntl.h
* src/gnutls.c: Include gnulib fcntl.h
2016-06-12 17:06:31 +02:00
Tim Rühsen
d4f97dc9af Add libraries to LDADD for wget
* src/Makefile.am: Add $(GETADDRINFO_LIB) $(HOSTENT_LIB) $(INET_NTOP_LIB)
 $(LIBSOCKET) $(LIB_CLOCK_GETTIME) $(LIB_CRYPTO) $(LIB_SELECT)
 $(LTLIBICONV) $(LTLIBINTL) $(LTLIBTHREAD) $(SERVENT_LIB) to LDADD
2016-06-12 17:02:12 +02:00
Giuseppe Scrivano
e996e322ff ftp: understand --trust-server-names on a HTTP->FTP redirect
If not --trust-server-names is used, FTP will also get the destination
file name from the original url specified by the user instead of the
redirected url.  Closes CVE-2016-4971.

* src/ftp.c (ftp_get_listing): Add argument original_url.
(getftp): Likewise.
(ftp_loop_internal): Likewise.  Use original_url to generate the
file name if --trust-server-names is not provided.
(ftp_retrieve_glob): Likewise.
(ftp_loop): Likewise.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2016-06-09 15:02:49 +02:00
Tim Rühsen
2bdfc4f521 Fix warnings for --disable-iri configure flag
* src/iri.h: Fix #define for parse_charset
* src/html-url.c: Surround some IRI code parts by #ifdef ENABLE_IRI
* src/http.c: Likewise
* src/iri.h: Likewise
* src/recur.c: Likewise
* src/retr.c: Likewise
2016-06-07 12:52:59 +02:00
Tim Rühsen
2c736abb4c Fix warning about redefinition of MAP_FAILED
* src/sysdep.h: Removed definition of MAP_FAILED
* src/utils.c: Check and define MAP_FAILED after including sys/mmap.h
2016-06-07 09:56:01 +02:00
Ander Juaristi
5224d752a5 Correct HSTS debug message
* src/main.c (save_hsts): save the in-memory HSTS database to a file
   only if something changed.
 * src/hsts.c (struct hsts_store): new field 'changed'.
   (hsts_match): update field 'changed' accordingly.
   (hsts_store_entry): update field 'changed' accordingly.
   (hsts_store_has_changed): new function.
 * src/hsts.h (hsts_store_has_changed): new function.
2016-05-26 16:37:51 +02:00
Ander Juaristi
2aaf12990c Check the HSTS file is not world-writable
* hsts.c (hsts_file_access_valid): check that the file is a regular
   file, and that it's not world-writable.
   (hsts_store_open): if the HSTS database file does not meet the
   above requirements, disable HSTS at all.
2016-05-26 16:29:29 +02:00
Tim Rühsen
a952f81f3e Remove special handling for Emacs in progress bar code
* src/progress.c: Remove special 'emacs' code

Fixes #47989
2016-05-23 21:46:29 +02:00
Jernej Simončič
42cc84b6b6 Fix xsleep() for Windows (trivial change)
* src/mswindows.c (xsleep): Fix check for number of seconds
2016-04-25 15:50:23 +02:00
Sergio Gelato
96ab9cad88 More accurate log message from do_conversion()
* src/iri.c (do_conversion): More accurate log message
2016-04-17 15:28:48 +02:00
Tim Rühsen
268163444d Include sys/select.h if HAVE_LIBCARES
* src/hosts.c: Include sys/select.h if HAVE_LIBCARES

Reported-by: Gisle Vanem <gvanem@yahoo.no>
2016-04-17 14:18:55 +02:00
Gisle Vanem
53800415a9 Fix Windows gnulib/c-ares incompatibility of select()
* src/host.c: Undef 'select' on Windows
2016-04-17 14:15:51 +02:00
Ander Juaristi
2f1c6a05c8 Strictly comply with RFC 6797
* src/hsts.c (hsts_store_entry): strictly comply with RFC 6797.

RFC 6797 states in section 8.1 that the UA's cached information should
only be updated if:

    "either or both of the max-age and includeSubDomains header field
    value tokens are conveying information different than that already
    maintained by the UA."
2016-04-11 16:44:47 +02:00
Ander Juaristi
33d860e1ef Correct HSTS database file description
* src/hsts.c (hsts_store_dump): s/[:port]/<port>/
2016-04-11 16:44:41 +02:00
moparisthebest
54746578e9 Implement --pinnedpubkey option to pin public keys
* doc/wget.texi: Add description for --pinnedpubkey
* src/gnutls.c: New function pkp_pin_peer_pubkey(),
  (ssl_check_certificate): Check pinned cert via pkp_pin_peer_pubkey()
* src/init.c: Add option --pinnedpubkey
* src/main.c: Add option --pinnedpubkey
* src/openssl.c: New function pkp_pin_peer_pubkey(),
  (ssl_check_certificate): Check pinned cert via pkp_pin_peer_pubkey()
* src/options.h: Add new option variable 'pinnedpubkey'
* src/utils.c: New functions wg_pubkey_pem_to_der(), wg_pin_peer_pubkey()
* src/utils.h: Add prototype for wg_pin_peer_pubkey()
2016-04-11 16:18:05 +02:00
Darshit Shah
d26377053d Print the fingerprint instead of the raw pointer in debugging message
* src/metalink.c (retrieve_from_metalink): Fix debug message to print the
fingerprint instead of a pointer.

* Do not delete the ChangeLog file since it is required by the Makefile
and breaks compilation
2016-03-29 15:09:04 +02:00
Darshit Shah
722675553c Revert "Print the fingerprint instead of the raw pointer in debugging message"
This reverts commit b916595168.
2016-03-29 15:07:29 +02:00
Giuseppe Scrivano
f3e63f0071 * metalink.c (retrieve_from_metalink): Fix typo 2016-03-25 16:46:39 +01:00
Giuseppe Scrivano
b916595168 Print the fingerprint instead of the raw pointer in debugging message
* src/metalink.c (retrieve_from_metalink): Fix debug message to print the
fingerprint instead of a pointer.
2016-03-25 16:23:19 +01:00
Tim Rühsen
76ef65b23c Add options --bind-dns-address and --dns-servers
* README.checkout: Add description for libares
* configure.ac: Add check for libares
* doc/wget.texi: Add docs for the new options
* src/build_info.c.in: Add +/-cares for --version output
* src/host.c:
  (merge_address_lists): New static function
  (address_list_from_hostent): New static function
  (wait_ares): New static function
  (callback): New static function
  (lookup_host): Add libares resolver code
* src/init.c: Add new options,
  (cleanup): Add cleanup code
* src/main.c: Add global libares channel variable
  (cmdline_option option_data): Add new options
  (print_help): Add short descriptions
  (main): Add libares init code
* src/options.h (struct options): Add option members

The new options allow to specify alternative DNS servers and
an alternate packet route for the resolver packets.
Wget has to built with libares, enabled at configure time by
./configure --with-cares.
2016-03-23 09:26:22 +01:00
Tim Rühsen
d7726f8a13 Fix SNI server names with trailing dot(s)
* src/gnutls.c (ssl_connect_wget, ssl_check_certificate): Fix SNI server name
* src/openssl.c (ssl_connect_wget, ssl_check_certificate): Fix SNI server name

Fixes #47408
2016-03-16 11:23:51 +01:00
Darshit Shah
7cb9efa668 Fix assertion in Progress bar
* src/progress.c (create_image): Fix off-by-one error in assert()
    statement for progress bar width.
    Reported-By: Gisle Vanem <gvanem@yahoo.no>
2016-03-05 13:27:46 +01:00
Giuseppe Scrivano
44aedd8321 src/url.c: fix make syntax-check 2016-03-03 09:40:39 +01:00
Maks Orlovich
c28f51aadf Parse <img srcset> attributes, they have image URLs.
* src/convert.h: Add link_noquote_html_p to permit rewriting URLs deep
                 inside attributes without adding extraneous quoting
* src/convert.c (convert_links): Honor link_noquote_html_p
* src/html_url.c (tag_handle_img): New function. Add srcset parsing.
2016-03-03 09:38:45 +01:00
Darshit Shah
7099f48998 Sanitize value sent to memset to prevent SEGFAULT 2016-03-01 08:11:13 +01:00
Tim Rühsen
100da11312 Fix writing WARC-Target-URI value
src/warc.c: Add function warc_write_header_uri(),
            Use it for creating WARC-Target-URI

Fixes #47281
2016-02-27 23:08:28 +01:00
Tim Rühsen
cacac6f996 Retain value of errno in logprintf(), logputs() even better
* src/log.c (logprintf,logputs): Save&Restore value of errno

Reported-by: Gisle Vanem <gvanem@yahoo.no>
2016-02-11 10:53:02 +01:00
Tim Rühsen
3056617e9c Retain value of errno in logprintf()
* src/log.c (logprintf): Save&Restore value of errno

Reported-by: Gisle Vanem <gvanem@yahoo.no>
2016-02-10 15:28:10 +01:00
Tim Rühsen
b30500f0f4 Fix Test-iri-forced-remote
* tests/Test-iri-forced-remote.px: Fix encodings
2015-12-20 21:32:06 +01:00
Eli Zaretskii
59b920874d Support non-ASCII URLs
* src/url.c [HAVE_ICONV]: Include iconv.h and langinfo.h.
(convert_fname): New function.
[HAVE_ICONV]: Convert file name from remote encoding to local
encoding.
(url_file_name): Call convert_fname.
(filechr_table): Don't consider bytes in 128..159 as control
characters.

* tests/Test-ftp-iri.px: Fix the expected file name to match the
new file-name recoding.  State the remote encoding explicitly on
the Wget command line.

* NEWS: Mention the URI recoding when built with libiconv.
2015-12-18 20:54:39 +01:00
Tim Rühsen
cbbeca2af4 Cleanup code
* src/iri.c (do_conversion): Code cleanup
2015-12-17 21:01:50 +01:00
Eli Zaretskii
93c1517c40 Set URI encoding when redirected
* src/retr.c (retrieve_url): Set URI on redirection
2015-12-17 15:27:43 +01:00
Tim Rühsen
bf5d7e9236 Remove requesting X/Open 5, POSIX 1995
* src/sysdep.h: Remove #define _XOPEN_SOURCE 500
2015-12-17 12:11:53 +01:00
Eli Zaretskii
94d9b68db9 Avoid hanging on MS-Windows when invoked with --connect-timeout
* src/connect.c (connect_to_ip) [WIN32]: Don't call fd_close if
the connection timed out, to avoid hanging.
2015-12-16 15:06:45 +01:00
Tim Rühsen
be7d19f478 Fix iconv conversion
* src/iri.c: Kick out the last converted character from iconv()

Thanks to Eli Zaretskii <eliz@gnu.org> for suggesting the fix.
Reported-by: "Andries E. Brouwer" <Andries.Brouwer@cwi.nl>
2015-12-15 10:55:41 +01:00
Ander Juaristi
478a584609 Fix leak in HSTS code
* src/hsts.c (hsts_store_open): close fp if open.
2015-12-13 16:10:16 +01:00
Ander Juaristi
994c4dcce7 Remove unused variable in ftp code
* src/ftp.c (getftp): fix compiler warning for unused variable.
2015-12-13 16:06:53 +01:00
Jernej Simončič
bf56bf4560 * src/metalink.c: Specify 'rb' as mode to open file 2015-12-11 09:58:30 +01:00
Ander Juaristi
160f0e908f Fix Coverity issues
* src/ftp.c (getftp): on error, close the file and attempt to remove it
   before exiting.
 * src/hsts.c (hsts_store_open): update modification time in the end.
2015-12-10 23:21:27 +01:00
Darshit Shah
9933da2b9f Fix remaining bugs in progress bar implementation
* src/progress.c (create_image): Ensure that the entire screen width is
drawn everytime to prevent any artefacts from leaking through.
2015-12-10 13:43:45 +01:00
Darshit Shah
636a5f9a1c Eliminate more compiler warnings
* src/options.h (CHECK_CERT_MODES): Remove C99 style comma after last
value
* src/progress.c (create_image): Do not mix statements and declarations
* src/init.c (cmd_boolean_internal): Mark unused parameters
2015-12-09 09:26:24 +01:00
Darshit Shah
2257d3ebf8 Fix progress bar assertion with multibyte locales
* src/progress.c (bar_create): Define size of progress buffer explicitly
  (create_image): Clean up progress bar image creation. Use memset
  instead of for loops to create arrays of the same byte.
2015-12-09 09:26:24 +01:00
Ygal Blum
ad5a283528 Fix compilation when without-ssl is selected 2015-12-03 16:12:35 +01:00
Darshit Shah
3dd2e78256 Include Metalink and GPG information in version
* src/build_info.c.in: Include the presence of Metalink and GPGME features in
the output for wget --version
2015-12-03 16:02:51 +01:00
Giuseppe Scrivano
81061571d1 Add --check-certificate=quiet
* doc/wget.texi: Add documentation for  --check-certificate=quiet.
* src/options.h (enum CHECK_CERT_MODES): New enum.
* src/init.c (cmd_check_cert): New static function.
(cmd_boolean_internal): Likewise.
* src/gnutls.c (ssl_check_certificate): Handle CHECK_CERT_QUIET.
* src/openssl.c (ssl_check_certificate): Handle CHECK_CERT_QUIET.
2015-12-03 11:49:55 +01:00
Tim Rühsen
4e37fb6191 Fix regression in HTTP authentication
* src/http.c (initialize_request): Fix wrong params to search_netrc()

Regression introduced in commit 29850e77
Reported-by: Axel Reinhold <axel@freakout.de>
2015-11-24 10:39:39 +01:00
Tim Rühsen
218d81f6e5 Fix SIGSEGV in -N / --content-disposition combination
* src/http.c (http_loop): Fix SIGSEGV

Reported-by: "Schleusener, Jens" <Jens.Schleusener@t-online.de>
2015-11-23 15:10:00 +01:00
Ander Juaristi
46cd721c0f Fix potential NULL pointer dereference
* src/gnutls.c (ssl_connect_wget): check for NULL before calls
2015-11-20 19:22:25 +01:00
Tim Rühsen
99aa7b4f5e Fix HSTS memory issue + test code issue
* src/hsts.c (hsts_find_entry): Fix freeing memory
  (hsts_remove_entry): Remove freeing host member
  (hsts_match): Free host member here
  (hsts_store_entry): Free host member here
  (test_url_rewrite): Fix 'created' value
  (test_hsts_read_database): Fix 'created' value

Reported-by: Dagobert Michelsen <dam@opencsw.org>
2015-11-19 12:20:35 +01:00
Tim Rühsen
76da642aaf Include errno.h instead of sys/errno.h (Solaris issue)
* src/metalink.c: Include errno.h instead of sys/errno.h

Reported-by: Dagobert Michelsen <dam@opencsw.org>
2015-11-17 14:42:25 +01:00
Darshit Shah
2cfcadf5e6 Fix compile error when IPv6 is disabled
* src/ftp-basic.c: The code for the new FTPS functionality was unintentionally
inside a #ifdef IPV6 block. Move the code around so that it is defined even when
IPV6 isn't used
2015-11-17 13:40:44 +01:00
Darshit Shah
4ed540ddc7 Eliminate NDEBUG redefined warnings
* src/wget.h: Define NDEBUG only if it hasn't been defined before
2015-11-16 23:53:59 +01:00
Giuseppe Scrivano
2b418d1146 Prepare release 1.17
* gnulib: sync with upstream.
* NEWS: Update.
* src/main.c: Change the copyright year.
2015-11-15 15:00:55 +01:00
Tim Rühsen
6cdfc9c143 Do not download/save file on error when --spider enabled
* src/http.c (gethttp,http_loop):
  Do not download/save file on error when --spider is enabled and not
  working recursive.

Reported-by: Сковорода Никита Андреевич chalkerx@gmail.com
Fixes #45821
2015-11-03 14:29:36 +01:00
Tim Rühsen
b14eeb5aee Fix URL conversion for colons in filenames
* src/convert.c (construct_relative): Prepend './' to filename
* tests/Test-k.px: Amend test to succeed
2015-10-27 13:13:54 +01:00
Tim Rühsen
71979f1643 Adjust indentation of --no-use-server-timestamps in help output
* src/main.c: Adjust indentation of --no-use-server-timestamps
2015-10-15 21:09:59 +02:00
Ander Juaristi
4ad201a7e7 Added --convert-file-only option
* src/convert.c (convert_links_in_hashtable, convert_links):
   test for CO_CONVERT_BASENAME_ONLY.
   (convert_basename): new function.
 * src/convert.h: new constant CO_CONVERT_BASENAME_ONLY.
 * src/init.c, src/main.c, src/options.h: new option "--convert-file-only".
 * doc/wget.texi: updated documentation.

 Reviewed-by: Gabriel Somlo <somlo@cmu.edu>
2015-10-13 16:17:20 +02:00
Ander Juaristi
f5a63e3100 Fix potential race condition
* src/hsts.c (hsts_read_database): get an open file handle
   instead of a file name.
   (hsts_store_dump): get an open file handle
   instead of a file name.
   (hsts_store_open): open the file and pass the open file handle.
   (hsts_store_save): lock the file before the read-merge-dump
   process.

 Reported-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
2015-10-09 10:13:51 +02:00
Ander Juaristi
077e897819 Fix HSTS merge bug
* src/hsts.c (hsts_store_merge): call hsts_new_entry() if the entry
   does not exist in the database.

When merging the existing HSTS database on disk with the one on memory,
the entries that were on disk but not on memory were ignored. Thus,
only the existing entries were merged. This behavior was only triggered
when more than one Wget processes were using the same HSTS database
simultaneously. This commit fixes the bug by adding the new entries
to the on-memory database if they were not found there.
2015-10-09 10:13:23 +02:00
Tim Rühsen
26fadc55c2 Handle TLS rehandshakes in GnuTLS code
* src/gnutls.c: New static function _do_handshake()
* src/gnutls.c (wgnutls_read_timeout): Handle rehandshake
* src/gnutls.c (wgnutls_write): Handle rehandshake
* src/gnutls.c (ssl_connect_wget): Move handshake code into _do_handshake()

Fixes #46061
2015-09-28 16:18:33 +02:00
Darshit Shah
c387db6451 Do not test for impossible qop value
* http.c (digest_authentication_encode): Wget already errors out if
    qop != "auth". Then it makes no sense to test for qop == "auth-int"
    later on. Currently, Wget does not support the "auth-int" qop value
    and till nobidy requests, it may remain so.
2015-09-22 16:36:40 +05:30
Darshit Shah
12dfc03116 Fix #46024. Support RFC 2069 Digest Authentication
* http.c (digest_authentication_encode): Some servers are still
    using the obsolete RFC 2069 Digest Authentication. Allow Digest
    authentication without the qop parameter for this.

    Reported-by: Andreas Longwitz  <longwitz@incore.de>
2015-09-22 15:41:22 +05:30
Darshit Shah
3ea0beec6f Revert "Disable progress bar when wget is backgrounded (trivial patch)"
This reverts commit e624732563.
2015-09-21 19:41:38 +05:30
Ander Juaristi
f8901af4e0 Added support for FTPS
* doc/wget.texi: updated documentation to reflect the new FTPS functionality.
 * src/ftp-basic.c (ftp_greeting): new function to read the server's greeting.
   (ftp_login): greeting code was previously here. Moved to ftp_greeting to
   support FTPS implicit mode.
   (ftp_auth): wrapper around the AUTH TLS command.
   (ftp_ccc): wrapper around the CCC command.
   (ftp_pbsz): wrapper around the PBSZ command.
   (ftp_prot): wraooer around the PROT command.
 * src/ftp.c (get_ftp_greeting): new static function.
   (init_control_ssl_connection): new static function to start SSL/TLS on the
   control channel.
   (getftp): added hooks to support FTPS commands (RFCs 2228 and 4217).
   (ftp_loop_internal): test for new FTPS error codes.
 * src/ftp.h: new enum 'prot_level' with available FTPS protection levels +
   prototypes of previous functions. New flag for enum 'wget_ftp_fstatus' to track
   whether the data channel has some security mechanism enabled or not.
 * src/gnutls.c (struct wgnutls_transport_context): new field 'session_data'.
   (wgnutls_close): free GnuTLS session data before exiting.
   (ssl_connect_wget): save/resume SSL/TLS session.
 * src/http.c (establish_connection): refactor ssl_connect_wget call.
   (metalink_from_http): take into account SCHEME_FTPS as well.
 * src/init.c, src/main.c, src/options.h: new command line/wgetrc options.
   (main): in recursive downloads, check for SCHEME_FTPS as well.
 * src/openssl.c (struct openssl_transport_context): new field 'sess'.
   (ssl_connect_wget): save/resume SSL/TLS session.
 * src/retr.c (retrieve_url): check new scheme SCHEME_FTPS.
 * src/ssl.h (ssl_connect_wget): refactor. New parameter of type 'int *'.
 * src/url.c. src/url.h: new scheme SCHEME_FTPS.
 * src/wget.h: new FTPS error codes.
 * src/metalink.h: support FTPS scheme.
2015-09-14 10:16:44 +02:00
Christian Neukirchen
e624732563 Disable progress bar when wget is backgrounded (trivial patch)
* src/progress.c (create_image): progress only when in foreground

Sometimes I start wget, but the remote site is too slow, so I rather
want to run it in background, however when I simply use job control
for that, wget will keep spewing the progress bar all over my
terminal.  I have found the SIGHUP/SIGUSR1 feature to redirect output
to a log file, but I think the following small patch is even more
useful, since the progress bar will simply resume when wget is
foregrounded again (also, the final message is still printed to the
terminal in any case):
2015-09-10 10:26:29 +02:00
Hubert Tarasiuk
84b9abbf3c Do not free Metalink structure if not initialized
* src/main.c (main): Move metalink_delete to the conditional block.
2015-09-02 09:17:37 +02:00
Ander Juaristi
ab47d9fa3a Extra debug traces for HSTS.
* src/main.c (load_hsts, save_hsts): added DEBUGP() calls to signal
   reads and saves of the HSTS database file.
2015-09-01 13:50:40 +02:00
Darshit Shah
187edb604a Fix coding style violation in last commit
* http.c (test_parse_range_header): Declare loop variable
    explicitly. Not in gnu99 standard.
2015-08-31 21:04:54 +05:30
Darshit Shah
b06fca60ac Add unit test for parse_content_range() method
* http.c (test_parse_range_header): New function to test the
    function for parsing the HTTP/1.1 Content-Range header.
    * test.[ch]: Same
    * http.c (parse_content_range): Fix parsing code. Fail on scenarios
    mentioned in rfc 7233.
2015-08-30 21:34:32 +05:30
Tim Rühsen
c809398e8c Fix null pointer dereference
* src/metalink.c (gpg_skip_verification):
  Check output_stream before fclose
2015-08-30 14:17:47 +02:00
Tim Rühsen
88a1a79bc1 Fix leaks found by Coverity
* src/http.c (parse_strict_transport_security): Free c_max_age
             (open_output_stream): Fix indentation
* src/iri.c (locale_to_utf8): Free new
2015-08-30 14:10:25 +02:00
Tim Rühsen
398699c438 Fix two leaks foudn by Coverity
* src/http.c (gethttp): Do not leak 'message'.
* src/main.c (format_and_print_line): Do not leak 'line_dup'.
2015-08-29 22:35:29 +02:00
Tim Rühsen
d3504b9261 Fix resource leak discovered by Coverity
* src/retr.c (retrieve_url): Don't leak local_file.
2015-08-29 22:15:34 +02:00
Darshit Shah
6b5acff566 Fix memory leaks in unit-test
* hsts.c (get_hsts_store_filename): Free the homedir value
    (close_hsts_test_store): Actually free the store struct too
    (test_hsts_new_entry): Pass store to close_hsts_test_store()
    (test_hsts_url_rewrite_superdomain): Same
    (test_hsts_url_rewrite_congruent): Same
    (test_hsts_read_database): Same and homedir and store filename
    * http.c (test_parse_content_disposition): Free the returned
    filename
    * url.c (test_append_uri_pathel): Free allocated string
2015-08-29 22:52:49 +05:30
Darshit Shah
5c4489db9b Fix mixed-indentation in http.c
* http.c: Fix mix indentation. Visual change only.
2015-08-29 09:45:13 +05:30
Tim Rühsen
7bed9a6f8f Suppress debug output when strings may contain password
* iri.c (do_conversion): Do not print out converted strings if they
  contain an '@'. That could be an URL with embedded password.

Fixes #45825
2015-08-27 09:55:13 +02:00
Ander Juaristi
d080a70a3a Fix resource leak.
* src/http.c (parse_strict_transport_security): Freed memory to avoid resource leak.
   Comply with GNU coding style.
2015-08-26 17:50:26 +05:30
Jookia
030c3379d1 Clarify that links are being converted.
* src/convert.c: Add 'links in' after 'Converted %d' and 'Converting %s'.
2015-08-21 20:58:55 +02:00
Miquel Llobet
e04c5989ff Fixed #44516 -o- not logging to stdout
src/log.c (log_init): check for hypen on filename, set stdout
2015-08-16 00:20:20 +05:30
Daniele Calore
12bae50b28 Fix #40426: Allow -r -O- only if FILE is regular
* main.c: added check of "-r -O FILE" option combination
    allow only if FILE is a regular file (bug #40426)
2015-08-16 00:16:12 +05:30
Darshit Shah
f71887bbe5 Fix var name conflicts with math.h and wingdi.h
* src/recur.c (reject_reason): Rename all enum members to WG_RR_xx.
    * src/recur.c (retrieve_tree, download_child,
    write_reject_log_reason): Same
2015-08-15 15:43:33 +05:30
Tim Rühsen
075d755696 Fix IP address exposure in FTP code
* src/ftp.c (getftp): Do not use PORT when PASV fails.
* tests/FTPServer.px: Add pasv_not_supported server flag.
* tests/Makefile.am: Add Test-ftp-pasv-not-supported.px
* tests/Test-ftp-pasv-not-supported.px: New test

Fix IP address exposure when automatically falling back from
passive mode to active mode (using the PORT command). A behavior that
may be used to expose a client's privacy even when using a proxy.
2015-08-11 17:38:33 +02:00
Tim Rühsen
7578e47d49 Fix C89 compliancy in HSTS test code
* src/hsts.c (test_hsts_new_entry):
  Move variable assignment before code
2015-08-07 14:03:00 +02:00
Tim Rühsen
3a708f7ef8 Fix C89 compliancy in latest code
* src/recur.c: Declare variables before code
  (write_reject_log_url):
    Use const keyword where appropriate
    Use the 'default' switch statement
    Use xfree() instead of free()
    Renamed variable f -> fp
  (write_reject_log_reason):
    Use const keyword where appropriate
    Use the 'default' switch statement
    Renamed variable f -> fp
    Renamed variable r -> reason
2015-08-07 13:42:30 +02:00
Tim Rühsen
474935665e Remove redundant definition of _GNU_SOURCE
* src/warc.c: Remove definition of _GNU_SOURCE

_GNU_SOURCE is already defined in config.h
2015-08-07 13:24:14 +02:00
Jookia
e4db00d74d Add option to write URL rejections to a tab-delimited CSV log.
* main.c: Add "--rejected-log" option.
 * init.c: Add "rejectedlog" command.
 * options.h: Add "rejected_log" parameter string.
 * wget.texi: Add brief documentation on new --rejected-log option.
 * recur.c: Optionally log details of URLs not traversed.
   Add reject_reason enum.
   (download_child_p -> download_child): Return a reject_reason.
   (descend_redirect_p -> descend_redirect): Return a reject_reason.
   (retrieve_tree): Support logging reasons for rejection.
   Add write_reject_log_header that writes a CSV format header to a file.
   Add write_reject_log_url that writes a url struct to a file in CSV format.
   Add write_reject_log_reason that writes the URL and parent URL as well as the
   rejection reason to a CSV file.
 * Test--rejected-log.px: Add a basic test for the --rejected-log command.
 * tests/Makefile.am: Run Test--rejected-log.px.

This allows you to figure out why URLs are being rejected and some context
around it. CSV is used as the output format since it can be used easily parsed,
it's delimited by tabs instead of commas to allow using all (quoted) URL
characters and includes column names which may be used for compatibility.
2015-08-06 08:10:55 +02:00
Tim Rühsen
670eb924e7 Fix memory leak in HSTS code
* src/main.c (get_hsts_database): Free 'home' variable
2015-08-04 17:41:54 +02:00
Tim Rühsen
5d55018ce6 void uninitialized variable in metalink code
* src/metalink.c: Init retr_err with METALINK_MISSING_RESOURCE
* src/wget.h: Add enum METALINK_MISSING_RESOURCE
2015-08-04 17:24:59 +02:00
Darshit Shah
4e56a91001 Fix function name collision with OpenSSL library
* src/utils.[ch], src/http.c, src/metalink.c: Rename function
    hex_to_string() to wg_hex_to_string sine it collides with a
    similarly named function in OpenSSL Library.
2015-07-24 23:52:43 +05:30
Alex Henrie
b6e242cd6f Make the filename marquee a proper marquee
* src/progress.c: Start the marquee in the middle of the available space
  and do not restart it until all of the text has scrolled out of view.
2015-07-22 16:52:20 +05:30
Ander Juaristi
b60131a399 Added support for HSTS.
* Makefile.am: Added new source files hsts.c and hsts.h.
 * http.c (parse_strict_transport_security): new function for STS header
   parsing.
   (gethttp): update the HSTS store.
 * http.h: new include "hsts.h".
 * init.c: new options --hsts and --hsts-file.
 * main.c (get_hsts_database, load_hsts, save_hsts): new functions.
   New options --no-hsts and --hsts-file added to help.
   (main): load and save HSTS store.
 * options.h: new variables for supporting --hsts and --hsts-file.
 * retr.c (retrieve_url): rewrite the URI according to the HSTS policy before
   entering http_loop.
 * test.c, test.h: new unit tests for HSTS.
 * utils.c, utils.h (countchars): new function.
 * wget.h: new preprocessor check.
 * hsts.c, hsts.h: new files with the HSTS engine implementation.

Added support for HTTP Strict Transport Security (HSTS), as defined by RFC
6797.
2015-07-20 15:55:57 +02:00
Giuseppe Scrivano
9e12b8ca39 fix compiler warnings
* src/utils.h: Include <stdlib.h>
* src/recur.c: Include "exits.h"
2015-07-20 15:37:52 +02:00
Hubert Tarasiuk
6064f21c66 Geolocation support for Metalink resources.
* doc/wget.text: Add information about --preferred-location.
* src/init.c: Add --preferred-location option.
* src/main.c (option_data): Handle --preferred-location argument.
(main): Sort resources based on location if requested.
* src/metalink.c (metalink_res_cmp): Compare based on location if
priority and preference are equal.
* src/options.h (options): Add preferred_location option.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
97389a7497 Support at most one file signature. Adapt comments to libmetalink 0.13.
* src/metalink.c (retrieve_from_metalink): Add comment about new
libmetalink version. Do not iterate over signatures - support just one.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
225a87d4a2 Move some Metalink-related code from http.c to metalink.c.
* src/http.c: Move find_key_value, has_key, find_key_values.
* src/metalink.c: To here.
* src/metalink.h: Make them non-static and add prototypes here.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
92a889b278 Unit test for find_key_values.
* src/http.c: Add test_find_key_values.
* src/test.c (main): Run new test.
* src/test.h: Add test_find_key_values.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
1113e78534 Unit test for has_key.
* src/http.c: Add test_has_key.
* src/test.c (main): Run new test.
* src/test.h: Add test_has_key.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
70cbd59ed6 Unit test for find_key_value.
* src/http.c: Add test_find_key_value.
* src/test.c (main): Run new test.
* src/test.h: Add test_find_key_value.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
37b58e3976 Metalink support.
* bootstrap.conf: Add crypto/sha256
* configure.ac: Look for libmetalink and GPGME
* doc/wget.texi: Add --input-metalink and --metalink-over-http
options description.
* po/POTFILES.in: Add metalink.c
* src/Makefile.am: Add new translation unit (metalink.c)
* src/http.c (http_stat): Add metalink field.
(free_stat): Free metalink field.
(find_key_value): Find value of given key in header string.
(has_key): Check if token exists in header string.
(find_key_values): Find all key=value pairs in header string.
(metalink_from_http): Obtain Metalink metadata from HTTP response.
(gethttp): Call metalink_from_http if requested.
(http_loop): Request Metalink metadata from HTTP response if should be.
Fall back to regular download if no Metalink metadata found.
* src/init.c: Add --input-metalink and --metalink-over-http options
* src/main.c (option_data): Handle --input-metalink and
--metalink-over-http cmd arguments.
(print_help): Print --input-metalink option description.
(main): Retrieve files from Metalink file
* src/metalink.c (retrieve_from_metalink): Download files described by
metalink.
(metalink_res_cmp): Comparator for resources priority-sorting.
* src/metalink.h: Create header for metalink.c
(RES_TYPE_SUPPORTED): Define supported resources media.
(DEFAULT_PRI): Default mirror priority for Metalink over HTTP.
(VALID_PRI_RANGE): Valid priority range.
* src/options.h (options): Add input_metalink option and metalink_over_http
options.
* src/utils.c (hex_to_string): Convert binary data to ASCII-hex.
* src/utils.h (hex_to_string): Add prototype.
* src/wget.h: Add metalink-related error enums
Add METALINK_METADATA flag for document type.
2015-07-20 15:30:39 +02:00
Romain Bentz
80303366ae Add NULL value check to fix #45289
* src/recur.c (retrieve_tree): Check return value of url_parse()
2015-07-15 18:10:08 +02:00
Tim Rühsen
25c9b462bf Change function params to const in src/iri.[ch]
* iri.h, iri.c: Added const attribute for params of parse_charsset(),
	check_encoding_name(), idn_encode(), idn_decode(),
	remote_to_utf8(), set_uri_encoding(), set_content_encoding().
2015-07-01 17:15:10 +02:00
Tim Rühsen
77f5a27e65 Work around a libidn <= 1.30 vulnerability
* src/iri.c: Add _utf8_is_valid() to check UTF-8 sequences before
  passing them to idna_to_ascii_8z().
2015-07-01 17:15:05 +02:00
Ángel González
ae58d8a78b Fix wgetrc filename creation for Windows
* init.c/wgetrc_file_name: Remove obsolete code in WINDOWS code path

Reported-by: Gisle Vanem <gvanem@yahoo.no>
2015-06-27 21:32:48 +02:00
Tim Rühsen
c6ac51d5bc Move test_* function protoypes from test.c to test.h
* src/test.c: Remove test_* function prototypes, make tests_run static
* src/test.h: Add test_* function protoypes
2015-06-13 22:34:36 +02:00
Hubert Tarasiuk
8a8d138dcc Support If-Modified-Since header in timestamping mode.
* src/wget.h: Add IF_MODIFIED_SINCE enum for dt. Add TIMECONV_ERR
enum to uerr_t.
* src/http.c (time_to_rfc1123): Convert time_t do http time.
* src/http.c (initialize_request): Include If-Modified-Since header
if appropriate.
* src/http.c (set_file_timestamp): Separate this code from check_file_output.
* src/http.c (check_file_output): Use set_file_timestamp.
* src/http.c (gethttp): Handle properly 304 return code and 200 if server
ignores If-Modified-Since headers.
* src/http.c (http_loop): Load filename to hstat if condget was requested,
use IF_MODIFIED_SINCE if requested and current timestamp can be obtained.
2015-05-22 11:08:30 +02:00
Hubert Tarasiuk
0e8d2d4251 Add --if-modified-since option
* src/init.c: Add to commands array.
* src/main.c: Add to cmdline_option. Add to help message.
* src/options.h: Add to options struct.
2015-05-22 11:08:30 +02:00
Ander Juaristi
b0820d553b Fixed incorrect handling of reserved chars.
* src/iri.c (do_conversion): Call url_unescape_except_reserved,
instead of url_unescape.

* src/url.c (url_unescape_1): New static function.
(url_unescape): Calls url_unescape_1 with mask zero. Preserves
same behavior as before. Only code changes.
(url_unescape_except_reserved): New function.

* src/url.h: Added prototype for url_unescape_except_reserved().

When the locale is US-ASCII, URIs that contain special characters
in them are converted to IRIs according to RFC 3987, section 3.2
"Converting URIs to IRIs".
2015-05-12 21:24:06 +02:00
Darshit Shah
b6b1388fb7 Fix documentation for update_speed_ring()
* progress.c (update_speed_ring): The comment for the function
    incorrectly stated that the function uses thirty samples from the
    past instead of twenty.

    Reported-By: Yi Li <lovelylich@gmail.com>
2015-05-07 11:29:07 +05:30
Darshit Shah
9b1dd6dab8 Remove shadowed variable in http.c
* http.c (gethttp): Rename err to conn_err to prevent shadowed
    variable
2015-05-04 21:45:26 +05:30
Tim Ruehsen
6b8dfe1d6e Fix format specifier warning
* src/utils.c (aprintf): Use %d for int argument
2015-05-03 21:18:47 +02:00
Nikolay Merinov
0e6d6ca963 Fix timestamping and continue behaviour with ftp protocol.
* src/ftp.c (ftp_loop_internal): Add option `force_full_retrieve' that force to
retrieve full file.
(ftp_retrieve_list): Pass `true' as `force_full_retrieve' option to
`ftp_loop_internal' if we want to download file with newer timestamp than local
copy.
2015-05-01 00:28:08 +02:00
Rohit Mathulla
3765a1b266 openssl: Read cert from private key file when needed
* src/openssl.c (ssl_init): Assign opt.cert_{file, type}
  from opt.private_key(_type)
2015-04-27 19:52:18 +02:00
Rohit Mathulla
8654f7e2e7 Fix double free bug in SSL code
* src/openssl.c, src/gnutls.c (ssl_init): Copy options using xstrdup
2015-04-27 19:48:51 +02:00
Hubert Tarasiuk
566696cb82 Single exit point and common cleanup code in gethttp
* src/http.c (gethttp): Common cleanup for type, message,
  req, resp, head.  Single exit point.
2015-04-20 10:54:34 +02:00
Tim Rühsen
c579c7bf1e Check memory allocations in WARC code
* src/warc.c: Remove some memory allocations,
              use xmalloc instead of malloc

Reported-by: Bill Parker <wp02855@gmail.com>
2015-04-17 22:42:59 +02:00
Tim Rühsen
4dde3e200f Add more const usage to function params
* warc.c, warc.h: Add const specifier to several function args
2015-04-17 22:42:59 +02:00
Ángel González
bef5945202 Remove memory leak in idn_encode.
* src/iri.c (idn_encode): Free buffer from remote_to_utf8
when needed; give meaningful names to variables;
remove excessive comment.
2015-04-11 21:55:17 +02:00
Hubert Tarasiuk
ac40b84ee1 Fix error in free_vec.
* src/utils.c (free_vec): Increment pointer instead of its value.

Reported-by: Gisle Vanem <gvanem@yahoo.no>
2015-04-10 18:06:14 +02:00
Ángel González
45463eaad7 Fix const usage in iri.c
* src/iri.c (remote_to_utf8): Do not qualify with const the output pointer.
(do_conversion): Use the provided input parameter as const.
(idn_encode): casts to remote_to_utf8 parameters are no longer needed.
* src/iri.h: Adjusted remote_to_utf8 prototype.
* src/url.c: It is no longer necessary to cast new_url to const char.
2015-04-10 10:23:01 +02:00
Miquel Llobet
d03b40e31e Fixed #44628 honoring RFC 6266 content-disposition
src/http.c (parse_content_disposition): stores filename* and filename
separately and choses filename* if available.
(test_parse_content_disposition): added new tests.
2015-04-06 10:30:30 +02:00
Steven M. Schweda
5efb24e4a2 Add option to restrict filenames used VMS.
* src/options.h (enum restrict_files_os): Define "restrict_vms".
* src/init.c (defaults) [__VMS]: Set "opt.restrict_files_os" to
"restrict_vms".
(cmd_spec_restrict_file_names): honor "vms".
* src/url.c (filechr_not_unix): Define "filechr_not_vms".
(filechr_table): Update for VMS.
(append_uri_pathel): Honor opt.restrict_files_os.
(FN_QUERY_SEP): Update for VMS.
(FN_QUERY_SEP_STR): Update for VMS.
2015-04-02 15:36:42 +02:00
Hubert Tarasiuk
eae8b1d565 Change semantics of resp_free and request_free in http.c
* src/http.c (resp_free): Change the semantics of this function.
(request_free): Change the semantics of this function.
(initialize_request): Adjust request_free call.
(establish_connection): Adjust request_free, resp_free calls.
(gethttp): Adjust request_free, resp_free calls.
2015-04-01 17:01:07 +02:00
Hubert Tarasiuk
045463b814 Do not free request in establish_connection; do it in gethttp
* src/http.c (establish_connection): Do not free request here (it is
* never allocated here).
* src/http.c (gethttp): Free request before returning if error in
* establish_connection encountered.
2015-04-01 17:01:07 +02:00
Hubert Tarasiuk
621c313b94 Transform read_header label and goto into a loop
* src/http.c (gethttp): Replace label and goto statement with a do
loop.
2015-04-01 17:00:40 +02:00
Hubert Tarasiuk
52a7d0ad85 Factor out set_content_type function from gethttp
* src/http.c (gethttp): Move some code in...
(set_content_type): ... a new function.
2015-03-31 17:00:07 +02:00
Giuseppe Scrivano
59e9ef00e6 Factor out some gethttp code
* src/http.c (gethttp): Move some code in...
(open_output_stream): ... a new function.
2015-03-18 12:09:55 +01:00
Giuseppe Scrivano
14bbc18512 Factor out some auth gethttp code
* src/http.c (gethttp): Move some code in...
(check_auth): ... a new function.
2015-03-18 12:09:55 +01:00
Giuseppe Scrivano
8aa63e482e Factor out some gethttp code
* src/http.c (gethttp): Move some code in...
(check_file_output): ... a new function.
2015-03-18 12:09:55 +01:00
Giuseppe Scrivano
0bc2757713 Factor out some connection initialization code for gethttp
* src/http.c (gethttp): Move some initialization code in...
(establish_connection): ... a new function.
2015-03-18 12:09:55 +01:00
Giuseppe Scrivano
f8abb9dd00 Factor out some proxy initialization code for gethttp
* src/http.c (gethttp): Move some initialization code in...
(initialize_proxy_configuration): ... a new function.
2015-03-18 12:09:55 +01:00
Giuseppe Scrivano
29850e77d0 Factor out some initialization code for gethttp
* src/http.c (gethttp): Move some initialization code in...
(initialize_request): ... a new function.
2015-03-18 12:09:54 +01:00
Tim Ruehsen
799c545722 src/ftp.c: make sure warc_tmp becomes closed before return
Reported-by: Coverity bug #1188044
2015-03-18 10:46:11 +01:00
Tim Ruehsen
014b1d6041 src/http.c: fix error return of digest_authentication_encode()
Reported-by: Coverity bug #1188036
2015-03-18 10:46:05 +01:00
Darshit Shah
cc9f76c5a4 retr.c: Fix memory leak in retrieve_from_file()
Reported by: Coverity Bug 1188045
2015-03-14 16:48:30 +05:30
Darshit Shah
53b22974cb html-url.c: Fix potential memory leaks
Reported by: Coverity Bug 1188050
2015-03-14 16:48:30 +05:30
Darshit Shah
7d5a7ef9ca main.c: Fix two potential memory leaks
Reported by: Coverity bug 1188048
2015-03-14 16:48:30 +05:30
Darshit Shah
735cc220e3 retr.c: Fix two memory leaks when proxy URL is bad
Reported by: Coverity bug 1188047
2015-03-14 16:48:29 +05:30
Giuseppe Scrivano
16f1fb1d1f maint: update copyright year ranges to include 2015 2015-03-09 16:32:01 +01:00
Yousong Zhou
91e9a20752 Fix --content-on-error option handling.
* src/http.c: Log --content-on-error downloads.
* src/retr.c (retrieve_url): Register the download of an error page
when --content-on-error is specified.
2015-03-09 11:45:01 +01:00
Anderson Goulart
882ed28d59 src/main.c (--no-verbose): don't show progress bar
Fixes #44431
2015-03-09 00:44:23 +05:30
Darshit Shah
e316d253fa main.c: Use assertion to test buffer size 2015-03-07 00:38:04 +05:30
Darshit Shah
9dde436dd6 main.c: Need to explicitly disallow show_progress in -q 2015-03-02 21:40:57 +05:30
Eli Zaretskii
33c5d979ce warc.c: native uuid generation on Windows
* warc.c (windows_uuid_str) [WINDOWS]: New function specific to
MS-Windows.
(warc_uuid_str) [WINDOWS]: If windows_uuid_str succeeds, use its
result; otherwise use the fallback method.
2015-02-23 23:36:02 +01:00
Gisle Vanem
9df2250f4c idn: use idn_free() to free allocated libidn memory
xfree() might crash on libidn memory on Windows.

From 'man idn_free':
"Under Windows, different parts of the same application may use different
 heap memory, and then it is important to deallocate memory allocated within
 the same  module  that  allocated it. This function makes that possible."
2015-02-18 12:50:57 +01:00
Tim Rühsen
3d8e765c1d gettext: Use gnulib's gettext.h for compatibility
Fixes issues with gettext on Solaris
Reported-by: Kiyoshi KANAZAWA <yoi_no_myoujou@yahoo.co.jp>
2015-02-10 09:56:32 +01:00
Tim Rühsen
c83f344564 src/openssl.c: Use SSL_state() instead of ssl_st.state
Changes in OpenSSL 1.0.2 API hides ssl_st structure members.
Reported-by: Gisle Vanem <gvanem@yahoo.no>
2015-02-10 09:53:42 +01:00
Darshit Shah
8705e27e20 progress bar: Allow display on stderr alongwith -o
This commit causes the --show-progress option to print the progress bar
to stderr even when a logfile was explicitly provided on the command
line. Such a combination allows a user to log the output of Wget while
simultaneously keeping track of the download status.
2015-01-20 20:16:20 +01:00
Mathieu Parent
87f4fee8c9 src/connect.c: More verbose error message (tiny change)
This fixes Debian bug #144076.
2015-01-16 10:18:13 +01:00
Tim Ruehsen
5e3a760731 src/ftp-basic.c: Accept 5-digit port numbers in EPSV responses
Reported-by: Adam Sampson <ats@offog.org>
2015-01-04 20:57:15 +01:00
Tim Ruehsen
103cbf1751 src/http.c: Revert commit d81a8d5f56
The removal of the 'redundant' condition was a failure.
Fixes: #43876
Reported-by: Sean Jensen-Grey <seanj@xyke.com>
2014-12-27 23:27:20 +01:00
Tim Ruehsen
f6b28575cc src/main.c, src/warc.c: Use gnulib's base_name() instead of basename()
Reported-by: Eli Zaretskii <eliz@gnu.org>
2014-12-25 12:07:42 +01:00