Commit Graph

29 Commits

Author SHA1 Message Date
Matthew White
9532861aef Bugfix: Remove surrounding quotes from Metalink/HTTP key's value
* src/metalink.h: Add declaration of function dequote_metalink_string()
* src/metalink.c: Add function dequote_metalink_string() remove
  surrounding quotes from string, \' or \"
* src/metalink.c (find_key_value, find_key_values): Call dequote_metalink_string()
  to remove the surrounding quotes from the parsed value
* src/metalink.c (test_find_key_value, test_find_key_values): Add
  quoted key's values for unit-tests
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-http-quoted.py: New file. Metalink/HTTP quoted
  values tests

Some Metalink/HTTP keys, like "type" [2], may have a quoted value [1]:
Link: <http://example.com/example.ext.meta4>; rel=describedby;
type="application/metalink4+xml"

Wget was expecting a dequoted value from the Metalink module. This
patch addresses this problem.

References:
 [1] Metalink/HTTP: Mirrors and Hashes
     1.1. Example Metalink Server Response
     https://tools.ietf.org/html/rfc6249#section-1.1

 [2] Additional Link Relations
     6. "type"
     https://tools.ietf.org/html/rfc6903#section-6
2016-09-30 19:44:05 +02:00
Matthew White
8aca8fc80d Bugfix: Process Metalink/XML url strings containing white spaces and CRLF
* src/metalink.h: Add declaration of function clean_metalink_string()
* src/metalink.c: Add directive #include "xmemdup0.h"
* src/metalink.c: Add function clean_metalink_string() remove leading
  and trailing white spaces and CRLF from string
* src/metalink.c (retrieve_from_metalink): Remove leading and trailing
  white spaces and CRLF from url resource mres->url
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-xml-urlbreak.py: New test. Metalink/XML white
  spaces and CRLF in url resources tests

White spaces and CRLF are not automatically removed by libmetalink
from url strings. The Wget's Metalink module was unable to process
such url strings. This patch implements the processing of such url
strings cleaning off leading and trailing white spaces and CRLF.

If a parsed Metalink/XML url string contains strings separated by
CRLF, only the first of the series is accepted.
2016-09-30 19:44:05 +02:00
Matthew White
70360b3eab New: Metalink file size mismatch returns error code METALINK_SIZE_ERROR
* src/wget.h (uerr_t): Add error code METALINK_SIZE_ERROR to enum
* src/metalink.c (retrieve_from_metalink): Use boolean variable
  size_ok, when false set retr_err to METALINK_SIZE_ERROR
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-xml-size.py: New file. Metalink/XML file size
  tests (<size></size>)

Before this patch, no appropriate error code was returned to inform a
file size mismatch.

This patch introduces the error code METALINK_SIZE_ERROR to inform a
file size mismatch.
2016-09-30 19:44:05 +02:00
Matthew White
c29983a044 New: Metalink/XML and Metalink/HTTP file naming safety rules
* NEWS: Mention the effect of --trust-server-names over Metalink
* src/metalink.h: Add declaration of function append_suffix_number()
* src/metalink.c: Add function append_suffix_number() append number to
  string
* src/metalink.c (retrieve_from_metalink): Safer Metalink/XML and
  Metalink/HTTP download naming system, opt.trustservernames based
* doc/metalink-standard.txt: Update doc. Explain new Metalink/XML and
  Metalin/HTTP download naming system and --trust-server-names role
* testenv/Makefile.am: Add new files
* testenv/Test-metalink-xml-continue.py: Update test. Metalink/XML
  continue/keep existing files (HTTP 416) with --continue tests
* testenv/Test-metalink-xml.py: Update test. Metalink/XML naming tests
* testenv/Test-metalink-xml-trust.py: New file. Metalink/XML naming
  tests with --trust-server-names
* testenv/Test-metalink-xml-abspath.py: Update test. Metalink/XML
  absolute path tests
* testenv/Test-metalink-xml-abspath-trust.py: New file. Metalink/XML
  absolute path tests with --trust-server-names
* testenv/Test-metalink-xml-relpath.py: Update test. Metalink/XML
  relative path tests
* testenv/Test-metalink-xml-relpath-trust.py: New file. Metalink/XML
  relative path tests with --trust-server-names
* testenv/Test-metalink-xml-homepath.py: Update test. Metalink/XML
  home path and ~ (tilde) tests
* testenv/Test-metalink-xml-homepath-trust.py: New file. Metalink/XML
  home path and ~ (tilde) tests with --trust-server-names
* testenv/Test-metalink-xml-prefix.py: New file. Metalink/XML naming
  tests with --directory-prefix
* testenv/Test-metalink-xml-prefix-trust.py: New file. Metalink/XML
  naming tests with --directory-prefix and --trust-server-names
* testenv/Test-metalink-xml-absprefix.py: New file. Metalink/XML
  absolute --directory-prefix tests
* testenv/Test-metalink-xml-absprefix-trust.py: New file. Metalink/XML
  absolute --directory-prefix tests with --trust-server-names
* testenv/Test-metalink-xml-relprefix.py: New file. Metalink/XML
  relative --directory-prefix tests
* testenv/Test-metalink-xml-relprefix-trust.py: New file. Metalink/XML
  relative --directory-prefix tests with --trust-server-names
* testenv/Test-metalink-xml-homeprefix.py: New file. Metalink/XML home
  --directory-prefix tests
* testenv/Test-metalink-xml-homeprefix-trust.py: New file. Metalink/XML
  home --directory-prefix tests with --trust-server-names

The option --trust-server-names allows to use the file names parsed
from a Metalink/XML file.  Without --trust-server-names, the safety
mechanism provides secure and predictable file names.
2016-09-30 19:44:05 +02:00
Matthew White
43ec7008f2 Enforce Metalink file name verification, strip directory if necessary
* NEWS: Mention the use of a safe Metalink destination path
* src/metalink.h: Add declaration of functions get_metalink_basename(),
  last_component(), metalink_check_safe_path()
* src/metalink.c: Add directive #include "dosname.h"
* src/metalink.c: Add function get_metalink_basename() to return the
  basename of a file name, strip w32's drive letter prefixes
* src/metalink.c (retrieve_from_metalink): Enforce Metalink file name
  verification, if the file name is unsafe try its basename
* doc/metalink.txt: Update document. Explain --directory-prefix

The function get_metalink_basename() uses FILE_SYSTEM_PREFIX_LEN to
catch any 'C:D:file' (w32 environment), then it removes each drive
letter prefix, i.e. 'C:' and 'D:'.

Unsafe file names contain an absolute, relative, or home path.  Safe
paths can be verified by libmetalink's metalink_check_safe_path().
2016-09-30 19:44:03 +02:00
Matthew White
7d4942864b Implement Metalink/XML --directory-prefix option in Metalink module
* NEWS: Mention the effect of --directory-prefix over Metalink
* src/metalink.c (retrieve_from_metalink): Add opt.dir_prefix as
  prefix to the metalink:file name mfile->name
* doc/metalink.txt: Update document. Explain --directory-prefix

When --directory-prefix=<prefix> is used, set the top of the retrieval
tree to prefix. The default is . (the current directory). Metalink/XML
and Metalink/HTTP files will be downloaded under prefix.
2016-09-27 20:29:03 +02:00
Matthew White
666b7862bf Change mfile->name to filename in Metalink module's messages
* src/metalink.c (retrieve_from_metalink): Change mfile->name to
  filename when referring to the downloaded file

The file name could have been changed by unique_create() (or by any
other mean) before downloading. Use the name of the downloaded file
(filename) when printing output which refer to it.
2016-09-27 20:29:03 +02:00
Matthew White
f3f349a0cf Add file size computation in Metalink module
* NEWS: Mention Metalink's file size verification
* src/metalink.c (retrieve_from_metalink): Add file size computation
* doc/metalink.txt: Update document. Remove resolved bugs

Reject downloaded files when they do not agree with their Metalink/XML
metalink:size: https://tools.ietf.org/html/rfc5854#section-4.2.14

At the moment of writing, Metalink/HTTP headers do not provide a file
size field. This information could be obtained from the Content-Length
header field: https://tools.ietf.org/html/rfc6249#section-7
2016-09-27 20:29:03 +02:00
Matthew White
ff444ebc2a Bugfix: Keep the download progress when alternating metalink:url
* NEWS: Mention the effects of --continue over Metalink
* src/metalink.c (retrieve_from_metalink): On download error, resume
  output_stream with the next mres->url. Keep fully downloaded files
  started with --continue, otherwise rename/remove the file
* testenv/Makefile.am: Add new file
* testenv/Test-metalink-xml-continue.py: New file. Metalink/XML
  continue/keep existing files (HTTP 416) with --continue tests

Before this patch, with --continue, existing and/or fully retrieved
files which fail the sanity tests were renamed (--keep-badhash), or
removed.

This patch ensures that --continue doesn't rename/remove existing
and/or fully retrieved files (HTTP 416) which fail the sanity tests.
2016-09-27 20:28:50 +02:00
Matthew White
96554861f9 Bugfix: Fix NULL filename and output_stream in Metalink module
* NEWS: Mention the Metalink "path/file" name format handling
* src/metalink.c (retrieve_from_metalink): Fix NULL filename, set
  filename to the right "path/file" value
* src/metalink.c (retrieve_from_metalink): Fix NULL output_stream, set
  output_stream to filename when it is created by retrieve_url()
* src/metalink.c (retrieve_from_metalink): Add RFC5854 comments about
  proper metalink:file "path/file" name format handling
* doc/metalink.txt: Update document. Remove resolved bugs

If unique_create() cannot create/open the destination file, filename
and output_stream remain NULL. If fopen() is used instead, filename
always remains NULL. Both functions cannot create "path/file" trees.

Setting filename to the right value is sufficient to prevent SIGSEGV
generating from testing a NULL value. This also allows retrieve_url()
to create a "path/file" tree through opt.output_document.

Reading NULL as output_stream, when it shall not be, leads to wrong
results. For instance, a non-NULL output_stream tells when a stream
was interrupted, reading NULL instead means to assume the contrary.

This patch conforms to the RFC5854 specification:
  The Metalink Download Description Format
  4.1.2.1.  The "name" Attribute
  https://tools.ietf.org/html/rfc5854#section-4.1.2.1
2016-09-27 20:17:08 +02:00
Tim Rühsen
e3fb4c3859 * src/metalink.c (badhash_suffix): Fix quoting 2016-08-04 13:09:28 +02:00
Matthew White
943a6d585f Add new option --keep-badhash to keep Metalink's files with a bad hash
* src/init.c: Add keepbadhash
* src/main.c: Add keep-badhash
* src/options.h: Add keep_badhash
* doc/wget.texi: Add docs for --keep-badhash
* src/metalink.h: Add prototypes badhash_suffix(), badhash_or_remove()
* src/metalink.c: New functions badhash_suffix(), badhash_or_remove().
  (retrieve_from_metalink): Call badhash_or_remove() on download error

With --keep-badhash, append .badhash to Metalink's files with checksum
mismatch. (retrieve_from_metalink): unique_create() may append another
suffix to avoid overwriting existing files.

Without --keep-badhash, remove downloaded files with checksum mismatch
(this conforms to the old behaviour).
2016-08-04 12:03:49 +02:00
Tim Rühsen
7fad76db4c * src/metalink.c: Remove C++ style comments 2016-08-03 13:48:07 +02:00
Matthew White
e0b60fd073 New: --continue continues partially downloaded Metalink's files
* src/metalink.c (retrieve_from_metalink): Continue file download if
  opt.always_rest is true

Without --continue, download as a new file with an unique name (this
conforms to the old behaviour).
2016-08-03 13:37:27 +02:00
Matthew White
9db02a0c46 Add support for Metalink's md2, and md4 hashes
* bootstrap.conf: Add crypto/md2, and crypto/md4
* src/metalink.c (retrieve_from_metalink): Add md2, and md4 support

This patch adds support for the deprecated (insecure) md2, and md4
Message-Digest algorithms to the Metalink module.
2016-08-03 12:58:43 +02:00
Matthew White
edad3c1df3 Add support for Metalink's md5, sha1, sha224, sha384, and sha512 hashes
* bootstrap.conf: Add crypto/sha512
* src/metalink.c (retrieve_from_metalink): Add md5, sha1, sha224,
  sha384, and sha512 support

Metalink's checksum verification was limited to sha256. This patch
adds support for md5, sha1, sha224, sha384, and sha512.
2016-08-03 12:49:26 +02:00
Darshit Shah
d26377053d Print the fingerprint instead of the raw pointer in debugging message
* src/metalink.c (retrieve_from_metalink): Fix debug message to print the
fingerprint instead of a pointer.

* Do not delete the ChangeLog file since it is required by the Makefile
and breaks compilation
2016-03-29 15:09:04 +02:00
Darshit Shah
722675553c Revert "Print the fingerprint instead of the raw pointer in debugging message"
This reverts commit b916595168.
2016-03-29 15:07:29 +02:00
Giuseppe Scrivano
f3e63f0071 * metalink.c (retrieve_from_metalink): Fix typo 2016-03-25 16:46:39 +01:00
Giuseppe Scrivano
b916595168 Print the fingerprint instead of the raw pointer in debugging message
* src/metalink.c (retrieve_from_metalink): Fix debug message to print the
fingerprint instead of a pointer.
2016-03-25 16:23:19 +01:00
Jernej Simončič
bf56bf4560 * src/metalink.c: Specify 'rb' as mode to open file 2015-12-11 09:58:30 +01:00
Tim Rühsen
76da642aaf Include errno.h instead of sys/errno.h (Solaris issue)
* src/metalink.c: Include errno.h instead of sys/errno.h

Reported-by: Dagobert Michelsen <dam@opencsw.org>
2015-11-17 14:42:25 +01:00
Tim Rühsen
c809398e8c Fix null pointer dereference
* src/metalink.c (gpg_skip_verification):
  Check output_stream before fclose
2015-08-30 14:17:47 +02:00
Tim Rühsen
5d55018ce6 void uninitialized variable in metalink code
* src/metalink.c: Init retr_err with METALINK_MISSING_RESOURCE
* src/wget.h: Add enum METALINK_MISSING_RESOURCE
2015-08-04 17:24:59 +02:00
Darshit Shah
4e56a91001 Fix function name collision with OpenSSL library
* src/utils.[ch], src/http.c, src/metalink.c: Rename function
    hex_to_string() to wg_hex_to_string sine it collides with a
    similarly named function in OpenSSL Library.
2015-07-24 23:52:43 +05:30
Hubert Tarasiuk
6064f21c66 Geolocation support for Metalink resources.
* doc/wget.text: Add information about --preferred-location.
* src/init.c: Add --preferred-location option.
* src/main.c (option_data): Handle --preferred-location argument.
(main): Sort resources based on location if requested.
* src/metalink.c (metalink_res_cmp): Compare based on location if
priority and preference are equal.
* src/options.h (options): Add preferred_location option.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
97389a7497 Support at most one file signature. Adapt comments to libmetalink 0.13.
* src/metalink.c (retrieve_from_metalink): Add comment about new
libmetalink version. Do not iterate over signatures - support just one.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
225a87d4a2 Move some Metalink-related code from http.c to metalink.c.
* src/http.c: Move find_key_value, has_key, find_key_values.
* src/metalink.c: To here.
* src/metalink.h: Make them non-static and add prototypes here.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
37b58e3976 Metalink support.
* bootstrap.conf: Add crypto/sha256
* configure.ac: Look for libmetalink and GPGME
* doc/wget.texi: Add --input-metalink and --metalink-over-http
options description.
* po/POTFILES.in: Add metalink.c
* src/Makefile.am: Add new translation unit (metalink.c)
* src/http.c (http_stat): Add metalink field.
(free_stat): Free metalink field.
(find_key_value): Find value of given key in header string.
(has_key): Check if token exists in header string.
(find_key_values): Find all key=value pairs in header string.
(metalink_from_http): Obtain Metalink metadata from HTTP response.
(gethttp): Call metalink_from_http if requested.
(http_loop): Request Metalink metadata from HTTP response if should be.
Fall back to regular download if no Metalink metadata found.
* src/init.c: Add --input-metalink and --metalink-over-http options
* src/main.c (option_data): Handle --input-metalink and
--metalink-over-http cmd arguments.
(print_help): Print --input-metalink option description.
(main): Retrieve files from Metalink file
* src/metalink.c (retrieve_from_metalink): Download files described by
metalink.
(metalink_res_cmp): Comparator for resources priority-sorting.
* src/metalink.h: Create header for metalink.c
(RES_TYPE_SUPPORTED): Define supported resources media.
(DEFAULT_PRI): Default mirror priority for Metalink over HTTP.
(VALID_PRI_RANGE): Valid priority range.
* src/options.h (options): Add input_metalink option and metalink_over_http
options.
* src/utils.c (hex_to_string): Convert binary data to ASCII-hex.
* src/utils.h (hex_to_string): Add prototype.
* src/wget.h: Add metalink-related error enums
Add METALINK_METADATA flag for document type.
2015-07-20 15:30:39 +02:00