2016-07-28 23:10:46 +08:00
|
|
|
GNU Wget Metalink module
|
2016-08-06 02:10:19 +08:00
|
|
|
|
2016-07-28 23:10:46 +08:00
|
|
|
Evaluation of the Metalink/XML and Metalink/HTTP implementations
|
2016-08-06 02:10:19 +08:00
|
|
|
|
|
|
|
|
|
|
|
1. Introduction
|
|
|
|
***************
|
|
|
|
|
|
|
|
This document, and the results contained in it, is focused over the
|
2016-07-28 23:10:46 +08:00
|
|
|
evaluation of the Metalink/XML and Metalink/HTTP implementations.
|
2016-08-06 02:10:19 +08:00
|
|
|
|
|
|
|
The "Directory Options" mentioned here are used on the command line in
|
2016-07-28 23:10:46 +08:00
|
|
|
conjunction with the option '--input-metalink=file' for Metalink/XML,
|
|
|
|
and '--metalink-over-http' for Metalink/HTTP.
|
2016-08-06 02:10:19 +08:00
|
|
|
|
2016-07-28 23:10:46 +08:00
|
|
|
$ wget --input-metalink=<file> [directory options]
|
|
|
|
$ wget --metalink-over-http [directory options] <url>
|
2016-08-06 02:10:19 +08:00
|
|
|
|
|
|
|
2. Notes
|
|
|
|
********
|
|
|
|
|
2016-07-28 23:10:46 +08:00
|
|
|
Tests for metalink:file names beginning with '/', '~/', './', or '../'
|
|
|
|
(e.g. "/path/file") shall be run manually due to security concerns.
|
2016-08-06 02:10:19 +08:00
|
|
|
|
|
|
|
3. Metalink files used as reference
|
|
|
|
***********************************
|
|
|
|
|
|
|
|
3.1 Test: metalink:file with "path/file" name format
|
|
|
|
====================================================
|
|
|
|
|
|
|
|
cat > test.meta4 << EOF
|
|
|
|
<?xml version="1.0" encoding="UTF-8"?>
|
|
|
|
<metalink xmlns="urn:ietf:params:xml:ns:metalink">
|
|
|
|
<file name="path/file">
|
|
|
|
<size>543</size>
|
|
|
|
<hash type="sha256">d37d3965f8e1a7b16504b4273b09c392776b7e4dd17e601256c7b2fd9ce5f56e</hash>
|
|
|
|
<hash type="md5">0f6ff5cdc15603f1b81227b5a296f001</hash>
|
|
|
|
<url>http://wrongurl.really/gnu/wget/wget-1.18.tar.xz.sig</url>
|
|
|
|
<url>http://ftpmirror.gnu.org/wget/wget-1.18.tar.xz.sig</url>
|
|
|
|
<url>http://ftp.gnu.org/gnu/wget/wget-1.18.tar.xz.sig</url>
|
|
|
|
<url>http://nl.mirror.babylon.network/gnu/wget/wget-1.18.tar.xz.sig</url>
|
|
|
|
</file>
|
|
|
|
</metalink>
|
|
|
|
EOF
|
|
|
|
|
|
|
|
4. `wget --input-metalink=test.meta4`
|
|
|
|
*************************************
|
|
|
|
|
|
|
|
4.1 Implemented safety features
|
|
|
|
===============================
|
|
|
|
|
2016-07-28 23:10:46 +08:00
|
|
|
Any metalink:file name containing an absolute, relative, or home path
|
|
|
|
(see '2. Notes') parsed from Metalink/XML files is rejected.
|
2016-08-06 02:10:19 +08:00
|
|
|
|
2016-07-28 23:10:46 +08:00
|
|
|
This is a libmetalink's design decision implemented in the function
|
|
|
|
metalink_check_safe_path(). This feature shall not be modified.
|
2016-08-06 02:10:19 +08:00
|
|
|
|
2016-07-28 23:10:46 +08:00
|
|
|
All the above conform to the RFC5854 standard.
|
|
|
|
|
|
|
|
References:
|
|
|
|
https://tools.ietf.org/html/rfc5854#section-4.1.2.1
|
|
|
|
https://tools.ietf.org/html/rfc5854#section-4.2.8.3
|
|
|
|
|
|
|
|
4.2 File download behaviour
|
|
|
|
===========================
|
|
|
|
|
|
|
|
When a Metalink/XML file is parsed:
|
|
|
|
1. create the metalink:file "path/file" tree;
|
|
|
|
2. download the metalink:url file as "path/file";
|
2016-08-17 05:12:56 +08:00
|
|
|
3. verify the "path/file" size, if declared;
|
|
|
|
4. verify the "path/file" checksum.
|
2016-07-28 23:10:46 +08:00
|
|
|
|
|
|
|
All the above conform to the RFC5854 standard.
|
|
|
|
|
|
|
|
References:
|
|
|
|
https://tools.ietf.org/html/rfc5854
|
2016-08-06 02:10:19 +08:00
|
|
|
|
|
|
|
4.3 Questionable behaviours
|
|
|
|
===========================
|
|
|
|
|
|
|
|
If more metalink:file elements are the same, wget downloads them all.
|
|
|
|
|
2016-07-28 23:10:46 +08:00
|
|
|
5. `wget --metalink-over-http`
|
|
|
|
******************************
|
|
|
|
|
|
|
|
5.1 Implemented safety features
|
|
|
|
===============================
|
|
|
|
|
|
|
|
The function url_file_name() is responsible of parsing the url's file
|
|
|
|
name and mixing in the "Directory Options" wrote on the command line.
|
|
|
|
|
|
|
|
The use of libmetalink's metalink_check_safe_path() shouldn't be
|
|
|
|
necessary (see '4.1 Implemented safety features').
|
|
|
|
|
|
|
|
All the above comform to the usual Wget's download behaviour.
|
|
|
|
|
|
|
|
References:
|
|
|
|
wget(1)
|
|
|
|
|
|
|
|
5.2 File download behaviour
|
|
|
|
===========================
|
|
|
|
|
|
|
|
When a Metalink/HTTP header is parsed:
|
|
|
|
1. extract metalink metadata from the header;
|
|
|
|
2. download the file from the mirror with the highest priority;
|
2016-08-17 05:12:56 +08:00
|
|
|
3. verify the file's size, if declared;
|
|
|
|
4. verify the file's checksum.
|
2016-07-28 23:10:46 +08:00
|
|
|
|
|
|
|
All the above comform to the usual Wget's download behaviour and to
|
|
|
|
the RFC6249 standard.
|
|
|
|
|
|
|
|
References:
|
|
|
|
wget(1)
|
|
|
|
https://tools.ietf.org/html/rfc6249
|
|
|
|
|
|
|
|
6. Directory Options
|
2016-08-06 02:10:19 +08:00
|
|
|
********************
|
|
|
|
|
|
|
|
'-nd'
|
|
|
|
'--no-directories'
|
|
|
|
|
2016-07-28 23:10:46 +08:00
|
|
|
Do not apply to Metalink/XML files (aka --input-metalink=<file>).
|
2016-08-06 02:10:19 +08:00
|
|
|
|
2016-07-28 23:10:46 +08:00
|
|
|
Apply to Metalink/HTTP urls as described in the Wget's manual, see
|
|
|
|
wget(1). The target url is the url wrote on the command line.
|
2016-08-06 02:10:19 +08:00
|
|
|
|
|
|
|
'-x'
|
|
|
|
'--force-directories'
|
|
|
|
|
2016-07-28 23:10:46 +08:00
|
|
|
Do not apply to Metalink/XML files (aka --input-metalink=<file>).
|
|
|
|
|
|
|
|
Apply to Metalink/HTTP urls as described in the Wget's manual, see
|
|
|
|
wget(1). The target url is the url wrote on the command line.
|
2016-08-06 02:10:19 +08:00
|
|
|
|
|
|
|
'-nH'
|
|
|
|
'--no-host-directories'
|
|
|
|
|
2016-07-28 23:10:46 +08:00
|
|
|
Do not apply to Metalink/XML files (aka --input-metalink=<file>).
|
|
|
|
|
|
|
|
Apply to Metalink/HTTP urls as described in the Wget's manual, see
|
|
|
|
wget(1). The target url is the url wrote on the command line.
|
2016-08-06 02:10:19 +08:00
|
|
|
|
|
|
|
'--protocol-directories'
|
|
|
|
|
2016-07-28 23:10:46 +08:00
|
|
|
Do not apply to Metalink/XML files (aka --input-metalink=<file>).
|
2016-08-06 02:10:19 +08:00
|
|
|
|
2016-07-28 23:10:46 +08:00
|
|
|
Apply to Metalink/HTTP urls as described in the Wget's manual, see
|
|
|
|
wget(1). The target url is the url wrote on the command line.
|
2016-08-06 02:10:19 +08:00
|
|
|
|
|
|
|
'--cut-dirs=number'
|
|
|
|
|
2016-07-28 23:10:46 +08:00
|
|
|
Do not apply to Metalink/XML files (aka --input-metalink=<file>).
|
2016-08-06 02:10:19 +08:00
|
|
|
|
2016-07-28 23:10:46 +08:00
|
|
|
Apply to Metalink/HTTP urls as described in the Wget's manual, see
|
|
|
|
wget(1). The target url is the url wrote on the command line.
|
2016-08-06 02:10:19 +08:00
|
|
|
|
|
|
|
'-P prefix'
|
|
|
|
'--directory-prefix=prefix'
|
|
|
|
|
2016-08-17 10:45:06 +08:00
|
|
|
Set the top of the retrieval tree to prefix for both Metalink/XML
|
|
|
|
and Metalink/HTTP downloads, see wget(1).
|
2016-08-17 22:50:18 +08:00
|
|
|
|
|
|
|
If combining the prefix with the file name results in an absolute,
|
|
|
|
relative, or home path, the directory components are stripped and
|
|
|
|
only the basename is used. See '4.1 Implemented safety features'.
|