mirror of
https://github.com/mirror/wget.git
synced 2024-12-28 22:00:27 +08:00
c403e67935
* src/http.c (metalink_from_http): Process the Content-Type header. Add an application/metalink4+xml URL as metalink metaurl. If the option opt.content_disposition is true, the Content-Disposition's filename is the metaurl's name * doc/wget.texi: Update --content-disposition and --metalink-over-http * doc/metalink-standard.txt: Update doc. Content-Type/Disposition processing through --metalink-over-http. Update download naming system about --trust-server-names and --content-disposition * testenv/Makefile.am: Add new files * testenv/Test-metalink-http-xml-type.py: New file. Metalink/HTTP Content-Type/Disposition header automated Metalink/XML tests * testenv/Test-metalink-http-xml-type-trust.py: New file. Metalink/HTTP Content-Type/Disposition header with --trust-server-names automated Metalink/XML tests * testenv/Test-metalink-http-xml-type-content.py: New file. Metalink/HTTP Content-Type/Disposition header with --content-disposition automated Metalink/XML tests * testenv/Test-metalink-http-xml-type-trust-content.py: New file. Metalink/HTTP Content-Type/Disposition header with --trust-server-names and --content-disposition automated Metalink/XML tests Process the Content-Type header, identify an application/metalink4+xml file. The Content-Disposition could provide an alternate name through the "filename" field for the metalink xml file. Respectively, the cli options --metalink-over-http and --content-disposition are required. When Metalink/XML auto-processing, to use the Content-Disposition's filename, the cli option --trust-server-names is also required.
234 lines
7.2 KiB
Plaintext
234 lines
7.2 KiB
Plaintext
GNU Wget Metalink recommended behaviour
|
|
|
|
Metalink/XML and Metalink/HTTP standard reference
|
|
|
|
|
|
1. Security features
|
|
********************
|
|
|
|
Only metalink:file elements with safe "name" fields shall be accepted
|
|
[1 #section-4.1.2.1]. If unsafe metalink:file elements are saved, any
|
|
related test shall fail (see '2. Tests').
|
|
|
|
By design, libmetalink rejects unsafe metalink:file elements [3]:
|
|
* lib/metalink_helper.c (metalink_check_safe_path): Verify path
|
|
|
|
1.1 Exceptions
|
|
==============
|
|
|
|
The option --directory-prefix could allow to use an absolute, relative
|
|
or home path.
|
|
|
|
2. Tests
|
|
********
|
|
|
|
Saving a file to an unexpected path poses a security problem. We must
|
|
ensure that Wget's automated tests never modify the root and the home
|
|
paths or descend/escalate to a relative path unexpectedly.
|
|
|
|
2.1 Metalink/XML implemented tests
|
|
==================================
|
|
|
|
See testenv/Makefile.am (METALINK_TESTS).
|
|
|
|
2.2 Metalink/HTTP implemented tests
|
|
===================================
|
|
|
|
See testenv/Makefile.am (METALINK_TESTS).
|
|
|
|
3. Download file name
|
|
*********************
|
|
|
|
The download file name shall be decided by precise rules which prevent
|
|
any naming uncertainty and security issues.
|
|
|
|
3.1 Naming rules
|
|
================
|
|
|
|
The final name of downloaded files is computed starting from a trusted
|
|
name, which is then combined with the "Directory Options". The result
|
|
is verified and eventually made safer following security rules. If the
|
|
final name isn't found safe enough, then the file isn't downloaded.
|
|
|
|
Depending on the options used, a suffix could be appended to the final
|
|
name to not overwrite existing files.
|
|
|
|
3.1.1 The trusted name
|
|
======================
|
|
|
|
The option --trust-server-names decides what is the trusted name.
|
|
|
|
Any Metalink/XML element with an unsafe metalink:file "name" field is
|
|
ignored, see '1. Security features'.
|
|
|
|
3.1.1.1 Without --trust-server-names
|
|
====================================
|
|
|
|
When --trust-server-names is off, the basename of the --input-metalink
|
|
file, if available, or of the mother URL is trusted. This trusted name
|
|
is the radix of any subsequent file name.
|
|
|
|
When a Metalink/HTTP in encountered, any fetched Metalink/XML file has
|
|
its own ordinal number appended as suffix to the trusted name. In this
|
|
case scenario, an unique Metalink/XML file is saved each time applying
|
|
an additional suffix to the currently computed name when necessary.
|
|
|
|
The files described by a Metalink/XML file will be named sequentially
|
|
applying an additional suffix to the currently trusted/computed name.
|
|
|
|
3.1.1.2 With --trust-server-names
|
|
=================================
|
|
|
|
When --trust-server-names is on, the metalink:file "name" field parsed
|
|
from Metalink/XML files is trusted. When no Metalink/XML is available,
|
|
the mother URL is trusted.
|
|
|
|
Any Metalink/HTTP application/metalink4+xml file is saved using the
|
|
basename of its own Link header "name" field, if available.
|
|
|
|
In conjunction with the option --content-disposition, a 'Content-Type:
|
|
application/metalink4+xml' file is saved using the basename of its own
|
|
Content-Disposition header "filename" field, if available.
|
|
|
|
3.1.2 The final name
|
|
====================
|
|
|
|
The "Directory Options" are combined with the trusted name. The result
|
|
is evaluated again by the '1. Security features'. If the path is found
|
|
unsafe, only the basename of the final name is considered. If this is
|
|
found unsafe too, the file is not downloaded.
|
|
|
|
4. Metalink/XML
|
|
***************
|
|
|
|
4.1 Example files
|
|
=================
|
|
|
|
See [1 #section-1.1].
|
|
|
|
cat > bugus.meta4 << EOF
|
|
<?xml version="1.0" encoding="UTF-8"?>
|
|
<metalink xmlns="urn:ietf:params:xml:ns:metalink">
|
|
<file name="/dir/A/File1">
|
|
<size>1617</size>
|
|
<hash type="sha256">ecb3dff2648667513e31554b3ad054ccd89fce38e33367c9459ac3a285153742</hash>
|
|
<url>http://another.url/common_name</url>
|
|
<url>http://ftpmirror.gnu.org/bash/bash-4.3-patches/bash43-001</url>
|
|
</file>
|
|
<file name="dir/B/File2">
|
|
<size>1594</size>
|
|
<hash type="sha256">eee7cd7062ab29a9e4f02924d9c367264dcb8b162703f74ff6eb8f175a91502b</hash>
|
|
<url>http://another.url/again/common_name</url>
|
|
<url>http://ftpmirror.gnu.org/bash/bash-4.3-patches/bash43-002</url>
|
|
</file>
|
|
</metalink>
|
|
EOF
|
|
|
|
4.2 Command line example
|
|
========================
|
|
|
|
$ wget --input-metalink=bogus.meta4
|
|
|
|
4.3 Metalink/XML file parsing
|
|
=============================
|
|
|
|
The metalink xml file is parsed by one of the following libmetalink's
|
|
functions [3], depending upon the library configured to use:
|
|
* lib/libexpat_metalink_parser.c (metalink_parse_file): Expat [4]
|
|
* lib/libxml2_metalink_parser.c (metalink_parse_file): Libxml2 [5]
|
|
|
|
The result returned doesn't include unsafe metalink:file elements, as
|
|
stated at point '1. Security features'.
|
|
|
|
An empty result shall not be considered an error. Parsing errors will
|
|
be informed to the caller of libmetalink's metalink_parse_file().
|
|
|
|
4.4 Saving files
|
|
================
|
|
|
|
Fetched metalink:file elements shall be wrote using the unique "name"
|
|
field as file name [1 #section-4.1.2.1].
|
|
|
|
A metalink:file url's file name shall not substitute the "name" field.
|
|
|
|
Security exceptions are explained in '3. Download file name'.
|
|
|
|
4.5 Multi-Source download
|
|
=========================
|
|
|
|
Parallel range requests are allowed [1 #section-1].
|
|
|
|
5. Metalink/HTTP
|
|
****************
|
|
|
|
5.1 HTTP server
|
|
===============
|
|
|
|
The local server http://127.0.0.1 is used as reference in the course
|
|
of this chapter. Any server service capable of sending Metalink/HTTP
|
|
header answers may be used.
|
|
|
|
5.2 Command line example
|
|
========================
|
|
|
|
$ wget --metalink-over-http http://127.0.0.1/dir/file.ext
|
|
|
|
5.3 Metalink/HTTP header answer
|
|
===============================
|
|
|
|
See [2 #section-1.1].
|
|
|
|
Etag: "thvDyvhfIqlvFe+A9MYgxAfm1q5="
|
|
Link: <http://www2.example.com/example.ext>; rel=duplicate
|
|
Link: <ftp://ftp.example.com/example.ext>; rel=duplicate
|
|
Link: <http://example.com/example.ext.torrent>; rel=describedby;
|
|
type="application/x-bittorrent"
|
|
Link: <http://example.com/example.ext.meta4>; rel=describedby;
|
|
type="application/metalink4+xml"
|
|
Link: <http://example.com/example.ext.asc>; rel=describedby;
|
|
type="application/pgp-signature"
|
|
Digest: SHA-256=MWVkMWQxYTRiMzk5MDQ0MzI3NGU5NDEyZTk5OWY1ZGFmNzgyZTJlO
|
|
DYzYjRjYzFhOTlmNTQwYzI2M2QwM2U2MQ==
|
|
|
|
See [2 #section-4].
|
|
|
|
Link: <http://example.com/example.ext.torrent>; rel=describedby;
|
|
type="application/x-bittorrent"; name="differentname.ext"
|
|
Link: <http://example.com/example.ext.meta4>; rel=describedby;
|
|
type="application/metalink4+xml"
|
|
|
|
5.4 Saving files
|
|
================
|
|
|
|
When none of --output-document and/or --content-disposition is used,
|
|
the file name to wrote is computed from the cli's url hierarchy. The
|
|
purpose of the "Directory Options" is as usual, and the file name is
|
|
the cli's url file name, see wget(1).
|
|
|
|
The url followed to download the file shall not substitute the cli's
|
|
url to compute the file name to wrote, except when it redirects to a
|
|
Metalink/XML file, following the rules in '3. Download file name'.
|
|
|
|
5.5 Multi-Source download
|
|
=========================
|
|
|
|
Parallel range requests are allowed [2 #section-7].
|
|
|
|
4. References
|
|
*************
|
|
|
|
[1] The Metalink Download Description Format
|
|
https://tools.ietf.org/html/rfc5854
|
|
|
|
[2] Metalink/HTTP: Mirrors and Hashes
|
|
https://tools.ietf.org/html/rfc6249
|
|
|
|
[3] Libmetalink
|
|
https://github.com/metalink-dev/libmetalink
|
|
|
|
[4] Expat
|
|
http://www.libexpat.org
|
|
|
|
[5] Libxml2
|
|
http://xmlsoft.org
|