GNU Wget Metalink recommended behaviour Metalink/XML and Metalink/HTTP standard reference 1. Security features ******************** Only metalink:file elements with safe "name" fields shall be accepted [1 #section-4.1.2.1]. If unsafe metalink:file elements are saved, any related test shall fail (see '2. Tests'). By design, libmetalink rejects unsafe metalink:file elements [3]: * lib/metalink_helper.c (metalink_check_safe_path): Verify path 1.1 Exceptions ============== The option --directory-prefix could allow to use an absolute, relative or home path. 2. Tests ******** Saving a file to an unexpected path poses a security problem. We must ensure that Wget's automated tests never modify the root and the home paths or descend/escalate to a relative path unexpectedly. 2.1 Metalink/XML implemented tests ================================== See testenv/Makefile.am (METALINK_TESTS). 2.2 Metalink/HTTP implemented tests =================================== See testenv/Makefile.am (METALINK_TESTS). 3. Download file name ********************* The download file name shall be decided by precise rules which prevent any naming uncertainty and security issues. 3.1 Naming rules ================ The final name of downloaded files is computed starting from a trusted name, which is then combined with the "Directory Options". The result is verified and eventually made safer following security rules. If the final name isn't found safe enough, then the file isn't downloaded. Depending on the options used, a suffix could be appended to the final name to not overwrite existing files. 3.1.1 The trusted name ====================== The option --trust-server-names decides what is the trusted name. Any Metalink/XML element with an unsafe metalink:file "name" field is ignored, see '1. Security features'. 3.1.1.1 Without --trust-server-names ==================================== When --trust-server-names is off, the basename of the --input-metalink file, if available, or of the mother URL is trusted. This trusted name is the radix of any subsequent file name. When a Metalink/HTTP in encountered, any fetched Metalink/XML file has its own ordinal number appended as suffix to the trusted name. In this case scenario, an unique Metalink/XML file is saved each time applying an additional suffix to the currently computed name when necessary. The files described by a Metalink/XML file will be named sequentially applying an additional suffix to the currently trusted/computed name. 3.1.1.2 With --trust-server-names ================================= When --trust-server-names is on, the metalink:file "name" field parsed from Metalink/XML files is trusted. When no Metalink/XML is available, the mother URL is trusted. Any Metalink/HTTP application/metalink4+xml file is saved using the basename of its own Link header "name" field, if available. In conjunction with the option --content-disposition, a 'Content-Type: application/metalink4+xml' file is saved using the basename of its own Content-Disposition header "filename" field, if available. 3.1.2 The final name ==================== The "Directory Options" are combined with the trusted name. The result is evaluated again by the '1. Security features'. If the path is found unsafe, only the basename of the final name is considered. If this is found unsafe too, the file is not downloaded. 4. Metalink/XML *************** 4.1 Example files ================= See [1 #section-1.1]. cat > bogus.meta4 << EOF 1617 ecb3dff2648667513e31554b3ad054ccd89fce38e33367c9459ac3a285153742 http://another.url/common_name http://ftpmirror.gnu.org/bash/bash-4.3-patches/bash43-001 1594 eee7cd7062ab29a9e4f02924d9c367264dcb8b162703f74ff6eb8f175a91502b http://another.url/again/common_name http://ftpmirror.gnu.org/bash/bash-4.3-patches/bash43-002 EOF 4.2 Command line example ======================== $ wget --input-metalink=bogus.meta4 4.3 Metalink/XML file parsing ============================= The metalink xml file is parsed by one of the following libmetalink's functions [3], depending upon the library configured to use: * lib/libexpat_metalink_parser.c (metalink_parse_file): Expat [4] * lib/libxml2_metalink_parser.c (metalink_parse_file): Libxml2 [5] The result returned doesn't include unsafe metalink:file elements, as stated at point '1. Security features'. An empty result shall not be considered an error. Parsing errors will be informed to the caller of libmetalink's metalink_parse_file(). 4.4 Saving files ================ Fetched metalink:file elements shall be wrote using the unique "name" field as file name [1 #section-4.1.2.1]. A metalink:file url's file name shall not substitute the "name" field. Security exceptions are explained in '3. Download file name'. 4.5 Multi-Source download ========================= Parallel range requests are allowed [1 #section-1]. 5. Metalink/HTTP **************** 5.1 HTTP server =============== The local server http://127.0.0.1 is used as reference in the course of this chapter. Any server service capable of sending Metalink/HTTP header answers may be used. 5.2 Command line example ======================== $ wget --metalink-over-http http://127.0.0.1/dir/file.ext 5.3 Metalink/HTTP header answer =============================== See [2 #section-1.1]. Etag: "thvDyvhfIqlvFe+A9MYgxAfm1q5=" Link: ; rel=duplicate Link: ; rel=duplicate Link: ; rel=describedby; type="application/x-bittorrent" Link: ; rel=describedby; type="application/metalink4+xml" Link: ; rel=describedby; type="application/pgp-signature" Digest: SHA-256=MWVkMWQxYTRiMzk5MDQ0MzI3NGU5NDEyZTk5OWY1ZGFmNzgyZTJlO DYzYjRjYzFhOTlmNTQwYzI2M2QwM2U2MQ== See [2 #section-4]. Link: ; rel=describedby; type="application/x-bittorrent"; name="differentname.ext" Link: ; rel=describedby; type="application/metalink4+xml" 5.4 Saving files ================ When none of --output-document and/or --content-disposition is used, the file name to wrote is computed from the cli's url hierarchy. The purpose of the "Directory Options" is as usual, and the file name is the cli's url file name, see wget(1). The url followed to download the file shall not substitute the cli's url to compute the file name to wrote, except when it redirects to a Metalink/XML file, following the rules in '3. Download file name'. 5.5 Multi-Source download ========================= Parallel range requests are allowed [2 #section-7]. 4. References ************* [1] The Metalink Download Description Format https://tools.ietf.org/html/rfc5854 [2] Metalink/HTTP: Mirrors and Hashes https://tools.ietf.org/html/rfc6249 [3] Libmetalink https://github.com/metalink-dev/libmetalink [4] Expat http://www.libexpat.org [5] Libxml2 http://xmlsoft.org