* src/url.c [HAVE_ICONV]: Include iconv.h and langinfo.h.
(convert_fname): New function.
[HAVE_ICONV]: Convert file name from remote encoding to local
encoding.
(url_file_name): Call convert_fname.
(filechr_table): Don't consider bytes in 128..159 as control
characters.
* tests/Test-ftp-iri.px: Fix the expected file name to match the
new file-name recoding. State the remote encoding explicitly on
the Wget command line.
* NEWS: Mention the URI recoding when built with libiconv.
* src/iri.c: Kick out the last converted character from iconv()
Thanks to Eli Zaretskii <eliz@gnu.org> for suggesting the fix.
Reported-by: "Andries E. Brouwer" <Andries.Brouwer@cwi.nl>
* src/ftp.c (getftp): on error, close the file and attempt to remove it
before exiting.
* src/hsts.c (hsts_store_open): update modification time in the end.
* src/options.h (CHECK_CERT_MODES): Remove C99 style comma after last
value
* src/progress.c (create_image): Do not mix statements and declarations
* src/init.c (cmd_boolean_internal): Mark unused parameters
* src/progress.c (bar_create): Define size of progress buffer explicitly
(create_image): Clean up progress bar image creation. Use memset
instead of for loops to create arrays of the same byte.
* src/http.c (initialize_request): Fix wrong params to search_netrc()
Regression introduced in commit 29850e77
Reported-by: Axel Reinhold <axel@freakout.de>
* src/hsts.c (hsts_find_entry): Fix freeing memory
(hsts_remove_entry): Remove freeing host member
(hsts_match): Free host member here
(hsts_store_entry): Free host member here
(test_url_rewrite): Fix 'created' value
(test_hsts_read_database): Fix 'created' value
Reported-by: Dagobert Michelsen <dam@opencsw.org>
* src/ftp-basic.c: The code for the new FTPS functionality was unintentionally
inside a #ifdef IPV6 block. Move the code around so that it is defined even when
IPV6 isn't used
* src/http.c (gethttp,http_loop):
Do not download/save file on error when --spider is enabled and not
working recursive.
Reported-by: Сковорода Никита Андреевич chalkerx@gmail.comFixes#45821
* src/convert.c (convert_links_in_hashtable, convert_links):
test for CO_CONVERT_BASENAME_ONLY.
(convert_basename): new function.
* src/convert.h: new constant CO_CONVERT_BASENAME_ONLY.
* src/init.c, src/main.c, src/options.h: new option "--convert-file-only".
* doc/wget.texi: updated documentation.
Reviewed-by: Gabriel Somlo <somlo@cmu.edu>
* src/hsts.c (hsts_read_database): get an open file handle
instead of a file name.
(hsts_store_dump): get an open file handle
instead of a file name.
(hsts_store_open): open the file and pass the open file handle.
(hsts_store_save): lock the file before the read-merge-dump
process.
Reported-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
* src/hsts.c (hsts_store_merge): call hsts_new_entry() if the entry
does not exist in the database.
When merging the existing HSTS database on disk with the one on memory,
the entries that were on disk but not on memory were ignored. Thus,
only the existing entries were merged. This behavior was only triggered
when more than one Wget processes were using the same HSTS database
simultaneously. This commit fixes the bug by adding the new entries
to the on-memory database if they were not found there.
* http.c (digest_authentication_encode): Wget already errors out if
qop != "auth". Then it makes no sense to test for qop == "auth-int"
later on. Currently, Wget does not support the "auth-int" qop value
and till nobidy requests, it may remain so.
* http.c (digest_authentication_encode): Some servers are still
using the obsolete RFC 2069 Digest Authentication. Allow Digest
authentication without the qop parameter for this.
Reported-by: Andreas Longwitz <longwitz@incore.de>
* doc/wget.texi: updated documentation to reflect the new FTPS functionality.
* src/ftp-basic.c (ftp_greeting): new function to read the server's greeting.
(ftp_login): greeting code was previously here. Moved to ftp_greeting to
support FTPS implicit mode.
(ftp_auth): wrapper around the AUTH TLS command.
(ftp_ccc): wrapper around the CCC command.
(ftp_pbsz): wrapper around the PBSZ command.
(ftp_prot): wraooer around the PROT command.
* src/ftp.c (get_ftp_greeting): new static function.
(init_control_ssl_connection): new static function to start SSL/TLS on the
control channel.
(getftp): added hooks to support FTPS commands (RFCs 2228 and 4217).
(ftp_loop_internal): test for new FTPS error codes.
* src/ftp.h: new enum 'prot_level' with available FTPS protection levels +
prototypes of previous functions. New flag for enum 'wget_ftp_fstatus' to track
whether the data channel has some security mechanism enabled or not.
* src/gnutls.c (struct wgnutls_transport_context): new field 'session_data'.
(wgnutls_close): free GnuTLS session data before exiting.
(ssl_connect_wget): save/resume SSL/TLS session.
* src/http.c (establish_connection): refactor ssl_connect_wget call.
(metalink_from_http): take into account SCHEME_FTPS as well.
* src/init.c, src/main.c, src/options.h: new command line/wgetrc options.
(main): in recursive downloads, check for SCHEME_FTPS as well.
* src/openssl.c (struct openssl_transport_context): new field 'sess'.
(ssl_connect_wget): save/resume SSL/TLS session.
* src/retr.c (retrieve_url): check new scheme SCHEME_FTPS.
* src/ssl.h (ssl_connect_wget): refactor. New parameter of type 'int *'.
* src/url.c. src/url.h: new scheme SCHEME_FTPS.
* src/wget.h: new FTPS error codes.
* src/metalink.h: support FTPS scheme.
* src/progress.c (create_image): progress only when in foreground
Sometimes I start wget, but the remote site is too slow, so I rather
want to run it in background, however when I simply use job control
for that, wget will keep spewing the progress bar all over my
terminal. I have found the SIGHUP/SIGUSR1 feature to redirect output
to a log file, but I think the following small patch is even more
useful, since the progress bar will simply resume when wget is
foregrounded again (also, the final message is still printed to the
terminal in any case):
* http.c (test_parse_range_header): New function to test the
function for parsing the HTTP/1.1 Content-Range header.
* test.[ch]: Same
* http.c (parse_content_range): Fix parsing code. Fail on scenarios
mentioned in rfc 7233.
* hsts.c (get_hsts_store_filename): Free the homedir value
(close_hsts_test_store): Actually free the store struct too
(test_hsts_new_entry): Pass store to close_hsts_test_store()
(test_hsts_url_rewrite_superdomain): Same
(test_hsts_url_rewrite_congruent): Same
(test_hsts_read_database): Same and homedir and store filename
* http.c (test_parse_content_disposition): Free the returned
filename
* url.c (test_append_uri_pathel): Free allocated string
* src/ftp.c (getftp): Do not use PORT when PASV fails.
* tests/FTPServer.px: Add pasv_not_supported server flag.
* tests/Makefile.am: Add Test-ftp-pasv-not-supported.px
* tests/Test-ftp-pasv-not-supported.px: New test
Fix IP address exposure when automatically falling back from
passive mode to active mode (using the PORT command). A behavior that
may be used to expose a client's privacy even when using a proxy.
* src/recur.c: Declare variables before code
(write_reject_log_url):
Use const keyword where appropriate
Use the 'default' switch statement
Use xfree() instead of free()
Renamed variable f -> fp
(write_reject_log_reason):
Use const keyword where appropriate
Use the 'default' switch statement
Renamed variable f -> fp
Renamed variable r -> reason
* main.c: Add "--rejected-log" option.
* init.c: Add "rejectedlog" command.
* options.h: Add "rejected_log" parameter string.
* wget.texi: Add brief documentation on new --rejected-log option.
* recur.c: Optionally log details of URLs not traversed.
Add reject_reason enum.
(download_child_p -> download_child): Return a reject_reason.
(descend_redirect_p -> descend_redirect): Return a reject_reason.
(retrieve_tree): Support logging reasons for rejection.
Add write_reject_log_header that writes a CSV format header to a file.
Add write_reject_log_url that writes a url struct to a file in CSV format.
Add write_reject_log_reason that writes the URL and parent URL as well as the
rejection reason to a CSV file.
* Test--rejected-log.px: Add a basic test for the --rejected-log command.
* tests/Makefile.am: Run Test--rejected-log.px.
This allows you to figure out why URLs are being rejected and some context
around it. CSV is used as the output format since it can be used easily parsed,
it's delimited by tabs instead of commas to allow using all (quoted) URL
characters and includes column names which may be used for compatibility.
* src/utils.[ch], src/http.c, src/metalink.c: Rename function
hex_to_string() to wg_hex_to_string sine it collides with a
similarly named function in OpenSSL Library.
* Makefile.am: Added new source files hsts.c and hsts.h.
* http.c (parse_strict_transport_security): new function for STS header
parsing.
(gethttp): update the HSTS store.
* http.h: new include "hsts.h".
* init.c: new options --hsts and --hsts-file.
* main.c (get_hsts_database, load_hsts, save_hsts): new functions.
New options --no-hsts and --hsts-file added to help.
(main): load and save HSTS store.
* options.h: new variables for supporting --hsts and --hsts-file.
* retr.c (retrieve_url): rewrite the URI according to the HSTS policy before
entering http_loop.
* test.c, test.h: new unit tests for HSTS.
* utils.c, utils.h (countchars): new function.
* wget.h: new preprocessor check.
* hsts.c, hsts.h: new files with the HSTS engine implementation.
Added support for HTTP Strict Transport Security (HSTS), as defined by RFC
6797.
* doc/wget.text: Add information about --preferred-location.
* src/init.c: Add --preferred-location option.
* src/main.c (option_data): Handle --preferred-location argument.
(main): Sort resources based on location if requested.
* src/metalink.c (metalink_res_cmp): Compare based on location if
priority and preference are equal.
* src/options.h (options): Add preferred_location option.
* src/http.c: Move find_key_value, has_key, find_key_values.
* src/metalink.c: To here.
* src/metalink.h: Make them non-static and add prototypes here.
* src/wget.h: Add IF_MODIFIED_SINCE enum for dt. Add TIMECONV_ERR
enum to uerr_t.
* src/http.c (time_to_rfc1123): Convert time_t do http time.
* src/http.c (initialize_request): Include If-Modified-Since header
if appropriate.
* src/http.c (set_file_timestamp): Separate this code from check_file_output.
* src/http.c (check_file_output): Use set_file_timestamp.
* src/http.c (gethttp): Handle properly 304 return code and 200 if server
ignores If-Modified-Since headers.
* src/http.c (http_loop): Load filename to hstat if condget was requested,
use IF_MODIFIED_SINCE if requested and current timestamp can be obtained.
* src/iri.c (do_conversion): Call url_unescape_except_reserved,
instead of url_unescape.
* src/url.c (url_unescape_1): New static function.
(url_unescape): Calls url_unescape_1 with mask zero. Preserves
same behavior as before. Only code changes.
(url_unescape_except_reserved): New function.
* src/url.h: Added prototype for url_unescape_except_reserved().
When the locale is US-ASCII, URIs that contain special characters
in them are converted to IRIs according to RFC 3987, section 3.2
"Converting URIs to IRIs".
* progress.c (update_speed_ring): The comment for the function
incorrectly stated that the function uses thirty samples from the
past instead of twenty.
Reported-By: Yi Li <lovelylich@gmail.com>
* src/ftp.c (ftp_loop_internal): Add option `force_full_retrieve' that force to
retrieve full file.
(ftp_retrieve_list): Pass `true' as `force_full_retrieve' option to
`ftp_loop_internal' if we want to download file with newer timestamp than local
copy.
* src/iri.c (remote_to_utf8): Do not qualify with const the output pointer.
(do_conversion): Use the provided input parameter as const.
(idn_encode): casts to remote_to_utf8 parameters are no longer needed.
* src/iri.h: Adjusted remote_to_utf8 prototype.
* src/url.c: It is no longer necessary to cast new_url to const char.
src/http.c (parse_content_disposition): stores filename* and filename
separately and choses filename* if available.
(test_parse_content_disposition): added new tests.
* src/http.c (resp_free): Change the semantics of this function.
(request_free): Change the semantics of this function.
(initialize_request): Adjust request_free call.
(establish_connection): Adjust request_free, resp_free calls.
(gethttp): Adjust request_free, resp_free calls.
* src/http.c (establish_connection): Do not free request here (it is
* never allocated here).
* src/http.c (gethttp): Free request before returning if error in
* establish_connection encountered.
* src/http.c: Log --content-on-error downloads.
* src/retr.c (retrieve_url): Register the download of an error page
when --content-on-error is specified.
* warc.c (windows_uuid_str) [WINDOWS]: New function specific to
MS-Windows.
(warc_uuid_str) [WINDOWS]: If windows_uuid_str succeeds, use its
result; otherwise use the fallback method.
xfree() might crash on libidn memory on Windows.
From 'man idn_free':
"Under Windows, different parts of the same application may use different
heap memory, and then it is important to deallocate memory allocated within
the same module that allocated it. This function makes that possible."
This commit causes the --show-progress option to print the progress bar
to stderr even when a logfile was explicitly provided on the command
line. Such a combination allows a user to log the output of Wget while
simultaneously keeping track of the download status.
When errno was set to EPIPE before call to logprintf (e.g. during close of
SSL connection that was reset by peer), it will unexpectedly terminate wget.
It should exit only when EPIPE was triggered by logging code.
Regression by 0b5b100fc9
* Always attempt to detect uuid.h and uuid_create().
* Split libuuid and uuid.h implementations of warc_uuid_str(), since
those APIs vary significantly.
* Correctly use the uuid.h functions
This reverts commit fcd3b3c473.
Turns out that removing the ChangeLog files causes the Wget build to
fail. While this issue is investigated and sorted out, the commit is
reversed to allow people to be able to build Wget from master
The pointer respline in use after being passed to ftp_response() may be
uninitialized if ftp_response() fails. Ensure that respline be used
after checking the return value of ftp_response().
From v1.16.1 onwards, Wget no longer maintains an active ChangeLog file.
Instead the ChangeLog will be automatically generated on each release
through gnulib's gitlog-to-changelog script. However, the old versions
of the ChangeLog files are retained for reference. These files are
renamed with a .pre-gitlog appended to their filenames.
Also removed ChangeLog.README file which is not required anymore
A call to assert(1) will always fail and cause Wget to crash. If such a
situation does arise, Wget should invoke abort() and provide a useful
error message to the user prior to exiting.
MIN and MAx are macros that a developer will universally expect
throughout the source. Yet, they were being defined in multiple places
across the source. Instead, define them in a single location in the
common wget.h header file and use them consistently everywhere.