Commit Graph

256 Commits

Author SHA1 Message Date
Jookia
e4db00d74d Add option to write URL rejections to a tab-delimited CSV log.
* main.c: Add "--rejected-log" option.
 * init.c: Add "rejectedlog" command.
 * options.h: Add "rejected_log" parameter string.
 * wget.texi: Add brief documentation on new --rejected-log option.
 * recur.c: Optionally log details of URLs not traversed.
   Add reject_reason enum.
   (download_child_p -> download_child): Return a reject_reason.
   (descend_redirect_p -> descend_redirect): Return a reject_reason.
   (retrieve_tree): Support logging reasons for rejection.
   Add write_reject_log_header that writes a CSV format header to a file.
   Add write_reject_log_url that writes a url struct to a file in CSV format.
   Add write_reject_log_reason that writes the URL and parent URL as well as the
   rejection reason to a CSV file.
 * Test--rejected-log.px: Add a basic test for the --rejected-log command.
 * tests/Makefile.am: Run Test--rejected-log.px.

This allows you to figure out why URLs are being rejected and some context
around it. CSV is used as the output format since it can be used easily parsed,
it's delimited by tabs instead of commas to allow using all (quoted) URL
characters and includes column names which may be used for compatibility.
2015-08-06 08:10:55 +02:00
Ander Juaristi
b60131a399 Added support for HSTS.
* Makefile.am: Added new source files hsts.c and hsts.h.
 * http.c (parse_strict_transport_security): new function for STS header
   parsing.
   (gethttp): update the HSTS store.
 * http.h: new include "hsts.h".
 * init.c: new options --hsts and --hsts-file.
 * main.c (get_hsts_database, load_hsts, save_hsts): new functions.
   New options --no-hsts and --hsts-file added to help.
   (main): load and save HSTS store.
 * options.h: new variables for supporting --hsts and --hsts-file.
 * retr.c (retrieve_url): rewrite the URI according to the HSTS policy before
   entering http_loop.
 * test.c, test.h: new unit tests for HSTS.
 * utils.c, utils.h (countchars): new function.
 * wget.h: new preprocessor check.
 * hsts.c, hsts.h: new files with the HSTS engine implementation.

Added support for HTTP Strict Transport Security (HSTS), as defined by RFC
6797.
2015-07-20 15:55:57 +02:00
Hubert Tarasiuk
6064f21c66 Geolocation support for Metalink resources.
* doc/wget.text: Add information about --preferred-location.
* src/init.c: Add --preferred-location option.
* src/main.c (option_data): Handle --preferred-location argument.
(main): Sort resources based on location if requested.
* src/metalink.c (metalink_res_cmp): Compare based on location if
priority and preference are equal.
* src/options.h (options): Add preferred_location option.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
37b58e3976 Metalink support.
* bootstrap.conf: Add crypto/sha256
* configure.ac: Look for libmetalink and GPGME
* doc/wget.texi: Add --input-metalink and --metalink-over-http
options description.
* po/POTFILES.in: Add metalink.c
* src/Makefile.am: Add new translation unit (metalink.c)
* src/http.c (http_stat): Add metalink field.
(free_stat): Free metalink field.
(find_key_value): Find value of given key in header string.
(has_key): Check if token exists in header string.
(find_key_values): Find all key=value pairs in header string.
(metalink_from_http): Obtain Metalink metadata from HTTP response.
(gethttp): Call metalink_from_http if requested.
(http_loop): Request Metalink metadata from HTTP response if should be.
Fall back to regular download if no Metalink metadata found.
* src/init.c: Add --input-metalink and --metalink-over-http options
* src/main.c (option_data): Handle --input-metalink and
--metalink-over-http cmd arguments.
(print_help): Print --input-metalink option description.
(main): Retrieve files from Metalink file
* src/metalink.c (retrieve_from_metalink): Download files described by
metalink.
(metalink_res_cmp): Comparator for resources priority-sorting.
* src/metalink.h: Create header for metalink.c
(RES_TYPE_SUPPORTED): Define supported resources media.
(DEFAULT_PRI): Default mirror priority for Metalink over HTTP.
(VALID_PRI_RANGE): Valid priority range.
* src/options.h (options): Add input_metalink option and metalink_over_http
options.
* src/utils.c (hex_to_string): Convert binary data to ASCII-hex.
* src/utils.h (hex_to_string): Add prototype.
* src/wget.h: Add metalink-related error enums
Add METALINK_METADATA flag for document type.
2015-07-20 15:30:39 +02:00
Ángel González
ae58d8a78b Fix wgetrc filename creation for Windows
* init.c/wgetrc_file_name: Remove obsolete code in WINDOWS code path

Reported-by: Gisle Vanem <gvanem@yahoo.no>
2015-06-27 21:32:48 +02:00
Hubert Tarasiuk
0e8d2d4251 Add --if-modified-since option
* src/init.c: Add to commands array.
* src/main.c: Add to cmdline_option. Add to help message.
* src/options.h: Add to options struct.
2015-05-22 11:08:30 +02:00
Steven M. Schweda
5efb24e4a2 Add option to restrict filenames used VMS.
* src/options.h (enum restrict_files_os): Define "restrict_vms".
* src/init.c (defaults) [__VMS]: Set "opt.restrict_files_os" to
"restrict_vms".
(cmd_spec_restrict_file_names): honor "vms".
* src/url.c (filechr_not_unix): Define "filechr_not_vms".
(filechr_table): Update for VMS.
(append_uri_pathel): Honor opt.restrict_files_os.
(FN_QUERY_SEP): Update for VMS.
(FN_QUERY_SEP_STR): Update for VMS.
2015-04-02 15:36:42 +02:00
Giuseppe Scrivano
16f1fb1d1f maint: update copyright year ranges to include 2015 2015-03-09 16:32:01 +01:00
Darshit Shah
8705e27e20 progress bar: Allow display on stderr alongwith -o
This commit causes the --show-progress option to print the progress bar
to stderr even when a logfile was explicitly provided on the command
line. Such a combination allows a user to log the output of Wget while
simultaneously keeping track of the download status.
2015-01-20 20:16:20 +01:00
Tim Rühsen
b0b1cde6e2 src/init.c: Fix indentation for crlfile option 2014-12-17 12:39:00 +01:00
Tim Ruehsen
4850e9c873 Replaced xfree_null() by xfree() and nullify argument after freeing. 2014-12-01 16:15:37 +01:00
Darshit Shah
3e609a1192 Replace all occurences of free() with xfree() 2014-11-27 11:11:34 +05:30
Tim Rühsen
3c51ad7f02 Removed form feeds from sources and NEWS 2014-11-20 16:35:34 +01:00
Tim Rühsen
7b43510fe3 Fixes possible issues with Wget running in a turkish locale 2014-11-20 10:56:21 +01:00
Tim Rühsen
2ece0cc425 Remove 'make check'compiler warnings 2014-11-17 11:28:20 +01:00
Tim Rühsen
e4a8fe84e2 Added --crl-file to load a Certificate Revocation List (CRL) file
Reported-by: Noël Köthe <noel@debian.org>
2014-11-11 15:06:51 +01:00
Tim Rühsen
148065bc00 content for commit 6092205538 2014-10-29 16:18:01 +01:00
Darshit Shah
18b0979357 CVE-2014-4877: Arbitrary Symlink Access
Wget was susceptible to a symlink attack which could create arbitrary
files, directories or symbolic links and set their permissions when
retrieving a directory recursively through FTP. This commit changes the
default settings in Wget such that Wget no longer creates local symbolic
links, but rather traverses them and retrieves the pointed-to file in
such a retrieval.

The old behaviour can be attained by passing the --retr-symlinks=no
option to the Wget invokation command.
2014-10-27 09:18:13 +01:00
Tim Ruehsen
3e3073ca7b add TLSv1_1 and TLSv1_2 to --secure-protocol 2014-10-23 21:16:37 +02:00
Giuseppe Scrivano
c03855be40 ftp: Replace main() with main in comments. 2014-06-12 18:49:16 +02:00
Giuseppe Scrivano
dd1b69c600 Remove trailing empty lines 2014-06-12 18:49:15 +02:00
Giuseppe Scrivano
8e6de1fb5f Drop usage of strncpy 2014-06-12 18:49:13 +02:00
Giuseppe Scrivano
087e17be1c Do not use exit() with a magic number 2014-06-12 18:48:48 +02:00
Darshit Shah
8624553a31 Whitespace and formatting changes.(Aesthetic only)
This commit makes lots of whitespace only changes. It has been ensured that this
commit does not make any changes to the functioning of the program. The only
changes that have been made are:
    * Remove trailing whitespaces
    * Convert tabs to spaces
    * Fix indentation issues in the code
    * Other aesthetic changes to the formatting of comments
2014-05-30 21:12:57 +05:30
Darshit Shah
4eeabffee6 More progress bar aesthetic changes
This commit introduces two new changes to how the progress bar looks:
1. Support the --progress=bar:noscroll option which will prevent the filename
   from scrolling in the progress bar
2. Print human readable value for the amount already downloaded for any file
2014-05-30 13:28:02 +05:30
Tim Ruehsen
38a7829dcb Fix compiler warnings 2014-05-12 12:18:50 +02:00
Darshit Shah
8c2fd06ba8 Add --show-progress to force display progress bar
This is a relatively large commit that implements two major features:

1. Implement --show-progress switch to force the display of the progress bar in
   any verbosity level
2. Edit the implementation of the progress bar so that the filename is displayed
   in the same line.
2014-05-01 01:07:43 +02:00
Yousong Zhou
dfa1f4e064 Make wget capable of starting downloads from a specified position.
This patch adds an option `--start-pos' for specifying starting position
of a HTTP or FTP download.

Signed-off-by: Yousong Zhou <yszhou4tech@gmail.com>
2014-03-21 11:21:00 +01:00
Darshit Shah
b65b9cb8c5 Turn --debug into no-op if compiled without debugging support 2014-02-01 11:49:49 +01:00
Darshit Shah
b9e5c3e8b3 Introduce --no-config. The wgetrc files will not be read
In case of a conflict between --config and --no-config, the one
that appears first will be considered and the other ignored.
2014-01-22 21:59:06 +01:00
Giuseppe Scrivano
70f7cdf1af Remove some useless if statements 2013-12-29 11:46:04 +01:00
Tim Ruehsen
e505664ef3 added PFS to --secure-protocol 2013-09-07 13:22:15 +02:00
Tim Ruehsen
42c78fdd71 added option --https-only 2013-08-22 20:05:41 +02:00
Ángel González
49f6d0ded8 Cleanup cmd_string_uppercase 2013-06-22 14:06:06 +02:00
Tim Ruehsen
099d8ee3da replaced read_whole_file() by getline() 2013-05-17 20:19:02 +02:00
Darshit Shah
277785fa2a Fix issue when converting string to uppercase 2013-05-05 01:34:47 +02:00
Giuseppe Scrivano
550457bcad Fix crash when receiving a HTTP redirect upon a POST request
The crash was introduced by a recent commit.
2013-05-02 21:57:20 +02:00
Darshit Shah
6c30653a1a Add a generic --method command to set a method in HTTP Requests.
Add supplementary --body-data and --body-file commands to send BODY Data.

Signed-off-by: Darshit Shah <darnir@gmail.com>
2013-04-14 12:57:58 +02:00
Giuseppe Scrivano
06fc1edb54 Remove static modifier for functions used in other modules. 2012-08-28 21:38:12 +02:00
Steven Schubiger
31674653eb Include missing header. 2012-07-08 11:36:54 +02:00
Giuseppe Scrivano
ae0598df9b Check for fclose errors. 2012-06-17 22:24:32 +02:00
Giuseppe Scrivano
90e9d9e1bd Move cleanup related code to `cleanup' 2012-06-16 12:20:33 +02:00
Giuseppe Scrivano
6b5c0c742d Rename, again, --reports-bits to report-speed. 2012-06-06 20:41:25 +02:00
Giuseppe Scrivano
96418c6885 Rename --bits to --report-bps. 2012-06-06 14:10:07 +02:00
Gijs van Tulder
f5a1097871 Add support for -accept-regex and --reject-regex. 2012-05-09 21:18:23 +02:00
Steven Schubiger
0ccaa999a2 Fix typo. 2012-03-08 10:00:51 +01:00
Sasikantha Babu
b9b510ca5f Accept --bit option 2012-03-05 22:23:06 +01:00
Gijs van Tulder
0a8a898fbe Fix a linker error if zlib is not found. 2012-01-11 15:27:06 +01:00
Gijs van Tulder
e3820953b2 Add support for WARC files. 2011-11-04 22:25:00 +01:00
Steven Schweda
8c7bd588fe Fix some problems under VMS. 2011-10-23 13:11:22 +02:00