wget/src/xattr.c

96 lines
2.5 KiB
C
Raw Normal View History

Keep fetched URLs in POSIX extended attributes * configure.ac: Check for xattr availability * src/Makefile.am: Add xattr.c * src/ftp.c: Include xattr.h. (getftp): Set attributes if enabled. * src/http.c: Include xattr.h. (gethttp): Add parameter 'original_url', set attributes if enabled. (http_loop): Add 'original_url' to call of gethttp(). * src/init.c: Add new option --xattr. * src/main.c: Add new option --xattr, add description to help text. * src/options.h: Add new config member 'enable_xattr'. * src/xatrr.c: New file. * src/xattr.h: New file. These attributes provide a lightweight method of later determining where a file was downloaded from. This patch changes: * autoconf detects whether extended attributes are available and enables the code if they are. * The new flags --xattr and --no-xattr control whether xattr is enabled. * The new command "xattr = (on|off)" can be used in ~/.wgetrc or /etc/wgetrc * The original and redirected URLs are recorded as shown below. * This works for both single fetches and recursive mode. The attributes that are set are: user.xdg.origin.url: The URL that the content was fetched from. user.xdg.referrer.url: The URL that was originally requested. Here is an example, where http://archive.org redirects to https://archive.org: $ wget --xattr http://archive.org ... $ getfattr -d index.html user.xdg.origin.url="https://archive.org/" user.xdg.referrer.url="http://archive.org/" These attributes were chosen based on those stored by Google Chrome https://bugs.chromium.org/p/chromium/issues/detail?id=45903 and curl https://github.com/curl/curl/blob/master/src/tool_xattr.c
2016-07-21 12:15:49 +08:00
/* xattr.h -- POSIX Extended Attribute support.
2019-02-10 18:29:48 +08:00
Copyright (C) 2016, 2018-2019 Free Software Foundation, Inc.
Keep fetched URLs in POSIX extended attributes * configure.ac: Check for xattr availability * src/Makefile.am: Add xattr.c * src/ftp.c: Include xattr.h. (getftp): Set attributes if enabled. * src/http.c: Include xattr.h. (gethttp): Add parameter 'original_url', set attributes if enabled. (http_loop): Add 'original_url' to call of gethttp(). * src/init.c: Add new option --xattr. * src/main.c: Add new option --xattr, add description to help text. * src/options.h: Add new config member 'enable_xattr'. * src/xatrr.c: New file. * src/xattr.h: New file. These attributes provide a lightweight method of later determining where a file was downloaded from. This patch changes: * autoconf detects whether extended attributes are available and enables the code if they are. * The new flags --xattr and --no-xattr control whether xattr is enabled. * The new command "xattr = (on|off)" can be used in ~/.wgetrc or /etc/wgetrc * The original and redirected URLs are recorded as shown below. * This works for both single fetches and recursive mode. The attributes that are set are: user.xdg.origin.url: The URL that the content was fetched from. user.xdg.referrer.url: The URL that was originally requested. Here is an example, where http://archive.org redirects to https://archive.org: $ wget --xattr http://archive.org ... $ getfattr -d index.html user.xdg.origin.url="https://archive.org/" user.xdg.referrer.url="http://archive.org/" These attributes were chosen based on those stored by Google Chrome https://bugs.chromium.org/p/chromium/issues/detail?id=45903 and curl https://github.com/curl/curl/blob/master/src/tool_xattr.c
2016-07-21 12:15:49 +08:00
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3, or (at your option)
any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, see <http://www.gnu.org/licenses/>. */
#include "wget.h"
#include <stdio.h>
#include <string.h>
#include "log.h"
#include "utils.h"
Keep fetched URLs in POSIX extended attributes * configure.ac: Check for xattr availability * src/Makefile.am: Add xattr.c * src/ftp.c: Include xattr.h. (getftp): Set attributes if enabled. * src/http.c: Include xattr.h. (gethttp): Add parameter 'original_url', set attributes if enabled. (http_loop): Add 'original_url' to call of gethttp(). * src/init.c: Add new option --xattr. * src/main.c: Add new option --xattr, add description to help text. * src/options.h: Add new config member 'enable_xattr'. * src/xatrr.c: New file. * src/xattr.h: New file. These attributes provide a lightweight method of later determining where a file was downloaded from. This patch changes: * autoconf detects whether extended attributes are available and enables the code if they are. * The new flags --xattr and --no-xattr control whether xattr is enabled. * The new command "xattr = (on|off)" can be used in ~/.wgetrc or /etc/wgetrc * The original and redirected URLs are recorded as shown below. * This works for both single fetches and recursive mode. The attributes that are set are: user.xdg.origin.url: The URL that the content was fetched from. user.xdg.referrer.url: The URL that was originally requested. Here is an example, where http://archive.org redirects to https://archive.org: $ wget --xattr http://archive.org ... $ getfattr -d index.html user.xdg.origin.url="https://archive.org/" user.xdg.referrer.url="http://archive.org/" These attributes were chosen based on those stored by Google Chrome https://bugs.chromium.org/p/chromium/issues/detail?id=45903 and curl https://github.com/curl/curl/blob/master/src/tool_xattr.c
2016-07-21 12:15:49 +08:00
#include "xattr.h"
#ifdef USE_XATTR
static int
write_xattr_metadata (const char *name, const char *value, FILE *fp)
{
int retval = -1;
if (name && value && fp)
{
retval = fsetxattr (fileno (fp), name, value, strlen (value), 0);
Keep fetched URLs in POSIX extended attributes * configure.ac: Check for xattr availability * src/Makefile.am: Add xattr.c * src/ftp.c: Include xattr.h. (getftp): Set attributes if enabled. * src/http.c: Include xattr.h. (gethttp): Add parameter 'original_url', set attributes if enabled. (http_loop): Add 'original_url' to call of gethttp(). * src/init.c: Add new option --xattr. * src/main.c: Add new option --xattr, add description to help text. * src/options.h: Add new config member 'enable_xattr'. * src/xatrr.c: New file. * src/xattr.h: New file. These attributes provide a lightweight method of later determining where a file was downloaded from. This patch changes: * autoconf detects whether extended attributes are available and enables the code if they are. * The new flags --xattr and --no-xattr control whether xattr is enabled. * The new command "xattr = (on|off)" can be used in ~/.wgetrc or /etc/wgetrc * The original and redirected URLs are recorded as shown below. * This works for both single fetches and recursive mode. The attributes that are set are: user.xdg.origin.url: The URL that the content was fetched from. user.xdg.referrer.url: The URL that was originally requested. Here is an example, where http://archive.org redirects to https://archive.org: $ wget --xattr http://archive.org ... $ getfattr -d index.html user.xdg.origin.url="https://archive.org/" user.xdg.referrer.url="http://archive.org/" These attributes were chosen based on those stored by Google Chrome https://bugs.chromium.org/p/chromium/issues/detail?id=45903 and curl https://github.com/curl/curl/blob/master/src/tool_xattr.c
2016-07-21 12:15:49 +08:00
/* FreeBSD's extattr_set_fd returns the length of the extended attribute. */
retval = (retval < 0) ? retval : 0;
if (retval)
DEBUGP (("Failed to set xattr %s.\n", quote(name)));
Keep fetched URLs in POSIX extended attributes * configure.ac: Check for xattr availability * src/Makefile.am: Add xattr.c * src/ftp.c: Include xattr.h. (getftp): Set attributes if enabled. * src/http.c: Include xattr.h. (gethttp): Add parameter 'original_url', set attributes if enabled. (http_loop): Add 'original_url' to call of gethttp(). * src/init.c: Add new option --xattr. * src/main.c: Add new option --xattr, add description to help text. * src/options.h: Add new config member 'enable_xattr'. * src/xatrr.c: New file. * src/xattr.h: New file. These attributes provide a lightweight method of later determining where a file was downloaded from. This patch changes: * autoconf detects whether extended attributes are available and enables the code if they are. * The new flags --xattr and --no-xattr control whether xattr is enabled. * The new command "xattr = (on|off)" can be used in ~/.wgetrc or /etc/wgetrc * The original and redirected URLs are recorded as shown below. * This works for both single fetches and recursive mode. The attributes that are set are: user.xdg.origin.url: The URL that the content was fetched from. user.xdg.referrer.url: The URL that was originally requested. Here is an example, where http://archive.org redirects to https://archive.org: $ wget --xattr http://archive.org ... $ getfattr -d index.html user.xdg.origin.url="https://archive.org/" user.xdg.referrer.url="http://archive.org/" These attributes were chosen based on those stored by Google Chrome https://bugs.chromium.org/p/chromium/issues/detail?id=45903 and curl https://github.com/curl/curl/blob/master/src/tool_xattr.c
2016-07-21 12:15:49 +08:00
}
return retval;
}
#else /* USE_XATTR */
static int
write_xattr_metadata (const char *name, const char *value, FILE *fp)
{
(void)name;
(void)value;
(void)fp;
return 0;
}
#endif /* USE_XATTR */
int
set_file_metadata (const struct url *origin_url, const struct url *referrer_url, FILE *fp)
Keep fetched URLs in POSIX extended attributes * configure.ac: Check for xattr availability * src/Makefile.am: Add xattr.c * src/ftp.c: Include xattr.h. (getftp): Set attributes if enabled. * src/http.c: Include xattr.h. (gethttp): Add parameter 'original_url', set attributes if enabled. (http_loop): Add 'original_url' to call of gethttp(). * src/init.c: Add new option --xattr. * src/main.c: Add new option --xattr, add description to help text. * src/options.h: Add new config member 'enable_xattr'. * src/xatrr.c: New file. * src/xattr.h: New file. These attributes provide a lightweight method of later determining where a file was downloaded from. This patch changes: * autoconf detects whether extended attributes are available and enables the code if they are. * The new flags --xattr and --no-xattr control whether xattr is enabled. * The new command "xattr = (on|off)" can be used in ~/.wgetrc or /etc/wgetrc * The original and redirected URLs are recorded as shown below. * This works for both single fetches and recursive mode. The attributes that are set are: user.xdg.origin.url: The URL that the content was fetched from. user.xdg.referrer.url: The URL that was originally requested. Here is an example, where http://archive.org redirects to https://archive.org: $ wget --xattr http://archive.org ... $ getfattr -d index.html user.xdg.origin.url="https://archive.org/" user.xdg.referrer.url="http://archive.org/" These attributes were chosen based on those stored by Google Chrome https://bugs.chromium.org/p/chromium/issues/detail?id=45903 and curl https://github.com/curl/curl/blob/master/src/tool_xattr.c
2016-07-21 12:15:49 +08:00
{
/* Save metadata about where the file came from (requested, final URLs) to
* user POSIX Extended Attributes of retrieved file.
*
* For more details about the user namespace see
* [http://freedesktop.org/wiki/CommonExtendedAttributes] and
* [http://0pointer.de/lennart/projects/mod_mime_xattr/].
*/
int retval = -1;
char *value;
Keep fetched URLs in POSIX extended attributes * configure.ac: Check for xattr availability * src/Makefile.am: Add xattr.c * src/ftp.c: Include xattr.h. (getftp): Set attributes if enabled. * src/http.c: Include xattr.h. (gethttp): Add parameter 'original_url', set attributes if enabled. (http_loop): Add 'original_url' to call of gethttp(). * src/init.c: Add new option --xattr. * src/main.c: Add new option --xattr, add description to help text. * src/options.h: Add new config member 'enable_xattr'. * src/xatrr.c: New file. * src/xattr.h: New file. These attributes provide a lightweight method of later determining where a file was downloaded from. This patch changes: * autoconf detects whether extended attributes are available and enables the code if they are. * The new flags --xattr and --no-xattr control whether xattr is enabled. * The new command "xattr = (on|off)" can be used in ~/.wgetrc or /etc/wgetrc * The original and redirected URLs are recorded as shown below. * This works for both single fetches and recursive mode. The attributes that are set are: user.xdg.origin.url: The URL that the content was fetched from. user.xdg.referrer.url: The URL that was originally requested. Here is an example, where http://archive.org redirects to https://archive.org: $ wget --xattr http://archive.org ... $ getfattr -d index.html user.xdg.origin.url="https://archive.org/" user.xdg.referrer.url="http://archive.org/" These attributes were chosen based on those stored by Google Chrome https://bugs.chromium.org/p/chromium/issues/detail?id=45903 and curl https://github.com/curl/curl/blob/master/src/tool_xattr.c
2016-07-21 12:15:49 +08:00
if (!origin_url || !fp)
return retval;
value = url_string (origin_url, URL_AUTH_HIDE);
retval = write_xattr_metadata ("user.xdg.origin.url", escnonprint_uri (value), fp);
xfree (value);
if (!retval && referrer_url)
{
struct url u;
memset(&u, 0, sizeof(u));
u.scheme = referrer_url->scheme;
u.host = referrer_url->host;
u.port = referrer_url->port;
value = url_string (&u, 0);
retval = write_xattr_metadata ("user.xdg.referrer.url", escnonprint_uri (value), fp);
xfree (value);
}
Keep fetched URLs in POSIX extended attributes * configure.ac: Check for xattr availability * src/Makefile.am: Add xattr.c * src/ftp.c: Include xattr.h. (getftp): Set attributes if enabled. * src/http.c: Include xattr.h. (gethttp): Add parameter 'original_url', set attributes if enabled. (http_loop): Add 'original_url' to call of gethttp(). * src/init.c: Add new option --xattr. * src/main.c: Add new option --xattr, add description to help text. * src/options.h: Add new config member 'enable_xattr'. * src/xatrr.c: New file. * src/xattr.h: New file. These attributes provide a lightweight method of later determining where a file was downloaded from. This patch changes: * autoconf detects whether extended attributes are available and enables the code if they are. * The new flags --xattr and --no-xattr control whether xattr is enabled. * The new command "xattr = (on|off)" can be used in ~/.wgetrc or /etc/wgetrc * The original and redirected URLs are recorded as shown below. * This works for both single fetches and recursive mode. The attributes that are set are: user.xdg.origin.url: The URL that the content was fetched from. user.xdg.referrer.url: The URL that was originally requested. Here is an example, where http://archive.org redirects to https://archive.org: $ wget --xattr http://archive.org ... $ getfattr -d index.html user.xdg.origin.url="https://archive.org/" user.xdg.referrer.url="http://archive.org/" These attributes were chosen based on those stored by Google Chrome https://bugs.chromium.org/p/chromium/issues/detail?id=45903 and curl https://github.com/curl/curl/blob/master/src/tool_xattr.c
2016-07-21 12:15:49 +08:00
return retval;
}