Update 20180128 Being open about data privacy.md

This commit is contained in:
FelixYFZ 2018-05-20 11:11:55 +08:00 committed by GitHub
parent 3a0b1ceafc
commit fccdb35320
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -1,4 +1,3 @@
Translating by FelixYFZ Being open about data privacy
对数据隐私持开放的态度
======
![](https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/GOV_opendata.png?itok=M8L2HGVx)
@ -6,121 +5,77 @@ Translating by FelixYFZ Being open about data privacy
Image by : opensource.com
Today is [Data Privacy Day][1], ("Data Protection Day" in Europe), and you might think that those of us in the open source world should think that all data should be free, [as information supposedly wants to be][2], but life's not that simple. That's for two main reasons:
今天是[数据隐私日][1](在欧洲叫"数据保护日"),你可能会认为现在我们处于一个开源的世界中,所有的数据都应该免费,[就像人们想的那样][2],但是现实并没那么简单。主要有两个原因:
1. Most of us (and not just in open source) believe there's at least some data about us that we might not feel happy sharing (I compiled an example list in [a post][3] I published a while ago).
1. 我们中的大多数(不仅仅是在开源中)认为至少有些关于我们自己的数据是不愿意分享出去的(我在之前发表的一篇文章中列举了一些列子)
2. Many of us working in open source actually work for commercial companies or other organisations subject to legal requirements around what they can share.
2. 我们很多人虽然在开源中工作,但事实上是为了一些商业公司或者其他一些合组织工作,也是在合法的要求范围内分享数据。
1. 我们中的大多数(不仅仅是在开源中)认为至少有些关于我们自己的数据是不愿意分享出去的(我在之前发表的一篇文章中列举了一些列子[3]
2. 我们很多人虽然在开源中工作,但事实上是为了一些商业公司或者其他一些组织工作,也是在合法的要求范围内分享数据。
So actually, data privacy is something that's important for pretty much everybody.
所以实际上,数据隐私对于每个人来说是很重要的。
It turns out that the starting point for what data people and governments believe should be available for organisations to use is somewhat different between the U.S. and Europe, with the former generally providing more latitude for entities--particularly, the more cynical might suggest, large commercial entities--to use data they've collected about us as they will. Europe, on the other hand, has historically taken a more restrictive view, and on the 25th of May, Europe's view arguably will have triumphed.
事实证明,在美国和欧洲之间,人们和政府认为让组织使用的数据的起点是有些不同的。前者通常为实体提供更多的自由度,更愤世嫉俗的是--大型的商业体利用他们收集到的关于我们的数据。在欧洲完全是另一观念一直以来持有的多是有更多约束限制的观念而且在5月25日欧洲的观点可以说取得了胜利。
事实证明,在美国和欧洲之间,人们和政府认为让组织使用的数据的起点是有些不同的。前者通常为实体提供更多的自由度,更愤世嫉俗的是--大型的商业体利用他们收集到的关于我们的数据。在欧洲完全是另一方面一直以来持有的多是有更多约束限制的观念而且在5月25日欧洲的观点可说取得了胜利。
### The impact of GDPR
## 通用数据保护条例的影响
That's a rather sweeping statement, but the fact remains that this is the date on which a piece of legislation called the General Data Protection Regulation (GDPR), enacted by the European Union in 2016, becomes enforceable. The GDPR basically provides a stringent set of rules about how personal data can be stored, what it can be used for, who can see it, and how long it can be kept. It also describes what personal data is--and it's a pretty broad set of items, from your name and home address to your medical records and on through to your computer's IP address.
那是一个相当全面的声明其实事实上就是欧盟在2016年通过的一项关于通用数据保护的立法使它变得可实施。数据通用保护条例在私人数据怎样才能被保存如何才能被使用谁能使用能被持有多长时间这些方面设置了严格的规则。它描述了什么数据属于私人数据--而且涉及的条目范围非常广泛从你的姓名家庭住址到你的医疗记录以及接通你电脑的IP地址。
What is important about the GDPR, though, is that it doesn't apply just to European companies, but to any organisation processing data about EU citizens. If you're an Argentinian, Japanese, U.S., or Russian company and you're collecting data about an EU citizen, you're subject to it. 而是适用于任何在涉及处理关于欧盟居民的数据的任何组织。
通用数据保护条例的重要之处是他并不仅仅适用于欧洲的公司,如果你是阿根廷人,日本人,美国人或者是俄罗斯的公司而且你正在收集涉及到欧盟居民的数据,你就要受到这个条例的约束管辖。
"Pah!" you may say,1 "I'm not based in the EU: what can they do to me?" The answer is simple: If you want to continue doing any business in the EU, you'd better comply, because if you breach GDPR rules, you could be liable for up to four percent of your global revenues. Yes, that's global revenues: not just revenues in a particular country in Europe or across the EU, not just profits, but global revenues. Those are the sorts of numbers that should lead you to talk to your legal team, who will direct you to your exec team, who will almost immediately direct you to your IT group to make sure you're compliant in pretty short order.
“哼!” 你可能会这样说,“我的业务不在欧洲:他们能对我有啥约束?” 答案很简答如果你想继续在欧盟做任何生意你做好遵守因为一旦你违反了通用数据保护条例的规则你将会受到你全球总收入百分之四的惩罚。是的你没听错是全球总收入不是仅仅在欧盟某一国家的的收入也不只是净利润而是全球总收入。这将会让你去叮嘱告知你的法律团队他们就会知会你的这个团队同时也会立即去指引你的IT团队确保你在相当短的时间内是符合要求的。
“哼!” 你可能会这样说,“我的业务不在欧洲:他们能对我有啥约束?” 答案很简答如果你想继续在欧盟做任何生意你最好遵守因为一旦你违反了通用数据保护条例的规则你将会受到你全球总收入百分之四的惩罚。是的你没听错是全球总收入不是仅仅在欧盟某一国家的的收入也不只是净利润而是全球总收入。这将会让你去叮嘱告知你的法律团队他们就会知会你的整个团队同时也会立即去指引你的IT团队确保你的行为相当短的时间内是符合要求的。
看上去这和欧盟之外的城市没有什么相关性但其实不然对大多数公司来说对所有的他们的顾客、合作伙伴以及员工实行同样的数据保护措施是件既简单又有效的事情而不是只是在欧盟的城市实施这将会是一件很有利的事情。2
This may seem like it's not particularly relevant to non-EU citizens, but it is. For most companies, it's going to be simpler and more efficient to implement the same protection measures for data associated with all customers, partners, and employees they deal with, rather than just targeting specific measures at EU citizens. This has got to be a good thing.2
看上去这和欧盟之外的城市没有什么相关性,但其实不然,对大多数公司来说,对所有的他们的顾客、合作伙伴以及员工实行同样的数据保护措施是件既简单又有效的事情,
而不是只是在欧盟的城市实施,这将会是一件很有利的事情。
However, just because GDPR will soon be applied to organisations across the globe doesn't mean that everything's fine and dandy3: it's not. We give away information about ourselves all the time--and permission for companies to use it.
然而,数据通用保护条例不久将在全球实施并不意味着一切都会变的很美好:事实并非如此,我们一直在丢弃关于我们自己的信息--而且允许公司去使用它。
There's a telling (though disputed) saying: "If you're not paying, you're the product." What this suggests is that if you're not paying for a service, then somebody else is paying to use your data. Do you pay to use Facebook? Twitter? Gmail? How do you think they make their money? Well, partly through advertising, and some might argue that's a service they provide to you, but actually that's them using your data to get money from the advertisers. You're not really a customer of advertising--it's only once you buy something from the advertiser that you become their customer, but until you do, the relationship is between the the owner of the advertising platform and the advertiser.
有一句话是这么说的(尽管很争议):“如果你没有在付费,那么你就是产品。”这句话的意识就是如果你没有为某一项服务付费,那么其他的人就在付费使用你的数据。
你有付费使用Facebook、推特谷歌邮箱你觉得他们是如何赚钱的大部分是通过广告一些人会争论到他们那时他们向你提供的一项服务但事实上是他们在利用你的数据从广告商里获取收益。你不是一个真正的广告的顾客-只有当你从看了广告后买了他们的商品之后你才变成了他们的顾,但直到这个发生之前,都是广告平台和广告商的关系。
Some of these services allow you to pay to reduce or remove advertising (Spotify is a good example), but on the other hand, advertising may be enabled even for services that you think you do pay for (Amazon is apparently working to allow adverts via Alexa, for instance). Unless we want to start paying to use all of these "free" services, we need to be aware of what we're giving up, and making some choices about what we expose and what we don't.
有一句话是这么说的(尽管很争议):“如果你没有在付费,那么你就是产品。”这句话的意思就是如果你没有为某一项服务付费,那么其他的人就在付费使用你的数据。
你有付费使用Facebook、推特谷歌邮箱你觉得他们是如何赚钱的大部分是通过广告一些人会争论那是他们向你提供的一项服务而已但事实上是他们在利用你的数据从广告商里获取收益。你不是一个真正的广告的顾客-只有当你从看了广告后买了他们的商品之后你才变成了他们的顾客,但直到这个发生之前,都是广告平台和广告商的关系。
有些服务是允许你通过付费来消除广告的流媒体音乐平台声破天就是这样的但从另一方面来讲即使你认为付费的服务也可以启用广告列如亚马逊正在允许通过Alexa广告除非我们想要开始为这些所有的免费服务付费我们需要清除我们所放弃的而且在我们想要揭发和不想的里面做一些选择。
### Who's the customer?
### 谁是顾客?
There's another issue around data that should be exercising us, and it's a direct consequence of the amounts of data that are being generated. There are many organisations out there--including "public" ones like universities, hospitals, or government departments4--who generate enormous quantities of data all the time, and who just don't have the capacity to store it. It would be a different matter if this data didn't have long-term value, but it does, as the tools for handling Big Data are developing, and organisations are realising they can be mining this now and in the future.
关于数据的另一个问题一直在困扰着我们它是产生的数据量的直接结果。有许多组织一直在产生巨量的数据包括公共的组织比如大学、医院或者是政府部门4--
而且他们没有能力去储存这些数据。如果这些数据没有长久的价值也就没什么要紧的,但事实正好相反,随着处理大数据的工具正在开发中,而且这些组织也认识到他们现在以及在不久的将来将能够去开采这些。
The problem they face, though, as the amount of data increases and their capacity to store it fails to keep up, is what to do with it. Luckily--and I use this word with a very heavy dose of irony,5 big corporations are stepping in to help them. "Give us your data," they say, "and we'll host it for free. We'll even let you use the data you collected when you want to!" Sounds like a great deal, yes? A fantastic example of big corporations6 taking a philanthropic stance and helping out public organisations that have collected all of that lovely data about us.
然而他们面临的是,随着数据的增长和存储量的不足他们是如何处理的。幸运--E而且我是带有讽刺意味的使用了这个词5大公司正在介入去帮助他们。“把你们的数据给我们”他们说“我们将免费保存。我们甚至让你随时能够使用你所收集到的数据”这听起来很棒是吗这是大公司的一个极具代表性的列子站在慈善的立场上帮助公共组织管理他们收集到的关于我们的数据。
而且他们没有能力去储存这些数据。如果这些数据没有长久的价值也就没什么要紧的,但事实正好相反,随着处理大数据的工具正在开发中,而且这些组织也认识到他们现在以及在不久的将来将能够去开采这些数据。
然而他们面临的是,随着数据的增长和存储量的不足他们是如何处理的。幸运--而且我是带有讽刺意味的使用了这个词5大公司正在介入去帮助他们。“把你们的数据给我们”他们说“我们将免费保存。我们甚至让你随时能够使用你所收集到的数据”这听起来很棒是吗这是大公司的一个极具代表性的列子站在慈善的立场上帮助公共组织管理他们收集到的关于我们的数据。
不幸的是慈善不是唯一的理由。他们是附有条件的作为同意保存数据的交换条件这些公司得到了将数据访问权限出售非第三方的权利。你认为公共组织或者是被收集数据的人在数据被出售使用权使给第三方在他们如何使用上面能有发言权吗我将把这个问题当做一个练习留给读者去思考。7
Sadly, philanthropy isn't the only reason. These hosting deals come with a price: in exchange for agreeing to host the data, these corporations get to sell access to it to third parties. And do you think the public organisations, or those whose data is collected, will get a say in who these third parties are or how they will use it? I'll leave this as an exercise for the reader.7
不幸的是慈善不是唯一的理由。他们是附有条件的作为同意保存数据的交换条件这些公司得到了将数据访问权限出售非第三方的权利。你认为公共组织或者是被收集数据的人在数据被出售使用权使给第三方在他们如何使用上面能有发言权啊我将把这个问题当做一个练习留给读者。7
### Open and positive
### 开放和积极
It's not all bad news, however. There's a growing "open data" movement among governments to encourage departments to make much of their data available to the public and other bodies for free. In some cases, this is being specifically legislated. Many voluntary organisations--particularly those receiving public funding--are starting to do the same. There are glimmerings of interest even from commercial organisations. What's more, there are techniques becoming available, such as those around differential privacy and multi-party computation, that are beginning to allow us to mine data across data sets without revealing too much about individuals--a computing problem that has historically been much less tractable than you might otherwise expect.
然而并不只有坏消息。政府中有一项在逐渐发展起来的“开放数据”运动鼓励部门能够将免费开放他们的数据给公众或者其他组织。这项行动目前正在被实施立法。许多
支援组织--尤其是那些收到公共基金的--正在开始推动同样的活动。即使商业组织也有些许的兴趣。而且,在技术上已经可行了,例如围绕不同的隐私和多方计算上,正在允许我们根据数据设置和不揭露太多关于个人的前提下开采数据--一个历史的计算问题比你想象的要容易处理的多。
What does this all mean to us? Well, I've written before on Opensource.com about the [commonwealth of open source][4], and I'm increasingly convinced that we need to look beyond just software to other areas: hardware, organisations, and, relevant to this discussion, data. Let's imagine that you're a company (A) that provides a service to another company, a customer (B).8 There are four different types of data in play:
支援组织--尤其是那些收到公共基金的--正在开始推动同样的活动。即使商业组织也有些许的兴趣。而且,在技术上已经可行了,例如围绕不同的隐私和多方计算上,正在允许我们根据数据设置和不揭露太多关于个人的前提下开采数据--一个历史的计算问题比你想象的要容易处理的多。
这些对我们来说意味着什么呢我之前在网站Opensource.com上写过关于[开源的共享福利][4],而且我越来越相信我们需要把我们的视野从软件拓展到其他区域硬件组织和这次讨论有关的数据。让我们假设一下你是A公司要提向另一家公司提供一项服务客户B。在游戏中有四种不同类型的数据
 1. Data that's fully open: visible to A, B, and the rest of the world
 1. 数据完全开放:对A和B都是可得到的世界上任何人都可以得到
 2. Data that's known, shared, and confidential: visible to A and B, but nobody else
 2. 数据是已知的共享的和机密的A和B可得到但其他人不能得到。
 3. Data that's company-confidential: visible to A, but not B
3. 数据是公司级别上保密的A公司可以得到但B顾客不能
 4. Data that's customer-confidential: visible to B, but not A
 4. 数据是顾客级别保密的B顾客可以得到但A公司不能
First of all, maybe we should be a bit more open about data and default to putting it into bucket 1. That data--on self-driving cars, voice recognition, mineral deposits, demographic statistics--could be enormously useful if it were available to everyone.9 Also, wouldn't it be great if we could find ways to make the data in buckets 2, 3, and 4--or at least some of it--available in bucket 1, whilst still keeping the details confidential? That's the hope for some of these new techniques being researched. They're a way off, though, so don't get too excited, and in the meantime, start thinking about making more of your data open by default.
首先,也许我们对数据应该更开放些,将数据默认放到选项一中。如果那些数据对所有人开放--在无人驾驶、语音识别矿藏以及人口数据统计会有相当大的作用的9
如果我们能够找到方法将数据放到选项23和4中不是很好嘛--或者至少它们中的一些--在选项一中是可以实现的,同时仍将细节保密?这就是研究这些新技术的希望。
然而又很长的路要走,所以不要太兴奋,同时,开始考虑将你的的一些数据默认开放。
### Some concrete steps
### 一些具体的措施
So, what can we do around data privacy and being open? Here are a few concrete steps that occurred to me: please use the comments to contribute more.
我们如何处理数据的隐私和开放?下面是我想到的一些具体的措施:欢迎大家评论做出更多的贡献。
 * Check to see whether your organisation is taking GDPR seriously. If it isn't, push for it.
 * 检查你的组织是否正在认证严格的执行通用数据保护条例。如果没有,去推动实施它。
 * Default to encrypting sensitive data (or hashing where appropriate), and deleting when it's no longer required--there's really no excuse for data to be in the clear to these days except for when it's actually being processed.
 * 要默认去加密敏感数据(或者适当的时候用散列算法),当不再需要的时候及时删掉--除非数据正在被处理使用没有任何借口让数据清晰可见。
 * Consider what information you disclose when you sign up to services, particularly social media.
 * 检查你的组织是否正在认真严格的执行通用数据保护条例。如果没有,去推动实施它。
 * 要默认去加密敏感数据(或者适当的时候用散列算法),当不再需要的时候及时删掉--除非数据正在被处理使用否则没有任何借口让数据清晰可见。
 * 当你注册一个服务的时候考虑一下你公开了什么信息,特别是社交媒体类的。
 * Discuss this with your non-technical friends.
 * 和你的非技术朋友讨论这个话题。
 * Educate your children, your friends' children, and their friends. Better yet, go and talk to their teachers about it and present something in their schools.
 * 教育你的孩子,你朋友的孩子以及他们的朋友。然而最好是去他们的学校和他们的老师交谈在他们的学校中展示。
 * Encourage the organisations you work for, volunteer for, or interact with to make data open by default. Rather than thinking, "why should I make this public?" start with "why shouldn't I make this public?"
 * 鼓励你工作志愿服务的组织,或者和他们互动推动数据的默认开放。不是去思考为什么我要使数据开放而是以我为什么不让数据开放开始。
 * Try accessing some of the open data sources out there. Mine it, create apps that use it, perform statistical analyses, draw pretty graphs,10 make interesting music, but consider doing something with it. Tell the organisations that sourced it, thank them, and encourage them to do more.
 * 尝试去访问一些开源数据。开采使用它。开发应用来使用它进行数据分析画漂亮的图10 制作有趣的音乐,考虑使用它来做些事。告诉组织去使用它们,感谢它们,而且鼓励他们去做更多。
1. Though you probably won't, I admit.
1. 我承认你可能尽管不会
2. Assuming that you believe that your personal data should be protected.
2. 假设你坚信你的个人数据应该被保护。
3. If you're wondering what "dandy" means, you're not alone at this point.
3. 如果你在思考“极好的”的寓意,在这点上你并不孤独。
4. Exactly how public these institutions seem to you will probably depend on where you live: [YMMV][5].
4. 事实上这些机构能够有多开放取决于你所居住的地方。
5. And given that I'm British, that's a really very, very heavy dose.
5. 假设我是英国人,那是非常非常大的剂量。
6. And they're likely to be big corporations: nobody else can afford all of that storage and the infrastructure to keep it available.
6. 他们可能是巨大的公司:没有其他人能够负担得起这么大的存储和基础架构来使数据保持可用。
7. No. The answer's "no."
7. 不,答案是“不”。
8. Although the example works for people, too. Oh, look: A could be Alice, B could be Bob…
8. 尽管这个列子也同样适用于个人。看看A可能是Alice,B 可能是BOb...
9. Not that we should be exposing personal data or data that actually needs to be confidential, of course--not that type of data.
9. 并不是说我们应该暴露个人的数据或者是这样的数据应该被保密,当然--不是那类的数据。
10. A friend of mine decided that it always seemed to rain when she picked her children up from school, so to avoid confirmation bias, she accessed rainfall information across the school year and created graphs that she shared on social media.
10. 我的一个朋友当她接孩子放学的时候总是下雨,所以为了避免确认失误,她在整个学年都访问天气信息并制作了图表分享到社交媒体上。
--------------------------------------------------------------------------------