Merge remote-tracking branch 'refs/remotes/LCTT/master'

This commit is contained in:
joVoV 2016-08-03 17:05:42 +08:00
commit d35eef5c26
96 changed files with 5990 additions and 1928 deletions

View File

@ -0,0 +1,111 @@
JStockLinux 上不错的股票投资组合管理软件
================================================================================
如果你在股票市场做投资,那么你可能非常清楚投资组合管理计划有多重要。管理投资组合的目标是依据你能承受的风险,时间层面的长短和资金盈利的目标去为你量身打造的一种投资计划。鉴于这类软件的重要性,因此从来不会缺乏商业性的 app 和股票行情检测软件,每一个都可以兜售复杂的投资组合以及跟踪报告功能。
对于我们这些 Linux 爱好者们,我也找到了一些**好用的开源投资组合管理工具**,用来在 Linux 上管理和跟踪股票的投资组合,这里高度推荐一个基于 java 编写的管理软件 [JStock][1]。如果你不是一个 java 粉也许你会放弃它JStock 需要运行在沉重的 JVM 环境上。但同时,在每一个安装了 JRE 的环境中它都可以马上运行起来,在你的 Linux 环境中它会运行的很顺畅。
“开源”就意味着免费或标准低下的时代已经过去了。鉴于 JStock 只是一个个人完成的产物,作为一个投资组合管理软件它最令人印象深刻的是包含了非常多实用的功能,以上所有的荣誉属于它的作者 Yan Cheng Cheok例如JStock 支持通过监视列表去监控价格,多种投资组合,自选/内置的股票指标与相关监测支持27个不同的股票市场和跨平台的云端备份/还原。JStock 支持多平台部署Linux, OS X, Android 和 Windows你可以通过云端保存你的 JStock 投资组合,并通过云平台无缝的备份/还原到其他的不同平台上面。
现在我将向你展示如何安装以及使用过程的一些具体细节。
### 在 Linux 上安装 JStock ###
因为 JStock 使用Java编写所以必须[安装 JRE][2]才能让它运行起来。小提示JStock 需要 JRE1.7 或更高版本。如你的 JRE 版本不能满足这个需求JStock 将会运行失败然后出现下面的报错。
Exception in thread "main" java.lang.UnsupportedClassVersionError: org/yccheok/jstock/gui/JStock : Unsupported major.minor version 51.0
在你的 Linux 上安装好了 JRE 之后,从其官网下载最新的发布的 JStock然后加载启动它。
$ wget https://github.com/yccheok/jstock/releases/download/release_1-0-7-13/jstock-1.0.7.13-bin.zip
$ unzip jstock-1.0.7.13-bin.zip
$ cd jstock
$ chmod +x jstock.sh
$ ./jstock.sh
教程的其他部分,让我来给大家展示一些 JStock 的实用功能
### 监视监控列表中股票价格的波动 ###
使用 JStock 你可以创建一个或多个监视列表它可以自动的监视股票价格的波动并给你提供相应的通知。在每一个监视列表里面你可以添加多个感兴趣的股票进去。之后在“Fall Below”和“Rise Above”的表格里添加你的警戒值分别设定该股票的最低价格和最高价格。
![](https://c2.staticflickr.com/2/1588/23795349969_37f4b0f23c_c.jpg)
例如你设置了 AAPL 股票的最低/最高价格分别是 $102 和 $115.50,只要在价格低于 $102 或高于 $115.50 时你就得到桌面通知。
你也可以设置邮件通知这样你将收到一些价格信息的邮件通知。设置邮件通知在“Options”菜单里在“Alert”标签中国打开“Send message to email(s)”,填入你的 Gmail 账户。一旦完成 Gmail 认证步骤JStock 就会开始发送邮件通知到你的 Gmail 账户(也可以设置其他的第三方邮件地址)。
![](https://c2.staticflickr.com/2/1644/24080560491_3aef056e8d_b.jpg)
### 管理多个投资组合 ###
JStock 允许你管理多个投资组合。这个功能对于你使用多个股票经纪人时是非常实用的。你可以为每个经纪人创建一个投资组合去管理你的“买入/卖出/红利”用来了解每一个经纪人的业务情况。你也可以在“Portfolio”菜单里面选择特定的投资组合来切换不同的组合项目。下面是一张截图用来展示一个假设的投资组合。
![](https://c2.staticflickr.com/2/1646/23536385433_df6c036c9a_c.jpg)
你也可以设置付给中介费你可以为每个买卖交易设置中介费、印花税以及结算费。如果你比较懒你也可以在选项菜单里面启用自动费用计算并提前为每一家经济事务所设置费用方案。当你为你的投资组合增加交易之后JStock 将自动的计算并计入费用。
![](https://c2.staticflickr.com/2/1653/24055085262_0e315c3691_b.jpg)
### 使用内置/自选股票指标来监控 ###
如果你要做一些股票的技术分析你可能需要基于各种不同的标准来监控股票这里叫做“股票指标”。对于股票的跟踪JStock提供多个[预设的技术指示器][3] 去获得股票上涨/下跌/逆转指数的趋势。下面的列表里面是一些可用的指标。
- 平滑异同移动平均线MACD
- 相对强弱指标 (RSI)
- 资金流向指标 (MFI)
- 顺势指标 (CCI)
- 十字线
- 黄金交叉线,死亡交叉线
- 涨幅/跌幅
开启预设指示器能需要在 JStock 中点击“Stock Indicator Editor”标签。之后点击右侧面板中的安装按钮。选择“Install from JStock server”选项之后安装你想要的指示器。
![](https://c2.staticflickr.com/2/1476/23867534660_b6a9c95a06_c.jpg)
一旦安装了一个或多个指示器你可以用他们来扫描股票。选择“Stock Indicator Scanner”标签点击底部的“Scan”按钮选择需要的指示器。
![](https://c2.staticflickr.com/2/1653/24137054996_e8fcd10393_c.jpg)
当你选择完需要扫描的股票(例如, NYSE, NASDAQ以后JStock 将执行该扫描,并将该指示器捕获的结果通过列表展现。
![](https://c2.staticflickr.com/2/1446/23795349889_0f1aeef608_c.jpg)
除了预设指示器以外你也可以使用一个图形化的工具来定义自己的指示器。下面这张图例用于监控当前价格小于或等于60天平均价格的股票。
![](https://c2.staticflickr.com/2/1605/24080560431_3d26eac6b5_c.jpg)
### 通过云在 Linux 和 Android JStock 之间备份/恢复###
另一个非常棒的功能是 JStock 支持云备份恢复。Jstock 可以通过 Google Drive 把你的投资组合/监视列表在云上备份和恢复,这个功能可以实现在不同平台上无缝穿梭。如果你在两个不同的平台之间来回切换使用 Jstock这种跨平台备份和还原非常有用。我在 Linux 桌面和 Android 手机上测试过我的 Jstock 投资组合,工作的非常漂亮。我在 Android 上将 Jstock 投资组合信息保存到 Google Drive 上,然后我可以在我的 Linux 版的 Jstock 上恢复它。如果能够自动同步到云上,而不用我手动地触发云备份/恢复就更好了,十分期望这个功能出现。
![](https://c2.staticflickr.com/2/1537/24163165565_bb47e04d6c_c.jpg)
![](https://c2.staticflickr.com/2/1556/23536385333_9ed1a75d72_c.jpg)
如果你在从 Google Drive 还原之后不能看到你的投资信息以及监视列表请确认你的国家信息与“Country”菜单里面设置的保持一致。
JStock 的安卓免费版可以从 [Google Play Store][4] 获取到。如果你需要完整的功能(比如云备份,通知,图表等),你需要一次性支付费用升级到高级版。我认为高级版物有所值。
![](https://c2.staticflickr.com/2/1687/23867534720_18b917028c_c.jpg)
写在最后我应该说一下它的作者Yan Cheng Cheok他是一个十分活跃的开发者有bug及时反馈给他。这一切都要感谢他
关于 JStock 这个投资组合跟踪软件你有什么想法呢?
--------------------------------------------------------------------------------
via: http://xmodulo.com/stock-portfolio-management-software-Linux.html
作者:[Dan Nanni][a]
译者:[ivo-wang](https://github.com/ivo-wang)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://Linux.cn/) 荣誉推出
[a]:http://xmodulo.com/author/nanni
[1]:http://jstock.org/
[2]:http://ask.xmodulo.com/install-java-runtime-Linux.html
[3]:http://jstock.org/ma_indicator.html
[4]:https://play.google.com/store/apps/details?id=org.yccheok.jstock.gui

View File

@ -0,0 +1,105 @@
Fedora 中的容器技术systemd-nspawn
===
欢迎来到“Fedora 中的容器技术”系列!本文是该系列文章中的第一篇,它将说明你可以怎样使用 Fedora 中各种可用的容器技术。本文将学习 `systemd-nspawn` 的相关知识。
### 容器是什么?
一个容器就是一个用户空间实例,它能够在与托管容器的系统(叫做宿主系统)相隔离的环境中运行一个程序或者一个操作系统。这和 `chroot` 或 [虚拟机][1] 的思想非常类似。运行在容器中的进程是由与宿主操作系统相同的内核来管理的,但它们是与宿主文件系统以及其它进程隔离开的。
### 什么是 systemd-nspawn
systemd 项目认为应当将容器技术变成桌面的基础部分并且应当和用户的其余系统集成在一起。为此systemd 提供了 `systemd-nspawn`,这款工具能够使用多种 Linux 技术创建容器。它也提供了一些容器管理工具。
`systemd-nspawn``chroot` 在许多方面都是类似的,但是前者更加强大。它虚拟化了文件系统、进程树以及客户系统中的进程间通信。它的吸引力在于它提供了很多用于管理容器的工具,例如用来管理容器的 `machinectl`。由 `systemd-nspawn` 运行的容器将会与 systemd 组件一同运行在宿主系统上。举例来说,一个容器的日志可以输出到宿主系统的日志中。
在 Fedora 24 上,`systemd-nspawn` 已经从 systemd 软件包分离出来了,所以你需要安装 `systemd-container` 软件包。一如往常,你可以使用 `dnf install systemd-container` 进行安装。
### 创建容器
使用 `systemd-nspawn` 创建一个容器是很容易的。假设你有一个专门为 Debian 创造的应用,并且无法在其它发行版中正常运行。那并不是一个问题,我们可以创造一个容器!为了设置容器使用最新版本的 Debian现在是 Jessie你需要挑选一个目录来放置你的系统。我暂时将使用目录 `~/DebianJessie`
一旦你创建完目录,你需要运行 `debootstrap`,你可以从 Fedora 仓库中安装它。对于 Debian Jessie你运行下面的命令来初始化一个 Debian 文件系统。
```
$ debootstrap --arch=amd64 stable ~/DebianJessie
```
以上默认你的架构是 x86_64。如果不是的话你必须将架构的名称改为 `amd64`。你可以使用 `uname -m` 得知你的机器架构。
一旦设置好你的根目录,你就可以使用下面的命令来启动你的容器。
```
$ systemd-nspawn -bD ~/DebianJessie
```
容器将会在数秒后准备好并运行,当你试图登录时就会注意到:你无法使用你的系统上任何账户。这是因为 `systemd-nspawn` 虚拟化了用户。修复的方法很简单:将之前的命令中的 `-b` 移除即可。你将直接进入容器的 root 用户的 shell。此时你只能使用 `passwd` 命令为 root 设置密码,或者使用 `adduser` 命令添加一个新用户。一旦设置好密码或添加好用户,你就可以把 `-b` 标志添加回去然后继续了。你会进入到熟悉的登录控制台,然后你使用设置好的认证信息登录进去。
以上对于任意你想在容器中运行的发行版都适用,但前提是你需要使用正确的包管理器创建系统。对于 Fedora你应使用 DNF 而非 `debootstrap`。想要设置一个最小化的 Fedora 系统,你可以运行下面的命令,要将“/absolute/path/”替换成任何你希望容器存放的位置。
```
$ sudo dnf --releasever=24 --installroot=/absolute/path/ install systemd passwd dnf fedora-release
```
![](https://cdn.fedoramagazine.org/wp-content/uploads/2016/06/Screenshot-from-2016-06-17-15-04-14.png)
### 设置网络
如果你尝试启动一个服务,但它绑定了你宿主机正在使用的端口,你将会注意到这个问题:你的容器正在使用和宿主机相同的网络接口。幸运的是,`systemd-nspawn` 提供了几种可以将网络从宿主机分开的方法。
#### 本地网络
第一种方法是使用 `--private-network` 标志,它默认仅创建一个回环设备。这对于你不需要使用网络的环境是非常理想的,例如构建系统和其它持续集成系统。
#### 多个网络接口
如果你有多个网络接口设备,你可以使用 `--network-interface` 标志给容器分配一个接口。想要给我的容器分配 `eno1`,我会添加选项 `--network-interface=eno1`。当某个接口分配给一个容器后,宿主机就不能同时使用那个接口了。只有当容器彻底关闭后,宿主机才可以使用那个接口。
#### 共享网络接口
对于我们中那些并没有额外的网络设备的人来说,还有其它方法可以访问容器。一种就是使用 `--port` 选项。这会将容器中的一个端口定向到宿主机。使用格式是 `协议:宿主机端口:容器端口`,这里的协议可以是 `tcp` 或者 `udp``宿主机端口` 是宿主机的一个合法端口,`容器端口` 则是容器中的一个合法端口。你可以省略协议,只指定 `宿主机端口:容器端口`。我通常的用法类似 `--port=2222:22`
你可以使用 `--network-veth` 启用完全的、仅宿主机模式的网络,这会在宿主机和容器之间创建一个虚拟的网络接口。你也可以使用 `--network-bridge` 桥接二者的连接。
### 使用 systemd 组件
如果你容器中的系统含有 D-Bus你可以使用 systemd 提供的实用工具来控制并监视你的容器。基础安装的 Debian 并不包含 `dbus`。如果你想在 Debian Jessie 中使用 `dbus`,你需要运行命令 `apt install dbus`
#### machinectl
为了能够轻松地管理容器systemd 提供了 `machinectl` 实用工具。使用 `machinectl`,你可以使用 `machinectl login name` 登录到一个容器中、使用 `machinectl status name`检查状态、使用 `machinectl reboot name` 启动容器或者使用 `machinectl poweroff name` 关闭容器。
### 其它 systemd 命令
多数 systemd 命令,例如 `journalctl`, `systemd-analyze``systemctl`,都支持使用 `--machine` 选项来指定容器。例如,如果你想查看一个名为 “foobar” 的容器的日志,你可以使用 `journalctl --machine=foobar`。你也可以使用 `systemctl --machine=foobar status service` 来查看运行在这个容器中的服务状态。
![](https://cdn.fedoramagazine.org/wp-content/uploads/2016/06/Screenshot-from-2016-06-17-15-09-25.png)
### 和 SELinux 一起工作
如果你要使用 SELinux 强制模式Fedora 默认模式),你需要为你的容器设置 SELinux 环境。想要那样的话,你需要在宿主系统上运行下面两行命令。
```
$ semanage fcontext -a -t svirt_sandbox_file_t "/path/to/container(/.*)?"
$ restorecon -R /path/to/container/
```
确保使用你的容器路径替换 “/path/to/container”。对于我的容器 "DebianJessie",我会运行下面的命令:
```
$ semanage fcontext -a -t svirt_sandbox_file_t "/home/johnmh/DebianJessie(/.*)?"
$ restorecon -R /home/johnmh/DebianJessie/
```
--------------------------------------------------------------------------------
via: https://fedoramagazine.org/container-technologies-fedora-systemd-nspawn/
作者:[John M. Harris, Jr.][a]
译者:[ChrisLeeGit](https://github.com/chrisleegit)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://fedoramagazine.org/container-technologies-fedora-systemd-nspawn/
[1]: https://en.wikipedia.org/wiki/Virtual_machine

View File

@ -0,0 +1,101 @@
如何在 Ubuntu Linux 16.04上安装开源的 Discourse 论坛
===============================================================================
Discourse 是一个开源的论坛,它可以以邮件列表、聊天室或者论坛等多种形式工作。它是一个广受欢迎的现代的论坛工具。在服务端,它使用 Ruby on Rails 和 Postgres 搭建, 并且使用 Redis 缓存来减少读取时间 , 在客户端,它使用支持 Java Script 的浏览器。它非常容易定制,结构良好,并且它提供了转换插件,可以对你现存的论坛、公告板进行转换,例如: vBulletin、phpBB、Drupal、SMF 等等。在这篇文章中,我们将学习在 Ubuntu 操作系统下安装 Discourse。
它以安全作为设计思想,所以发垃圾信息的人和黑客们不能轻易的实现其企图。它能很好的支持各种现代设备,并可以相应的调整以手机和平板的显示。
### 在 Ubuntu 16.04 上安装 Discourse
让我们开始吧 ! 最少需要 1G 的内存,并且官方支持的安装过程需要已经安装了 docker。 说到 docker它还需要安装Git。要满足以上的两点要求我们只需要运行下面的命令
```
wget -qO- https://get.docker.com/ | sh
```
![](http://linuxpitstop.com/wp-content/uploads/2016/06/124.png)
用不了多久就安装好了 docker 和 Git安装结束以后在你的系统上的 /var 分区创建一个 Discourse 文件夹(当然你也可以选择其他的分区)。
```
mkdir /var/discourse
```
现在我们来克隆 Discourse 的 Github 仓库到这个新建的文件夹。
```
git clone https://github.com/discourse/discourse_docker.git /var/discourse
```
进入这个克隆的文件夹。
```
cd /var/discourse
```
![](http://linuxpitstop.com/wp-content/uploads/2016/06/314.png)
你将看到“discourse-setup” 脚本文件,运行这个脚本文件进行 Discourse 的初始化。
```
./discourse-setup
```
**备注: 在安装 discourse 之前请确保你已经安装好了邮件服务器。**
安装向导将会问你以下六个问题:
```
Hostname for your Discourse?
Email address for admin account?
SMTP server address?
SMTP user name?
SMTP port [587]:
SMTP password? []:
```
![](http://linuxpitstop.com/wp-content/uploads/2016/06/411.png)
当你提交了以上信息以后, 它会让你提交确认, 如果一切都很正常,点击回车以后安装开始。
![](http://linuxpitstop.com/wp-content/uploads/2016/06/511.png)
现在“坐等放宽”,需要花费一些时间来完成安装,倒杯咖啡,看看有什么错误信息没有。
![](http://linuxpitstop.com/wp-content/uploads/2016/06/610.png)
安装成功以后看起来应该像这样。
![](http://linuxpitstop.com/wp-content/uploads/2016/06/710.png)
现在打开浏览器,如果已经做了域名解析,你可以使用你的域名来连接 Discourse 页面 否则你只能使用IP地址了。你将看到如下信息
![](http://linuxpitstop.com/wp-content/uploads/2016/06/85.png)
就是这个,点击 “Sign Up” 选项创建一个新的账户,然后进行你的 Discourse 设置。
![](http://linuxpitstop.com/wp-content/uploads/2016/06/106.png)
### 结论
它安装简便,运行完美。 它拥有现代论坛所有必备功能。它以 GPL 发布,是完全开源的产品。简单、易用、以及特性丰富是它的最大特点。希望你喜欢这篇文章,如果有问题,你可以给我们留言。
--------------------------------------------------------------------------------
via: http://linuxpitstop.com/install-discourse-on-ubuntu-linux-16-04/
作者:[Aun][a]
译者:[kokialoves](https://github.com/kokialoves)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://linuxpitstop.com/author/aun/

View File

@ -0,0 +1,31 @@
Fedora 内核是由什么构成的?
====================================
![](https://cdn.fedoramagazine.org/wp-content/uploads/2016/06/kernel-945x400.png)
每个 Fedora 系统都运行着一个内核。许多代码片段组合在一起使之成为现实。
每个 Fedora 内核都起始于一个来自于[上游社区][1]的基线版本——通常称之为 vanilla 内核。上游内核就是标准。Fedora 的)目标是包含尽可能多的上游代码,这样使得 bug 修复和 API 更新更加容易同时也会有更多的人审查代码。理想情况下Fedora 能够直接获取 kernel.org 的内核,然后发送给所有用户。
现实情况是,使用 vanilla 内核并不能完全满足 Fedora。Vanilla 内核可能并不支持一些 Fedora 用户希望拥有的功能。用户接收的 [Fedora 内核] 是在 vanilla 内核之上打了很多补丁的内核。这些补丁被认为“不在树上out of tree”。许多这些位于补丁树之外的补丁都不会存在太久。如果某补丁能够修复一个问题那么该补丁可能会被合并到 Fedora 树,以便用户能够更快地收到修复。当内核变基到一个新版本时,在新版本中的补丁都将被清除。
一些补丁会在 Fedora 内核树上存在很长时间。一个很好的例子是,安全启动补丁就是这类补丁。这些补丁提供了 Fedora 希望支持的功能,即使上游社区还没有接受它们。保持这些补丁更新是需要付出很多努力的,所以 Fedora 尝试减少不被上游内核维护者接受的补丁数量。
通常来说,想要在 Fedora 内核中获得一个补丁的最佳方法是先给 [Linux 内核邮件列表LKML][3] 发送补丁,然后请求将该补丁包含到 Fedora 中。如果某个维护者接受了补丁,就意味着 Fedora 内核树中将来很有可能会包含该补丁。一些来自于 GitHub 等地方的还没有提交给 LKML 的补丁是不可能进入内核树的。首先向 LKML 发送补丁是非常重要的,它能确保 Fedora 内核树中携带的补丁是功能正常的。如果没有社区审查Fedora 最终携带的补丁将会充满 bug 并会导致问题。
Fedora 内核中包含的代码来自许多地方。一切都需要提供最佳的体验。
--------------------------------------------------------------------------------
via: https://fedoramagazine.org/makes-fedora-kernel/
作者:[Laura Abbott][a]
译者:[ChrisLeeGit](https://github.com/chrisleegit)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://fedoramagazine.org/makes-fedora-kernel/
[1]: http://www.kernel.org/
[2]: http://pkgs.fedoraproject.org/cgit/rpms/kernel.git/
[3]: http://www.labbott.name/blog/2015/10/02/the-art-of-communicating-with-lkml/

View File

@ -1,7 +1,7 @@
使用 Python 创建你自己的 ShellPart I
使用 Python 创建你自己的 Shell (一)
==========================================
我很想知道一个 shell (像 bashcsh 等)内部是如何工作的。为了满足自己的好奇心,我使用 Python 实现了一个名为 **yosh** Your Own Shell的 Shell。本文章所介绍的概念也可以应用于其他编程语言。
我很想知道一个 shell (像 bashcsh 等)内部是如何工作的。于是为了满足自己的好奇心,我使用 Python 实现了一个名为 **yosh** Your Own Shell的 Shell。本文章所介绍的概念也可以应用于其他编程语言。
(提示:你可以在[这里](https://github.com/supasate/yosh)查找本博文使用的源代码,代码以 MIT 许可证发布。在 Mac OS X 10.11.5 上,我使用 Python 2.7.10 和 3.4.3 进行了测试。它应该可以运行在其他类 Unix 环境,比如 Linux 和 Windows 上的 Cygwin。
@ -20,15 +20,15 @@ yosh_project
`yosh_project` 为项目根目录(你也可以把它简单命名为 `yosh`)。
`yosh` 为包目录,且 `__init__.py` 可以使它成为与包目录名字相同的包(如果你不写 Python,可以忽略它。)
`yosh` 为包目录,且 `__init__.py` 可以使它成为与包的目录名字相同的包(如果你不用 Python 编写的话,可以忽略它。)
`shell.py` 是我们主要的脚本文件。
### 步骤 1Shell 循环
当启动一个 shell它会显示一个命令提示符并等待你的命令输入。在接收了输入的命令并执行它之后稍后文章会进行详细解释你的 shell 会重新回到循环等待下一条指令。
当启动一个 shell它会显示一个命令提示符并等待你的命令输入。在接收了输入的命令并执行它之后稍后文章会进行详细解释你的 shell 会重新回到这里,并循环等待下一条指令。
`shell.py`,我们会以一个简单的 mian 函数开始,该函数调用了 shell_loop() 函数,如下:
`shell.py`,我们会以一个简单的 main 函数开始,该函数调用了 shell_loop() 函数,如下:
```
def shell_loop():
@ -43,7 +43,7 @@ if __name__ == "__main__":
main()
```
接着,在 `shell_loop()`,为了指示循环是否继续或停止,我们使用了一个状态标志。在循环的开始,我们的 shell 将显示一个命令提示符,并等待读取命令输入。
接着,在 `shell_loop()`,为了指示循环是否继续或停止,我们使用了一个状态标志。在循环的开始,我们的 shell 将显示一个命令提示符,并等待读取命令输入。
```
import sys
@ -56,15 +56,15 @@ def shell_loop():
status = SHELL_STATUS_RUN
while status == SHELL_STATUS_RUN:
# Display a command prompt
### 显示命令提示符
sys.stdout.write('> ')
sys.stdout.flush()
# Read command input
### 读取命令输入
cmd = sys.stdin.readline()
```
之后,我们切分命令输入并进行执行(我们即将实现`命令切分`和`执行`函数)。
之后,我们切分命令tokenize输入并进行执行execute我们即将实现 `tokenize``execute` 函数)。
因此,我们的 shell_loop() 会是如下这样:
@ -79,33 +79,33 @@ def shell_loop():
status = SHELL_STATUS_RUN
while status == SHELL_STATUS_RUN:
# Display a command prompt
### 显示命令提示符
sys.stdout.write('> ')
sys.stdout.flush()
# Read command input
### 读取命令输入
cmd = sys.stdin.readline()
# Tokenize the command input
### 切分命令输入
cmd_tokens = tokenize(cmd)
# Execute the command and retrieve new status
### 执行该命令并获取新的状态
status = execute(cmd_tokens)
```
这就是我们整个 shell 循环。如果我们使用 `python shell.py` 启动我们的 shell它会显示命令提示符。然而如果我们输入命令并按回车它会抛出错误因为我们还没定义`命令切分`函数。
这就是我们整个 shell 循环。如果我们使用 `python shell.py` 启动我们的 shell它会显示命令提示符。然而如果我们输入命令并按回车它会抛出错误因为我们还没定义 `tokenize` 函数。
为了退出 shell可以尝试输入 ctrl-c。稍后我将解释如何以优雅的形式退出 shell。
### 步骤 2命令切分
### 步骤 2命令切分tokenize
当用户在我们的 shell 中输入命令并按下回车键,该命令将会是一个包含命令名称及其参数的字符串。因此,我们必须切分该字符串(分割一个字符串为多个标记)。
当用户在我们的 shell 中输入命令并按下回车键,该命令将会是一个包含命令名称及其参数的长字符串。因此,我们必须切分该字符串(分割一个字符串为多个元组)。
咋一看似乎很简单。我们或许可以使用 `cmd.split()`,以空格分割输入。它对类似 `ls -a my_folder` 的命令起作用,因为它能够将命令分割为一个列表 `['ls', '-a', 'my_folder']`,这样我们便能轻易处理它们了。
然而,也有一些类似 `echo "Hello World"``echo 'Hello World'` 以单引号或双引号引用参数的情况。如果我们使用 cmd.spilt我们将会得到一个存有 3 个标记的列表 `['echo', '"Hello', 'World"']` 而不是 2 个标记的列表 `['echo', 'Hello World']`
幸运的是Python 提供了一个名为 `shlex` 的库,它能够帮助我们效验如神地分割命令。(提示:我们也可以使用正则表达式,但它不是本文的重点。)
幸运的是Python 提供了一个名为 `shlex` 的库,它能够帮助我们如魔法般地分割命令。(提示:我们也可以使用正则表达式,但它不是本文的重点。)
```
@ -120,23 +120,23 @@ def tokenize(string):
...
```
然后我们将这些标记发送到执行进程。
然后我们将这些元组发送到执行进程。
### 步骤 3执行
这是 shell 中核心有趣的一部分。当 shell 执行 `mkdir test_dir` 时,到底发生了什么?(提示: `mkdir` 是一个带有 `test_dir` 参数的执行程序,用于创建一个名为 `test_dir` 的目录。)
这是 shell 中核心有趣的一部分。当 shell 执行 `mkdir test_dir` 时,到底发生了什么?(提示: `mkdir` 是一个带有 `test_dir` 参数的执行程序,用于创建一个名为 `test_dir` 的目录。)
`execvp`涉及这一步的首个函数。在我们解释 `execvp` 所做的事之前,让我们看看它的实际效果。
`execvp`这一步的首先需要的函数。在我们解释 `execvp` 所做的事之前,让我们看看它的实际效果。
```
import os
...
def execute(cmd_tokens):
# Execute command
### 执行命令
os.execvp(cmd_tokens[0], cmd_tokens)
# Return status indicating to wait for next command in shell_loop
### 返回状态以告知在 shell_loop 中等待下一个命令
return SHELL_STATUS_RUN
...
@ -144,11 +144,11 @@ def execute(cmd_tokens):
再次尝试运行我们的 shell并输入 `mkdir test_dir` 命令,接着按下回车键。
在我们敲下回车键之后,问题是我们的 shell 会直接退出而不是等待下一个命令。然而,目标正确地被创建
在我们敲下回车键之后,问题是我们的 shell 会直接退出而不是等待下一个命令。然而,目录正确地创建了
因此,`execvp` 实际上做了什么?
`execvp` 是系统调用 `exec` 的一个变体。第一个参数是程序名字。`v` 表示第二个参数是一个程序参数列表(可变参数)。`p` 表示环境变量 `PATH` 会被用于搜索给定的程序名字。在我们上一次的尝试中,它将会基于我们的 `PATH` 环境变量查找`mkdir` 程序。
`execvp` 是系统调用 `exec` 的一个变体。第一个参数是程序名字。`v` 表示第二个参数是一个程序参数列表(参数数量可变)。`p` 表示将会使用环境变量 `PATH` 搜索给定的程序名字。在我们上一次的尝试中,它将会基于我们的 `PATH` 环境变量查找`mkdir` 程序。
(还有其他 `exec` 变体,比如 execv、execvpe、execl、execlp、execlpe你可以 google 它们获取更多的信息。)
@ -158,7 +158,7 @@ def execute(cmd_tokens):
因此,我们需要其他的系统调用来解决问题:`fork`。
`fork`开辟新的内存并拷贝当前进程到一个新的进程。我们称这个新的进程为**子进程**,调用者进程为**父进程**。然后,子进程内存会被替换为被执行的程序。因此,我们的 shell也就是父进程可以免受内存替换的危险。
`fork`分配新的内存并拷贝当前进程到一个新的进程。我们称这个新的进程为**子进程**,调用者进程为**父进程**。然后,子进程内存会被替换为被执行的程序。因此,我们的 shell也就是父进程可以免受内存替换的危险。
让我们看看修改的代码。
@ -166,34 +166,34 @@ def execute(cmd_tokens):
...
def execute(cmd_tokens):
# Fork a child shell process
# If the current process is a child process, its `pid` is set to `0`
# else the current process is a parent process and the value of `pid`
# is the process id of its child process.
### 分叉一个子 shell 进程
### 如果当前进程是子进程,其 `pid` 被设置为 `0`
### 否则当前进程是父进程的话,`pid` 的值
### 是其子进程的进程 ID。
pid = os.fork()
if pid == 0:
# Child process
# Replace the child shell process with the program called with exec
### 子进程
### 用被 exec 调用的程序替换该子进程
os.execvp(cmd_tokens[0], cmd_tokens)
elif pid > 0:
# Parent process
### 父进程
while True:
# Wait response status from its child process (identified with pid)
### 等待其子进程的响应状态(以进程 ID 来查找)
wpid, status = os.waitpid(pid, 0)
# Finish waiting if its child process exits normally
# or is terminated by a signal
### 当其子进程正常退出时
### 或者其被信号中断时,结束等待状态
if os.WIFEXITED(status) or os.WIFSIGNALED(status):
break
# Return status indicating to wait for next command in shell_loop
### 返回状态以告知在 shell_loop 中等待下一个命令
return SHELL_STATUS_RUN
...
```
当我们的父进程调用 `os.fork()`时,你可以想象所有的源代码被拷贝到了新的子进程。此时此刻,父进程和子进程看到的是相同的代码,且并行运行着。
当我们的父进程调用 `os.fork()` 时,你可以想象所有的源代码被拷贝到了新的子进程。此时此刻,父进程和子进程看到的是相同的代码,且并行运行着。
如果运行的代码属于子进程,`pid` 将为 `0`。否则,如果运行的代码属于父进程,`pid` 将会是子进程的进程 id。
@ -205,13 +205,13 @@ def execute(cmd_tokens):
现在,你可以尝试运行我们的 shell 并输入 `mkdir test_dir2`。它应该可以正确执行。我们的主 shell 进程仍然存在并等待下一条命令。尝试执行 `ls`,你可以看到已创建的目录。
但是,这里仍有许多问题。
但是,这里仍有一些问题。
第一,尝试执行 `cd test_dir2`,接着执行 `ls`。它应该会进入到一个空的 `test_dir2` 目录。然而,你将会看到目录并没有变为 `test_dir2`
第二,我们仍然没有办法优雅地退出我们的 shell。
我们将会在 [Part 2][1] 解决诸如此类的问题。
我们将会在 [第二部分][1] 解决诸如此类的问题。
--------------------------------------------------------------------------------
@ -219,8 +219,8 @@ def execute(cmd_tokens):
via: https://hackercollider.com/articles/2016/07/05/create-your-own-shell-in-python-part-1/
作者:[Supasate Choochaisri][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
译者:[cposture](https://github.com/cposture)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出

View File

@ -0,0 +1,211 @@
使用 Python 创建你自己的 Shell
===========================================
在[上篇][1]中,我们已经创建了一个 shell 主循环、切分了命令输入,以及通过 `fork``exec` 执行命令。在这部分,我们将会解决剩下的问题。首先,`cd test_dir2` 命令无法修改我们的当前目录。其次,我们仍无法优雅地从 shell 中退出。
### 步骤 4内置命令
“`cd test_dir2` 无法修改我们的当前目录” 这句话是对的,但在某种意义上也是错的。在执行完该命令之后,我们仍然处在同一目录,从这个意义上讲,它是对的。然而,目录实际上已经被修改,只不过它是在子进程中被修改。
还记得我们分叉fork了一个子进程然后执行命令执行命令的过程没有发生在父进程上。结果是我们只是改变了子进程的当前目录而不是父进程的目录。
然后子进程退出,而父进程在原封不动的目录下继续运行。
因此,这类与 shell 自己相关的命令必须是内置命令。它必须在 shell 进程中执行而不是在分叉中forking
#### cd
让我们从 `cd` 命令开始。
我们首先创建一个 `builtins` 目录。每一个内置命令都会被放进这个目录中。
```shell
yosh_project
|-- yosh
|-- builtins
| |-- __init__.py
| |-- cd.py
|-- __init__.py
|-- shell.py
```
`cd.py` 中,我们通过使用系统调用 `os.chdir` 实现自己的 `cd` 命令。
```python
import os
from yosh.constants import *
def cd(args):
os.chdir(args[0])
return SHELL_STATUS_RUN
```
注意,我们会从内置函数返回 shell 的运行状态。所以,为了能够在项目中继续使用常量,我们将它们移至 `yosh/constants.py`
```shell
yosh_project
|-- yosh
|-- builtins
| |-- __init__.py
| |-- cd.py
|-- __init__.py
|-- constants.py
|-- shell.py
```
`constants.py` 中,我们将状态常量都放在这里。
```python
SHELL_STATUS_STOP = 0
SHELL_STATUS_RUN = 1
```
现在,我们的内置 `cd` 已经准备好了。让我们修改 `shell.py` 来处理这些内置函数。
```python
...
### 导入常量
from yosh.constants import *
### 使用哈希映射来存储内建的函数名及其引用
built_in_cmds = {}
def tokenize(string):
return shlex.split(string)
def execute(cmd_tokens):
### 从元组中分拆命令名称与参数
cmd_name = cmd_tokens[0]
cmd_args = cmd_tokens[1:]
### 如果该命令是一个内建命令,使用参数调用该函数
if cmd_name in built_in_cmds:
return built_in_cmds[cmd_name](cmd_args)
...
```
我们使用一个 python 字典变量 `built_in_cmds` 作为哈希映射hash map以存储我们的内置函数。我们在 `execute` 函数中提取命令的名字和参数。如果该命令在我们的哈希映射中,则调用对应的内置函数。
(提示:`built_in_cmds[cmd_name]` 返回能直接使用参数调用的函数引用。)
我们差不多准备好使用内置的 `cd` 函数了。最后一步是将 `cd` 函数添加到 `built_in_cmds` 映射中。
```
...
### 导入所有内建函数引用
from yosh.builtins import *
...
### 注册内建函数到内建命令的哈希映射中
def register_command(name, func):
built_in_cmds[name] = func
### 在此注册所有的内建命令
def init():
register_command("cd", cd)
def main():
###在开始主循环之前初始化 shell
init()
shell_loop()
```
我们定义了 `register_command` 函数,以添加一个内置函数到我们内置的命令哈希映射。接着,我们定义 `init` 函数并且在这里注册内置的 `cd` 函数。
注意这行 `register_command("cd", cd)` 。第一个参数为命令的名字。第二个参数为一个函数引用。为了能够让第二个参数 `cd` 引用到 `yosh/builtins/cd.py` 中的 `cd` 函数引用,我们必须将以下这行代码放在 `yosh/builtins/__init__.py` 文件中。
```
from yosh.builtins.cd import *
```
因此,在 `yosh/shell.py` 中,当我们从 `yosh.builtins` 导入 `*` 时,我们可以得到已经通过 `yosh.builtins` 导入的 `cd` 函数引用。
我们已经准备好了代码。让我们尝试在 `yosh` 同级目录下以模块形式运行我们的 shell`python -m yosh.shell`。
现在,`cd` 命令可以正确修改我们的 shell 目录了,同时非内置命令仍然可以工作。非常好!
#### exit
最后一块终于来了:优雅地退出。
我们需要一个可以修改 shell 状态为 `SHELL_STATUS_STOP` 的函数。这样shell 循环可以自然地结束shell 将到达终点而退出。
`cd` 一样,如果我们在子进程中分叉并执行 `exit` 函数,其对父进程是不起作用的。因此,`exit` 函数需要成为一个 shell 内置函数。
让我们从这开始:在 `builtins` 目录下创建一个名为 `exit.py` 的新文件。
```
yosh_project
|-- yosh
|-- builtins
| |-- __init__.py
| |-- cd.py
| |-- exit.py
|-- __init__.py
|-- constants.py
|-- shell.py
```
`exit.py` 定义了一个 `exit` 函数,该函数仅仅返回一个可以退出主循环的状态。
```
from yosh.constants import *
def exit(args):
return SHELL_STATUS_STOP
```
然后,我们导入位于 `yosh/builtins/__init__.py` 文件的 `exit` 函数引用。
```
from yosh.builtins.cd import *
from yosh.builtins.exit import *
```
最后,我们在 `shell.py` 中的 `init()` 函数注册 `exit` 命令。
```
...
### 在此注册所有的内建命令
def init():
register_command("cd", cd)
register_command("exit", exit)
...
```
到此为止!
尝试执行 `python -m yosh.shell`。现在你可以输入 `exit` 优雅地退出程序了。
### 最后的想法
我希望你能像我一样享受创建 `yosh` **y**our **o**wn **sh**ell的过程。但我的 `yosh` 版本仍处于早期阶段。我没有处理一些会使 shell 崩溃的极端状况。还有很多我没有覆盖的内置命令。为了提高性能,一些非内置命令也可以实现为内置命令(避免新进程创建时间)。同时,大量的功能还没有实现(请看 [公共特性](http://tldp.org/LDP/Bash-Beginners-Guide/html/x7243.html) 和 [不同特性](http://www.tldp.org/LDP/intro-linux/html/x12249.html)
我已经在 https://github.com/supasate/yosh 中提供了源代码。请随意 fork 和尝试。
现在该是创建你真正自己拥有的 Shell 的时候了。
Happy Coding!
--------------------------------------------------------------------------------
via: https://hackercollider.com/articles/2016/07/06/create-your-own-shell-in-python-part-2/
作者:[Supasate Choochaisri][a]
译者:[cposture](https://github.com/cposture)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://disqus.com/by/supasate_choochaisri/
[1]: https://linux.cn/article-7624-1.html
[2]: http://tldp.org/LDP/Bash-Beginners-Guide/html/x7243.html
[3]: http://www.tldp.org/LDP/intro-linux/html/x12249.html
[4]: https://github.com/supasate/yosh

View File

@ -0,0 +1,80 @@
GNU KHATA开源的会计管理软件
============================================
作为一个活跃的 Linux 爱好者,我经常向我的朋友们介绍 Linux帮助他们选择最适合他们的发行版本同时也会帮助他们安装一些适用于他们工作的开源软件。
但是在这一次,我就变得很无奈。我的叔叔,他是一个自由职业的会计师。他会有一系列的为了会计工作的漂亮而成熟的付费软件。我不那么确定我能在在开源软件中找到这么一款可以替代的软件——直到昨天。
Abhishek 给我推荐了一些[很酷的软件][1],而其中 GNU Khata 脱颖而出。
[GNU Khata][2] 是一个会计工具。 或者,我应该说成是一系列的会计工具集合?它就像经济管理方面的 [Evernote][3] 一样。它的应用是如此之广,以至于它不但可以用于个人的财务管理,也可以用于大型公司的管理,从店铺存货管理到税率计算,都可以有效处理。
有个有趣的地方Khata 这个词在印度或者是其他的印度语国家中意味着账户,所以这个会计软件叫做 GNU Khata。
### 安装
互联网上有很多关于旧的 Web 版本的 Khata 安装介绍。现在GNU Khata 只能用在 Debian/Ubuntu 和它们的衍生版本中。我建议你按照 GNU Khata 官网给出的如下步骤来安装。我们来快速过一下。
- 从[这里][4]下载安装器。
- 在下载目录打开终端。
- 粘贴复制以下的代码到终端,并且执行。
```
sudo chmod 755 GNUKhatasetup.run
sudo ./GNUKhatasetup.run
```
这就结束了,从你的 Dash 或者是应用菜单中启动 GNU Khata 吧。
### 第一次启动
GNU Khata 在浏览器中打开,并且展现以下的画面。
![](https://itsfoss.com/wp-content/uploads/2016/07/GNU-khata-1.jpg)
填写组织的名字、组织形式,财务年度并且点击 proceed 按钮进入管理设置页面。
![](https://itsfoss.com/wp-content/uploads/2016/07/GNU-khata-2.jpg)
仔细填写你的用户名、密码、安全问题及其答案并且点击“create and login”。
![](https://itsfoss.com/wp-content/uploads/2016/07/GNU-khata-3.jpg)
你已经全部设置完成了。使用菜单栏来开始使用 GNU Khata 来管理你的财务吧。这很容易。
### 移除 GNU KHATA
如果你不想使用 GNU Khata 了,你可以执行如下命令移除:
```
sudo apt-get remove --auto-remove gnukhata-core-engine
```
你也可以通过新立得软件管理来删除它。
### GNU KHATA 真的是市面上付费会计应用的竞争对手吗?
首先GNU Khata 以简化为设计原则。顶部的菜单栏组织的很方便,可以帮助你有效的进行工作。你可以选择管理不同的账户和项目,并且切换非常容易。[它们的官网][5]表明GNU Khata 可以“像说印度语一样方便”LCTT 译注:原谅我,这个软件作者和本文作者是印度人……)。同时,你知道 GNU Khata 也可以在云端使用吗?
所有的主流的账户管理工具,比如分类账簿、项目报表、财务报表等等都用专业的方式整理,并且支持自定义格式和即时展示。这让会计和仓储管理看起来如此的简单。
这个项目正在积极的发展正在寻求实操中的反馈以帮助这个软件更加进步。考虑到软件的成熟性、使用的便利性还有免费的情况GNU Khata 可能会成为你最好的账簿助手。
请在评论框里留言吧,让我们知道你是如何看待 GNU Khata 的。
--------------------------------------------------------------------------------
via: https://itsfoss.com/using-gnu-khata/
作者:[Aquil Roshan][a]
译者:[MikeCoder](https://github.com/MikeCoder)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://itsfoss.com/author/aquil/
[1]: https://itsfoss.com/category/apps/
[2]: http://www.gnukhata.in/
[3]: https://evernote.com/
[4]: https://cloud.openmailbox.org/index.php/s/L8ppsxtsFq1345E/download
[5]: http://www.gnukhata.in/

View File

@ -0,0 +1,36 @@
在浏览器中体验 Ubuntu
=====================================================
[Ubuntu][2] 的背后的公司 [Canonical][1] 为 Linux 推广做了很多努力。无论你有多么不喜欢 Ubuntu你必须承认它对 “Linux 易用性”的影响。Ubuntu 以及其衍生是使用最多的 Linux 版本。
为了进一步推广 Ubuntu LinuxCanonical 把它放到了浏览器里,你可以在任何地方使用这个 [Ubuntu 演示版][0]。 它将帮你更好的体验 Ubuntu以便让新人更容易决定是否使用它。
你可能争辩说 USB 版的 Linux 更好。我同意,但是你要知道你要下载 ISO创建 USB 启动盘,修改配置文件,然后才能使用这个 USB 启动盘来体验。这么乏味并不是每个人都乐意这么干的。 在线体验是一个更好的选择。
那么,你能在 Ubuntu 在线看到什么。实际上并不多。
你可以浏览文件,你可以使用 Unity Dash浏览 Ubuntu 软件中心,甚至装几个应用(当然它们不会真的安装),看一看文件浏览器和其它一些东西。以上就是全部了。但是在我看来,这已经做的很好了,让你知道它是个什么,对这个流行的操作系统有个直接感受。
![](https://itsfoss.com/wp-content/uploads/2016/07/Ubuntu-online-demo.jpeg)
![](https://itsfoss.com/wp-content/uploads/2016/07/Ubuntu-online-demo-1.jpeg)
![](https://itsfoss.com/wp-content/uploads/2016/07/Ubuntu-online-demo-2.jpeg)
如果你的朋友或者家人对试试 Linux 抱有兴趣,但是想在安装前想体验一下 Linux 。你可以给他们以下链接:[Ubuntu 在线导览][0] 。
--------------------------------------------------------------------------------
via: https://itsfoss.com/ubuntu-online-demo/
作者:[Abhishek Prakash][a]
译者:[kokialoves](https://github.com/kokialoves)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://itsfoss.com/author/abhishek/
[0]: http://tour.ubuntu.com/en/
[1]: http://www.canonical.com/
[2]: http://www.ubuntu.com/

View File

@ -0,0 +1,200 @@
Linux 上 10 个最好的 Markdown 编辑器
======================================
在这篇文章中,我们会点评一些可以在 Linux 上安装使用的最好的 Markdown 编辑器。 你可以找到非常多的 Linux 平台上的 Markdown 编辑器,但是在这里我们将尽可能地为您推荐那些最好的。
![](http://www.tecmint.com/wp-content/uploads/2016/07/Best-Linux-Markdown-Editors.png)
*Best Linux Markdown Editors*
对于不了解 Markdown 的人做个简单介绍Markdown 是由著名的 Aaron Swartz 和 John Gruber 发明的标记语言,其最初的解析器是一个用 Perl 写的简单、轻量的[同名工具][1]。它可以将用户写的纯文本转为可用的 HTML或 XHTML。它实际上是一门易读易写的纯文本语言以及一个用于将文本转为 HTML 的转换工具。
希望你先对 Markdown 有一个稍微的了解,接下来让我们逐一列出这些编辑器。
### 1. Atom
Atom 是一个现代的、跨平台、开源且强大的文本编辑器,它可以运行在 Linux、Windows 和 MAC OS X 等操作系统上。用户可以在它的基础上进行定制,删减修改任何配置文件。
它包含了一些非常杰出的特性:
- 内置软件包管理器
- 智能自动补全功能
- 提供多窗口操作
- 支持查找替换功能
- 包含一个文件系统浏览器
- 轻松自定义主题
- 开源、高度扩展性的软件包等
![](http://www.tecmint.com/wp-content/uploads/2016/07/Atom-Markdown-Editor-for-Linux.png)
*Atom Markdown Editor for Linux*
访问主页: <https://atom.io/>
### 2. GNU Emacs
Emacs 是 Linux 平台上一款的流行文本编辑器。它是一个非常棒的、具备高扩展性和定制性的 Markdown 语言编辑器。
它综合了以下这些神奇的特性:
- 带有丰富的内置文档,包括适合初学者的教程
- 有完整的 Unicode 支持,可显示所有的人类符号
- 支持内容识别的文本编辑模式
- 包括多种文件类型的语法高亮
- 可用 Emacs Lisp 或 GUI 对其进行高度定制
- 提供了一个包系统可用来下载安装各种扩展等
![](http://www.tecmint.com/wp-content/uploads/2016/07/Emacs-Markdown-Editor-for-Linux.png)
*Emacs Markdown Editor for Linux*
访问主页: <https://www.gnu.org/software/emacs/>
### 3. Remarkable
Remarkable 可能是 Linux 上最好的 Markdown 编辑器了,它也适用于 Windows 操作系统。它的确是是一个卓越且功能齐全的 Markdown 编辑器,为用户提供了一些令人激动的特性。
一些卓越的特性:
- 支持实时预览
- 支持导出 PDF 和 HTML
- 支持 Github Markdown 语法
- 支持定制 CSS
- 支持语法高亮
- 提供键盘快捷键
- 高可定制性和其他
![](http://www.tecmint.com/wp-content/uploads/2016/07/Remarkable-Markdown-Editor-for-Linux.png)
*Remarkable Markdown Editor for Linux*
访问主页: <https://remarkableapp.github.io>
### 4. Haroopad
Haroopad 是为 LinuxWindows 和 Mac OS X 构建的跨平台 Markdown 文档处理程序。用户可以用它来书写许多专家级格式的文档,包括电子邮件、报告、博客、演示文稿和博客文章等等。
功能齐全且具备以下的亮点:
- 轻松导入内容
- 支持导出多种格式
- 广泛支持博客和邮件
- 支持许多数学表达式
- 支持 Github Markdown 扩展
- 为用户提供了一些令人兴奋的主题、皮肤和 UI 组件等等
![](http://www.tecmint.com/wp-content/uploads/2016/07/Haroopad-Markdown-Editor-for-Linux.png)
*Haroopad Markdown Editor for Linux*
访问主页: <http://pad.haroopress.com/>
### 5. ReText
ReText 是为 Linux 和其它几个 POSIX 兼容操作系统提供的简单、轻量、强大的 Markdown 编辑器。它还可以作为一个 reStructuredText 编辑器,并且具有以下的特性:
- 简单直观的 GUI
- 具备高定制性,用户可以自定义语法文件和配置选项
- 支持多种配色方案
- 支持使用多种数学公式
- 启用导出扩展等等
![](http://www.tecmint.com/wp-content/uploads/2016/07/ReText-Markdown-Editor-for-Linux.png)
*ReText Markdown Editor for Linux*
访问主页: <https://github.com/retext-project/retext>
### 6. UberWriter
UberWriter 是一个简单、易用的 Linux Markdown 编辑器。它的开发受 Mac OS X 上的 iA writer 影响很大,同样它也具备这些卓越的特性:
- 使用 pandoc 进行所有的文本到 HTML 的转换
- 提供了一个简洁的 UI 界面
- 提供了一种专心distraction free模式高亮用户最后的句子
- 支持拼写检查
- 支持全屏模式
- 支持用 pandoc 导出 PDF、HTML 和 RTF
- 启用语法高亮和数学函数等等
![](http://www.tecmint.com/wp-content/uploads/2016/07/UberWriter-Markdown-Editor-for-Linux.png)
*UberWriter Markdown Editor for Linux*
访问主页: <http://uberwriter.wolfvollprecht.de/>
### 7. Mark My Words
Mark My Words 同样也是一个轻量、强大的 Markdown 编辑器。它是一个相对比较新的编辑器,因此提供了包含语法高亮在内的大量的功能,简单和直观的 UI。
下面是一些棒极了,但还未捆绑到应用中的功能:
- 实时预览
- Markdown 解析和文件 IO
- 状态管理
- 支持导出 PDF 和 HTML
- 监测文件的修改
- 支持首选项设置
![](http://www.tecmint.com/wp-content/uploads/2016/07/MarkMyWords-Markdown-Editor-for-Linux.png)
*MarkMyWords Markdown Editor for-Linux*
访问主页: <https://github.com/voldyman/MarkMyWords>
### 8. Vim-Instant-Markdown 插件
Vim 是 Linux 上的一个久经考验的强大、流行而开源的文本编辑器。它用于编程极棒。它也高度支持插件功能,可以让用户为其增加一些其它功能,包括 Markdown 预览。
有好几种 Vim 的 Markdown 预览插件,但是 [Vim-Instant-Markdown][2] 的表现最佳。
###9. Bracket-MarkdownPreview 插件
Brackets 是一个现代、轻量、开源且跨平台的文本编辑器。它特别为 Web 设计和开发而构建。它的一些重要功能包括:支持内联编辑器、实时预览、预处理支持及更多。
它也是通过插件高度可扩展的,你可以使用 [Bracket-MarkdownPreview][3] 插件来编写和预览 Markdown 文档。
![](http://www.tecmint.com/wp-content/uploads/2016/07/Brackets-Markdown-Plugin.png)
*Brackets Markdown Plugin Preview*
### 10. SublimeText-Markdown 插件
Sublime Text 是一个精心打造的、流行的、跨平台文本编辑器用于代码、markdown 和普通文本。它的表现极佳,包括如下令人兴奋的功能:
- 简洁而美观的 GUI
- 支持多重选择
- 提供专心模式
- 支持窗体分割编辑
- 通过 Python 插件 API 支持高度插件化
- 完全可定制化,提供命令查找模式
[SublimeText-Markdown][4] 插件是一个支持格式高亮的软件包,带有一些漂亮的颜色方案。
![](http://www.tecmint.com/wp-content/uploads/2016/07/SublimeText-Markdown-Plugin-Preview.png)
*SublimeText Markdown Plugin Preview*
### 结论
通过上面的列表,你大概已经知道要为你的 Linux 桌面下载、安装什么样的 Markdown 编辑器和文档处理程序了。
请注意,这里提到的最好的 Markdown 编辑器可能对你来说并不是最好的选择。因此你可以通过下面的反馈部分,为我们展示你认为列表中未提及的,并且具备足够的资格的,令人兴奋的 Markdown 编辑器。
--------------------------------------------------------------------------------
via: http://www.tecmint.com/best-markdown-editors-for-linux/
作者:[Aaron Kili][a]
译者:[Locez](https://github.com/locez)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.tecmint.com/author/aaronkili/
[1]: https://daringfireball.net/projects/markdown/
[2]: https://github.com/suan/vim-instant-markdown
[3]: https://github.com/gruehle/MarkdownPreview
[4]: https://github.com/SublimeText-Markdown/MarkdownEditing

View File

@ -0,0 +1,69 @@
怎样在 Ubuntu 中修改默认程序
==============================================
![](https://itsfoss.com/wp-content/uploads/2016/07/change-default-applications-ubuntu.jpg)
> 简介: 这个新手指南会向你展示如何在 Ubuntu Linux 中修改默认程序
对于我来说,安装 [VLC 多媒体播放器][1]是[安装完 Ubuntu 16.04 该做的事][2]中最先做的几件事之一。为了能够使我双击一个视频就用 VLC 打开,在我安装完 VLC 之后我会设置它为默认程序。
作为一个新手,你需要知道如何在 Ubuntu 中修改任何默认程序,这也是我今天在这篇指南中所要讲的。
### 在 UBUNTU 中修改默认程序
这里提及的方法适用于所有的 Ubuntu 12.04Ubuntu 14.04 和Ubuntu 16.04。在 Ubuntu 中,这里有两种基本的方法可以修改默认程序:
- 通过系统设置
- 通过右键菜单
#### 1.通过系统设置修改 Ubuntu 的默认程序
进入 Unity 面板并且搜索系统设置System Settings
![](https://itsfoss.com/wp-content/uploads/2013/11/System_Settings_Ubuntu.jpeg)
在系统设置System Settings选择详细选项Details
![](https://itsfoss.com/wp-content/uploads/2016/07/System-settings-detail-ubuntu.jpeg)
在左边的面板中选择默认程序Default Applications你会发现在右边的面板中可以修改默认程序。
![](https://itsfoss.com/wp-content/uploads/2016/07/System-settings-default-applications.jpeg)
正如看到的那样,这里只有少数几类的默认程序可以被改变。你可以在这里改变浏览器、邮箱客户端、日历、音乐、视频和相册的默认程序。那其他类型的默认程序怎么修改?
不要担心,为了修改其他类型的默认程序,我们会用到右键菜单。
#### 2.通过右键菜单修改默认程序
如果你使用过 Windows 系统,你应该看见过右键菜单的“打开方式”,可以通过这个来修改默认程序。我们在 Ubuntu 中也有相似的方法。
右键一个还没有设置默认打开程序的文件选择“属性properties
![](https://itsfoss.com/wp-content/uploads/2016/05/WebP-images-Ubuntu-Linux-3.png)
*从右键菜单中选择属性*
在这里,你可以选择使用什么程序打开,并且设置为默认程序。
![](https://itsfoss.com/wp-content/uploads/2016/05/WebP-images-Ubuntu-Linux-4.png)
*在 Ubuntu 中设置打开 WebP 图片的默认程序为 gThumb*
小菜一碟不是么?一旦你做完这些,所有同样类型的文件都会用你选择的默认程序打开。
我很希望这个新手指南对你在修改 Ubuntu 的默认程序时有帮助。如果你有任何的疑问或者建议,可以随时在下面评论。
--------------------------------------------------------------------------------
via: https://itsfoss.com/change-default-applications-ubuntu/
作者:[Abhishek Prakash][a]
译者:[Locez](https://github.com/locez)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://itsfoss.com/author/abhishek/
[1]: http://www.videolan.org/vlc/index.html
[2]: https://linux.cn/article-7453-1.html

View File

@ -0,0 +1,90 @@
为你的 Linux 桌面设置一张实时的地球照片
=================================================================
![](http://www.omgubuntu.co.uk/wp-content/uploads/2016/07/Screen-Shot-2016-07-26-at-16.36.47-1.jpg)
厌倦了看同样的桌面背景了么?这里有一个(可能是)世界上最棒的东西。
[Himawaripy][1] 是一个 Python 3 小脚本,它会抓取由[日本 Himawari 8 气象卫星][2]拍摄的接近实时的地球照片,并将它设置成你的桌面背景。
安装完成后,你可以将它设置成每 10 分钟运行的定时任务(自然,它要在后台运行),这样它就可以实时地取回地球的照片并设置成背景了。
因为 Himawari-8 是一颗同步轨道卫星,你只能看到澳大利亚上空的地球的图片——但是它实时的天气形态、云团和光线仍使它很壮丽,对我而言要是看到英国上方的就更好了!
高级设置允许你配置从卫星取回的图片质量,但是要记住增加图片质量会增加文件大小及更长的下载等待!
最后,虽然这个脚本与其他我们提到过的其他脚本类似,它还仍保持更新及可用。
###获取 Himawaripy
Himawaripy 已经在一系列的桌面环境中都测试过了,包括 Unity、LXDE、i3、MATE 和其他桌面环境。它是自由开源软件,但是整体来说安装及配置不太简单。
在该项目的 [Github 主页][0]上可以找到安装和设置该应用程序的所有指导(提示:没有一键安装功能)。
- [实时地球壁纸脚本的 GitHub 主页][0]
### 安装及使用
![](http://www.omgubuntu.co.uk/wp-content/uploads/2016/07/Screen-Shot-2016-07-26-at-16.46.13-750x143.png)
一些读者请我在本文中补充一下一步步安装该应用的步骤。以下所有步骤都在其 GitHub 主页上,这里再贴一遍。
1、下载及解压 Himawaripy
这是最容易的步骤。点击下面的下载链接,然后下载最新版本,并解压到你的下载目录里面。
- [下载 Himawaripy 主干文件(.zip 格式)][3]
2、安装 python3-setuptools
你需要手工来安装主干软件包Ubuntu 里面默认没有安装它:
```
sudo apt install python3-setuptools
```
3、安装 Himawaripy
在终端中,你需要切换到之前解压的目录中,并运行如下安装命令:
```
cd ~/Downloads/himawaripy-master
sudo python3 setup.py install
```
4、 看看它是否可以运行并下载最新的实时图片:
```
himawaripy
```
5、 设置定时任务
如果你希望该脚本可以在后台自动运行并更新(如果你需要手动更新,只需要运行 himarwaripy 即可)
在终端中运行:
```
crontab -e
```
在其中新加一行默认每10分钟运行一次
```
*/10 * * * * /usr/local/bin/himawaripy
```
关于[配置定时任务][4]可以在 Ubuntu Wiki 上找到更多信息。
该脚本安装后你不需要不断运行它,它会自动的每十分钟在后台运行一次。
--------------------------------------------------------------------------------
via: http://www.omgubuntu.co.uk/2016/07/set-real-time-earth-wallpaper-ubuntu-desktop
作者:[JOEY-ELIJAH SNEDDON][a]
译者:[geekpi](https://github.com/geekpi)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://plus.google.com/117485690627814051450/?rel=author
[1]: https://github.com/boramalper/himawaripy
[2]: https://en.wikipedia.org/wiki/Himawari_8
[0]: https://github.com/boramalper/himawaripy
[3]: https://github.com/boramalper/himawaripy/archive/master.zip
[4]: https://help.ubuntu.com/community/CronHowto

View File

@ -0,0 +1,100 @@
用 VeraCrypt 加密闪存盘
============================================
很多安全专家偏好像 VeraCrypt 这类能够用来加密闪存盘的开源软件,是因为可以获取到它的源代码。
保护 USB 闪存盘里的数据,加密是一个聪明的方法,正如我们在使用 Microsoft 的 BitLocker [加密闪存盘][1] 一文中提到的。
但是如果你不想用 BitLocker 呢?
你可能有顾虑,因为你不能够查看 Microsoft 的程序源码,那么它容易被植入用于政府或其它用途的“后门”。而由于开源软件的源码是公开的,很多安全专家认为开源软件很少藏有后门。
还好,有几个开源加密软件能作为 BitLocker 的替代。
要是你需要在 Windows 系统,苹果的 OS X 系统或者 Linux 系统上加密以及访问文件,开源软件 [VeraCrypt][2] 提供绝佳的选择。
VeraCrypt 源于 TrueCrypt。TrueCrypt 是一个备受好评的开源加密软件,尽管它现在已经停止维护了。但是 TrueCrypt 的代码通过了审核,没有发现什么重要的安全漏洞。另外,在 VeraCrypt 中对它进行了改善。
WindowsOS X 和 Linux 系统的版本都有。
用 VeraCrypt 加密 USB 闪存盘不像用 BitLocker 那么简单,但是它也只要几分钟就好了。
### 用 VeraCrypt 加密闪存盘的 8 个步骤
对应你的操作系统 [下载 VeraCrypt][3] 之后:
打开 VeraCrypt点击 Create Volume进入 VeraCrypt 的创建卷的向导程序VeraCrypt Volume Creation Wizard
![](http://www.esecurityplanet.com/imagesvr_ce/6246/Vera0.jpg)
VeraCrypt 创建卷向导VeraCrypt Volume Creation Wizard允许你在闪存盘里新建一个加密文件容器这与其它未加密文件是独立的。或者你也可以选择加密整个闪存盘。这个时候你就选加密整个闪存盘就行。
![](http://www.esecurityplanet.com/imagesvr_ce/6703/Vera1.jpg)
然后选择标准模式Standard VeraCrypt Volume
![](http://www.esecurityplanet.com/imagesvr_ce/835/Vera2.jpg)
选择你想加密的闪存盘的驱动器卷标(这里是 O
![](http://www.esecurityplanet.com/imagesvr_ce/9427/Vera3.jpg)
选择创建卷模式Volume Creation Mode。如果你的闪存盘是空的或者你想要删除它里面的所有东西选第一个。要么你想保持所有现存的文件选第二个就好了。
![](http://www.esecurityplanet.com/imagesvr_ce/7828/Vera4.jpg)
这一步允许你选择加密选项。要是你不确定选哪个,就用默认的 AES 和 SHA-512 设置。
![](http://www.esecurityplanet.com/imagesvr_ce/5918/Vera5.jpg)
确定了卷容量后,输入并确认你想要用来加密数据密码。
![](http://www.esecurityplanet.com/imagesvr_ce/3850/Vera6.jpg)
要有效工作VeraCrypt 要从一个熵或者“随机数”池中取出一个随机数。要初始化这个池,你将被要求随机地移动鼠标一分钟。一旦进度条变绿了,或者更方便的是等到进度条到了屏幕右边足够远的时候,点击 “Format” 来结束创建加密盘。
![](http://www.esecurityplanet.com/imagesvr_ce/7468/Vera8.jpg)
### 用 VeraCrypt 使用加密过的闪存盘
当你想要使用一个加密了的闪存盘,先插入闪存盘到电脑上,启动 VeraCrypt。
然后选择一个没有用过的卷标(比如 z:点击自动挂载设备Auto-Mount Devices
![](http://www.esecurityplanet.com/imagesvr_ce/2016/Vera10.jpg)
输入密码,点击确定。
![](http://www.esecurityplanet.com/imagesvr_ce/8222/Vera11.jpg)
挂载过程需要几分钟,这之后你的解密盘就能通过你先前选择的盘符进行访问了。
### VeraCrypt 移动硬盘安装步骤
如果你设置闪存盘的时候,选择的是加密过的容器而不是加密整个盘,你可以选择创建 VeraCrypt 称为移动盘Traveler Disk的设备。这会复制安装一个 VeraCrypt 到 USB 闪存盘。当你在别的 Windows 电脑上插入 U 盘时,就能从 U 盘自动运行 VeraCrypt也就是说没必要在新电脑上安装 VeraCrypt。
你可以设置闪存盘作为一个移动硬盘Traveler Disk在 VeraCrypt 的工具栏Tools菜单里选择 Traveler Disk SetUp 就行了。
![](http://www.esecurityplanet.com/imagesvr_ce/5812/Vera12.jpg)
要从移动盘Traveler Disk上运行 VeraCrypt你必须要有那台电脑的管理员权限这不足为奇。尽管这看起来是个限制机密文件无法在不受控制的电脑上安全打开比如在一个商务中心的电脑上。
> 本文作者 Paul Rubens 从事技术行业已经超过 20 年。这期间他为英国和国际主要的出版社,包括 《The Economist》《The Times》《Financial Times》《The BBC》《Computing》和《ServerWatch》等出版社写过文章
--------------------------------------------------------------------------------
via: http://www.esecurityplanet.com/open-source-security/how-to-encrypt-flash-drive-using-veracrypt.html
作者:[Paul Rubens][a]
译者:[GitFuture](https://github.com/GitFuture)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.esecurityplanet.com/author/3700/Paul-Rubens
[1]: http://www.esecurityplanet.com/views/article.php/3880616/How-to-Encrypt-a-USB-Flash-Drive.htm
[2]: http://www.esecurityplanet.com/open-source-security/veracrypt-a-worthy-truecrypt-alternative.html
[3]: https://veracrypt.codeplex.com/releases/view/619351

View File

@ -0,0 +1,118 @@
Git 系列(一):什么是 Git
===========
欢迎阅读本系列关于如何使用 Git 版本控制系统的教程!通过本文的介绍,你将会了解到 Git 的用途及谁该使用 Git。
如果你刚步入开源的世界,你很有可能会遇到一些在 Git 上托管代码或者发布使用版本的开源软件。事实上,不管你知道与否,你都在使用基于 Git 进行版本管理的软件Linux 内核(就算你没有在手机或者电脑上使用 Linux你正在访问的网站也是运行在 Linux 系统上的Firefox、Chrome 等其他很多项目都通过 Git 代码库和世界各地开发者共享他们的代码。
换个角度来说,你是否仅仅通过 Git 就可以和其他人共享你的代码?你是否可以在家里或者企业里私有化的使用 Git你必须要通过一个 GitHub 账号来使用 Git 吗?为什么要使用 Git 呢Git 的优势又是什么Git 是我唯一的选择吗?这对 Git 所有的疑问都会把我们搞的一脑浆糊。
因此,忘记你以前所知的 Git让我们重新走进 Git 世界的大门。
### 什么是版本控制系统?
Git 首先是一个版本控制系统。现在市面上有很多不同的版本控制系统CVS、SVN、Mercurial、Fossil 当然还有 Git。
很多像 GitHub 和 GitLab 这样的服务是以 Git 为基础的,但是你也可以只使用 Git 而无需使用其他额外的服务。这意味着你可以以私有或者公有的方式来使用 Git。
如果你曾经和其他人有过任何电子文件方面的合作,你就会知道传统版本管理的工作流程。开始是很简单的:你有一个原始的版本,你把这个版本发送给你的同事,他们在接收到的版本上做了些修改,现在你们有两个版本了,然后他们把他们手上修改过的版本发回来给你。你把他们的修改合并到你手上的版本中,现在两个版本又合并成一个最新的版本了。
然后,你修改了你手上最新的版本,同时,你的同事也修改了他们手上合并前的版本。现在你们有 3 个不同的版本了,分别是合并后最新的版本,你修改后的版本,你同事手上继续修改过的版本。至此,你们的版本管理工作开始变得越来越混乱了。
正如 Jason van Gumster 在他的文章中指出 [即使是艺术家也需要版本控制][1]而且已经在个别人那里发现了这种趋势变化。无论是艺术家还是科学家开发一个某种实验版本是并不鲜见的在你的项目中可能有某个版本大获成功把项目推向一个新的高度也可能有某个版本惨遭失败。因此最终你不可避免的会创建出一堆名为project\_justTesting.kdenlive、project\_betterVersion.kdenlive、project\_best\_FINAL.kdenlive、project\_FINAL-alternateVersion.kdenlive 等类似名称的文件。
不管你是修改一个 for 循环,还是一些简单的文本编辑,一个好的版本控制系统都会让我们的生活更加的轻松。
### Git 快照
Git 可以为项目创建快照,并且存储这些快照为唯一的版本。
如果你将项目带领到了一个错误的方向上,你可以回退到上一个正确的版本,并且开始尝试另一个可行的方向。
如果你是和别人合作开发,当有人向你发送他们的修改时,你可以将这些修改合并到你的工作分支中,然后你的同事就可以获取到合并后的最新版本,并在此基础上继续工作。
Git 并不是魔法因此冲突还是会发生的“你修改了某文件的最后一行但是我把这行整行都删除了我们怎样处理这些冲突呢但是总体而言Git 会为你保留了所有更改的历史版本,甚至允许并行版本。这为你保留了以任何方式处理冲突的能力。
### 分布式 Git
在不同的机器上为同一个项目工作是一件复杂的事情。因为在你开始工作时,你想要获得项目的最新版本,然后此基础上进行修改,最后向你的同事共享这些改动。传统的方法是通过笨重的在线文件共享服务或者老旧的电邮附件,但是这两种方式都是效率低下且容易出错。
Git 天生是为分布式工作设计的。如果你要参与到某个项目中你可以克隆clone该项目的 Git 仓库然后就像这个项目只有你本地一个版本一样对项目进行修改。最后使用一些简单的命令你就可以拉取pull其他开发者的修改或者你可以把你的修改推送push给别人。现在不用担心谁手上的是最新的版本或者谁的版本又存放在哪里等这些问题了。全部人都是在本地进行开发然后向共同的目标推送或者拉取更新。或者不是共同的目标这取决于项目的开发方式
### Git 界面
最原始的 Git 是运行在 Linux 终端上的应用软件。然而,得益于 Git 是开源的,并且拥有良好的设计,世界各地的开发者都可以为 Git 设计不同的访问界面。
Git 完全是免费的,并且已经打包在 LinuxBSDIllumos 和其他类 Unix 系统中Git 命令看起来像这样:
```
$ git --version
git version 2.5.3
```
可能最著名的 Git 访问界面是基于网页的,像 GitHub、开源的 GitLab、Savannah、BitBucket 和 SourceForge 这些网站都是基于网页端的 Git 界面。这些站点为面向公众和面向社会的开源软件提供了最大限度的代码托管服务。在一定程度上基于浏览器的图形界面GUI可以尽量的减缓 Git 的学习曲线。下面的 GitLab 界面的截图:
![](https://opensource.com/sites/default/files/0_gitlab.png)
再者,第三方 Git 服务提供商或者独立开发者甚至可以在 Git 的基础上开发出不是基于 HTML 的定制化前端界面。此类界面让你可以不用打开浏览器就可以方便的使用 Git 进行版本管理。其中对用户最透明的方式是直接集成到文件管理器中。KDE 文件管理器 Dolphin 可以直接在目录中显示 Git 状态,甚至支持提交,推送和拉取更新操作。
![](https://opensource.com/sites/default/files/0_dolphin.jpg)
[Sparkleshare][2] 使用 Git 作为其 Dropbox 式的文件共享界面的基础。
![](https://opensource.com/sites/default/files/0_sparkleshare_1.jpg)
想了解更多的内容,可以查看 [Git wiki][3],这个(长长的)页面中展示了很多 Git 的图形界面项目。
### 谁应该使用 Git
就是你!我们更应该关心的问题是什么时候使用 Git和用 Git 来干嘛?
### 我应该在什么时候使用 Git 呢?我要用 Git 来干嘛呢?
想更深入的学习 Git我们必须比平常考虑更多关于文件格式的问题。
Git 是为了管理源代码而设计的在大多数编程语言中源代码就意味者一行行的文本。当然Git 并不知道你把这些文本当成是源代码还是下一部伟大的美式小说。因此,只要文件内容是以文本构成的,使用 Git 来跟踪和管理其版本就是一个很好的选择了。
但是什么是文本呢?如果你在像 Libre Office 这类办公软件中编辑一些内容,通常并不会产生纯文本内容。因为通常复杂的应用软件都会对原始的文本内容进行一层封装,就如把原始文本内容用 XML 标记语言包装起来,然后封装在 Zip 包中。这种对原始文本内容进行一层封装的做法可以保证当你把文件发送给其他人时,他们可以看到你在办公软件中编辑的内容及特定的文本效果。奇怪的是,虽然,通常你的需求可能会很复杂,就像保存 [Kdenlive][4] 项目文件,或者保存从 [Inkscape][5] 导出的SVG文件但是事实上使用 Git 管理像 XML 文本这样的纯文本类容是最简单的。
如果你在使用 Unix 系统,你可以使用 `file` 命令来查看文件内容构成:
```
$ file ~/path/to/my-file.blah
my-file.blah: ASCII text
$ file ~/path/to/different-file.kra: Zip data (MIME type "application/x-krita")
```
如果还是不确定,你可以使用 `head` 命令来查看文件内容:
```
$ head ~/path/to/my-file.blah
```
如果输出的文本你基本能看懂,这个文件就很有可能是文本文件。如果你仅仅在一堆乱码中偶尔看到几个熟悉的字符,那么这个文件就可能不是文本文件了。
准确的说Git 可以管理其他格式的文件但是它会把这些文件当成二进制大对象blob。两者的区别是在文本文件中Git 可以明确的告诉你在这两个快照(或者说提交)间有 3 行是修改过的。但是如果你在两个提交commit之间对一张图片进行的编辑操作Git 会怎么指出这种修改呢?实际上,因为图片并不是以某种可以增加或删除的有意义的文本构成,因此 Git 并不能明确的描述这种变化。当然我个人是非常希望图片的编辑可以像把文本“\<sky>丑陋的蓝绿色\</sky>”修改成“\<sky>漂浮着蓬松白云的天蓝色\</sky>”一样的简单,但是事实上图片的编辑并没有这么简单。
经常有人在 Git 上放入 png 图标、电子表格或者流程图这类二进制大型对象blob。尽管我们知道在 Git 上管理此类大型文件并不直观,但是,如果你需要使用 Git 来管理此类文件,你也并不需要过多的担心。如果你参与的项目同时生成文本文件和二进制大文件对象(如视频游戏中常见的场景,这些和源代码同样重要的图像和音频材料),那么你有两条路可以走:要么开发出你自己的解决方案,就如使用指向共享网络驱动器的引用;要么使用 Git 插件,如 Joey Hess 开发的 [git annex][6],以及 [Git-Media][7] 项目。
你看Git 真的是一个任何人都可以使用的工具。它是你进行文件版本管理的一个强大而且好用工具,同时它并没有你开始认为的那么可怕。
--------------------------------------------------------------------------------
via: https://opensource.com/resources/what-is-git
作者:[Seth Kenlon][a]
译者:[cvsher](https://github.com/cvsher)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/seth
[1]: https://opensource.com/life/16/2/version-control-isnt-just-programmers
[2]: http://sparkleshare.org/
[3]: https://git.wiki.kernel.org/index.php/InterfacesFrontendsAndTools#Graphical_Interfaces
[4]: https://opensource.com/life/11/11/introduction-kdenlive
[5]: http://inkscape.org/
[6]: https://git-annex.branchable.com/
[7]: https://github.com/alebedev/git-media

View File

@ -1,21 +1,24 @@
Git 入门指南
初步了解 Git
=========================
![](https://opensource.com/sites/default/files/styles/image-full-size/public/images/life/get_started_lead.jpeg?itok=r22AKc6P)
>Image by : opensource.com
在这个系列的介绍中,我们学习到了谁应该使用 Git以及 Git 是用来做什么的。今天,我们将学习如何克隆公共的 Git 仓库,以及如何提取出独立的文件而不用克隆整个仓库。
*图片来源opensource.com*
在这个系列的[介绍篇][4]中,我们学习到了谁应该使用 Git以及 Git 是用来做什么的。今天,我们将学习如何克隆公共 Git 仓库,以及如何提取出独立的文件而不用克隆整个仓库。
由于 Git 如此流行,因而如果你能够至少熟悉一些基础的 Git 知识也能为你的生活带来很多便捷。如果你可以掌握 Git 基础(你可以的,我发誓!),那么你将能够下载任何你需要的东西,甚至还可能做一些贡献作为回馈。毕竟,那就是开源的精髓所在:你拥有获取你使用的软件代码的权利,拥有和他人分享的自由,以及只要你愿意就可以修改它的权利。只要你熟悉了 Git它就可以让这一切都变得很容易。
那么,让我们一起来熟悉 Git 吧。
### 读和写
一般来说,有两种方法可以和 Git 仓库交互:你可以从仓库中读取,或者你也能够向仓库中写入。它就像一个文件:有时候你打开一个文档只是为了阅读它,而其它时候你打开文档是因为你需要做些改动。
本文仅讲解如何从 Git 仓库读取。我们将会在后面的一篇文章中讲解如何向 Git 仓库写回的主题。
### Git 还是 GitHub
一句话澄清Git 不同于 GitHub或 GitLab或 Bitbucket。Git 是一个命令行程序,所以它就像下面这样:
```
@ -31,29 +34,32 @@ usage: Git [--version] [--help] [-C <path>]
我的文章系列将首先教你纯粹的 Git 知识,因为一旦你理解了 Git 在做什么,那么你就无需关心正在使用的前端工具是什么了。然而,我的文章系列也将涵盖通过流行的 Git 服务完成每项任务的常用方法,因为那些将可能是你首先会遇到的。
### 安装 Git
在 Linux 系统上,你可以从所使用的发行版软件仓库中获取并安装 Git。BSD 用户应当在 Ports 树的 devel 部分查找 Git。
对于闭源的操作系统,请前往 [项目网站][1] 并根据说明安装。一旦安装后,在 Linux、BSD 和 Mac OS X 上的命令应当没有任何差别。Windows 用户需要调整 Git 命令,从而和 Windows 文件系统相匹配,或者安装 Cygwin 以原生的方式运行 Git而不受 Windows 文件系统转换问题的羁绊。
对于闭源的操作系统,请前往其[项目官网][1],并根据说明安装。一旦安装后,在 Linux、BSD 和 Mac OS X 上的命令应当没有任何差别。Windows 用户需要调整 Git 命令,从而和 Windows 文件系统相匹配,或者安装 Cygwin 以原生的方式运行 Git而不受 Windows 文件系统转换问题的羁绊。
### Git 下午茶
### 下午茶和 Git
并非每个人都需要立刻将 Git 加入到我们的日常生活中。有些时候,你和 Git 最多的交互就是访问一个代码库,下载一两个文件,然后就不用它了。以这样的方式看待 Git它更像是下午茶而非一次正式的宴会。你进行一些礼节性的交谈获得了需要的信息然后你就会离开至少接下来的三个月你不再想这样说话。
当然,那是可以的。
一般来说,有两种方法访问 Git使用命令行或者使用一种神奇的因特网技术通过 web 浏览器快速轻松地访问。
假设你想要在终端中安装并使用一个回收站,因为你已经被 rm 命令毁掉太多次了。你已经听说过 Trashy 了,它称自己为「理智的 rm 命令媒介」,并且你想在安装它之前阅读它的文档。幸运的是,[Trashy 公开地托管在 GitLab.com][2]。
假设你想要给终端安装一个回收站,因为你已经被 rm 命令毁掉太多次了。你可能听说过 Trashy ,它称自己为「理智的 rm 命令中间人」,也许你想在安装它之前阅读它的文档。幸运的是,[Trashy 公开地托管在 GitLab.com][2]。
### Landgrab
我们工作的第一步是对这个 Git 仓库使用 landgrab 排序方法:我们会克隆这个完整的仓库,然后会根据内容排序。由于该仓库是托管在公共的 Git 服务平台上,所以有两种方式来完成工作:使用命令行,或者使用 web 界面。
要想使用 Git 获取整个仓库,就要使用 git clone 命令和 Git 仓库的 URL 作为参数。如果你不清楚正确的 URL 是什么仓库应该会告诉你的。GitLab 为你提供了 [Trashy][3] 仓库的拷贝-粘贴 URL。
要想使用 Git 获取整个仓库,就要使用 git clone 命令和 Git 仓库的 URL 作为参数。如果你不清楚正确的 URL 是什么仓库应该会告诉你的。GitLab 为你提供了 [Trashy][3] 仓库的用于拷贝粘贴 URL。
![](https://opensource.com/sites/default/files/1_gitlab-url.jpg)
你也许注意到了,在某些服务平台上,会同时提供 SSH 和 HTTPS 链接。只有当你拥有仓库的写权限时,你才可以使用 SSH。否则的话你必须使用 HTTPS URL。
一旦你获得了正确的 URL克隆仓库是非常容易的。就是 git clone 这个 URL 即可,可选项是可以指定要克隆到的目录。默认情况下会将 git 目录克隆到你当前所在的位置;例如,'trashy.git' 表示将仓库克隆到你当前位置的 'trashy' 目录。我使用 .clone 扩展名标记那些只读的仓库,使用 .git 扩展名标记那些我可以读写的仓库,但那无论如何也不是官方要求的。
一旦你获得了正确的 URL克隆仓库是非常容易的。就是 git clone 该 URL 即可,以及一个可选的指定要克隆到的目录。默认情况下会将 git 目录克隆到你当前所在的目录;例如,'trashy.git' 将会克隆到你当前位置的 'trashy' 目录。我使用 .clone 扩展名标记那些只读的仓库,使用 .git 扩展名标记那些我可以读写的仓库,不过这并不是官方要求的。
```
$ git clone https://gitlab.com/trashy/trashy.git trashy.clone
@ -68,30 +74,34 @@ Checking connectivity... done.
一旦成功地克隆了仓库,你就可以像对待你电脑上任何其它目录那样浏览仓库中的文件。
另外一种获得仓库拷贝的方式是使用 web 界面。GitLab 和 GitHub 都会提供一个 .zip 格式的仓库快照文件。GitHub 有一个大的绿色下载按钮,但是在 GitLab 中,可以浏览器的右侧找到并不显眼的下载按钮。
另外一种获得仓库拷贝的方式是使用 web 界面。GitLab 和 GitHub 都会提供一个 .zip 格式的仓库快照文件。GitHub 有一个大的绿色下载按钮,但是在 GitLab 中,可以浏览器的右侧找到并不显眼的下载按钮。
![](https://opensource.com/sites/default/files/1_gitlab-zip.jpg)
### 挑选和选择
另外一种从 Git 仓库中获取文件的方法是找到你想要的文件,然后把它从仓库中拽出来。只有 web 界面才提供这种方法,本质上来说,你看到的是别人仓库的克隆;你可以把它想象成一个 HTTP 共享目录。
### 仔细挑选
另外一种从 Git 仓库中获取文件的方法是找到你想要的文件,然后把它从仓库中拽出来。只有 web 界面才提供这种方法,本质上来说,你看到的是别人的仓库克隆;你可以把它想象成一个 HTTP 共享目录。
使用这种方法的问题是,你也许会发现某些文件并不存在于原始仓库中,因为完整形式的文件可能只有在执行 make 命令后才能构建,那只有你下载了完整的仓库,阅读了 README 或者 INSTALL 文件,然后运行相关命令之后才会产生。不过,假如你确信文件存在,而你只想进入仓库,获取那个文件,然后离开的话,你就可以那样做。
在 GitLab 和 GitHub 中,单击文件链接,并在 Raw 模式下查看,然后使用你的 web 浏览器的保存功能,例如:在 Firefox 中,文件 > 保存页面为。在一个 GitWeb 仓库中(一些更喜欢自己托管 git 的人使用的私有 git 仓库 web 查看器Raw 查看链接在文件列表视图中。
在 GitLab 和 GitHub 中,单击文件链接,并在 Raw 模式下查看,然后使用你的 web 浏览器的保存功能,例如:在 Firefox 中,文件 \> 保存页面为。在一个 GitWeb 仓库中(这是个某些更喜欢自己托管 git 的人使用的私有 git 仓库 web 查看器Raw 查看链接在文件列表视图中。
![](https://opensource.com/sites/default/files/1_webgit-file.jpg)
### 最佳实践
通常认为,和 Git 交互的正确方式是克隆完整的 Git 仓库。这样认为是有几个原因的。首先,可以使用 git pull 命令轻松地使克隆仓库保持更新,这样你就不必在每次文件改变时就重回 web 站点获得一份全新的拷贝。第二,你碰巧需要做些改进,只要保持仓库整洁,那么你可以非常轻松地向原来的作者提交所做的变更。
现在,可能是时候练习查找感兴趣的 Git 仓库,然后将它们克隆到你的硬盘中了。只要你了解使用终端的基础知识,那就不会太难做到。还不知道终端使用基础吗?那再给多我 5 分钟时间吧。
现在,可能是时候练习查找感兴趣的 Git 仓库,然后将它们克隆到你的硬盘中了。只要你了解使用终端的基础知识,那就不会太难做到。还不知道基本的终端使用方式吗?那再给多我 5 分钟时间吧。
### 终端使用基础
首先要知道的是,所有的文件都有一个路径。这是有道理的;如果我让你在常规的非终端环境下为我打开一个文件,你就要导航到文件在你硬盘的位置,并且直到你找到那个文件,你要浏览一大堆窗口。例如,你也许要点击你的家目录 > 图片 > InktoberSketches > monkey.kra。
在那样的场景下,我们可以说文件 monkeysketch.kra 的路径是:$HOME/图片/InktoberSketches/monkey.kra。
在那样的场景下,文件 monkeysketch.kra 的路径是:$HOME/图片/InktoberSketches/monkey.kra。
在终端中,除非你正在处理一些特殊的系统管理员任务,你的文件路径通常是以 $HOME 开头的(或者,如果你很懒,就使用 ~ 字符),后面紧跟着一些列的文件夹直到文件名自身。
这就和你在 GUI 中点击各种图标直到找到相关的文件或文件夹类似。
如果你想把 Git 仓库克隆到你的文档目录,那么你可以打开一个终端然后运行下面的命令:
@ -100,6 +110,7 @@ Checking connectivity... done.
$ git clone https://gitlab.com/foo/bar.git
$HOME/文档/bar.clone
```
一旦克隆完成,你可以打开一个文件管理器窗口,导航到你的文档文件夹,然后你就会发现 bar.clone 目录正在等待着你访问。
如果你想要更高级点,你或许会在以后再次访问那个仓库,可以尝试使用 git pull 命令来查看项目有没有更新:
@ -111,15 +122,15 @@ bar.clone
$ git pull
```
到目前为止,你需要了解的所有终端命令就是那些了,那就去探索吧。你实践得越多Git 掌握得就越好(孰能生巧那就是游戏的名称至少它教会了你一些基础give or take a vowel
到目前为止,你需要初步了解的所有终端命令就是那些了那就去探索吧。你实践得越多Git 掌握得就越好(熟能生巧),这是重点,也是事情的本质
--------------------------------------------------------------------------------
via: https://opensource.com/life/16/7/stumbling-git
作者:[Seth Kenlon][a]
译者:[译者ID](https://github.com/chrisleegit)
校对:[校对者ID](https://github.com/校对者ID)
译者:[ChrisLeeGit](https://github.com/chrisleegit)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](https://linux.cn/) 荣誉推出
@ -127,4 +138,4 @@ via: https://opensource.com/life/16/7/stumbling-git
[1]: https://git-scm.com/download
[2]: https://gitlab.com/trashy/trashy
[3]: https://gitlab.com/trashy/trashy.git
[4]: https://linux.cn/article-7639-1.html

View File

@ -0,0 +1,60 @@
Vim 起步的五个技巧
=====================================
![](https://opensource.com/sites/default/files/styles/image-full-size/public/images/education/BUSINESS_peloton.png?itok=nuMbW9d3)
多年来,我一直想学 Vim。如今 Vim 是我最喜欢的 Linux 文本编辑器,也是开发者和系统管理者最喜爱的开源工具。我说的学习,指的是真正意义上的学习。想要精通确实很难,所以我只想要达到熟练的水平。我使用了这么多年的 Linux ,我会的也仅仅只是打开一个文件,使用上下左右箭头按键来移动光标,切换到插入模式,更改一些文本,保存,然后退出。
但那只是 Vim 的最最基本的操作。我的技能水平只能让我在终端使用 Vim 修改文本,但是它并没有任何一个我想象中强大的文本处理功能。这样我完全无法用 Vim 发挥出胜出 Pico 和 Nano 的能力。
所以到底为什么要学习 Vim因为我花费了相当多的时间用于编辑文本而且我知道还有很大的效率提升空间。为什么不选择 Emacs或者是更为现代化的编辑器例如 Atom因为 Vim 适合我,至少我有一丁点的使用经验。而且,很重要的一点就是,在我需要处理的系统上很少碰见没有装 Vim 或者它的弱化版Vi。如果你有强烈的欲望想学习对你来说更给力的 Emacs我希望这些对于 Emacs 同类编辑器的建议能对你有所帮助。
花了几周的时间专注提高我的 Vim 使用技巧之后,我想分享的第一个建议就是必须使用它。虽然这看起来就是明知故问的回答,但事实上它比我所预想的计划要困难一些。我的大多数工作是在网页浏览器上进行的,而且每次我需要在浏览器之外打开并编辑一段文本时,就需要避免下意识地打开 Gedit。Gedit 已经放在了我的快速启动栏中,所以第一步就是移除这个快捷方式,然后替换成 Vim 的。
为了更好的学习 Vim我尝试了很多。如果你也正想学习以下列举了一些作为推荐。
### Vimtutor
通常如何开始学习最好就是使用应用本身。我找到一个小的应用叫 Vimtutor当你在学习编辑一个文本时它能辅导你一些基础知识它向我展示了很多我这些年都忽视的基础命令。Vimtutor 一般在有 Vim 的地方都能找到它,如果你的系统上没有 VimtutorVimtutor 可以很容易从你的包管理器上安装。
### GVim
我知道并不是每个人都认同这个,但就是它让我从使用终端中的 Vim 转战到使用 GVim 来满足我基本编辑需求。反对者表示 GVim 鼓励使用鼠标,而 Vim 主要是为键盘党设计的。但是我能通过 GVim 的下拉菜单快速找到想找的指令,并且 GVim 可以提醒我正确的指令然后通过敲键盘执行它。努力学习一个新的编辑器然后陷入无法解决的困境,这种感觉并不好受。每隔几分钟读一下 man 出来的文字或者使用搜索引擎来提醒你该用的按键序列也并不是最好的学习新事物的方法。
### 键盘表
当我转战 GVim我发现有一个键盘的“速查表”来提醒我最基础的按键很是便利。网上有很多这种可用的表你可以下载、打印然后贴在你身边的某一处地方。但是为了我的笔记本键盘我选择买一沓便签纸。这些便签纸在美国不到 10 美元,当我使用键盘编辑文本,尝试新的命令的时候,可以随时提醒我。
### Vimium
上文提到,我工作都在浏览器上进行。其中一条我觉得很有帮助的建议就是,使用 [Vimium][1] 来用增强使用 Vim 的体验。Vimium 是 Chrome 浏览器上的一个开源插件,能用 Vim 的指令快捷操作 Chrome。我发现我只用了几次使用快捷键切换上下文就好像比之前更熟悉这些快捷键了。同样的扩展 Firefox 上也有,例如 [Vimerator][2]。
### 其它人
毫无疑问,最好的学习方法就是求助于在你之前探索过的人,让他给你建议、反馈和解决方法。
如果你住在一个大城市,那么附近可能会有一个 Vim meetup 小组,或者还有 Freenode IRC 上的 #vim 频道。#vim 频道是 Freenode 上最活跃的频道之一,那上面可以针对你个人的问题来提供帮助。听上面的人发发牢骚或者看看别人尝试解决自己没有遇到过的问题,仅仅是这样我都觉得很有趣。
------
那么,现在怎么样了?到现在为止还不错。为它所花的时间是否值得就在于之后它为你节省了多少时间。但是当我发现一个新的按键序列可以来跳过词,或者一些相似的小技巧,我经常会收获意外的惊喜与快乐。每天我至少可以看见,一点点的回报,正在逐渐配得上当初的付出。
学习 Vim 并不仅仅只有这些建议,还有很多。我很喜欢指引别人去 [Vim Advantures][3],它是一种使用 Vim 按键方式进行移动的在线游戏。而在另外一天我在 [Vimgifts.com][4] 发现了一个非常神奇的虚拟学习工具,那可能就是你真正想要的:用一个小小的 gif 动图来描述 Vim 操作。
你有花时间学习 Vim 吗?或者是任何需要大量键盘操作的程序?那些经过你努力后掌握的工具,你认为这些努力值得吗?效率的提高有没有达到你的预期?分享你们的故事在下面的评论区吧。
--------------------------------------------------------------------------------
via: https://opensource.com/life/16/7/tips-getting-started-vim
作者:[Jason Baker][a]
译者:[maywanting](https://github.com/maywanting)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/jason-baker
[1]: https://github.com/philc/vimium
[2]: http://www.vimperator.org/
[3]: http://vim-adventures.com/
[4]: http://vimgifs.com/

View File

@ -1,4 +1,4 @@
Part 1 - LXD 2.0: LXD 入门
LXD 2.0 系列(一):LXD 入门
======================================
这是 [LXD 2.0 系列介绍文章][1]的第一篇。
@ -20,12 +20,11 @@ LXD 最主要的目标就是使用 Linux 容器而不是硬件虚拟化向用户
LXD 聚焦于系统容器,通常也被称为架构容器。这就是说 LXD 容器实际上如在裸机或虚拟机上运行一般运行了一个完整的 Linux 操作系统。
这些容器一般基于一个干净的发布镜像并会长时间运行。传统的配置管理工具和部署工具可以如在虚拟机、云和物理机器上一样与 LXD 一起使用。
这些容器一般基于一个干净的发布镜像并会长时间运行。传统的配置管理工具和部署工具可以如在虚拟机、云实例和物理机器上一样与 LXD 一起使用。
相对的, Docker 关注于短期的、无状态的最小容器,这些容器通常并不会升级或者重新配置,而是作为一个整体被替换掉。这就使得 Docker 及类似项目更像是一种软件发布机制,而不是一个机器管理工具。
这两种模型并不是完全互斥的。你完全可以使用 LXD 为你的用户提供一个完整的 Linux 系统,而他们可以在 LXD 内安装 Docker 来运行他们想要的软件。
相对的, Docker 关注于短期的、无状态的、最小化的容器,这些容器通常并不会升级或者重新配置,而是作为一个整体被替换掉。这就使得 Docker 及类似项目更像是一种软件发布机制,而不是一个机器管理工具。
这两种模型并不是完全互斥的。你完全可以使用 LXD 为你的用户提供一个完整的 Linux 系统,然后他们可以在 LXD 内安装 Docker 来运行他们想要的软件。
#### 为什么要用 LXD?
@ -35,56 +34,55 @@ LXD 聚焦于系统容器,通常也被称为架构容器。这就是说 LXD
我们把 LXD 作为解决这些缺陷的一个很好的机会。作为一个长时间运行的守护进程, LXD 可以绕开 LXC 的许多限制,比如动态资源限制、无法进行容器迁移和高效的在线迁移;同时,它也为创造新的默认体验提供了机会:默认开启安全特性,对用户更加友好。
### LXD 的主要组件
LXD 是由几个主要组件构成的,这些组件都 LXD 目录结构、命令行客户端和 API 结构体里下可见的
LXD 是由几个主要组件构成的,这些组件都出现在 LXD 目录结构、命令行客户端和 API 结构体里。
#### 容器
LXD 中的容器包括以下及部分:
- 根文件系统
- 根文件系统rootfs
- 配置选项列表,包括资源限制、环境、安全选项等等
- 设备包括磁盘、unix 字符/块设备、网络接口
- 一组继承而来的容器配置文件
- 属性(容器架构,暂时的或持久的,容器名)
- 运行时状态(当时为了记录检查点、恢复时到了 CRIU时
- 属性(容器架构、暂时的还是持久的、容器名)
- 运行时状态(当用 CRIU 来中断/恢复时)
#### 快照
容器快照和容器是一回事,只不过快照是不可修改的,只能被重命名,销毁或者用来恢复系统,但是无论如何都不能被修改。
值得注意的是,因为我们允许用户保存容器的运行时状态,这就有效的为我们提供了“有状态”的快照的功能。这就是说我们可以使用快照回滚容器的 CPU 和内存。
值得注意的是,因为我们允许用户保存容器的运行时状态,这就有效的为我们提供了“有状态”的快照的功能。这就是说我们可以使用快照回滚容器的状态,包括快照当时的 CPU 和内存状态
#### 镜像
LXD 是基于镜像实现的,所有的 LXD 容器都是来自于镜像。容器镜像通常是一些纯净的 Linux 发行版的镜像,类似于你们在虚拟机和云实例上使用的镜像。
所以可以「发布」容器:使用容器制作一个镜像并在本地或者远程 LXD 主机上使用。
所以可以「发布」一个容器:使用容器制作一个镜像并在本地或者远程 LXD 主机上使用。
镜像通常使用全部或部分 sha256 哈希码来区分。因为输入长长的哈希码对用户来说不,所以镜像可以使用几个自身的属性来区分,这就使得用户在镜像商店里方便搜索镜像。别名也可以用来 1 对 1 地把对用户友好的名字映射到某个镜像的哈希码
镜像通常使用全部或部分 sha256 哈希码来区分。因为输入长长的哈希码对用户来说不方便,所以镜像可以使用几个自身的属性来区分,这就使得用户在镜像商店里方便搜索镜像。也可以使用别名来一对一地将一个用户好记的名字映射到某个镜像的哈希码上
LXD 安装时已经配置好了三个远程镜像服务器(参见下面的远程一节):
- “ubuntu:” 提供稳定版的 Ubuntu 镜像
- “ubuntu-daily:” 提供每天构建出来的 Ubuntu
- “images” 社区维护的镜像服务器,提供一系列的 Linux 发布版,使用的是上游 LXC 的模板
- “ubuntu”:提供稳定版的 Ubuntu 镜像
- “ubuntu-daily”:提供 Ubuntu 的每日构建镜像
- “images” 社区维护的镜像服务器,提供一系列的其它 Linux 发布版,使用的是上游 LXC 的模板
LXD 守护进程会从镜像上次被使用开始自动缓存远程镜像一段时间(默认是 10 天),超过时限后这些镜像才会失效。
此外, LXD 还会自动更新远程镜像(除非指明不更新),所以本地的镜像会一直是最新版的。
#### 配置
配置文件是一种在一处定义容器配置和容器设备,然后应用到一系列容器的方法。
配置文件是一种在一个地方定义容器配置和容器设备,然后将其应用到一系列容器的方法。
一个容器可以被应用多个配置文件。当构建最终容器配置时(即通常的扩展配置),这些配置文件都会按照他们定义顺序被应用到容器上,当有重名的配置时,新的会覆盖掉旧的。然后本地容器设置会在这些基础上应用,覆盖所有来自配置文件的选项。
一个容器可以被应用多个配置文件。当构建最终容器配置时(即通常的扩展配置),这些配置文件都会按照他们定义顺序被应用到容器上,当有重名的配置键或设备时,新的会覆盖掉旧的。然后本地容器设置会在这些基础上应用,覆盖所有来自配置文件的选项。
LXD 自带两种预配置的配置文件:
- 「 default 」配置是自动应用在所有容器之上,除非用户提供了一系列替代的配置文件。目前这个配置文件只做一件事,为容器定义 eth0 网络设备。
- 「 docker” 」配置是一个允许你在容器里运行 Docker 容器的配置文件。它会要求 LXD 加载一些需要的内核模块以支持容器嵌套并创建一些设备入口
- “default”配置是自动应用在所有容器之上,除非用户提供了一系列替代的配置文件。目前这个配置文件只做一件事,为容器定义 eth0 网络设备。
- “docker”配置是一个允许你在容器里运行 Docker 容器的配置文件。它会要求 LXD 加载一些需要的内核模块以支持容器嵌套并创建一些设备。
#### 远程
@ -92,14 +90,14 @@ LXD 自带两种预配置的配置文件:
默认情况下,我们的命令行客户端会与下面几个预定义的远程服务器通信:
- local默认的远程服务器,使用 UNIX socket 和本地的 LXD 守护进程通信
- ubuntu Ubuntu 镜像服务器,提供稳定版的 Ubuntu 镜像
- ubuntu-daily Ubuntu 镜像服务器,提供每天构建出来的 Ubuntu
- images images.linuxcontainers.org 镜像服务器
- local默认的远程服务器使用 UNIX socket 和本地的 LXD 守护进程通信
- ubuntuUbuntu 镜像服务器,提供稳定版的 Ubuntu 镜像
- ubuntu-dailyUbuntu 镜像服务器,提供 Ubuntu 的每日构建版
- imagesimages.linuxcontainers.org 镜像服务器
所有这些远程服务器的组合都可以在命令行客户端里使用。
你也可以添加任意数量的远程 LXD 主机监听网络。匿名的开放镜像服务器,或者通过认证可以管理远程容器的镜像服务器,都可以添加进来。
你也可以添加任意数量的远程 LXD 主机,并配置它们监听网络。匿名的开放镜像服务器,或者通过认证可以管理远程容器的镜像服务器,都可以添加进来。
正是这种远程机制使得与远程镜像服务器交互及在主机间复制、移动容器成为可能。
@ -107,30 +105,29 @@ LXD 自带两种预配置的配置文件:
我们设计 LXD 时的一个核心要求,就是在不修改现代 Linux 发行版的前提下,使容器尽可能的安全。
LXD 使用的、通过使用 LXC 库实现的主要安全特性有:
LXD 通过使用 LXC 库实现的主要安全特性有:
- 内核名字空间。尤其是用户名字空间它让容器和系统剩余部分完全分离。LXD 默认使用用户名字空间(和 LXC 相反),并允许用户在需要的时候以容器为单位打开或关闭。
- 内核名字空间。尤其是用户名字空间它让容器和系统剩余部分完全分离。LXD 默认使用用户名字空间(和 LXC 相反),并允许用户在需要的时候以容器为单位关闭(将容器标为“特权的”)
- Seccomp 系统调用。用来隔离潜在危险的系统调用。
- AppArmor对 mount、socket、ptrace 和文件访问提供额外的限制。特别是限制跨容器通信。
- AppArmor对 mount、socket、ptrace 和文件访问提供额外的限制。特别是限制跨容器通信。
- Capabilities。阻止容器加载内核模块修改主机系统时间等等。
- CGroups。限制资源使用防止对主机的 DoS 攻击。
- CGroups。限制资源使用防止对主机的 DoS 攻击。
为了对用户友好LXD 构建了一个新的配置语言把大部分的这些特性都抽象封装起来,而不是如 LXC 一般直接将这些特性暴露出来。举了例子,一个用户可以告诉 LXD 把主机设备放进容器而不需要手动检查他们的主/次设备号来手动更新 CGroup 策略。
为了对用户友好 LXD 构建了一个新的配置语言把大部分的这些特性都抽象封装起来,而不是如 LXC 一般直接将这些特性暴露出来。举了例子,一个用户可以告诉 LXD 把主机设备放进容器而不需要手动检查他们的主/次设备号来更新 CGroup 策略。
和 LXD 本身通信是基于使用 TLS 1.2 保护的链路,这些链路只允许使用有限的几个被允许的密钥。当和那些经过系统证书认证之外的主机通信时, LXD 会提示用户验证主机的远程足迹SSH 方式),然后把足迹缓存起来以供以后使用。
和 LXD 本身通信是基于使用 TLS 1.2 保护的链路,只允许使用有限的几个被允许的密钥算法。当和那些经过系统证书认证之外的主机通信时, LXD 会提示用户验证主机的远程指纹SSH 方式),然后把指纹缓存起来以供以后使用。
### REST 接口
LXD 的工作都是通过 REST 接口实现的。在客户端和守护进程之间并没有其他的通讯手段
LXD 的工作都是通过 REST 接口实现的。在客户端和守护进程之间并没有其他的通讯渠道
REST 接口可以通过本地的 unix socket 访问,这只需要经过组认证,或者经过 HTTP 套接字使用客户端认证进行通信。
REST 接口可以通过本地的 unix socket 访问,这只需要经过用户组认证,或者经过 HTTP 套接字使用客户端认证进行通信。
REST 接口的结构能够和上文所说的不同的组件匹配,是一种简单、直观的使用方法。
当需要一种复杂的通信机制时, LXD 将会进行 websocket 协商完成剩余的通信工作。这主要用于交互式终端会话、容器迁移和事件通知。
LXD 2.0 附带了 1.0 版的稳定 API。虽然我们在 1.0 版 API 添加了额外的特性,但是这不会在 1.0 版 API 端点里破坏向后兼容性,因为我们会声明额外的 API 扩展使得客户端可以找到新的接口。
LXD 2.0 附带了 1.0 版的稳定 API。虽然我们在 1.0 版 API 添加了额外的特性,但是这不会在 1.0 版 API 端点里破坏向后兼容性,因为我们会声明额外的 API 扩展使得客户端可以找到新的接口。
### 容器规模化

View File

@ -0,0 +1,86 @@
Android vs. iPhone: Pros and Cons
===================================
>When comparing Android vs. iPhone, clearly Android has certain advantages even as the iPhone is superior in some key ways. But ultimately, which is better?
The question of Android vs. iPhone is a personal one.
Take myself, for example. I'm someone who has used both Android and the iPhone iOS. I'm well aware of the strengths of both platforms along with their weaknesses. Because of this, I decided to share my perspective regarding these two mobile platforms. Additionally, we'll take a look at my impressions of the new Ubuntu mobile platform and where it stacks up.
### What iPhone gets right
Even though I'm a full time Android user these days, I do recognize the areas where the iPhone got it right. First, Apple has a better record in updating their devices. This is especially true for older devices running iOS. With Android, if it's not a “Google blessed” Nexus...it better be a higher end carrier supported phone. Otherwise, you're going to find updates are either sparse or non-existent.
Another area where the iPhone does well is apps availability. Expanding on that: iPhone apps almost always have a cleaner look to them. This isn't to say that Android apps are ugly, rather, they may not have an expected flow and consistency found with iOS. Two examples of exclusivity and great iOS-only layout would have to be [Dark Sky][1] (weather) and [Facebook Paper][2].
Then there is the backup process. Android can, by default, back stuff up to Google. But that doesn't help much with application data! By contrast, iCloud can essentially make a full backup of your iOS device.
### Where iPhone loses me
The biggest indisputable issue I have with the iPhone is more of a hardware limitation than a software one. That issue is storage.
Look, with most Android phones, I can buy a smaller capacity phone and then add an SD card later. This does two things: First, I can use the SD card to store a lot of media files. Second, I can even use the SD card to store "some" of my apps. Apple has nothing that will touch this.
Another area where the iPhone loses me is in the lack of choice it provides. Backing up your device? Hope you like iTunes or iCloud. For someone like myself who uses Linux, this means my ONLY option would be to use iCloud.
To be ultimately fair, there are additional solutions for your iPhone if you're willing to jailbreak it. But that's not what this article is about. Same goes for rooting Android. This article is addressing a vanilla setup for both platforms.
Finally, let us not forget this little treat [iTunes decides to delete a user's music][3] because it was seen as a duplication of Apple Music contents...or something along those lines. Not iPhone specific? I disagree, as that music would have very well ended up onto the iPhone at some point. I can say with great certainty that in no universe would I ever put up with this kind of nonsense!
![](http://www.datamation.com/imagesvr_ce/5552/mobile-abstract-icon-200x150.jpg)
>The Android vs. iPhone debate depends on what features matter the most to you.
### What Android gets right
The biggest thing Android gives me that the iPhone doesn't: choice. Choices in applications, devices and overall layout of how my phone works.
I love desktop widgets! To iPhone users, they may seem really silly. But I can tell you that they save me from opening up applications as I can see the desired data without the extra hassle. Another similar feature I love is being able to install custom launchers instead of my phone's default!
Finally, I can utilize tools like [Airdroid][4] and [Tasker][5] to add full computer-like functionality to my smart phone. Airdroid allows me treat my Android phone like a computer with file management and SMS with anyone this becomes a breeze to use with my mouse and keyboard. Tasker is awesome in that I can setup "recipes" to connect/disconnect, put my phone into meeting mode or even put itself into power saving mode when I set the parameters to do so. I can even set it to launch applications when I arrive at specific destinations.
### Where Android loses me
Backup options are limited to specific user data, not a full clone of your phone. Without rooting, you're either left out in the wind or you must look to the Android SDK for solutions. Expecting casual users to either root their phone or run the SDK for a complete (I mean everything) Android backup is a joke.
Yes, Google's backup service will backup Google app data, along with other related customizations. But it's nowhere near as complete as what we see with the iPhone. To accomplish something similar to what the iPhone enjoys, I've found you're going to either be rooting your Android phone or connecting it to a Windows PC to utilize some random program.
To be fair, however, I believe Nexus owners benefit from a [full backup service][6] that is device specific. Sorry, but Google's default backup is not cutting it. Same applies for adb backups via your PC they don't always restore things as expected.
Wait, it gets better. Now after a lot of failed let downs and frustration, I found that there was one app that looked like it "might" offer a glimmer of hope, it's called Helium. Unlike other applications I found to be misleading and frustrating with their limitations, [Helium][7] initially looked like it was the backup application Google should have been offering all along -- emphasis on "looked like." Sadly, it was a huge let down. Not only did I need to connect it to my computer for a first run, it didn't even work using their provided Linux script. After removing their script, I settling for a good old fashioned adb backup...to my Linux PC. Fun facts: You will need to turn on a laundry list of stuff in developer tools, plus if you run the Twilight app, that needs to be turned off. It took me a bit to put this together when the backup option for adb on my phone wasn't responding.
At the end of the day, Android has ample options for non-rooted users to backup superficial stuff like contacts, SMS and other data easily. But a deep down phone backup is best left to a wired connection and adb from my experience.
### Ubuntu will save us?
With the good and the bad examined between the two major players in the mobile space, there's a lot of hope that we're going to see good things from Ubuntu on the mobile front. Well, thus far, it's been pretty lackluster.
I like what the developers are doing with the OS and I certainly love the idea of a third option for mobile besides iPhone and Android. Unfortunately, though, it's not that popular on the phone and the tablet received a lot of bad press due to subpar hardware and a lousy demonstration that made its way onto YouTube.
To be fair, I've had subpar experiences with iPhone and Android, too, in the past. So this isn't a dig on Ubuntu. But until it starts showing up with a ready to go ecosystem of functionality that matches what Android and iOS offer, it's not something I'm terribly interested in yet. At a later date, perhaps, I'll feel like the Ubuntu phones are ready to meet my needs.
### Android vs. iPhone bottom line: Why Android wins long term
Despite its painful shortcomings, Android treats me like an adult. It doesn't lock me into only two methods for backing up my data. Yes, some of Android's limitations are due to the fact that it's focused on letting me choose how to handle my data. But, I also get to choose my own device, add storage on a whim. Android enables me to do a lot of cool stuff that the iPhone simply isn't capable of doing.
At its core, Android gives non-root users greater access to the phone's functionality. For better or worse, it's a level of freedom that I think people are gravitating towards. Now there are going to be many of you who swear by the iPhone thanks to efforts like the [libimobiledevice][8] project. But take a long hard look at all the stuff Apple blocks Linux users from doing...then ask yourself is it really worth it as a Linux user? Hit the Comments, share your thoughts on Android, iPhone or Ubuntu.
------------------------------------------------------------------------------
via: http://www.datamation.com/mobile-wireless/android-vs.-iphone-pros-and-cons.html
作者:[Matt Hartley][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.datamation.com/author/Matt-Hartley-3080.html
[1]: http://darkskyapp.com/
[2]: https://www.facebook.com/paper/
[3]: https://blog.vellumatlanta.com/2016/05/04/apple-stole-my-music-no-seriously/
[4]: https://www.airdroid.com/
[5]: http://tasker.dinglisch.net/
[6]: https://support.google.com/nexus/answer/2819582?hl=en
[7]: https://play.google.com/store/apps/details?id=com.koushikdutta.backup&hl=en
[8]: http://www.libimobiledevice.org/

View File

@ -1,3 +1,5 @@
MikeCoder Translating...
What containers and unikernels can learn from Arduino and Raspberry Pi
==========================================================================

View File

@ -1,3 +1,4 @@
chenxinlong translating
Who needs a GUI? How to live in a Linux terminal
=================================================
@ -84,7 +85,7 @@ LibreOffice, Google Slides or, gasp, PowerPoint. I spend a lot of time in presen
via: http://www.networkworld.com/article/3091139/linux/who-needs-a-gui-how-to-live-in-a-linux-terminal.html#slide1
作者:[Bryan Lunduke][a]
译者:[译者ID](https://github.com/译者ID)
译者:[译者ID](https://github.com/chenxinlong)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](https://linux.cn/) 荣誉推出

View File

@ -1,3 +1,5 @@
translating by maywanting
5 SSH Hardening Tips
======================

View File

@ -0,0 +1,7 @@
组队翻译 《Building a data science portfolio: Machine learning project》
本次组织者 @选题-oska874
参加译者 @译者-vim-kakali @译者-Noobfish @译者-zky001 @译者-kokialoves @译者-ideas4u @译者-cposture
分配方式 原文按大致长度分成 6 部分,参与者自由选择,先到先选,如有疑问联系 @选题-oska874

View File

@ -0,0 +1,84 @@
+@noobfish translating since Aug 2nd,2016.
+
+
>This is the third in a series of posts on how to build a Data Science Portfolio. If you like this and want to know when the next post in the series is released, you can [subscribe at the bottom of the page][1].
Data science companies are increasingly looking at portfolios when making hiring decisions. One of the reasons for this is that a portfolio is the best way to judge someones real-world skills. The good news for you is that a portfolio is entirely within your control. If you put some work in, you can make a great portfolio that companies are impressed by.
The first step in making a high-quality portfolio is to know what skills to demonstrate. The primary skills that companies want in data scientists, and thus the primary skills they want a portfolio to demonstrate, are:
- Ability to communicate
- Ability to collaborate with others
- Technical competence
- Ability to reason about data
- Motivation and ability to take initiative
Any good portfolio will be composed of multiple projects, each of which may demonstrate 1-2 of the above points. This is the third post in a series that will cover how to make a well-rounded data science portfolio. In this post, well cover how to make the second project in your portfolio, and how to build an end to end machine learning project. At the end, youll have a project that shows your ability to reason about data, and your technical competence. [Heres][2] the completed project if you want to take a look.
### An end to end project
As a data scientist, there are times when youll be asked to take a dataset and figure out how to [tell a story with it][3]. In times like this, its important to communicate very well, and walk through your process. Tools like Jupyter notebook, which we used in a previous post, are very good at helping you do this. The expectation here is that the deliverable is a presentation or document summarizing your findings.
However, there are other times when youll be asked to create a project that has operational value. A project with operational value directly impacts the day-to-day operations of a company, and will be used more than once, and often by multiple people. A task like this might be “create an algorithm to forecast our churn rate”, or “create a model that can automatically tag our articles”. In cases like this, storytelling is less important than technical competence. You need to be able to take a dataset, understand it, then create a set of scripts that can process that data. Its often important that these scripts run quickly, and use minimal system resources like memory. Its very common that these scripts will be run several times, so the deliverable becomes the scripts themselves, not a presentation. The deliverable is often integrated into operational flows, and may even be user-facing.
The main components of building an end to end project are:
- Understanding the context
- Exploring the data and figuring out the nuances
- Creating a well-structured project, so its easy to integrate into operational flows
- Writing high-performance code that runs quickly and uses minimal system resources
- Documenting the installation and usage of your code well, so others can use it
In order to effectively create a project of this kind, well need to work with multiple files. Using a text editor like [Atom][4], or an IDE like [PyCharm][5] is highly recommended. These tools will allow you to jump between files, and edit files of different types, like markdown files, Python files, and csv files. Structuring your project so its easy to version control and upload to collaborative coding tools like [Github][6] is also useful.
![](https://www.dataquest.io/blog/images/end_to_end/github.png)
>This project on Github.
Well use our editing tools along with libraries like [Pandas][7] and [scikit-learn][8] in this post. Well make extensive use of Pandas [DataFrames][9], which make it easy to read in and work with tabular data in Python.
### Finding good datasets
A good dataset for an end to end portfolio project can be hard to find. [The dataset][10] needs to be sufficiently large that memory and performance constraints come into play. It also needs to potentially be operationally useful. For instance, this dataset, which contains data on the admission criteria, graduation rates, and graduate future earnings for US colleges would be a great dataset to use to tell a story. However, as you think about the dataset, it becomes clear that there isnt enough nuance to build a good end to end project with it. For example, you could tell someone their potential future earnings if they went to a specific college, but that would be a quick lookup without enough nuance to demonstrate technical competence. You could also figure out if colleges with higher admissions standards tend to have graduates who earn more, but that would be more storytelling than operational.
These memory and performance constraints tend to come into play when you have more than a gigabyte of data, and when you have some nuance to what you want to predict, which involves running algorithms over the dataset.
A good operational dataset enables you to build a set of scripts that transform the data, and answer dynamic questions. A good example would be a dataset of stock prices. You would be able to predict the prices for the next day, and keep feeding new data to the algorithm as the markets closed. This would enable you to make trades, and potentially even profit. This wouldnt be telling a story it would be adding direct value.
Some good places to find datasets like this are:
- [/r/datasets][11] a subreddit that has hundreds of interesting datasets.
- [Google Public Datasets][12] public datasets available through Google BigQuery.
- [Awesome datasets][13] a list of datasets, hosted on Github.
As you look through these datasets, think about what questions someone might want answered with the dataset, and think if those questions are one-time (“how did housing prices correlate with the S&P 500?”), or ongoing (“can you predict the stock market?”). The key here is to find questions that are ongoing, and require the same code to be run multiple times with different inputs (different data).
For the purposes of this post, well look at [Fannie Mae Loan Data][14]. Fannie Mae is a government sponsored enterprise in the US that buys mortgage loans from other lenders. It then bundles these loans up into mortgage-backed securities and resells them. This enables lenders to make more mortgage loans, and creates more liquidity in the market. This theoretically leads to more homeownership, and better loan terms. From a borrowers perspective, things stay largely the same, though.
Fannie Mae releases two types of data data on loans it acquires, and data on how those loans perform over time. In the ideal case, someone borrows money from a lender, then repays the loan until the balance is zero. However, some borrowers miss multiple payments, which can cause foreclosure. Foreclosure is when the house is seized by the bank because mortgage payments cannot be made. Fannie Mae tracks which loans have missed payments on them, and which loans needed to be foreclosed on. This data is published quarterly, and lags the current date by 1 year. As of this writing, the most recent dataset thats available is from the first quarter of 2015.
Acquisition data, which is published when the loan is acquired by Fannie Mae, contains information on the borrower, including credit score, and information on their loan and home. Performance data, which is published every quarter after the loan is acquired, contains information on the payments being made by the borrower, and the foreclosure status, if any. A loan that is acquired may have dozens of rows in the performance data. A good way to think of this is that the acquisition data tells you that Fannie Mae now controls the loan, and the performance data contains a series of status updates on the loan. One of the status updates may tell us that the loan was foreclosed on during a certain quarter.
![](https://www.dataquest.io/blog/images/end_to_end/foreclosure.jpg)
>A foreclosed home being sold.
### Picking an angle
There are a few directions we could go in with the Fannie Mae dataset. We could:
- Try to predict the sale price of a house after its foreclosed on.
- Predict the payment history of a borrower.
- Figure out a score for each loan at acquisition time.
The important thing is to stick to a single angle. Trying to focus on too many things at once will make it hard to make an effective project. Its also important to pick an angle that has sufficient nuance. Here are examples of angles without much nuance:
- Figuring out which banks sold loans to Fannie Mae that were foreclosed on the most.
- Figuring out trends in borrower credit scores.
- Exploring which types of homes are foreclosed on most often.
- Exploring the relationship between loan amounts and foreclosure sale prices
All of the above angles are interesting, and would be great if we were focused on storytelling, but arent great fits for an operational project.
With the Fannie Mae dataset, well try to predict whether a loan will be foreclosed on in the future by only using information that was available when the loan was acquired. In effect, well create a “score” for any mortgage that will tell us if Fannie Mae should buy it or not. This will give us a nice foundation to build on, and will be a great portfolio piece.

View File

@ -0,0 +1,98 @@
翻译中 by ideas4u
### 理解数据
我们来简单看一下原始数据文件。下面是2012年1季度前几行的采集数据。
```
100000853384|R|OTHER|4.625|280000|360|02/2012|04/2012|31|31|1|23|801|N|C|SF|1|I|CA|945||FRM|
100003735682|R|SUNTRUST MORTGAGE INC.|3.99|466000|360|01/2012|03/2012|80|80|2|30|794|N|P|SF|1|P|MD|208||FRM|788
100006367485|C|PHH MORTGAGE CORPORATION|4|229000|360|02/2012|04/2012|67|67|2|36|802|N|R|SF|1|P|CA|959||FRM|794
```
下面是2012年1季度的前几行执行数据
```
100000853384|03/01/2012|OTHER|4.625||0|360|359|03/2042|41860|0|N||||||||||||||||
100000853384|04/01/2012||4.625||1|359|358|03/2042|41860|0|N||||||||||||||||
100000853384|05/01/2012||4.625||2|358|357|03/2042|41860|0|N||||||||||||||||
```
在开始编码之前,花些时间真正理解数据是值得的。这对于操作项目优为重要,因为我们没有交互式探索数据,将很难察觉到细微的差别除非我们在前期发现他们。在这种情况下,第一个步骤是阅读房利美站点的资料:
- [概述][15]
- [有用的术语表][16]
- [问答][17]
- [采集和执行文件中的列][18]
- [采集数据文件样本][19]
- [执行数据文件样本][20]
在看完这些文件后后,我们了解到一些能帮助我们的关键点:
- 从2000年到现在每季度都有一个采集和执行文件因数据是滞后一年的所以到目前为止最新数据是2015年的。
- 这些文件是文本格式的,采用管道符号“|”进行分割。
- 这些文件是没有表头的,但我们有文件列明各列的名称。
- 所有一起文件包含2200万个贷款的数据。
由于执行文件包含过去几年获得的贷款的信息在早些年获得的贷款将有更多的执行数据即在2014获得的贷款没有多少历史执行数据
这些小小的信息将会为我们节省很多时间,因为我们知道如何构造我们的项目和利用这些数据。
### 构造项目
在我们开始下载和探索数据之前,先想一想将如何构造项目是很重要的。当建立端到端项目时,我们的主要目标是:
- 创建一个可行解决方案
- 有一个快速运行且占用最小资源的解决方案
- 容易可扩展
- 写容易理解的代码
- 写尽量少的代码
为了实现这些目标,需要对我们的项目进行良好的构造。一个结构良好的项目遵循几个原则:
- 分离数据文件和代码文件
- 从原始数据中分离生成的数据。
- 有一个README.md文件帮助人们安装和使用该项目。
- 有一个requirements.txt文件列明项目运行所需的所有包。
- 有一个单独的settings.py 文件列明其它文件中使用的所有的设置
- 例如如果从多个Python脚本读取相同的文件把它们全部import设置和从一个集中的地方获得文件名是有用的。
- 有一个.gitignore文件防止大的或秘密文件被提交。
- 分解任务中每一步可以单独执行的步骤到单独的文件中。
- 例如,我们将有一个文件用于读取数据,一个用于创建特征,一个用于做出预测。
- 保存中间结果,例如,一个脚本可输出下一个脚本可读取的文件。
- 这使我们无需重新计算就可以在数据处理流程中进行更改。
我们的文件结构大体如下:
```
loan-prediction
├── data
├── processed
├── .gitignore
├── README.md
├── requirements.txt
├── settings.py
```
### 创建初始文件
首先我们需要创建一个loan-prediction文件夹在此文件夹下面再创建一个data文件夹和一个processed文件夹。data文件夹存放原始数据processed文件夹存放所有的中间计算结果。
其次,创建.gitignore文件.gitignore文件将保证某些文件被git忽略而不会被推送至github。关于这个文件的一个好的例子是由OSX在每一个文件夹都会创建的.DS_Store文件.gitignore文件一个很好的起点就是在这了。我们还想忽略数据文件因为他们实在是太大了同时房利美的条文禁止我们重新分发该数据文件所以我们应该在我们的文件后面添加以下2行
```
data
processed
```
这是该项目的一个关于.gitignore文件的例子。
再次我们需要创建README.md文件它将帮助人们理解该项目。后缀.md表示这个文件采用markdown格式。Markdown使你能够写纯文本文件同时还可以添加你想要的梦幻格式。这是关于markdown的导引。如果你上传一个叫README.md的文件至GithubGithub会自动处理该markdown同时展示给浏览该项目的人。
至此我们仅需在README.md文件中添加简单的描述
```
Loan Prediction
-----------------------
Predict whether or not loans acquired by Fannie Mae will go into foreclosure. Fannie Mae acquires loans from other lenders as a way of inducing them to lend more. Fannie Mae releases data on the loans it has acquired and their performance afterwards [here](http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html).
```
现在我们可以创建requirements.txt文件了。这会唯其它人可以很方便地安装我们的项目。我们还不知道我们将会具体用到哪些库但是以下几个库是一个很好的开始
```
pandas
matplotlib
scikit-learn
numpy
ipython
scipy
```
以上几个是在python数据分析任务中最常用到的库。可以认为我们将会用到大部分这些库。这里是【24】该项目requirements文件的一个例子。
创建requirements.txt文件之后你应该安装包了。我们将会使用python3.如果你没有安装python你应该考虑使用 [Anaconda][25]一个python安装程序同时安装了上面列出的所有包。
最后我们可以建立一个空白的settings.py文件因为我们的项目还没有任何设置。

View File

@ -0,0 +1,194 @@
### Acquiring the data
Once we have the skeleton of our project, we can get the raw data.
Fannie Mae has some restrictions around acquiring the data, so youll need to sign up for an account. You can find the download page [here][26]. After creating an account, youll be able to download as few or as many loan data files as you want. The files are in zip format, and are reasonably large after decompression.
For the purposes of this blog post, well download everything from Q1 2012 to Q1 2015, inclusive. Well then need to unzip all of the files. After unzipping the files, remove the original .zip files. At the end, the loan-prediction folder should look something like this:
```
loan-prediction
├── data
│ ├── Acquisition_2012Q1.txt
│ ├── Acquisition_2012Q2.txt
│ ├── Performance_2012Q1.txt
│ ├── Performance_2012Q2.txt
│ └── ...
├── processed
├── .gitignore
├── README.md
├── requirements.txt
├── settings.py
```
After downloading the data, you can use the head and tail shell commands to look at the lines in the files. Do you see any columns that arent needed? It might be useful to consult the [pdf of column names][27] while doing this.
### Reading in the data
There are two issues that make our data hard to work with right now:
- The acquisition and performance datasets are segmented across multiple files.
- Each file is missing headers.
Before we can get started on working with the data, well need to get to the point where we have one file for the acquisition data, and one file for the performance data. Each of the files will need to contain only the columns we care about, and have the proper headers. One wrinkle here is that the performance data is quite large, so we should try to trim some of the columns if we can.
The first step is to add some variables to settings.py, which will contain the paths to our raw data and our processed data. Well also add a few other settings that will be useful later on:
```
DATA_DIR = "data"
PROCESSED_DIR = "processed"
MINIMUM_TRACKING_QUARTERS = 4
TARGET = "foreclosure_status"
NON_PREDICTORS = [TARGET, "id"]
CV_FOLDS = 3
```
Putting the paths in settings.py will put them in a centralized place and make them easy to change down the line. When referring to the same variables in multiple files, its easier to put them in a central place than edit them in every file when you want to change them. [Heres][28] an example settings.py file for this project.
The second step is to create a file called assemble.py that will assemble all the pieces into 2 files. When we run python assemble.py, well get 2 data files in the processed directory.
Well then start writing code in assemble.py. Well first need to define the headers for each file, so well need to look at [pdf of column names][29] and create lists of the columns in each Acquisition and Performance file:
```
HEADERS = {
"Acquisition": [
"id",
"channel",
"seller",
"interest_rate",
"balance",
"loan_term",
"origination_date",
"first_payment_date",
"ltv",
"cltv",
"borrower_count",
"dti",
"borrower_credit_score",
"first_time_homebuyer",
"loan_purpose",
"property_type",
"unit_count",
"occupancy_status",
"property_state",
"zip",
"insurance_percentage",
"product_type",
"co_borrower_credit_score"
],
"Performance": [
"id",
"reporting_period",
"servicer_name",
"interest_rate",
"balance",
"loan_age",
"months_to_maturity",
"maturity_date",
"msa",
"delinquency_status",
"modification_flag",
"zero_balance_code",
"zero_balance_date",
"last_paid_installment_date",
"foreclosure_date",
"disposition_date",
"foreclosure_costs",
"property_repair_costs",
"recovery_costs",
"misc_costs",
"tax_costs",
"sale_proceeds",
"credit_enhancement_proceeds",
"repurchase_proceeds",
"other_foreclosure_proceeds",
"non_interest_bearing_balance",
"principal_forgiveness_balance"
]
}
```
The next step is to define the columns we want to keep. Since all were measuring on an ongoing basis about the loan is whether or not it was ever foreclosed on, we can discard many of the columns in the performance data. Well need to keep all the columns in the acquisition data, though, because we want to maximize the information we have about when the loan was acquired (after all, were predicting if the loan will ever be foreclosed or not at the point its acquired). Discarding columns will enable us to save disk space and memory, while also speeding up our code.
```
SELECT = {
"Acquisition": HEADERS["Acquisition"],
"Performance": [
"id",
"foreclosure_date"
]
}
```
Next, well write a function to concatenate the data sets. The below code will:
- Import a few needed libraries, including settings.
- Define a function concatenate, that:
- Gets the names of all the files in the data directory.
- Loops through each file.
- If the file isnt the right type (doesnt start with the prefix we want), we ignore it.
- Reads the file into a [DataFrame][30] with the right settings using the Pandas [read_csv][31] function.
- Sets the separator to | so the fields are read in correctly.
- The data has no header row, so sets header to None to indicate this.
- Sets names to the right value from the HEADERS dictionary these will be the column names of our DataFrame.
- Picks only the columns from the DataFrame that we added in SELECT.
- Concatenates all the DataFrames together.
- Writes the concatenated DataFrame back to a file.
```
import os
import settings
import pandas as pd
def concatenate(prefix="Acquisition"):
files = os.listdir(settings.DATA_DIR)
full = []
for f in files:
if not f.startswith(prefix):
continue
data = pd.read_csv(os.path.join(settings.DATA_DIR, f), sep="|", header=None, names=HEADERS[prefix], index_col=False)
data = data[SELECT[prefix]]
full.append(data)
full = pd.concat(full, axis=0)
full.to_csv(os.path.join(settings.PROCESSED_DIR, "{}.txt".format(prefix)), sep="|", header=SELECT[prefix], index=False)
```
We can call the above function twice with the arguments Acquisition and Performance to concatenate all the acquisition and performance files together. The below code will:
- Only execute if the script is called from the command line with python assemble.py.
- Concatenate all the files, and result in two files:
- `processed/Acquisition.txt`
- `processed/Performance.txt`
```
if __name__ == "__main__":
concatenate("Acquisition")
concatenate("Performance")
```
We now have a nice, compartmentalized assemble.py thats easy to execute, and easy to build off of. By decomposing the problem into pieces like this, we make it easy to build our project. Instead of one messy script that does everything, we define the data that will pass between the scripts, and make them completely separate from each other. When youre working on larger projects, its a good idea to do this, because it makes it much easier to change individual pieces without having unexpected consequences on unrelated pieces of the project.
Once we finish the assemble.py script, we can run python assemble.py. You can find the complete assemble.py file [here][32].
This will result in two files in the processed directory:
```
loan-prediction
├── data
│ ├── Acquisition_2012Q1.txt
│ ├── Acquisition_2012Q2.txt
│ ├── Performance_2012Q1.txt
│ ├── Performance_2012Q2.txt
│ └── ...
├── processed
│ ├── Acquisition.txt
│ ├── Performance.txt
├── .gitignore
├── assemble.py
├── README.md
├── requirements.txt
├── settings.py
```

View File

@ -0,0 +1,84 @@
vim-kakali translating
### Computing values from the performance data
The next step well take is to calculate some values from processed/Performance.txt. All we want to do is to predict whether or not a property is foreclosed on. To figure this out, we just need to check if the performance data associated with a loan ever has a foreclosure_date. If foreclosure_date is None, then the property was never foreclosed on. In order to avoid including loans with little performance history in our sample, well also want to count up how many rows exist in the performance file for each loan. This will let us filter loans without much performance history from our training data.
One way to think of the loan data and the performance data is like this:
![](https://github.com/LCTT/wiki-images/blob/master/TranslateProject/ref_img/001.png)
As you can see above, each row in the Acquisition data can be related to multiple rows in the Performance data. In the Performance data, foreclosure_date will appear in the quarter when the foreclosure happened, so it should be blank prior to that. Some loans are never foreclosed on, so all the rows related to them in the Performance data have foreclosure_date blank.
We need to compute foreclosure_status, which is a Boolean that indicates whether a particular loan id was ever foreclosed on, and performance_count, which is the number of rows in the performance data for each loan id.
There are a few different ways to compute the counts we want:
- We could read in all the performance data, then use the Pandas groupby method on the DataFrame to figure out the number of rows associated with each loan id, and also if the foreclosure_date is ever not None for the id.
- The upside of this method is that its easy to implement from a syntax perspective.
- The downside is that reading in all 129236094 lines in the data will take a lot of memory, and be extremely slow.
- We could read in all the performance data, then use apply on the acquisition DataFrame to find the counts for each id.
- The upside is that its easy to conceptualize.
- The downside is that reading in all 129236094 lines in the data will take a lot of memory, and be extremely slow.
- We could iterate over each row in the performance dataset, and keep a separate dictionary of counts.
- The upside is that the dataset doesnt need to be loaded into memory, so its extremely fast and memory-efficient.
- The downside is that it will take slightly longer to conceptualize and implement, and we need to parse the rows manually.
Loading in all the data will take quite a bit of memory, so lets go with the third option above. All we need to do is to iterate through all the rows in the Performance data, while keeping a dictionary of counts per loan id. In the dictionary, well keep track of how many times the id appears in the performance data, as well as if foreclosure_date is ever not None. This will give us foreclosure_status and performance_count.
Well create a new file called annotate.py, and add in code that will enable us to compute these values. In the below code, well:
- Import needed libraries.
- Define a function called count_performance_rows.
- Open processed/Performance.txt. This doesnt read the file into memory, but instead opens a file handler that can be used to read in the file line by line.
- Loop through each line in the file.
- Split the line on the delimiter (|)
- Check if the loan_id is not in the counts dictionary.
- If not, add it to counts.
- Increment performance_count for the given loan_id because were on a row that contains it.
- If date is not None, then we know that the loan was foreclosed on, so set foreclosure_status appropriately.
```
import os
import settings
import pandas as pd
def count_performance_rows():
counts = {}
with open(os.path.join(settings.PROCESSED_DIR, "Performance.txt"), 'r') as f:
for i, line in enumerate(f):
if i == 0:
# Skip header row
continue
loan_id, date = line.split("|")
loan_id = int(loan_id)
if loan_id not in counts:
counts[loan_id] = {
"foreclosure_status": False,
"performance_count": 0
}
counts[loan_id]["performance_count"] += 1
if len(date.strip()) > 0:
counts[loan_id]["foreclosure_status"] = True
return counts
```
### Getting the values
Once we create our counts dictionary, we can make a function that will extract values from the dictionary if a loan_id and a key are passed in:
```
def get_performance_summary_value(loan_id, key, counts):
value = counts.get(loan_id, {
"foreclosure_status": False,
"performance_count": 0
})
return value[key]
```
The above function will return the appropriate value from the counts dictionary, and will enable us to assign a foreclosure_status value and a performance_count value to each row in the Acquisition data. The [get][33] method on dictionaries returns a default value if a key isnt found, so this enables us to return sensible default values if a key isnt found in the counts dictionary.

View File

@ -0,0 +1,163 @@
### 注解数据
我们已经在annotate.py中添加了一些功能, 现在我们来看一看数据文件. 我们需要将采集到的数据转换到训练数据表来进行机器学习的训练. 这涉及到以下几件事情:
- 转换所以列数字.
- 填充缺失值.
- 分配 performance_count 和 foreclosure_status.
- 移除出现次数很少的行(performance_count 计数低).
我们有几个列是文本类型的, 看起来对于机器学习算法来说并不是很有用. 然而, 他们实际上是分类变量, 其中有很多不同的类别代码, 例如R,S等等. 我们可以把这些类别标签转换为数值:
![](https://github.com/LCTT/wiki-images/blob/master/TranslateProject/ref_img/002.png)
通过这种方法转换的列我们可以应用到机器学习算法中.
还有一些包含日期的列 (first_payment_date 和 origination_date). 我们可以将这些日期放到两个列中:
![](https://github.com/LCTT/wiki-images/blob/master/TranslateProject/ref_img/003.png)
在下面的代码中, 我们将转换采集到的数据. 我们将定义一个函数如下:
- 在采集到的数据中创建foreclosure_status列 .
- 在采集到的数据中创建performance_count列.
- 将下面的string列转换为integer列:
- channel
- seller
- first_time_homebuyer
- loan_purpose
- property_type
- occupancy_status
- property_state
- product_type
- 转换first_payment_date 和 origination_date 为两列:
- 通过斜杠分离列.
- 将第一部分分离成月清单.
- 将第二部分分离成年清单.
- 删除这一列.
- 最后, 我们得到 first_payment_month, first_payment_year, origination_month, and origination_year.
- 所有缺失值填充为-1.
```
def annotate(acquisition, counts):
acquisition["foreclosure_status"] = acquisition["id"].apply(lambda x: get_performance_summary_value(x, "foreclosure_status", counts))
acquisition["performance_count"] = acquisition["id"].apply(lambda x: get_performance_summary_value(x, "performance_count", counts))
for column in [
"channel",
"seller",
"first_time_homebuyer",
"loan_purpose",
"property_type",
"occupancy_status",
"property_state",
"product_type"
]:
acquisition[column] = acquisition[column].astype('category').cat.codes
for start in ["first_payment", "origination"]:
column = "{}_date".format(start)
acquisition["{}_year".format(start)] = pd.to_numeric(acquisition[column].str.split('/').str.get(1))
acquisition["{}_month".format(start)] = pd.to_numeric(acquisition[column].str.split('/').str.get(0))
del acquisition[column]
acquisition = acquisition.fillna(-1)
acquisition = acquisition[acquisition["performance_count"] > settings.MINIMUM_TRACKING_QUARTERS]
return acquisition
```
### 聚合到一起
我们差不多准备就绪了, 我们只需要再在annotate.py添加一点点代码. 在下面代码中, 我们将:
- 定义一个函数来读取采集的数据.
- 定义一个函数来写入数据到/train.csv
- 如果我们在命令行运行annotate.py来读取更新过的数据文件它将做如下事情:
- 读取采集到的数据.
- 计算数据性能.
- 注解数据.
- 将注解数据写入到train.csv.
```
def read():
acquisition = pd.read_csv(os.path.join(settings.PROCESSED_DIR, "Acquisition.txt"), sep="|")
return acquisition
def write(acquisition):
acquisition.to_csv(os.path.join(settings.PROCESSED_DIR, "train.csv"), index=False)
if __name__ == "__main__":
acquisition = read()
counts = count_performance_rows()
acquisition = annotate(acquisition, counts)
write(acquisition)
```
修改完成以后为了确保annotate.py能够生成train.csv文件. 你可以在这里找到完整的 annotate.py file [here][34].
文件夹结果应该像这样:
```
loan-prediction
├── data
│ ├── Acquisition_2012Q1.txt
│ ├── Acquisition_2012Q2.txt
│ ├── Performance_2012Q1.txt
│ ├── Performance_2012Q2.txt
│ └── ...
├── processed
│ ├── Acquisition.txt
│ ├── Performance.txt
│ ├── train.csv
├── .gitignore
├── annotate.py
├── assemble.py
├── README.md
├── requirements.txt
├── settings.py
```
### 找到标准
我们已经完成了训练数据表的生成, 现在我们需要最后一步, 生成预测. 我们需要找到错误的标准, 以及该如何评估我们的数据. 在这种情况下, 因为有很多的贷款没有收回, 所以根本不可能做到精确的计算.
我们需要读取数据, 并且计算foreclosure_status列, 我们将得到如下信息:
```
import pandas as pd
import settings
train = pd.read_csv(os.path.join(settings.PROCESSED_DIR, "train.csv"))
train["foreclosure_status"].value_counts()
```
```
False 4635982
True 1585
Name: foreclosure_status, dtype: int64
```
因为只有一点点贷款收回, 通过百分比标签来建立的机器学习模型会把每行都设置为Fasle, 所以我们在这里要考虑每个样本的不平衡性,确保我们做出的预测是准确的. 我们不想要这么多假的false, 我们将预计贷款收回但是它并没有收回, 我们预计贷款不会回收但是却回收了. 通过以上两点, Fannie Mae的false太多了, 因此显示他们可能无法收回投资.
所以我们将定义一个百分比,就是模型预测没有收回但是实际上收回了, 这个数除以总的负债回收总数. 这个负债回收百分比模型实际上是“没有的”. 下面看这个图表:
![](https://github.com/LCTT/wiki-images/blob/master/TranslateProject/ref_img/004.png)
通过上面的图表, 1个负债预计不会回收, 也确实没有回收. 如果我们将这个数除以总数, 2, 我们将得到false的概率为50%. 我们将使用这个标准, 因此我们可以评估一下模型的性能.
### 设置机器学习分类器
我们使用交叉验证预测. 通过交叉验证法, 我们将数据分为3组. 按照下面的方法来做:
- Train a model on groups 1 and 2, and use the model to make predictions for group 3.
- Train a model on groups 1 and 3, and use the model to make predictions for group 2.
- Train a model on groups 2 and 3, and use the model to make predictions for group 1.
将它们分割到不同的组 ,这意味着我们永远不会用相同的数据来为预测训练模型. 这样就避免了overfitting(过拟合). 如果我们overfit(过拟合), 我们将得到很低的false概率, 这使得我们难以改进算法或者应用到现实生活中.
[Scikit-learn][35] 有一个叫做 [cross_val_predict][36] 他可以帮助我们理解交叉算法.
我们还需要一种算法来帮我们预测. 我们还需要一个分类器 [binary classification][37](二元分类). 目标变量foreclosure_status 只有两个值, True 和 False.
我们用[logistic regression][38](回归算法), 因为它能很好的进行binary classification二元分类, 并且运行很快, 占用内存很小. 我们来说一下它是如何工作的 取代许多树状结构, 更像随机森林, 进行转换, 更像一个向量机, 逻辑回归涉及更少的步骤和更少的矩阵.
我们可以使用[logistic regression classifier][39](逻辑回归分类器)算法 来实现scikit-learn. 我们唯一需要注意的是每个类的标准. 如果我们使用同样标准的类, 算法将会预测每行都为false, 因为它总是试图最小化误差.不管怎样, 我们关注有多少贷款能够回收而不是有多少不能回收. 因此, 我们通过 [LogisticRegression][40](逻辑回归)来平衡标准参数, 并计算回收贷款的标准. 这将使我们的算法不会认为每一行都为false.

View File

@ -0,0 +1,156 @@
Translating by cposture 2016-08-02
### Making predictions
Now that we have the preliminaries out of the way, were ready to make predictions. Well create a new file called predict.py that will use the train.csv file we created in the last step. The below code will:
- Import needed libraries.
- Create a function called cross_validate that:
- Creates a logistic regression classifier with the right keyword arguments.
- Creates a list of columns that we want to use to train the model, removing id and foreclosure_status.
- Run cross validation across the train DataFrame.
- Return the predictions.
```
import os
import settings
import pandas as pd
from sklearn import cross_validation
from sklearn.linear_model import LogisticRegression
from sklearn import metrics
def cross_validate(train):
clf = LogisticRegression(random_state=1, class_weight="balanced")
predictors = train.columns.tolist()
predictors = [p for p in predictors if p not in settings.NON_PREDICTORS]
predictions = cross_validation.cross_val_predict(clf, train[predictors], train[settings.TARGET], cv=settings.CV_FOLDS)
return predictions
```
### Predicting error
Now, we just need to write a few functions to compute error. The below code will:
- Create a function called compute_error that:
- Uses scikit-learn to compute a simple accuracy score (the percentage of predictions that matched the actual foreclosure_status values).
- Create a function called compute_false_negatives that:
- Combines the target and the predictions into a DataFrame for convenience.
- Finds the false negative rate.
- Create a function called compute_false_positives that:
- Combines the target and the predictions into a DataFrame for convenience.
- Finds the false positive rate.
- Finds the number of loans that werent foreclosed on that the model predicted would be foreclosed on.
- Divide by the total number of loans that werent foreclosed on.
```
def compute_error(target, predictions):
return metrics.accuracy_score(target, predictions)
def compute_false_negatives(target, predictions):
df = pd.DataFrame({"target": target, "predictions": predictions})
return df[(df["target"] == 1) & (df["predictions"] == 0)].shape[0] / (df[(df["target"] == 1)].shape[0] + 1)
def compute_false_positives(target, predictions):
df = pd.DataFrame({"target": target, "predictions": predictions})
return df[(df["target"] == 0) & (df["predictions"] == 1)].shape[0] / (df[(df["target"] == 0)].shape[0] + 1)
```
### Putting it all together
Now, we just have to put the functions together in predict.py. The below code will:
- Read in the dataset.
- Compute cross validated predictions.
- Compute the 3 error metrics above.
- Print the error metrics.
```
def read():
train = pd.read_csv(os.path.join(settings.PROCESSED_DIR, "train.csv"))
return train
if __name__ == "__main__":
train = read()
predictions = cross_validate(train)
error = compute_error(train[settings.TARGET], predictions)
fn = compute_false_negatives(train[settings.TARGET], predictions)
fp = compute_false_positives(train[settings.TARGET], predictions)
print("Accuracy Score: {}".format(error))
print("False Negatives: {}".format(fn))
print("False Positives: {}".format(fp))
```
Once youve added the code, you can run python predict.py to generate predictions. Running everything shows that our false negative rate is .26, which means that of the foreclosed loans, we missed predicting 26% of them. This is a good start, but can use a lot of improvement!
You can find the complete predict.py file [here][41].
Your file tree should now look like this:
```
loan-prediction
├── data
│ ├── Acquisition_2012Q1.txt
│ ├── Acquisition_2012Q2.txt
│ ├── Performance_2012Q1.txt
│ ├── Performance_2012Q2.txt
│ └── ...
├── processed
│ ├── Acquisition.txt
│ ├── Performance.txt
│ ├── train.csv
├── .gitignore
├── annotate.py
├── assemble.py
├── predict.py
├── README.md
├── requirements.txt
├── settings.py
```
### Writing up a README
Now that weve finished our end to end project, we just have to write up a README.md file so that other people know what we did, and how to replicate it. A typical README.md for a project should include these sections:
- A high level overview of the project, and what the goals are.
- Where to download any needed data or materials.
- Installation instructions.
- How to install the requirements.
- Usage instructions.
- How to run the project.
- What you should see after each step.
- How to contribute to the project.
- Good next steps for extending the project.
[Heres][42] a sample README.md for this project.
### Next steps
Congratulations, youre done making an end to end machine learning project! You can find a complete example project [here][43]. Its a good idea to upload your project to [Github][44] once youve finished it, so others can see it as part of your portfolio.
There are still quite a few angles left to explore with this data. Broadly, we can split them up into 3 categories extending this project and making it more accurate, finding other columns to predict, and exploring the data. Here are some ideas:
- Generate more features in annotate.py.
- Switch algorithms in predict.py.
- Try using more data from Fannie Mae than we used in this post.
- Add in a way to make predictions on future data. The code we wrote will still work if we add more data, so we can add more past or future data.
- Try seeing if you can predict if a bank should have issued the loan originally (vs if Fannie Mae should have acquired the loan).
- Remove any columns from train that the bank wouldnt have known at the time of issuing the loan.
- Some columns are known when Fannie Mae bought the loan, but not before.
- Make predictions.
- Explore seeing if you can predict columns other than foreclosure_status.
- Can you predict how much the property will be worth at sale time?
- Explore the nuances between performance updates.
- Can you predict how many times the borrower will be late on payments?
- Can you map out the typical loan lifecycle?
- Map out data on a state by state or zip code by zip code level.
- Do you see any interesting patterns?
If you build anything interesting, please let us know in the comments!
If you liked this, you might like to read the other posts in our Build a Data Science Porfolio series:
- [Storytelling with data][45].
- [How to setup up a data science blog][46].

View File

@ -0,0 +1,150 @@
Lets Build A Web Server. Part 1.
=====================================
Out for a walk one day, a woman came across a construction site and saw three men working. She asked the first man, “What are you doing?” Annoyed by the question, the first man barked, “Cant you see that Im laying bricks?” Not satisfied with the answer, she asked the second man what he was doing. The second man answered, “Im building a brick wall.” Then, turning his attention to the first man, he said, “Hey, you just passed the end of the wall. You need to take off that last brick.” Again not satisfied with the answer, she asked the third man what he was doing. And the man said to her while looking up in the sky, “I am building the biggest cathedral this world has ever known.” While he was standing there and looking up in the sky the other two men started arguing about the errant brick. The man turned to the first two men and said, “Hey guys, dont worry about that brick. Its an inside wall, it will get plastered over and no one will ever see that brick. Just move on to another layer.”1
The moral of the story is that when you know the whole system and understand how different pieces fit together (bricks, walls, cathedral), you can identify and fix problems faster (errant brick).
What does it have to do with creating your own Web server from scratch?
I believe to become a better developer you MUST get a better understanding of the underlying software systems you use on a daily basis and that includes programming languages, compilers and interpreters, databases and operating systems, web servers and web frameworks. And, to get a better and deeper understanding of those systems you MUST re-build them from scratch, brick by brick, wall by wall.
Confucius put it this way:
>“I hear and I forget.”
![](https://ruslanspivak.com/lsbasi-part4/LSBAWS_confucius_hear.png)
>“I see and I remember.”
![](https://ruslanspivak.com/lsbasi-part4/LSBAWS_confucius_see.png)
>“I do and I understand.”
![](https://ruslanspivak.com/lsbasi-part4/LSBAWS_confucius_do.png)
I hope at this point youre convinced that its a good idea to start re-building different software systems to learn how they work.
In this three-part series I will show you how to build your own basic Web server. Lets get started.
First things first, what is a Web server?
![](https://ruslanspivak.com/lsbaws-part1/LSBAWS_HTTP_request_response.png)
In a nutshell its a networking server that sits on a physical server (oops, a server on a server) and waits for a client to send a request. When it receives a request, it generates a response and sends it back to the client. The communication between a client and a server happens using HTTP protocol. A client can be your browser or any other software that speaks HTTP.
What would a very simple implementation of a Web server look like? Here is my take on it. The example is in Python but even if you dont know Python (its a very easy language to pick up, try it!) you still should be able to understand concepts from the code and explanations below:
```
import socket
HOST, PORT = '', 8888
listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listen_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
listen_socket.bind((HOST, PORT))
listen_socket.listen(1)
print 'Serving HTTP on port %s ...' % PORT
while True:
client_connection, client_address = listen_socket.accept()
request = client_connection.recv(1024)
print request
http_response = """\
HTTP/1.1 200 OK
Hello, World!
"""
client_connection.sendall(http_response)
client_connection.close()
```
Save the above code as webserver1.py or download it directly from GitHub and run it on the command line like this
```
$ python webserver1.py
Serving HTTP on port 8888 …
```
Now type in the following URL in your Web browsers address bar http://localhost:8888/hello, hit Enter, and see magic in action. You should see “Hello, World!” displayed in your browser like this:
![](https://ruslanspivak.com/lsbaws-part1/browser_hello_world.png)
Just do it, seriously. I will wait for you while youre testing it.
Done? Great. Now lets discuss how it all actually works.
First lets start with the Web address youve entered. Its called an URL and here is its basic structure:
![](https://ruslanspivak.com/lsbaws-part1/LSBAWS_URL_Web_address.png)
This is how you tell your browser the address of the Web server it needs to find and connect to and the page (path) on the server to fetch for you. Before your browser can send a HTTP request though, it first needs to establish a TCP connection with the Web server. Then it sends an HTTP request over the TCP connection to the server and waits for the server to send an HTTP response back. And when your browser receives the response it displays it, in this case it displays “Hello, World!”
Lets explore in more detail how the client and the server establish a TCP connection before sending HTTP requests and responses. To do that they both use so-called sockets. Instead of using a browser directly you are going to simulate your browser manually by using telnet on the command line.
On the same computer youre running the Web server fire up a telnet session on the command line specifying a host to connect to localhost and the port to connect to 8888 and then press Enter:
```
$ telnet localhost 8888
Trying 127.0.0.1 …
Connected to localhost.
```
At this point youve established a TCP connection with the server running on your local host and ready to send and receive HTTP messages. In the picture below you can see a standard procedure a server has to go through to be able to accept new TCP connections.
![](https://ruslanspivak.com/lsbaws-part1/LSBAWS_socket.png)
In the same telnet session type GET /hello HTTP/1.1 and hit Enter:
```
$ telnet localhost 8888
Trying 127.0.0.1 …
Connected to localhost.
GET /hello HTTP/1.1
HTTP/1.1 200 OK
Hello, World!
```
Youve just manually simulated your browser! You sent an HTTP request and got an HTTP response back. This is the basic structure of an HTTP request:
![](https://ruslanspivak.com/lsbaws-part1/LSBAWS_HTTP_request_anatomy.png)
The HTTP request consists of the line indicating the HTTP method (GET, because we are asking our server to return us something), the path /hello that indicates a “page” on the server we want and the protocol version.
For simplicitys sake our Web server at this point completely ignores the above request line. You could just as well type in any garbage instead of “GET /hello HTTP/1.1” and you would still get back a “Hello, World!” response.
Once youve typed the request line and hit Enter the client sends the request to the server, the server reads the request line, prints it and returns the proper HTTP response.
Here is the HTTP response that the server sends back to your client (telnet in this case):
![](https://ruslanspivak.com/lsbaws-part1/LSBAWS_HTTP_response_anatomy.png)
Lets dissect it. The response consists of a status line HTTP/1.1 200 OK, followed by a required empty line, and then the HTTP response body.
The response status line HTTP/1.1 200 OK consists of the HTTP Version, the HTTP status code and the HTTP status code reason phrase OK. When the browser gets the response, it displays the body of the response and thats why you see “Hello, World!” in your browser.
And thats the basic model of how a Web server works. To sum it up: The Web server creates a listening socket and starts accepting new connections in a loop. The client initiates a TCP connection and, after successfully establishing it, the client sends an HTTP request to the server and the server responds with an HTTP response that gets displayed to the user. To establish a TCP connection both clients and servers use sockets.
Now you have a very basic working Web server that you can test with your browser or some other HTTP client. As youve seen and hopefully tried, you can also be a human HTTP client too, by using telnet and typing HTTP requests manually.
Heres a question for you: “How do you run a Django application, Flask application, and Pyramid application under your freshly minted Web server without making a single change to the server to accommodate all those different Web frameworks?”
I will show you exactly how in Part 2 of the series. Stay tuned.
BTW, Im writing a book “Lets Build A Web Server: First Steps” that explains how to write a basic web server from scratch and goes into more detail on topics I just covered. Subscribe to the mailing list to get the latest updates about the book and the release date.
--------------------------------------------------------------------------------
via: https://ruslanspivak.com/lsbaws-part1/
作者:[Ruslan][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://linkedin.com/in/ruslanspivak/

View File

@ -0,0 +1,427 @@
Lets Build A Web Server. Part 2.
===================================
Remember, in Part 1 I asked you a question: “How do you run a Django application, Flask application, and Pyramid application under your freshly minted Web server without making a single change to the server to accommodate all those different Web frameworks?” Read on to find out the answer.
In the past, your choice of a Python Web framework would limit your choice of usable Web servers, and vice versa. If the framework and the server were designed to work together, then you were okay:
![](https://ruslanspivak.com/lsbaws-part2/lsbaws_part2_before_wsgi.png)
But you could have been faced (and maybe you were) with the following problem when trying to combine a server and a framework that werent designed to work together:
![](https://ruslanspivak.com/lsbaws-part2/lsbaws_part2_after_wsgi.png)
Basically you had to use what worked together and not what you might have wanted to use.
So, how do you then make sure that you can run your Web server with multiple Web frameworks without making code changes either to the Web server or to the Web frameworks? And the answer to that problem became the Python Web Server Gateway Interface (or WSGI for short, pronounced “wizgy”).
![](https://ruslanspivak.com/lsbaws-part2/lsbaws_part2_wsgi_idea.png)
WSGI allowed developers to separate choice of a Web framework from choice of a Web server. Now you can actually mix and match Web servers and Web frameworks and choose a pairing that suits your needs. You can run Django, Flask, or Pyramid, for example, with Gunicorn or Nginx/uWSGI or Waitress. Real mix and match, thanks to the WSGI support in both servers and frameworks:
![](https://ruslanspivak.com/lsbaws-part2/lsbaws_part2_wsgi_interop.png)
So, WSGI is the answer to the question I asked you in Part 1 and repeated at the beginning of this article. Your Web server must implement the server portion of a WSGI interface and all modern Python Web Frameworks already implement the framework side of the WSGI interface, which allows you to use them with your Web server without ever modifying your servers code to accommodate a particular Web framework.
Now you know that WSGI support by Web servers and Web frameworks allows you to choose a pairing that suits you, but it is also beneficial to server and framework developers because they can focus on their preferred area of specialization and not step on each others toes. Other languages have similar interfaces too: Java, for example, has Servlet API and Ruby has Rack.
Its all good, but I bet you are saying: “Show me the code!” Okay, take a look at this pretty minimalistic WSGI server implementation:
```
# Tested with Python 2.7.9, Linux & Mac OS X
import socket
import StringIO
import sys
class WSGIServer(object):
address_family = socket.AF_INET
socket_type = socket.SOCK_STREAM
request_queue_size = 1
def __init__(self, server_address):
# Create a listening socket
self.listen_socket = listen_socket = socket.socket(
self.address_family,
self.socket_type
)
# Allow to reuse the same address
listen_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
# Bind
listen_socket.bind(server_address)
# Activate
listen_socket.listen(self.request_queue_size)
# Get server host name and port
host, port = self.listen_socket.getsockname()[:2]
self.server_name = socket.getfqdn(host)
self.server_port = port
# Return headers set by Web framework/Web application
self.headers_set = []
def set_app(self, application):
self.application = application
def serve_forever(self):
listen_socket = self.listen_socket
while True:
# New client connection
self.client_connection, client_address = listen_socket.accept()
# Handle one request and close the client connection. Then
# loop over to wait for another client connection
self.handle_one_request()
def handle_one_request(self):
self.request_data = request_data = self.client_connection.recv(1024)
# Print formatted request data a la 'curl -v'
print(''.join(
'< {line}\n'.format(line=line)
for line in request_data.splitlines()
))
self.parse_request(request_data)
# Construct environment dictionary using request data
env = self.get_environ()
# It's time to call our application callable and get
# back a result that will become HTTP response body
result = self.application(env, self.start_response)
# Construct a response and send it back to the client
self.finish_response(result)
def parse_request(self, text):
request_line = text.splitlines()[0]
request_line = request_line.rstrip('\r\n')
# Break down the request line into components
(self.request_method, # GET
self.path, # /hello
self.request_version # HTTP/1.1
) = request_line.split()
def get_environ(self):
env = {}
# The following code snippet does not follow PEP8 conventions
# but it's formatted the way it is for demonstration purposes
# to emphasize the required variables and their values
#
# Required WSGI variables
env['wsgi.version'] = (1, 0)
env['wsgi.url_scheme'] = 'http'
env['wsgi.input'] = StringIO.StringIO(self.request_data)
env['wsgi.errors'] = sys.stderr
env['wsgi.multithread'] = False
env['wsgi.multiprocess'] = False
env['wsgi.run_once'] = False
# Required CGI variables
env['REQUEST_METHOD'] = self.request_method # GET
env['PATH_INFO'] = self.path # /hello
env['SERVER_NAME'] = self.server_name # localhost
env['SERVER_PORT'] = str(self.server_port) # 8888
return env
def start_response(self, status, response_headers, exc_info=None):
# Add necessary server headers
server_headers = [
('Date', 'Tue, 31 Mar 2015 12:54:48 GMT'),
('Server', 'WSGIServer 0.2'),
]
self.headers_set = [status, response_headers + server_headers]
# To adhere to WSGI specification the start_response must return
# a 'write' callable. We simplicity's sake we'll ignore that detail
# for now.
# return self.finish_response
def finish_response(self, result):
try:
status, response_headers = self.headers_set
response = 'HTTP/1.1 {status}\r\n'.format(status=status)
for header in response_headers:
response += '{0}: {1}\r\n'.format(*header)
response += '\r\n'
for data in result:
response += data
# Print formatted response data a la 'curl -v'
print(''.join(
'> {line}\n'.format(line=line)
for line in response.splitlines()
))
self.client_connection.sendall(response)
finally:
self.client_connection.close()
SERVER_ADDRESS = (HOST, PORT) = '', 8888
def make_server(server_address, application):
server = WSGIServer(server_address)
server.set_app(application)
return server
if __name__ == '__main__':
if len(sys.argv) < 2:
sys.exit('Provide a WSGI application object as module:callable')
app_path = sys.argv[1]
module, application = app_path.split(':')
module = __import__(module)
application = getattr(module, application)
httpd = make_server(SERVER_ADDRESS, application)
print('WSGIServer: Serving HTTP on port {port} ...\n'.format(port=PORT))
httpd.serve_forever()
```
Its definitely bigger than the server code in Part 1, but its also small enough (just under 150 lines) for you to understand without getting bogged down in details. The above server also does more - it can run your basic Web application written with your beloved Web framework, be it Pyramid, Flask, Django, or some other Python WSGI framework.
Dont believe me? Try it and see for yourself. Save the above code as webserver2.py or download it directly from GitHub. If you try to run it without any parameters its going to complain and exit.
```
$ python webserver2.py
Provide a WSGI application object as module:callable
```
It really wants to serve your Web application and thats where the fun begins. To run the server the only thing you need installed is Python. But to run applications written with Pyramid, Flask, and Django you need to install those frameworks first. Lets install all three of them. My preferred method is by using virtualenv. Just follow the steps below to create and activate a virtual environment and then install all three Web frameworks.
```
$ [sudo] pip install virtualenv
$ mkdir ~/envs
$ virtualenv ~/envs/lsbaws/
$ cd ~/envs/lsbaws/
$ ls
bin include lib
$ source bin/activate
(lsbaws) $ pip install pyramid
(lsbaws) $ pip install flask
(lsbaws) $ pip install django
```
At this point you need to create a Web application. Lets start with Pyramid first. Save the following code as pyramidapp.py to the same directory where you saved webserver2.py or download the file directly from GitHub:
```
from pyramid.config import Configurator
from pyramid.response import Response
def hello_world(request):
return Response(
'Hello world from Pyramid!\n',
content_type='text/plain',
)
config = Configurator()
config.add_route('hello', '/hello')
config.add_view(hello_world, route_name='hello')
app = config.make_wsgi_app()
```
Now youre ready to serve your Pyramid application with your very own Web server:
```
(lsbaws) $ python webserver2.py pyramidapp:app
WSGIServer: Serving HTTP on port 8888 ...
```
You just told your server to load the app callable from the python module pyramidapp Your server is now ready to take requests and forward them to your Pyramid application. The application only handles one route now: the /hello route. Type http://localhost:8888/hello address into your browser, press Enter, and observe the result:
![](https://ruslanspivak.com/lsbaws-part2/lsbaws_part2_browser_pyramid.png)
You can also test the server on the command line using the curl utility:
```
$ curl -v http://localhost:8888/hello
...
```
Check what the server and curl prints to standard output.
Now onto Flask. Lets follow the same steps.
```
from flask import Flask
from flask import Response
flask_app = Flask('flaskapp')
@flask_app.route('/hello')
def hello_world():
return Response(
'Hello world from Flask!\n',
mimetype='text/plain'
)
app = flask_app.wsgi_app
```
Save the above code as flaskapp.py or download it from GitHub and run the server as:
```
(lsbaws) $ python webserver2.py flaskapp:app
WSGIServer: Serving HTTP on port 8888 ...
```
Now type in the http://localhost:8888/hello into your browser and press Enter:
![](https://ruslanspivak.com/lsbaws-part2/lsbaws_part2_browser_flask.png)
Again, try curl and see for yourself that the server returns a message generated by the Flask application:
```
$ curl -v http://localhost:8888/hello
...
```
Can the server also handle a Django application? Try it out! Its a little bit more involved, though, and I would recommend cloning the whole repo and use djangoapp.py, which is part of the GitHub repository. Here is the source code which basically adds the Django helloworld project (pre-created using Djangos django-admin.py startproject command) to the current Python path and then imports the projects WSGI application.
```
import sys
sys.path.insert(0, './helloworld')
from helloworld import wsgi
app = wsgi.application
```
Save the above code as djangoapp.py and run the Django application with your Web server:
```
(lsbaws) $ python webserver2.py djangoapp:app
WSGIServer: Serving HTTP on port 8888 ...
```
Type in the following address and press Enter:
![](https://ruslanspivak.com/lsbaws-part2/lsbaws_part2_browser_django.png)
And as youve already done a couple of times before, you can test it on the command line, too, and confirm that its the Django application that handles your requests this time around:
```
$ curl -v http://localhost:8888/hello
...
```
Did you try it? Did you make sure the server works with those three frameworks? If not, then please do so. Reading is important, but this series is about rebuilding and that means you need to get your hands dirty. Go and try it. I will wait for you, dont worry. No seriously, you must try it and, better yet, retype everything yourself and make sure that it works as expected.
Okay, youve experienced the power of WSGI: it allows you to mix and match your Web servers and Web frameworks. WSGI provides a minimal interface between Python Web servers and Python Web Frameworks. Its very simple and its easy to implement on both the server and the framework side. The following code snippet shows the server and the framework side of the interface:
```
def run_application(application):
"""Server code."""
# This is where an application/framework stores
# an HTTP status and HTTP response headers for the server
# to transmit to the client
headers_set = []
# Environment dictionary with WSGI/CGI variables
environ = {}
def start_response(status, response_headers, exc_info=None):
headers_set[:] = [status, response_headers]
# Server invokes the application' callable and gets back the
# response body
result = application(environ, start_response)
# Server builds an HTTP response and transmits it to the client
def app(environ, start_response):
"""A barebones WSGI app."""
start_response('200 OK', [('Content-Type', 'text/plain')])
return ['Hello world!']
run_application(app)
```
Here is how it works:
1. The framework provides an application callable (The WSGI specification doesnt prescribe how that should be implemented)
2. The server invokes the application callable for each request it receives from an HTTP client. It passes a dictionary environ containing WSGI/CGI variables and a start_response callable as arguments to the application callable.
3. The framework/application generates an HTTP status and HTTP response headers and passes them to the start_response callable for the server to store them. The framework/application also returns a response body.
4. The server combines the status, the response headers, and the response body into an HTTP response and transmits it to the client (This step is not part of the specification but its the next logical step in the flow and I added it for clarity)
And here is a visual representation of the interface:
![](https://ruslanspivak.com/lsbaws-part2/lsbaws_part2_wsgi_interface.png)
So far, youve seen the Pyramid, Flask, and Django Web applications and youve seen the server code that implements the server side of the WSGI specification. Youve even seen the barebones WSGI application code snippet that doesnt use any framework.
The thing is that when you write a Web application using one of those frameworks you work at a higher level and dont work with WSGI directly, but I know youre curious about the framework side of the WSGI interface, too because youre reading this article. So, lets create a minimalistic WSGI Web application/Web framework without using Pyramid, Flask, or Django and run it with your server:
```
def app(environ, start_response):
"""A barebones WSGI application.
This is a starting point for your own Web framework :)
"""
status = '200 OK'
response_headers = [('Content-Type', 'text/plain')]
start_response(status, response_headers)
return ['Hello world from a simple WSGI application!\n']
```
Again, save the above code in wsgiapp.py file or download it from GitHub directly and run the application under your Web server as:
```
(lsbaws) $ python webserver2.py wsgiapp:app
WSGIServer: Serving HTTP on port 8888 ...
```
Type in the following address and press Enter. This is the result you should see:
![](https://ruslanspivak.com/lsbaws-part2/lsbaws_part2_browser_simple_wsgi_app.png)
You just wrote your very own minimalistic WSGI Web framework while learning about how to create a Web server! Outrageous.
Now, lets get back to what the server transmits to the client. Here is the HTTP response the server generates when you call your Pyramid application using an HTTP client:
![](https://ruslanspivak.com/lsbaws-part2/lsbaws_part2_http_response.png)
The response has some familiar parts that you saw in Part 1 but it also has something new. It has, for example, four HTTP headers that you havent seen before: Content-Type, Content-Length, Date, and Server. Those are the headers that a response from a Web server generally should have. None of them are strictly required, though. The purpose of the headers is to transmit additional information about the HTTP request/response.
Now that you know more about the WSGI interface, here is the same HTTP response with some more information about what parts produced it:
![](https://ruslanspivak.com/lsbaws-part2/lsbaws_part2_http_response_explanation.png)
I havent said anything about the environ dictionary yet, but basically its a Python dictionary that must contain certain WSGI and CGI variables prescribed by the WSGI specification. The server takes the values for the dictionary from the HTTP request after parsing the request. This is what the contents of the dictionary look like:
![](https://ruslanspivak.com/lsbaws-part2/lsbaws_part2_environ.png
A Web framework uses the information from that dictionary to decide which view to use based on the specified route, request method etc., where to read the request body from and where to write errors, if any.
By now youve created your own WSGI Web server and youve made Web applications written with different Web frameworks. And, youve also created your barebones Web application/Web framework along the way. Its been a heck of a journey. Lets recap what your WSGI Web server has to do to serve requests aimed at a WSGI application:
- First, the server starts and loads an application callable provided by your Web framework/application
- Then, the server reads a request
- Then, the server parses it
- Then, it builds an environ dictionary using the request data
- Then, it calls the application callable with the environ dictionary and a start_response callable as parameters and gets back a response body.
- Then, the server constructs an HTTP response using the data returned by the call to the application object and the status and response headers set by the start_response callable.
- And finally, the server transmits the HTTP response back to the client
![](https://ruslanspivak.com/lsbaws-part2/lsbaws_part2_server_summary.png)
Thats about all there is to it. You now have a working WSGI server that can serve basic Web applications written with WSGI compliant Web frameworks like Django, Flask, Pyramid, or your very own WSGI framework. The best part is that the server can be used with multiple Web frameworks without any changes to the server code base. Not bad at all.
Before you go, here is another question for you to think about, “How do you make your server handle more than one request at a time?”
Stay tuned and I will show you a way to do that in Part 3. Cheers!
BTW, Im writing a book “Lets Build A Web Server: First Steps” that explains how to write a basic web server from scratch and goes into more detail on topics I just covered. Subscribe to the mailing list to get the latest updates about the book and the release date.
--------------------------------------------------------------------------------
via: https://ruslanspivak.com/lsbaws-part2/
作者:[Ruslan][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://github.com/rspivak/

View File

@ -1,320 +0,0 @@
wyangsun translating
How to build and deploy a Facebook Messenger bot with Python and Flask, a tutorial
==========================================================================
This is my log of how I built a simple Facebook Messenger bot. The functionality is really simple, its an echo bot that will just print back to the user what they write.
This is something akin to the Hello World example for servers, the echo server.
The goal of the project is not to build the best Messenger bot, but rather to get a feel for what it takes to build a minimal bot and how everything comes together.
- [Tech Stack][1]
- [Bot Architecture][2]
- [The Bot Server][3]
- [Deploying to Heroku][4]
- [Creating the Facebook App][5]
- [Conclusion][6]
### Tech Stack
The tech stack that was used is:
- [Heroku][7] for back end hosting. The free-tier is more than enough for a tutorial of this level. The echo bot does not require any sort of data persistence so a database was not used.
- [Python][8] was the language of choice. The version that was used is 2.7 however it can easily be ported to Python 3 with minor alterations.
- [Flask][9] as the web development framework. Its a very lightweight framework thats perfect for small scale projects/microservices.
- Finally the [Git][10] version control system was used for code maintenance and to deploy to Heroku.
- Worth mentioning: [Virtualenv][11]. This python tool is used to create “environments” clean of python libraries so you can only install the necessary requirements and minimize the app footprint.
### Bot Architecture
Messenger bots are constituted by a server that responds to two types of requests:
- GET requests are being used for authentication. They are sent by Messenger with an authentication code that you register on FB.
- POST requests are being used for the actual communication. The typical workflow is that the bot will initiate the communication by sending the POST request with the data of the message sent by the user, we will handle it, send a POST request of our own back. If that one is completed successfully (a 200 OK status is returned) we also respond with a 200 OK code to the initial Messenger request.
For this tutorial the app will be hosted on Heroku, which provides a nice and easy interface to deploy apps. As mentioned the free tier will suffice for this tutorial.
After the app has been deployed and is running, well create a Facebook app and link it to our app so that messenger knows where to send the requests that are meant for our bot.
### The Bot Server
The basic server code was taken from the following [Chatbot][12] project by Github user [hult (Magnus Hult)][13], with a few modifications to the code to only echo messages and a couple bugfixes I came across. This is the final version of the server code:
```
from flask import Flask, request
import json
import requests
app = Flask(__name__)
# This needs to be filled with the Page Access Token that will be provided
# by the Facebook App that will be created.
PAT = ''
@app.route('/', methods=['GET'])
def handle_verification():
print "Handling Verification."
if request.args.get('hub.verify_token', '') == 'my_voice_is_my_password_verify_me':
print "Verification successful!"
return request.args.get('hub.challenge', '')
else:
print "Verification failed!"
return 'Error, wrong validation token'
@app.route('/', methods=['POST'])
def handle_messages():
print "Handling Messages"
payload = request.get_data()
print payload
for sender, message in messaging_events(payload):
print "Incoming from %s: %s" % (sender, message)
send_message(PAT, sender, message)
return "ok"
def messaging_events(payload):
"""Generate tuples of (sender_id, message_text) from the
provided payload.
"""
data = json.loads(payload)
messaging_events = data["entry"][0]["messaging"]
for event in messaging_events:
if "message" in event and "text" in event["message"]:
yield event["sender"]["id"], event["message"]["text"].encode('unicode_escape')
else:
yield event["sender"]["id"], "I can't echo this"
def send_message(token, recipient, text):
"""Send the message text to recipient with id recipient.
"""
r = requests.post("https://graph.facebook.com/v2.6/me/messages",
params={"access_token": token},
data=json.dumps({
"recipient": {"id": recipient},
"message": {"text": text.decode('unicode_escape')}
}),
headers={'Content-type': 'application/json'})
if r.status_code != requests.codes.ok:
print r.text
if __name__ == '__main__':
app.run()
```
Lets break down the code. The first part is the imports that will be needed:
```
from flask import Flask, request
import json
import requests
```
Next we define the two functions (using the Flask specific app.route decorators) that will handle the GET and POST requests to our bot.
```
@app.route('/', methods=['GET'])
def handle_verification():
print "Handling Verification."
if request.args.get('hub.verify_token', '') == 'my_voice_is_my_password_verify_me':
print "Verification successful!"
return request.args.get('hub.challenge', '')
else:
print "Verification failed!"
return 'Error, wrong validation token'
```
The verify_token object that is being sent by Messenger will be declared by us when we create the Facebook app. We have to validate the one we are being have against itself. Finally we return the “hub.challenge” back to Messenger.
The function that handles the POST requests is a bit more interesting.
```
@app.route('/', methods=['POST'])
def handle_messages():
print "Handling Messages"
payload = request.get_data()
print payload
for sender, message in messaging_events(payload):
print "Incoming from %s: %s" % (sender, message)
send_message(PAT, sender, message)
return "ok"
```
When called we grab the massage payload, use function messaging_events to break it down and extract the sender user id and the actual message sent, generating a python iterator that we can loop over. Notice that in each request sent by Messenger it is possible to have more than one messages.
```
def messaging_events(payload):
"""Generate tuples of (sender_id, message_text) from the
provided payload.
"""
data = json.loads(payload)
messaging_events = data["entry"][0]["messaging"]
for event in messaging_events:
if "message" in event and "text" in event["message"]:
yield event["sender"]["id"], event["message"]["text"].encode('unicode_escape')
else:
yield event["sender"]["id"], "I can't echo this"
```
While iterating over each message we call the send_message function and we perform the POST request back to Messnger using the Facebook Graph messages API. During this time we still have not responded to the original Messenger request which we are blocking. This can lead to timeouts and 5XX errors.
The above was spotted during an outage due to a bug I came across, which was occurred when the user was sending emojis which are actual unicode ids, however Python was miss-encoding. We ended up sending back garbage.
This POST request back to Messenger would never finish, and that in turn would cause 5XX status codes to be returned to the original request, rendering the service unusable.
This was fixed by escaping the messages with `encode('unicode_escape')` and then just before we sent back the message decode it with `decode('unicode_escape')`.
```
def send_message(token, recipient, text):
"""Send the message text to recipient with id recipient.
"""
r = requests.post("https://graph.facebook.com/v2.6/me/messages",
params={"access_token": token},
data=json.dumps({
"recipient": {"id": recipient},
"message": {"text": text.decode('unicode_escape')}
}),
headers={'Content-type': 'application/json'})
if r.status_code != requests.codes.ok:
print r.text
```
### Deploying to Heroku
Once the code was built to my liking it was time for the next step.
Deploy the app.
Sure, but how?
I have deployed apps before to Heroku (mainly Rails) however I was always following a tutorial of some sort, so the configuration has already been created. In this case though I had to start from scratch.
Fortunately it was the official [Heroku documentation][14] to the rescue. The article explains nicely the bare minimum required for running an app.
Long story short, what we need besides our code are two files. The first file is the “requirements.txt” file which is a list of of the library dependencies required to run the application.
The second file required is the “Procfile”. This file is there to inform the Heroku how to run our service. Again the bare minimum needed for this file is the following:
>web: gunicorn echoserver:app
The way this will be interpreted by heroku is that our app is started by running the echoserver.py file and the app will be using gunicorn as the web server. The reason we are using an additional webserver is performance related and is explained in the above Heroku documentation:
>Web applications that process incoming HTTP requests concurrently make much more efficient use of dyno resources than web applications that only process one request at a time. Because of this, we recommend using web servers that support concurrent request processing whenever developing and running production services.
>The Django and Flask web frameworks feature convenient built-in web servers, but these blocking servers only process a single request at a time. If you deploy with one of these servers on Heroku, your dyno resources will be underutilized and your application will feel unresponsive.
>Gunicorn is a pure-Python HTTP server for WSGI applications. It allows you to run any Python application concurrently by running multiple Python processes within a single dyno. It provides a perfect balance of performance, flexibility, and configuration simplicity.
Going back to our “requirements.txt” file lets see how it binds with the Virtualenv tool that was mentioned.
At anytime, your developement machine may have a number of python libraries installed. When deploying applications you dont want to have these libraries loaded as it makes it hard to make out which ones you actually use.
What Virtualenv does is create a new blank virtual enviroment so that you can only install the libraries that your app requires.
You can check which libraries are currently installed by running the following command:
```
kostis@KostisMBP ~ $ pip freeze
cycler==0.10.0
Flask==0.10.1
gunicorn==19.6.0
itsdangerous==0.24
Jinja2==2.8
MarkupSafe==0.23
matplotlib==1.5.1
numpy==1.10.4
pyparsing==2.1.0
python-dateutil==2.5.0
pytz==2015.7
requests==2.10.0
scipy==0.17.0
six==1.10.0
virtualenv==15.0.1
Werkzeug==0.11.10
```
Note: The pip tool should already be installed on your machine along with Python.
If not check the [official site][15] for how to install it.
Now lets use Virtualenv to create a new blank enviroment. First we create a new folder for our project, and change dir into it:
```
kostis@KostisMBP projects $ mkdir echoserver
kostis@KostisMBP projects $ cd echoserver/
kostis@KostisMBP echoserver $
```
Now lets create a new enviroment called echobot. To activate it you run the following source command, and checking with pip freeze we can see that its now empty.
```
kostis@KostisMBP echoserver $ virtualenv echobot
kostis@KostisMBP echoserver $ source echobot/bin/activate
(echobot) kostis@KostisMBP echoserver $ pip freeze
(echobot) kostis@KostisMBP echoserver $
```
We can start installing the libraries required. The ones well need are flask, gunicorn, and requests and with them installed we create the requirements.txt file:
```
(echobot) kostis@KostisMBP echoserver $ pip install flask
(echobot) kostis@KostisMBP echoserver $ pip install gunicorn
(echobot) kostis@KostisMBP echoserver $ pip install requests
(echobot) kostis@KostisMBP echoserver $ pip freeze
click==6.6
Flask==0.11
gunicorn==19.6.0
itsdangerous==0.24
Jinja2==2.8
MarkupSafe==0.23
requests==2.10.0
Werkzeug==0.11.10
(echobot) kostis@KostisMBP echoserver $ pip freeze > requirements.txt
```
After all the above have been run, we create the echoserver.py file with the python code and the Procfile with the command that was mentioned, and we should end up with the following files/folders:
```
(echobot) kostis@KostisMBP echoserver $ ls
Procfile echobot echoserver.py requirements.txt
```
We are now ready to upload to Heroku. We need to do two things. The first is to install the Heroku toolbet if its not already installed on your system (go to [Heroku][16] for details). The second is to create a new Heroku app through the [web interface][17].
Click on the big plus sign on the top right and select “Create new app”.
--------------------------------------------------------------------------------
via: http://tsaprailis.com/2016/06/02/How-to-build-and-deploy-a-Facebook-Messenger-bot-with-Python-and-Flask-a-tutorial/
作者:[Konstantinos Tsaprailis][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://github.com/kostistsaprailis
[1]: http://tsaprailis.com/2016/06/02/How-to-build-and-deploy-a-Facebook-Messenger-bot-with-Python-and-Flask-a-tutorial/#tech-stack
[2]: http://tsaprailis.com/2016/06/02/How-to-build-and-deploy-a-Facebook-Messenger-bot-with-Python-and-Flask-a-tutorial/#bot-architecture
[3]: http://tsaprailis.com/2016/06/02/How-to-build-and-deploy-a-Facebook-Messenger-bot-with-Python-and-Flask-a-tutorial/#the-bot-server
[4]: http://tsaprailis.com/2016/06/02/How-to-build-and-deploy-a-Facebook-Messenger-bot-with-Python-and-Flask-a-tutorial/#deploying-to-heroku
[5]: http://tsaprailis.com/2016/06/02/How-to-build-and-deploy-a-Facebook-Messenger-bot-with-Python-and-Flask-a-tutorial/#creating-the-facebook-app
[6]: http://tsaprailis.com/2016/06/02/How-to-build-and-deploy-a-Facebook-Messenger-bot-with-Python-and-Flask-a-tutorial/#conclusion
[7]: https://www.heroku.com
[8]: https://www.python.org
[9]: http://flask.pocoo.org
[10]: https://git-scm.com
[11]: https://virtualenv.pypa.io/en/stable
[12]: https://github.com/hult/facebook-chatbot-python
[13]: https://github.com/hult
[14]: https://devcenter.heroku.com/articles/python-gunicorn
[15]: https://pip.pypa.io/en/stable/installing
[16]: https://toolbelt.heroku.com
[17]: https://dashboard.heroku.com/apps

View File

@ -1,3 +1,5 @@
alim0x translating
Implementing Mandatory Access Control with SELinux or AppArmor in Linux
===========================================================================

View File

@ -0,0 +1,142 @@
vim-kakali translating
Scientific Audio Processing, Part I - How to read and write Audio files with Octave 4.0.0 on Ubuntu
================
Octave, the equivalent software to Matlab in Linux, has a number of functions and commands that allow the acquisition, recording, playback and digital processing of audio signals for entertainment applications, research, medical, or any other science areas. In this tutorial, we will use Octave V4.0.0 in Ubuntu and will start reading from audio files through writing and playing signals to emulate sounds used in a wide range of activities.
Note that the main focus of this tutorial is not to install or learn to use an audio processing software already established, but rather to understand how it works from the point of view of design and audio engineering.
### Prerequisites
The first step is to install octave. Run the following commands in a terminal to add the Octave PPA in Ubuntu and install the software.
```
sudo apt-add-repository ppa:octave/stable
sudo apt-get update
sudo apt-get install octave
```
### Step 1: Opening Octave.
In this step we open the software by clicking on its icon, we can change the work directory by clicking on the File Browser dropdown.
![](https://www.howtoforge.com/images/how-to-read-and-write-audio-files-with-octave-4-in-ubuntu/initial.png)
### Step 2: Audio Info
The command "audioinfo" shows us relevant information about the audio file that we will process.
```
>> info = audioinfo ('testing.ogg')
```
![](https://www.howtoforge.com/images/how-to-read-and-write-audio-files-with-octave-4-in-ubuntu/audioinfo.png)
### Step 3: Reading an audio File
In this tutorial I will read and use ogg files for which it is feasible to read characteristics like sampling , audio type (stereo or mono), number of channels, etc. I should mention that for purposes of this tutorial, all the commands used will be executed in the terminal window of Octave. First, we have to save the ogg file in a variable. Note: it´s important that the file must be in the work path of Octave
```
>> file='yourfile.ogg'
```
```
>> [M, fs] = audioread(file)
```
Where M is a matrix of one or two columns, depending on the number of channels and fs is the sampling frequency.
![](https://www.howtoforge.com/images/how-to-read-and-write-audio-files-with-octave-4-in-ubuntu/reading.png)
![](https://www.howtoforge.com/images/how-to-read-and-write-audio-files-with-octave-4-in-ubuntu/matrix.png)
![](https://www.howtoforge.com/images/how-to-read-and-write-audio-files-with-octave-4-in-ubuntu/big/frequency.png)
There are some options that we can use for reading audio files, such as:
```
>> [y, fs] = audioread (filename, samples)
>> [y, fs] = audioread (filename, datatype)
>> [y, fs] = audioread (filename, samples, datatype)
```
Where samples specifies starting and ending frames and datatype specifies the data type to return. We can assign values to any variable:
```
>> samples = [1, fs)
>> [y, fs] = audioread (filename, samples)
```
And about datatype:
```
>> [y,Fs] = audioread(filename,'native')
```
If the value is 'native' then the type of data depends on how the data is stored in the audio file.
### Step 4: Writing an audio file
Creating the ogg file:
For this purpose, we are going to generate an ogg file with values from a cosine. The sampling frequency that I will use is 44100 samples per second and the file will last for 10 seconds. The frequency of the cosine signal is 440 Hz.
```
>> filename='cosine.ogg';
>> fs=44100;
>> t=0:1/fs:10;
>> w=2*pi*440*t;
>> signal=cos(w);
>> audiowrite(filename, signal, fs);
```
This creates a file named 'cosine.ogg' in our workspace that contains the cosine signal.
![](https://www.howtoforge.com/images/how-to-read-and-write-audio-files-with-octave-4-in-ubuntu/cosinefile.png)
If we play the 'cosine.ogg' file then this will reproduce a 440Hz tone which is equivalent to an 'A' musical tone. If we want to see the values saved in the file we have to 'read' the file with the 'audioread' function. In a further tutorial, we will see how to write an audio file with two channels.
### Step 5: Playing an audio file
Octave, by default, has an audio player that we can use for testing purposes. Use the following functions as example:
```
>> [y,fs]=audioread('yourfile.ogg');
>> player=audioplayer(y, fs, 8)
scalar structure containing the fields:
BitsPerSample = 8
CurrentSample = 0
DeviceID = -1
NumberOfChannels = 1
Running = off
SampleRate = 44100
TotalSamples = 236473
Tag =
Type = audioplayer
UserData = [](0x0)
>> play(player);
```
In the next parts of the tutorial, we will see advanced audio processing features and possible use cases for scientific and commercial use.
--------------------------------------------------------------------------------
via: https://www.howtoforge.com/tutorial/how-to-read-and-write-audio-files-with-octave-4-in-ubuntu/
作者:[David Duarte][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对ID](https://github.com/校对ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://twitter.com/intent/follow?original_referer=https%3A%2F%2Fwww.howtoforge.com%2Ftutorial%2Fhow-to-read-and-write-audio-files-with-octave-4-in-ubuntu%2F&ref_src=twsrc%5Etfw&region=follow_link&screen_name=howtoforgecom&tw_p=followbutton

View File

@ -1,33 +1,36 @@
An Introduction to Mocking in Python
Mock 在 Python 中的使用介绍
=====================================
http://www.oschina.net/translate/an-introduction-to-mocking-in-python?cmp
本文讲述的是 Python 中 Mock 的使用
This article is about mocking in python,
**如何在避免测试你的耐心的情景下执行单元测试**
**How to Run Unit Tests Without Testing Your Patience**
通常,我们编写的软件会直接与我们称之为肮脏无比的服务交互。用外行人的话说:交互已设计好的服务对我们的应用程序很重要,但是这会带来我们不希望的副作用,也就是那些在我们自己测试的时候不希望的功能。例如:我们正在写一个社交 app并且想要测试一下我们 "发布到 Facebook" 的新功能,但是不想每次运行测试集的时候真的发布到 Facebook。
More often than not, the software we write directly interacts with what we would label as “dirty” services. In laymans terms: services that are crucial to our application, but whose interactions have intended but undesired side-effects—that is, undesired in the context of an autonomous test run.For example: perhaps were writing a social app and want to test out our new Post to Facebook feature, but dont want to actually post to Facebook every time we run our test suite.
The Python unittest library includes a subpackage named unittest.mock—or if you declare it as a dependency, simply mock—which provides extremely powerful and useful means by which to mock and stub out these undesired side-effects.
Python 的单元测试库包含了一个名为 unittest.mock 或者可以称之为依赖的子包,简言之为 mock——其提供了极其强大和有用的方法通过它们可以模拟和打桩我们不希望的副作用。
>Source | <http://www.toptal.com/python/an-introduction-to-mocking-in-python>
Note: mock is [newly included][1] in the standard library as of Python 3.3; prior distributions will have to use the Mock library downloadable via [PyPI][2].
注意mock [最近收录][1]到了 Python 3.3 的标准库中;先前发布的版本必须通过 [PyPI][2] 下载 Mock 库。
###
### Fear System Calls
To give you another example, and one that well run with for the rest of the article, consider system calls. Its not difficult to see that these are prime candidates for mocking: whether youre writing a script to eject a CD drive, a web server which removes antiquated cache files from /tmp, or a socket server which binds to a TCP port, these calls all feature undesired side-effects in the context of your unit-tests.
再举另一个例子,思考一个我们会在余文讨论的系统调用。不难发现,这些系统调用都是主要的模拟对象:无论你是正在写一个可以弹出 CD 驱动的脚本,还是一个用来删除 /tmp 下过期的缓存文件的 Web 服务,这些调用都是在你的单元测试上下文中不希望的副作用。
>As a developer, you care more that your library successfully called the system function for ejecting a CD as opposed to experiencing your CD tray open every time a test is run.
> 作为一个开发者,你需要更关心你的库是否成功地调用了一个可以弹出 CD 的系统函数,而不是切身经历 CD 托盘每次在测试执行的时候都打开了。
As a developer, you care more that your library successfully called the system function for ejecting a CD (with the correct arguments, etc.) as opposed to actually experiencing your CD tray open every time a test is run. (Or worse, multiple times, as multiple tests reference the eject code during a single unit-test run!)
作为一个开发者,你需要更关心你的库是否成功地调用了一个可以弹出 CD 的系统函数(使用了正确的参数等等),而不是切身经历 CD 托盘每次在测试执行的时候都打开了。(或者更糟糕的是,很多次,在一个单元测试运行期间多个测试都引用了弹出代码!)
Likewise, keeping your unit-tests efficient and performant means keeping as much “slow code” out of the automated test runs, namely filesystem and network access.
同样,保持你的单元测试的效率和性能意味着需要让如此多的 "缓慢代码" 远离自动测试,比如文件系统和网络访问。
For our first example, well refactor a standard Python test case from original form to one using mock. Well demonstrate how writing a test case with mocks will make our tests smarter, faster, and able to reveal more about how the software works.
对于我们首个例子,我们要从原始形式到使用 mock 地重构一个标准 Python 测试用例。我们会演示如何使用 mock 写一个测试用例使我们的测试更加智能、快速,并且能展示更多关于我们软件的工作原理。
### A Simple Delete Function
### 一个简单的删除函数
We all need to delete files from our filesystem from time to time, so lets write a function in Python which will make it a bit easier for our scripts to do so.
有时,我们都需要从文件系统中删除文件,因此,让我们在 Python 中写一个可以使我们的脚本更加轻易完成此功能的函数。
```
#!/usr/bin/env python
@ -39,9 +42,9 @@ def rm(filename):
os.remove(filename)
```
Obviously, our rm method at this point in time doesnt provide much more than the underlying os.remove method, but our codebase will improve, allowing us to add more functionality here.
很明显,我们的 rm 方法此时无法提供比相关 os.remove 方法更多的功能,但我们的基础代码会逐步改善,允许我们在这里添加更多的功能。
Lets write a traditional test case, i.e., without mocks:
让我们写一个传统的测试用例,即,没有使用 mock
```
#!/usr/bin/env python
@ -60,7 +63,7 @@ class RmTestCase(unittest.TestCase):
def setUp(self):
with open(self.tmpfilepath, "wb") as f:
f.write("Delete me!")
def test_rm(self):
# remove the file
rm(self.tmpfilepath)
@ -68,9 +71,11 @@ class RmTestCase(unittest.TestCase):
self.assertFalse(os.path.isfile(self.tmpfilepath), "Failed to remove the file.")
```
Our test case is pretty simple, but every time it is run, a temporary file is created and then deleted. Additionally, we have no way of testing whether our rm method properly passes the argument down to the os.remove call. We can assume that it does based on the test above, but much is left to be desired.
我们的测试用例相当简单,但是当它每次运行的时候,它都会创建一个临时文件并且随后删除。此外,我们没有办法测试我们的 rm 方法是否正确地将我们的参数向下传递给 os.remove 调用。我们可以基于以上的测试认为它做到了,但还有很多需要改进的地方。
Refactoring with MocksLets refactor our test case using mock:
### 使用 Mock 重构
让我们使用 mock 重构我们的测试用例:
```
#!/usr/bin/env python
@ -82,7 +87,7 @@ import mock
import unittest
class RmTestCase(unittest.TestCase):
@mock.patch('mymodule.os')
def test_rm(self, mock_os):
rm("any path")
@ -90,25 +95,30 @@ class RmTestCase(unittest.TestCase):
mock_os.remove.assert_called_with("any path")
```
With these refactors, we have fundamentally changed the way that the test operates. Now, we have an insider, an object we can use to verify the functionality of another.
使用这些重构,我们从根本上改变了该测试用例的运行方式。现在,我们有一个可以用于验证其他功能的内部对象。
### Potential Pitfalls
### 潜在陷阱
第一件需要注意的事情就是,我们使用了位于 mymodule.os 且用于模拟对象的 mock.patch 方法装饰器,并且将该 mock 注入到我们的测试用例方法。相比在 mymodule.os 引用它,那么只是模拟 os 本身,会不会更有意义呢?
One of the first things that should stick out is that were using the mock.patch method decorator to mock an object located at mymodule.os, and injecting that mock into our test case method. Wouldnt it make more sense to just mock os itself, rather than the reference to it at mymodule.os?
Well, Python is somewhat of a sneaky snake when it comes to imports and managing modules. At runtime, the mymodule module has its own os which is imported into its own local scope in the module. Thus, if we mock os, we wont see the effects of the mock in the mymodule module.
当然当涉及到导入和管理模块Python 的用法非常灵活。在运行时mymodule 模块拥有被导入到本模块局部作用域的 os。因此如果我们模拟 os我们是看不到模拟在 mymodule 模块中的作用的。
The mantra to keep repeating is this:
这句话需要深刻地记住:
> 模拟测试一个项目,只需要了解它用在哪里,而不是它从哪里来。
> Mock an item where it is used, not where it came from.
如果你需要为 myproject.app.MyElaborateClass 模拟 tempfile 模块,你可能需要
If you need to mock the tempfile module for myproject.app.MyElaborateClass, you probably need to apply the mock to myproject.app.tempfile, as each module keeps its own imports.
先将那个陷阱置身事外,让我们继续模拟。
With that pitfall out of the way, lets keep mocking.
### Adding Validation to rm
### 向 rm 中加入验证
之前定义的 rm 方法相当的简单。在盲目地删除之前,我们倾向于拿它来验证一个路径是否存在,并验证其是否是一个文件。让我们重构 rm 使其变得更加智能:
The rm method defined earlier is quite oversimplified. Wed like to have it validate that a path exists and is a file before just blindly attempting to remove it. Lets refactor rm to be a bit smarter:
```
#!/usr/bin/env python
@ -122,7 +132,7 @@ def rm(filename):
os.remove(filename)
```
Great. Now, lets adjust our test case to keep coverage up.
很好。现在,让我们调整测试用例来保持测试的覆盖程度。
```
#!/usr/bin/env python
@ -134,33 +144,33 @@ import mock
import unittest
class RmTestCase(unittest.TestCase):
@mock.patch('mymodule.os.path')
@mock.patch('mymodule.os')
def test_rm(self, mock_os, mock_path):
# set up the mock
mock_path.isfile.return_value = False
rm("any path")
# test that the remove call was NOT called.
self.assertFalse(mock_os.remove.called, "Failed to not remove the file if not present.")
# make the file 'exist'
mock_path.isfile.return_value = True
rm("any path")
mock_os.remove.assert_called_with("any path")
```
Our testing paradigm has completely changed. We now can verify and validate internal functionality of methods without any side-effects.
我们的测试用例完全改变了。现在我们可以在没有任何副作用下核实并验证方法的内部功能。
### File-Removal as a Service
### 将文件删除作为服务
So far, weve only been working with supplying mocks for functions, but not for methods on objects or cases where mocking is necessary for sending parameters. Lets cover object methods first.
到目前为止,我们只是对函数功能提供模拟测试,并没对需要传递参数的对象和实例的方法进行模拟测试。接下来我们将介绍如何对对象的方法进行模拟测试。
Well begin with a refactor of the rm method into a service class. There really isnt a justifiable need, per se, to encapsulate such a simple function into an object, but it will at the very least help us demonstrate key concepts in mock. Lets refactor:
首先我们将rm方法重构成一个服务类。实际上将这样一个简单的函数转换成一个对象在本质上这不是一个合理的需求但它能够帮助我们了解mock的关键概念。让我们开始重构:
```
#!/usr/bin/env python
@ -177,6 +187,7 @@ class RemovalService(object):
os.remove(filename)
```
### 你会注意到我们的测试用例没有太大的变化
### Youll notice that not much has changed in our test case:
```
@ -189,29 +200,30 @@ import mock
import unittest
class RemovalServiceTestCase(unittest.TestCase):
@mock.patch('mymodule.os.path')
@mock.patch('mymodule.os')
def test_rm(self, mock_os, mock_path):
# instantiate our service
reference = RemovalService()
# set up the mock
mock_path.isfile.return_value = False
reference.rm("any path")
# test that the remove call was NOT called.
self.assertFalse(mock_os.remove.called, "Failed to not remove the file if not present.")
# make the file 'exist'
mock_path.isfile.return_value = True
reference.rm("any path")
mock_os.remove.assert_called_with("any path")
```
很好,我们知道 RemovalService 会如期工作。接下来让我们创建另一个服务,将其声明为一个依赖
Great, so we now know that the RemovalService works as planned. Lets create another service which declares it as a dependency:
```
@ -227,13 +239,13 @@ class RemovalService(object):
def rm(self, filename):
if os.path.isfile(filename):
os.remove(filename)
class UploadService(object):
def __init__(self, removal_service):
self.removal_service = removal_service
def upload_complete(self, filename):
self.removal_service.rm(filename)
```
@ -261,29 +273,29 @@ import mock
import unittest
class RemovalServiceTestCase(unittest.TestCase):
@mock.patch('mymodule.os.path')
@mock.patch('mymodule.os')
def test_rm(self, mock_os, mock_path):
# instantiate our service
reference = RemovalService()
# set up the mock
mock_path.isfile.return_value = False
reference.rm("any path")
# test that the remove call was NOT called.
self.assertFalse(mock_os.remove.called, "Failed to not remove the file if not present.")
# make the file 'exist'
mock_path.isfile.return_value = True
reference.rm("any path")
mock_os.remove.assert_called_with("any path")
class UploadServiceTestCase(unittest.TestCase):
@mock.patch.object(RemovalService, 'rm')
@ -291,13 +303,13 @@ class UploadServiceTestCase(unittest.TestCase):
# build our dependencies
removal_service = RemovalService()
reference = UploadService(removal_service)
# call upload_complete, which should, in turn, call `rm`:
reference.upload_complete("my uploaded file")
# check that it called the rm method of any RemovalService
mock_rm.assert_called_with("my uploaded file")
# check that it called the rm method of _our_ removal_service
removal_service.rm.assert_called_with("my uploaded file")
```
@ -338,39 +350,39 @@ import mock
import unittest
class RemovalServiceTestCase(unittest.TestCase):
@mock.patch('mymodule.os.path')
@mock.patch('mymodule.os')
def test_rm(self, mock_os, mock_path):
# instantiate our service
reference = RemovalService()
# set up the mock
mock_path.isfile.return_value = False
reference.rm("any path")
# test that the remove call was NOT called.
self.assertFalse(mock_os.remove.called, "Failed to not remove the file if not present.")
# make the file 'exist'
mock_path.isfile.return_value = True
reference.rm("any path")
mock_os.remove.assert_called_with("any path")
class UploadServiceTestCase(unittest.TestCase):
def test_upload_complete(self, mock_rm):
# build our dependencies
mock_removal_service = mock.create_autospec(RemovalService)
reference = UploadService(mock_removal_service)
# call upload_complete, which should, in turn, call `rm`:
reference.upload_complete("my uploaded file")
# test that it called the rm method
mock_removal_service.rm.assert_called_with("my uploaded file")
```
@ -426,7 +438,7 @@ To finish up, lets write a more applicable real-world example, one which we m
import facebook
class SimpleFacebook(object):
def __init__(self, oauth_token):
self.graph = facebook.GraphAPI(oauth_token)
@ -444,7 +456,7 @@ import mock
import unittest
class SimpleFacebookTestCase(unittest.TestCase):
@mock.patch.object(facebook.GraphAPI, 'put_object', autospec=True)
def test_post_message(self, mock_put_object):
sf = simple_facebook.SimpleFacebook("fake oauth token")
@ -479,12 +491,3 @@ via: http://slviki.com/index.php/2016/06/18/introduction-to-mocking-in-python/
[6]: http://www.voidspace.org.uk/python/mock/mock.html
[7]: http://www.toptal.com/qa/how-to-write-testable-code-and-why-it-matters
[8]: http://www.toptal.com/python

View File

@ -1,106 +0,0 @@
Container technologies in Fedora: systemd-nspawn
===
Welcome to the “Container technologies in Fedora” series! This is the first article in a series of articles that will explain how you can use the various container technologies available in Fedora. This first article will deal with `systemd-nspawn`.
### What is a container?
A container is a user-space instance which can be used to run a program or an operating system in isolation from the system hosting the container (called the host system). The idea is very similar to a `chroot` or a [virtual machine][1]. The processes running in a container are managed by the same kernel as the host operating system, but they are isolated from the host file system, and from the other processes.
### What is systemd-nspawn?
The systemd project considers container technologies as something that should fundamentally be part of the desktop and that should integrate with the rest of the users systems. To this end, systemd provides `systemd-nspawn`, a tool which is able to create containers using various Linux technologies. It also provides some container management tools.
In many ways, `systemd-nspawn` is similar to `chroot`, but is much more powerful. It virtualizes the file system, process tree, and inter-process communication of the guest system. Much of its appeal lies in the fact that it provides a number of tools, such as `machinectl`, for managing containers. Containers run by `systemd-nspawn` will integrate with the systemd components running on the host system. As an example, journal entries can be logged from a container in the host systems journal.
In Fedora 24, `systemd-nspawn` has been split out from the systemd package, so youll need to install the `systemd-container` package. As usual, you can do that with a `dnf install systemd-container`.
### Creating the container
Creating a container with `systemd-nspawn` is easy. Lets say you have an application made for Debian, and it doesnt run well anywhere else. Thats not a problem, we can make a container! To set up a container with the latest version of Debian (at this point in time, Jessie), you need to pick a directory to set up your system in. Ill be using `~/DebianJessie` for now.
Once the directory has been created, you need to run `debootstrap`, which you can install from the Fedora repositories. For Debian Jessie, you run the following command to initialize a Debian file system.
```
$ debootstrap --arch=amd64 stable ~/DebianJessie
```
This assumes your architecture is x86_64. If it isnt, you must change `amd64` to the name of your architecture. You can find your machines architecture with `uname -m`.
Once your root directory is set up, you will start your container with the following command.
```
$ systemd-nspawn -bD ~/DebianJessie
```
Youll be up and running within seconds. Youll notice something as soon as you try to log in: you cant use any accounts on your system. This is because systemd-nspawn virtualizes users. The fix is simple: remove -b from the previous command. Youll boot directly to the root shell in the container. From there, you can just use passwd to set a password for root, or you can use adduser to add a new user. As soon as youre done with that, go ahead and put the -b flag back. Youll boot to the familiar login console and you log in with the credentials you set.
All of this applies for any distribution you would want to run in the container, but you need to create the system using the correct package manager. For Fedora, you would use DNF instead of debootstrap. To set up a minimal Fedora system, you can run the following command, replacing the absolute path with wherever you want the container to be.
```
$ sudo dnf --releasever=24 --installroot=/absolute/path/ install systemd passwd dnf fedora-release
```
![](https://cdn.fedoramagazine.org/wp-content/uploads/2016/06/Screenshot-from-2016-06-17-15-04-14.png)
### Setting up the network
Youll notice an issue if you attempt to start a service that binds to a port currently in use on your host system. Your container is using the same network interface. Luckily, `systemd-nspawn` provides several ways to achieve separate networking from the host machine.
#### Local networking
The first method uses the `--private-network` flag, which only creates a loopback device by default. This is ideal for environments where you dont need networking, such as build systems and other continuous integration systems.
#### Multiple networking interfaces
If you have multiple network devices, you can give one to the container with the `--network-interface` flag. To give `eno1` to my container, I would add the flag `--network-interface=eno1`. While an interface is assigned to a container, the host cant use it at the same time. When the container is completely shut down, it will be available to the host again.
#### Sharing network interfaces
For those of us who dont have spare network devices, there are other options for providing access to the container. One of those is the `--port` flag. This forwards a port on the container to the host. The format is `protocol:host:container`, where protocol is either `tcp` or `udp`, `host` is a valid port number on the host, and `container` is a valid port on the container. You can omit the protocol and specify only `host:container`. I often use something similar to `--port=2222:22`.
You can enable complete, host-only networking with the `--network-veth` flag, which creates a virtual Ethernet interface between the host and the container. You can also bridge two connections with `--network-bridge`.
### Using systemd components
If the system in your container has D-Bus, you can use systemds provided utilities to control and monitor your container. Debian doesnt include dbus in the base install. If you want to use it with Debian Jessie, youll want to run `apt install dbus`.
#### machinectl
To easily manage containers, systemd provides the machinectl utility. Using machinectl, you can log in to a container with machinectl login name, check the status with machinectl status name, reboot with machinectl reboot name, or power it off with machinectl poweroff name.
### Other systemd commands
Most systemd commands, such as journalctl, systemd-analyze, and systemctl, support containers with the `--machine` option. For example, if you want to see the journals of a container named “foobar”, you can use journalctl `--machine=foobar`. You can also see the status of a service running in this container with `systemctl --machine=foobar` status service.
![](https://cdn.fedoramagazine.org/wp-content/uploads/2016/06/Screenshot-from-2016-06-17-15-09-25.png)
### Working with SELinux
If youre running with SELinux enforcing (the default in Fedora), youll need to set the SELinux context for your container. To do that, you need to run the following two commands on the host system.
```
$ semanage fcontext -a -t svirt_sandbox_file_t "/path/to/container(/.*)?"
$ restorecon -R /path/to/container/
```
Make sure you replace “/path/to/container” with the path to your container. For my container, “DebianJessie”, I would run the following:
```
$ semanage fcontext -a -t svirt_sandbox_file_t "/home/johnmh/DebianJessie(/.*)?"
$ restorecon -R /home/johnmh/DebianJessie/
```
--------------------------------------------------------------------------------
via: http://linoxide.com/linux-how-to/set-nginx-reverse-proxy-centos-7-cpanel/
作者:[John M. Harris, Jr.][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://linoxide.com/linux-how-to/set-nginx-reverse-proxy-centos-7-cpanel/
[1]: https://en.wikipedia.org/wiki/Virtual_machine

View File

@ -1,3 +1,5 @@
name1e5s translating
TOP 5 BEST VIDEO EDITING SOFTWARE FOR LINUX IN 2016
=====================================================

View File

@ -1,101 +0,0 @@
How To Setup Open Source Discussion Platform Discourse On Ubuntu Linux 16.04
===============================================================================
Discourse is an open source discussion platform, that can work as a mailing list, a chat room and a forum as well. It is a popular tool and modern day implementation of a successful discussion platform. On server side, it is built using Ruby on Rails and uses Postgres on the backend, it also makes use of Redis caching to reduce the loading times, while on clients side end, it runs in browser using Java Script. It is a pretty well optimized and well structured tool. It also offers converter plugins to migrate your existing discussion boards / forums like vBulletin, phpBB, Drupal, SMF etc to Discourse. In this article, we will be learning how to install Discourse on Ubuntu operating system.
It is developed by keeping security in mind, so spammers and hackers might not be lucky with this application. It works well with all modern devices, and adjusts its display setting accordingly for mobile devices and tablets.
### Installing Discourse on Ubuntu 16.04
Lets get started ! the minimum system RAM to run Discourse is 1 GB and the officially supported installation process for Discourse requires dockers to be installed on our Linux system. Besides dockers, it also requires Git. We can fulfill these two requirements by simply running the following command on our systems terminal.
```
wget -qO- https://get.docker.com/ | sh
```
![](http://linuxpitstop.com/wp-content/uploads/2016/06/124.png)
It shouldnt take longer to complete the installation for Docker and Git, as soon its installation process is complete, create a directory for Discourse inside /var partition of your system (You can choose any other partition here too).
```
mkdir /var/discourse
```
Now clone the Discourses Github repository to this newly created directory.
```
git clone https://github.com/discourse/discourse_docker.git /var/discourse
```
Go into the cloned directory.
```
cd /var/discourse
```
![](http://linuxpitstop.com/wp-content/uploads/2016/06/314.png)
You should be able to locate “discourse-setup” script file here, simply run this script to initiate the installation wizard for Discourse.
```
./discourse-setup
```
**Side note: Please make sure you have a ready email server setup before attempting install for discourse.**
Installation wizard will ask you following six questions.
```
Hostname for your Discourse?
Email address for admin account?
SMTP server address?
SMTP user name?
SMTP port [587]:
SMTP password? []:
```
![](http://linuxpitstop.com/wp-content/uploads/2016/06/411.png)
Once you supply these information, it will ask for the confirmation, if everything is fine, hit “Enter” and installation process will take off.
![](http://linuxpitstop.com/wp-content/uploads/2016/06/511.png)
Sit back and relax! it will take sweet amount of time to complete the installation, grab a cup of coffee, and keep an eye for any error messages.
![](http://linuxpitstop.com/wp-content/uploads/2016/06/610.png)
Here is how the successful completion of the installation process should look alike.
![](http://linuxpitstop.com/wp-content/uploads/2016/06/710.png)
Now launch your web browser, if the hostname for discourse installation resolves properly to IP, then you can use your hostname in browser , otherwise use your IP address to launch the Discourse page. Here is what you should see:
![](http://linuxpitstop.com/wp-content/uploads/2016/06/85.png)
Thats it, create new account by using “Sign Up” option and you should be good to go with your Discourse setup.
![](http://linuxpitstop.com/wp-content/uploads/2016/06/106.png)
### Conclusion
It is an easy to setup application and works flawlessly. It is equipped with all required features of modern day discussion board. It is available under General Public License and is 100% open source product. The simplicity, easy of use, powerful and long feature list are the most important feathers of this tool. Hope you enjoyed this article, Question? do let us know in comments please.
--------------------------------------------------------------------------------
via: http://linuxpitstop.com/install-discourse-on-ubuntu-linux-16-04/
作者:[Aun][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://linuxpitstop.com/author/aun/

View File

@ -1,198 +0,0 @@
Python 101: An Intro to urllib
=================================
The urllib module in Python 3 is a collection of modules that you can use for working with URLs. If you are coming from a Python 2 background you will note that in Python 2 you had urllib and urllib2. These are now a part of the urllib package in Python 3. The current version of urllib is made up of the following modules:
- urllib.request
- urllib.error
- urllib.parse
- urllib.rebotparser
We will be covering each part individually except for urllib.error. The official documentation actually recommends that you might want to check out the 3rd party library, requests, for a higher-level HTTP client interface. However, I believe that it can be useful to know how to open URLs and interact with them without using a 3rd party and it may also help you appreciate why the requests package is so popular.
---
### urllib.request
The urllib.request module is primarily used for opening and fetching URLs. Lets take a look at some of the things you can do with the urlopen function:
```
>>> import urllib.request
>>> url = urllib.request.urlopen('https://www.google.com/')
>>> url.geturl()
'https://www.google.com/'
>>> url.info()
<http.client.HTTPMessage object at 0x7fddc2de04e0>
>>> header = url.info()
>>> header.as_string()
('Date: Fri, 24 Jun 2016 18:21:19 GMT\n'
'Expires: -1\n'
'Cache-Control: private, max-age=0\n'
'Content-Type: text/html; charset=ISO-8859-1\n'
'P3P: CP="This is not a P3P policy! See '
'https://www.google.com/support/accounts/answer/151657?hl=en for more info."\n'
'Server: gws\n'
'X-XSS-Protection: 1; mode=block\n'
'X-Frame-Options: SAMEORIGIN\n'
'Set-Cookie: '
'NID=80=tYjmy0JY6flsSVj7DPSSZNOuqdvqKfKHDcHsPIGu3xFv41LvH_Jg6LrUsDgkPrtM2hmZ3j9V76pS4K_cBg7pdwueMQfr0DFzw33SwpGex5qzLkXUvUVPfe9g699Qz4cx9ipcbU3HKwrRYA; '
'expires=Sat, 24-Dec-2016 18:21:19 GMT; path=/; domain=.google.com; HttpOnly\n'
'Alternate-Protocol: 443:quic\n'
'Alt-Svc: quic=":443"; ma=2592000; v="34,33,32,31,30,29,28,27,26,25"\n'
'Accept-Ranges: none\n'
'Vary: Accept-Encoding\n'
'Connection: close\n'
'\n')
>>> url.getcode()
200
```
Here we import our module and ask it to open Googles URL. Now we have an HTTPResponse object that we can interact with. The first thing we do is call the geturl method which will return the URL of the resource that was retrieved. This is useful for finding out if we followed a redirect.
Next we call info, which will return meta-data about the page, such as headers. Because of this, we assign that result to our headers variable and then call its as_string method. This prints out the header we received from Google. You can also get the HTTP response code by calling getcode, which in this case was 200, which means it worked successfully.
If youd like to see the HTML of the page, you can call the read method on the url variable we created. I am not reproducing that here as the output will be quite long.
Please note that the request object defaults to a GET request unless you specify the data parameter. Should you pass in the data parameter, then the request object will issue a POST request instead.
---
### Downloading a file
A typical use case for the urllib package is for downloading a file. Lets find out a couple of ways we can accomplish this task:
```
>>> import urllib.request
>>> url = 'http://www.blog.pythonlibrary.org/wp-content/uploads/2012/06/wxDbViewer.zip'
>>> response = urllib.request.urlopen(url)
>>> data = response.read()
>>> with open('/home/mike/Desktop/test.zip', 'wb') as fobj:
... fobj.write(data)
...
```
Here we just open a URL that leads us to a zip file stored on my blog. Then we read the data and write it out to disk. An alternate way to accomplish this is to use urlretrieve:
```
>>> import urllib.request
>>> url = 'http://www.blog.pythonlibrary.org/wp-content/uploads/2012/06/wxDbViewer.zip'
>>> tmp_file, header = urllib.request.urlretrieve(url)
>>> with open('/home/mike/Desktop/test.zip', 'wb') as fobj:
... with open(tmp_file, 'rb') as tmp:
... fobj.write(tmp.read())
```
The urlretrieve method will copy a network object to a local file. The file it copies to is randomly named and goes into the temp directory unless you use the second parameter to urlretrieve where you can actually specify where you want the file saved. This will save you a step and make your code much simpler:
```
>>> import urllib.request
>>> url = 'http://www.blog.pythonlibrary.org/wp-content/uploads/2012/06/wxDbViewer.zip'
>>> urllib.request.urlretrieve(url, '/home/mike/Desktop/blog.zip')
('/home/mike/Desktop/blog.zip',
<http.client.HTTPMessage object at 0x7fddc21c2470>)
```
As you can see, it returns the location of where it saved the file and the header information from the request.
### Specifying Your User Agent
When you visit a website with your browser, the browser tells the website who it is. This is called the user-agent string. Pythons urllib identifies itself as Python-urllib/x.y where the x and y are major and minor version numbers of Python. Some websites wont recognize this user-agent string and will behave in strange ways or not work at all. Fortunately, its easy for you to set up your own custom user-agent string:
```
>>> import urllib.request
>>> user_agent = ' Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:47.0) Gecko/20100101 Firefox/47.0'
>>> url = 'http://www.whatsmyua.com/'
>>> headers = {'User-Agent': user_agent}
>>> request = urllib.request.Request(url, headers=headers)
>>> with urllib.request.urlopen(request) as response:
... with open('/home/mdriscoll/Desktop/user_agent.html', 'wb') as out:
... out.write(response.read())
```
Here we set up our user agent to Mozilla FireFox and we set out URL to <http://www.whatsmyua.com/> which will tell us what it thinks our user-agent string is. Then we create a Request instance using our url and headers and pass that to urlopen. Finally we save the result. If you open the result file, you will see that we successfully changed our user-agent string. Feel free to try out a few different strings with this code to see how it will change.
---
### urllib.parse
The urllib.parse library is your standard interface for breaking up URL strings and combining them back together. You can use it to convert a relative URL to an absolute URL, for example. Lets try using it to parse a URL that includes a query:
```
>>> from urllib.parse import urlparse
>>> result = urlparse('https://duckduckgo.com/?q=python+stubbing&t=canonical&ia=qa')
>>> result
ParseResult(scheme='https', netloc='duckduckgo.com', path='/', params='', query='q=python+stubbing&t=canonical&ia=qa', fragment='')
>>> result.netloc
'duckduckgo.com'
>>> result.geturl()
'https://duckduckgo.com/?q=python+stubbing&t=canonical&ia=qa'
>>> result.port
None
```
Here we import the urlparse function and pass it an URL that contains a search query to the duckduckgo website. My query was to look up articles on “python stubbing”. As you can see, it returned a ParseResult object that you can use to learn more about the URL. For example, you can get the port information (None in this case), the network location, path and much more.
### Submitting a Web Form
This module also holds the urlencode method, which is great for passing data to a URL. A typical use case for the urllib.parse library is submitting a web form. Lets find out how you might do that by having the duckduckgo search engine look for Python:
```
>>> import urllib.request
>>> import urllib.parse
>>> data = urllib.parse.urlencode({'q': 'Python'})
>>> data
'q=Python'
>>> url = 'http://duckduckgo.com/html/'
>>> full_url = url + '?' + data
>>> response = urllib.request.urlopen(full_url)
>>> with open('/home/mike/Desktop/results.html', 'wb') as f:
... f.write(response.read())
```
This is pretty straightforward. Basically we want to submit a query to duckduckgo ourselves using Python instead of a browser. To do that, we need to construct our query string using urlencode. Then we put that together to create a fully qualified URL and use urllib.request to submit the form. We then grab the result and save it to disk.
---
### urllib.robotparser
The robotparser module is made up of a single class, RobotFileParser. This class will answer questions about whether or not a specific user agent can fetch a URL that has a published robot.txt file. The robots.txt file will tell a web scraper or robot what parts of the server should not be accessed. Lets take a look at a simple example using ArsTechnicas website:
```
>>> import urllib.robotparser
>>> robot = urllib.robotparser.RobotFileParser()
>>> robot.set_url('http://arstechnica.com/robots.txt')
None
>>> robot.read()
None
>>> robot.can_fetch('*', 'http://arstechnica.com/')
True
>>> robot.can_fetch('*', 'http://arstechnica.com/cgi-bin/')
False
```
Here we import the robot parser class and create an instance of it. Then we pass it a URL that specifies where the websites robots.txt file resides. Next we tell our parser to read the file. Now that thats done, we give it a couple of different URLs to find out which ones we can crawl and which ones we cant. We quickly see that we can access the main site, but not the cgi-bin.
---
### Wrapping Up
You have reached the point that you should be able to use Pythons urllib package competently. We learned how to download a file, submit a web form, change our user agent and access a robots.txt file in this chapter. The urllib has a lot of additional functionality that is not covered here, such as website authentication. However, you might want to consider switching to the requests library before trying to do authentication with urllib as the requests implementation is a lot easier to understand and debug. I also want to note that Python has support for Cookies via its http.cookies module although that is also wrapped quite well in the requests package. You should probably consider trying both to see which one makes the most sense to you.
--------------------------------------------------------------------------------
via: http://www.blog.pythonlibrary.org/2016/06/28/python-101-an-intro-to-urllib/
作者:[Mike][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.blog.pythonlibrary.org/author/mld/

View File

@ -1,33 +0,0 @@
What makes up the Fedora kernel?
====================================
![](https://cdn.fedoramagazine.org/wp-content/uploads/2016/06/kernel-945x400.png)
Every Fedora system runs a kernel. Many pieces of code come together to make this a reality.
Each release of the Fedora kernel starts with a baseline release from the [upstream community][1]. This is often called a vanilla kernel. The upstream kernel is the standard. The goal is to have as much code upstream as possible. This makes it easier for bug fixes and API updates to happen as well as having more people review the code. In an ideal world, Fedora would be able to to take the kernel straight from kernel.org and send that out to all users.
Realistically, using the vanilla kernel isnt complete enough for Fedora. Some features Fedora users want may not be available. The [Fedora kernel][2] that users actually receive contains a number of patches on top of the vanilla kernel. These patches are considered out of tree. Many of these patches will not exist out of tree patches very long. If patches are available to fix an issue, the patches may be pulled in to the Fedora tree so the fix can go out to users faster. When the kernel is rebased to a new version, the patches will be removed if they are in the new version.
Some patches remain in the Fedora kernel tree for an extended period of time. A good example of patches that fall into this category are the secure boot patches. These patches provide a feature Fedora wants to support even though the upstream community has not yet accepted them. It takes effort to keep these patches up to date so Fedora tries to minimize the number of patches that are carried without being accepted by an upstream kernel maintainer.
Generally, the best way to get a patch included in the Fedora kernel is to send it to the ]Linux Kernel Mailing List (LKML)][3] first and then ask for it to be included in Fedora. If a patch has been accepted by a maintainer it stands a very high chance of being included in the Fedora kernel tree. Patches that come from places like github which have not been submitted to LKML are unlikely to be taken into the tree. Its important to send the patches to LKML first to ensure Fedora is carrying the correct patches in its tree. Without the community review, Fedora could end up carrying patches which are buggy and cause problems.
The Fedora kernel contains code from many places. All of it is necessary to give the best experience possible.
--------------------------------------------------------------------------------
via: https://fedoramagazine.org/makes-fedora-kernel/
作者:[Laura Abbott][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://fedoramagazine.org/makes-fedora-kernel/
[1]: http://www.kernel.org/
[2]: http://pkgs.fedoraproject.org/cgit/rpms/kernel.git/
[3]: http://www.labbott.name/blog/2015/10/02/the-art-of-communicating-with-lkml/

View File

@ -0,0 +1,845 @@
Building a data science portfolio: Machine learning project
===========================================================
>This is the third in a series of posts on how to build a Data Science Portfolio. If you like this and want to know when the next post in the series is released, you can [subscribe at the bottom of the page][1].
Data science companies are increasingly looking at portfolios when making hiring decisions. One of the reasons for this is that a portfolio is the best way to judge someones real-world skills. The good news for you is that a portfolio is entirely within your control. If you put some work in, you can make a great portfolio that companies are impressed by.
The first step in making a high-quality portfolio is to know what skills to demonstrate. The primary skills that companies want in data scientists, and thus the primary skills they want a portfolio to demonstrate, are:
- Ability to communicate
- Ability to collaborate with others
- Technical competence
- Ability to reason about data
- Motivation and ability to take initiative
Any good portfolio will be composed of multiple projects, each of which may demonstrate 1-2 of the above points. This is the third post in a series that will cover how to make a well-rounded data science portfolio. In this post, well cover how to make the second project in your portfolio, and how to build an end to end machine learning project. At the end, youll have a project that shows your ability to reason about data, and your technical competence. [Heres][2] the completed project if you want to take a look.
### An end to end project
As a data scientist, there are times when youll be asked to take a dataset and figure out how to [tell a story with it][3]. In times like this, its important to communicate very well, and walk through your process. Tools like Jupyter notebook, which we used in a previous post, are very good at helping you do this. The expectation here is that the deliverable is a presentation or document summarizing your findings.
However, there are other times when youll be asked to create a project that has operational value. A project with operational value directly impacts the day-to-day operations of a company, and will be used more than once, and often by multiple people. A task like this might be “create an algorithm to forecast our churn rate”, or “create a model that can automatically tag our articles”. In cases like this, storytelling is less important than technical competence. You need to be able to take a dataset, understand it, then create a set of scripts that can process that data. Its often important that these scripts run quickly, and use minimal system resources like memory. Its very common that these scripts will be run several times, so the deliverable becomes the scripts themselves, not a presentation. The deliverable is often integrated into operational flows, and may even be user-facing.
The main components of building an end to end project are:
- Understanding the context
- Exploring the data and figuring out the nuances
- Creating a well-structured project, so its easy to integrate into operational flows
- Writing high-performance code that runs quickly and uses minimal system resources
- Documenting the installation and usage of your code well, so others can use it
In order to effectively create a project of this kind, well need to work with multiple files. Using a text editor like [Atom][4], or an IDE like [PyCharm][5] is highly recommended. These tools will allow you to jump between files, and edit files of different types, like markdown files, Python files, and csv files. Structuring your project so its easy to version control and upload to collaborative coding tools like [Github][6] is also useful.
![](https://www.dataquest.io/blog/images/end_to_end/github.png)
>This project on Github.
Well use our editing tools along with libraries like [Pandas][7] and [scikit-learn][8] in this post. Well make extensive use of Pandas [DataFrames][9], which make it easy to read in and work with tabular data in Python.
### Finding good datasets
A good dataset for an end to end portfolio project can be hard to find. [The dataset][10] needs to be sufficiently large that memory and performance constraints come into play. It also needs to potentially be operationally useful. For instance, this dataset, which contains data on the admission criteria, graduation rates, and graduate future earnings for US colleges would be a great dataset to use to tell a story. However, as you think about the dataset, it becomes clear that there isnt enough nuance to build a good end to end project with it. For example, you could tell someone their potential future earnings if they went to a specific college, but that would be a quick lookup without enough nuance to demonstrate technical competence. You could also figure out if colleges with higher admissions standards tend to have graduates who earn more, but that would be more storytelling than operational.
These memory and performance constraints tend to come into play when you have more than a gigabyte of data, and when you have some nuance to what you want to predict, which involves running algorithms over the dataset.
A good operational dataset enables you to build a set of scripts that transform the data, and answer dynamic questions. A good example would be a dataset of stock prices. You would be able to predict the prices for the next day, and keep feeding new data to the algorithm as the markets closed. This would enable you to make trades, and potentially even profit. This wouldnt be telling a story it would be adding direct value.
Some good places to find datasets like this are:
- [/r/datasets][11] a subreddit that has hundreds of interesting datasets.
- [Google Public Datasets][12] public datasets available through Google BigQuery.
- [Awesome datasets][13] a list of datasets, hosted on Github.
As you look through these datasets, think about what questions someone might want answered with the dataset, and think if those questions are one-time (“how did housing prices correlate with the S&P 500?”), or ongoing (“can you predict the stock market?”). The key here is to find questions that are ongoing, and require the same code to be run multiple times with different inputs (different data).
For the purposes of this post, well look at [Fannie Mae Loan Data][14]. Fannie Mae is a government sponsored enterprise in the US that buys mortgage loans from other lenders. It then bundles these loans up into mortgage-backed securities and resells them. This enables lenders to make more mortgage loans, and creates more liquidity in the market. This theoretically leads to more homeownership, and better loan terms. From a borrowers perspective, things stay largely the same, though.
Fannie Mae releases two types of data data on loans it acquires, and data on how those loans perform over time. In the ideal case, someone borrows money from a lender, then repays the loan until the balance is zero. However, some borrowers miss multiple payments, which can cause foreclosure. Foreclosure is when the house is seized by the bank because mortgage payments cannot be made. Fannie Mae tracks which loans have missed payments on them, and which loans needed to be foreclosed on. This data is published quarterly, and lags the current date by 1 year. As of this writing, the most recent dataset thats available is from the first quarter of 2015.
Acquisition data, which is published when the loan is acquired by Fannie Mae, contains information on the borrower, including credit score, and information on their loan and home. Performance data, which is published every quarter after the loan is acquired, contains information on the payments being made by the borrower, and the foreclosure status, if any. A loan that is acquired may have dozens of rows in the performance data. A good way to think of this is that the acquisition data tells you that Fannie Mae now controls the loan, and the performance data contains a series of status updates on the loan. One of the status updates may tell us that the loan was foreclosed on during a certain quarter.
![](https://www.dataquest.io/blog/images/end_to_end/foreclosure.jpg)
>A foreclosed home being sold.
### Picking an angle
There are a few directions we could go in with the Fannie Mae dataset. We could:
- Try to predict the sale price of a house after its foreclosed on.
- Predict the payment history of a borrower.
- Figure out a score for each loan at acquisition time.
The important thing is to stick to a single angle. Trying to focus on too many things at once will make it hard to make an effective project. Its also important to pick an angle that has sufficient nuance. Here are examples of angles without much nuance:
- Figuring out which banks sold loans to Fannie Mae that were foreclosed on the most.
- Figuring out trends in borrower credit scores.
- Exploring which types of homes are foreclosed on most often.
- Exploring the relationship between loan amounts and foreclosure sale prices
All of the above angles are interesting, and would be great if we were focused on storytelling, but arent great fits for an operational project.
With the Fannie Mae dataset, well try to predict whether a loan will be foreclosed on in the future by only using information that was available when the loan was acquired. In effect, well create a “score” for any mortgage that will tell us if Fannie Mae should buy it or not. This will give us a nice foundation to build on, and will be a great portfolio piece.
### Understanding the data
Lets take a quick look at the raw data files. Here are the first few rows of the acquisition data from quarter 1 of 2012:
```
100000853384|R|OTHER|4.625|280000|360|02/2012|04/2012|31|31|1|23|801|N|C|SF|1|I|CA|945||FRM|
100003735682|R|SUNTRUST MORTGAGE INC.|3.99|466000|360|01/2012|03/2012|80|80|2|30|794|N|P|SF|1|P|MD|208||FRM|788
100006367485|C|PHH MORTGAGE CORPORATION|4|229000|360|02/2012|04/2012|67|67|2|36|802|N|R|SF|1|P|CA|959||FRM|794
```
Here are the first few rows of the performance data from quarter 1 of 2012:
```
100000853384|03/01/2012|OTHER|4.625||0|360|359|03/2042|41860|0|N||||||||||||||||
100000853384|04/01/2012||4.625||1|359|358|03/2042|41860|0|N||||||||||||||||
100000853384|05/01/2012||4.625||2|358|357|03/2042|41860|0|N||||||||||||||||
```
Before proceeding too far into coding, its useful to take some time and really understand the data. This is more critical in operational projects because we arent interactively exploring the data, it can be harder to spot certain nuances unless we find them upfront. In this case, the first step is to read the materials on the Fannie Mae site:
- [Overview][15]
- [Glossary of useful terms][16]
- [FAQs][17]
- [Columns in the Acquisition and Performance files][18]
- [Sample Acquisition data file][19]
- [Sample Performance data file][20]
After reading through these files, we know some key facts that will help us:
- Theres an Acquisition file and a Performance file for each quarter, starting from the year 2000 to present. Theres a 1 year lag in the data, so the most recent data is from 2015 as of this writing.
- The files are in text format, with a pipe (|) as a delimiter.
- The files dont have headers, but we have a list of what each column is.
- All together, the files contain data on 22 million loans.
- Because the Performance files contain information on loans acquired in previous years, there will be more performance data for loans acquired in earlier years (ie loans acquired in 2014 wont have much performance history).
These small bits of information will save us a ton of time as we figure out how to structure our project and work with the data.
### Structuring the project
Before we start downloading and exploring the data, its important to think about how well structure the project. When building an end-to-end project, our primary goals are:
- Creating a solution that works
- Having a solution that runs quickly and uses minimal resources
- Enabling others to easily extend our work
- Making it easy for others to understand our code
- Writing as little code as possible
In order to achieve these goals, well need to structure our project well. A well structured project follows a few principles:
- Separates data files and code files.
- Separates raw data from generated data.
- Has a README.md file that walks people through installing and using the project.
- Has a requirements.txt file that contains all the packages needed to run the project.
- Has a single settings.py file that contains any settings that are used in other files.
- For example, if you are reading the same file from multiple Python scripts, its useful to have them all import settings and get the file name from a centralized place.
- Has a .gitignore file that prevents large or secret files from being committed.
- Breaks each step in our task into a separate file that can be executed separately.
- For example, we may have one file for reading in the data, one for creating features, and one for making predictions.
- Stores intermediate values. For example, one script may output a file that the next script can read.
- This enables us to make changes in our data processing flow without recalculating everything.
Our file structure will look something like this shortly:
```
loan-prediction
├── data
├── processed
├── .gitignore
├── README.md
├── requirements.txt
├── settings.py
```
### Creating the initial files
To start with, well need to create a loan-prediction folder. Inside that folder, well need to make a data folder and a processed folder. The first will store our raw data, and the second will store any intermediate calculated values.
Next, well make a .gitignore file. A .gitignore file will make sure certain files are ignored by git and not pushed to Github. One good example of such a file is the .DS_Store file created by OSX in every folder. A good starting point for a .gitignore file is here. Well also want to ignore the data files because they are very large, and the Fannie Mae terms prevent us from redistributing them, so we should add two lines to the end of our file:
```
data
processed
```
[Heres][21] an example .gitignore file for this project.
Next, well need to create README.md, which will help people understand the project. .md indicates that the file is in markdown format. Markdown enables you write plain text, but also add some fancy formatting if you want. [Heres][22] a guide on markdown. If you upload a file called README.md to Github, Github will automatically process the markdown, and show it to anyone who views the project. [Heres][23] an example.
For now, we just need to put a simple description in README.md:
```
Loan Prediction
-----------------------
Predict whether or not loans acquired by Fannie Mae will go into foreclosure. Fannie Mae acquires loans from other lenders as a way of inducing them to lend more. Fannie Mae releases data on the loans it has acquired and their performance afterwards [here](http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html).
```
Now, we can create a requirements.txt file. This will make it easy for other people to install our project. We dont know exactly what libraries well be using yet, but heres a good starting point:
```
pandas
matplotlib
scikit-learn
numpy
ipython
scipy
```
The above libraries are the most commonly used for data analysis tasks in Python, and its fair to assume that well be using most of them. [Heres][24] an example requirements file for this project.
After creating requirements.txt, you should install the packages. For this post, well be using Python 3. If you dont have Python installed, you should look into using [Anaconda][25], a Python installer that also installs all the packages listed above.
Finally, we can just make a blank settings.py file, since we dont have any settings for our project yet.
### Acquiring the data
Once we have the skeleton of our project, we can get the raw data.
Fannie Mae has some restrictions around acquiring the data, so youll need to sign up for an account. You can find the download page [here][26]. After creating an account, youll be able to download as few or as many loan data files as you want. The files are in zip format, and are reasonably large after decompression.
For the purposes of this blog post, well download everything from Q1 2012 to Q1 2015, inclusive. Well then need to unzip all of the files. After unzipping the files, remove the original .zip files. At the end, the loan-prediction folder should look something like this:
```
loan-prediction
├── data
│ ├── Acquisition_2012Q1.txt
│ ├── Acquisition_2012Q2.txt
│ ├── Performance_2012Q1.txt
│ ├── Performance_2012Q2.txt
│ └── ...
├── processed
├── .gitignore
├── README.md
├── requirements.txt
├── settings.py
```
After downloading the data, you can use the head and tail shell commands to look at the lines in the files. Do you see any columns that arent needed? It might be useful to consult the [pdf of column names][27] while doing this.
### Reading in the data
There are two issues that make our data hard to work with right now:
- The acquisition and performance datasets are segmented across multiple files.
- Each file is missing headers.
Before we can get started on working with the data, well need to get to the point where we have one file for the acquisition data, and one file for the performance data. Each of the files will need to contain only the columns we care about, and have the proper headers. One wrinkle here is that the performance data is quite large, so we should try to trim some of the columns if we can.
The first step is to add some variables to settings.py, which will contain the paths to our raw data and our processed data. Well also add a few other settings that will be useful later on:
```
DATA_DIR = "data"
PROCESSED_DIR = "processed"
MINIMUM_TRACKING_QUARTERS = 4
TARGET = "foreclosure_status"
NON_PREDICTORS = [TARGET, "id"]
CV_FOLDS = 3
```
Putting the paths in settings.py will put them in a centralized place and make them easy to change down the line. When referring to the same variables in multiple files, its easier to put them in a central place than edit them in every file when you want to change them. [Heres][28] an example settings.py file for this project.
The second step is to create a file called assemble.py that will assemble all the pieces into 2 files. When we run python assemble.py, well get 2 data files in the processed directory.
Well then start writing code in assemble.py. Well first need to define the headers for each file, so well need to look at [pdf of column names][29] and create lists of the columns in each Acquisition and Performance file:
```
HEADERS = {
"Acquisition": [
"id",
"channel",
"seller",
"interest_rate",
"balance",
"loan_term",
"origination_date",
"first_payment_date",
"ltv",
"cltv",
"borrower_count",
"dti",
"borrower_credit_score",
"first_time_homebuyer",
"loan_purpose",
"property_type",
"unit_count",
"occupancy_status",
"property_state",
"zip",
"insurance_percentage",
"product_type",
"co_borrower_credit_score"
],
"Performance": [
"id",
"reporting_period",
"servicer_name",
"interest_rate",
"balance",
"loan_age",
"months_to_maturity",
"maturity_date",
"msa",
"delinquency_status",
"modification_flag",
"zero_balance_code",
"zero_balance_date",
"last_paid_installment_date",
"foreclosure_date",
"disposition_date",
"foreclosure_costs",
"property_repair_costs",
"recovery_costs",
"misc_costs",
"tax_costs",
"sale_proceeds",
"credit_enhancement_proceeds",
"repurchase_proceeds",
"other_foreclosure_proceeds",
"non_interest_bearing_balance",
"principal_forgiveness_balance"
]
}
```
The next step is to define the columns we want to keep. Since all were measuring on an ongoing basis about the loan is whether or not it was ever foreclosed on, we can discard many of the columns in the performance data. Well need to keep all the columns in the acquisition data, though, because we want to maximize the information we have about when the loan was acquired (after all, were predicting if the loan will ever be foreclosed or not at the point its acquired). Discarding columns will enable us to save disk space and memory, while also speeding up our code.
```
SELECT = {
"Acquisition": HEADERS["Acquisition"],
"Performance": [
"id",
"foreclosure_date"
]
}
```
Next, well write a function to concatenate the data sets. The below code will:
- Import a few needed libraries, including settings.
- Define a function concatenate, that:
- Gets the names of all the files in the data directory.
- Loops through each file.
- If the file isnt the right type (doesnt start with the prefix we want), we ignore it.
- Reads the file into a [DataFrame][30] with the right settings using the Pandas [read_csv][31] function.
- Sets the separator to | so the fields are read in correctly.
- The data has no header row, so sets header to None to indicate this.
- Sets names to the right value from the HEADERS dictionary these will be the column names of our DataFrame.
- Picks only the columns from the DataFrame that we added in SELECT.
- Concatenates all the DataFrames together.
- Writes the concatenated DataFrame back to a file.
```
import os
import settings
import pandas as pd
def concatenate(prefix="Acquisition"):
files = os.listdir(settings.DATA_DIR)
full = []
for f in files:
if not f.startswith(prefix):
continue
data = pd.read_csv(os.path.join(settings.DATA_DIR, f), sep="|", header=None, names=HEADERS[prefix], index_col=False)
data = data[SELECT[prefix]]
full.append(data)
full = pd.concat(full, axis=0)
full.to_csv(os.path.join(settings.PROCESSED_DIR, "{}.txt".format(prefix)), sep="|", header=SELECT[prefix], index=False)
```
We can call the above function twice with the arguments Acquisition and Performance to concatenate all the acquisition and performance files together. The below code will:
- Only execute if the script is called from the command line with python assemble.py.
- Concatenate all the files, and result in two files:
- `processed/Acquisition.txt`
- `processed/Performance.txt`
```
if __name__ == "__main__":
concatenate("Acquisition")
concatenate("Performance")
```
We now have a nice, compartmentalized assemble.py thats easy to execute, and easy to build off of. By decomposing the problem into pieces like this, we make it easy to build our project. Instead of one messy script that does everything, we define the data that will pass between the scripts, and make them completely separate from each other. When youre working on larger projects, its a good idea to do this, because it makes it much easier to change individual pieces without having unexpected consequences on unrelated pieces of the project.
Once we finish the assemble.py script, we can run python assemble.py. You can find the complete assemble.py file [here][32].
This will result in two files in the processed directory:
```
loan-prediction
├── data
│ ├── Acquisition_2012Q1.txt
│ ├── Acquisition_2012Q2.txt
│ ├── Performance_2012Q1.txt
│ ├── Performance_2012Q2.txt
│ └── ...
├── processed
│ ├── Acquisition.txt
│ ├── Performance.txt
├── .gitignore
├── assemble.py
├── README.md
├── requirements.txt
├── settings.py
```
### Computing values from the performance data
The next step well take is to calculate some values from processed/Performance.txt. All we want to do is to predict whether or not a property is foreclosed on. To figure this out, we just need to check if the performance data associated with a loan ever has a foreclosure_date. If foreclosure_date is None, then the property was never foreclosed on. In order to avoid including loans with little performance history in our sample, well also want to count up how many rows exist in the performance file for each loan. This will let us filter loans without much performance history from our training data.
One way to think of the loan data and the performance data is like this:
![](https://github.com/LCTT/wiki-images/blob/master/TranslateProject/ref_img/001.png)
As you can see above, each row in the Acquisition data can be related to multiple rows in the Performance data. In the Performance data, foreclosure_date will appear in the quarter when the foreclosure happened, so it should be blank prior to that. Some loans are never foreclosed on, so all the rows related to them in the Performance data have foreclosure_date blank.
We need to compute foreclosure_status, which is a Boolean that indicates whether a particular loan id was ever foreclosed on, and performance_count, which is the number of rows in the performance data for each loan id.
There are a few different ways to compute the counts we want:
- We could read in all the performance data, then use the Pandas groupby method on the DataFrame to figure out the number of rows associated with each loan id, and also if the foreclosure_date is ever not None for the id.
- The upside of this method is that its easy to implement from a syntax perspective.
- The downside is that reading in all 129236094 lines in the data will take a lot of memory, and be extremely slow.
- We could read in all the performance data, then use apply on the acquisition DataFrame to find the counts for each id.
- The upside is that its easy to conceptualize.
- The downside is that reading in all 129236094 lines in the data will take a lot of memory, and be extremely slow.
- We could iterate over each row in the performance dataset, and keep a separate dictionary of counts.
- The upside is that the dataset doesnt need to be loaded into memory, so its extremely fast and memory-efficient.
- The downside is that it will take slightly longer to conceptualize and implement, and we need to parse the rows manually.
Loading in all the data will take quite a bit of memory, so lets go with the third option above. All we need to do is to iterate through all the rows in the Performance data, while keeping a dictionary of counts per loan id. In the dictionary, well keep track of how many times the id appears in the performance data, as well as if foreclosure_date is ever not None. This will give us foreclosure_status and performance_count.
Well create a new file called annotate.py, and add in code that will enable us to compute these values. In the below code, well:
- Import needed libraries.
- Define a function called count_performance_rows.
- Open processed/Performance.txt. This doesnt read the file into memory, but instead opens a file handler that can be used to read in the file line by line.
- Loop through each line in the file.
- Split the line on the delimiter (|)
- Check if the loan_id is not in the counts dictionary.
- If not, add it to counts.
- Increment performance_count for the given loan_id because were on a row that contains it.
- If date is not None, then we know that the loan was foreclosed on, so set foreclosure_status appropriately.
```
import os
import settings
import pandas as pd
def count_performance_rows():
counts = {}
with open(os.path.join(settings.PROCESSED_DIR, "Performance.txt"), 'r') as f:
for i, line in enumerate(f):
if i == 0:
# Skip header row
continue
loan_id, date = line.split("|")
loan_id = int(loan_id)
if loan_id not in counts:
counts[loan_id] = {
"foreclosure_status": False,
"performance_count": 0
}
counts[loan_id]["performance_count"] += 1
if len(date.strip()) > 0:
counts[loan_id]["foreclosure_status"] = True
return counts
```
### Getting the values
Once we create our counts dictionary, we can make a function that will extract values from the dictionary if a loan_id and a key are passed in:
```
def get_performance_summary_value(loan_id, key, counts):
value = counts.get(loan_id, {
"foreclosure_status": False,
"performance_count": 0
})
return value[key]
```
The above function will return the appropriate value from the counts dictionary, and will enable us to assign a foreclosure_status value and a performance_count value to each row in the Acquisition data. The [get][33] method on dictionaries returns a default value if a key isnt found, so this enables us to return sensible default values if a key isnt found in the counts dictionary.
### Annotating the data
Weve already added a few functions to annotate.py, but now we can get into the meat of the file. Well need to convert the acquisition data into a training dataset that can be used in a machine learning algorithm. This involves a few things:
- Converting all columns to numeric.
- Filling in any missing values.
- Assigning a performance_count and a foreclosure_status to each row.
- Removing any rows that dont have a lot of performance history (where performance_count is low).
Several of our columns are strings, which arent useful to a machine learning algorithm. However, they are actually categorical variables, where there are a few different category codes, like R, S, and so on. We can convert these columns to numeric by assigning a number to each category label:
![](https://github.com/LCTT/wiki-images/blob/master/TranslateProject/ref_img/002.png)
Converting the columns this way will allow us to use them in our machine learning algorithm.
Some of the columns also contain dates (first_payment_date and origination_date). We can split these dates into 2 columns each:
![](https://github.com/LCTT/wiki-images/blob/master/TranslateProject/ref_img/003.png)
In the below code, well transform the Acquisition data. Well define a function that:
- Creates a foreclosure_status column in acquisition by getting the values from the counts dictionary.
- Creates a performance_count column in acquisition by getting the values from the counts dictionary.
- Converts each of the following columns from a string column to an integer column:
- channel
- seller
- first_time_homebuyer
- loan_purpose
- property_type
- occupancy_status
- property_state
- product_type
- Converts first_payment_date and origination_date to 2 columns each:
- Splits the column on the forward slash.
- Assigns the first part of the split list to a month column.
- Assigns the second part of the split list to a year column.
- Deletes the column.
- At the end, well have first_payment_month, first_payment_year, origination_month, and origination_year.
- Fills any missing values in acquisition with -1.
```
def annotate(acquisition, counts):
acquisition["foreclosure_status"] = acquisition["id"].apply(lambda x: get_performance_summary_value(x, "foreclosure_status", counts))
acquisition["performance_count"] = acquisition["id"].apply(lambda x: get_performance_summary_value(x, "performance_count", counts))
for column in [
"channel",
"seller",
"first_time_homebuyer",
"loan_purpose",
"property_type",
"occupancy_status",
"property_state",
"product_type"
]:
acquisition[column] = acquisition[column].astype('category').cat.codes
for start in ["first_payment", "origination"]:
column = "{}_date".format(start)
acquisition["{}_year".format(start)] = pd.to_numeric(acquisition[column].str.split('/').str.get(1))
acquisition["{}_month".format(start)] = pd.to_numeric(acquisition[column].str.split('/').str.get(0))
del acquisition[column]
acquisition = acquisition.fillna(-1)
acquisition = acquisition[acquisition["performance_count"] > settings.MINIMUM_TRACKING_QUARTERS]
return acquisition
```
### Pulling everything together
Were almost ready to pull everything together, we just need to add a bit more code to annotate.py. In the below code, we:
- Define a function to read in the acquisition data.
- Define a function to write the processed data to processed/train.csv
- If this file is called from the command line, like python annotate.py:
- Read in the acquisition data.
- Compute the counts for the performance data, and assign them to counts.
- Annotate the acquisition DataFrame.
- Write the acquisition DataFrame to train.csv.
```
def read():
acquisition = pd.read_csv(os.path.join(settings.PROCESSED_DIR, "Acquisition.txt"), sep="|")
return acquisition
def write(acquisition):
acquisition.to_csv(os.path.join(settings.PROCESSED_DIR, "train.csv"), index=False)
if __name__ == "__main__":
acquisition = read()
counts = count_performance_rows()
acquisition = annotate(acquisition, counts)
write(acquisition)
```
Once youre done updating the file, make sure to run it with python annotate.py, to generate the train.csv file. You can find the complete annotate.py file [here][34].
The folder should now look like this:
```
loan-prediction
├── data
│ ├── Acquisition_2012Q1.txt
│ ├── Acquisition_2012Q2.txt
│ ├── Performance_2012Q1.txt
│ ├── Performance_2012Q2.txt
│ └── ...
├── processed
│ ├── Acquisition.txt
│ ├── Performance.txt
│ ├── train.csv
├── .gitignore
├── annotate.py
├── assemble.py
├── README.md
├── requirements.txt
├── settings.py
```
### Finding an error metric
Were done with generating our training dataset, and now well just need to do the final step, generating predictions. Well need to figure out an error metric, as well as how we want to evaluate our data. In this case, there are many more loans that arent foreclosed on than are, so typical accuracy measures dont make much sense.
If we read in the training data, and check the counts in the foreclosure_status column, heres what we get:
```
import pandas as pd
import settings
train = pd.read_csv(os.path.join(settings.PROCESSED_DIR, "train.csv"))
train["foreclosure_status"].value_counts()
```
```
False 4635982
True 1585
Name: foreclosure_status, dtype: int64
```
Since so few of the loans were foreclosed on, just checking the percentage of labels that were correctly predicted will mean that we can make a machine learning model that predicts False for every row, and still gets a very high accuracy. Instead, well want to use a metric that takes the class imbalance into account, and ensures that we predict foreclosures accurately. We dont want too many false positives, where we make predict that a loan will be foreclosed on even though it wont, or too many false negatives, where we predict that a loan wont be foreclosed on, but it is. Of these two, false negatives are more costly for Fannie Mae, because theyre buying loans where they may not be able to recoup their investment.
Well define false negative rate as the number of loans where the model predicts no foreclosure but the the loan was actually foreclosed on, divided by the number of total loans that were actually foreclosed on. This is the percentage of actual foreclosures that the model “Missed”. Heres a diagram:
![](https://github.com/LCTT/wiki-images/blob/master/TranslateProject/ref_img/004.png)
In the diagram above, 1 loan was predicted as not being foreclosed on, but it actually was. If we divide this by the number of loans that were actually foreclosed on, 2, we get the false negative rate, 50%. Well use this as our error metric, so we can evaluate our models performance.
### Setting up the classifier for machine learning
Well use cross validation to make predictions. With cross validation, well divide our data into 3 groups. Then well do the following:
- Train a model on groups 1 and 2, and use the model to make predictions for group 3.
- Train a model on groups 1 and 3, and use the model to make predictions for group 2.
- Train a model on groups 2 and 3, and use the model to make predictions for group 1.
Splitting it up into groups this way means that we never train a model using the same data were making predictions for. This avoids overfitting. If we overfit, well get a falsely low false negative rate, which makes it hard to improve our algorithm or use it in the real world.
[Scikit-learn][35] has a function called [cross_val_predict][36] which will make it easy to perform cross validation.
Well also need to pick an algorithm to use to make predictions. We need a classifier that can do [binary classification][37]. The target variable, foreclosure_status only has two values, True and False.
Well use [logistic regression][38], because it works well for binary classification, runs extremely quickly, and uses little memory. This is due to how the algorithm works instead of constructing dozens of trees, like a random forest, or doing expensive transformations, like a support vector machine, logistic regression has far fewer steps involving fewer matrix operations.
We can use the [logistic regression classifier][39] algorithm thats implemented in scikit-learn. The only thing we need to pay attention to is the weights of each class. If we weight the classes equally, the algorithm will predict False for every row, because it is trying to minimize errors. However, we care much more about foreclosures than we do about loans that arent foreclosed on. Thus, well pass balanced to the class_weight keyword argument of the [LogisticRegression][40] class, to get the algorithm to weight the foreclosures more to account for the difference in the counts of each class. This will ensure that the algorithm doesnt predict False for every row, and instead is penalized equally for making errors in predicting either class.
### Making predictions
Now that we have the preliminaries out of the way, were ready to make predictions. Well create a new file called predict.py that will use the train.csv file we created in the last step. The below code will:
- Import needed libraries.
- Create a function called cross_validate that:
- Creates a logistic regression classifier with the right keyword arguments.
- Creates a list of columns that we want to use to train the model, removing id and foreclosure_status.
- Run cross validation across the train DataFrame.
- Return the predictions.
```
import os
import settings
import pandas as pd
from sklearn import cross_validation
from sklearn.linear_model import LogisticRegression
from sklearn import metrics
def cross_validate(train):
clf = LogisticRegression(random_state=1, class_weight="balanced")
predictors = train.columns.tolist()
predictors = [p for p in predictors if p not in settings.NON_PREDICTORS]
predictions = cross_validation.cross_val_predict(clf, train[predictors], train[settings.TARGET], cv=settings.CV_FOLDS)
return predictions
```
### Predicting error
Now, we just need to write a few functions to compute error. The below code will:
- Create a function called compute_error that:
- Uses scikit-learn to compute a simple accuracy score (the percentage of predictions that matched the actual foreclosure_status values).
- Create a function called compute_false_negatives that:
- Combines the target and the predictions into a DataFrame for convenience.
- Finds the false negative rate.
- Create a function called compute_false_positives that:
- Combines the target and the predictions into a DataFrame for convenience.
- Finds the false positive rate.
- Finds the number of loans that werent foreclosed on that the model predicted would be foreclosed on.
- Divide by the total number of loans that werent foreclosed on.
```
def compute_error(target, predictions):
return metrics.accuracy_score(target, predictions)
def compute_false_negatives(target, predictions):
df = pd.DataFrame({"target": target, "predictions": predictions})
return df[(df["target"] == 1) & (df["predictions"] == 0)].shape[0] / (df[(df["target"] == 1)].shape[0] + 1)
def compute_false_positives(target, predictions):
df = pd.DataFrame({"target": target, "predictions": predictions})
return df[(df["target"] == 0) & (df["predictions"] == 1)].shape[0] / (df[(df["target"] == 0)].shape[0] + 1)
```
### Putting it all together
Now, we just have to put the functions together in predict.py. The below code will:
- Read in the dataset.
- Compute cross validated predictions.
- Compute the 3 error metrics above.
- Print the error metrics.
```
def read():
train = pd.read_csv(os.path.join(settings.PROCESSED_DIR, "train.csv"))
return train
if __name__ == "__main__":
train = read()
predictions = cross_validate(train)
error = compute_error(train[settings.TARGET], predictions)
fn = compute_false_negatives(train[settings.TARGET], predictions)
fp = compute_false_positives(train[settings.TARGET], predictions)
print("Accuracy Score: {}".format(error))
print("False Negatives: {}".format(fn))
print("False Positives: {}".format(fp))
```
Once youve added the code, you can run python predict.py to generate predictions. Running everything shows that our false negative rate is .26, which means that of the foreclosed loans, we missed predicting 26% of them. This is a good start, but can use a lot of improvement!
You can find the complete predict.py file [here][41].
Your file tree should now look like this:
```
loan-prediction
├── data
│ ├── Acquisition_2012Q1.txt
│ ├── Acquisition_2012Q2.txt
│ ├── Performance_2012Q1.txt
│ ├── Performance_2012Q2.txt
│ └── ...
├── processed
│ ├── Acquisition.txt
│ ├── Performance.txt
│ ├── train.csv
├── .gitignore
├── annotate.py
├── assemble.py
├── predict.py
├── README.md
├── requirements.txt
├── settings.py
```
### Writing up a README
Now that weve finished our end to end project, we just have to write up a README.md file so that other people know what we did, and how to replicate it. A typical README.md for a project should include these sections:
- A high level overview of the project, and what the goals are.
- Where to download any needed data or materials.
- Installation instructions.
- How to install the requirements.
- Usage instructions.
- How to run the project.
- What you should see after each step.
- How to contribute to the project.
- Good next steps for extending the project.
[Heres][42] a sample README.md for this project.
### Next steps
Congratulations, youre done making an end to end machine learning project! You can find a complete example project [here][43]. Its a good idea to upload your project to [Github][44] once youve finished it, so others can see it as part of your portfolio.
There are still quite a few angles left to explore with this data. Broadly, we can split them up into 3 categories extending this project and making it more accurate, finding other columns to predict, and exploring the data. Here are some ideas:
- Generate more features in annotate.py.
- Switch algorithms in predict.py.
- Try using more data from Fannie Mae than we used in this post.
- Add in a way to make predictions on future data. The code we wrote will still work if we add more data, so we can add more past or future data.
- Try seeing if you can predict if a bank should have issued the loan originally (vs if Fannie Mae should have acquired the loan).
- Remove any columns from train that the bank wouldnt have known at the time of issuing the loan.
- Some columns are known when Fannie Mae bought the loan, but not before.
- Make predictions.
- Explore seeing if you can predict columns other than foreclosure_status.
- Can you predict how much the property will be worth at sale time?
- Explore the nuances between performance updates.
- Can you predict how many times the borrower will be late on payments?
- Can you map out the typical loan lifecycle?
- Map out data on a state by state or zip code by zip code level.
- Do you see any interesting patterns?
If you build anything interesting, please let us know in the comments!
If you liked this, you might like to read the other posts in our Build a Data Science Porfolio series:
- [Storytelling with data][45].
- [How to setup up a data science blog][46].
--------------------------------------------------------------------------------
via: https://www.dataquest.io/blog/data-science-portfolio-machine-learning/
作者:[Vik Paruchuri][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对ID](https://github.com/校对ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://www.dataquest.io/blog
[1]: https://www.dataquest.io/blog/data-science-portfolio-machine-learning/#email-signup
[2]: https://github.com/dataquestio/loan-prediction
[3]: https://www.dataquest.io/blog/data-science-portfolio-project/
[4]: https://atom.io/
[5]: https://www.jetbrains.com/pycharm/
[6]: https://github.com/
[7]: http://pandas.pydata.org/
[8]: http://scikit-learn.org/
[9]: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html
[10]: https://collegescorecard.ed.gov/data/
[11]: https://reddit.com/r/datasets
[12]: https://cloud.google.com/bigquery/public-data/#usa-names
[13]: https://github.com/caesar0301/awesome-public-datasets
[14]: http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html
[15]: http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html
[16]: https://loanperformancedata.fanniemae.com/lppub-docs/lppub_glossary.pdf
[17]: https://loanperformancedata.fanniemae.com/lppub-docs/lppub_faq.pdf
[18]: https://loanperformancedata.fanniemae.com/lppub-docs/lppub_file_layout.pdf
[19]: https://loanperformancedata.fanniemae.com/lppub-docs/acquisition-sample-file.txt
[20]: https://loanperformancedata.fanniemae.com/lppub-docs/performance-sample-file.txt
[21]: https://github.com/dataquestio/loan-prediction/blob/master/.gitignore
[22]: https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet
[23]: https://github.com/dataquestio/loan-prediction
[24]: https://github.com/dataquestio/loan-prediction/blob/master/requirements.txt
[25]: https://www.continuum.io/downloads
[26]: https://loanperformancedata.fanniemae.com/lppub/index.html
[27]: https://loanperformancedata.fanniemae.com/lppub-docs/lppub_file_layout.pdf
[28]: https://github.com/dataquestio/loan-prediction/blob/master/settings.py
[29]: https://loanperformancedata.fanniemae.com/lppub-docs/lppub_file_layout.pdf
[30]: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html
[31]: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
[32]: https://github.com/dataquestio/loan-prediction/blob/master/assemble.py
[33]: https://docs.python.org/3/library/stdtypes.html#dict.get
[34]: https://github.com/dataquestio/loan-prediction/blob/master/annotate.py
[35]: http://scikit-learn.org/
[36]: http://scikit-learn.org/stable/modules/generated/sklearn.cross_validation.cross_val_predict.html
[37]: https://en.wikipedia.org/wiki/Binary_classification
[38]: https://en.wikipedia.org/wiki/Logistic_regression
[39]: http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
[40]: http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
[41]: https://github.com/dataquestio/loan-prediction/blob/master/predict.py
[42]: https://github.com/dataquestio/loan-prediction/blob/master/README.md
[43]: https://github.com/dataquestio/loan-prediction
[44]: https://www.github.com/
[45]: https://www.dataquest.io/blog/data-science-portfolio-project/
[46]: https://www.dataquest.io/blog/how-to-setup-a-data-science-blog/

View File

@ -1,100 +0,0 @@
How to Encrypt a Flash Drive Using VeraCrypt
============================================
Many security experts prefer open source software like VeraCrypt, which can be used to encrypt flash drives, because of its readily available source code.
Encryption is a smart idea for protecting data on a USB flash drive, as we covered in our piece that described ]how to encrypt a flash drive][1] using Microsoft BitLocker.
But what if you do not want to use BitLocker?
You may be concerned that because Microsoft's source code is not available for inspection, it could be susceptible to security "backdoors" used by the government or others. Because source code for open source software is widely shared, many security experts feel open source software is far less likely to have any backdoors.
Fortunately, there are several open source encryption alternatives to BitLocker.
If you need to be able to encrypt and access files on any Windows machine, as well as computers running Apple OS X or Linux, the open source [VeraCrypt][2] offers an excellent alternative.
VeraCrypt is derived from TrueCrypt, a well-regarded open source encryption software product that has now been discontinued. But the code for TrueCrypt was audited and no major security flaws were found. In addition, it has since been improved in VeraCrypt.
Versions exist for Windows, OS X and Linux.
Encrypting a USB flash drive with VeraCrypt is not as straightforward as it is with BitLocker, but it still only takes a few minutes.
### Encrypting Flash Drive with VeraCrypt in 8 Steps
After [downloading VeraCrypt][3] for your operating system:
Start VeraCrypt, and click on Create Volume to start the VeraCrypt Volume Creation Wizard.
![](http://www.esecurityplanet.com/imagesvr_ce/6246/Vera0.jpg)
The VeraCrypt Volume Creation Wizard allows you to create an encrypted file container on the flash drive which sits along with other unencrypted files, or you can choose to encrypt the entire flash drive. For the moment, we will choose to encrypt the entire flash drive.
![](http://www.esecurityplanet.com/imagesvr_ce/6703/Vera1.jpg)
On the next screen, choose Standard VeraCrypt Volume.
![](http://www.esecurityplanet.com/imagesvr_ce/835/Vera2.jpg)
Select the drive letter of the flash drive you want to encrypt (in this case O:).
![](http://www.esecurityplanet.com/imagesvr_ce/9427/Vera3.jpg)
Choose the Volume Creation Mode. If your flash drive is empty or you want to delete everything it contains, choose the first option. If you want to keep any existing files, choose the second option.
![](http://www.esecurityplanet.com/imagesvr_ce/7828/Vera4.jpg)
This screen allows you to choose your encryption options. If you are unsure of which to choose, leave the default settings of AES and SHA-512.
![](http://www.esecurityplanet.com/imagesvr_ce/5918/Vera5.jpg)
After confirming the Volume Size screen, enter and re-enter the password you want to use to encrypt your data.
![](http://www.esecurityplanet.com/imagesvr_ce/3850/Vera6.jpg)
To work effectively, VeraCrypt must draw from a pool of entropy or "randomness." To generate this pool, you'll be asked to move your mouse around in a random fashion for about a minute. Once the bar has turned green, or preferably when it reaches the far right of the screen, click Format to finish creating your encrypted drive.
![](http://www.esecurityplanet.com/imagesvr_ce/7468/Vera8.jpg)
### Using a Flash Drive Encrypted with VeraCrypt
When you want to use an encrypted flash drive, first insert the drive in the computer and start VeraCrypt.
Then select an unused drive letter (such as z:) and click Auto-Mount Devices.
![](http://www.esecurityplanet.com/imagesvr_ce/2016/Vera10.jpg)
Enter your password and click OK.
![](http://www.esecurityplanet.com/imagesvr_ce/8222/Vera11.jpg)
The mounting process may take a few minutes, after which your unencrypted drive will become available with the drive letter you selected previously.
### VeraCrypt Traveler Disk Setup
If you set up a flash drive with an encrypted container rather than encrypting the whole drive, you also have the option to create what VeraCrypt calls a traveler disk. This installs a copy of VeraCrypt on the USB flash drive itself, so when you insert the drive in another Windows computer you can run VeraCrypt automatically from the flash drive; there is no need to install it on the computer.
You can set up a flash drive to be a Traveler Disk by choosing Traveler Disk SetUp from the Tools menu of VeraCrypt.
![](http://www.esecurityplanet.com/imagesvr_ce/5812/Vera12.jpg)
It is worth noting that in order to run VeraCrypt from a Traveler Disk on a computer, you must have administrator privileges on that computer. While that may seem to be a limitation, no confidential files can be opened safely on a computer that you do not control, such as one in a business center.
>Paul Rubens has been covering enterprise technology for over 20 years. In that time he has written for leading UK and international publications including The Economist, The Times, Financial Times, the BBC, Computing and ServerWatch.
--------------------------------------------------------------------------------
via: http://www.esecurityplanet.com/open-source-security/how-to-encrypt-flash-drive-using-veracrypt.html
作者:[Paul Rubens ][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.esecurityplanet.com/author/3700/Paul-Rubens
[1]: http://www.esecurityplanet.com/views/article.php/3880616/How-to-Encrypt-a-USB-Flash-Drive.htm
[2]: http://www.esecurityplanet.com/open-source-security/veracrypt-a-worthy-truecrypt-alternative.html
[3]: https://veracrypt.codeplex.com/releases/view/619351

View File

@ -1,96 +0,0 @@
MikeCoder Translating...
Doing for User Space What We Did for Kernel Space
=======================================================
I believe the best and worst thing about Linux is its hard distinction between kernel space and user space.
Without that distinction, Linux never would have become the most leveraged operating system in the world. Today, Linux has the largest range of uses for the largest number of users—most of whom have no idea they are using Linux when they search for something on Google or poke at their Android phones. Even Apple stuff wouldn't be what it is (for example, using BSD in its computers) were it not for Linux's success.
Not caring about user space is a feature of Linux kernel development, not a bug. As Linus put it on our 2003 Geek Cruise, "I only do kernel stuff...I don't know what happens outside the kernel, and I don't much care. What happens inside the kernel I care about." After Andrew Morton gave me additional schooling on the topic a couple years later on another Geek Cruise, I wrote:
>Kernel space is where the Linux species lives. User space is where Linux gets put to use, along with a lot of other natural building materials. The division between kernel space and user space is similar to the division between natural materials and stuff humans make out of those materials.
A natural outcome of this distinction, however, is for Linux folks to stay relatively small as a community while the world outside depends more on Linux every second. So, in hope that we can enlarge our number a bit, I want to point us toward two new things. One is already hot, and the other could be.
The first is [blockchain][1], made famous as the distributed ledger used by Bitcoin, but useful for countless other purposes as well. At the time of this writing, interest in blockchain is [trending toward the vertical][2].
![](http://www.linuxjournal.com/files/linuxjournal.com/ufiles/imagecache/large-550px-centered/u1000009/12042f1.png)
>Figure 1. Google Trends for Blockchain
The second is self-sovereign identity. To explain that, let me ask who and what you are.
If your answers come from your employer, your doctor, the Department of Motor Vehicles, Facebook, Twitter or Google, they are each administrative identifiers: entries in namespaces each of those organizations control, entirely for their own convenience. As Timothy Ruff of [Evernym][3] explains, "You don't exist for them. Only your identifier does." It's the dependent variable. The independent variable—the one controlling the identifier—is the organization.
If your answer comes from your self, we have a wide-open area for a new development category—one where, finally, we can be set fully free in the connected world.
The first person to explain this, as far as I know, was [Devon Loffreto][4] He wrote "What is 'Sovereign Source Authority'?" in February 2012, on his blog, [The Moxy Tongue][5]. In "[Self-Sovereign Identity][6]", published in February 2016, he writes:
>Self-Sovereign Identity must emit directly from an individual human life, and not from within an administrative mechanism...self-Sovereign Identity references every individual human identity as the origin of source authority. A self-Sovereign identity produces an administrative trail of data relations that begin and resolve to individual humans. Every individual human may possess a self-Sovereign identity, and no person or abstraction of any type created may alter this innate human Right. A self-Sovereign identity is the root of all participation as a valued social being within human societies of any type.
To put this in Linux terms, only the individual has root for his or her own source identity. In the physical world, this is a casual thing. For example, my own portfolio of identifiers includes:
- David Allen Searls, which my parents named me.
- David Searls, the name I tend to use when I suspect official records are involved.
- Dave, which is what most of my relatives and old friends call me.
- Doc, which is what most people call me.
As the sovereign source authority over the use of those, I can jump from one to another in different contexts and get along pretty well. But, that's in the physical world. In the virtual one, it gets much more complicated. In addition to all the above, I am @dsearls (my Twitter handle) and dsearls (my handle in many other net-based services). I am also burdened by having my ability to relate contained within hundreds of different silos, each with their own logins and passwords.
You can get a sense of how bad this is by checking the list of logins and passwords on your browser. On Firefox alone, I have hundreds of them. Many are defunct (since my collection dates back to Netscape days), but I would guess that I still have working logins to hundreds of companies I need to deal with from time to time. For all of them, I'm the dependent variable. It's not the other way around. Even the term "user" testifies to the subordinate dependency that has become a primary fact of life in the connected world.
Today, the only easy way to bridge namespaces is via the compromised convenience of "Log in with Facebook" or "Log in with Twitter". In both of those cases, each of us is even less ourselves or in any kind of personal control over how we are known (if we wish to be knowable at all) to other entities in the connected world.
What we have needed from the start are personal systems for instantiating our sovereign selves and choosing how to reveal and protect ourselves when dealing with others in the connected world. For lack of that ability, we are deep in a metastasized mess that Shoshana Zuboff calls "surveillance capitalism", which she says is:
>...unimaginable outside the inscrutable high velocity circuits of Google's digital universe, whose signature feature is the Internet and its successors. While the world is riveted by the showdown between Apple and the FBI, the real truth is that the surveillance capabilities being developed by surveillance capitalists are the envy of every state security agency.
Then she asks, "How can we protect ourselves from its invasive power?"
I suggest self-sovereign identity. I believe it is only there that we have both safety from unwelcome surveillance and an Archimedean place to stand in the world. From that place, we can assert full agency in our dealings with others in society, politics and business.
I came to this provisional conclusion during [ID2020][7], a gathering at the UN on May. It was gratifying to see Devon Loffreto there, since he's the guy who got the sovereign ball rolling in 2013. Here's [what I wrote about][8] it at the time, with pointers to Devon's earlier posts (such as one sourced above).
Here are three for the field's canon:
- "[Self-Sovereign Identity][9]" by Devon Loffreto.
- "[System or Human First][10]" by Devon Loffreto.
- "[The Path to Self-Sovereign Identity][11]" by Christopher Allen.
A one-pager from Evernym, [digi.me][12], [iRespond][13] and [Respect Network][14] also was circulated there, contrasting administrative identity (which it calls the "current model") with the self-sovereign one. In it is the graphic shown in Figure 2.
![](http://www.linuxjournal.com/files/linuxjournal.com/ufiles/imagecache/large-550px-centered/u1000009/12042f2.jpg)
>Figure 2. Current Model of Identity vs. Self-Sovereign Identity
The [platform][15] for this is Sovrin, explained as a "Fully open-source, attribute-based, sovereign identity graph platform on an advanced, dedicated, permissioned, distributed ledger" There's a [white paper][16] too. The code is called [plenum][17], and it's at GitHub.
Here—and places like it—we can do for user space what we've done for the last quarter century for kernel space.
--------------------------------------------------------------------------------
via: https://www.linuxjournal.com/content/doing-user-space-what-we-did-kernel-space
作者:[Doc Searls][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://www.linuxjournal.com/users/doc-searls
[1]: https://en.wikipedia.org/wiki/Block_chain_%28database%29
[2]: https://www.google.com/trends/explore#q=blockchain
[3]: http://evernym.com/
[4]: https://twitter.com/nzn
[5]: http://www.moxytongue.com/2012/02/what-is-sovereign-source-authority.html
[6]: http://www.moxytongue.com/2016/02/self-sovereign-identity.html
[7]: http://www.id2020.org/
[8]: http://blogs.harvard.edu/doc/2013/10/14/iiw-challenge-1-sovereign-identity-in-the-great-silo-forest
[9]: http://www.moxytongue.com/2016/02/self-sovereign-identity.html
[10]: http://www.moxytongue.com/2016/05/system-or-human.html
[11]: http://www.lifewithalacrity.com/2016/04/the-path-to-self-soverereign-identity.html
[12]: https://get.digi.me/
[13]: http://irespond.com/
[14]: https://www.respectnetwork.com/
[15]: http://evernym.com/technology
[16]: http://evernym.com/assets/doc/Identity-System-Essentials.pdf?v=167284fd65
[17]: https://github.com/evernym/plenum

View File

@ -1,123 +0,0 @@
translating by cvsher
What is Git
===========
Welcome to my series on learning how to use the Git version control system! In this introduction to the series, you will learn what Git is for and who should use it.
If you're just starting out in the open source world, you're likely to come across a software project that keeps its code in, and possibly releases it for use, by way of Git. In fact, whether you know it or not, you're certainly using software right now that is developed using Git: the Linux kernel (which drives the website you're on right now, if not the desktop or mobile phone you're accessing it on), Firefox, Chrome, and many more projects share their codebase with the world in a Git repository.
On the other hand, all the excitement and hype over Git tends to make things a little muddy. Can you only use Git to share your code with others, or can you use Git in the privacy of your own home or business? Do you have to have a GitHub account to use Git? Why use Git at all? What are the benefits of Git? Is Git the only option?
So forget what you know or what you think you know about Git, and let's take it from the beginning.
### What is version control?
Git is, first and foremost, a version control system (VCS). There are many version control systems out there: CVS, SVN, Mercurial, Fossil, and, of course, Git.
Git serves as the foundation for many services, like GitHub and GitLab, but you can use Git without using any other service. This means that you can use Git privately or publicly.
If you have ever collaborated on anything digital with anyone, then you know how it goes. It starts out simple: you have your version, and you send it to your partner. They make some changes, so now there are two versions, and send the suggestions back to you. You integrate their changes into your version, and now there is one version again.
Then it gets worse: while you change your version further, your partner makes more changes to their version. Now you have three versions; the merged copy that you both worked on, the version you changed, and the version your partner has changed.
As Jason van Gumster points out in his article, 【Even artists need version control][1], this syndrome tends to happen in individual settings as well. In both art and science, it's not uncommon to develop a trial version of something; a version of your project that might make it a lot better, or that might fail miserably. So you create file names like project_justTesting.kdenlive and project_betterVersion.kdenlive, and then project_best_FINAL.kdenlive, but with the inevitable allowance for project_FINAL-alternateVersion.kdenlive, and so on.
Whether it's a change to a for loop or an editing change, it happens to the best of us. That is where a good version control system makes life easier.
### Git snapshots
Git takes snapshots of a project, and stores those snapshots as unique versions.
If you go off in a direction with your project that you decide was the wrong direction, you can just roll back to the last good version and continue along an alternate path.
If you're collaborating, then when someone sends you changes, you can merge those changes into your working branch, and then your collaborator can grab the merged version of the project and continue working from the new current version.
Git isn't magic, so conflicts do occur ("You changed the last line of the book, but I deleted that line entirely; how do we resolve that?"), but on the whole, Git enables you to manage the many potential variants of a single work, retaining the history of all the changes, and even allows for parallel versions.
### Git distributes
Working on a project on separate machines is complex, because you want to have the latest version of a project while you work, makes your own changes, and share your changes with your collaborators. The default method of doing this tends to be clunky online file sharing services, or old school email attachments, both of which are inefficient and error-prone.
Git is designed for distributed development. If you're involved with a project you can clone the project's Git repository, and then work on it as if it was the only copy in existence. Then, with a few simple commands, you can pull in any changes from other contributors, and you can also push your changes over to someone else. Now there is no confusion about who has what version of a project, or whose changes exist where. It is all locally developed, and pushed and pulled toward a common target (or not, depending on how the project chooses to develop).
### Git interfaces
In its natural state, Git is an application that runs in the Linux terminal. However, as it is well-designed and open source, developers all over the world have designed other ways to access it.
It is free, available to anyone for $0, and comes in packages on Linux, BSD, Illumos, and other Unix-like operating systems. It looks like this:
```
$ git --version
git version 2.5.3
```
Probably the most well-known Git interfaces are web-based: sites like GitHub, the open source GitLab, Savannah, BitBucket, and SourceForge all offer online code hosting to maximise the public and social aspect of open source along with, in varying degrees, browser-based GUIs to minimise the learning curve of using Git. This is what the GitLab interface looks like:
![](https://opensource.com/sites/default/files/0_gitlab.png)
Additionally, it is possible that a Git service or independent developer may even have a custom Git frontend that is not HTML-based, which is particularly handy if you don't live with a browser eternally open. The most transparent integration comes in the form of file manager support. The KDE file manager, Dolphin, can show the Git status of a directory, and even generate commits, pushes, and pulls.
![](https://opensource.com/sites/default/files/0_dolphin.jpg)
[Sparkleshare][2] uses Git as a foundation for its own Dropbox-style file sharing interface.
![](https://opensource.com/sites/default/files/0_sparkleshare_1.jpg)
For more, see the (long) page on the official [Git wiki][3] listing projects with graphical interfaces to Git.
### Who should use Git?
You should! The real question is when? And what for?
### When should I use Git, and what should I use it for?
To get the most out of Git, you need to think a little bit more than usual about file formats.
Git is designed to manage source code, which in most languages consists of lines of text. Of course, Git doesn't know if you're feeding it source code or the next Great American Novel, so as long as it breaks down to text, Git is a great option for managing and tracking versions.
But what is text? If you write something in an office application like Libre Office, then you're probably not generating raw text. There is usually a wrapper around complex applications like that which encapsulate the raw text in XML markup and then in a zip container, as a way to ensure that all of the assets for your office file are available when you send that file to someone else. Strangely, though, something that you might expect to be very complex, like the save files for a [Kdenlive][4] project, or an SVG from [Inkscape][5], are actually raw XML files that can easily be managed by Git.
If you use Unix, you can check to see what a file is made of with the file command:
```
$ file ~/path/to/my-file.blah
my-file.blah: ASCII text
$ file ~/path/to/different-file.kra: Zip data (MIME type "application/x-krita")
```
If unsure, you can view the contents of a file with the head command:
```
$ head ~/path/to/my-file.blah
```
If you see text that is mostly readable by you, then it is probably a file made of text. If you see garbage with some familiar text characters here and there, it is probably not made of text.
Make no mistake: Git can manage other formats of files, but it treats them as blobs. The difference is that in a text file, two Git snapshots (or commits, as we call them) might be, say, three lines different from each other. If you have a photo that has been altered between two different commits, how can Git express that change? It can't, really, because photographs are not made of any kind of sensible text that can just be inserted or removed. I wish photo editing were as easy as just changing some text from "<sky>ugly greenish-blue</sky>" to "<sky>blue-with-fluffy-clouds</sky>" but it truly is not.
People check in blobs, like PNG icons or a speadsheet or a flowchart, to Git all the time, so if you're working in Git then don't be afraid to do that. Know that it's not sensible to do that with huge files, though. If you are working on a project that does generate both text files and large blobs (a common scenario with video games, which have equal parts source code to graphical and audio assets), then you can do one of two things: either invent your own solution, such as pointers to a shared network drive, or use a Git add-on like Joey Hess's excellent [git annex][6], or the [Git-Media][7] project.
So you see, Git really is for everyone. It is a great way to manage versions of your files, it is a powerful tool, and it is not as scary as it first seems.
--------------------------------------------------------------------------------
via: https://opensource.com/resources/what-is-git
作者:[Seth Kenlon ][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/seth
[1]: https://opensource.com/life/16/2/version-control-isnt-just-programmers
[2]: http://sparkleshare.org/
[3]: https://git.wiki.kernel.org/index.php/InterfacesFrontendsAndTools#Graphical_Interfaces
[4]: https://opensource.com/life/11/11/introduction-kdenlive
[5]: http://inkscape.org/
[6]: https://git-annex.branchable.com/
[7]: https://github.com/alebedev/git-media

View File

@ -1,181 +0,0 @@
vim-kakali translating
Creating your first Git repository
======================================
![](https://opensource.com/sites/default/files/styles/image-full-size/public/images/life/open_abstract_pieces.jpg?itok=ZRt0Db00)
Now it is time to learn how to create your own Git repository, and how to add files and make commits.
In the previous installments in this series, you learned how to interact with Git as an end user; you were the aimless wanderer who stumbled upon an open source project's website, cloned a repository, and moved on with your life. You learned that interacting with Git wasn't as confusing as you may have thought it would be, and maybe you've been convinced that it's time to start leveraging Git for your own work.
While Git is definitely the tool of choice for major software projects, it doesn't only work with major software projects. It can manage your grocery lists (if they're that important to you, and they may be!), your configuration files, a journal or diary, a novel in progress, and even source code!
And it is well worth doing; after all, when have you ever been angry that you have a backup copy of something that you've just mangled beyond recognition?
Git can't work for you unless you use it, and there's no time like the present. Or, translated to Git, "There is no push like origin HEAD". You'll understand that later, I promise.
### The audio recording analogy
We tend to speak of computer imaging in terms of snapshots because most of us can identify with the idea of having a photo album filled with particular moments in time. It may be more useful, however, to think of Git more like an analogue audio recording.
A traditional studio tape deck, in case you're unfamiliar, has a few components: it contains the reels that turn either forward or in reverse, tape to preserve sound waves, and a playhead to record or detect sound waves on tape and present them to the listener.
In addition to playing a tape forward, you can rewind it to get back to a previous point in the tape, or fast-forward to skip ahead to a later point.
Imagine a band in the 1970s recording to tape. You can imagine practising a song over and over until all the parts are perfect, and then laying down a track. First, you record the drums, and then the bass, and then the guitar, and then the vocals. Each time you record, the studio engineer rewinds the tape and puts it into loop mode so that it plays the previous part as you play yours; that is, if you're on bass, you get to hear the drums in the background as you play, and then the guitarist hears the drums and bass (and cowbell) and so on. On each loop, you play over the part, and then on the following loop, the engineer hits the record button and lays the performance down on tape.
You can also copy and swap out a reel of tape entirely, should you decide to do a re-mix of something you're working on.
Now that I've hopefully painted a vivid Roger Dean-quality image of studio life in the 70s, let's translate that into Git.
### Create a Git repository
The first step is to go out and buy some tape for our virtual tape deck. In Git terms, that's the repository ; it's the medium or domain where all the work is going to live.
Any directory can become a Git repository, but to begin with let's start a fresh one. It takes three commands:
- Create the directory (you can do that in your GUI file manager, if you prefer).
- Visit that directory in a terminal.
- Initialise it as a directory managed by Git.
Specifically, run these commands:
```
$ mkdir ~/jupiter # make directory
$ cd ~/jupiter # change into the new directory
$ git init . # initialise your new Git repo
```
Is this example, the folder jupiter is now an empty but valid Git repository.
That's all it takes. You can clone the repository, you can go backward and forward in history (once it has a history), create alternate timelines, and everything else Git can normally do.
Working inside the Git repository is the same as working in any directory; create files, copy files into the directory, save files into it. You can do everything as normal; Git doesn't get involved until you involve it.
In a local Git repository, a file can have one of three states:
- Untracked: a file you create in a repository, but not yet added to Git.
- Tracked: a file that has been added to Git.
- Staged: a tracked file that has been changed and added to Git's commit queue.
Any file that you add to a Git repository starts life out as an untracked file. The file exists on your computer, but you have not told Git about it yet. In our tape deck analogy, the tape deck isn't even turned on yet; the band is just noodling around in the studio, nowhere near ready to record yet.
That is perfectly acceptable, and Git will let you know when it happens:
```
$ echo "hello world" > foo
$ git status
On branch master
Untracked files:
(use "git add <file>..." to include in what will be committed)
foo
nothing added but untracked files present (use "git add" to track)
```
As you can see, Git also tells you how to start tracking files.
### Git without Git
Creating a repository in GitHub or GitLab is a lot more clicky and pointy. It isn't difficult; you click the New Repository button and follow the prompts.
It is a good practice to include a README file so that people wandering by have some notion of what your repository is for, and it is a little more satisfying to clone a non-empty repository.
Cloning the repository is no different than usual, but obtaining permission to write back into that repository on GitHub is slightly more complex, because in order to authenticate to GitHub you must have an SSH key. If you're on Linux, create one with this command:
```
$ ssh-keygen
```
Then copy your new key, which is plain text. You can open it in a plain text editor, or use the cat command:
```
$ cat ~/.ssh/id_rsa.pub
```
Now paste your key into [GitHub's SSH configuration][1], or your [GitLab configuration][2].
As long as you clone your GitHub project via SSH, you'll be able to write back to your repository.
Alternately, you can use GitHub's file uploader interface to add files without even having Git on your system.
![](https://opensource.com/sites/default/files/2_githubupload.jpg)
### Tracking files
As the output of git status tells you, if you want Git to start tracking a file, you must git add it. The git add action places a file in a special staging area, where files wait to be committed, or preserved for posterity in a snapshot. The point of a git add is to differentiate between files that you want to have included in a snapshot, and the new or temporary files you want Git to, at least for now, ignore.
In our tape deck analogy, this action turns the tape deck on and arms it for recording. You can picture the tape deck with the record and pause button pushed, or in a playback loop awaiting the next track to be laid down.
Once you add a file, Git will identify it as a tracked file:
```
$ git add foo
$ git status
On branch master
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
new file: foo
```
Adding a file to Git's tracking system is not making a recording. It just puts a file on the stage in preparation for recording. You can still change a file after you've added it; it's being tracked and remains staged, so you can continue to refine it or change it before committing it to tape (but be warned; you're NOT recording yet, so if you break something in a file that was perfect, there's no going back in time yet, because you never got that perfect moment on tape).
If you decide that the file isn't really ready to be recorded in the annals of Git history, then you can unstage something, just as the Git message described:
```
$ git reset HEAD foo
```
This, in effect, disarms the tape deck from being ready to record, and you're back to just noodling around in the studio.
### The big commit
At some point, you're going to want to commit something; in our tape deck analogy, that means finally pressing record and laying a track down on tape.
At different stages of a project's life, how often you press that record button varies. For example, if you're hacking your way through a new Python toolkit and finally manage to get a window to appear, then you'll certainly want to commit so you have something to fall back on when you inevitably break it later as you try out new display options. But if you're working on a rough draft of some new graphics in Inkscape, you might wait until you have something you want to develop from before committing. Ultimately, though, it's up to you how often you commit; Git doesn't "cost" that much and hard drives these days are big, so in my view, the more the better.
A commit records all staged files in a repository. Git only records files that are tracked, that is, any file that you did a git add on at some point in the past. and that have been modified since the previous commit. If no previous commit exists, then all tracked files are included in the commit because they went from not existing to existing, which is a pretty major modification from Git's point-of-view.
To make a commit, run this command:
```
$ git commit -m 'My great project, first commit.'
```
This preserves all files committed for posterity (or, if you speak Gallifreyan, they become "fixed points in time"). You can see not only the commit event, but also the reference pointer back to that commit in your Git log:
```
$ git log --oneline
55df4c2 My great project, first commit.
```
For a more detailed report, just use git log without the --oneline option.
The reference number for the commit in this example is 55df4c2. It's called a commit hash and it represents all of the new material you just recorded, overlaid onto previous recordings. If you need to "rewind" back to that point in history, you can use that hash as a reference.
You can think of a commit hash as [SMPTE timecode][3] on an audio tape, or if we bend the analogy a little, one of those big gaps between songs on a vinyl record, or track numbers on a CD.
As you change files further and add them to the stage, and ultimately commit them, you accrue new commit hashes, each of which serve as pointers to different versions of your production.
And that's why they call Git a version control system, Charlie Brown.
In the next article, we'll explore everything you need to know about the Git HEAD, and we'll nonchalantly reveal the secret of time travel. No big deal, but you'll want to read it (or maybe you already have?).
--------------------------------------------------------------------------------
via: https://opensource.com/life/16/7/creating-your-first-git-repository
作者:[Seth Kenlon][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/seth
[1]: https://github.com/settings/keys
[2]: https://gitlab.com/profile/keys
[3]: http://slackermedia.ml/handbook/doku.php?id=timecode

View File

@ -1,71 +0,0 @@
GNU KHATA: OPEN SOURCE ACCOUNTING SOFTWARE
============================================
Being an active Linux enthusiast, I usually introduce my friends to Linux, help them choose the best distro to suit their needs, and finally get them set with open source alternative software for their work.
But in one case, I was pretty helpless. My uncle, who is a freelance accountant, uses a set of some pretty sophisticated paid software for work. And I wasnt sure if Id find anything under FOSS for him, until Yesterday.
Abhishek was suggesting me some [cool apps][1] to check out and this particular one, GNU Khata stuck out.
[GNU Khata][2] is an accounting tool. Or shall I say a collection of accounting tools? It is like the [Evernote][3] of economy management. It is so versatile that it can be used from personal Finance management to large scale business management, from store inventory management to corporate tax works.
One interesting fact for you. Khata in Hindi and other Indian languages means account and hence this accounting software is called GNU Khata.
### INSTALLATION
There are many installation instructions floating around the internet which actually install the older web app version of GNU Khata. Currently, GNU Khata is available only for Debian/Ubuntu and their derivatives. I suggest you follow the steps given in GNU Khata official Website to install the updated standalone. Let me give them out real quick.
- Download the installer [here][4].
- Open the terminal in download location.
- Copy and paste the below code in terminal and run.
```
sudo chmod 755 GNUKhatasetup.run
sudo ./GNUKhatasetup.run
```
- Thats it. Open the GNU Khata from the dash or the application menu.
### FIRST LAUNCH
GNU Khata opens up in the browser and displays the following page.
![](https://itsfoss.com/wp-content/uploads/2016/07/GNU-khata-1.jpg)
Fill in the Organization name, case and organization type, financial year and click on proceed to go to the admin setup page.
![](https://itsfoss.com/wp-content/uploads/2016/07/GNU-khata-2.jpg)
Carefully feed in your name, password, security question and the answer and click on “create and login”.
![](https://itsfoss.com/wp-content/uploads/2016/07/GNU-khata-3.jpg)
Youre all set now. Use the Menu bar to start using GNU Khata to manage your finances. Its that easy.
### DOES GNU KHATA REALLY RIVAL THE PAID ACCOUNTING SOFTWARE IN THE MARKET?
To begin with, GNU Khata keeps it all simple. The menu bar up top is very conveniently organized to help you work faster and better. You can choose to manage different accounts and projects and access them easily. [Their Website][5] states that GNU Khata can be “easily transformed into Indian languages”. Also, did you know that GNU Khata can be used on the cloud too?
All the major accounting tools like ledgers, project statements, statement of affairs etc are formatted in a professional manner and are made available in both instantly presentable as well as customizable formats. It makes accounting and inventory management look so easy.
The Project is very actively evolving, seeks feedback and guidance from practicing accountants to make improvements in the software. Considering the maturity, ease of use and the absence of a price tag, GNU Khata can be the perfect assistant in bookkeeping.
Let us know what you think about GNU Khata in the comments below.
--------------------------------------------------------------------------------
via: https://itsfoss.com/using-gnu-khata/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+ItsFoss+%28Its+FOSS%21+An+Open+Source+Blog%29
作者:[Aquil Roshan][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://itsfoss.com/author/aquil/
[1]: https://itsfoss.com/category/apps/
[2]: http://www.gnukhata.in/
[3]: https://evernote.com/
[4]: https://cloud.openmailbox.org/index.php/s/L8ppsxtsFq1345E/download
[5]: http://www.gnukhata.in/

View File

@ -1,62 +0,0 @@
maywanting
5 tricks for getting started with Vim
=====================================
![](https://opensource.com/sites/default/files/styles/image-full-size/public/images/education/BUSINESS_peloton.png?itok=nuMbW9d3)
For years, I've wanted to learn Vim, now my preferred Linux text editor and a favorite open source tool among developers and system administrators. And when I say learn, I mean really learn. Master is probably too strong a word, but I'd settle for advanced proficiency. For most of my years using Linux, my skillset included the ability to open a file, use the arrow keys to navigate up and down, switch into insert mode, change some text, save, and exit.
But that's like minimum-viable-Vim. My skill level enabled me edit text documents from the terminal, but hasn't actually empowered me with any of the text editing super powers I've always imagined were possible. And it didn't justify using Vim over the totally capable Pico or Nano.
So why learn Vim at all? Because I do spend an awful lot of time editing text, and I know I could be more efficient at it. And why not Emacs, or a more modern editor like Atom? Because Vim works for me, and at least I have some minimal experience in it. And perhaps, importantly, because it's rare that I encounter a system that I'm working on which doesn't have Vim or it's less-improved cousin (vi) available on it already. If you've always had a desire to learn Emacs, more power to you—I hope the Emacs-analog of these tips will prove useful to you, too.
A few weeks in to this concentrated effort to up my Vim-use ability, the number one tip I have to share is that you actually must use the tool. While it seems like a piece of advice straight from Captain Obvious, I actually found it considerably harder than I expected to stay in the program. Most of my work happens inside of a web browser, and I had to untrain my trigger-like opening of (Gedit) every time I needed to edit a block of text outside of a browser. Gedit had made its way to my quick launcher, and so step one was removing this shortcut and putting Vim there instead.
I've tried a number of things that have helped me learn. Here's a few of them I would recommend if you're looking to learn as well.
### Vimtutor
Sometimes the best place to get started isn't far from the application itself. I found Vimtutor, a tiny application that is basically a tutorial in a text file that you edit as you learn, to be as helpful as anything else in showing me the basics of the commands I had skipped learning through the years. Vimtutor is typically found everywhere Vim is, and is an easy install from your package manager if it's not already on your system.
### GVim
I know not everyone will agree with this one, but I found it useful to stop using the version of Vim that lives in my terminal and start using GVim for my basic editing needs. Naysayers will argue that it encourages using the mouse in an environment designed for keyboards, but I found it helpful to be able to quickly find the command I was looking for in a drop-down menu, reminding myself of the correct command, and then executing it with a keyboard. The alternative was often frustration at the inability to figure out how to do something, which is not a good feeling to be under constantly as you struggle to learn a new editor. No, stopping every few minutes to read a man page or use a search engine to remind you of a key sequence is not the best way to learn something new.
### Keyboard maps
Along with switching to GVim, I also found it handy to have a keyboard "cheat sheet" handy to remind me of the basic keystrokes. There are many available on the web that you can download, print, and set beside your station, but I opted for buying a set of stickers for my laptop keyboard. They were less than ten dollars US and had the added bonus of being a subtle reminder every time I used the laptop to at least try out one new thing as I edited.
### Vimium
As I mentioned, I live in the web browser most of the day. One of the tricks I've found helpful to reinforce the Vim way of navigation is to use [Vimium][1], an open source extension for Chrome that makes Chrome mimick the shortcuts used by Vim. I've found the fewer times I switch contexts for the keyboard shortcuts I'm using, the more likely I am to actually use them. Similar extensions, like [Vimerator][2], exist for Firefox.
### Other human beings
Without a doubt, there's no better way to get help learning something new than to get advice, feedback, and solutions from other people who have gone down a path before you.
If you live in a larger urban area, there might be a Vim meetup group near you. Otherwise, the place to be is the #vim channel on Freenode IRC. One of the more popular channels on Freenode, the #vim channel is always full of helpful individuals willing to offer help with your problems. I find it interesting just to listen to the chatter and see what sorts of problems others are trying to solve to see what I'm missing out on.
------
And so what to make of this effort? So far, so good. The time spent has probably yet to pay for itself in terms of time saved, but I'm always mildly surprised and amused when I find myself with a new reflex, jumping words with the right keypress sequence, or some similarly small feat. I can at least see that every day, the investment is bringing itself a little closer to payoff.
These aren't the only tricks for learning Vim, by far. I also like to point people towards [Vim Adventures][3], an online game in which you navigate using the Vim keystrokes. And just the other day I came across a marvelous visual learning tool at [Vimgifs.com][4], which is exactly what you might expect it to be: illustrated examples with Vim so small they fit nicely in a gif.
Have you invested the time to learn Vim, or really, any program with a keyboard-heavy interface? What worked for you, and, did you think the effort was worth it? Has your productivity changed as much as you thought it would? Lets share stories in the comments below.
--------------------------------------------------------------------------------
via: https://opensource.com/life/16/7/tips-getting-started-vim
作者:[Jason Baker ][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/jason-baker
[1]: https://github.com/philc/vimium
[2]: http://www.vimperator.org/
[3]: http://vim-adventures.com/
[4]: http://vimgifs.com/

View File

@ -1,67 +0,0 @@
HOW TO CHANGE DEFAULT APPLICATIONS IN UBUNTU
==============================================
![](https://itsfoss.com/wp-content/uploads/2016/07/change-default-applications-ubuntu.jpg)
Brief: This beginners guide shows you how to change the default applications in Ubuntu Linux.
Installing [VLC media player][1] is one of the first few [things to do after installing Ubuntu 16.04][2] for me. One thing I do after installing VLC is to make it the default application so that I can open video file with VLC when I double click it.
As a beginner, you may need to know how to change any default application in Ubuntu and this is what I am going to show you in todays tutorial.
### CHANGE DEFAULT APPLICATIONS IN UBUNTU
The methods mentioned here are valid for all versions of Ubuntu be it Ubuntu 12.04, Ubuntu 14.04 or Ubuntu 16.04. There are basically two ways you can change the default applications in Ubuntu:
- via system settings
- via right click menu
#### 1. CHANGE DEFAULT APPLICATIONS IN UBUNTU FROM SYSTEM SETTINGS
Go to Unity Dash and search for System Settings:
![](https://itsfoss.com/wp-content/uploads/2013/11/System_Settings_Ubuntu.jpeg)
In the System Settings, click on the Details option:
![](https://itsfoss.com/wp-content/uploads/2016/07/System-settings-detail-ubuntu.jpeg)
In here, from the left side pane, select Default Applications. You will see the option to change the default applications in the right side pane.
![](https://itsfoss.com/wp-content/uploads/2016/07/System-settings-default-applications.jpeg)
As you can see, there are only a few kinds of default applications that can be changed here. You can change the default applications for web browser, email client, calendar app, music, videos and photo here. What about other kinds of applications?
Dont worry. To change the default applications of other kinds, well use the option in the right click menu.
#### 2. CHANGE DEFAULT APPLICATIONS IN UBUNTU FROM RIGHT CLICK MENU
If you have ever used Windows, you might be aware of the “open with” option in the right click menu that allows changing the default applications. We have something similar in Ubuntu Linux as well.
Right click on the file that you want to open in a non-default application. Go to properties.
![](https://itsfoss.com/wp-content/uploads/2016/05/WebP-images-Ubuntu-Linux-3.png)
>Select Properties from Right Click menu
And in here, you can select the application that you want to use and set it as default.
![](https://itsfoss.com/wp-content/uploads/2016/05/WebP-images-Ubuntu-Linux-4.png)
>Making gThumb the default application for WebP images in Ubuntu
Easy peasy. Isnt it? Once you do that, all the files of the same kind will be opened with your chosen default application.
I hope you found this beginners tutorial to change default applications in Ubuntu helpful. If you have any questions or suggestions, feel free to drop a comment below.
--------------------------------------------------------------------------------
via: https://itsfoss.com/change-default-applications-ubuntu/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+ItsFoss+%28Its+FOSS%21+An+Open+Source+Blog%29
作者:[Abhishek Prakash][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://itsfoss.com/author/abhishek/
[1]: http://www.videolan.org/vlc/index.html
[2]: https://itsfoss.com/things-to-do-after-installing-ubuntu-16-04/

View File

@ -0,0 +1,86 @@
sevenot translating
Terminator A Linux Terminal Emulator With Multiple Terminals In One Window
=============================================================================
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/lots-of-terminals-in-terminator_1.jpg?659)
Each Linux distribution has a default terminal emulator for interacting with system through commands. But the default terminal app might not be perfect for you. There are so many terminal apps that will provide you more functionalities to perform more tasks simultaneously to sky-rocket speed of your work. Such useful terminal emulators include Terminator, a multi-windows supported free terminal emulator for your Linux system.
### What Is Linux Terminal Emulator?
A Linux terminal emulator is a program that lets you interact with the shell. All Linux distributions come with a default Linux terminal app that let you pass commands to the shell.
### Terminator, A Free Linux Terminal App
Terminator is a Linux terminal emulator that provides several features that your default terminal app does not support. It provides the ability to create multiple terminals in one window and faster your work progress. Other than multiple windows, it allows you to change other properties such as, terminal fonts, fonts colour, background colour and so on. Let's see how we can install and use Terminator in different Linux distributions.
### How To Install Terminator In Linux?
#### Install Terminator In Ubuntu Based Distributions
Terminator is available in the default Ubuntu repository. So you don't require to add any additional PPA. Just use APT or Software App to install it in Ubuntu.
```
sudo apt-get install terminator
```
In case Terminator is not available in your default repository, just compile Terminator from source code.
[DOWNLOAD SOURCE CODE][1]
Download Terminator source code and extract it on your desktop. Now open your default terminal & cd into the extracted folder.
Now use the following command to install Terminator -
```
sudo ./setup.py install
```
#### Install Terminator In Fedora & Other Derivatives
```
dnf install terminator
```
#### Install Terminator In OpenSuse
[INSTALL IN OPENSUSE][2]
### How To Use Multiple Terminals In One Window?
After you have installed Terminator, simply open multiple terminals in one window. Simply right click and divide.
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/multiple-terminals-in-terminator_orig.jpg)
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/multiple-terminals-in-terminator-emulator.jpg?697)
You can create as many terminals as you want, if you can manage them.
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/lots-of-terminals-in-terminator.jpg?706)
### Customise Terminals
Right click the terminal and click Properties. Now you can customise fonts, fonts colour, title colour & background and terminal fonts colour & background.
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/customize-terminator-interface.jpg?702)
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/free-terminal-emulator_orig.jpg)
### Conclusion & What Is Your Favorite Terminal Emulator?
Terminator is an advanced terminal emulator and it also let you customize the interface. If you have not yet switched from your default terminal emulator then just try this one. I know you'll like it. If you're using any other free terminal emulator, then let us know your favorite terminal emulator. Also don't forget to share this article with your friends. Perhaps your friends are searching for something like this.
--------------------------------------------------------------------------------
via: http://www.tecmint.com/mandatory-access-control-with-selinux-or-apparmor-linux/
作者:[author][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.linuxandubuntu.com/home/terminator-a-linux-terminal-emulator-with-multiple-terminals-in-one-window
[1]: https://launchpad.net/terminator/+download
[2]: http://software.opensuse.org/download.html?project=home%3AKorbi123&package=terminator

View File

@ -1,185 +0,0 @@
Part 13 - LFCS: How to Configure and Troubleshoot Grand Unified Bootloader (GRUB)
=====================================================================================
Because of the changes in the LFCS exam requirements effective Feb. 2, 2016, we are adding the necessary topics to the [LFCS series][1] published here. To prepare for this exam, your are highly encouraged to use the [LFCE series][2] as well.
![](http://www.tecmint.com/wp-content/uploads/2016/03/Configure-Troubleshoot-Grub-Boot-Loader.png)
>LFCS: Configure and Troubleshoot Grub Boot Loader Part 13
In this article we will introduce you to GRUB and explain why a boot loader is necessary, and how it adds versatility to the system.
The [Linux boot process][3] from the time you press the power button of your computer until you get a fully-functional system follows this high-level sequence:
* 1. A process known as **POST** (**Power-On Self Test**) performs an overall check on the hardware components of your computer.
* 2. When **POST** completes, it passes the control over to the boot loader, which in turn loads the Linux kernel in memory (along with **initramfs**) and executes it. The most used boot loader in Linux is the **GRand Unified Boot loader**, or **GRUB** for short.
* 3. The kernel checks and accesses the hardware, and then runs the initial process (mostly known by its generic name “**init**”) which in turn completes the system boot by starting services.
In Part 7 of this series (“[SysVinit, Upstart, and Systemd][4]”) we introduced the [service management systems and tools][5] used by modern Linux distributions. You may want to review that article before proceeding further.
### Introducing GRUB Boot Loader
Two major **GRUB** versions (**v1** sometimes called **GRUB Legacy** and **v2**) can be found in modern systems, although most distributions use **v2** by default in their latest versions. Only **Red Hat Enterprise Linux 6** and its derivatives still use **v1** today.
Thus, we will focus primarily on the features of **v2** in this guide.
Regardless of the **GRUB** version, a boot loader allows the user to:
* 1). modify the way the system behaves by specifying different kernels to use,
* 2). choose between alternate operating systems to boot, and
* 3). add or edit configuration stanzas to change boot options, among other things.
Today, **GRUB** is maintained by the **GNU** project and is well documented in their website. You are encouraged to use the [GNU official documentation][6] while going through this guide.
When the system boots you are presented with the following **GRUB** screen in the main console. Initially, you are prompted to choose between alternate kernels (by default, the system will boot using the latest kernel) and are allowed to enter a **GRUB** command line (with `c`) or edit the boot options (by pressing the `e` key).
![](http://www.tecmint.com/wp-content/uploads/2016/03/GRUB-Boot-Screen.png)
>GRUB Boot Screen
One of the reasons why you would consider booting with an older kernel is a hardware device that used to work properly and has started “acting up” after an upgrade (refer to [this link][7] in the AskUbuntu forums for an example).
The **GRUB v2** configuration is read on boot from `/boot/grub/grub.cfg` or `/boot/grub2/grub.cfg`, whereas `/boot/grub/grub.conf` or `/boot/grub/menu.lst` are used in **v1**. These files are NOT to be edited by hand, but are modified based on the contents of `/etc/default/grub` and the files found inside `/etc/grub.d`.
In a **CentOS 7**, heres the configuration file that is created when the system is first installed:
```
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="vconsole.keymap=la-latin1 rd.lvm.lv=centos_centos7-2/swap crashkernel=auto vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos_centos7-2/root rhgb quiet"
GRUB_DISABLE_RECOVERY="true"
```
In addition to the online documentation, you can also find the GNU GRUB manual using info as follows:
```
# info grub
```
If youre interested specifically in the options available for /etc/default/grub, you can invoke the configuration section directly:
```
# info -f grub -n 'Simple configuration'
```
Using the command above you will find out that `GRUB_TIMEOUT` sets the time between the moment when the initial screen appears and the system automatic booting begins unless interrupted by the user. When this variable is set to `-1`, boot will not be started until the user makes a selection.
When multiple operating systems or kernels are installed in the same machine, `GRUB_DEFAULT` requires an integer value that indicates which OS or kernel entry in the GRUB initial screen should be selected to boot by default. The list of entries can be viewed not only in the splash screen shown above, but also using the following command:
### In CentOS and openSUSE:
```
# awk -F\' '$1=="menuentry " {print $2}' /boot/grub2/grub.cfg
```
### In Ubuntu:
```
# awk -F\' '$1=="menuentry " {print $2}' /boot/grub/grub.cfg
```
In the example shown in the below image, if we wish to boot with the kernel version **3.10.0-123.el7.x86_64** (4th entry), we need to set `GRUB_DEFAULT` to `3` (entries are internally numbered beginning with zero) as follows:
```
GRUB_DEFAULT=3
```
![](http://www.tecmint.com/wp-content/uploads/2016/03/Boot-System-with-Old-Kernel-Version.png)
>Boot System with Old Kernel Version
One final GRUB configuration variable that is of special interest is `GRUB_CMDLINE_LINUX`, which is used to pass options to the kernel. The options that can be passed through GRUB to the kernel are well documented in the [Kernel Parameters file][8] and in [man 7 bootparam][9].
Current options in my **CentOS 7** server are:
```
GRUB_CMDLINE_LINUX="vconsole.keymap=la-latin1 rd.lvm.lv=centos_centos7-2/swap crashkernel=auto vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos_centos7-2/root rhgb quiet"
```
Why would you want to modify the default kernel parameters or pass extra options? In simple terms, there may be times when you need to tell the kernel certain hardware parameters that it may not be able to determine on its own, or to override the values that it would detect.
This happened to me not too long ago when I tried **Vector Linux**, a derivative of **Slackware**, on my 10-year old laptop. After installation it did not detect the right settings for my video card so I had to modify the kernel options passed through GRUB in order to make it work.
Another example is when you need to bring the system to single-user mode to perform maintenance tasks. You can do this by appending the word single to `GRUB_CMDLINE_LINUX` and rebooting:
```
GRUB_CMDLINE_LINUX="vconsole.keymap=la-latin1 rd.lvm.lv=centos_centos7-2/swap crashkernel=auto vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos_centos7-2/root rhgb quiet single"
```
After editing `/etc/defalt/grub`, you will need to run `update-grub` (Ubuntu) or `grub2-mkconfig -o /boot/grub2/grub.cfg` (**CentOS** and **openSUSE**) afterwards to update `grub.cfg` (otherwise, changes will be lost upon boot).
This command will process the boot configuration files mentioned earlier to update `grub.cfg`. This method ensures changes are permanent, while options passed through GRUB at boot time will only last during the current session.
### Fixing Linux GRUB Issues
If you install a second operating system or if your GRUB configuration file gets corrupted due to human error, there are ways you can get your system back on its feet and be able to boot again.
In the initial screen, press `c` to get a GRUB command line (remember that you can also press `e` to edit the default boot options), and use help to bring the available commands in the GRUB prompt:
![](http://www.tecmint.com/wp-content/uploads/2016/03/Fix-Grub-Issues-in-Linux.png)
>Fix Grub Configuration Issues in Linux
We will focus on **ls**, which will list the installed devices and filesystems, and we will examine what it finds. In the image below we can see that there are 4 hard drives (`hd0` through `hd3`).
Only `hd0` seems to have been partitioned (as evidenced by msdos1 and msdos2, where 1 and 2 are the partition numbers and msdos is the partitioning scheme).
Lets now examine the first partition on `hd0` (**msdos1**) to see if we can find GRUB there. This approach will allow us to boot Linux and there use other high level tools to repair the configuration file or reinstall GRUB altogether if it is needed:
```
# ls (hd0,msdos1)/
```
As we can see in the highlighted area, we found the `grub2` directory in this partition:
![](http://www.tecmint.com/wp-content/uploads/2016/03/Find-Grub-Configuration.png)
>Find Grub Configuration
Once we are sure that GRUB resides in (**hd0,msdos1**), lets tell GRUB where to find its configuration file and then instruct it to attempt to launch its menu:
```
set prefix=(hd0,msdos1)/grub2
set root=(hd0,msdos1)
insmod normal
normal
```
![](http://www.tecmint.com/wp-content/uploads/2016/03/Find-and-Launch-Grub-Menu.png)
>Find and Launch Grub Menu
Then in the GRUB menu, choose an entry and press **Enter** to boot using it. Once the system has booted you can issue the `grub2-install /dev/sdX` command (change `sdX` with the device you want to install GRUB on). The boot information will then be updated and all related files be restored.
```
# grub2-install /dev/sdX
```
Other more complex scenarios are documented, along with their suggested fixes, in the [Ubuntu GRUB2 Troubleshooting guide][10]. The concepts explained there are valid for other distributions as well.
### Summary
In this article we have introduced you to GRUB, indicated where you can find documentation both online and offline, and explained how to approach an scenario where a system has stopped booting properly due to a bootloader-related issue.
Fortunately, GRUB is one of the tools that is best documented and you can easily find help either in the installed docs or online using the resources we have shared in this article.
Do you have questions or comments? Dont hesitate to let us know using the comment form below. We look forward to hearing from you!
--------------------------------------------------------------------------------
via: http://www.tecmint.com/linux-basic-shell-scripting-and-linux-filesystem-troubleshooting/
作者:[Gabriel Cánepa][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.tecmint.com/author/gacanepa/
[1]: http://www.tecmint.com/sed-command-to-create-edit-and-manipulate-files-in-linux/
[2]: http://www.tecmint.com/installing-network-services-and-configuring-services-at-system-boot/
[3]: http://www.tecmint.com/linux-boot-process/
[4]: http://www.tecmint.com/linux-boot-process-and-manage-services/
[5]: http://www.tecmint.com/best-linux-log-monitoring-and-management-tools/
[6]: http://www.gnu.org/software/grub/manual/
[7]: http://askubuntu.com/questions/82140/how-can-i-boot-with-an-older-kernel-version
[8]: https://www.kernel.org/doc/Documentation/kernel-parameters.txt
[9]: http://man7.org/linux/man-pages/man7/bootparam.7.html
[10]: https://help.ubuntu.com/community/Grub2/Troubleshooting

View File

@ -0,0 +1,120 @@
Being translated by ChrisLeeGit
Learn How to Use Awk Built-in Variables Part 10
=================================================
As we uncover the section of Awk features, in this part of the series, we shall walk through the concept of built-in variables in Awk. There are two types of variables you can use in Awk, these are; user-defined variables, which we covered in Part 8 and built-in variables.
![](http://www.tecmint.com/wp-content/uploads/2016/07/Awk-Built-in-Variables-Examples.png)
>Awk Built in Variables Examples
Built-in variables have values already defined in Awk, but we can also carefully alter those values, the built-in variables include:
- `FILENAME` : current input file name( do not change variable name)
- `FR` : number of the current input line (that is input line 1, 2, 3… so on, do not change variable name)
- `NF` : number of fields in current input line (do not change variable name)
- `OFS` : output field separator
- `FS` : input field separator
- `ORS` : output record separator
- `RS` : input record separator
Let us proceed to illustrate the use of some of the Awk built-in variables above:
To read the filename of the current input file, you can use the `FILENAME` built-in variable as follows:
```
$ awk ' { print FILENAME } ' ~/domains.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Awk-FILENAME-Variable.png)
>Awk FILENAME Variable
You will realize that, the filename is printed out for each input line, that is the default behavior of Awk when you use `FILENAME` built-in variable.
Using `NR` to count the number of lines (records) in an input file, remember that, it also counts the empty lines, as we shall see in the example below.
When we view the file domains.txt using cat command, it contains 14 lines with text and empty 2 lines:
```
$ cat ~/domains.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Print-Contents-of-File.png)
>Print Contents of File
```
$ awk ' END { print "Number of records in file is: ", NR } ' ~/domains.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Awk-Count-Number-of-Lines.png)
>Awk Count Number of Lines
To count the number of fields in a record or line, we use the NR built-in variable as follows:
```
$ cat ~/names.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/List-File-Contents.png)
>List File Contents
```
$ awk '{ print "Record:",NR,"has",NF,"fields" ; }' ~/names.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Awk-Count-Number-of-Fields-in-File.png)
>Awk Count Number of Fields in File
Next, you can also specify an input field separator using the FS built-in variable, it defines how Awk divides input lines into fields.
The default value for FS is space and tab, but we can change the value of FS to any character that will instruct Awk to divide input lines accordingly.
There are two methods to do this:
- one method is to use the FS built-in variable
- and the second is to invoke the -F Awk option
Consider the file /etc/passwd on a Linux system, the fields in this file are divided using the : character, so we can specify it as the new input field separator when we want to filter out certain fields as in the following examples:
We can use the `-F` option as follows:
```
$ awk -F':' '{ print $1, $4 ;}' /etc/passwd
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Awk-Filter-Fields-in-Password-File.png)
>Awk Filter Fields in Password File
Optionally, we can also take advantage of the FS built-in variable as below:
```
$ awk ' BEGIN { FS=“:” ; } { print $1, $4 ; } ' /etc/passwd
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Filter-Fields-in-File-Using-Awk.png)
>Filter Fields in File Using Awk
To specify an output field separator, use the OFS built-in variable, it defines how the output fields will be separated using the character we use as in the example below:
```
$ awk -F':' ' BEGIN { OFS="==>" ;} { print $1, $4 ;}' /etc/passwd
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Add-Separator-to-Field-in-File.png)
>Add Separator to Field in File
In this Part 10, we have explored the idea of using Awk built-in variables which come with predefined values. But we can also change these values, though, it is not recommended to do so unless you know what you are doing, with adequate understanding.
After this, we shall progress to cover how we can use shell variables in Awk command operations, therefore, stay connected to Tecmint.
--------------------------------------------------------------------------------
via: http://www.tecmint.com/awk-built-in-variables-examples/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+tecmint+%28Tecmint%3A+Linux+Howto%27s+Guide%29
作者:[Aaron Kili][a]
译者:[ChrisLeeGit](https://github.com/chrisleegit)
校对:[校对ID](https://github.com/校对ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.tecmint.com/author/aaronkili/

View File

@ -1,109 +0,0 @@
Translating by ivo-wang
What is good stock portfolio management software on Linux
linux上那些不错的管理股票组合投资软件
================================================================================
如果你在股票市场做投资那么你可能非常清楚管理组合投资的计划有多重要。管理组合投资的目标是依据你能承受的风险时间层面的长短和资金盈利的目标去为你量身打造的一种投资计划。鉴于这类软件的重要性难怪从不缺乏商业性质的app和股票行情检测软件每一个都可以兜售复杂的组合投资以及跟踪报告功能。
对于这些linux爱好者们我们找到了一些 **好用的开源组合投资管理工具** 用来在linux上管理和跟踪股票的组合投资这里高度推荐一个基于java编写的管理软件[JStock][1]。如果你不是一个java粉你不得不面对这样一个事实JStock需要运行在重型的JVM环境上。同时我相信许多人非常欣赏JStock安装JRE以后它可以非常迅速的安装在各个linux平台上。没有障碍能阻止你将它安装在你的linux环境中。
开源就意味着免费或标准低下的时代已经过去了。鉴于JStock只是一个个人完成的产物作为一个组合投资管理软件它最令人印象深刻的是包含了非常多实用的功能以上所有的荣誉属于它的作者Yan Cheng Cheok!例如JStock 支持通过监视列表去监控价格,多种组合投资,按习惯/按固定 做股票指示与相关扫描支持27个不同的股票市场和交易平台云端备份/还原。JStock支持多平台部署(Linux, OS X, Android 和 Windows)你可以通过云端保存你的JStock记录它可以无缝的备份还原到其他的不同平台上面。
现在我将向你展示如何安装以及使用过程的一些具体细节。
### 在Linux上安装JStock ###
因为JStock使用Java编写所以必须[安装 JRE][2]才能让它运行起来.小提示JStock 需要JRE1.7或更高版本。如你的JRE版本不能满足这个需求JStock将会安装失败然后出现下面的报错。
Exception in thread "main" java.lang.UnsupportedClassVersionError: org/yccheok/jstock/gui/JStock : Unsupported major.minor version 51.0
一旦你安装了JRE在你的linux上从官网下载最新的发布的JStock然后加载启动它。
$ wget https://github.com/yccheok/jstock/releases/download/release_1-0-7-13/jstock-1.0.7.13-bin.zip
$ unzip jstock-1.0.7.13-bin.zip
$ cd jstock
$ chmod +x jstock.sh
$ ./jstock.sh
教程的其他部分让我来给大家展示一些JStock的实用功能
### 监视监控列表股票价格的波动 ###
使用JStock你可以创建一个或多个监视列表它可以自动的监视股票价格的波动并给你提供相应的通知。在每一个监视列表里面你可以添加多个感兴趣的股票进去。之后添加你的警戒值在"Fall Below"和"Rise Above"的表格里,分别是在设定最低价格和最高价格。
![](https://c2.staticflickr.com/2/1588/23795349969_37f4b0f23c_c.jpg)
例如你设置了AAPL股票的最低/最高价格分别是$102 和 $115.50,你将在价格低于$102或高于$115.50的任意时间在桌面得到通知。
你也可以设置邮件通知,之后你将收到一些价格信息的邮件通知。设置邮件通知在栏的"Options"选项。在"Alert"标签,打开"Send message to email(s)"填入你的Gmail账户。一旦完成Gmail认证步骤JStock将开始发送邮件通知到你的Gmail账户也可以设置其他的第三方邮件地址
![](https://c2.staticflickr.com/2/1644/24080560491_3aef056e8d_b.jpg)
### 管理多个组合投资 ###
JStock能够允许你管理多个组合投资。这个功能对于股票经纪人是非常实用的。你可以为经纪人创建一个投资项去管理你的 买入/卖出/红利 用来了解每一个经纪人的业务情况。你也可以切换不同的组合项目通过选择一个特殊项目在"Portfolio"菜单里面。下面是一张截图用来展示一个意向投资
![](https://c2.staticflickr.com/2/1646/23536385433_df6c036c9a_c.jpg)
因为能够设置付给经纪人小费的选项所以你能付给经纪人任意的小费印花税以及清空每一比交易的小费。如果你非常懒你也可以在菜单里面设置自动计算小费和给每一个经纪人固定的小费。在完成交易之后JStock将自动的计算并发送小费。
![](https://c2.staticflickr.com/2/1653/24055085262_0e315c3691_b.jpg)
### 显示固定/自选股票提示 ###
如果你要做一些股票的技术分析你可能需要不同股票的指数这里叫做“平均股指”对于股票的跟踪JStock提供多个[预设技术指示器][3] 去获得股票上涨/下跌/逆转指数的趋势。下面的列表里面是一些可用的指示。
- 异同平均线MACD
- 相对强弱指数 (RSI)
- 货币流通指数 (MFI)
- 顺势指标 (CCI)
- 十字线
- 黄金交叉线, 死亡交叉线
- 涨幅/跌幅
开启预设指示器能需要在JStock中点击"Stock Indicator Editor"标签。之后点击右侧面板中的安装按钮。选择"Install from JStock server"选项,之后安装你想要的指示器。
![](https://c2.staticflickr.com/2/1476/23867534660_b6a9c95a06_c.jpg)
一旦安装了一个或多个指示器,你可以用他们来扫描股票。选择"Stock Indicator Scanner"标签,点击底部的"Scan"按钮,选择需要的指示器。
![](https://c2.staticflickr.com/2/1653/24137054996_e8fcd10393_c.jpg)
当你选择完需要扫描的股票(例如e.g., NYSE, NASDAQ)以后JStock将执行扫描并将捕获的结果通过列表的形式展现在指示器上面。
![](https://c2.staticflickr.com/2/1446/23795349889_0f1aeef608_c.jpg)
除了预设指示器以外你也可以使用一个图形化的工具来定义自己的指示器。下面这张图例中展示的是当前价格小于或等于60天平均价格
![](https://c2.staticflickr.com/2/1605/24080560431_3d26eac6b5_c.jpg)
### 云备份还原Linux 和 Android JStock ###
另一个非常棒的功能是JStock可以支持云备份还原。Jstock也可以把你的组合投资/监视列表备份还原在 Google Drive这个功能可以实现在不同平台例如Linux和Android上无缝穿梭。举个例子如果你把Android Jstock组合投资的信息保存在Google Drive上你可以在Linux班级本上还原他们。
![](https://c2.staticflickr.com/2/1537/24163165565_bb47e04d6c_c.jpg)
![](https://c2.staticflickr.com/2/1556/23536385333_9ed1a75d72_c.jpg)
如果你在从Google Drive还原之后不能看到你的投资信息以及监视列表请确认你的国家信息与“Country”菜单里面设置的保持一致。
JStock的安卓免费版可以从[Google Play Store][4]获取到。如果你需要完整的功能(比如云备份,通知,图表等),你需要一次性支付费用升级到高级版。我想高级版肯定有它的价值所在。
![](https://c2.staticflickr.com/2/1687/23867534720_18b917028c_c.jpg)
写在最后我应该说一下它的作者Yan Cheng Cheok他是一个十分活跃的开发者有bug及时反馈给他。最后多有的荣耀都属于他一个人
关于JStock这个组合投资跟踪软件你有什么想法呢
--------------------------------------------------------------------------------
via: http://xmodulo.com/stock-portfolio-management-software-linux.html
作者:[Dan Nanni][a]
译者:[ivo-wang](https://github.com/ivo-wang)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]:http://xmodulo.com/author/nanni
[1]:http://jstock.org/
[2]:http://ask.xmodulo.com/install-java-runtime-linux.html
[3]:http://jstock.org/ma_indicator.html
[4]:https://play.google.com/store/apps/details?id=org.yccheok.jstock.gui

View File

@ -0,0 +1,319 @@
如何用Python和Flask建立部署一个Facebook信使机器人教程
==========================================================================
这是我建立一个简单的Facebook信使机器人的记录。功能很简单他是一个回显机器人只是打印回用户写了什么。
回显服务器类似于服务器的“Hello World”例子。
这个项目的目的不是建立最好的信使机器人,而是获得建立一个小型机器人和每个事物是如何整合起来的感觉。
- [技术栈][1]
- [机器人架构][2]
- [机器人服务器][3]
- [部署到 Heroku][4]
- [创建 Facebook 应用][5]
- [结论][6]
### 技术栈
使用到的技术栈:
- [Heroku][7] 做后端主机。免费层足够这个等级的教程。回显机器人不需要任何种类的数据持久,所以不需要数据库。
- [Python][8] 是选择的一个语言。版本选择2.7,虽然它移植到 Pyhton 3 很容易,只需要很少的改动。
- [Flask][9] 作为网站开发框架。它是非常轻量的框架,用在小型工程或微服务是完美的。
- 最后 [Git][10] 版本控制系统用来维护代码和部署到 Heroku。
- 值得一提:[Virtualenv][11]。这个 python 工具是用来创建清洁的 python 库“环境”的,你只用安装必要的需求和最小化的应用封装。
### 机器人架构
信使机器人是由一个服务器组成,响应两种请求:
- GET 请求被用来认证。他们与你注册的 FB 认证码一同被信使发出。
- POST 请求被用来真实的通信。传统的工作流是,机器人将通过发送 POST 请求与用户发送的消息数据建立通信,我们将处理它,发送一个我们自己的 POST 请求回去。如果这一个完全成功返回一个200 OK 状态我们也响应一个200 OK 码给初始信使请求。
这个教程应用将托管到Heroku他提供了一个很好并且简单的接口来部署应用。如前所述,免费层可以满足这个教程。
在应用已经部署并且运行后,我们将创建一个 Facebook 应用然后连接它到我们的应用,以便信使知道发送请求到哪,这就是我们的机器人。
### 机器人服务器
基本的服务器代码可以在Github用户 [hult(Magnus Hult)][13] 的 [Chatbot][12] 工程上获取,经过一些代码修改只回显消息和一些我遇到的错误更正。最终版本的服务器代码:
```
from flask import Flask, request
import json
import requests
app = Flask(__name__)
# 这需要填写被授予的页面通行令牌
# 通过 Facebook 应用创建令牌。
PAT = ''
@app.route('/', methods=['GET'])
def handle_verification():
print "Handling Verification."
if request.args.get('hub.verify_token', '') == 'my_voice_is_my_password_verify_me':
print "Verification successful!"
return request.args.get('hub.challenge', '')
else:
print "Verification failed!"
return 'Error, wrong validation token'
@app.route('/', methods=['POST'])
def handle_messages():
print "Handling Messages"
payload = request.get_data()
print payload
for sender, message in messaging_events(payload):
print "Incoming from %s: %s" % (sender, message)
send_message(PAT, sender, message)
return "ok"
def messaging_events(payload):
"""Generate tuples of (sender_id, message_text) from the
provided payload.
"""
data = json.loads(payload)
messaging_events = data["entry"][0]["messaging"]
for event in messaging_events:
if "message" in event and "text" in event["message"]:
yield event["sender"]["id"], event["message"]["text"].encode('unicode_escape')
else:
yield event["sender"]["id"], "I can't echo this"
def send_message(token, recipient, text):
"""Send the message text to recipient with id recipient.
"""
r = requests.post("https://graph.facebook.com/v2.6/me/messages",
params={"access_token": token},
data=json.dumps({
"recipient": {"id": recipient},
"message": {"text": text.decode('unicode_escape')}
}),
headers={'Content-type': 'application/json'})
if r.status_code != requests.codes.ok:
print r.text
if __name__ == '__main__':
app.run()
```
让我们分解代码。第一部分是引入所需:
```
from flask import Flask, request
import json
import requests
```
接下来我们定义两个函数(使用 Flask 特定的 app.route 装饰器),用来处理到我们的机器人的 GET 和 POST 请求。
```
@app.route('/', methods=['GET'])
def handle_verification():
print "Handling Verification."
if request.args.get('hub.verify_token', '') == 'my_voice_is_my_password_verify_me':
print "Verification successful!"
return request.args.get('hub.challenge', '')
else:
print "Verification failed!"
return 'Error, wrong validation token'
```
当我们创建 Facebook 应用时声明由信使发送的 verify_token 对象。我们必须对自己进行认证。最后我们返回“hub.challenge”给信使。
处理 POST 请求的函数更有趣
```
@app.route('/', methods=['POST'])
def handle_messages():
print "Handling Messages"
payload = request.get_data()
print payload
for sender, message in messaging_events(payload):
print "Incoming from %s: %s" % (sender, message)
send_message(PAT, sender, message)
return "ok"
```
当调用我们抓取的消息负载时,使用函数 messaging_events 来中断它并且提取发件人身份和真实发送消息,生成一个 python 迭代器循环遍历。请注意信使发送的每个请求有可能多于一个消息。
```
def messaging_events(payload):
"""Generate tuples of (sender_id, message_text) from the
provided payload.
"""
data = json.loads(payload)
messaging_events = data["entry"][0]["messaging"]
for event in messaging_events:
if "message" in event and "text" in event["message"]:
yield event["sender"]["id"], event["message"]["text"].encode('unicode_escape')
else:
yield event["sender"]["id"], "I can't echo this"
```
迭代完每个消息时我们调用send_message函数然后我们执行POST请求回给使用Facebook图形消息接口信使。在这期间我们一直没有回应我们阻塞的原始信使请求。这会导致超时和5XX错误。
上述的发现错误是中断期间我偶然发现的,当用户发送表情时其实的是当成了 unicode 标识,无论如何 Python 发生了误编码。我们以发送回垃圾结束。
这个 POST 请求回到信使将不会结束这会导致发生5xx状态返回给原始的请求显示服务不可用。
通过使用`encode('unicode_escape')`转义消息然后在我们发送回消息前用`decode('unicode_escape')`解码消息就可以解决。
```
def send_message(token, recipient, text):
"""Send the message text to recipient with id recipient.
"""
r = requests.post("https://graph.facebook.com/v2.6/me/messages",
params={"access_token": token},
data=json.dumps({
"recipient": {"id": recipient},
"message": {"text": text.decode('unicode_escape')}
}),
headers={'Content-type': 'application/json'})
if r.status_code != requests.codes.ok:
print r.text
```
### 部署到 Heroku
一旦代码已经建立成我想要的样子时就可以进行下一步。部署应用。
当然,但是怎么做?
我之前已经部署了应用到 Heroku (主要是 Rails然而我总是遵循某种教程所以配置已经创建。在这种情况下尽管我必须从头开始。
幸运的是有官方[Heroku文档][14]来帮忙。这篇文章很好地说明了运行应用程序所需的最低限度。
长话短说我们需要的除了我们的代码还有两个文件。第一个文件是“requirements.txt”他列出了运行应用所依赖的库。
需要的第二个文件是“Procfile”。这个文件通知 Heroku 如何运行我们的服务。此外这个文件最低限度如下:
>web: gunicorn echoserver:app
heroku解读他的方式是我们的应用通过运行 echoserver.py 开始并且应用将使用 gunicorn 作为网站服务器。我们使用一个额外的网站服务器是因为与性能相关并在上面的Heroku文档里解释了
>Web 应用程序并发处理传入的HTTP请求比一次只处理一个请求的Web应用程序更有效利地用动态资源。由于这个原因我们建议使用支持并发请求的 web 服务器处理开发和生产运行的服务。
>Django 和 Flask web 框架特性方便内建 web 服务器,但是这些阻塞式服务器一个时刻只处理一个请求。如果你部署这种服务到 Heroku上你的动态资源不会充分使用并且你的应用会感觉迟钝。
>Gunicorn 是一个纯 Python HTTP 的 WSGI 引用服务器。允许你在单独一个动态资源内通过并发运行多 Python 进程的方式运行任一 Python 应用。它提供了一个完美性能,弹性,简单配置的平衡。
回到我们提到的“requirements.txt”文件让我们看看它如何结合 Virtualenv 工具。
在任何时候,你的开发机器也许有若干已安装的 python 库。当部署应用时你不想这些库被加载因为很难辨认出你实际使用哪些库。
Virtualenv 创建一个新的空白虚拟环境,因此你可以只安装你应用需要的库。
你可以检查当前安装使用哪些库的命令如下:
```
kostis@KostisMBP ~ $ pip freeze
cycler==0.10.0
Flask==0.10.1
gunicorn==19.6.0
itsdangerous==0.24
Jinja2==2.8
MarkupSafe==0.23
matplotlib==1.5.1
numpy==1.10.4
pyparsing==2.1.0
python-dateutil==2.5.0
pytz==2015.7
requests==2.10.0
scipy==0.17.0
six==1.10.0
virtualenv==15.0.1
Werkzeug==0.11.10
```
注意pip 工具应该已经与 Python 一起安装在你的机器上。
如果没有,查看[官方网站][15]如何安装他。
现在让我们使用 Virtualenv 来创建一个新的空白环境。首先我们给我们的工程创建一个新文件夹,然后进到目录下:
```
kostis@KostisMBP projects $ mkdir echoserver
kostis@KostisMBP projects $ cd echoserver/
kostis@KostisMBP echoserver $
```
现在来创建一个叫做 echobot 新的环境。运行下面的 source 命令激活它,然后使用 pip freeze 检查,我们能看到现在是空的。
```
kostis@KostisMBP echoserver $ virtualenv echobot
kostis@KostisMBP echoserver $ source echobot/bin/activate
(echobot) kostis@KostisMBP echoserver $ pip freeze
(echobot) kostis@KostisMBP echoserver $
```
我们可以安装需要的库。我们需要是 flaskgunicorn和 requests他们被安装完我们就创建 requirements.txt 文件:
```
(echobot) kostis@KostisMBP echoserver $ pip install flask
(echobot) kostis@KostisMBP echoserver $ pip install gunicorn
(echobot) kostis@KostisMBP echoserver $ pip install requests
(echobot) kostis@KostisMBP echoserver $ pip freeze
click==6.6
Flask==0.11
gunicorn==19.6.0
itsdangerous==0.24
Jinja2==2.8
MarkupSafe==0.23
requests==2.10.0
Werkzeug==0.11.10
(echobot) kostis@KostisMBP echoserver $ pip freeze > requirements.txt
```
毕竟上文已经被运行,我们用 python 代码创建 echoserver.py 文件然后用之前提到的命令创建 Procfile我们应该以下面的文件/文件夹结束:
```
(echobot) kostis@KostisMBP echoserver $ ls
Procfile echobot echoserver.py requirements.txt
```
我们现在准备上传到 Heroku。我们需要做两件事。第一是安装 Heroku toolbet 如果你还没安装到你的系统中(详细看[Heroku][16])。第二通过[网页接口][17]创建一个新的 Heroku 应用。
点击右上的大加号然后选择“Create new app”。
--------------------------------------------------------------------------------
via: http://tsaprailis.com/2016/06/02/How-to-build-and-deploy-a-Facebook-Messenger-bot-with-Python-and-Flask-a-tutorial/
作者:[Konstantinos Tsaprailis][a]
译者:[wyangsun](https://github.com/wyangsun)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://github.com/kostistsaprailis
[1]: http://tsaprailis.com/2016/06/02/How-to-build-and-deploy-a-Facebook-Messenger-bot-with-Python-and-Flask-a-tutorial/#tech-stack
[2]: http://tsaprailis.com/2016/06/02/How-to-build-and-deploy-a-Facebook-Messenger-bot-with-Python-and-Flask-a-tutorial/#bot-architecture
[3]: http://tsaprailis.com/2016/06/02/How-to-build-and-deploy-a-Facebook-Messenger-bot-with-Python-and-Flask-a-tutorial/#the-bot-server
[4]: http://tsaprailis.com/2016/06/02/How-to-build-and-deploy-a-Facebook-Messenger-bot-with-Python-and-Flask-a-tutorial/#deploying-to-heroku
[5]: http://tsaprailis.com/2016/06/02/How-to-build-and-deploy-a-Facebook-Messenger-bot-with-Python-and-Flask-a-tutorial/#creating-the-facebook-app
[6]: http://tsaprailis.com/2016/06/02/How-to-build-and-deploy-a-Facebook-Messenger-bot-with-Python-and-Flask-a-tutorial/#conclusion
[7]: https://www.heroku.com
[8]: https://www.python.org
[9]: http://flask.pocoo.org
[10]: https://git-scm.com
[11]: https://virtualenv.pypa.io/en/stable
[12]: https://github.com/hult/facebook-chatbot-python
[13]: https://github.com/hult
[14]: https://devcenter.heroku.com/articles/python-gunicorn
[15]: https://pip.pypa.io/en/stable/installing
[16]: https://toolbelt.heroku.com
[17]: https://dashboard.heroku.com/apps

View File

@ -0,0 +1,193 @@
Python 101: urllib 简介
=================================
Python 3 的 urllib 模块是一堆可以处理 URL 的组件集合。如果你有 Python 2 背景,那么你就会注意到 Python 2 中有 urllib 和 urllib2 两个版本的模块。这些现在都是 Python 3 的 urllib 包的一部分。当前版本的 urllib 包括下面几部分:
- urllib.request
- urllib.error
- urllib.parse
- urllib.rebotparser
接下来我们会分开讨论除了 urllib.error 以外的几部分。官方文档实际推荐你尝试第三方库, requests一个高级的 HTTP 客户端接口。然而我依然认为知道如何不依赖第三方库打开 URL 并与之进行交互是很有用的,而且这也可以帮助你理解为什么 requests 包是如此的流行。
---
### urllib.request
urllib.request 模块期初是用来打开和获取 URL 的。让我们看看你可以用函数 urlopen可以做的事
```
>>> import urllib.request
>>> url = urllib.request.urlopen('https://www.google.com/')
>>> url.geturl()
'https://www.google.com/'
>>> url.info()
<http.client.HTTPMessage object at 0x7fddc2de04e0>
>>> header = url.info()
>>> header.as_string()
('Date: Fri, 24 Jun 2016 18:21:19 GMT\n'
'Expires: -1\n'
'Cache-Control: private, max-age=0\n'
'Content-Type: text/html; charset=ISO-8859-1\n'
'P3P: CP="This is not a P3P policy! See '
'https://www.google.com/support/accounts/answer/151657?hl=en for more info."\n'
'Server: gws\n'
'X-XSS-Protection: 1; mode=block\n'
'X-Frame-Options: SAMEORIGIN\n'
'Set-Cookie: '
'NID=80=tYjmy0JY6flsSVj7DPSSZNOuqdvqKfKHDcHsPIGu3xFv41LvH_Jg6LrUsDgkPrtM2hmZ3j9V76pS4K_cBg7pdwueMQfr0DFzw33SwpGex5qzLkXUvUVPfe9g699Qz4cx9ipcbU3HKwrRYA; '
'expires=Sat, 24-Dec-2016 18:21:19 GMT; path=/; domain=.google.com; HttpOnly\n'
'Alternate-Protocol: 443:quic\n'
'Alt-Svc: quic=":443"; ma=2592000; v="34,33,32,31,30,29,28,27,26,25"\n'
'Accept-Ranges: none\n'
'Vary: Accept-Encoding\n'
'Connection: close\n'
'\n')
>>> url.getcode()
200
```
在这里我们包含了需要的模块,然后告诉它打开 Google 的 URL。现在我们就有了一个可以交互的 HTTPResponse 对象。我们要做的第一件事是调用方法 geturl ,它会返回根据 URL 获取的资源。这可以让我们发现 URL 是否进行了重定向。
接下来调用 info ,它会返回网页的元数据,比如头信息。因此,我们可以将结果赋给我们的 headers 变量,然后调用它的方法 as_string 。就可以打印出我们从 Google 收到的头信息。你也可以通过 getcode 得到网页的 HTTP 响应码,当前情况下就是 200意思是正常工作。
如果你想看看网页的 HTML 代码,你可以调用变量 url 的方法 read。我不准备再现这个过程因为输出结果太长了。
请注意 request 对象默认是 GET 请求,除非你指定它的 data 参数。你应该给它传递 data 参数,这样 request 对象才会变成 POST 请求。
---
### 下载文件
urllib 一个典型的应用场景是下载文件。让我们看看几种可以完成这个任务的方法:
```
>>> import urllib.request
>>> url = 'http://www.blog.pythonlibrary.org/wp-content/uploads/2012/06/wxDbViewer.zip'
>>> response = urllib.request.urlopen(url)
>>> data = response.read()
>>> with open('/home/mike/Desktop/test.zip', 'wb') as fobj:
... fobj.write(data)
...
```
这个例子中我们打开一个保存在我的博客上的 zip 压缩文件的 URL。然后我们读出数据并将数据写到磁盘。一个替代方法是使用 urlretrieve
```
>>> import urllib.request
>>> url = 'http://www.blog.pythonlibrary.org/wp-content/uploads/2012/06/wxDbViewer.zip'
>>> tmp_file, header = urllib.request.urlretrieve(url)
>>> with open('/home/mike/Desktop/test.zip', 'wb') as fobj:
... with open(tmp_file, 'rb') as tmp:
... fobj.write(tmp.read())
```
方法 urlretrieve 会把网络对象拷贝到本地文件。除非你在使用 urlretrieve 的第二个参数指定你要保存文件的路径,否则这个文件在本地是随机命名的并且是保存在临时文件夹。这个可以为你节省一步操作,并且使代码开起来更简单:
```
>>> import urllib.request
>>> url = 'http://www.blog.pythonlibrary.org/wp-content/uploads/2012/06/wxDbViewer.zip'
>>> urllib.request.urlretrieve(url, '/home/mike/Desktop/blog.zip')
('/home/mike/Desktop/blog.zip',
<http.client.HTTPMessage object at 0x7fddc21c2470>)
```
如你所见,它返回了文件保存的路径,以及从请求得来的头信息。
### 设置你的用户代理
当你使用浏览器访问网页时,浏览器会告诉网站它是谁。这就是所谓的 user-agent 字段。Python 的 urllib 会表示他自己为 Python-urllib/x.y 其中 x 和 y 是你使用的 Python 的主、次版本号。有一些网站不认识这个用户代理字段,然后网站的可能会有奇怪的表现或者根本不能正常工作。辛运的是你可以很轻松的设置你自己的 user-agent 字段。
```
>>> import urllib.request
>>> user_agent = ' Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:47.0) Gecko/20100101 Firefox/47.0'
>>> url = 'http://www.whatsmyua.com/'
>>> headers = {'User-Agent': user_agent}
>>> request = urllib.request.Request(url, headers=headers)
>>> with urllib.request.urlopen(request) as response:
... with open('/home/mdriscoll/Desktop/user_agent.html', 'wb') as out:
... out.write(response.read())
```
这里设置我们的用户代理为 Mozilla FireFox ,然后我们访问 <http://www.whatsmyua.com/> 它会告诉我们它识别出的我们的 user-agent 字段。之后我们将 url 和我们的头信息传给 urlopen 创建一个 Request 实例。最后我们保存这个结果。如果你打开这个结果,你会看到我们成功的修改了自己的 user-agent 字段。使用这段代码尽情的尝试不同的值来看看它是如何改变的。
---
### urllib.parse
urllib.parse 库是用来拆分和组合 URL 字符串的标准接口。比如,你可以使用它来转换一个相对的 URL 为绝对的 URL。让我们试试用它来转换一个包含查询的 URL
```
>>> from urllib.parse import urlparse
>>> result = urlparse('https://duckduckgo.com/?q=python+stubbing&t=canonical&ia=qa')
>>> result
ParseResult(scheme='https', netloc='duckduckgo.com', path='/', params='', query='q=python+stubbing&t=canonical&ia=qa', fragment='')
>>> result.netloc
'duckduckgo.com'
>>> result.geturl()
'https://duckduckgo.com/?q=python+stubbing&t=canonical&ia=qa'
>>> result.port
None
```
这里我们导入了函数 urlparse 并且把一个包含搜索查询 duckduckgo 的 URL 作为参数传给它。我的查询的关于 “python stubbing” 的文章。如你所见,它返回了一个 ParseResult 对象,你可以用这个对象了解更多关于 URL 的信息。举个例子,你可以获取到短信息(此处的没有端口信息),网络位置,路径和很多其他东西。
### 提交一个 Web 表单
这个模块还有一个方法 urlencode 可以向 URL 传输数据。 urllib.parse 的一个典型使用场景是提交 Web 表单。让我们通过搜索引擎 duckduckgo 搜索 Python 来看看这个功能是怎么工作的。
```
>>> import urllib.request
>>> import urllib.parse
>>> data = urllib.parse.urlencode({'q': 'Python'})
>>> data
'q=Python'
>>> url = 'http://duckduckgo.com/html/'
>>> full_url = url + '?' + data
>>> response = urllib.request.urlopen(full_url)
>>> with open('/home/mike/Desktop/results.html', 'wb') as f:
... f.write(response.read())
```
这个例子很直接。基本上我们想使用 Python 而不是浏览器向 duckduckgo 提交一个查询。要完成这个我们需要使用 urlencode 构建我们的查询字符串。然后我们把这个字符串和网址拼接成一个完整的正确 URL ,然后使用 urllib.request 提交这个表单。最后我们就获取到了结果然后保存到磁盘上。
---
### urllib.robotparser
robotparser 模块是由一个单独的类 —— RobotFileParser —— 构成的。这个类会回答诸如一个特定的用户代理可以获取已经设置 robot.txt 的网站。 robot.txt 文件会告诉网络爬虫或者机器人当前网站的那些部分是不允许被访问的。让我们看一个简单的例子:
```
>>> import urllib.robotparser
>>> robot = urllib.robotparser.RobotFileParser()
>>> robot.set_url('http://arstechnica.com/robots.txt')
None
>>> robot.read()
None
>>> robot.can_fetch('*', 'http://arstechnica.com/')
True
>>> robot.can_fetch('*', 'http://arstechnica.com/cgi-bin/')
False
```
这里我们导入了 robot 分析器类,然后创建一个实例。然后我们给它传递一个表明网站 robots.txt 位置的 URL 。接下来我们告诉分析器来读取这个文件。现在就完成了,我们给它了一组不同的 URL 让它找出那些我们可以爬取而那些不能爬取。我们很快就看到我们可以访问主站但是不包括 cgi-bin 路径。
---
### 总结一下
现在你就有能力使用 Python 的 urllib 包了。在这一节里,我们学习了如何下载文件,提交 Web 表单,修改自己的用户代理以及访问 robots.txt。 urllib 还有一大堆附加功能没有在这里提及,比如网站授权。然后你可能会考虑在使用 urllib 进行认证之前切换到 requests 库,因为 requests 已经以更易用和易调试的方式实现了这些功能。我同时也希望提醒你 Python 已经通过 http.cookies 模块支持 Cookies 了,虽然在 request 包里也很好的封装了这个功能。你应该可能考虑同时试试两个来决定那个最适合你。
--------------------------------------------------------------------------------
via: http://www.blog.pythonlibrary.org/2016/06/28/python-101-an-intro-to-urllib/
作者:[Mike][a]
译者:[Ezio](https://github.com/oska874)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.blog.pythonlibrary.org/author/mld/

View File

@ -1,7 +1,7 @@
使用 Python 创建你自己的 Shell:Part II
===========================================
在[part 1][1] 中,我们已经创建了一个主要的 shell 循环、切分了的命令输入,以及通过 `fork``exec` 执行命令。在这部分,我们将会解决剩下的问题。首先,`cd test_dir2` 命令无法修改我们的当前目录。其次,我们仍无法优雅地从 shell 中退出。
[part 1][1] 中,我们已经创建了一个主要的 shell 循环、切分了的命令输入,以及通过 `fork``exec` 执行命令。在这部分,我们将会解决剩下的问题。首先,`cd test_dir2` 命令无法修改我们的当前目录。其次,我们仍无法优雅地从 shell 中退出。
### 步骤 4内置命令

View File

@ -0,0 +1,95 @@
在用户空间做我们会在内核空间做的事情
=======================================================
我相信Linux 最好也是最坏的事情,就是内核空间和用户空间之间的巨大差别。
但是如果抛开这个区别Linux 可能也不会成为世界上影响力最大的操作系统。如今Linux 已经拥有世界上最大数量的用户,和最大范围的应用。尽管大多数用户并不知道,当他们进行谷歌搜索,或者触摸安卓手机的时候,他们其实正在使用 Linux。如果不是 Linux 的巨大成功Apple 公司也可能并不会成为现在这样(苹果在他们的电脑产品中使用 BSD 发行版)。
不用担心,用户空间是 Linux 内核开发中的一个特性,并不是一个缺陷。正如 Linus 在 2003 的极客巡航中提到的那样,“”我只做内核相关技术……我并不知道内核之外发生的事情,而且我并不关心。我只关注内核部分发生的事情。” 在 Andrew Morton 在多年之后的另一个极客巡航上给我上了另外的一课,我写到:
> 内核空间是 Linux 核心存在的地方。用户空间是使用 Linux 时使用的空间,和其他的自然的建筑材料一样。内核空间和用户空间的区别,和自然材料和人类从中生产的人造材料的区别很类似。
这个区别的自然而然的结果,就是尽管外面的世界一刻也离不开 Linux 但是 Linux 社区还是保持相对较小。所以,为了增加我们社区团体的数量,我希望指出两件事情。第一件已经非常火热,另外一件可能热门。
第一件事情就是 [blockchain][1],出自著名的分布式货币,比特币之手。当你正在阅读这篇文章的同时,对 blockchain 的[兴趣已经直线上升][2]。
![](http://www.linuxjournal.com/files/linuxjournal.com/ufiles/imagecache/large-550px-centered/u1000009/12042f1.png)
> 图1. 谷歌 Blockchain 的趋势
第二件事就是自主身份。为了解释这个,让我先来问你,你是谁或者你是什么。
如果你从你的雇员你的医生或者车管所FacebookTwitter 或者谷歌上得到答案,你就会发现他们每一个都有明显的社会性: 为了他们自己的便利,在进入这些机构的控制前,他们都会添加自己的命名空间。正如 Timothy Ruff 在 [Evernym][3] 中解释的,”你并不为了他们而存在,你只为了自己的身份而活。“。你的身份可能会变化,但是唯一不变的就是控制着身份的人,也就是这个组织。
如果你的答案出自你自己,我们就有一个广大空间来发展一个新的领域,在这个领域中,我们完全自由。
第一个解释这个的人,据我所知,是 [Devon Loffreto][4]。在 2012 年 2 月,在的他的博客中,他写道 ”什么是' Sovereign Source Authority'?“,[Moxy Tongue][5]。在他发表在 2016 年 2 月的 "[Self-Sovereign Identity][6]" 中,他写道:
> 自主身份必须是独立个人提出的,并且不包含社会因素。。。自主身份源于每个个体对其自身本源的认识。 一个自主身份可以为个体带来新的社会面貌。每个个体都可能为自己生成一个自主身份,并且这并不会改变固有的人权。使用自主身份机制是所有参与者参与的基石,并且 依旧可以同各种形式的人类社会保持联系。
为了将这个发布在 Linux 条款中,只有个人才能为他或她设定一个自己的开源社区身份。这在现实实践中,这只是一个非常偶然的事件。举个例子,我自己的身份包括:
- David Allen Searls我父母会这样叫我。
- David Searls正式场合下我会这么称呼自己。
- Dave我的亲戚和好朋友会这么叫我。
- Doc大多数人会这么叫我。
在上述提到的身份认证中,我可以在不同的情景中轻易的转换。但是,这只是在现实世界中。在虚拟世界中,这就变得非常困难。除了上述的身份之外,我还可以是 @dsearls(我的 twitter 账号) 和 dsearls (其他的网络账号)。然而为了记住成百上千的不同账号的登录名和密码,我已经不堪重负。
你可以在你的浏览器上感受到这个糟糕的体验。在火狐上,我有成百上千个用户名密码。很多已经废弃(很多都是从 Netscape 时代遗留下来的),但是我依旧假设我有时会有大量的工作账号需要处理。对于这些,我只是被动接受者。没有其他的解决方法。甚至一些安全较低的用户认证,已经成为了现实世界中不可缺少的一环。
现在,最简单的方式来联系账号,就是通过 "Log in with Facebook" 或者 "Login in with Twitter" 来进行身份认证。在这些例子中,我们中的每一个甚至并不是真正意义上的自己,或者某种程度上是我们希望被大家认识的自己(如果我们希望被其他人认识的话)。
我们从一开始就需要的是一个可以实体化我们的自主身份和交流时选择如何保护和展示自身的个人系统。因为缺少这个能力我们现在陷入混乱。Shoshana Zuboff 称之为 "监视资本主义",她如此说道:
>...难以想象,在见证了互联网和获得了的巨大成功的谷歌背后。世界因 Apple 和 FBI 的对决而紧密联系在一起。真相就是,被热衷于监视的资本家开发监视系统,是每一个国家安全机构真正的恶。
然后,她问道,”我们怎样才能保护自己远离他人的影响?“
我建议使用自主身份。我相信这是我们唯一的方式,来保证我们从一个被监视的世界中脱离出来。以此为基础,我们才可以完全无顾忌的和社会,政治,商业上的人交流。
我在五月联合国举行的 [ID2020][7] 会议中总结了这个临时的结论。很高兴Devon Loffreto 也在那自从他在2013年被选为作为轮值主席之后。这就是[我曾经写的一些文章][8],引用了 Devon 的早期博客(比如上面的原文)。
这有三篇这个领域的准则:
- "[Self-Sovereign Identity][9]" - Devon Loffreto.
- "[System or Human First][10]" - Devon Loffreto.
- "[The Path to Self-Sovereign Identity][11]" - Christopher Allen.
从Evernym 的简要说明中,[digi.me][12], [iRespond][13] 和 [Respect Network][14] 也被包括在内。自主身份和社会身份 (也被称为”current model“ 的对比结果,显示在图二中。
![](http://www.linuxjournal.com/files/linuxjournal.com/ufiles/imagecache/large-550px-centered/u1000009/12042f2.jpg)
> 图 2. Current Model 身份 vs. 自主身份
为此而生的[平台][15]就是 Sovrin也被解释为“”依托于先进技术的授权机制的分布式货币上的一个完全开源基于标识声明身份的图平台“ 同时,这也有一本[白皮书][16]。代号为 [plenum][17],而且它在 Github 上。
在这-或者其他类似的地方-我们就可以在用户空间中重现我们在上一个的四分之一世纪中已经做过的事情。
--------------------------------------------------------------------------------
via: https://www.linuxjournal.com/content/doing-user-space-what-we-did-kernel-space
作者:[Doc Searls][a]
译者:[译者ID](https://github.com/MikeCoder)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://www.linuxjournal.com/users/doc-searls
[1]: https://en.wikipedia.org/wiki/Block_chain_%28database%29
[2]: https://www.google.com/trends/explore#q=blockchain
[3]: http://evernym.com/
[4]: https://twitter.com/nzn
[5]: http://www.moxytongue.com/2012/02/what-is-sovereign-source-authority.html
[6]: http://www.moxytongue.com/2016/02/self-sovereign-identity.html
[7]: http://www.id2020.org/
[8]: http://blogs.harvard.edu/doc/2013/10/14/iiw-challenge-1-sovereign-identity-in-the-great-silo-forest
[9]: http://www.moxytongue.com/2016/02/self-sovereign-identity.html
[10]: http://www.moxytongue.com/2016/05/system-or-human.html
[11]: http://www.lifewithalacrity.com/2016/04/the-path-to-self-soverereign-identity.html
[12]: https://get.digi.me/
[13]: http://irespond.com/
[14]: https://www.respectnetwork.com/
[15]: http://evernym.com/technology
[16]: http://evernym.com/assets/doc/Identity-System-Essentials.pdf?v=167284fd65
[17]: https://github.com/evernym/plenum

View File

@ -0,0 +1,129 @@
bc : 一个命令行计算器
============================
![](https://cdn.fedoramagazine.org/wp-content/uploads/2016/07/bc-calculator-945x400.jpg)
假如你运行在一个图形桌面环境中当你需要一个计算器时你可能只需要一路进行点击便可以找到一个计算器。例如Fedora 工作站中就已经包含了一个名为 `Calculator` 的工具。它有着几种不同的操作模式,例如,你可以进行复杂的数学运算或者金融运算。但是,你知道吗,命令行也提供了一个与之相似的名为 `bc` 的工具?
`bc` 工具可以为你提供你期望一个科学计算器、金融计算器或者是简单的计算器所能提供的所有功能。另外,假如需要的话,它还可以从命令行中被脚本化。这使得当你需要做复杂的数学运算时,你可以在 shell 脚本中使用它。
因为 bc 被其他的系统软件所使用,例如 CUPS 打印服务,它可能已经在你的 Fedora 系统中被安装了。你可以使用下面这个命令来进行检查:
```
dnf list installed bc
```
假如因为某些原因你没有在上面命令的输出中看到它,你可以使用下面的这个命令来安装它:
```
sudo dnf install bc
```
### 用 bc 做一些简单的数学运算
使用 bc 的一种方式是进入它自己的 shell。在那里你可以在一行中做许多次计算。但在你键入 bc 后,首先出现的是有关这个程序的警告:
```
$ bc
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
```
现在你可以按照每行一个输入运算式或者命令了:
```
1+1
```
bc 会回答上面计算式的答案是:
```
2
```
在这里你还可以执行其他的命令。你可以使用 加(+)、减(-)、乘(*)、除(/)、圆括号、指数符号(^ 等等。请注意 bc 同样也遵循所有约定俗成的运算规定,例如运算的先后顺序。你可以试试下面的例子:
```
(4+7)*2
4+7*2
```
若要离开 bc 可以通过按键组合 `Ctrl+D` 来发送 “输入结束”信号给 bc 。
使用 bc 的另一种方式是使用 `echo` 命令来传递运算式或命令。下面这个示例类似于计算器中的 "Hello, world" 例子,使用 shell 的管道函数(| 来将 `echo` 的输出传入 `bc` 中:
```
echo '1+1' | bc
```
使用 shell 的管道,你可以发送不止一个运算操作,你需要使用分号来分隔不同的运算。结果将在不同的行中返回。
```
echo '1+1; 2+2' | bc
```
### 精度
在某些计算中bc 会使用精度的概念,即小数点后面的数字位数。默认的精度是 0。除法操作总是使用精度的设定。所以如果你没有设置精度有可能会带来意想不到的答案
```
echo '3/2' | bc
echo 'scale=3; 3/2' | bc
```
乘法使用一个更复杂的精度选择机制:
```
echo '3*2' | bc
echo '3*2.0' | bc
```
同时,加法和减法的相关运算则与之相似:
```
echo '7-4.15' | bc
```
### 其他进制系统
bc 的另一个有用的功能是可以使用除 十进制以外的其他计数系统。例如,你可以轻松地做十六进制或二进制的数学运算。可以使用 `ibase``obase` 命令来分别设定输入和输出的进制系统。需要记住的是一旦你使用了 `ibase`,之后你输入的任何数字都将被认为是在新定义的进制系统中。
要做十六进制数到十进制数的转换或运算,你可以使用类似下面的命令。请注意大于 9 的十六进制数必须是大写的A-F
```
echo 'ibase=16; A42F' | bc
echo 'ibase=16; 5F72+C39B' | bc
```
若要使得结果是十六进制数,则需要设定 `obase`
```
echo 'obase=16; ibase=16; 5F72+C39B' | bc
```
下面是一个小技巧。假如你在 shell 中做这些运算,怎样才能使得输入重新为十进制数呢?答案是使用 `ibase` 命令,但你必须设定它为在当前进制中与十进制中的 10 等价的值。例如,假如 `ibase` 被设定为十六进制,你需要输入:
```
ibase=A
```
一旦你执行了上面的命令,所有输入的数字都将是十进制的了,接着你便可以输入 `obase=10` 来重置输出的进制系统。
### 结论
上面所提到的只是 bc 所能做到的基础。它还允许你为某些复杂的运算和程序定义函数、变量和循环结构。你可以在你的系统中将这些程序保存为文本文件以便你在需要的时候使用。你还可以在网上找到更多的资源,它们提供了更多的例子以及额外的函数库。快乐地计算吧!
--------------------------------------------------------------------------------
via: http://www.tecmint.com/mandatory-access-control-with-selinux-or-apparmor-linux/
作者:[Paul W. Frields][a]
译者:[FSSlc](https://github.com/FSSlc)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://fedoramagazine.org/author/pfrields/
[1]: http://phodd.net/gnu-bc/

View File

@ -0,0 +1,199 @@
建立你的第一个仓库
======================================
![](https://opensource.com/sites/default/files/styles/image-full-size/public/images/life/open_abstract_pieces.jpg?itok=ZRt0Db00)
现在是时候学习怎样创建你自己的仓库了,还有怎样增加文件和完成提交。
在本系列前面文章的安装过程中,你已经学习了作为一个目标用户怎样与 Git 进行交互你就像一个漫无目的的流浪者一样偶然发现了一个开源项目网站然后克隆了仓库Git 走进了你的生活。学习怎样和 Git 进行交互并不像你想的那样困难,或许你并不确信现在是否应该使用 Git 完成你的工作。
Git 被认为是选择大多软件项目的工具,它不仅能够完成大多软件项目的工作;它也能管理你杂乱项目的列表(如果他们不重要,也可以这样说!),你的配置文件,一个日记,项目进展日志,甚至源代码!
使用 Git 是很有必要的,毕竟,你肯定有因为一个备份文件不能够辨认出版本信息而烦恼的时候。
你不使用 Git它也就不会为你工作或者也可以把 Git 理解为“没有任何推送就像源头指针一样”【译注: HEAD 可以理解为“头指针”,是当前工作区的“基础版本”,当执行提交时, HEAD 指向的提交将作为新提交的父提交。】。我保证,你很快就会对 Git 有所了解 。
### 类比于录音
我们更喜欢谈论快照上的图像,因为很多人都可以通过一个相册很快辨认出每个照片上特有的信息。这可能很有用,然而,我认为 Git 更像是在进行声音的记录。
传统的录音机,可能你对于它的部件不是很清楚:它包含转轴并且正转或反转,使用磁带保存声音波形,通过放音头记录声音并保存到磁带上然后播放给收听者。
除了往前退磁带,你也可以把磁带多绕几圈到磁带前面的部分,或快进跳过前面的部分到最后。
想象一下 70 年代的磁带录制的声音。你可能想到那会正在反复练习一首歌直到非常完美,它们最终被记录下来了。起初,你记录了鼓声,低音,然后是吉他声,还有其他的声音。每次你录音,工作人员都会把磁带重绕并设置为环绕模式,这样在你演唱的时候录音磁带就会播放之前录制的声音。如果你是低音歌唱,你唱歌的时候就需要把有鼓声的部分作为背景音乐,然后就是吉他声、鼓声、低音(和牛铃声【译注:一种打击乐器,状如四棱锥。】)等等。在每一环,你完成了整个部分,到了下一环,工作人员就开始在磁带上制作你的演唱作品。
你也可以拷贝或换出整个磁带,这是你需要继续录音并且进行多次混合的时候需要做的。
现在我希望对于上述 70 年代的录音工作的描述足够生动,我们就可以把 Git 的工作想象成一个录音磁带了。
### 新建一个 Git 仓库
首先得为我们的虚拟的录音机买一些磁带。在 Git 术语中,这就是仓库;它是完成所有工作的基础,也就是说这里是存放 Git 文件的地方(即 Git 工作区)。
任何目录都可以是一个 Git 仓库,但是在开始的时候需要进行一次更新。需要下面三个命令:
- 创建目录(如果你喜欢的话,你可以在你的 GUI 文件管理器里面完成。)
- 在终端里查看目录。
- 初始化这个目录使它可以被 Git管理。
特别是运行如下代码:
```
$ mkdir ~/jupiter # 创建目录
$ cd ~/jupiter # 进入目录
$ git init . # 初始化你的新 Git 工作区
```
在这个例子中,文件夹 jupiter 是空的但却成为了你的 Git 仓库。
有了仓库接下来的事件就按部就班了。你可以克隆项目仓库,你可以在一个历史点前后来回穿梭(前提是你有一个历史点),创建可交替时间线,然后剩下的工作 Git 就都能正常完成了。
在 Git 仓库里面工作和在任何目录里面工作都是一样的在仓库中新建文件复制文件保存文件。你可以像平常一样完成工作Git 并不复杂,除非你把它想复杂了。
在本地的 Git 仓库中,一个文件可以有下面这三种状态:
- 未跟踪文件:你在仓库里新建了一个文件,但是你没有把文件加入到 Git 的提交任务提交暂存区stage中。
- 已跟踪文件:已经加入到 Git 暂存区的文件。
- 暂存区文件:存在于暂存区的文件已经加入到 Git 的提交队列中。
任何你新加入到 Git 仓库中的文件都是未跟踪文件。文件还保存在你的电脑硬盘上,但是你没有告诉 Git 这是需要提交的文件,就像我们的录音机,如果你没有打开录音机;乐队开始演唱了,但是录音机并没有准备录音。
不用担心Git 会告诉你存在的问题并提示你怎么解决:
```
$ echo "hello world" > foo
$ git status
位于您当前工作的分支 master 上
未跟踪文件:
(使用 "git add <file>" 更新要提交的内容)
foo
没有任何提交任务,但是存在未跟踪文件(用 "git add" 命令加入到提交任务)
```
你看到了Git 会提醒你怎样把文件加入到提交任务中。
### 不使用 it 命令进行 Git 操作
在 GitHub 或 GitLab译注GitLab 是一个用于仓库管理系统的开源项目。使用Git作为代码管理工具并在此基础上搭建起来的web服务。上创建一个仓库大多是使用鼠标点击完成的。这不会很难你单击 New Repository 这个按钮就会很快创建一个仓库。
在仓库中新建一个 README 文件是一个好习惯,这样人们在浏览你的仓库的时候就可以知道你的仓库基于什么项目,更有用的是通过 README 文件可以确定克隆的是否为一个非空仓库。
克隆仓库通常很简单,但是在 GitHub 上获取仓库改动权限就不简单了,为了进行用户验证你必须有一个 SSH 秘钥。如果你使用 Linux 系统,通过下面的命令可以生成一个秘钥:
```
$ ssh-keygen
```
复制纯文本文件里的秘钥。你可以使用一个文本编辑器打开它,也可以使用 cat 命令:
```
$ cat ~/.ssh/id_rsa.pub
```
现在把你的秘钥拷贝到 [GitHub SSH 配置文件][1] 中,或者 [GitLab 配置文件[2]。
如果你通过使用 SSH 模式克隆了你的项目,就可以在你的仓库开始工作了。
另外,如果你的系统上没有安装 Git 的话也可以使用 GitHub 的文件上传接口来克隆仓库。
![](https://opensource.com/sites/default/files/2_githubupload.jpg)
### 跟踪文件
命令 git status 的输出会告诉你如果你想让 git 跟踪一个文件,你必须使用命令 git add 把它加入到提交任务中。这个命令把文件存在了暂存区暂存区存放的都是等待提交的文件或者把仓库保存为一个快照。git add 命令的最主要目的是为了区分你已经保存在仓库快照里的文件,还有新建的或你想提交的临时文件,至少现在,你都不用为它们之间的不同之处而费神了。
类比大型录音机,这个动作就像打开录音机开始准备录音一样。你可以按已经录音的录音机上的 pause 按钮来完成推送,或者按下重置按钮等待开始跟踪下一个文件。
如果你把文件加入到提交任务中Git 会自动标识为跟踪文件:
```
$ git add foo
$ git status
位于您当前工作的分支 master 上
下列修改将被提交:
(使用 "git reset HEAD <file>..." 将下列改动撤出提交任务)
新增文件foo
```
加入文件到提交任务中并不会生成一个记录。这仅仅是为了之后方便记录而把文件存放到暂存区。在你把文件加入到提交任务后仍然可以修改文件;文件会被标记为跟踪文件并且存放到暂存区,所以你在最终提交之前都可以改动文件或撤出提交任务(但是请注意:你并没有记录文件,所以如果你完全改变了文件就没有办法撤销了,因为你没有记住最终修改的准确时间。)。
如果你决定不把文件记录到 Git 历史列表中,那么你可以撤出提交任务,在 Git 中是这样做的:
```
$ git reset HEAD foo
```
这实际上就是删除了录音机里面的录音,你只是在工作区转了一圈而已而已。
### 大型提交
有时候,你会需要完成很多提交;我们以录音机类比,这就好比按下录音键并最终按下保存键一样。
在一个项目从建立到完成,你会按记录键无数次。比如,如果你通过你的方式使用一个新的 Python 工具包并且最终实现了窗口展示,然后你就很肯定的提交了文件,但是不可避免的会发生一些错误,现在你却不能撤销你的提交操作了。
一次提交会记录仓库中所有的暂存区文件。Git 只记录加入到提交任务中的文件,也就是说在过去某个时刻你使用 git add 命令加入到暂存区的所有文件。还有从先前的提交开始被改动的文件。如果没有其他的提交,所有的跟踪文件都包含在这次提交中,因为在浏览 Git 历史点的时候,它们没有存在于仓库中。
完成一次提交需要运行下面的命令:
```
$ git commit -m 'My great project, first commit.'
```
这就保存了所有需要在仓库中提交的文件(或者,如果你说到 Gallifreyan【译注英国电视剧《神秘博士》里的时间领主使用的一种优雅的语言】,它们可能就是“固定的时间点” )。你不仅能看到整个提交记录,还能通过 git log 命令查看修改日志找到提交时的版本号
```
$ git log --oneline
55df4c2 My great project, first commit.
```
如果想浏览更多信息,只需要使用不带 --oneline 选项的 git log 命令。
在这个例子中提交时的版本号是 55df4c2。它被叫做 commit hash译注一个SHA-1生成的哈希码用于表示一个git commit对象。它表示着刚才你的提交包含的所有改动覆盖了先前的记录。如果你想要“倒回”到你的提交历史点上就可以用这个 commit hash 作为依据。
你可以把 commit hash 想象成一个声音磁带上的 [SMPTE timecode][3],或者再夸张一点,这就是好比一个黑胶唱片上两首不同的歌之间的不同点,或是一个 CD 上的轨段编号。
你在很久前改动了文件并且把它们加入到提交任务中,最终完成提交,这就会生成新的 commit hashes每个 commit hashes 标示的历史点都代表着你的产品不同的版本。
这就是 Charlie Brown 把 Git 称为版本控制系统的原因。
在接下来的文章中,我们将会讨论你需要知道的关于 Git HEAD 的一切,我们不准备讨论关于 Git 的提交历史问题。基本不会提及,但是你可能会需要了解它(或许你已经有所了解?)。
--------------------------------------------------------------------------------
via: https://opensource.com/life/16/7/creating-your-first-git-repository
作者:[Seth Kenlon][a]
译者:[vim-kakali](https://github.com/vim-kakali)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/seth
[1]: https://github.com/settings/keys
[2]: https://gitlab.com/profile/keys
[3]: http://slackermedia.ml/handbook/doku.php?id=timecode

View File

@ -0,0 +1,62 @@
Linux 密码管理器Keeweb
================================
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/keeweb_1.png?608)
如今,我们依赖于越来越多的线上服务。我们每注册一个线上服务,就要设置一个密码;如此,我们就不得不记住数以百计的密码。这样对于每个人来说,都很容易忘记密码。我将在本文中介绍 Keeweb它是一款 Linux 密码管理器,可以将你所有的密码安全地存储在线上或线下。
当谈及 Linux 密码管理器时,我们会发现有很多这样的软件。我们已经在 LinuxAndUbuntu 上讨论过像 [Keepass][1] 和 [Encryptr一个基于零知识系统的密码管理器][2] 这样的密码管理器。Keeweb 则是另外一款我们将在本文讲解的 Linux 密码管理器。
### Keeweb 可以在线下或线上存储密码
Keeweb 是一款跨平台的密码管理器。它可以在线下存储你所有的密码,并且能够同步到你自己的云存储服务上,例如 OneDrive, Google Drive, Dropbox 等。Keeweb 并没有它自己的用于同步你密码的在线数据库。
要使用 Keeweb 连接你的线上存储服务,只需要点击更多,然后再点击你想要使用的服务即可。
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/keeweb.png?685)
现在Keeweb 会提示你登录到你的云盘。登录成功后,给 Keeweb 授权使用你的账户。
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/authenticate-dropbox-with-keeweb_orig.jpg?649)
### 使用 Keeweb 存储密码
使用 Keeweb 存储你的密码是非常容易的。你可以使用一个复杂的密码加密你的密码文件。Keeweb 也允许你使用一个秘钥文件来锁定密码文件,但是我并不推荐这种方式。如果某个家伙拿到了你的秘钥文件,他只需要简单点击一下就可以解锁你的密码文件。
#### 创建密码
想要创建一个新的密码,你只需要简单地点击 `+` 号,然后你就会看到所有需要填充的输入框。如果你想的话,可以创建更多的输入框。
#### 搜索密码
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/search-passwords_orig.png)
Keeweb 拥有一个图标库,这样你就可以轻松地找到任何特定的密码入口。你可以改变图标的颜色、下载更多的图标,甚至可以直接从你的电脑中导入图标。这对于密码搜索来说,异常好使。
相似服务的密码可以分组,这样你就可以在一个文件夹的一个地方同时找到它们。你也可以给密码打上标签并把它们存放在不同分类中。
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/tags-passwords-in-keeweb.png?283)
### 主题
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/themes.png?304)
如果你喜欢类似于白色或者高对比度的亮色主题,你可以在“设置 > 通用 > 主题”中修改。Keeweb有四款可供选择的主题其中两款为暗色另外两款为亮色。
### 不喜欢 Linux 密码管理器?没问题!
我已经发表过文章介绍了另外两款 Linux 密码管理器,它们分别是 Keepass 和 Encryptr在 Reddit 和其它社交媒体上有些关于它们的争论。有些人反对使用任何密码管理器,反之亦然。在本文中,我想要澄清的是,存放密码文件是我们自己的责任。我认为像 keepass 和 Keeweb 这样的密码管理器是非常好用的,因为它们并没有自己的云来存放你的密码。这些密码管理器会创建一个文件,然后你可以将它存放在你的硬盘上,或者使用像 VeraCrypt 这样的应用给它加密。我个人不使用也不推荐使用那些将密码存储在它们自己数据库的服务。
--------------------------------------------------------------------------------
via: http://www.linuxandubuntu.com/home/keeweb-a-linux-password-manager
作者:[author][a]
译者:[ChrisLeeGit](https://github.com/chrisleegit)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.linuxandubuntu.com/home/keeweb-a-linux-password-manager
[1]: http://www.linuxandubuntu.com/home/keepass-password-management-tool-creates-strong-passwords-and-keeps-them-secure
[2]: http://www.linuxandubuntu.com/home/encryptr-zero-knowledge-system-based-password-manager-for-linux

View File

@ -0,0 +1,218 @@
如何在Ubuntu Linux 16.04 LTS中使用多条连接加速apt-get/apt
=========================================================================================
我该如何在Ubuntu Linux 16.04或者14.04 LTS中从多个仓库中下载包来加速apt-get或者apt命令
你需要使用到apt-fast这个shell封装器。它会通过多个连接同时下载一个包来加速apt-get/apt和aptitude命令。所有的包都会同时下载。它使用aria2c作为默认的下载加速。
### 安装 apt-fast 工具
在Ubuntu Linux 14.04或者之后的版本尝试下面的命令:
```
$ sudo add-apt-repository ppa:saiarcot895/myppa
```
示例输出:
![](http://s0.cyberciti.org/uploads/faq/2016/07/install-apt-fast-repo.jpg)
更新你的仓库:
```
$ sudo apt-get update
```
或者
```
$ sudo apt update
```
![](http://s0.cyberciti.org/uploads/faq/2016/07/install-apt-fast-command.jpg)
安装 apt-fast
```
$ sudo apt-get -y install apt-fast
```
或者
```
$ sudo apt -y install apt-fast
```
示例输出:
```
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
aria2 libc-ares2 libssh2-1
Suggested packages:
aptitude
The following NEW packages will be installed:
apt-fast aria2 libc-ares2 libssh2-1
0 upgraded, 4 newly installed, 0 to remove and 0 not upgraded.
Need to get 1,282 kB of archives.
After this operation, 4,786 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://01.archive.ubuntu.com/ubuntu xenial/universe amd64 libssh2-1 amd64 1.5.0-2 [70.3 kB]
Get:2 http://ppa.launchpad.net/saiarcot895/myppa/ubuntu xenial/main amd64 apt-fast all 1.8.3~137+git7b72bb7-0ubuntu1~ppa3~xenial1 [34.4 kB]
Get:3 http://01.archive.ubuntu.com/ubuntu xenial/main amd64 libc-ares2 amd64 1.10.0-3 [33.9 kB]
Get:4 http://01.archive.ubuntu.com/ubuntu xenial/universe amd64 aria2 amd64 1.19.0-1build1 [1,143 kB]
54% [4 aria2 486 kB/1,143 kB 42%] 20.4 kB/s 32s
```
### 配置 apt-fast
你将会得到下面的提示必须输入一个5到16的数值
![](http://s0.cyberciti.org/uploads/faq/2016/07/max-connection-10.jpg)
并且
![](http://s0.cyberciti.org/uploads/faq/2016/07/apt-fast-confirmation-box.jpg)
你可以直接编辑设置:
```
$ sudo vi /etc/apt-fast.conf
```
>**请注意这个工具并不是给慢速网络连接的,它是给快速网络连接的。如果你的网速慢,那么你将无法从这个工具中得到好处。**
### 我该怎么使用 apt-fast 命令?
语法是:
```
apt-fast command
apt-fast [options] command
```
#### 使用apt-fast取回新的包列表
```
sudo apt-fast update
```
#### 使用apt-fast执行升级
```
sudo apt-fast upgrade
```
#### 执行发行版升级(发布或者强制内核升级),输入:
```
$ sudo apt-fast dist-upgrade
```
#### 安装新的包
语法是:
```
sudo apt-fast install pkg
```
比如要安装nginx输入
```
$ sudo apt-fast install nginx
```
示例输出:
![](http://s0.cyberciti.org/uploads/faq/2016/07/sudo-apt-fast-install.jpg)
#### 删除包
```
$ sudo apt-fast remove pkg
$ sudo apt-fast remove nginx
```
#### 删除包和它的配置文件
```
$ sudo apt-fast purge pkg
$ sudo apt-fast purge nginx
```
#### 删除所有未使用的包
```
$ sudo apt-fast autoremove
```
#### 下载源码包
```
$ sudo apt-fast source pkgNameHere
```
#### 清理下载的文件
```
$ sudo apt-fast clean
```
#### 清理旧的下载文件
```
$ sudo apt-fast autoclean
```
#### 验证没有破坏的依赖
```
$ sudo apt-fast check
```
#### 下载二进制包到当前目录
```
$ sudo apt-fast download pkgNameHere
$ sudo apt-fast download nginx
```
示例输出:
```
[#7bee0c 0B/0B CN:1 DL:0B]
07/26 15:35:42 [NOTICE] Verification finished successfully. file=/home/vivek/nginx_1.10.0-0ubuntu0.16.04.2_all.deb
07/26 15:35:42 [NOTICE] Download complete: /home/vivek/nginx_1.10.0-0ubuntu0.16.04.2_all.deb
Download Results:
gid |stat|avg speed |path/URI
======+====+===========+=======================================================
7bee0c|OK | n/a|/home/vivek/nginx_1.10.0-0ubuntu0.16.04.2_all.deb
Status Legend:
(OK):download completed.
```
#### 下载并显示指定包的changelog
```
$ sudo apt-fast changelog pkgNameHere
$ sudo apt-fast changelog nginx
```
--------------------------------------------------------------------------------
via: https://fedoramagazine.org/introducing-flatpak/
作者:[VIVEK GITE][a]
译者:[geekpi](https://github.com/geekpi)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.cyberciti.biz/tips/about-us

View File

@ -0,0 +1,184 @@
LFCS 系列第十三讲:如何配置并排除 GNU 引导加载程序GRUB故障
=====================================================================================
由于 LFCS 考试需求的变动已于 2016 年 2 月 2 日生效,因此我们向 [LFCS 系列][1] 添加了一些必要的话题。为了准备认证考试,我们也强烈推荐你去看 [LFCE 系列][2]。
![](http://www.tecmint.com/wp-content/uploads/2016/03/Configure-Troubleshoot-Grub-Boot-Loader.png)
>LFCS 系列第十三讲:配置并排除 Grub 引导加载程序故障。
本文将会向你介绍 GRUB 的知识,并会说明你为什么需要一个引导加载程序,以及它是如何增强系统通用性的。
[Linux 引导过程][3] 是从你按下你的电脑电源键开始,直到你拥有一个全功能的系统为止,整个过程遵循着这样的高层次顺序:
* 1. 一个叫做 **POST****上电自检**)的过程会对你的电脑硬件组件做全面的检查。
* 2. 当 **POST** 完成后,它会把控制权转交给引导加载程序,接下来引导加载程序会将 Linux 内核(以及 **initramfs**)加载到内存中并执行。
* 3. 内核首先检查并访问硬件,然后运行初始进程(主要以它的通用名 **init** 而为人熟知),接下来初始进程会启动一些服务,最后完成系统启动过程。
在该系列的第七讲(“[SysVinit, Upstart, 和 Systemd][4]”)中,我们介绍了现代 Linux 发行版使用的一些服务管理系统和工具。在继续学习之前,你可能想要回顾一下那一讲的知识。
### GRUB 引导装载程序介绍
在现代系统中,你会发现有两种主要的 **GRUB** 版本(一种是偶尔被成为 **GRUB Legacy****v1** 版本,另一种则是 **v2** 版本),虽说多数最新版本的发行版系统都默认使用了 **v2** 版本。如今,只有 **红帽企业版 Linux 6** 及其衍生系统仍在使用 **v1** 版本。
因此,在本指南中,我们将着重关注 **v2** 版本的功能。
不管 **GRUB** 的版本是什么,一个引导加载程序都允许用户:
* 1). 通过指定使用不同的内核来修改系统的表现方式;
* 2). 从多个操作系统中选择一个启动;
* 3). 添加或编辑配置节点来改变启动选项等。
如今,**GNU** 项目负责维护 **GRUB**,并在它们的网站上提供了丰富的文档。当你在阅读这篇指南时,我们强烈建议你看下 [GNU 官方文档][6]。
当系统引导时,你会在主控制台看到如下的 **GRUB** 画面。最开始,你可以根据提示在多个内核版本中选择一个内核(默认情况下,系统将会使用最新的内核启动),并且可以进入 **GRUB** 命令行模式(使用 `c` 键),或者编辑启动项(按下 `e` 键)。
![](http://www.tecmint.com/wp-content/uploads/2016/03/GRUB-Boot-Screen.png)
> GRUB 启动画面
你会考虑使用一个旧版内核启动的原因之一是之前工作正常的某个硬件设备在一次升级后出现了“怪毛病acting up例如你可以参考 AskUbuntu 论坛中的 [这条链接][7])。
**GRUB v2** 的配置文件会在启动时从 `/boot/grub/grub.cfg``/boot/grub2/grub.cfg` 文件中读取,而 **GRUB v1** 使用的配置文件则来自 `/boot/grub/grub.conf``/boot/grub/menu.lst`。这些文件不能直接手动编辑,而是根据 `/etc/default/grub` 的内容和 `/etc/grub.d` 目录中的文件来修改的。
**CentOS 7** 上,当系统最初完成安装后,会生成如下的配置文件:
```
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="vconsole.keymap=la-latin1 rd.lvm.lv=centos_centos7-2/swap crashkernel=auto vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos_centos7-2/root rhgb quiet"
GRUB_DISABLE_RECOVERY="true"
```
除了在线文档外,你也可以使用下面的命令查阅 GNU GRUB 手册:
```
# info grub
```
如果你对 `/etc/default/grub` 文件中的可用选项特别感兴趣的话,你可以直接查阅配置一节的帮助文档:
```
# info -f grub -n 'Simple configuration'
```
使用上述命令,你会发现 `GRUB_TIMEOUT` 用于设置启动画面出现和系统自动开始启动(除非被用户中断)之间的时间。当该变量值为 `-1` 时,除非用户主动做出选择,否则不会开始启动。
当同一台机器上安装了多个操作系统或内核后,`GRUB_DEFAULT` 就需要用一个整数来指定 GRUB 启动画面默认选择启动的操作系统或内核条目。我们既可以通过上述启动画查看启动条目列表,也可以使用下面的命令:
### 在 CentOS 和 openSUSE 系统上
```
# awk -F\' '$1=="menuentry " {print $2}' /boot/grub2/grub.cfg
```
### 在 Ubuntu 系统上
```
# awk -F\' '$1=="menuentry " {print $2}' /boot/grub/grub.cfg
```
如下图所示的例子中,如果我们想要使用版本为 `3.10.0-123.el7.x86_64` 的内核(第四个条目),我们需要将 `GRUB_DEFAULT` 设置为 `3`(条目从零开始编号),如下所示:
```
GRUB_DEFAULT=3
```
![](http://www.tecmint.com/wp-content/uploads/2016/03/Boot-System-with-Old-Kernel-Version.png)
> 使用旧版内核启动系统
最后一个需要特别关注的 GRUB 配置变量是 `GRUB_CMDLINE_LINUX`,它是用来给内核传递选项的。我们可以在 [内核变量文件][8] 和 [man 7 bootparam][9] 中找到能够通过 GRUB 传递给内核的选项的详细文档。
我的 **CentOS 7** 服务器上当前的选项是:
```
GRUB_CMDLINE_LINUX="vconsole.keymap=la-latin1 rd.lvm.lv=centos_centos7-2/swap crashkernel=auto vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos_centos7-2/root rhgb quiet"
```
为什么你希望修改默认的内核参数或者传递额外的选项呢?简单来说,在很多情况下,你需要告诉内核某些由内核自身无法判断的硬件参数,或者是覆盖一些内核会检测的值。
不久之前,就在我身上发生过这样的事情,当时我在自己已用了 10 年的老笔记本上尝试衍生自 **Slackware****Vector Linux**。完成安装后,内核并没有检测出我的显卡的正确配置,所以我不得不通过 GRUB 传递修改过的内核选项来让它工作。
另外一个例子是当你需要将系统切换到单用户模式以执行维护工作时。为此,你可以直接在 `GRUB_CMDLINE_LINUX` 变量中直接追加 `single` 并重启即可:
```
GRUB_CMDLINE_LINUX="vconsole.keymap=la-latin1 rd.lvm.lv=centos_centos7-2/swap crashkernel=auto vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos_centos7-2/root rhgb quiet single"
```
编辑完 `/etc/default/grub` 之后,你需要运行 `update-grub` (在 Ubuntu 上)或者 `grub2-mkconfig -o /boot/grub2/grub.cfg` (在 **CentOS****openSUSE** 上)命令来更新 `grub.cfg` 文件(否则,改动会在系统启动时丢失)。
这条命令会处理早先提到的一些启动配置文件来更新 `grub.cfg` 文件。这种方法可以确保改动持久化,而在启动时刻通过 GRUB 传递的选项仅在当前会话期间有效。
### 修复 Linux GRUB 问题
如果你安装了第二个操作系统,或者由于人为失误而导致你的 GRUB 配置文件损坏了,依然有一些方法可以让你恢复并能够再次启动系统。
在启动画面中按下 `c` 键进入 GRUB 命令行模式(记住,你也可以按下 `e` 键编辑默认启动选项),并可以在 GRUB 提示中输入 `help` 命令获得可用命令:
![](http://www.tecmint.com/wp-content/uploads/2016/03/Fix-Grub-Issues-in-Linux.png)
> 修复 Linux 的 Grub 配置问题
我们将会着重关注 **ls** 命令,它会列出已安装的设备和文件系统,并且我们将会看看它可以查找什么。在下面的图片中,我们可以看到有 4 块硬盘(`hd0` 到 `hd3`)。
貌似只有 `hd0` 已经分区了msdos1 和 msdos2 可以证明,这里的 1 和 2 是分区号msdos 则是分区方案)。
现在我们来看看能否在第一个分区 `hd0`**msdos1**)上找到 GRUB。这种方法允许我们启动 Linux并且使用高级工具修复配置文件或者如果有必要的话干脆重新安装 GRUB
```
# ls (hd0,msdos1)/
```
从高亮区域可以发现,`grub2` 目录就在这个分区:
![](http://www.tecmint.com/wp-content/uploads/2016/03/Find-Grub-Configuration.png)
> 查找 Grub 配置
一旦我们确信了 GRUB 位于 (**hd0, msdos1**),那就让我们告诉 GRUB 该去哪儿查找它的配置文件并指示它去尝试启动它的菜单:
```
set prefix=(hd0,msdos1)/grub2
set root=(hd0,msdos1)
insmod normal
normal
```
![](http://www.tecmint.com/wp-content/uploads/2016/03/Find-and-Launch-Grub-Menu.png)
> 查找并启动 Grub 菜单
然后,在 GRUB 菜单中,选择一个条目并按下 **Enter** 键以使用它启动。一旦系统成功启动后,你就可以运行 `grub2-install /dev/sdX` 命令修复问题了(将 `sdX` 改成你想要安装 GRUB 的设备)。然后启动信息将会更新,并且所有相关文件都会得到恢复。
```
# grub2-install /dev/sdX
```
其它更加复杂的情景及其修复建议都记录在 [Ubuntu GRUB2 故障排除指南][10] 中。该指南中阐述的概念对于其它发行版也是有效的。
### 总结
本文向你介绍了 GRUB并指导你可以在何处找到线上和线下的文档同时说明了如何面对由于引导加载相关的问题而导致系统无法正常启动的情况。
幸运的是GRUB 是文档支持非常丰富的工具之一,你可以使用我们在文中分享的资源非常轻松地获取已安装的文档或在线文档。
你有什么问题或建议吗?请不要犹豫,使用下面的评论框告诉我们吧。我们期待着来自你的回复!
--------------------------------------------------------------------------------
via: http://www.tecmint.com/configure-and-troubleshoot-grub-boot-loader-linux/
作者:[Gabriel Cánepa][a]
译者:[ChrisLeeGit](https://github.com/chrisleegit)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.tecmint.com/author/gacanepa/
[1]: http://www.tecmint.com/sed-command-to-create-edit-and-manipulate-files-in-linux/
[2]: http://www.tecmint.com/installing-network-services-and-configuring-services-at-system-boot/
[3]: http://www.tecmint.com/linux-boot-process/
[4]: http://www.tecmint.com/linux-boot-process-and-manage-services/
[5]: http://www.tecmint.com/best-linux-log-monitoring-and-management-tools/
[6]: http://www.gnu.org/software/grub/manual/
[7]: http://askubuntu.com/questions/82140/how-can-i-boot-with-an-older-kernel-version
[8]: https://www.kernel.org/doc/Documentation/kernel-parameters.txt
[9]: http://man7.org/linux/man-pages/man7/bootparam.7.html
[10]: https://help.ubuntu.com/community/Grub2/Troubleshooting

View File

@ -0,0 +1,275 @@
第 8 节--学习怎样使用 Awk 变量,数值表达式以及赋值运算符
=======================================================================================
我相信 [Awk 命令系列][1] 将会令人兴奋不已,在系列的前几节我们讨论了在 Linux 中处理文件和筛选字符串需要的基本 Awk 命令。
在这一部分,我们会对处理更复杂的文件和筛选字符串操作需要的更高级的命令进行讨论。因此,我们将会看到关于 Awk 的一些特性诸如变量,数值表达式和赋值运算符。
![](http://www.tecmint.com/wp-content/uploads/2016/07/Learn-Awk-Variables-Numeric-Expressions-Assignment-Operators.png)
>学习 Awk 变量,数值表达式和赋值运算符
你可能已经在很多编程语言中接触过它们,比如 shellCPython等这些概念在理解上和这些语言没有什么不同所以在这一小节中你不用担心很难理解我们将会简短的提及常用的一些 Awk 特性。
这一小节可能是 Awk 命令里最容易理解的部分,所以放松点,我们开始吧。
### 1. Awk 变量
在任何编程语言中,当你在程序中新建一个变量的时候这个变量就是一个存储了值的占位符,程序一运行就占用了一些内存空间,你为变量赋的值会存储在这些内存空间上。
你可以像下面这样定义 shell 变量一样定义 Awk 变量:
```
variable_name=value
```
上面的语法:
- `variable_name`: 为定义的变量的名字
- `value`: 为变量赋的值
再看下面的一些例子:
```
computer_name=”tecmint.com”
port_no=”22”
email=”admin@tecmint.com”
server=”computer_name”
```
观察上面的简单的例子,在定义第一个变量的时候,值 'tecmint.com' 被赋给了 'computer_name' 变量。
此外,值 22 也被赋给了 port_no 变量,把一个变量的值赋给另一个变量也是可以的,在最后的例子中我们把变量 computer_name 的值赋给了变量 server。
你可以看看 [本系列的第 2 节][2] 中提到的字段编辑,我们讨论了 Awk 怎样将输入的行分隔为若干字段并且使用标准的字段进行输入操作 $ 访问不同的被分配的字段。我们也可以像下面这样使用变量为字段赋值。
```
first_name=$2
second_name=$3
```
在上面的例子中,变量 first_name 的值设置为第二个字段second_name 的值设置为第三个字段。
再举个例子,有一个名为 names.txt 的文件,这个文件包含了一个应用程序的用户列表,这个用户列表显示了用户的名字和曾用名以及性别。可以使用 [cat 命令][3] 查看文件内容:
```
$ cat names.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/List-File-Content-Using-cat-Command.png)
>使用 cat 命令查看列表文件内容
然后,我们也可以使用下面的 Awk 命令把列表中第一个用户的第一个和第二个名字分别存储到变量 first_name 和 second_name 上:
```
$ awk '/Aaron/{ first_name=$2 ; second_name=$3 ; print first_name, second_name ; }' names.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Store-Variables-Using-Awk-Command.png)
>使用 Awk 命令为变量赋值
再看一个例子,当你在终端运行 'uname -a' 时,它可以打印出所有的系统信息。
第二个字段包含了你的 'hostname',因此,我们可以像下面这样把它赋给一个叫做 hostname 的变量并且用 Awk 打印出来。
```
$ uname -a
$ uname -a | awk '{hostname=$2 ; print hostname ; }'
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Store-Command-Output-to-Variable-Using-Awk.png)
>使用 Awk 把命令的输出赋给变量
### 2. 数值表达式
在 Awk 中,数值表达式使用下面的数值运算符组成:
- `*` : 乘法运算符
- `+` : 加法运算符
- `/` : 除法运算符
- `-` : 减法运算符
- `%` : 取模运算符
- `^` : 指数运算符
数值表达式的语法是:
```
$ operand1 operator operand2
```
上面的 operand1 和 operand2 可以是数值和变量,运算符可以是上面列出的任意一种。
下面是一些展示怎样使用数值表达式的例子:
```
counter=0
num1=5
num2=10
num3=num2-num1
counter=counter+1
```
理解了 Awk 中数值表达式的用法,我们就可以看下面的例子了,文件 domians.txt 里包括了所有属于 Tecmint 的域名。
```
news.tecmint.com
tecmint.com
linuxsay.com
windows.tecmint.com
tecmint.com
news.tecmint.com
tecmint.com
linuxsay.com
tecmint.com
news.tecmint.com
tecmint.com
linuxsay.com
windows.tecmint.com
tecmint.com
```
可以使用下面的命令查看文件的内容;
```
$ cat domains.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/View-Contents-of-File.png)
>查看文件内容
如果想要计算出域名 tecmint.com 在文件中出现的次数,我们就可以通过写一个简单的脚本实现这个功能:
```
#!/bin/bash
for file in $@; do
if [ -f $file ] ; then
#print out filename
echo "File is: $file"
#print a number incrementally for every line containing tecmint.com
awk '/^tecmint.com/ { counter=counter+1 ; printf "%s\n", counter ; }' $file
else
#print error info incase input is not a file
echo "$file is not a file, please specify a file." >&2 && exit 1
fi
done
#terminate script with exit code 0 in case of successful execution
exit 0
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Shell-Script-to-Count-a-String-in-File.png)
>计算一个字符串或文本在文件中出现次数的 shell 脚本
写完脚本后保存并赋予执行权限,当我们使用文件运行脚本的时候,文件 domains.txt 作为脚本的输入,我们会得到下面的输出:
```
$ ./script.sh ~/domains.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Script-To-Count-String.png)
>计算字符串或文本出现次数的脚本
从脚本执行后的输出中,可以看到在文件 domains.txt 中包含域名 tecmint.com 的地方有 6 行,你可以自己计算进行验证。
### 3. 赋值操作符
我们要说的最后的 Awk 特性是赋值运算符,下面列出的只是 Awk 中的部分赋值运算符:
- `*=` : 乘法赋值运算符
- `+=` : 加法赋值运算符
- `/=` : 除法赋值运算符
- `-=` : 减法赋值运算符
- `%=` : 取模赋值运算符
- `^=` : 指数赋值运算符
下面是 Awk 中最简单的一个赋值操作的语法:
```
$ variable_name=variable_name operator operand
```
例子:
```
counter=0
counter=counter+1
num=20
num=num-1
```
你可以使用在 Awk 中使用上面的赋值操作符使命令更简短,从先前的例子中,我们可以使用下面这种格式进行赋值操作:
```
variable_name operator=operand
counter=0
counter+=1
num=20
num-=1
```
因此,我们可以在 shell 脚本中改变 Awk 命令,使用上面提到的 += 操作符:
```
#!/bin/bash
for file in $@; do
if [ -f $file ] ; then
#print out filename
echo "File is: $file"
#print a number incrementally for every line containing tecmint.com
awk '/^tecmint.com/ { counter+=1 ; printf "%s\n", counter ; }' $file
else
#print error info incase input is not a file
echo "$file is not a file, please specify a file." >&2 && exit 1
fi
done
#terminate script with exit code 0 in case of successful execution
exit 0
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Alter-Shell-Script.png)
>改变了的 shell 脚本
在 [Awk 系列][4] 的这一部分,我们讨论了一些有用的 Awk 特性,有变量,使用数值表达式和赋值运算符,还有一些使用他们的实例。
这些概念和其他的编程语言没有任何不同,但是可能在 Awk 中有一些意义上的区别。
在本系列的第 9 节,我们会学习更多的 Awk 特性,比如特殊格式: BEGIN 和 END。这也会与 Tecmit 有联系。
--------------------------------------------------------------------------------
via: http://www.tecmint.com/learn-awk-variables-numeric-expressions-and-assignment-operators/
作者:[Aaron Kili][a]
译者:[vim-kakali](https://github.com/vim-kakali)
校对:[校对ID](https://github.com/校对ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.tecmint.com/author/aaronkili/
[1]: http://www.tecmint.com/category/awk-command/
[2]: http://www.tecmint.com/awk-print-fields-columns-with-space-separator/
[3]: http://www.tecmint.com/13-basic-cat-command-examples-in-linux/
[4]: http://www.tecmint.com/category/awk-command/

View File

@ -0,0 +1,166 @@
awk 系列:如何使用 awk 的特殊模式 BEGIN 和 END
===============================================================
在 awk 系列的第八节,我们介绍了一些强大的 awk 命令功能,它们是变量、数字表达式和赋值运算符。
本节我们将学习更多的 awk 功能,即 awk 的特殊模式:`BEGIN` 和 `END`
![](http://www.tecmint.com/wp-content/uploads/2016/07/Learn-Awk-Patterns-BEGIN-and-END.png)
> 学习 awk 的模式 BEGIN 和 END
随着我们逐渐展开,并探索出更多构建复杂 awk 操作的方法,将会证明 awk 的这些特殊功能的是多么强大。
开始前,先让我们回顾一下 awk 系列的介绍,记得当我们开始这个系列时,我就指出 awk 指令的通用语法是这样的:
```
# awk 'script' filenames
```
在上述语法中awk 脚本拥有这样的形式:
```
/pattern/ { actions }
```
当你看脚本中的模式(`/pattern`)时,你会发现它通常是一个正则表达式,此外,你也可以将模式(`/pattern`)当成特殊模式 `BEGIN``END`
因此,我们也能按照下面的形式编写一条 awk 命令:
```
awk '
BEGIN { actions }
/pattern/ { actions }
/pattern/ { actions }
……….
END { actions }
' filenames
```
假如你在 awk 脚本中使用了特殊模式:`BEGIN` 和 `END`,以下则是它们对应的含义:
- `BEGIN` 模式:是指 awk 将在读取任何输入行之前立即执行 `BEGIN` 中指定的动作。
- `END` 模式:是指 awk 将在它正式退出前执行 `END` 中指定的动作。
含有这些特殊模式的 awk 命令脚本的执行流程如下:
- 当在脚本中使用了 `BEGIN` 模式,则 `BEGIN` 中所有的动作都会在读取任何输入行之前执行。
- 然后,读入一个输入行并解析成不同的段。
- 接下来,每一条指定的非特殊模式都会和输入行进行比较匹配,当匹配成功后,就会执行模式对应的动作。对所有你指定的模式重复此执行该步骤。
- 再接下来,对于所有输入行重复执行步骤 2 和 步骤 3。
- 当读取并处理完所有输入行后,假如你指定了 `END` 模式,那么将会执行相应的动作。
当你使用特殊模式时,想要在 awk 操作中获得最好的结果,你应当记住上面的执行顺序。
为了便于理解,让我们使用第八节的例子进行演示,那个例子是关于 Tecmint 拥有的域名列表,并保存在一个叫做 domains.txt 的文件中。
```
news.tecmint.com
tecmint.com
linuxsay.com
windows.tecmint.com
tecmint.com
news.tecmint.com
tecmint.com
linuxsay.com
tecmint.com
news.tecmint.com
tecmint.com
linuxsay.com
windows.tecmint.com
tecmint.com
```
```
$ cat ~/domains.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/View-Contents-of-File.png)
> 查看文件内容
在这个例子中,我们希望统计出 domains.txt 文件中域名 `tecmint.com` 出现的次数。所以,我们编写了一个简单的 shell 脚本帮助我们完成任务,它使用了变量、数学表达式和赋值运算符的思想,脚本内容如下:
```
#!/bin/bash
for file in $@; do
if [ -f $file ] ; then
# 输出文件名
echo "File is: $file"
# 输出一个递增的数字记录包含 tecmint.com 的行数
awk '/^tecmint.com/ { counter+=1 ; printf "%s\n", counter ; }' $file
else
# 若输入不是文件,则输出错误信息
echo "$file 不是一个文件,请指定一个文件。" >&2 && exit 1
fi
done
# 成功执行后使用退出代码 0 终止脚本
exit 0
```
现在让我们像下面这样在上述脚本的 awk 命令中应用这两个特殊模式:`BEGIN` 和 `END`
我们应当把脚本:
```
awk '/^tecmint.com/ { counter+=1 ; printf "%s\n", counter ; }' $file
```
改成:
```
awk ' BEGIN { print "文件中出现 tecmint.com 的次数是:" ; }
/^tecmint.com/ { counter+=1 ; }
END { printf "%s\n", counter ; }
' $file
```
在修改了 awk 命令之后,现在完整的 shell 脚本就像下面这样:
```
#!/bin/bash
for file in $@; do
if [ -f $file ] ; then
# 输出文件名
echo "File is: $file"
# 输出文件中 tecmint.com 出现的总次数
awk ' BEGIN { print "文件中出现 tecmint.com 的次数是:" ; }
/^tecmint.com/ { counter+=1 ; }
END { printf "%s\n", counter ; }
' $file
else
# 若输入不是文件,则输出错误信息
echo "$file 不是一个文件,请指定一个文件。" >&2 && exit 1
fi
done
# 成功执行后使用退出代码 0 终止脚本
exit 0
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Awk-BEGIN-and-END-Patterns.png)
> awk 模式 BEGIN 和 END
当我们运行上面的脚本时,它会首先输出 domains.txt 文件的位置,然后执行 awk 命令脚本,该命令脚本中的特殊模式 `BEGIN` 将会在从文件读取任何行之前帮助我们输出这样的消息“`文件中出现 tecmint.com 的次数是:`”。
接下来,我们的模式 `/^tecmint.com/` 会在每个输入行中进行比较,对应的动作 `{ counter+=1 ; }` 会在每个匹配成功的行上执行,它会统计出 `tecmint.com` 在文件中出现的次数。
最终,`END` 模式将会输出域名 `tecmint.com` 在文件中出现的总次数。
```
$ ./script.sh ~/domains.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Script-to-Count-Number-of-Times-String-Appears.png)
> 用于统计字符串出现次数的脚本
最后总结一下,我们在本节中演示了更多的 awk 功能,并学习了特殊模式 `BEGIN``END` 的概念。
正如我之前所言,这些 awk 功能将会帮助我们构建出更复杂的文本过滤操作。第十节将会给出更多的 awk 功能,我们将会学习 awk 内置变量的思想,所以,请继续保持关注。
--------------------------------------------------------------------------------
via: http://www.tecmint.com/learn-use-awk-special-patterns-begin-and-end/
作者:[Aaron Kili][a]
译者:[ChrisLeeGit](https://github.com/chrisleegit)
校对:[校对ID](https://github.com/校对ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.tecmint.com/author/aaronkili/