Merge branch 'master' of github.com:LCTT/TranslateProject

This commit is contained in:
cposture 2016-08-02 13:33:47 +08:00
commit 9fb9348519
81 changed files with 3337 additions and 1132 deletions

View File

@ -0,0 +1,111 @@
JStockLinux 上不错的股票投资组合管理软件
================================================================================
如果你在股票市场做投资,那么你可能非常清楚投资组合管理计划有多重要。管理投资组合的目标是依据你能承受的风险,时间层面的长短和资金盈利的目标去为你量身打造的一种投资计划。鉴于这类软件的重要性,因此从来不会缺乏商业性的 app 和股票行情检测软件,每一个都可以兜售复杂的投资组合以及跟踪报告功能。
对于我们这些 Linux 爱好者们,我也找到了一些**好用的开源投资组合管理工具**,用来在 Linux 上管理和跟踪股票的投资组合,这里高度推荐一个基于 java 编写的管理软件 [JStock][1]。如果你不是一个 java 粉也许你会放弃它JStock 需要运行在沉重的 JVM 环境上。但同时,在每一个安装了 JRE 的环境中它都可以马上运行起来,在你的 Linux 环境中它会运行的很顺畅。
“开源”就意味着免费或标准低下的时代已经过去了。鉴于 JStock 只是一个个人完成的产物,作为一个投资组合管理软件它最令人印象深刻的是包含了非常多实用的功能,以上所有的荣誉属于它的作者 Yan Cheng Cheok例如JStock 支持通过监视列表去监控价格,多种投资组合,自选/内置的股票指标与相关监测支持27个不同的股票市场和跨平台的云端备份/还原。JStock 支持多平台部署Linux, OS X, Android 和 Windows你可以通过云端保存你的 JStock 投资组合,并通过云平台无缝的备份/还原到其他的不同平台上面。
现在我将向你展示如何安装以及使用过程的一些具体细节。
### 在 Linux 上安装 JStock ###
因为 JStock 使用Java编写所以必须[安装 JRE][2]才能让它运行起来。小提示JStock 需要 JRE1.7 或更高版本。如你的 JRE 版本不能满足这个需求JStock 将会运行失败然后出现下面的报错。
Exception in thread "main" java.lang.UnsupportedClassVersionError: org/yccheok/jstock/gui/JStock : Unsupported major.minor version 51.0
在你的 Linux 上安装好了 JRE 之后,从其官网下载最新的发布的 JStock然后加载启动它。
$ wget https://github.com/yccheok/jstock/releases/download/release_1-0-7-13/jstock-1.0.7.13-bin.zip
$ unzip jstock-1.0.7.13-bin.zip
$ cd jstock
$ chmod +x jstock.sh
$ ./jstock.sh
教程的其他部分,让我来给大家展示一些 JStock 的实用功能
### 监视监控列表中股票价格的波动 ###
使用 JStock 你可以创建一个或多个监视列表它可以自动的监视股票价格的波动并给你提供相应的通知。在每一个监视列表里面你可以添加多个感兴趣的股票进去。之后在“Fall Below”和“Rise Above”的表格里添加你的警戒值分别设定该股票的最低价格和最高价格。
![](https://c2.staticflickr.com/2/1588/23795349969_37f4b0f23c_c.jpg)
例如你设置了 AAPL 股票的最低/最高价格分别是 $102 和 $115.50,只要在价格低于 $102 或高于 $115.50 时你就得到桌面通知。
你也可以设置邮件通知这样你将收到一些价格信息的邮件通知。设置邮件通知在“Options”菜单里在“Alert”标签中国打开“Send message to email(s)”,填入你的 Gmail 账户。一旦完成 Gmail 认证步骤JStock 就会开始发送邮件通知到你的 Gmail 账户(也可以设置其他的第三方邮件地址)。
![](https://c2.staticflickr.com/2/1644/24080560491_3aef056e8d_b.jpg)
### 管理多个投资组合 ###
JStock 允许你管理多个投资组合。这个功能对于你使用多个股票经纪人时是非常实用的。你可以为每个经纪人创建一个投资组合去管理你的“买入/卖出/红利”用来了解每一个经纪人的业务情况。你也可以在“Portfolio”菜单里面选择特定的投资组合来切换不同的组合项目。下面是一张截图用来展示一个假设的投资组合。
![](https://c2.staticflickr.com/2/1646/23536385433_df6c036c9a_c.jpg)
你也可以设置付给中介费你可以为每个买卖交易设置中介费、印花税以及结算费。如果你比较懒你也可以在选项菜单里面启用自动费用计算并提前为每一家经济事务所设置费用方案。当你为你的投资组合增加交易之后JStock 将自动的计算并计入费用。
![](https://c2.staticflickr.com/2/1653/24055085262_0e315c3691_b.jpg)
### 使用内置/自选股票指标来监控 ###
如果你要做一些股票的技术分析你可能需要基于各种不同的标准来监控股票这里叫做“股票指标”。对于股票的跟踪JStock提供多个[预设的技术指示器][3] 去获得股票上涨/下跌/逆转指数的趋势。下面的列表里面是一些可用的指标。
- 平滑异同移动平均线MACD
- 相对强弱指标 (RSI)
- 资金流向指标 (MFI)
- 顺势指标 (CCI)
- 十字线
- 黄金交叉线,死亡交叉线
- 涨幅/跌幅
开启预设指示器能需要在 JStock 中点击“Stock Indicator Editor”标签。之后点击右侧面板中的安装按钮。选择“Install from JStock server”选项之后安装你想要的指示器。
![](https://c2.staticflickr.com/2/1476/23867534660_b6a9c95a06_c.jpg)
一旦安装了一个或多个指示器你可以用他们来扫描股票。选择“Stock Indicator Scanner”标签点击底部的“Scan”按钮选择需要的指示器。
![](https://c2.staticflickr.com/2/1653/24137054996_e8fcd10393_c.jpg)
当你选择完需要扫描的股票(例如, NYSE, NASDAQ以后JStock 将执行该扫描,并将该指示器捕获的结果通过列表展现。
![](https://c2.staticflickr.com/2/1446/23795349889_0f1aeef608_c.jpg)
除了预设指示器以外你也可以使用一个图形化的工具来定义自己的指示器。下面这张图例用于监控当前价格小于或等于60天平均价格的股票。
![](https://c2.staticflickr.com/2/1605/24080560431_3d26eac6b5_c.jpg)
### 通过云在 Linux 和 Android JStock 之间备份/恢复###
另一个非常棒的功能是 JStock 支持云备份恢复。Jstock 可以通过 Google Drive 把你的投资组合/监视列表在云上备份和恢复,这个功能可以实现在不同平台上无缝穿梭。如果你在两个不同的平台之间来回切换使用 Jstock这种跨平台备份和还原非常有用。我在 Linux 桌面和 Android 手机上测试过我的 Jstock 投资组合,工作的非常漂亮。我在 Android 上将 Jstock 投资组合信息保存到 Google Drive 上,然后我可以在我的 Linux 版的 Jstock 上恢复它。如果能够自动同步到云上,而不用我手动地触发云备份/恢复就更好了,十分期望这个功能出现。
![](https://c2.staticflickr.com/2/1537/24163165565_bb47e04d6c_c.jpg)
![](https://c2.staticflickr.com/2/1556/23536385333_9ed1a75d72_c.jpg)
如果你在从 Google Drive 还原之后不能看到你的投资信息以及监视列表请确认你的国家信息与“Country”菜单里面设置的保持一致。
JStock 的安卓免费版可以从 [Google Play Store][4] 获取到。如果你需要完整的功能(比如云备份,通知,图表等),你需要一次性支付费用升级到高级版。我认为高级版物有所值。
![](https://c2.staticflickr.com/2/1687/23867534720_18b917028c_c.jpg)
写在最后我应该说一下它的作者Yan Cheng Cheok他是一个十分活跃的开发者有bug及时反馈给他。这一切都要感谢他
关于 JStock 这个投资组合跟踪软件你有什么想法呢?
--------------------------------------------------------------------------------
via: http://xmodulo.com/stock-portfolio-management-software-Linux.html
作者:[Dan Nanni][a]
译者:[ivo-wang](https://github.com/ivo-wang)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://Linux.cn/) 荣誉推出
[a]:http://xmodulo.com/author/nanni
[1]:http://jstock.org/
[2]:http://ask.xmodulo.com/install-java-runtime-Linux.html
[3]:http://jstock.org/ma_indicator.html
[4]:https://play.google.com/store/apps/details?id=org.yccheok.jstock.gui

View File

@ -0,0 +1,101 @@
如何在 Ubuntu Linux 16.04上安装开源的 Discourse 论坛
===============================================================================
Discourse 是一个开源的论坛,它可以以邮件列表、聊天室或者论坛等多种形式工作。它是一个广受欢迎的现代的论坛工具。在服务端,它使用 Ruby on Rails 和 Postgres 搭建, 并且使用 Redis 缓存来减少读取时间 , 在客户端,它使用支持 Java Script 的浏览器。它非常容易定制,结构良好,并且它提供了转换插件,可以对你现存的论坛、公告板进行转换,例如: vBulletin、phpBB、Drupal、SMF 等等。在这篇文章中,我们将学习在 Ubuntu 操作系统下安装 Discourse。
它以安全作为设计思想,所以发垃圾信息的人和黑客们不能轻易的实现其企图。它能很好的支持各种现代设备,并可以相应的调整以手机和平板的显示。
### 在 Ubuntu 16.04 上安装 Discourse
让我们开始吧 ! 最少需要 1G 的内存,并且官方支持的安装过程需要已经安装了 docker。 说到 docker它还需要安装Git。要满足以上的两点要求我们只需要运行下面的命令
```
wget -qO- https://get.docker.com/ | sh
```
![](http://linuxpitstop.com/wp-content/uploads/2016/06/124.png)
用不了多久就安装好了 docker 和 Git安装结束以后在你的系统上的 /var 分区创建一个 Discourse 文件夹(当然你也可以选择其他的分区)。
```
mkdir /var/discourse
```
现在我们来克隆 Discourse 的 Github 仓库到这个新建的文件夹。
```
git clone https://github.com/discourse/discourse_docker.git /var/discourse
```
进入这个克隆的文件夹。
```
cd /var/discourse
```
![](http://linuxpitstop.com/wp-content/uploads/2016/06/314.png)
你将看到“discourse-setup” 脚本文件,运行这个脚本文件进行 Discourse 的初始化。
```
./discourse-setup
```
**备注: 在安装 discourse 之前请确保你已经安装好了邮件服务器。**
安装向导将会问你以下六个问题:
```
Hostname for your Discourse?
Email address for admin account?
SMTP server address?
SMTP user name?
SMTP port [587]:
SMTP password? []:
```
![](http://linuxpitstop.com/wp-content/uploads/2016/06/411.png)
当你提交了以上信息以后, 它会让你提交确认, 如果一切都很正常,点击回车以后安装开始。
![](http://linuxpitstop.com/wp-content/uploads/2016/06/511.png)
现在“坐等放宽”,需要花费一些时间来完成安装,倒杯咖啡,看看有什么错误信息没有。
![](http://linuxpitstop.com/wp-content/uploads/2016/06/610.png)
安装成功以后看起来应该像这样。
![](http://linuxpitstop.com/wp-content/uploads/2016/06/710.png)
现在打开浏览器,如果已经做了域名解析,你可以使用你的域名来连接 Discourse 页面 否则你只能使用IP地址了。你将看到如下信息
![](http://linuxpitstop.com/wp-content/uploads/2016/06/85.png)
就是这个,点击 “Sign Up” 选项创建一个新的账户,然后进行你的 Discourse 设置。
![](http://linuxpitstop.com/wp-content/uploads/2016/06/106.png)
### 结论
它安装简便,运行完美。 它拥有现代论坛所有必备功能。它以 GPL 发布,是完全开源的产品。简单、易用、以及特性丰富是它的最大特点。希望你喜欢这篇文章,如果有问题,你可以给我们留言。
--------------------------------------------------------------------------------
via: http://linuxpitstop.com/install-discourse-on-ubuntu-linux-16-04/
作者:[Aun][a]
译者:[kokialoves](https://github.com/kokialoves)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://linuxpitstop.com/author/aun/

View File

@ -0,0 +1,80 @@
GNU KHATA开源的会计管理软件
============================================
作为一个活跃的 Linux 爱好者,我经常向我的朋友们介绍 Linux帮助他们选择最适合他们的发行版本同时也会帮助他们安装一些适用于他们工作的开源软件。
但是在这一次,我就变得很无奈。我的叔叔,他是一个自由职业的会计师。他会有一系列的为了会计工作的漂亮而成熟的付费软件。我不那么确定我能在在开源软件中找到这么一款可以替代的软件——直到昨天。
Abhishek 给我推荐了一些[很酷的软件][1],而其中 GNU Khata 脱颖而出。
[GNU Khata][2] 是一个会计工具。 或者,我应该说成是一系列的会计工具集合?它就像经济管理方面的 [Evernote][3] 一样。它的应用是如此之广,以至于它不但可以用于个人的财务管理,也可以用于大型公司的管理,从店铺存货管理到税率计算,都可以有效处理。
有个有趣的地方Khata 这个词在印度或者是其他的印度语国家中意味着账户,所以这个会计软件叫做 GNU Khata。
### 安装
互联网上有很多关于旧的 Web 版本的 Khata 安装介绍。现在GNU Khata 只能用在 Debian/Ubuntu 和它们的衍生版本中。我建议你按照 GNU Khata 官网给出的如下步骤来安装。我们来快速过一下。
- 从[这里][4]下载安装器。
- 在下载目录打开终端。
- 粘贴复制以下的代码到终端,并且执行。
```
sudo chmod 755 GNUKhatasetup.run
sudo ./GNUKhatasetup.run
```
这就结束了,从你的 Dash 或者是应用菜单中启动 GNU Khata 吧。
### 第一次启动
GNU Khata 在浏览器中打开,并且展现以下的画面。
![](https://itsfoss.com/wp-content/uploads/2016/07/GNU-khata-1.jpg)
填写组织的名字、组织形式,财务年度并且点击 proceed 按钮进入管理设置页面。
![](https://itsfoss.com/wp-content/uploads/2016/07/GNU-khata-2.jpg)
仔细填写你的用户名、密码、安全问题及其答案并且点击“create and login”。
![](https://itsfoss.com/wp-content/uploads/2016/07/GNU-khata-3.jpg)
你已经全部设置完成了。使用菜单栏来开始使用 GNU Khata 来管理你的财务吧。这很容易。
### 移除 GNU KHATA
如果你不想使用 GNU Khata 了,你可以执行如下命令移除:
```
sudo apt-get remove --auto-remove gnukhata-core-engine
```
你也可以通过新立得软件管理来删除它。
### GNU KHATA 真的是市面上付费会计应用的竞争对手吗?
首先GNU Khata 以简化为设计原则。顶部的菜单栏组织的很方便,可以帮助你有效的进行工作。你可以选择管理不同的账户和项目,并且切换非常容易。[它们的官网][5]表明GNU Khata 可以“像说印度语一样方便”LCTT 译注:原谅我,这个软件作者和本文作者是印度人……)。同时,你知道 GNU Khata 也可以在云端使用吗?
所有的主流的账户管理工具,比如分类账簿、项目报表、财务报表等等都用专业的方式整理,并且支持自定义格式和即时展示。这让会计和仓储管理看起来如此的简单。
这个项目正在积极的发展正在寻求实操中的反馈以帮助这个软件更加进步。考虑到软件的成熟性、使用的便利性还有免费的情况GNU Khata 可能会成为你最好的账簿助手。
请在评论框里留言吧,让我们知道你是如何看待 GNU Khata 的。
--------------------------------------------------------------------------------
via: https://itsfoss.com/using-gnu-khata/
作者:[Aquil Roshan][a]
译者:[MikeCoder](https://github.com/MikeCoder)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://itsfoss.com/author/aquil/
[1]: https://itsfoss.com/category/apps/
[2]: http://www.gnukhata.in/
[3]: https://evernote.com/
[4]: https://cloud.openmailbox.org/index.php/s/L8ppsxtsFq1345E/download
[5]: http://www.gnukhata.in/

View File

@ -0,0 +1,36 @@
在浏览器中体验 Ubuntu
=====================================================
[Ubuntu][2] 的背后的公司 [Canonical][1] 为 Linux 推广做了很多努力。无论你有多么不喜欢 Ubuntu你必须承认它对 “Linux 易用性”的影响。Ubuntu 以及其衍生是使用最多的 Linux 版本。
为了进一步推广 Ubuntu LinuxCanonical 把它放到了浏览器里,你可以在任何地方使用这个 [Ubuntu 演示版][0]。 它将帮你更好的体验 Ubuntu以便让新人更容易决定是否使用它。
你可能争辩说 USB 版的 Linux 更好。我同意,但是你要知道你要下载 ISO创建 USB 启动盘,修改配置文件,然后才能使用这个 USB 启动盘来体验。这么乏味并不是每个人都乐意这么干的。 在线体验是一个更好的选择。
那么,你能在 Ubuntu 在线看到什么。实际上并不多。
你可以浏览文件,你可以使用 Unity Dash浏览 Ubuntu 软件中心,甚至装几个应用(当然它们不会真的安装),看一看文件浏览器和其它一些东西。以上就是全部了。但是在我看来,这已经做的很好了,让你知道它是个什么,对这个流行的操作系统有个直接感受。
![](https://itsfoss.com/wp-content/uploads/2016/07/Ubuntu-online-demo.jpeg)
![](https://itsfoss.com/wp-content/uploads/2016/07/Ubuntu-online-demo-1.jpeg)
![](https://itsfoss.com/wp-content/uploads/2016/07/Ubuntu-online-demo-2.jpeg)
如果你的朋友或者家人对试试 Linux 抱有兴趣,但是想在安装前想体验一下 Linux 。你可以给他们以下链接:[Ubuntu 在线导览][0] 。
--------------------------------------------------------------------------------
via: https://itsfoss.com/ubuntu-online-demo/
作者:[Abhishek Prakash][a]
译者:[kokialoves](https://github.com/kokialoves)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://itsfoss.com/author/abhishek/
[0]: http://tour.ubuntu.com/en/
[1]: http://www.canonical.com/
[2]: http://www.ubuntu.com/

View File

@ -0,0 +1,90 @@
为你的 Linux 桌面设置一张实时的地球照片
=================================================================
![](http://www.omgubuntu.co.uk/wp-content/uploads/2016/07/Screen-Shot-2016-07-26-at-16.36.47-1.jpg)
厌倦了看同样的桌面背景了么?这里有一个(可能是)世界上最棒的东西。
[Himawaripy][1] 是一个 Python 3 小脚本,它会抓取由[日本 Himawari 8 气象卫星][2]拍摄的接近实时的地球照片,并将它设置成你的桌面背景。
安装完成后,你可以将它设置成每 10 分钟运行的定时任务(自然,它要在后台运行),这样它就可以实时地取回地球的照片并设置成背景了。
因为 Himawari-8 是一颗同步轨道卫星,你只能看到澳大利亚上空的地球的图片——但是它实时的天气形态、云团和光线仍使它很壮丽,对我而言要是看到英国上方的就更好了!
高级设置允许你配置从卫星取回的图片质量,但是要记住增加图片质量会增加文件大小及更长的下载等待!
最后,虽然这个脚本与其他我们提到过的其他脚本类似,它还仍保持更新及可用。
###获取 Himawaripy
Himawaripy 已经在一系列的桌面环境中都测试过了,包括 Unity、LXDE、i3、MATE 和其他桌面环境。它是自由开源软件,但是整体来说安装及配置不太简单。
在该项目的 [Github 主页][0]上可以找到安装和设置该应用程序的所有指导(提示:没有一键安装功能)。
- [实时地球壁纸脚本的 GitHub 主页][0]
### 安装及使用
![](http://www.omgubuntu.co.uk/wp-content/uploads/2016/07/Screen-Shot-2016-07-26-at-16.46.13-750x143.png)
一些读者请我在本文中补充一下一步步安装该应用的步骤。以下所有步骤都在其 GitHub 主页上,这里再贴一遍。
1、下载及解压 Himawaripy
这是最容易的步骤。点击下面的下载链接,然后下载最新版本,并解压到你的下载目录里面。
- [下载 Himawaripy 主干文件(.zip 格式)][3]
2、安装 python3-setuptools
你需要手工来安装主干软件包Ubuntu 里面默认没有安装它:
```
sudo apt install python3-setuptools
```
3、安装 Himawaripy
在终端中,你需要切换到之前解压的目录中,并运行如下安装命令:
```
cd ~/Downloads/himawaripy-master
sudo python3 setup.py install
```
4、 看看它是否可以运行并下载最新的实时图片:
```
himawaripy
```
5、 设置定时任务
如果你希望该脚本可以在后台自动运行并更新(如果你需要手动更新,只需要运行 himarwaripy 即可)
在终端中运行:
```
crontab -e
```
在其中新加一行默认每10分钟运行一次
```
*/10 * * * * /usr/local/bin/himawaripy
```
关于[配置定时任务][4]可以在 Ubuntu Wiki 上找到更多信息。
该脚本安装后你不需要不断运行它,它会自动的每十分钟在后台运行一次。
--------------------------------------------------------------------------------
via: http://www.omgubuntu.co.uk/2016/07/set-real-time-earth-wallpaper-ubuntu-desktop
作者:[JOEY-ELIJAH SNEDDON][a]
译者:[geekpi](https://github.com/geekpi)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://plus.google.com/117485690627814051450/?rel=author
[1]: https://github.com/boramalper/himawaripy
[2]: https://en.wikipedia.org/wiki/Himawari_8
[0]: https://github.com/boramalper/himawaripy
[3]: https://github.com/boramalper/himawaripy/archive/master.zip
[4]: https://help.ubuntu.com/community/CronHowto

View File

@ -1,29 +1,29 @@
用 VeraCrypt 加密闪存盘
============================================
很多安全专家偏好像 VeraCrypt 这类能够用来加密闪存盘的开源软件。因为获取它的源代码很简单
很多安全专家偏好像 VeraCrypt 这类能够用来加密闪存盘的开源软件,是因为可以获取到它的源代码
保护 USB闪存盘里的数据加密是一个聪明的方法正如我们在使用 Microsoft 的 BitLocker [加密闪存盘][1] 一文中提到的。
保护 USB 闪存盘里的数据,加密是一个聪明的方法,正如我们在使用 Microsoft 的 BitLocker [加密闪存盘][1] 一文中提到的。
但是如果你不想用 BitLocker 呢?
你可能有顾虑,因为你不能够查看 Microsoft 的程序源码,那么它容易被植入用于政府或其它用途的“后门”。由于开源软件的源码是公开的,很多安全专家认为开源软件很少藏有后门。
你可能有顾虑,因为你不能够查看 Microsoft 的程序源码,那么它容易被植入用于政府或其它用途的“后门”。由于开源软件的源码是公开的,很多安全专家认为开源软件很少藏有后门。
还好,有几个开源加密软件能作为 BitLocker 的替代。
要是你需要在 Windows 系统,苹果的 OS X 系统或者 Linux 系统上加密以及访问文件,开源软件 [VeraCrypt][2] 提供绝佳的选择。
VeraCrypt 源于 TrueCrypt。TrueCrypt是一个备受好评的开源加密软件尽管它现在已经停止维护了。但是 TrueCrypt 的代码通过了审核,没有发现什么重要的安全漏洞。另外,它已经在 VeraCrypt 中进行了改善。
VeraCrypt 源于 TrueCrypt。TrueCrypt 是一个备受好评的开源加密软件,尽管它现在已经停止维护了。但是 TrueCrypt 的代码通过了审核,没有发现什么重要的安全漏洞。另外,在 VeraCrypt 中对它进行了改善。
WindowsOS X 和 Linux 系统的版本都有。
用 VeraCrypt 加密 USB 闪存盘不像用 BitLocker 那么简单,但是它只要几分钟就好了。
用 VeraCrypt 加密 USB 闪存盘不像用 BitLocker 那么简单,但是它只要几分钟就好了。
### 用 VeraCrypt 加密闪存盘的 8 个步骤
对应操作系统 [下载 VeraCrypt][3] 之后:
对应你的操作系统 [下载 VeraCrypt][3] 之后:
打开 VeraCrypt点击 Create Volume进入 VeraCrypt 的创建卷的向导程序VeraCrypt Volume Creation WizardVeraCrypt Volume Creation Wizard 首字母全大写,不清楚是否需要翻译,之后有很多首字母大写的词,都以括号标出)
打开 VeraCrypt点击 Create Volume进入 VeraCrypt 的创建卷的向导程序VeraCrypt Volume Creation Wizard
![](http://www.esecurityplanet.com/imagesvr_ce/6246/Vera0.jpg)
@ -39,7 +39,7 @@ VeraCrypt 创建卷向导VeraCrypt Volume Creation Wizard允许你在闪
![](http://www.esecurityplanet.com/imagesvr_ce/9427/Vera3.jpg)
选择创建卷模式Volume Creation Mode。如果你的闪存盘是空的或者你想要删除它里面的所有东西选第一个。要么你想保持所有现存的文件选第二个就好了。
选择创建卷模式Volume Creation Mode。如果你的闪存盘是空的或者你想要删除它里面的所有东西选第一个。要么你想保持所有现存的文件选第二个就好了。
![](http://www.esecurityplanet.com/imagesvr_ce/7828/Vera4.jpg)
@ -47,7 +47,7 @@ VeraCrypt 创建卷向导VeraCrypt Volume Creation Wizard允许你在闪
![](http://www.esecurityplanet.com/imagesvr_ce/5918/Vera5.jpg)
确定了卷容量后,输入并确认你想要用来加密数据密码。
确定了卷容量后,输入并确认你想要用来加密数据密码。
![](http://www.esecurityplanet.com/imagesvr_ce/3850/Vera6.jpg)
@ -71,7 +71,7 @@ VeraCrypt 创建卷向导VeraCrypt Volume Creation Wizard允许你在闪
### VeraCrypt 移动硬盘安装步骤
如果你设置闪存盘的时候,选择的是加密过的容器而不是加密整个盘,你可以选择创建 VeraCrypt 称为移动盘Traveler Disk的设备。这会复制安装一个 VeraCrypt USB 闪存盘。当你在别的 Windows 电脑上插入 U 盘时,就能从 U 盘自动运行 VeraCrypt也就是说没必要在新电脑上安装 VeraCrypt。
如果你设置闪存盘的时候,选择的是加密过的容器而不是加密整个盘,你可以选择创建 VeraCrypt 称为移动盘Traveler Disk的设备。这会复制安装一个 VeraCrypt USB 闪存盘。当你在别的 Windows 电脑上插入 U 盘时,就能从 U 盘自动运行 VeraCrypt也就是说没必要在新电脑上安装 VeraCrypt。
你可以设置闪存盘作为一个移动硬盘Traveler Disk在 VeraCrypt 的工具栏Tools菜单里选择 Traveler Disk SetUp 就行了。
@ -79,15 +79,15 @@ VeraCrypt 创建卷向导VeraCrypt Volume Creation Wizard允许你在闪
要从移动盘Traveler Disk上运行 VeraCrypt你必须要有那台电脑的管理员权限这不足为奇。尽管这看起来是个限制机密文件无法在不受控制的电脑上安全打开比如在一个商务中心的电脑上。
>Paul Rubens 从事技术行业已经超过 20 年。这期间他为英国和国际主要的出版社,包括 《The Economist》《The Times》《Financial Times》《The BBC》《Computing》和《ServerWatch》等出版社写过文章
> 本文作者 Paul Rubens 从事技术行业已经超过 20 年。这期间他为英国和国际主要的出版社,包括 《The Economist》《The Times》《Financial Times》《The BBC》《Computing》和《ServerWatch》等出版社写过文章
--------------------------------------------------------------------------------
via: http://www.esecurityplanet.com/open-source-security/how-to-encrypt-flash-drive-using-veracrypt.html
作者:[Paul Rubens ][a]
作者:[Paul Rubens][a]
译者:[GitFuture](https://github.com/GitFuture)
校对:[校对者ID](https://github.com/校对者ID)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出

View File

@ -0,0 +1,118 @@
Git 系列(一):什么是 Git
===========
欢迎阅读本系列关于如何使用 Git 版本控制系统的教程!通过本文的介绍,你将会了解到 Git 的用途及谁该使用 Git。
如果你刚步入开源的世界,你很有可能会遇到一些在 Git 上托管代码或者发布使用版本的开源软件。事实上,不管你知道与否,你都在使用基于 Git 进行版本管理的软件Linux 内核(就算你没有在手机或者电脑上使用 Linux你正在访问的网站也是运行在 Linux 系统上的Firefox、Chrome 等其他很多项目都通过 Git 代码库和世界各地开发者共享他们的代码。
换个角度来说,你是否仅仅通过 Git 就可以和其他人共享你的代码?你是否可以在家里或者企业里私有化的使用 Git你必须要通过一个 GitHub 账号来使用 Git 吗?为什么要使用 Git 呢Git 的优势又是什么Git 是我唯一的选择吗?这对 Git 所有的疑问都会把我们搞的一脑浆糊。
因此,忘记你以前所知的 Git让我们重新走进 Git 世界的大门。
### 什么是版本控制系统?
Git 首先是一个版本控制系统。现在市面上有很多不同的版本控制系统CVS、SVN、Mercurial、Fossil 当然还有 Git。
很多像 GitHub 和 GitLab 这样的服务是以 Git 为基础的,但是你也可以只使用 Git 而无需使用其他额外的服务。这意味着你可以以私有或者公有的方式来使用 Git。
如果你曾经和其他人有过任何电子文件方面的合作,你就会知道传统版本管理的工作流程。开始是很简单的:你有一个原始的版本,你把这个版本发送给你的同事,他们在接收到的版本上做了些修改,现在你们有两个版本了,然后他们把他们手上修改过的版本发回来给你。你把他们的修改合并到你手上的版本中,现在两个版本又合并成一个最新的版本了。
然后,你修改了你手上最新的版本,同时,你的同事也修改了他们手上合并前的版本。现在你们有 3 个不同的版本了,分别是合并后最新的版本,你修改后的版本,你同事手上继续修改过的版本。至此,你们的版本管理工作开始变得越来越混乱了。
正如 Jason van Gumster 在他的文章中指出 [即使是艺术家也需要版本控制][1]而且已经在个别人那里发现了这种趋势变化。无论是艺术家还是科学家开发一个某种实验版本是并不鲜见的在你的项目中可能有某个版本大获成功把项目推向一个新的高度也可能有某个版本惨遭失败。因此最终你不可避免的会创建出一堆名为project\_justTesting.kdenlive、project\_betterVersion.kdenlive、project\_best\_FINAL.kdenlive、project\_FINAL-alternateVersion.kdenlive 等类似名称的文件。
不管你是修改一个 for 循环,还是一些简单的文本编辑,一个好的版本控制系统都会让我们的生活更加的轻松。
### Git 快照
Git 可以为项目创建快照,并且存储这些快照为唯一的版本。
如果你将项目带领到了一个错误的方向上,你可以回退到上一个正确的版本,并且开始尝试另一个可行的方向。
如果你是和别人合作开发,当有人向你发送他们的修改时,你可以将这些修改合并到你的工作分支中,然后你的同事就可以获取到合并后的最新版本,并在此基础上继续工作。
Git 并不是魔法因此冲突还是会发生的“你修改了某文件的最后一行但是我把这行整行都删除了我们怎样处理这些冲突呢但是总体而言Git 会为你保留了所有更改的历史版本,甚至允许并行版本。这为你保留了以任何方式处理冲突的能力。
### 分布式 Git
在不同的机器上为同一个项目工作是一件复杂的事情。因为在你开始工作时,你想要获得项目的最新版本,然后此基础上进行修改,最后向你的同事共享这些改动。传统的方法是通过笨重的在线文件共享服务或者老旧的电邮附件,但是这两种方式都是效率低下且容易出错。
Git 天生是为分布式工作设计的。如果你要参与到某个项目中你可以克隆clone该项目的 Git 仓库然后就像这个项目只有你本地一个版本一样对项目进行修改。最后使用一些简单的命令你就可以拉取pull其他开发者的修改或者你可以把你的修改推送push给别人。现在不用担心谁手上的是最新的版本或者谁的版本又存放在哪里等这些问题了。全部人都是在本地进行开发然后向共同的目标推送或者拉取更新。或者不是共同的目标这取决于项目的开发方式
### Git 界面
最原始的 Git 是运行在 Linux 终端上的应用软件。然而,得益于 Git 是开源的,并且拥有良好的设计,世界各地的开发者都可以为 Git 设计不同的访问界面。
Git 完全是免费的,并且已经打包在 LinuxBSDIllumos 和其他类 Unix 系统中Git 命令看起来像这样:
```
$ git --version
git version 2.5.3
```
可能最著名的 Git 访问界面是基于网页的,像 GitHub、开源的 GitLab、Savannah、BitBucket 和 SourceForge 这些网站都是基于网页端的 Git 界面。这些站点为面向公众和面向社会的开源软件提供了最大限度的代码托管服务。在一定程度上基于浏览器的图形界面GUI可以尽量的减缓 Git 的学习曲线。下面的 GitLab 界面的截图:
![](https://opensource.com/sites/default/files/0_gitlab.png)
再者,第三方 Git 服务提供商或者独立开发者甚至可以在 Git 的基础上开发出不是基于 HTML 的定制化前端界面。此类界面让你可以不用打开浏览器就可以方便的使用 Git 进行版本管理。其中对用户最透明的方式是直接集成到文件管理器中。KDE 文件管理器 Dolphin 可以直接在目录中显示 Git 状态,甚至支持提交,推送和拉取更新操作。
![](https://opensource.com/sites/default/files/0_dolphin.jpg)
[Sparkleshare][2] 使用 Git 作为其 Dropbox 式的文件共享界面的基础。
![](https://opensource.com/sites/default/files/0_sparkleshare_1.jpg)
想了解更多的内容,可以查看 [Git wiki][3],这个(长长的)页面中展示了很多 Git 的图形界面项目。
### 谁应该使用 Git
就是你!我们更应该关心的问题是什么时候使用 Git和用 Git 来干嘛?
### 我应该在什么时候使用 Git 呢?我要用 Git 来干嘛呢?
想更深入的学习 Git我们必须比平常考虑更多关于文件格式的问题。
Git 是为了管理源代码而设计的在大多数编程语言中源代码就意味者一行行的文本。当然Git 并不知道你把这些文本当成是源代码还是下一部伟大的美式小说。因此,只要文件内容是以文本构成的,使用 Git 来跟踪和管理其版本就是一个很好的选择了。
但是什么是文本呢?如果你在像 Libre Office 这类办公软件中编辑一些内容,通常并不会产生纯文本内容。因为通常复杂的应用软件都会对原始的文本内容进行一层封装,就如把原始文本内容用 XML 标记语言包装起来,然后封装在 Zip 包中。这种对原始文本内容进行一层封装的做法可以保证当你把文件发送给其他人时,他们可以看到你在办公软件中编辑的内容及特定的文本效果。奇怪的是,虽然,通常你的需求可能会很复杂,就像保存 [Kdenlive][4] 项目文件,或者保存从 [Inkscape][5] 导出的SVG文件但是事实上使用 Git 管理像 XML 文本这样的纯文本类容是最简单的。
如果你在使用 Unix 系统,你可以使用 `file` 命令来查看文件内容构成:
```
$ file ~/path/to/my-file.blah
my-file.blah: ASCII text
$ file ~/path/to/different-file.kra: Zip data (MIME type "application/x-krita")
```
如果还是不确定,你可以使用 `head` 命令来查看文件内容:
```
$ head ~/path/to/my-file.blah
```
如果输出的文本你基本能看懂,这个文件就很有可能是文本文件。如果你仅仅在一堆乱码中偶尔看到几个熟悉的字符,那么这个文件就可能不是文本文件了。
准确的说Git 可以管理其他格式的文件但是它会把这些文件当成二进制大对象blob。两者的区别是在文本文件中Git 可以明确的告诉你在这两个快照(或者说提交)间有 3 行是修改过的。但是如果你在两个提交commit之间对一张图片进行的编辑操作Git 会怎么指出这种修改呢?实际上,因为图片并不是以某种可以增加或删除的有意义的文本构成,因此 Git 并不能明确的描述这种变化。当然我个人是非常希望图片的编辑可以像把文本“\<sky>丑陋的蓝绿色\</sky>”修改成“\<sky>漂浮着蓬松白云的天蓝色\</sky>”一样的简单,但是事实上图片的编辑并没有这么简单。
经常有人在 Git 上放入 png 图标、电子表格或者流程图这类二进制大型对象blob。尽管我们知道在 Git 上管理此类大型文件并不直观,但是,如果你需要使用 Git 来管理此类文件,你也并不需要过多的担心。如果你参与的项目同时生成文本文件和二进制大文件对象(如视频游戏中常见的场景,这些和源代码同样重要的图像和音频材料),那么你有两条路可以走:要么开发出你自己的解决方案,就如使用指向共享网络驱动器的引用;要么使用 Git 插件,如 Joey Hess 开发的 [git annex][6],以及 [Git-Media][7] 项目。
你看Git 真的是一个任何人都可以使用的工具。它是你进行文件版本管理的一个强大而且好用工具,同时它并没有你开始认为的那么可怕。
--------------------------------------------------------------------------------
via: https://opensource.com/resources/what-is-git
作者:[Seth Kenlon][a]
译者:[cvsher](https://github.com/cvsher)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/seth
[1]: https://opensource.com/life/16/2/version-control-isnt-just-programmers
[2]: http://sparkleshare.org/
[3]: https://git.wiki.kernel.org/index.php/InterfacesFrontendsAndTools#Graphical_Interfaces
[4]: https://opensource.com/life/11/11/introduction-kdenlive
[5]: http://inkscape.org/
[6]: https://git-annex.branchable.com/
[7]: https://github.com/alebedev/git-media

View File

@ -2,20 +2,23 @@
=========================
![](https://opensource.com/sites/default/files/styles/image-full-size/public/images/life/get_started_lead.jpeg?itok=r22AKc6P)
> 图片来源opensource.com
在这个系列的介绍中,我们学习到了谁应该使用 Git以及 Git 是用来做什么的。今天,我们将学习如何克隆公共的 Git 仓库,以及如何提取出独立的文件而不用克隆整个仓库。
*图片来源opensource.com*
在这个系列的[介绍篇][4]中,我们学习到了谁应该使用 Git以及 Git 是用来做什么的。今天,我们将学习如何克隆公共 Git 仓库,以及如何提取出独立的文件而不用克隆整个仓库。
由于 Git 如此流行,因而如果你能够至少熟悉一些基础的 Git 知识也能为你的生活带来很多便捷。如果你可以掌握 Git 基础(你可以的,我发誓!),那么你将能够下载任何你需要的东西,甚至还可能做一些贡献作为回馈。毕竟,那就是开源的精髓所在:你拥有获取你使用的软件代码的权利,拥有和他人分享的自由,以及只要你愿意就可以修改它的权利。只要你熟悉了 Git它就可以让这一切都变得很容易。
那么,让我们一起来熟悉 Git 吧。
### 读和写
一般来说,有两种方法可以和 Git 仓库交互:你可以从仓库中读取,或者你也能够向仓库中写入。它就像一个文件:有时候你打开一个文档只是为了阅读它,而其它时候你打开文档是因为你需要做些改动。
本文仅讲解如何从 Git 仓库读取。我们将会在后面的一篇文章中讲解如何向 Git 仓库写回的主题。
### Git 还是 GitHub
一句话澄清Git 不同于 GitHub或 GitLab或 Bitbucket。Git 是一个命令行程序,所以它就像下面这样:
```
@ -31,29 +34,32 @@ usage: Git [--version] [--help] [-C <path>]
我的文章系列将首先教你纯粹的 Git 知识,因为一旦你理解了 Git 在做什么,那么你就无需关心正在使用的前端工具是什么了。然而,我的文章系列也将涵盖通过流行的 Git 服务完成每项任务的常用方法,因为那些将可能是你首先会遇到的。
### 安装 Git
在 Linux 系统上,你可以从所使用的发行版软件仓库中获取并安装 Git。BSD 用户应当在 Ports 树的 devel 部分查找 Git。
对于闭源的操作系统,请前往 [项目网站][1] 并根据说明安装。一旦安装后,在 Linux、BSD 和 Mac OS X 上的命令应当没有任何差别。Windows 用户需要调整 Git 命令,从而和 Windows 文件系统相匹配,或者安装 Cygwin 以原生的方式运行 Git而不受 Windows 文件系统转换问题的羁绊。
对于闭源的操作系统,请前往其[项目官网][1],并根据说明安装。一旦安装后,在 Linux、BSD 和 Mac OS X 上的命令应当没有任何差别。Windows 用户需要调整 Git 命令,从而和 Windows 文件系统相匹配,或者安装 Cygwin 以原生的方式运行 Git而不受 Windows 文件系统转换问题的羁绊。
### Git 下午茶
### 下午茶和 Git
并非每个人都需要立刻将 Git 加入到我们的日常生活中。有些时候,你和 Git 最多的交互就是访问一个代码库,下载一两个文件,然后就不用它了。以这样的方式看待 Git它更像是下午茶而非一次正式的宴会。你进行一些礼节性的交谈获得了需要的信息然后你就会离开至少接下来的三个月你不再想这样说话。
当然,那是可以的。
一般来说,有两种方法访问 Git使用命令行或者使用一种神奇的因特网技术通过 web 浏览器快速轻松地访问。
假设你想要在终端中安装并使用一个回收站,因为你已经被 rm 命令毁掉太多次了。你已经听说过 Trashy 了,它称自己为「理智的 rm 命令媒介」,并且你想在安装它之前阅读它的文档。幸运的是,[Trashy 公开地托管在 GitLab.com][2]。
假设你想要给终端安装一个回收站,因为你已经被 rm 命令毁掉太多次了。你可能听说过 Trashy ,它称自己为「理智的 rm 命令中间人」,也许你想在安装它之前阅读它的文档。幸运的是,[Trashy 公开地托管在 GitLab.com][2]。
### Landgrab
我们工作的第一步是对这个 Git 仓库使用 landgrab 排序方法:我们会克隆这个完整的仓库,然后会根据内容排序。由于该仓库是托管在公共的 Git 服务平台上,所以有两种方式来完成工作:使用命令行,或者使用 web 界面。
要想使用 Git 获取整个仓库,就要使用 git clone 命令和 Git 仓库的 URL 作为参数。如果你不清楚正确的 URL 是什么仓库应该会告诉你的。GitLab 为你提供了 [Trashy][3] 仓库的拷贝-粘贴 URL。
要想使用 Git 获取整个仓库,就要使用 git clone 命令和 Git 仓库的 URL 作为参数。如果你不清楚正确的 URL 是什么仓库应该会告诉你的。GitLab 为你提供了 [Trashy][3] 仓库的用于拷贝粘贴 URL。
![](https://opensource.com/sites/default/files/1_gitlab-url.jpg)
你也许注意到了,在某些服务平台上,会同时提供 SSH 和 HTTPS 链接。只有当你拥有仓库的写权限时,你才可以使用 SSH。否则的话你必须使用 HTTPS URL。
一旦你获得了正确的 URL克隆仓库是非常容易的。就是 git clone 这个 URL 即可,可选项是可以指定要克隆到的目录。默认情况下会将 git 目录克隆到你当前所在的位置;例如,'trashy.git' 表示将仓库克隆到你当前位置的 'trashy' 目录。我使用 .clone 扩展名标记那些只读的仓库,使用 .git 扩展名标记那些我可以读写的仓库,但那无论如何也不是官方要求的。
一旦你获得了正确的 URL克隆仓库是非常容易的。就是 git clone 该 URL 即可,以及一个可选的指定要克隆到的目录。默认情况下会将 git 目录克隆到你当前所在的目录;例如,'trashy.git' 将会克隆到你当前位置的 'trashy' 目录。我使用 .clone 扩展名标记那些只读的仓库,使用 .git 扩展名标记那些我可以读写的仓库,不过这并不是官方要求的。
```
$ git clone https://gitlab.com/trashy/trashy.git trashy.clone
@ -68,30 +74,34 @@ Checking connectivity... done.
一旦成功地克隆了仓库,你就可以像对待你电脑上任何其它目录那样浏览仓库中的文件。
另外一种获得仓库拷贝的方式是使用 web 界面。GitLab 和 GitHub 都会提供一个 .zip 格式的仓库快照文件。GitHub 有一个大的绿色下载按钮,但是在 GitLab 中,可以浏览器的右侧找到并不显眼的下载按钮。
另外一种获得仓库拷贝的方式是使用 web 界面。GitLab 和 GitHub 都会提供一个 .zip 格式的仓库快照文件。GitHub 有一个大的绿色下载按钮,但是在 GitLab 中,可以浏览器的右侧找到并不显眼的下载按钮。
![](https://opensource.com/sites/default/files/1_gitlab-zip.jpg)
### 仔细挑选
另外一种从 Git 仓库中获取文件的方法是找到你想要的文件,然后把它从仓库中拽出来。只有 web 界面才提供这种方法,本质上来说,你看到的是别人仓库的克隆;你可以把它想象成一个 HTTP 共享目录。
另外一种从 Git 仓库中获取文件的方法是找到你想要的文件,然后把它从仓库中拽出来。只有 web 界面才提供这种方法,本质上来说,你看到的是别人的仓库克隆;你可以把它想象成一个 HTTP 共享目录。
使用这种方法的问题是,你也许会发现某些文件并不存在于原始仓库中,因为完整形式的文件可能只有在执行 make 命令后才能构建,那只有你下载了完整的仓库,阅读了 README 或者 INSTALL 文件,然后运行相关命令之后才会产生。不过,假如你确信文件存在,而你只想进入仓库,获取那个文件,然后离开的话,你就可以那样做。
在 GitLab 和 GitHub 中,单击文件链接,并在 Raw 模式下查看,然后使用你的 web 浏览器的保存功能,例如:在 Firefox 中,文件 > 保存页面为。在一个 GitWeb 仓库中(一些更喜欢自己托管 git 的人使用的私有 git 仓库 web 查看器Raw 查看链接在文件列表视图中。
在 GitLab 和 GitHub 中,单击文件链接,并在 Raw 模式下查看,然后使用你的 web 浏览器的保存功能,例如:在 Firefox 中,文件 \> 保存页面为。在一个 GitWeb 仓库中(这是个某些更喜欢自己托管 git 的人使用的私有 git 仓库 web 查看器Raw 查看链接在文件列表视图中。
![](https://opensource.com/sites/default/files/1_webgit-file.jpg)
### 最佳实践
通常认为,和 Git 交互的正确方式是克隆完整的 Git 仓库。这样认为是有几个原因的。首先,可以使用 git pull 命令轻松地使克隆仓库保持更新,这样你就不必在每次文件改变时就重回 web 站点获得一份全新的拷贝。第二,你碰巧需要做些改进,只要保持仓库整洁,那么你可以非常轻松地向原来的作者提交所做的变更。
现在,可能是时候练习查找感兴趣的 Git 仓库,然后将它们克隆到你的硬盘中了。只要你了解使用终端的基础知识,那就不会太难做到。还不知道终端使用基础吗?那再给多我 5 分钟时间吧。
现在,可能是时候练习查找感兴趣的 Git 仓库,然后将它们克隆到你的硬盘中了。只要你了解使用终端的基础知识,那就不会太难做到。还不知道基本的终端使用方式吗?那再给多我 5 分钟时间吧。
### 终端使用基础
首先要知道的是,所有的文件都有一个路径。这是有道理的;如果我让你在常规的非终端环境下为我打开一个文件,你就要导航到文件在你硬盘的位置,并且直到你找到那个文件,你要浏览一大堆窗口。例如,你也许要点击你的家目录 > 图片 > InktoberSketches > monkey.kra。
在那样的场景下,我们可以说文件 monkeysketch.kra 的路径是:$HOME/图片/InktoberSketches/monkey.kra。
在那样的场景下,文件 monkeysketch.kra 的路径是:$HOME/图片/InktoberSketches/monkey.kra。
在终端中,除非你正在处理一些特殊的系统管理员任务,你的文件路径通常是以 $HOME 开头的(或者,如果你很懒,就使用 ~ 字符),后面紧跟着一些列的文件夹直到文件名自身。
这就和你在 GUI 中点击各种图标直到找到相关的文件或文件夹类似。
如果你想把 Git 仓库克隆到你的文档目录,那么你可以打开一个终端然后运行下面的命令:
@ -100,6 +110,7 @@ Checking connectivity... done.
$ git clone https://gitlab.com/foo/bar.git
$HOME/文档/bar.clone
```
一旦克隆完成,你可以打开一个文件管理器窗口,导航到你的文档文件夹,然后你就会发现 bar.clone 目录正在等待着你访问。
如果你想要更高级点,你或许会在以后再次访问那个仓库,可以尝试使用 git pull 命令来查看项目有没有更新:
@ -111,7 +122,7 @@ bar.clone
$ git pull
```
到目前为止你需要初步了解的所有终端命令就是那些了那就去探索吧。你实践得越多Git 掌握得就越好(孰能生巧),那就是游戏的名称,至少给了或取了一个元音
到目前为止你需要初步了解的所有终端命令就是那些了那就去探索吧。你实践得越多Git 掌握得就越好(熟能生巧),这是重点,也是事情的本质
--------------------------------------------------------------------------------
@ -119,7 +130,7 @@ via: https://opensource.com/life/16/7/stumbling-git
作者:[Seth Kenlon][a]
译者:[ChrisLeeGit](https://github.com/chrisleegit)
校对:[校对者ID](https://github.com/校对者ID)
校对:[wxy](https://github.com/wxy)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](https://linux.cn/) 荣誉推出
@ -127,4 +138,4 @@ via: https://opensource.com/life/16/7/stumbling-git
[1]: https://git-scm.com/download
[2]: https://gitlab.com/trashy/trashy
[3]: https://gitlab.com/trashy/trashy.git
[4]: https://linux.cn/article-7639-1.html

View File

@ -0,0 +1,86 @@
Android vs. iPhone: Pros and Cons
===================================
>When comparing Android vs. iPhone, clearly Android has certain advantages even as the iPhone is superior in some key ways. But ultimately, which is better?
The question of Android vs. iPhone is a personal one.
Take myself, for example. I'm someone who has used both Android and the iPhone iOS. I'm well aware of the strengths of both platforms along with their weaknesses. Because of this, I decided to share my perspective regarding these two mobile platforms. Additionally, we'll take a look at my impressions of the new Ubuntu mobile platform and where it stacks up.
### What iPhone gets right
Even though I'm a full time Android user these days, I do recognize the areas where the iPhone got it right. First, Apple has a better record in updating their devices. This is especially true for older devices running iOS. With Android, if it's not a “Google blessed” Nexus...it better be a higher end carrier supported phone. Otherwise, you're going to find updates are either sparse or non-existent.
Another area where the iPhone does well is apps availability. Expanding on that: iPhone apps almost always have a cleaner look to them. This isn't to say that Android apps are ugly, rather, they may not have an expected flow and consistency found with iOS. Two examples of exclusivity and great iOS-only layout would have to be [Dark Sky][1] (weather) and [Facebook Paper][2].
Then there is the backup process. Android can, by default, back stuff up to Google. But that doesn't help much with application data! By contrast, iCloud can essentially make a full backup of your iOS device.
### Where iPhone loses me
The biggest indisputable issue I have with the iPhone is more of a hardware limitation than a software one. That issue is storage.
Look, with most Android phones, I can buy a smaller capacity phone and then add an SD card later. This does two things: First, I can use the SD card to store a lot of media files. Second, I can even use the SD card to store "some" of my apps. Apple has nothing that will touch this.
Another area where the iPhone loses me is in the lack of choice it provides. Backing up your device? Hope you like iTunes or iCloud. For someone like myself who uses Linux, this means my ONLY option would be to use iCloud.
To be ultimately fair, there are additional solutions for your iPhone if you're willing to jailbreak it. But that's not what this article is about. Same goes for rooting Android. This article is addressing a vanilla setup for both platforms.
Finally, let us not forget this little treat [iTunes decides to delete a user's music][3] because it was seen as a duplication of Apple Music contents...or something along those lines. Not iPhone specific? I disagree, as that music would have very well ended up onto the iPhone at some point. I can say with great certainty that in no universe would I ever put up with this kind of nonsense!
![](http://www.datamation.com/imagesvr_ce/5552/mobile-abstract-icon-200x150.jpg)
>The Android vs. iPhone debate depends on what features matter the most to you.
### What Android gets right
The biggest thing Android gives me that the iPhone doesn't: choice. Choices in applications, devices and overall layout of how my phone works.
I love desktop widgets! To iPhone users, they may seem really silly. But I can tell you that they save me from opening up applications as I can see the desired data without the extra hassle. Another similar feature I love is being able to install custom launchers instead of my phone's default!
Finally, I can utilize tools like [Airdroid][4] and [Tasker][5] to add full computer-like functionality to my smart phone. Airdroid allows me treat my Android phone like a computer with file management and SMS with anyone this becomes a breeze to use with my mouse and keyboard. Tasker is awesome in that I can setup "recipes" to connect/disconnect, put my phone into meeting mode or even put itself into power saving mode when I set the parameters to do so. I can even set it to launch applications when I arrive at specific destinations.
### Where Android loses me
Backup options are limited to specific user data, not a full clone of your phone. Without rooting, you're either left out in the wind or you must look to the Android SDK for solutions. Expecting casual users to either root their phone or run the SDK for a complete (I mean everything) Android backup is a joke.
Yes, Google's backup service will backup Google app data, along with other related customizations. But it's nowhere near as complete as what we see with the iPhone. To accomplish something similar to what the iPhone enjoys, I've found you're going to either be rooting your Android phone or connecting it to a Windows PC to utilize some random program.
To be fair, however, I believe Nexus owners benefit from a [full backup service][6] that is device specific. Sorry, but Google's default backup is not cutting it. Same applies for adb backups via your PC they don't always restore things as expected.
Wait, it gets better. Now after a lot of failed let downs and frustration, I found that there was one app that looked like it "might" offer a glimmer of hope, it's called Helium. Unlike other applications I found to be misleading and frustrating with their limitations, [Helium][7] initially looked like it was the backup application Google should have been offering all along -- emphasis on "looked like." Sadly, it was a huge let down. Not only did I need to connect it to my computer for a first run, it didn't even work using their provided Linux script. After removing their script, I settling for a good old fashioned adb backup...to my Linux PC. Fun facts: You will need to turn on a laundry list of stuff in developer tools, plus if you run the Twilight app, that needs to be turned off. It took me a bit to put this together when the backup option for adb on my phone wasn't responding.
At the end of the day, Android has ample options for non-rooted users to backup superficial stuff like contacts, SMS and other data easily. But a deep down phone backup is best left to a wired connection and adb from my experience.
### Ubuntu will save us?
With the good and the bad examined between the two major players in the mobile space, there's a lot of hope that we're going to see good things from Ubuntu on the mobile front. Well, thus far, it's been pretty lackluster.
I like what the developers are doing with the OS and I certainly love the idea of a third option for mobile besides iPhone and Android. Unfortunately, though, it's not that popular on the phone and the tablet received a lot of bad press due to subpar hardware and a lousy demonstration that made its way onto YouTube.
To be fair, I've had subpar experiences with iPhone and Android, too, in the past. So this isn't a dig on Ubuntu. But until it starts showing up with a ready to go ecosystem of functionality that matches what Android and iOS offer, it's not something I'm terribly interested in yet. At a later date, perhaps, I'll feel like the Ubuntu phones are ready to meet my needs.
### Android vs. iPhone bottom line: Why Android wins long term
Despite its painful shortcomings, Android treats me like an adult. It doesn't lock me into only two methods for backing up my data. Yes, some of Android's limitations are due to the fact that it's focused on letting me choose how to handle my data. But, I also get to choose my own device, add storage on a whim. Android enables me to do a lot of cool stuff that the iPhone simply isn't capable of doing.
At its core, Android gives non-root users greater access to the phone's functionality. For better or worse, it's a level of freedom that I think people are gravitating towards. Now there are going to be many of you who swear by the iPhone thanks to efforts like the [libimobiledevice][8] project. But take a long hard look at all the stuff Apple blocks Linux users from doing...then ask yourself is it really worth it as a Linux user? Hit the Comments, share your thoughts on Android, iPhone or Ubuntu.
------------------------------------------------------------------------------
via: http://www.datamation.com/mobile-wireless/android-vs.-iphone-pros-and-cons.html
作者:[Matt Hartley][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.datamation.com/author/Matt-Hartley-3080.html
[1]: http://darkskyapp.com/
[2]: https://www.facebook.com/paper/
[3]: https://blog.vellumatlanta.com/2016/05/04/apple-stole-my-music-no-seriously/
[4]: https://www.airdroid.com/
[5]: http://tasker.dinglisch.net/
[6]: https://support.google.com/nexus/answer/2819582?hl=en
[7]: https://play.google.com/store/apps/details?id=com.koushikdutta.backup&hl=en
[8]: http://www.libimobiledevice.org/

View File

@ -1,3 +1,5 @@
MikeCoder Translating...
What containers and unikernels can learn from Arduino and Raspberry Pi
==========================================================================

View File

@ -1,3 +1,5 @@
translating by maywanting
5 SSH Hardening Tips
======================

View File

@ -0,0 +1,7 @@
组队翻译 《Building a data science portfolio: Machine learning project》
本次组织者 @选题-oska874
参加译者 @译者-vim-kakali @译者-Noobfish @译者-zky001 @译者-kokialoves @译者-ideas4u @译者-cposture
分配方式 原文按大致长度分成 6 部分,参与者自由选择,先到先选,如有疑问联系 @选题-oska874

View File

@ -0,0 +1,79 @@
>This is the third in a series of posts on how to build a Data Science Portfolio. If you like this and want to know when the next post in the series is released, you can [subscribe at the bottom of the page][1].
Data science companies are increasingly looking at portfolios when making hiring decisions. One of the reasons for this is that a portfolio is the best way to judge someones real-world skills. The good news for you is that a portfolio is entirely within your control. If you put some work in, you can make a great portfolio that companies are impressed by.
The first step in making a high-quality portfolio is to know what skills to demonstrate. The primary skills that companies want in data scientists, and thus the primary skills they want a portfolio to demonstrate, are:
- Ability to communicate
- Ability to collaborate with others
- Technical competence
- Ability to reason about data
- Motivation and ability to take initiative
Any good portfolio will be composed of multiple projects, each of which may demonstrate 1-2 of the above points. This is the third post in a series that will cover how to make a well-rounded data science portfolio. In this post, well cover how to make the second project in your portfolio, and how to build an end to end machine learning project. At the end, youll have a project that shows your ability to reason about data, and your technical competence. [Heres][2] the completed project if you want to take a look.
### An end to end project
As a data scientist, there are times when youll be asked to take a dataset and figure out how to [tell a story with it][3]. In times like this, its important to communicate very well, and walk through your process. Tools like Jupyter notebook, which we used in a previous post, are very good at helping you do this. The expectation here is that the deliverable is a presentation or document summarizing your findings.
However, there are other times when youll be asked to create a project that has operational value. A project with operational value directly impacts the day-to-day operations of a company, and will be used more than once, and often by multiple people. A task like this might be “create an algorithm to forecast our churn rate”, or “create a model that can automatically tag our articles”. In cases like this, storytelling is less important than technical competence. You need to be able to take a dataset, understand it, then create a set of scripts that can process that data. Its often important that these scripts run quickly, and use minimal system resources like memory. Its very common that these scripts will be run several times, so the deliverable becomes the scripts themselves, not a presentation. The deliverable is often integrated into operational flows, and may even be user-facing.
The main components of building an end to end project are:
- Understanding the context
- Exploring the data and figuring out the nuances
- Creating a well-structured project, so its easy to integrate into operational flows
- Writing high-performance code that runs quickly and uses minimal system resources
- Documenting the installation and usage of your code well, so others can use it
In order to effectively create a project of this kind, well need to work with multiple files. Using a text editor like [Atom][4], or an IDE like [PyCharm][5] is highly recommended. These tools will allow you to jump between files, and edit files of different types, like markdown files, Python files, and csv files. Structuring your project so its easy to version control and upload to collaborative coding tools like [Github][6] is also useful.
![](https://www.dataquest.io/blog/images/end_to_end/github.png)
>This project on Github.
Well use our editing tools along with libraries like [Pandas][7] and [scikit-learn][8] in this post. Well make extensive use of Pandas [DataFrames][9], which make it easy to read in and work with tabular data in Python.
### Finding good datasets
A good dataset for an end to end portfolio project can be hard to find. [The dataset][10] needs to be sufficiently large that memory and performance constraints come into play. It also needs to potentially be operationally useful. For instance, this dataset, which contains data on the admission criteria, graduation rates, and graduate future earnings for US colleges would be a great dataset to use to tell a story. However, as you think about the dataset, it becomes clear that there isnt enough nuance to build a good end to end project with it. For example, you could tell someone their potential future earnings if they went to a specific college, but that would be a quick lookup without enough nuance to demonstrate technical competence. You could also figure out if colleges with higher admissions standards tend to have graduates who earn more, but that would be more storytelling than operational.
These memory and performance constraints tend to come into play when you have more than a gigabyte of data, and when you have some nuance to what you want to predict, which involves running algorithms over the dataset.
A good operational dataset enables you to build a set of scripts that transform the data, and answer dynamic questions. A good example would be a dataset of stock prices. You would be able to predict the prices for the next day, and keep feeding new data to the algorithm as the markets closed. This would enable you to make trades, and potentially even profit. This wouldnt be telling a story it would be adding direct value.
Some good places to find datasets like this are:
- [/r/datasets][11] a subreddit that has hundreds of interesting datasets.
- [Google Public Datasets][12] public datasets available through Google BigQuery.
- [Awesome datasets][13] a list of datasets, hosted on Github.
As you look through these datasets, think about what questions someone might want answered with the dataset, and think if those questions are one-time (“how did housing prices correlate with the S&P 500?”), or ongoing (“can you predict the stock market?”). The key here is to find questions that are ongoing, and require the same code to be run multiple times with different inputs (different data).
For the purposes of this post, well look at [Fannie Mae Loan Data][14]. Fannie Mae is a government sponsored enterprise in the US that buys mortgage loans from other lenders. It then bundles these loans up into mortgage-backed securities and resells them. This enables lenders to make more mortgage loans, and creates more liquidity in the market. This theoretically leads to more homeownership, and better loan terms. From a borrowers perspective, things stay largely the same, though.
Fannie Mae releases two types of data data on loans it acquires, and data on how those loans perform over time. In the ideal case, someone borrows money from a lender, then repays the loan until the balance is zero. However, some borrowers miss multiple payments, which can cause foreclosure. Foreclosure is when the house is seized by the bank because mortgage payments cannot be made. Fannie Mae tracks which loans have missed payments on them, and which loans needed to be foreclosed on. This data is published quarterly, and lags the current date by 1 year. As of this writing, the most recent dataset thats available is from the first quarter of 2015.
Acquisition data, which is published when the loan is acquired by Fannie Mae, contains information on the borrower, including credit score, and information on their loan and home. Performance data, which is published every quarter after the loan is acquired, contains information on the payments being made by the borrower, and the foreclosure status, if any. A loan that is acquired may have dozens of rows in the performance data. A good way to think of this is that the acquisition data tells you that Fannie Mae now controls the loan, and the performance data contains a series of status updates on the loan. One of the status updates may tell us that the loan was foreclosed on during a certain quarter.
![](https://www.dataquest.io/blog/images/end_to_end/foreclosure.jpg)
>A foreclosed home being sold.
### Picking an angle
There are a few directions we could go in with the Fannie Mae dataset. We could:
- Try to predict the sale price of a house after its foreclosed on.
- Predict the payment history of a borrower.
- Figure out a score for each loan at acquisition time.
The important thing is to stick to a single angle. Trying to focus on too many things at once will make it hard to make an effective project. Its also important to pick an angle that has sufficient nuance. Here are examples of angles without much nuance:
- Figuring out which banks sold loans to Fannie Mae that were foreclosed on the most.
- Figuring out trends in borrower credit scores.
- Exploring which types of homes are foreclosed on most often.
- Exploring the relationship between loan amounts and foreclosure sale prices
All of the above angles are interesting, and would be great if we were focused on storytelling, but arent great fits for an operational project.
With the Fannie Mae dataset, well try to predict whether a loan will be foreclosed on in the future by only using information that was available when the loan was acquired. In effect, well create a “score” for any mortgage that will tell us if Fannie Mae should buy it or not. This will give us a nice foundation to build on, and will be a great portfolio piece.

View File

@ -0,0 +1,114 @@
### Understanding the data
Lets take a quick look at the raw data files. Here are the first few rows of the acquisition data from quarter 1 of 2012:
```
100000853384|R|OTHER|4.625|280000|360|02/2012|04/2012|31|31|1|23|801|N|C|SF|1|I|CA|945||FRM|
100003735682|R|SUNTRUST MORTGAGE INC.|3.99|466000|360|01/2012|03/2012|80|80|2|30|794|N|P|SF|1|P|MD|208||FRM|788
100006367485|C|PHH MORTGAGE CORPORATION|4|229000|360|02/2012|04/2012|67|67|2|36|802|N|R|SF|1|P|CA|959||FRM|794
```
Here are the first few rows of the performance data from quarter 1 of 2012:
```
100000853384|03/01/2012|OTHER|4.625||0|360|359|03/2042|41860|0|N||||||||||||||||
100000853384|04/01/2012||4.625||1|359|358|03/2042|41860|0|N||||||||||||||||
100000853384|05/01/2012||4.625||2|358|357|03/2042|41860|0|N||||||||||||||||
```
Before proceeding too far into coding, its useful to take some time and really understand the data. This is more critical in operational projects because we arent interactively exploring the data, it can be harder to spot certain nuances unless we find them upfront. In this case, the first step is to read the materials on the Fannie Mae site:
- [Overview][15]
- [Glossary of useful terms][16]
- [FAQs][17]
- [Columns in the Acquisition and Performance files][18]
- [Sample Acquisition data file][19]
- [Sample Performance data file][20]
After reading through these files, we know some key facts that will help us:
- Theres an Acquisition file and a Performance file for each quarter, starting from the year 2000 to present. Theres a 1 year lag in the data, so the most recent data is from 2015 as of this writing.
- The files are in text format, with a pipe (|) as a delimiter.
- The files dont have headers, but we have a list of what each column is.
- All together, the files contain data on 22 million loans.
- Because the Performance files contain information on loans acquired in previous years, there will be more performance data for loans acquired in earlier years (ie loans acquired in 2014 wont have much performance history).
These small bits of information will save us a ton of time as we figure out how to structure our project and work with the data.
### Structuring the project
Before we start downloading and exploring the data, its important to think about how well structure the project. When building an end-to-end project, our primary goals are:
- Creating a solution that works
- Having a solution that runs quickly and uses minimal resources
- Enabling others to easily extend our work
- Making it easy for others to understand our code
- Writing as little code as possible
In order to achieve these goals, well need to structure our project well. A well structured project follows a few principles:
- Separates data files and code files.
- Separates raw data from generated data.
- Has a README.md file that walks people through installing and using the project.
- Has a requirements.txt file that contains all the packages needed to run the project.
- Has a single settings.py file that contains any settings that are used in other files.
- For example, if you are reading the same file from multiple Python scripts, its useful to have them all import settings and get the file name from a centralized place.
- Has a .gitignore file that prevents large or secret files from being committed.
- Breaks each step in our task into a separate file that can be executed separately.
- For example, we may have one file for reading in the data, one for creating features, and one for making predictions.
- Stores intermediate values. For example, one script may output a file that the next script can read.
- This enables us to make changes in our data processing flow without recalculating everything.
Our file structure will look something like this shortly:
```
loan-prediction
├── data
├── processed
├── .gitignore
├── README.md
├── requirements.txt
├── settings.py
```
### Creating the initial files
To start with, well need to create a loan-prediction folder. Inside that folder, well need to make a data folder and a processed folder. The first will store our raw data, and the second will store any intermediate calculated values.
Next, well make a .gitignore file. A .gitignore file will make sure certain files are ignored by git and not pushed to Github. One good example of such a file is the .DS_Store file created by OSX in every folder. A good starting point for a .gitignore file is here. Well also want to ignore the data files because they are very large, and the Fannie Mae terms prevent us from redistributing them, so we should add two lines to the end of our file:
```
data
processed
```
[Heres][21] an example .gitignore file for this project.
Next, well need to create README.md, which will help people understand the project. .md indicates that the file is in markdown format. Markdown enables you write plain text, but also add some fancy formatting if you want. [Heres][22] a guide on markdown. If you upload a file called README.md to Github, Github will automatically process the markdown, and show it to anyone who views the project. [Heres][23] an example.
For now, we just need to put a simple description in README.md:
```
Loan Prediction
-----------------------
Predict whether or not loans acquired by Fannie Mae will go into foreclosure. Fannie Mae acquires loans from other lenders as a way of inducing them to lend more. Fannie Mae releases data on the loans it has acquired and their performance afterwards [here](http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html).
```
Now, we can create a requirements.txt file. This will make it easy for other people to install our project. We dont know exactly what libraries well be using yet, but heres a good starting point:
```
pandas
matplotlib
scikit-learn
numpy
ipython
scipy
```
The above libraries are the most commonly used for data analysis tasks in Python, and its fair to assume that well be using most of them. [Heres][24] an example requirements file for this project.
After creating requirements.txt, you should install the packages. For this post, well be using Python 3. If you dont have Python installed, you should look into using [Anaconda][25], a Python installer that also installs all the packages listed above.
Finally, we can just make a blank settings.py file, since we dont have any settings for our project yet.

View File

@ -0,0 +1,194 @@
### Acquiring the data
Once we have the skeleton of our project, we can get the raw data.
Fannie Mae has some restrictions around acquiring the data, so youll need to sign up for an account. You can find the download page [here][26]. After creating an account, youll be able to download as few or as many loan data files as you want. The files are in zip format, and are reasonably large after decompression.
For the purposes of this blog post, well download everything from Q1 2012 to Q1 2015, inclusive. Well then need to unzip all of the files. After unzipping the files, remove the original .zip files. At the end, the loan-prediction folder should look something like this:
```
loan-prediction
├── data
│ ├── Acquisition_2012Q1.txt
│ ├── Acquisition_2012Q2.txt
│ ├── Performance_2012Q1.txt
│ ├── Performance_2012Q2.txt
│ └── ...
├── processed
├── .gitignore
├── README.md
├── requirements.txt
├── settings.py
```
After downloading the data, you can use the head and tail shell commands to look at the lines in the files. Do you see any columns that arent needed? It might be useful to consult the [pdf of column names][27] while doing this.
### Reading in the data
There are two issues that make our data hard to work with right now:
- The acquisition and performance datasets are segmented across multiple files.
- Each file is missing headers.
Before we can get started on working with the data, well need to get to the point where we have one file for the acquisition data, and one file for the performance data. Each of the files will need to contain only the columns we care about, and have the proper headers. One wrinkle here is that the performance data is quite large, so we should try to trim some of the columns if we can.
The first step is to add some variables to settings.py, which will contain the paths to our raw data and our processed data. Well also add a few other settings that will be useful later on:
```
DATA_DIR = "data"
PROCESSED_DIR = "processed"
MINIMUM_TRACKING_QUARTERS = 4
TARGET = "foreclosure_status"
NON_PREDICTORS = [TARGET, "id"]
CV_FOLDS = 3
```
Putting the paths in settings.py will put them in a centralized place and make them easy to change down the line. When referring to the same variables in multiple files, its easier to put them in a central place than edit them in every file when you want to change them. [Heres][28] an example settings.py file for this project.
The second step is to create a file called assemble.py that will assemble all the pieces into 2 files. When we run python assemble.py, well get 2 data files in the processed directory.
Well then start writing code in assemble.py. Well first need to define the headers for each file, so well need to look at [pdf of column names][29] and create lists of the columns in each Acquisition and Performance file:
```
HEADERS = {
"Acquisition": [
"id",
"channel",
"seller",
"interest_rate",
"balance",
"loan_term",
"origination_date",
"first_payment_date",
"ltv",
"cltv",
"borrower_count",
"dti",
"borrower_credit_score",
"first_time_homebuyer",
"loan_purpose",
"property_type",
"unit_count",
"occupancy_status",
"property_state",
"zip",
"insurance_percentage",
"product_type",
"co_borrower_credit_score"
],
"Performance": [
"id",
"reporting_period",
"servicer_name",
"interest_rate",
"balance",
"loan_age",
"months_to_maturity",
"maturity_date",
"msa",
"delinquency_status",
"modification_flag",
"zero_balance_code",
"zero_balance_date",
"last_paid_installment_date",
"foreclosure_date",
"disposition_date",
"foreclosure_costs",
"property_repair_costs",
"recovery_costs",
"misc_costs",
"tax_costs",
"sale_proceeds",
"credit_enhancement_proceeds",
"repurchase_proceeds",
"other_foreclosure_proceeds",
"non_interest_bearing_balance",
"principal_forgiveness_balance"
]
}
```
The next step is to define the columns we want to keep. Since all were measuring on an ongoing basis about the loan is whether or not it was ever foreclosed on, we can discard many of the columns in the performance data. Well need to keep all the columns in the acquisition data, though, because we want to maximize the information we have about when the loan was acquired (after all, were predicting if the loan will ever be foreclosed or not at the point its acquired). Discarding columns will enable us to save disk space and memory, while also speeding up our code.
```
SELECT = {
"Acquisition": HEADERS["Acquisition"],
"Performance": [
"id",
"foreclosure_date"
]
}
```
Next, well write a function to concatenate the data sets. The below code will:
- Import a few needed libraries, including settings.
- Define a function concatenate, that:
- Gets the names of all the files in the data directory.
- Loops through each file.
- If the file isnt the right type (doesnt start with the prefix we want), we ignore it.
- Reads the file into a [DataFrame][30] with the right settings using the Pandas [read_csv][31] function.
- Sets the separator to | so the fields are read in correctly.
- The data has no header row, so sets header to None to indicate this.
- Sets names to the right value from the HEADERS dictionary these will be the column names of our DataFrame.
- Picks only the columns from the DataFrame that we added in SELECT.
- Concatenates all the DataFrames together.
- Writes the concatenated DataFrame back to a file.
```
import os
import settings
import pandas as pd
def concatenate(prefix="Acquisition"):
files = os.listdir(settings.DATA_DIR)
full = []
for f in files:
if not f.startswith(prefix):
continue
data = pd.read_csv(os.path.join(settings.DATA_DIR, f), sep="|", header=None, names=HEADERS[prefix], index_col=False)
data = data[SELECT[prefix]]
full.append(data)
full = pd.concat(full, axis=0)
full.to_csv(os.path.join(settings.PROCESSED_DIR, "{}.txt".format(prefix)), sep="|", header=SELECT[prefix], index=False)
```
We can call the above function twice with the arguments Acquisition and Performance to concatenate all the acquisition and performance files together. The below code will:
- Only execute if the script is called from the command line with python assemble.py.
- Concatenate all the files, and result in two files:
- `processed/Acquisition.txt`
- `processed/Performance.txt`
```
if __name__ == "__main__":
concatenate("Acquisition")
concatenate("Performance")
```
We now have a nice, compartmentalized assemble.py thats easy to execute, and easy to build off of. By decomposing the problem into pieces like this, we make it easy to build our project. Instead of one messy script that does everything, we define the data that will pass between the scripts, and make them completely separate from each other. When youre working on larger projects, its a good idea to do this, because it makes it much easier to change individual pieces without having unexpected consequences on unrelated pieces of the project.
Once we finish the assemble.py script, we can run python assemble.py. You can find the complete assemble.py file [here][32].
This will result in two files in the processed directory:
```
loan-prediction
├── data
│ ├── Acquisition_2012Q1.txt
│ ├── Acquisition_2012Q2.txt
│ ├── Performance_2012Q1.txt
│ ├── Performance_2012Q2.txt
│ └── ...
├── processed
│ ├── Acquisition.txt
│ ├── Performance.txt
├── .gitignore
├── assemble.py
├── README.md
├── requirements.txt
├── settings.py
```

View File

@ -0,0 +1,84 @@
vim-kakali translating
### Computing values from the performance data
The next step well take is to calculate some values from processed/Performance.txt. All we want to do is to predict whether or not a property is foreclosed on. To figure this out, we just need to check if the performance data associated with a loan ever has a foreclosure_date. If foreclosure_date is None, then the property was never foreclosed on. In order to avoid including loans with little performance history in our sample, well also want to count up how many rows exist in the performance file for each loan. This will let us filter loans without much performance history from our training data.
One way to think of the loan data and the performance data is like this:
![](https://github.com/LCTT/wiki-images/blob/master/TranslateProject/ref_img/001.png)
As you can see above, each row in the Acquisition data can be related to multiple rows in the Performance data. In the Performance data, foreclosure_date will appear in the quarter when the foreclosure happened, so it should be blank prior to that. Some loans are never foreclosed on, so all the rows related to them in the Performance data have foreclosure_date blank.
We need to compute foreclosure_status, which is a Boolean that indicates whether a particular loan id was ever foreclosed on, and performance_count, which is the number of rows in the performance data for each loan id.
There are a few different ways to compute the counts we want:
- We could read in all the performance data, then use the Pandas groupby method on the DataFrame to figure out the number of rows associated with each loan id, and also if the foreclosure_date is ever not None for the id.
- The upside of this method is that its easy to implement from a syntax perspective.
- The downside is that reading in all 129236094 lines in the data will take a lot of memory, and be extremely slow.
- We could read in all the performance data, then use apply on the acquisition DataFrame to find the counts for each id.
- The upside is that its easy to conceptualize.
- The downside is that reading in all 129236094 lines in the data will take a lot of memory, and be extremely slow.
- We could iterate over each row in the performance dataset, and keep a separate dictionary of counts.
- The upside is that the dataset doesnt need to be loaded into memory, so its extremely fast and memory-efficient.
- The downside is that it will take slightly longer to conceptualize and implement, and we need to parse the rows manually.
Loading in all the data will take quite a bit of memory, so lets go with the third option above. All we need to do is to iterate through all the rows in the Performance data, while keeping a dictionary of counts per loan id. In the dictionary, well keep track of how many times the id appears in the performance data, as well as if foreclosure_date is ever not None. This will give us foreclosure_status and performance_count.
Well create a new file called annotate.py, and add in code that will enable us to compute these values. In the below code, well:
- Import needed libraries.
- Define a function called count_performance_rows.
- Open processed/Performance.txt. This doesnt read the file into memory, but instead opens a file handler that can be used to read in the file line by line.
- Loop through each line in the file.
- Split the line on the delimiter (|)
- Check if the loan_id is not in the counts dictionary.
- If not, add it to counts.
- Increment performance_count for the given loan_id because were on a row that contains it.
- If date is not None, then we know that the loan was foreclosed on, so set foreclosure_status appropriately.
```
import os
import settings
import pandas as pd
def count_performance_rows():
counts = {}
with open(os.path.join(settings.PROCESSED_DIR, "Performance.txt"), 'r') as f:
for i, line in enumerate(f):
if i == 0:
# Skip header row
continue
loan_id, date = line.split("|")
loan_id = int(loan_id)
if loan_id not in counts:
counts[loan_id] = {
"foreclosure_status": False,
"performance_count": 0
}
counts[loan_id]["performance_count"] += 1
if len(date.strip()) > 0:
counts[loan_id]["foreclosure_status"] = True
return counts
```
### Getting the values
Once we create our counts dictionary, we can make a function that will extract values from the dictionary if a loan_id and a key are passed in:
```
def get_performance_summary_value(loan_id, key, counts):
value = counts.get(loan_id, {
"foreclosure_status": False,
"performance_count": 0
})
return value[key]
```
The above function will return the appropriate value from the counts dictionary, and will enable us to assign a foreclosure_status value and a performance_count value to each row in the Acquisition data. The [get][33] method on dictionaries returns a default value if a key isnt found, so this enables us to return sensible default values if a key isnt found in the counts dictionary.

View File

@ -0,0 +1,143 @@
注解数据
我们已经在annotate.py中添加了一些功能, 现在我们来看一看数据文件. 我们需要将采集到的数据转换到training dataset来进行机器学习的训练. 这涉及到以下几件事情:
转换所以列数字.
填充缺失值.
分配 performance_count 和 foreclosure_status.
移除出现次数很少的行(performance_count 计数低).
我们有几个列是strings类型的, 看起来对于机器学习算法来说并不是很有用. 然而, 他们实际上是分类变量, 其中有很多不同的类别代码, 例如R,S等等. 我们可以把这些类别标签转换为数值:
通过这种方法转换的列我们可以应用到机器学习算法中.
还有一些包含日期的列 (first_payment_date 和 origination_date). 我们可以将这些日期放到两个列中:
在下面的代码中, 我们将转换采集到的数据. 我们将定义一个函数如下:
在采集到的数据中创建foreclosure_status列 .
在采集到的数据中创建performance_count列.
将下面的string列转换为integer列:
channel
seller
first_time_homebuyer
loan_purpose
property_type
occupancy_status
property_state
product_type
转换first_payment_date 和 origination_date 为两列:
通过斜杠分离列.
将第一部分分离成月清单.
将第二部分分离成年清单.
删除这一列.
最后, 我们得到 first_payment_month, first_payment_year, origination_month, and origination_year.
所有缺失值填充为-1.
def annotate(acquisition, counts):
acquisition["foreclosure_status"] = acquisition["id"].apply(lambda x: get_performance_summary_value(x, "foreclosure_status", counts))
acquisition["performance_count"] = acquisition["id"].apply(lambda x: get_performance_summary_value(x, "performance_count", counts))
for column in [
"channel",
"seller",
"first_time_homebuyer",
"loan_purpose",
"property_type",
"occupancy_status",
"property_state",
"product_type"
]:
acquisition[column] = acquisition[column].astype('category').cat.codes
for start in ["first_payment", "origination"]:
column = "{}_date".format(start)
acquisition["{}_year".format(start)] = pd.to_numeric(acquisition[column].str.split('/').str.get(1))
acquisition["{}_month".format(start)] = pd.to_numeric(acquisition[column].str.split('/').str.get(0))
del acquisition[column]
acquisition = acquisition.fillna(-1)
acquisition = acquisition[acquisition["performance_count"] > settings.MINIMUM_TRACKING_QUARTERS]
return acquisition
聚合到一起
我们差不多准备就绪了, 我们只需要再在annotate.py添加一点点代码. 在下面代码中, 我们将:
定义一个函数来读取采集的数据.
定义一个函数来写入数据到/train.csv
如果我们在命令行运行annotate.py来读取更新过的数据文件它将做如下事情:
读取采集到的数据.
计算数据性能.
注解数据.
将注解数据写入到train.csv.
def read():
acquisition = pd.read_csv(os.path.join(settings.PROCESSED_DIR, "Acquisition.txt"), sep="|")
return acquisition
def write(acquisition):
acquisition.to_csv(os.path.join(settings.PROCESSED_DIR, "train.csv"), index=False)
if __name__ == "__main__":
acquisition = read()
counts = count_performance_rows()
acquisition = annotate(acquisition, counts)
write(acquisition)
修改完成以后为了确保annotate.py能够生成train.csv文件. 你可以在这里找到完整的 annotate.py file [here][34].
文件夹结果应该像这样:
loan-prediction
├── data
│ ├── Acquisition_2012Q1.txt
│ ├── Acquisition_2012Q2.txt
│ ├── Performance_2012Q1.txt
│ ├── Performance_2012Q2.txt
│ └── ...
├── processed
│ ├── Acquisition.txt
│ ├── Performance.txt
│ ├── train.csv
├── .gitignore
├── annotate.py
├── assemble.py
├── README.md
├── requirements.txt
├── settings.py
找到标准
我们已经完成了training dataset的生成, 现在我们需要最后一步, 生成预测. 我们需要找到错误的标准, 以及该如何评估我们的数据. 在这种情况下, 因为有很多的贷款没有收回, 所以根本不可能做到精确的计算.
我们需要读取数据, 并且计算foreclosure_status列, 我们将得到如下信息:
import pandas as pd
import settings
train = pd.read_csv(os.path.join(settings.PROCESSED_DIR, "train.csv"))
train["foreclosure_status"].value_counts()
False 4635982
True 1585
Name: foreclosure_status, dtype: int64
因为只有一点点贷款收回, 通过百分比标签来建立的机器学习模型会把每行都设置为Fasle, 所以我们在这里要考虑每个样本的不平衡性,确保我们做出的预测是准确的. 我们不想要这么多假的false, 我们将预计贷款收回但是它并没有收回, 我们预计贷款不会回收但是却回收了. 通过以上两点, Fannie Mae的false太多了, 因此显示他们可能无法收回投资.
所以我们将定义一个百分比,就是模型预测没有收回但是实际上收回了, 这个数除以总的负债回收总数. 这个负债回收百分比模型实际上是“没有的”. 下面看这个图表:
通过上面的图表, 1个负债预计不会回收, 也确实没有回收. 如果我们将这个数除以总数, 2, 我们将得到false的概率为50%. 我们将使用这个标准, 因此我们可以评估一下模型的性能.
设置机器学习分类器
我们使用交叉验证预测. 通过交叉验证法, 我们将数据分为3组. 按照下面的方法来做:
Train a model on groups 1 and 2, and use the model to make predictions for group 3.
Train a model on groups 1 and 3, and use the model to make predictions for group 2.
Train a model on groups 2 and 3, and use the model to make predictions for group 1.
将它们分割到不同的组 ,这意味着我们永远不会用相同的数据来为预测训练模型. 这样就避免了 overfitting. 如果我们overfit, 我们将得到很低的false概率, 这使得我们难以改进算法或者应用到现实生活中.
[Scikit-learn][35] 有一个叫做 [cross_val_predict][36] 他可以帮助我们理解交叉算法.
我们还需要一种算法来帮我们预测. 我们还需要一个分类器 [binary classification][37](二元分类). 目标变量foreclosure_status 只有两个值, True 和 False.
我们用[logistic regression][38](回归算法), 因为它能很好的进行binary classification二元分类, 并且运行很快, 占用内存很小. 我们来说一下它是如何工作的 取代许多树状结构, 更像随机森林, 进行转换, 更像一个向量机, 逻辑回归涉及更少的步骤和更少的矩阵.
我们可以使用[logistic regression classifier][39](逻辑回归分类器)算法 来实现scikit-learn. 我们唯一需要注意的是每个类的标准. 如果我们使用同样标准的类, 算法将会预测每行都为false, 因为它总是试图最小化误差.不管怎样, 我们关注有多少贷款能够回收而不是有多少不能回收. 因此, 我们通过 [LogisticRegression][40](逻辑回归)来平衡标准参数, 并计算回收贷款的标准. 这将使我们的算法不会认为每一行都为false.

View File

@ -0,0 +1,156 @@
### Making predictions
Now that we have the preliminaries out of the way, were ready to make predictions. Well create a new file called predict.py that will use the train.csv file we created in the last step. The below code will:
- Import needed libraries.
- Create a function called cross_validate that:
- Creates a logistic regression classifier with the right keyword arguments.
- Creates a list of columns that we want to use to train the model, removing id and foreclosure_status.
- Run cross validation across the train DataFrame.
- Return the predictions.
```
import os
import settings
import pandas as pd
from sklearn import cross_validation
from sklearn.linear_model import LogisticRegression
from sklearn import metrics
def cross_validate(train):
clf = LogisticRegression(random_state=1, class_weight="balanced")
predictors = train.columns.tolist()
predictors = [p for p in predictors if p not in settings.NON_PREDICTORS]
predictions = cross_validation.cross_val_predict(clf, train[predictors], train[settings.TARGET], cv=settings.CV_FOLDS)
return predictions
```
### Predicting error
Now, we just need to write a few functions to compute error. The below code will:
- Create a function called compute_error that:
- Uses scikit-learn to compute a simple accuracy score (the percentage of predictions that matched the actual foreclosure_status values).
- Create a function called compute_false_negatives that:
- Combines the target and the predictions into a DataFrame for convenience.
- Finds the false negative rate.
- Create a function called compute_false_positives that:
- Combines the target and the predictions into a DataFrame for convenience.
- Finds the false positive rate.
- Finds the number of loans that werent foreclosed on that the model predicted would be foreclosed on.
- Divide by the total number of loans that werent foreclosed on.
```
def compute_error(target, predictions):
return metrics.accuracy_score(target, predictions)
def compute_false_negatives(target, predictions):
df = pd.DataFrame({"target": target, "predictions": predictions})
return df[(df["target"] == 1) & (df["predictions"] == 0)].shape[0] / (df[(df["target"] == 1)].shape[0] + 1)
def compute_false_positives(target, predictions):
df = pd.DataFrame({"target": target, "predictions": predictions})
return df[(df["target"] == 0) & (df["predictions"] == 1)].shape[0] / (df[(df["target"] == 0)].shape[0] + 1)
```
### Putting it all together
Now, we just have to put the functions together in predict.py. The below code will:
- Read in the dataset.
- Compute cross validated predictions.
- Compute the 3 error metrics above.
- Print the error metrics.
```
def read():
train = pd.read_csv(os.path.join(settings.PROCESSED_DIR, "train.csv"))
return train
if __name__ == "__main__":
train = read()
predictions = cross_validate(train)
error = compute_error(train[settings.TARGET], predictions)
fn = compute_false_negatives(train[settings.TARGET], predictions)
fp = compute_false_positives(train[settings.TARGET], predictions)
print("Accuracy Score: {}".format(error))
print("False Negatives: {}".format(fn))
print("False Positives: {}".format(fp))
```
Once youve added the code, you can run python predict.py to generate predictions. Running everything shows that our false negative rate is .26, which means that of the foreclosed loans, we missed predicting 26% of them. This is a good start, but can use a lot of improvement!
You can find the complete predict.py file [here][41].
Your file tree should now look like this:
```
loan-prediction
├── data
│ ├── Acquisition_2012Q1.txt
│ ├── Acquisition_2012Q2.txt
│ ├── Performance_2012Q1.txt
│ ├── Performance_2012Q2.txt
│ └── ...
├── processed
│ ├── Acquisition.txt
│ ├── Performance.txt
│ ├── train.csv
├── .gitignore
├── annotate.py
├── assemble.py
├── predict.py
├── README.md
├── requirements.txt
├── settings.py
```
### Writing up a README
Now that weve finished our end to end project, we just have to write up a README.md file so that other people know what we did, and how to replicate it. A typical README.md for a project should include these sections:
- A high level overview of the project, and what the goals are.
- Where to download any needed data or materials.
- Installation instructions.
- How to install the requirements.
- Usage instructions.
- How to run the project.
- What you should see after each step.
- How to contribute to the project.
- Good next steps for extending the project.
[Heres][42] a sample README.md for this project.
### Next steps
Congratulations, youre done making an end to end machine learning project! You can find a complete example project [here][43]. Its a good idea to upload your project to [Github][44] once youve finished it, so others can see it as part of your portfolio.
There are still quite a few angles left to explore with this data. Broadly, we can split them up into 3 categories extending this project and making it more accurate, finding other columns to predict, and exploring the data. Here are some ideas:
- Generate more features in annotate.py.
- Switch algorithms in predict.py.
- Try using more data from Fannie Mae than we used in this post.
- Add in a way to make predictions on future data. The code we wrote will still work if we add more data, so we can add more past or future data.
- Try seeing if you can predict if a bank should have issued the loan originally (vs if Fannie Mae should have acquired the loan).
- Remove any columns from train that the bank wouldnt have known at the time of issuing the loan.
- Some columns are known when Fannie Mae bought the loan, but not before.
- Make predictions.
- Explore seeing if you can predict columns other than foreclosure_status.
- Can you predict how much the property will be worth at sale time?
- Explore the nuances between performance updates.
- Can you predict how many times the borrower will be late on payments?
- Can you map out the typical loan lifecycle?
- Map out data on a state by state or zip code by zip code level.
- Do you see any interesting patterns?
If you build anything interesting, please let us know in the comments!
If you liked this, you might like to read the other posts in our Build a Data Science Porfolio series:
- [Storytelling with data][45].
- [How to setup up a data science blog][46].

View File

@ -1,3 +1,5 @@
alim0x translating
Implementing Mandatory Access Control with SELinux or AppArmor in Linux
===========================================================================

View File

@ -0,0 +1,142 @@
vim-kakali translating
Scientific Audio Processing, Part I - How to read and write Audio files with Octave 4.0.0 on Ubuntu
================
Octave, the equivalent software to Matlab in Linux, has a number of functions and commands that allow the acquisition, recording, playback and digital processing of audio signals for entertainment applications, research, medical, or any other science areas. In this tutorial, we will use Octave V4.0.0 in Ubuntu and will start reading from audio files through writing and playing signals to emulate sounds used in a wide range of activities.
Note that the main focus of this tutorial is not to install or learn to use an audio processing software already established, but rather to understand how it works from the point of view of design and audio engineering.
### Prerequisites
The first step is to install octave. Run the following commands in a terminal to add the Octave PPA in Ubuntu and install the software.
```
sudo apt-add-repository ppa:octave/stable
sudo apt-get update
sudo apt-get install octave
```
### Step 1: Opening Octave.
In this step we open the software by clicking on its icon, we can change the work directory by clicking on the File Browser dropdown.
![](https://www.howtoforge.com/images/how-to-read-and-write-audio-files-with-octave-4-in-ubuntu/initial.png)
### Step 2: Audio Info
The command "audioinfo" shows us relevant information about the audio file that we will process.
```
>> info = audioinfo ('testing.ogg')
```
![](https://www.howtoforge.com/images/how-to-read-and-write-audio-files-with-octave-4-in-ubuntu/audioinfo.png)
### Step 3: Reading an audio File
In this tutorial I will read and use ogg files for which it is feasible to read characteristics like sampling , audio type (stereo or mono), number of channels, etc. I should mention that for purposes of this tutorial, all the commands used will be executed in the terminal window of Octave. First, we have to save the ogg file in a variable. Note: it´s important that the file must be in the work path of Octave
```
>> file='yourfile.ogg'
```
```
>> [M, fs] = audioread(file)
```
Where M is a matrix of one or two columns, depending on the number of channels and fs is the sampling frequency.
![](https://www.howtoforge.com/images/how-to-read-and-write-audio-files-with-octave-4-in-ubuntu/reading.png)
![](https://www.howtoforge.com/images/how-to-read-and-write-audio-files-with-octave-4-in-ubuntu/matrix.png)
![](https://www.howtoforge.com/images/how-to-read-and-write-audio-files-with-octave-4-in-ubuntu/big/frequency.png)
There are some options that we can use for reading audio files, such as:
```
>> [y, fs] = audioread (filename, samples)
>> [y, fs] = audioread (filename, datatype)
>> [y, fs] = audioread (filename, samples, datatype)
```
Where samples specifies starting and ending frames and datatype specifies the data type to return. We can assign values to any variable:
```
>> samples = [1, fs)
>> [y, fs] = audioread (filename, samples)
```
And about datatype:
```
>> [y,Fs] = audioread(filename,'native')
```
If the value is 'native' then the type of data depends on how the data is stored in the audio file.
### Step 4: Writing an audio file
Creating the ogg file:
For this purpose, we are going to generate an ogg file with values from a cosine. The sampling frequency that I will use is 44100 samples per second and the file will last for 10 seconds. The frequency of the cosine signal is 440 Hz.
```
>> filename='cosine.ogg';
>> fs=44100;
>> t=0:1/fs:10;
>> w=2*pi*440*t;
>> signal=cos(w);
>> audiowrite(filename, signal, fs);
```
This creates a file named 'cosine.ogg' in our workspace that contains the cosine signal.
![](https://www.howtoforge.com/images/how-to-read-and-write-audio-files-with-octave-4-in-ubuntu/cosinefile.png)
If we play the 'cosine.ogg' file then this will reproduce a 440Hz tone which is equivalent to an 'A' musical tone. If we want to see the values saved in the file we have to 'read' the file with the 'audioread' function. In a further tutorial, we will see how to write an audio file with two channels.
### Step 5: Playing an audio file
Octave, by default, has an audio player that we can use for testing purposes. Use the following functions as example:
```
>> [y,fs]=audioread('yourfile.ogg');
>> player=audioplayer(y, fs, 8)
scalar structure containing the fields:
BitsPerSample = 8
CurrentSample = 0
DeviceID = -1
NumberOfChannels = 1
Running = off
SampleRate = 44100
TotalSamples = 236473
Tag =
Type = audioplayer
UserData = [](0x0)
>> play(player);
```
In the next parts of the tutorial, we will see advanced audio processing features and possible use cases for scientific and commercial use.
--------------------------------------------------------------------------------
via: https://www.howtoforge.com/tutorial/how-to-read-and-write-audio-files-with-octave-4-in-ubuntu/
作者:[David Duarte][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对ID](https://github.com/校对ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://twitter.com/intent/follow?original_referer=https%3A%2F%2Fwww.howtoforge.com%2Ftutorial%2Fhow-to-read-and-write-audio-files-with-octave-4-in-ubuntu%2F&ref_src=twsrc%5Etfw&region=follow_link&screen_name=howtoforgecom&tw_p=followbutton

View File

@ -0,0 +1,845 @@
Building a data science portfolio: Machine learning project
===========================================================
>This is the third in a series of posts on how to build a Data Science Portfolio. If you like this and want to know when the next post in the series is released, you can [subscribe at the bottom of the page][1].
Data science companies are increasingly looking at portfolios when making hiring decisions. One of the reasons for this is that a portfolio is the best way to judge someones real-world skills. The good news for you is that a portfolio is entirely within your control. If you put some work in, you can make a great portfolio that companies are impressed by.
The first step in making a high-quality portfolio is to know what skills to demonstrate. The primary skills that companies want in data scientists, and thus the primary skills they want a portfolio to demonstrate, are:
- Ability to communicate
- Ability to collaborate with others
- Technical competence
- Ability to reason about data
- Motivation and ability to take initiative
Any good portfolio will be composed of multiple projects, each of which may demonstrate 1-2 of the above points. This is the third post in a series that will cover how to make a well-rounded data science portfolio. In this post, well cover how to make the second project in your portfolio, and how to build an end to end machine learning project. At the end, youll have a project that shows your ability to reason about data, and your technical competence. [Heres][2] the completed project if you want to take a look.
### An end to end project
As a data scientist, there are times when youll be asked to take a dataset and figure out how to [tell a story with it][3]. In times like this, its important to communicate very well, and walk through your process. Tools like Jupyter notebook, which we used in a previous post, are very good at helping you do this. The expectation here is that the deliverable is a presentation or document summarizing your findings.
However, there are other times when youll be asked to create a project that has operational value. A project with operational value directly impacts the day-to-day operations of a company, and will be used more than once, and often by multiple people. A task like this might be “create an algorithm to forecast our churn rate”, or “create a model that can automatically tag our articles”. In cases like this, storytelling is less important than technical competence. You need to be able to take a dataset, understand it, then create a set of scripts that can process that data. Its often important that these scripts run quickly, and use minimal system resources like memory. Its very common that these scripts will be run several times, so the deliverable becomes the scripts themselves, not a presentation. The deliverable is often integrated into operational flows, and may even be user-facing.
The main components of building an end to end project are:
- Understanding the context
- Exploring the data and figuring out the nuances
- Creating a well-structured project, so its easy to integrate into operational flows
- Writing high-performance code that runs quickly and uses minimal system resources
- Documenting the installation and usage of your code well, so others can use it
In order to effectively create a project of this kind, well need to work with multiple files. Using a text editor like [Atom][4], or an IDE like [PyCharm][5] is highly recommended. These tools will allow you to jump between files, and edit files of different types, like markdown files, Python files, and csv files. Structuring your project so its easy to version control and upload to collaborative coding tools like [Github][6] is also useful.
![](https://www.dataquest.io/blog/images/end_to_end/github.png)
>This project on Github.
Well use our editing tools along with libraries like [Pandas][7] and [scikit-learn][8] in this post. Well make extensive use of Pandas [DataFrames][9], which make it easy to read in and work with tabular data in Python.
### Finding good datasets
A good dataset for an end to end portfolio project can be hard to find. [The dataset][10] needs to be sufficiently large that memory and performance constraints come into play. It also needs to potentially be operationally useful. For instance, this dataset, which contains data on the admission criteria, graduation rates, and graduate future earnings for US colleges would be a great dataset to use to tell a story. However, as you think about the dataset, it becomes clear that there isnt enough nuance to build a good end to end project with it. For example, you could tell someone their potential future earnings if they went to a specific college, but that would be a quick lookup without enough nuance to demonstrate technical competence. You could also figure out if colleges with higher admissions standards tend to have graduates who earn more, but that would be more storytelling than operational.
These memory and performance constraints tend to come into play when you have more than a gigabyte of data, and when you have some nuance to what you want to predict, which involves running algorithms over the dataset.
A good operational dataset enables you to build a set of scripts that transform the data, and answer dynamic questions. A good example would be a dataset of stock prices. You would be able to predict the prices for the next day, and keep feeding new data to the algorithm as the markets closed. This would enable you to make trades, and potentially even profit. This wouldnt be telling a story it would be adding direct value.
Some good places to find datasets like this are:
- [/r/datasets][11] a subreddit that has hundreds of interesting datasets.
- [Google Public Datasets][12] public datasets available through Google BigQuery.
- [Awesome datasets][13] a list of datasets, hosted on Github.
As you look through these datasets, think about what questions someone might want answered with the dataset, and think if those questions are one-time (“how did housing prices correlate with the S&P 500?”), or ongoing (“can you predict the stock market?”). The key here is to find questions that are ongoing, and require the same code to be run multiple times with different inputs (different data).
For the purposes of this post, well look at [Fannie Mae Loan Data][14]. Fannie Mae is a government sponsored enterprise in the US that buys mortgage loans from other lenders. It then bundles these loans up into mortgage-backed securities and resells them. This enables lenders to make more mortgage loans, and creates more liquidity in the market. This theoretically leads to more homeownership, and better loan terms. From a borrowers perspective, things stay largely the same, though.
Fannie Mae releases two types of data data on loans it acquires, and data on how those loans perform over time. In the ideal case, someone borrows money from a lender, then repays the loan until the balance is zero. However, some borrowers miss multiple payments, which can cause foreclosure. Foreclosure is when the house is seized by the bank because mortgage payments cannot be made. Fannie Mae tracks which loans have missed payments on them, and which loans needed to be foreclosed on. This data is published quarterly, and lags the current date by 1 year. As of this writing, the most recent dataset thats available is from the first quarter of 2015.
Acquisition data, which is published when the loan is acquired by Fannie Mae, contains information on the borrower, including credit score, and information on their loan and home. Performance data, which is published every quarter after the loan is acquired, contains information on the payments being made by the borrower, and the foreclosure status, if any. A loan that is acquired may have dozens of rows in the performance data. A good way to think of this is that the acquisition data tells you that Fannie Mae now controls the loan, and the performance data contains a series of status updates on the loan. One of the status updates may tell us that the loan was foreclosed on during a certain quarter.
![](https://www.dataquest.io/blog/images/end_to_end/foreclosure.jpg)
>A foreclosed home being sold.
### Picking an angle
There are a few directions we could go in with the Fannie Mae dataset. We could:
- Try to predict the sale price of a house after its foreclosed on.
- Predict the payment history of a borrower.
- Figure out a score for each loan at acquisition time.
The important thing is to stick to a single angle. Trying to focus on too many things at once will make it hard to make an effective project. Its also important to pick an angle that has sufficient nuance. Here are examples of angles without much nuance:
- Figuring out which banks sold loans to Fannie Mae that were foreclosed on the most.
- Figuring out trends in borrower credit scores.
- Exploring which types of homes are foreclosed on most often.
- Exploring the relationship between loan amounts and foreclosure sale prices
All of the above angles are interesting, and would be great if we were focused on storytelling, but arent great fits for an operational project.
With the Fannie Mae dataset, well try to predict whether a loan will be foreclosed on in the future by only using information that was available when the loan was acquired. In effect, well create a “score” for any mortgage that will tell us if Fannie Mae should buy it or not. This will give us a nice foundation to build on, and will be a great portfolio piece.
### Understanding the data
Lets take a quick look at the raw data files. Here are the first few rows of the acquisition data from quarter 1 of 2012:
```
100000853384|R|OTHER|4.625|280000|360|02/2012|04/2012|31|31|1|23|801|N|C|SF|1|I|CA|945||FRM|
100003735682|R|SUNTRUST MORTGAGE INC.|3.99|466000|360|01/2012|03/2012|80|80|2|30|794|N|P|SF|1|P|MD|208||FRM|788
100006367485|C|PHH MORTGAGE CORPORATION|4|229000|360|02/2012|04/2012|67|67|2|36|802|N|R|SF|1|P|CA|959||FRM|794
```
Here are the first few rows of the performance data from quarter 1 of 2012:
```
100000853384|03/01/2012|OTHER|4.625||0|360|359|03/2042|41860|0|N||||||||||||||||
100000853384|04/01/2012||4.625||1|359|358|03/2042|41860|0|N||||||||||||||||
100000853384|05/01/2012||4.625||2|358|357|03/2042|41860|0|N||||||||||||||||
```
Before proceeding too far into coding, its useful to take some time and really understand the data. This is more critical in operational projects because we arent interactively exploring the data, it can be harder to spot certain nuances unless we find them upfront. In this case, the first step is to read the materials on the Fannie Mae site:
- [Overview][15]
- [Glossary of useful terms][16]
- [FAQs][17]
- [Columns in the Acquisition and Performance files][18]
- [Sample Acquisition data file][19]
- [Sample Performance data file][20]
After reading through these files, we know some key facts that will help us:
- Theres an Acquisition file and a Performance file for each quarter, starting from the year 2000 to present. Theres a 1 year lag in the data, so the most recent data is from 2015 as of this writing.
- The files are in text format, with a pipe (|) as a delimiter.
- The files dont have headers, but we have a list of what each column is.
- All together, the files contain data on 22 million loans.
- Because the Performance files contain information on loans acquired in previous years, there will be more performance data for loans acquired in earlier years (ie loans acquired in 2014 wont have much performance history).
These small bits of information will save us a ton of time as we figure out how to structure our project and work with the data.
### Structuring the project
Before we start downloading and exploring the data, its important to think about how well structure the project. When building an end-to-end project, our primary goals are:
- Creating a solution that works
- Having a solution that runs quickly and uses minimal resources
- Enabling others to easily extend our work
- Making it easy for others to understand our code
- Writing as little code as possible
In order to achieve these goals, well need to structure our project well. A well structured project follows a few principles:
- Separates data files and code files.
- Separates raw data from generated data.
- Has a README.md file that walks people through installing and using the project.
- Has a requirements.txt file that contains all the packages needed to run the project.
- Has a single settings.py file that contains any settings that are used in other files.
- For example, if you are reading the same file from multiple Python scripts, its useful to have them all import settings and get the file name from a centralized place.
- Has a .gitignore file that prevents large or secret files from being committed.
- Breaks each step in our task into a separate file that can be executed separately.
- For example, we may have one file for reading in the data, one for creating features, and one for making predictions.
- Stores intermediate values. For example, one script may output a file that the next script can read.
- This enables us to make changes in our data processing flow without recalculating everything.
Our file structure will look something like this shortly:
```
loan-prediction
├── data
├── processed
├── .gitignore
├── README.md
├── requirements.txt
├── settings.py
```
### Creating the initial files
To start with, well need to create a loan-prediction folder. Inside that folder, well need to make a data folder and a processed folder. The first will store our raw data, and the second will store any intermediate calculated values.
Next, well make a .gitignore file. A .gitignore file will make sure certain files are ignored by git and not pushed to Github. One good example of such a file is the .DS_Store file created by OSX in every folder. A good starting point for a .gitignore file is here. Well also want to ignore the data files because they are very large, and the Fannie Mae terms prevent us from redistributing them, so we should add two lines to the end of our file:
```
data
processed
```
[Heres][21] an example .gitignore file for this project.
Next, well need to create README.md, which will help people understand the project. .md indicates that the file is in markdown format. Markdown enables you write plain text, but also add some fancy formatting if you want. [Heres][22] a guide on markdown. If you upload a file called README.md to Github, Github will automatically process the markdown, and show it to anyone who views the project. [Heres][23] an example.
For now, we just need to put a simple description in README.md:
```
Loan Prediction
-----------------------
Predict whether or not loans acquired by Fannie Mae will go into foreclosure. Fannie Mae acquires loans from other lenders as a way of inducing them to lend more. Fannie Mae releases data on the loans it has acquired and their performance afterwards [here](http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html).
```
Now, we can create a requirements.txt file. This will make it easy for other people to install our project. We dont know exactly what libraries well be using yet, but heres a good starting point:
```
pandas
matplotlib
scikit-learn
numpy
ipython
scipy
```
The above libraries are the most commonly used for data analysis tasks in Python, and its fair to assume that well be using most of them. [Heres][24] an example requirements file for this project.
After creating requirements.txt, you should install the packages. For this post, well be using Python 3. If you dont have Python installed, you should look into using [Anaconda][25], a Python installer that also installs all the packages listed above.
Finally, we can just make a blank settings.py file, since we dont have any settings for our project yet.
### Acquiring the data
Once we have the skeleton of our project, we can get the raw data.
Fannie Mae has some restrictions around acquiring the data, so youll need to sign up for an account. You can find the download page [here][26]. After creating an account, youll be able to download as few or as many loan data files as you want. The files are in zip format, and are reasonably large after decompression.
For the purposes of this blog post, well download everything from Q1 2012 to Q1 2015, inclusive. Well then need to unzip all of the files. After unzipping the files, remove the original .zip files. At the end, the loan-prediction folder should look something like this:
```
loan-prediction
├── data
│ ├── Acquisition_2012Q1.txt
│ ├── Acquisition_2012Q2.txt
│ ├── Performance_2012Q1.txt
│ ├── Performance_2012Q2.txt
│ └── ...
├── processed
├── .gitignore
├── README.md
├── requirements.txt
├── settings.py
```
After downloading the data, you can use the head and tail shell commands to look at the lines in the files. Do you see any columns that arent needed? It might be useful to consult the [pdf of column names][27] while doing this.
### Reading in the data
There are two issues that make our data hard to work with right now:
- The acquisition and performance datasets are segmented across multiple files.
- Each file is missing headers.
Before we can get started on working with the data, well need to get to the point where we have one file for the acquisition data, and one file for the performance data. Each of the files will need to contain only the columns we care about, and have the proper headers. One wrinkle here is that the performance data is quite large, so we should try to trim some of the columns if we can.
The first step is to add some variables to settings.py, which will contain the paths to our raw data and our processed data. Well also add a few other settings that will be useful later on:
```
DATA_DIR = "data"
PROCESSED_DIR = "processed"
MINIMUM_TRACKING_QUARTERS = 4
TARGET = "foreclosure_status"
NON_PREDICTORS = [TARGET, "id"]
CV_FOLDS = 3
```
Putting the paths in settings.py will put them in a centralized place and make them easy to change down the line. When referring to the same variables in multiple files, its easier to put them in a central place than edit them in every file when you want to change them. [Heres][28] an example settings.py file for this project.
The second step is to create a file called assemble.py that will assemble all the pieces into 2 files. When we run python assemble.py, well get 2 data files in the processed directory.
Well then start writing code in assemble.py. Well first need to define the headers for each file, so well need to look at [pdf of column names][29] and create lists of the columns in each Acquisition and Performance file:
```
HEADERS = {
"Acquisition": [
"id",
"channel",
"seller",
"interest_rate",
"balance",
"loan_term",
"origination_date",
"first_payment_date",
"ltv",
"cltv",
"borrower_count",
"dti",
"borrower_credit_score",
"first_time_homebuyer",
"loan_purpose",
"property_type",
"unit_count",
"occupancy_status",
"property_state",
"zip",
"insurance_percentage",
"product_type",
"co_borrower_credit_score"
],
"Performance": [
"id",
"reporting_period",
"servicer_name",
"interest_rate",
"balance",
"loan_age",
"months_to_maturity",
"maturity_date",
"msa",
"delinquency_status",
"modification_flag",
"zero_balance_code",
"zero_balance_date",
"last_paid_installment_date",
"foreclosure_date",
"disposition_date",
"foreclosure_costs",
"property_repair_costs",
"recovery_costs",
"misc_costs",
"tax_costs",
"sale_proceeds",
"credit_enhancement_proceeds",
"repurchase_proceeds",
"other_foreclosure_proceeds",
"non_interest_bearing_balance",
"principal_forgiveness_balance"
]
}
```
The next step is to define the columns we want to keep. Since all were measuring on an ongoing basis about the loan is whether or not it was ever foreclosed on, we can discard many of the columns in the performance data. Well need to keep all the columns in the acquisition data, though, because we want to maximize the information we have about when the loan was acquired (after all, were predicting if the loan will ever be foreclosed or not at the point its acquired). Discarding columns will enable us to save disk space and memory, while also speeding up our code.
```
SELECT = {
"Acquisition": HEADERS["Acquisition"],
"Performance": [
"id",
"foreclosure_date"
]
}
```
Next, well write a function to concatenate the data sets. The below code will:
- Import a few needed libraries, including settings.
- Define a function concatenate, that:
- Gets the names of all the files in the data directory.
- Loops through each file.
- If the file isnt the right type (doesnt start with the prefix we want), we ignore it.
- Reads the file into a [DataFrame][30] with the right settings using the Pandas [read_csv][31] function.
- Sets the separator to | so the fields are read in correctly.
- The data has no header row, so sets header to None to indicate this.
- Sets names to the right value from the HEADERS dictionary these will be the column names of our DataFrame.
- Picks only the columns from the DataFrame that we added in SELECT.
- Concatenates all the DataFrames together.
- Writes the concatenated DataFrame back to a file.
```
import os
import settings
import pandas as pd
def concatenate(prefix="Acquisition"):
files = os.listdir(settings.DATA_DIR)
full = []
for f in files:
if not f.startswith(prefix):
continue
data = pd.read_csv(os.path.join(settings.DATA_DIR, f), sep="|", header=None, names=HEADERS[prefix], index_col=False)
data = data[SELECT[prefix]]
full.append(data)
full = pd.concat(full, axis=0)
full.to_csv(os.path.join(settings.PROCESSED_DIR, "{}.txt".format(prefix)), sep="|", header=SELECT[prefix], index=False)
```
We can call the above function twice with the arguments Acquisition and Performance to concatenate all the acquisition and performance files together. The below code will:
- Only execute if the script is called from the command line with python assemble.py.
- Concatenate all the files, and result in two files:
- `processed/Acquisition.txt`
- `processed/Performance.txt`
```
if __name__ == "__main__":
concatenate("Acquisition")
concatenate("Performance")
```
We now have a nice, compartmentalized assemble.py thats easy to execute, and easy to build off of. By decomposing the problem into pieces like this, we make it easy to build our project. Instead of one messy script that does everything, we define the data that will pass between the scripts, and make them completely separate from each other. When youre working on larger projects, its a good idea to do this, because it makes it much easier to change individual pieces without having unexpected consequences on unrelated pieces of the project.
Once we finish the assemble.py script, we can run python assemble.py. You can find the complete assemble.py file [here][32].
This will result in two files in the processed directory:
```
loan-prediction
├── data
│ ├── Acquisition_2012Q1.txt
│ ├── Acquisition_2012Q2.txt
│ ├── Performance_2012Q1.txt
│ ├── Performance_2012Q2.txt
│ └── ...
├── processed
│ ├── Acquisition.txt
│ ├── Performance.txt
├── .gitignore
├── assemble.py
├── README.md
├── requirements.txt
├── settings.py
```
### Computing values from the performance data
The next step well take is to calculate some values from processed/Performance.txt. All we want to do is to predict whether or not a property is foreclosed on. To figure this out, we just need to check if the performance data associated with a loan ever has a foreclosure_date. If foreclosure_date is None, then the property was never foreclosed on. In order to avoid including loans with little performance history in our sample, well also want to count up how many rows exist in the performance file for each loan. This will let us filter loans without much performance history from our training data.
One way to think of the loan data and the performance data is like this:
![](https://github.com/LCTT/wiki-images/blob/master/TranslateProject/ref_img/001.png)
As you can see above, each row in the Acquisition data can be related to multiple rows in the Performance data. In the Performance data, foreclosure_date will appear in the quarter when the foreclosure happened, so it should be blank prior to that. Some loans are never foreclosed on, so all the rows related to them in the Performance data have foreclosure_date blank.
We need to compute foreclosure_status, which is a Boolean that indicates whether a particular loan id was ever foreclosed on, and performance_count, which is the number of rows in the performance data for each loan id.
There are a few different ways to compute the counts we want:
- We could read in all the performance data, then use the Pandas groupby method on the DataFrame to figure out the number of rows associated with each loan id, and also if the foreclosure_date is ever not None for the id.
- The upside of this method is that its easy to implement from a syntax perspective.
- The downside is that reading in all 129236094 lines in the data will take a lot of memory, and be extremely slow.
- We could read in all the performance data, then use apply on the acquisition DataFrame to find the counts for each id.
- The upside is that its easy to conceptualize.
- The downside is that reading in all 129236094 lines in the data will take a lot of memory, and be extremely slow.
- We could iterate over each row in the performance dataset, and keep a separate dictionary of counts.
- The upside is that the dataset doesnt need to be loaded into memory, so its extremely fast and memory-efficient.
- The downside is that it will take slightly longer to conceptualize and implement, and we need to parse the rows manually.
Loading in all the data will take quite a bit of memory, so lets go with the third option above. All we need to do is to iterate through all the rows in the Performance data, while keeping a dictionary of counts per loan id. In the dictionary, well keep track of how many times the id appears in the performance data, as well as if foreclosure_date is ever not None. This will give us foreclosure_status and performance_count.
Well create a new file called annotate.py, and add in code that will enable us to compute these values. In the below code, well:
- Import needed libraries.
- Define a function called count_performance_rows.
- Open processed/Performance.txt. This doesnt read the file into memory, but instead opens a file handler that can be used to read in the file line by line.
- Loop through each line in the file.
- Split the line on the delimiter (|)
- Check if the loan_id is not in the counts dictionary.
- If not, add it to counts.
- Increment performance_count for the given loan_id because were on a row that contains it.
- If date is not None, then we know that the loan was foreclosed on, so set foreclosure_status appropriately.
```
import os
import settings
import pandas as pd
def count_performance_rows():
counts = {}
with open(os.path.join(settings.PROCESSED_DIR, "Performance.txt"), 'r') as f:
for i, line in enumerate(f):
if i == 0:
# Skip header row
continue
loan_id, date = line.split("|")
loan_id = int(loan_id)
if loan_id not in counts:
counts[loan_id] = {
"foreclosure_status": False,
"performance_count": 0
}
counts[loan_id]["performance_count"] += 1
if len(date.strip()) > 0:
counts[loan_id]["foreclosure_status"] = True
return counts
```
### Getting the values
Once we create our counts dictionary, we can make a function that will extract values from the dictionary if a loan_id and a key are passed in:
```
def get_performance_summary_value(loan_id, key, counts):
value = counts.get(loan_id, {
"foreclosure_status": False,
"performance_count": 0
})
return value[key]
```
The above function will return the appropriate value from the counts dictionary, and will enable us to assign a foreclosure_status value and a performance_count value to each row in the Acquisition data. The [get][33] method on dictionaries returns a default value if a key isnt found, so this enables us to return sensible default values if a key isnt found in the counts dictionary.
### Annotating the data
Weve already added a few functions to annotate.py, but now we can get into the meat of the file. Well need to convert the acquisition data into a training dataset that can be used in a machine learning algorithm. This involves a few things:
- Converting all columns to numeric.
- Filling in any missing values.
- Assigning a performance_count and a foreclosure_status to each row.
- Removing any rows that dont have a lot of performance history (where performance_count is low).
Several of our columns are strings, which arent useful to a machine learning algorithm. However, they are actually categorical variables, where there are a few different category codes, like R, S, and so on. We can convert these columns to numeric by assigning a number to each category label:
![](https://github.com/LCTT/wiki-images/blob/master/TranslateProject/ref_img/002.png)
Converting the columns this way will allow us to use them in our machine learning algorithm.
Some of the columns also contain dates (first_payment_date and origination_date). We can split these dates into 2 columns each:
![](https://github.com/LCTT/wiki-images/blob/master/TranslateProject/ref_img/003.png)
In the below code, well transform the Acquisition data. Well define a function that:
- Creates a foreclosure_status column in acquisition by getting the values from the counts dictionary.
- Creates a performance_count column in acquisition by getting the values from the counts dictionary.
- Converts each of the following columns from a string column to an integer column:
- channel
- seller
- first_time_homebuyer
- loan_purpose
- property_type
- occupancy_status
- property_state
- product_type
- Converts first_payment_date and origination_date to 2 columns each:
- Splits the column on the forward slash.
- Assigns the first part of the split list to a month column.
- Assigns the second part of the split list to a year column.
- Deletes the column.
- At the end, well have first_payment_month, first_payment_year, origination_month, and origination_year.
- Fills any missing values in acquisition with -1.
```
def annotate(acquisition, counts):
acquisition["foreclosure_status"] = acquisition["id"].apply(lambda x: get_performance_summary_value(x, "foreclosure_status", counts))
acquisition["performance_count"] = acquisition["id"].apply(lambda x: get_performance_summary_value(x, "performance_count", counts))
for column in [
"channel",
"seller",
"first_time_homebuyer",
"loan_purpose",
"property_type",
"occupancy_status",
"property_state",
"product_type"
]:
acquisition[column] = acquisition[column].astype('category').cat.codes
for start in ["first_payment", "origination"]:
column = "{}_date".format(start)
acquisition["{}_year".format(start)] = pd.to_numeric(acquisition[column].str.split('/').str.get(1))
acquisition["{}_month".format(start)] = pd.to_numeric(acquisition[column].str.split('/').str.get(0))
del acquisition[column]
acquisition = acquisition.fillna(-1)
acquisition = acquisition[acquisition["performance_count"] > settings.MINIMUM_TRACKING_QUARTERS]
return acquisition
```
### Pulling everything together
Were almost ready to pull everything together, we just need to add a bit more code to annotate.py. In the below code, we:
- Define a function to read in the acquisition data.
- Define a function to write the processed data to processed/train.csv
- If this file is called from the command line, like python annotate.py:
- Read in the acquisition data.
- Compute the counts for the performance data, and assign them to counts.
- Annotate the acquisition DataFrame.
- Write the acquisition DataFrame to train.csv.
```
def read():
acquisition = pd.read_csv(os.path.join(settings.PROCESSED_DIR, "Acquisition.txt"), sep="|")
return acquisition
def write(acquisition):
acquisition.to_csv(os.path.join(settings.PROCESSED_DIR, "train.csv"), index=False)
if __name__ == "__main__":
acquisition = read()
counts = count_performance_rows()
acquisition = annotate(acquisition, counts)
write(acquisition)
```
Once youre done updating the file, make sure to run it with python annotate.py, to generate the train.csv file. You can find the complete annotate.py file [here][34].
The folder should now look like this:
```
loan-prediction
├── data
│ ├── Acquisition_2012Q1.txt
│ ├── Acquisition_2012Q2.txt
│ ├── Performance_2012Q1.txt
│ ├── Performance_2012Q2.txt
│ └── ...
├── processed
│ ├── Acquisition.txt
│ ├── Performance.txt
│ ├── train.csv
├── .gitignore
├── annotate.py
├── assemble.py
├── README.md
├── requirements.txt
├── settings.py
```
### Finding an error metric
Were done with generating our training dataset, and now well just need to do the final step, generating predictions. Well need to figure out an error metric, as well as how we want to evaluate our data. In this case, there are many more loans that arent foreclosed on than are, so typical accuracy measures dont make much sense.
If we read in the training data, and check the counts in the foreclosure_status column, heres what we get:
```
import pandas as pd
import settings
train = pd.read_csv(os.path.join(settings.PROCESSED_DIR, "train.csv"))
train["foreclosure_status"].value_counts()
```
```
False 4635982
True 1585
Name: foreclosure_status, dtype: int64
```
Since so few of the loans were foreclosed on, just checking the percentage of labels that were correctly predicted will mean that we can make a machine learning model that predicts False for every row, and still gets a very high accuracy. Instead, well want to use a metric that takes the class imbalance into account, and ensures that we predict foreclosures accurately. We dont want too many false positives, where we make predict that a loan will be foreclosed on even though it wont, or too many false negatives, where we predict that a loan wont be foreclosed on, but it is. Of these two, false negatives are more costly for Fannie Mae, because theyre buying loans where they may not be able to recoup their investment.
Well define false negative rate as the number of loans where the model predicts no foreclosure but the the loan was actually foreclosed on, divided by the number of total loans that were actually foreclosed on. This is the percentage of actual foreclosures that the model “Missed”. Heres a diagram:
![](https://github.com/LCTT/wiki-images/blob/master/TranslateProject/ref_img/004.png)
In the diagram above, 1 loan was predicted as not being foreclosed on, but it actually was. If we divide this by the number of loans that were actually foreclosed on, 2, we get the false negative rate, 50%. Well use this as our error metric, so we can evaluate our models performance.
### Setting up the classifier for machine learning
Well use cross validation to make predictions. With cross validation, well divide our data into 3 groups. Then well do the following:
- Train a model on groups 1 and 2, and use the model to make predictions for group 3.
- Train a model on groups 1 and 3, and use the model to make predictions for group 2.
- Train a model on groups 2 and 3, and use the model to make predictions for group 1.
Splitting it up into groups this way means that we never train a model using the same data were making predictions for. This avoids overfitting. If we overfit, well get a falsely low false negative rate, which makes it hard to improve our algorithm or use it in the real world.
[Scikit-learn][35] has a function called [cross_val_predict][36] which will make it easy to perform cross validation.
Well also need to pick an algorithm to use to make predictions. We need a classifier that can do [binary classification][37]. The target variable, foreclosure_status only has two values, True and False.
Well use [logistic regression][38], because it works well for binary classification, runs extremely quickly, and uses little memory. This is due to how the algorithm works instead of constructing dozens of trees, like a random forest, or doing expensive transformations, like a support vector machine, logistic regression has far fewer steps involving fewer matrix operations.
We can use the [logistic regression classifier][39] algorithm thats implemented in scikit-learn. The only thing we need to pay attention to is the weights of each class. If we weight the classes equally, the algorithm will predict False for every row, because it is trying to minimize errors. However, we care much more about foreclosures than we do about loans that arent foreclosed on. Thus, well pass balanced to the class_weight keyword argument of the [LogisticRegression][40] class, to get the algorithm to weight the foreclosures more to account for the difference in the counts of each class. This will ensure that the algorithm doesnt predict False for every row, and instead is penalized equally for making errors in predicting either class.
### Making predictions
Now that we have the preliminaries out of the way, were ready to make predictions. Well create a new file called predict.py that will use the train.csv file we created in the last step. The below code will:
- Import needed libraries.
- Create a function called cross_validate that:
- Creates a logistic regression classifier with the right keyword arguments.
- Creates a list of columns that we want to use to train the model, removing id and foreclosure_status.
- Run cross validation across the train DataFrame.
- Return the predictions.
```
import os
import settings
import pandas as pd
from sklearn import cross_validation
from sklearn.linear_model import LogisticRegression
from sklearn import metrics
def cross_validate(train):
clf = LogisticRegression(random_state=1, class_weight="balanced")
predictors = train.columns.tolist()
predictors = [p for p in predictors if p not in settings.NON_PREDICTORS]
predictions = cross_validation.cross_val_predict(clf, train[predictors], train[settings.TARGET], cv=settings.CV_FOLDS)
return predictions
```
### Predicting error
Now, we just need to write a few functions to compute error. The below code will:
- Create a function called compute_error that:
- Uses scikit-learn to compute a simple accuracy score (the percentage of predictions that matched the actual foreclosure_status values).
- Create a function called compute_false_negatives that:
- Combines the target and the predictions into a DataFrame for convenience.
- Finds the false negative rate.
- Create a function called compute_false_positives that:
- Combines the target and the predictions into a DataFrame for convenience.
- Finds the false positive rate.
- Finds the number of loans that werent foreclosed on that the model predicted would be foreclosed on.
- Divide by the total number of loans that werent foreclosed on.
```
def compute_error(target, predictions):
return metrics.accuracy_score(target, predictions)
def compute_false_negatives(target, predictions):
df = pd.DataFrame({"target": target, "predictions": predictions})
return df[(df["target"] == 1) & (df["predictions"] == 0)].shape[0] / (df[(df["target"] == 1)].shape[0] + 1)
def compute_false_positives(target, predictions):
df = pd.DataFrame({"target": target, "predictions": predictions})
return df[(df["target"] == 0) & (df["predictions"] == 1)].shape[0] / (df[(df["target"] == 0)].shape[0] + 1)
```
### Putting it all together
Now, we just have to put the functions together in predict.py. The below code will:
- Read in the dataset.
- Compute cross validated predictions.
- Compute the 3 error metrics above.
- Print the error metrics.
```
def read():
train = pd.read_csv(os.path.join(settings.PROCESSED_DIR, "train.csv"))
return train
if __name__ == "__main__":
train = read()
predictions = cross_validate(train)
error = compute_error(train[settings.TARGET], predictions)
fn = compute_false_negatives(train[settings.TARGET], predictions)
fp = compute_false_positives(train[settings.TARGET], predictions)
print("Accuracy Score: {}".format(error))
print("False Negatives: {}".format(fn))
print("False Positives: {}".format(fp))
```
Once youve added the code, you can run python predict.py to generate predictions. Running everything shows that our false negative rate is .26, which means that of the foreclosed loans, we missed predicting 26% of them. This is a good start, but can use a lot of improvement!
You can find the complete predict.py file [here][41].
Your file tree should now look like this:
```
loan-prediction
├── data
│ ├── Acquisition_2012Q1.txt
│ ├── Acquisition_2012Q2.txt
│ ├── Performance_2012Q1.txt
│ ├── Performance_2012Q2.txt
│ └── ...
├── processed
│ ├── Acquisition.txt
│ ├── Performance.txt
│ ├── train.csv
├── .gitignore
├── annotate.py
├── assemble.py
├── predict.py
├── README.md
├── requirements.txt
├── settings.py
```
### Writing up a README
Now that weve finished our end to end project, we just have to write up a README.md file so that other people know what we did, and how to replicate it. A typical README.md for a project should include these sections:
- A high level overview of the project, and what the goals are.
- Where to download any needed data or materials.
- Installation instructions.
- How to install the requirements.
- Usage instructions.
- How to run the project.
- What you should see after each step.
- How to contribute to the project.
- Good next steps for extending the project.
[Heres][42] a sample README.md for this project.
### Next steps
Congratulations, youre done making an end to end machine learning project! You can find a complete example project [here][43]. Its a good idea to upload your project to [Github][44] once youve finished it, so others can see it as part of your portfolio.
There are still quite a few angles left to explore with this data. Broadly, we can split them up into 3 categories extending this project and making it more accurate, finding other columns to predict, and exploring the data. Here are some ideas:
- Generate more features in annotate.py.
- Switch algorithms in predict.py.
- Try using more data from Fannie Mae than we used in this post.
- Add in a way to make predictions on future data. The code we wrote will still work if we add more data, so we can add more past or future data.
- Try seeing if you can predict if a bank should have issued the loan originally (vs if Fannie Mae should have acquired the loan).
- Remove any columns from train that the bank wouldnt have known at the time of issuing the loan.
- Some columns are known when Fannie Mae bought the loan, but not before.
- Make predictions.
- Explore seeing if you can predict columns other than foreclosure_status.
- Can you predict how much the property will be worth at sale time?
- Explore the nuances between performance updates.
- Can you predict how many times the borrower will be late on payments?
- Can you map out the typical loan lifecycle?
- Map out data on a state by state or zip code by zip code level.
- Do you see any interesting patterns?
If you build anything interesting, please let us know in the comments!
If you liked this, you might like to read the other posts in our Build a Data Science Porfolio series:
- [Storytelling with data][45].
- [How to setup up a data science blog][46].
--------------------------------------------------------------------------------
via: https://www.dataquest.io/blog/data-science-portfolio-machine-learning/
作者:[Vik Paruchuri][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对ID](https://github.com/校对ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://www.dataquest.io/blog
[1]: https://www.dataquest.io/blog/data-science-portfolio-machine-learning/#email-signup
[2]: https://github.com/dataquestio/loan-prediction
[3]: https://www.dataquest.io/blog/data-science-portfolio-project/
[4]: https://atom.io/
[5]: https://www.jetbrains.com/pycharm/
[6]: https://github.com/
[7]: http://pandas.pydata.org/
[8]: http://scikit-learn.org/
[9]: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html
[10]: https://collegescorecard.ed.gov/data/
[11]: https://reddit.com/r/datasets
[12]: https://cloud.google.com/bigquery/public-data/#usa-names
[13]: https://github.com/caesar0301/awesome-public-datasets
[14]: http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html
[15]: http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html
[16]: https://loanperformancedata.fanniemae.com/lppub-docs/lppub_glossary.pdf
[17]: https://loanperformancedata.fanniemae.com/lppub-docs/lppub_faq.pdf
[18]: https://loanperformancedata.fanniemae.com/lppub-docs/lppub_file_layout.pdf
[19]: https://loanperformancedata.fanniemae.com/lppub-docs/acquisition-sample-file.txt
[20]: https://loanperformancedata.fanniemae.com/lppub-docs/performance-sample-file.txt
[21]: https://github.com/dataquestio/loan-prediction/blob/master/.gitignore
[22]: https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet
[23]: https://github.com/dataquestio/loan-prediction
[24]: https://github.com/dataquestio/loan-prediction/blob/master/requirements.txt
[25]: https://www.continuum.io/downloads
[26]: https://loanperformancedata.fanniemae.com/lppub/index.html
[27]: https://loanperformancedata.fanniemae.com/lppub-docs/lppub_file_layout.pdf
[28]: https://github.com/dataquestio/loan-prediction/blob/master/settings.py
[29]: https://loanperformancedata.fanniemae.com/lppub-docs/lppub_file_layout.pdf
[30]: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html
[31]: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
[32]: https://github.com/dataquestio/loan-prediction/blob/master/assemble.py
[33]: https://docs.python.org/3/library/stdtypes.html#dict.get
[34]: https://github.com/dataquestio/loan-prediction/blob/master/annotate.py
[35]: http://scikit-learn.org/
[36]: http://scikit-learn.org/stable/modules/generated/sklearn.cross_validation.cross_val_predict.html
[37]: https://en.wikipedia.org/wiki/Binary_classification
[38]: https://en.wikipedia.org/wiki/Logistic_regression
[39]: http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
[40]: http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
[41]: https://github.com/dataquestio/loan-prediction/blob/master/predict.py
[42]: https://github.com/dataquestio/loan-prediction/blob/master/README.md
[43]: https://github.com/dataquestio/loan-prediction
[44]: https://www.github.com/
[45]: https://www.dataquest.io/blog/data-science-portfolio-project/
[46]: https://www.dataquest.io/blog/how-to-setup-a-data-science-blog/

View File

@ -1,123 +0,0 @@
translating by cvsher
What is Git
===========
Welcome to my series on learning how to use the Git version control system! In this introduction to the series, you will learn what Git is for and who should use it.
If you're just starting out in the open source world, you're likely to come across a software project that keeps its code in, and possibly releases it for use, by way of Git. In fact, whether you know it or not, you're certainly using software right now that is developed using Git: the Linux kernel (which drives the website you're on right now, if not the desktop or mobile phone you're accessing it on), Firefox, Chrome, and many more projects share their codebase with the world in a Git repository.
On the other hand, all the excitement and hype over Git tends to make things a little muddy. Can you only use Git to share your code with others, or can you use Git in the privacy of your own home or business? Do you have to have a GitHub account to use Git? Why use Git at all? What are the benefits of Git? Is Git the only option?
So forget what you know or what you think you know about Git, and let's take it from the beginning.
### What is version control?
Git is, first and foremost, a version control system (VCS). There are many version control systems out there: CVS, SVN, Mercurial, Fossil, and, of course, Git.
Git serves as the foundation for many services, like GitHub and GitLab, but you can use Git without using any other service. This means that you can use Git privately or publicly.
If you have ever collaborated on anything digital with anyone, then you know how it goes. It starts out simple: you have your version, and you send it to your partner. They make some changes, so now there are two versions, and send the suggestions back to you. You integrate their changes into your version, and now there is one version again.
Then it gets worse: while you change your version further, your partner makes more changes to their version. Now you have three versions; the merged copy that you both worked on, the version you changed, and the version your partner has changed.
As Jason van Gumster points out in his article, 【Even artists need version control][1], this syndrome tends to happen in individual settings as well. In both art and science, it's not uncommon to develop a trial version of something; a version of your project that might make it a lot better, or that might fail miserably. So you create file names like project_justTesting.kdenlive and project_betterVersion.kdenlive, and then project_best_FINAL.kdenlive, but with the inevitable allowance for project_FINAL-alternateVersion.kdenlive, and so on.
Whether it's a change to a for loop or an editing change, it happens to the best of us. That is where a good version control system makes life easier.
### Git snapshots
Git takes snapshots of a project, and stores those snapshots as unique versions.
If you go off in a direction with your project that you decide was the wrong direction, you can just roll back to the last good version and continue along an alternate path.
If you're collaborating, then when someone sends you changes, you can merge those changes into your working branch, and then your collaborator can grab the merged version of the project and continue working from the new current version.
Git isn't magic, so conflicts do occur ("You changed the last line of the book, but I deleted that line entirely; how do we resolve that?"), but on the whole, Git enables you to manage the many potential variants of a single work, retaining the history of all the changes, and even allows for parallel versions.
### Git distributes
Working on a project on separate machines is complex, because you want to have the latest version of a project while you work, makes your own changes, and share your changes with your collaborators. The default method of doing this tends to be clunky online file sharing services, or old school email attachments, both of which are inefficient and error-prone.
Git is designed for distributed development. If you're involved with a project you can clone the project's Git repository, and then work on it as if it was the only copy in existence. Then, with a few simple commands, you can pull in any changes from other contributors, and you can also push your changes over to someone else. Now there is no confusion about who has what version of a project, or whose changes exist where. It is all locally developed, and pushed and pulled toward a common target (or not, depending on how the project chooses to develop).
### Git interfaces
In its natural state, Git is an application that runs in the Linux terminal. However, as it is well-designed and open source, developers all over the world have designed other ways to access it.
It is free, available to anyone for $0, and comes in packages on Linux, BSD, Illumos, and other Unix-like operating systems. It looks like this:
```
$ git --version
git version 2.5.3
```
Probably the most well-known Git interfaces are web-based: sites like GitHub, the open source GitLab, Savannah, BitBucket, and SourceForge all offer online code hosting to maximise the public and social aspect of open source along with, in varying degrees, browser-based GUIs to minimise the learning curve of using Git. This is what the GitLab interface looks like:
![](https://opensource.com/sites/default/files/0_gitlab.png)
Additionally, it is possible that a Git service or independent developer may even have a custom Git frontend that is not HTML-based, which is particularly handy if you don't live with a browser eternally open. The most transparent integration comes in the form of file manager support. The KDE file manager, Dolphin, can show the Git status of a directory, and even generate commits, pushes, and pulls.
![](https://opensource.com/sites/default/files/0_dolphin.jpg)
[Sparkleshare][2] uses Git as a foundation for its own Dropbox-style file sharing interface.
![](https://opensource.com/sites/default/files/0_sparkleshare_1.jpg)
For more, see the (long) page on the official [Git wiki][3] listing projects with graphical interfaces to Git.
### Who should use Git?
You should! The real question is when? And what for?
### When should I use Git, and what should I use it for?
To get the most out of Git, you need to think a little bit more than usual about file formats.
Git is designed to manage source code, which in most languages consists of lines of text. Of course, Git doesn't know if you're feeding it source code or the next Great American Novel, so as long as it breaks down to text, Git is a great option for managing and tracking versions.
But what is text? If you write something in an office application like Libre Office, then you're probably not generating raw text. There is usually a wrapper around complex applications like that which encapsulate the raw text in XML markup and then in a zip container, as a way to ensure that all of the assets for your office file are available when you send that file to someone else. Strangely, though, something that you might expect to be very complex, like the save files for a [Kdenlive][4] project, or an SVG from [Inkscape][5], are actually raw XML files that can easily be managed by Git.
If you use Unix, you can check to see what a file is made of with the file command:
```
$ file ~/path/to/my-file.blah
my-file.blah: ASCII text
$ file ~/path/to/different-file.kra: Zip data (MIME type "application/x-krita")
```
If unsure, you can view the contents of a file with the head command:
```
$ head ~/path/to/my-file.blah
```
If you see text that is mostly readable by you, then it is probably a file made of text. If you see garbage with some familiar text characters here and there, it is probably not made of text.
Make no mistake: Git can manage other formats of files, but it treats them as blobs. The difference is that in a text file, two Git snapshots (or commits, as we call them) might be, say, three lines different from each other. If you have a photo that has been altered between two different commits, how can Git express that change? It can't, really, because photographs are not made of any kind of sensible text that can just be inserted or removed. I wish photo editing were as easy as just changing some text from "<sky>ugly greenish-blue</sky>" to "<sky>blue-with-fluffy-clouds</sky>" but it truly is not.
People check in blobs, like PNG icons or a speadsheet or a flowchart, to Git all the time, so if you're working in Git then don't be afraid to do that. Know that it's not sensible to do that with huge files, though. If you are working on a project that does generate both text files and large blobs (a common scenario with video games, which have equal parts source code to graphical and audio assets), then you can do one of two things: either invent your own solution, such as pointers to a shared network drive, or use a Git add-on like Joey Hess's excellent [git annex][6], or the [Git-Media][7] project.
So you see, Git really is for everyone. It is a great way to manage versions of your files, it is a powerful tool, and it is not as scary as it first seems.
--------------------------------------------------------------------------------
via: https://opensource.com/resources/what-is-git
作者:[Seth Kenlon ][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/seth
[1]: https://opensource.com/life/16/2/version-control-isnt-just-programmers
[2]: http://sparkleshare.org/
[3]: https://git.wiki.kernel.org/index.php/InterfacesFrontendsAndTools#Graphical_Interfaces
[4]: https://opensource.com/life/11/11/introduction-kdenlive
[5]: http://inkscape.org/
[6]: https://git-annex.branchable.com/
[7]: https://github.com/alebedev/git-media

View File

@ -1,3 +1,5 @@
FSSlc translating
bc: Command line calculator
============================

View File

@ -1,62 +0,0 @@
maywanting
5 tricks for getting started with Vim
=====================================
![](https://opensource.com/sites/default/files/styles/image-full-size/public/images/education/BUSINESS_peloton.png?itok=nuMbW9d3)
For years, I've wanted to learn Vim, now my preferred Linux text editor and a favorite open source tool among developers and system administrators. And when I say learn, I mean really learn. Master is probably too strong a word, but I'd settle for advanced proficiency. For most of my years using Linux, my skillset included the ability to open a file, use the arrow keys to navigate up and down, switch into insert mode, change some text, save, and exit.
But that's like minimum-viable-Vim. My skill level enabled me edit text documents from the terminal, but hasn't actually empowered me with any of the text editing super powers I've always imagined were possible. And it didn't justify using Vim over the totally capable Pico or Nano.
So why learn Vim at all? Because I do spend an awful lot of time editing text, and I know I could be more efficient at it. And why not Emacs, or a more modern editor like Atom? Because Vim works for me, and at least I have some minimal experience in it. And perhaps, importantly, because it's rare that I encounter a system that I'm working on which doesn't have Vim or it's less-improved cousin (vi) available on it already. If you've always had a desire to learn Emacs, more power to you—I hope the Emacs-analog of these tips will prove useful to you, too.
A few weeks in to this concentrated effort to up my Vim-use ability, the number one tip I have to share is that you actually must use the tool. While it seems like a piece of advice straight from Captain Obvious, I actually found it considerably harder than I expected to stay in the program. Most of my work happens inside of a web browser, and I had to untrain my trigger-like opening of (Gedit) every time I needed to edit a block of text outside of a browser. Gedit had made its way to my quick launcher, and so step one was removing this shortcut and putting Vim there instead.
I've tried a number of things that have helped me learn. Here's a few of them I would recommend if you're looking to learn as well.
### Vimtutor
Sometimes the best place to get started isn't far from the application itself. I found Vimtutor, a tiny application that is basically a tutorial in a text file that you edit as you learn, to be as helpful as anything else in showing me the basics of the commands I had skipped learning through the years. Vimtutor is typically found everywhere Vim is, and is an easy install from your package manager if it's not already on your system.
### GVim
I know not everyone will agree with this one, but I found it useful to stop using the version of Vim that lives in my terminal and start using GVim for my basic editing needs. Naysayers will argue that it encourages using the mouse in an environment designed for keyboards, but I found it helpful to be able to quickly find the command I was looking for in a drop-down menu, reminding myself of the correct command, and then executing it with a keyboard. The alternative was often frustration at the inability to figure out how to do something, which is not a good feeling to be under constantly as you struggle to learn a new editor. No, stopping every few minutes to read a man page or use a search engine to remind you of a key sequence is not the best way to learn something new.
### Keyboard maps
Along with switching to GVim, I also found it handy to have a keyboard "cheat sheet" handy to remind me of the basic keystrokes. There are many available on the web that you can download, print, and set beside your station, but I opted for buying a set of stickers for my laptop keyboard. They were less than ten dollars US and had the added bonus of being a subtle reminder every time I used the laptop to at least try out one new thing as I edited.
### Vimium
As I mentioned, I live in the web browser most of the day. One of the tricks I've found helpful to reinforce the Vim way of navigation is to use [Vimium][1], an open source extension for Chrome that makes Chrome mimick the shortcuts used by Vim. I've found the fewer times I switch contexts for the keyboard shortcuts I'm using, the more likely I am to actually use them. Similar extensions, like [Vimerator][2], exist for Firefox.
### Other human beings
Without a doubt, there's no better way to get help learning something new than to get advice, feedback, and solutions from other people who have gone down a path before you.
If you live in a larger urban area, there might be a Vim meetup group near you. Otherwise, the place to be is the #vim channel on Freenode IRC. One of the more popular channels on Freenode, the #vim channel is always full of helpful individuals willing to offer help with your problems. I find it interesting just to listen to the chatter and see what sorts of problems others are trying to solve to see what I'm missing out on.
------
And so what to make of this effort? So far, so good. The time spent has probably yet to pay for itself in terms of time saved, but I'm always mildly surprised and amused when I find myself with a new reflex, jumping words with the right keypress sequence, or some similarly small feat. I can at least see that every day, the investment is bringing itself a little closer to payoff.
These aren't the only tricks for learning Vim, by far. I also like to point people towards [Vim Adventures][3], an online game in which you navigate using the Vim keystrokes. And just the other day I came across a marvelous visual learning tool at [Vimgifs.com][4], which is exactly what you might expect it to be: illustrated examples with Vim so small they fit nicely in a gif.
Have you invested the time to learn Vim, or really, any program with a keyboard-heavy interface? What worked for you, and, did you think the effort was worth it? Has your productivity changed as much as you thought it would? Lets share stories in the comments below.
--------------------------------------------------------------------------------
via: https://opensource.com/life/16/7/tips-getting-started-vim
作者:[Jason Baker ][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/jason-baker
[1]: https://github.com/philc/vimium
[2]: http://www.vimperator.org/
[3]: http://vim-adventures.com/
[4]: http://vimgifs.com/

View File

@ -1,62 +0,0 @@
Keeweb A Linux Password Manager
================================
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/keeweb_1.png?608)
Today we are depending on more and more online services. Each online service we sign up for, let us set a password and this way we have to remember hundreds of passwords. In this case, it is easy for anyone to forget passwords. In this article I am going to talk about Keeweb, a Linux password manager that can store all your passwords securely either online or offline.
When we talk about Linux password managers, there are so many. Password managers like, [Keepass][1] and [Encryptr, a Zero-knowledge system based password manager][2] have already been talked about on LinuxAndUbuntu. Keeweb is another password manager for Linux that we are going to see in this article.
### Keeweb can store passwords offline or online
Keeweb is a cross-platform password manager. It can store all your passwords offline and sync it with your own cloud storage services like OneDrive, Google Drive, Dropbox etc. Keeweb does not have online database of its own to sync your passwords.
To connect your online storage with Keeweb, just click more and click the service that you want to use.
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/keeweb.png?685)
Now Keeweb will prompt you to sign in to your drive. After sign in authenticate Keeweb to use your account.
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/authenticate-dropbox-with-keeweb_orig.jpg?649)
### Store passwords with Keeweb
It is very easy to store your passwords with Keeweb. You can encrypt your password file with a complex password. Keeweb also allows you to lock file with a key file but I don't recommend it. If somebody gets your key file, it takes only a click to unlock your passwords file.
#### Create Passwords
To create a new password simply click the '+' sign and you will be presented all entries to fill up. You can create more entries if you want.
#### Search Passwords
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/search-passwords_orig.png)
Keeweb has a library of icons so that you can find any particular password entry easily. You can change the color of icons, download more icons and even import icons from your computer. When talking about finding passwords the search comes very handy.
Passwords of similar services can be grouped so that you can find them all at one place in one folder. You can also tag passwords to store them all in different categories.
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/tags-passwords-in-keeweb.png?283)
### Themes
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/themes.png?304)
If you like light themes like white or high contrast then you can change theme from Settings > General > Themes. There are four themes available, two are dark and two are light.
### Dont' you Like Linux Passwords Manager? No problem!
I have already posted about two other Linux password managers, Keepass and Encryptr and there were arguments on Reddit, and other social media. There were people against using any password manager and vice-versa. In this article I want to clear out that it is our responsibility to save the file that passwords are stored in. I think Password managers like Keepass and Keeweb are good to use as they don't store your passwords in the cloud. These password managers create a file and you can store it on your hard drive or encrypt it with apps like VeraCrypt. I myself don't use or recommend to use services that store passwords in their own database.
--------------------------------------------------------------------------------
via: http://www.tecmint.com/mandatory-access-control-with-selinux-or-apparmor-linux/
作者:[author][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.linuxandubuntu.com/home/keeweb-a-linux-password-manager
[1]: http://www.linuxandubuntu.com/home/keepass-password-management-tool-creates-strong-passwords-and-keeps-them-secure
[2]: http://www.linuxandubuntu.com/home/encryptr-zero-knowledge-system-based-password-manager-for-linux

View File

@ -1,3 +1,4 @@
sevenot translating
Terminator A Linux Terminal Emulator With Multiple Terminals In One Window
=============================================================================

View File

@ -1,187 +0,0 @@
Being translated by ChrisLeeGit
Part 13 - LFCS: How to Configure and Troubleshoot Grand Unified Bootloader (GRUB)
=====================================================================================
Because of the changes in the LFCS exam requirements effective Feb. 2, 2016, we are adding the necessary topics to the [LFCS series][1] published here. To prepare for this exam, your are highly encouraged to use the [LFCE series][2] as well.
![](http://www.tecmint.com/wp-content/uploads/2016/03/Configure-Troubleshoot-Grub-Boot-Loader.png)
>LFCS: Configure and Troubleshoot Grub Boot Loader Part 13
In this article we will introduce you to GRUB and explain why a boot loader is necessary, and how it adds versatility to the system.
The [Linux boot process][3] from the time you press the power button of your computer until you get a fully-functional system follows this high-level sequence:
* 1. A process known as **POST** (**Power-On Self Test**) performs an overall check on the hardware components of your computer.
* 2. When **POST** completes, it passes the control over to the boot loader, which in turn loads the Linux kernel in memory (along with **initramfs**) and executes it. The most used boot loader in Linux is the **GRand Unified Boot loader**, or **GRUB** for short.
* 3. The kernel checks and accesses the hardware, and then runs the initial process (mostly known by its generic name “**init**”) which in turn completes the system boot by starting services.
In Part 7 of this series (“[SysVinit, Upstart, and Systemd][4]”) we introduced the [service management systems and tools][5] used by modern Linux distributions. You may want to review that article before proceeding further.
### Introducing GRUB Boot Loader
Two major **GRUB** versions (**v1** sometimes called **GRUB Legacy** and **v2**) can be found in modern systems, although most distributions use **v2** by default in their latest versions. Only **Red Hat Enterprise Linux 6** and its derivatives still use **v1** today.
Thus, we will focus primarily on the features of **v2** in this guide.
Regardless of the **GRUB** version, a boot loader allows the user to:
* 1). modify the way the system behaves by specifying different kernels to use,
* 2). choose between alternate operating systems to boot, and
* 3). add or edit configuration stanzas to change boot options, among other things.
Today, **GRUB** is maintained by the **GNU** project and is well documented in their website. You are encouraged to use the [GNU official documentation][6] while going through this guide.
When the system boots you are presented with the following **GRUB** screen in the main console. Initially, you are prompted to choose between alternate kernels (by default, the system will boot using the latest kernel) and are allowed to enter a **GRUB** command line (with `c`) or edit the boot options (by pressing the `e` key).
![](http://www.tecmint.com/wp-content/uploads/2016/03/GRUB-Boot-Screen.png)
>GRUB Boot Screen
One of the reasons why you would consider booting with an older kernel is a hardware device that used to work properly and has started “acting up” after an upgrade (refer to [this link][7] in the AskUbuntu forums for an example).
The **GRUB v2** configuration is read on boot from `/boot/grub/grub.cfg` or `/boot/grub2/grub.cfg`, whereas `/boot/grub/grub.conf` or `/boot/grub/menu.lst` are used in **v1**. These files are NOT to be edited by hand, but are modified based on the contents of `/etc/default/grub` and the files found inside `/etc/grub.d`.
In a **CentOS 7**, heres the configuration file that is created when the system is first installed:
```
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="vconsole.keymap=la-latin1 rd.lvm.lv=centos_centos7-2/swap crashkernel=auto vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos_centos7-2/root rhgb quiet"
GRUB_DISABLE_RECOVERY="true"
```
In addition to the online documentation, you can also find the GNU GRUB manual using info as follows:
```
# info grub
```
If youre interested specifically in the options available for /etc/default/grub, you can invoke the configuration section directly:
```
# info -f grub -n 'Simple configuration'
```
Using the command above you will find out that `GRUB_TIMEOUT` sets the time between the moment when the initial screen appears and the system automatic booting begins unless interrupted by the user. When this variable is set to `-1`, boot will not be started until the user makes a selection.
When multiple operating systems or kernels are installed in the same machine, `GRUB_DEFAULT` requires an integer value that indicates which OS or kernel entry in the GRUB initial screen should be selected to boot by default. The list of entries can be viewed not only in the splash screen shown above, but also using the following command:
### In CentOS and openSUSE:
```
# awk -F\' '$1=="menuentry " {print $2}' /boot/grub2/grub.cfg
```
### In Ubuntu:
```
# awk -F\' '$1=="menuentry " {print $2}' /boot/grub/grub.cfg
```
In the example shown in the below image, if we wish to boot with the kernel version **3.10.0-123.el7.x86_64** (4th entry), we need to set `GRUB_DEFAULT` to `3` (entries are internally numbered beginning with zero) as follows:
```
GRUB_DEFAULT=3
```
![](http://www.tecmint.com/wp-content/uploads/2016/03/Boot-System-with-Old-Kernel-Version.png)
>Boot System with Old Kernel Version
One final GRUB configuration variable that is of special interest is `GRUB_CMDLINE_LINUX`, which is used to pass options to the kernel. The options that can be passed through GRUB to the kernel are well documented in the [Kernel Parameters file][8] and in [man 7 bootparam][9].
Current options in my **CentOS 7** server are:
```
GRUB_CMDLINE_LINUX="vconsole.keymap=la-latin1 rd.lvm.lv=centos_centos7-2/swap crashkernel=auto vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos_centos7-2/root rhgb quiet"
```
Why would you want to modify the default kernel parameters or pass extra options? In simple terms, there may be times when you need to tell the kernel certain hardware parameters that it may not be able to determine on its own, or to override the values that it would detect.
This happened to me not too long ago when I tried **Vector Linux**, a derivative of **Slackware**, on my 10-year old laptop. After installation it did not detect the right settings for my video card so I had to modify the kernel options passed through GRUB in order to make it work.
Another example is when you need to bring the system to single-user mode to perform maintenance tasks. You can do this by appending the word single to `GRUB_CMDLINE_LINUX` and rebooting:
```
GRUB_CMDLINE_LINUX="vconsole.keymap=la-latin1 rd.lvm.lv=centos_centos7-2/swap crashkernel=auto vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos_centos7-2/root rhgb quiet single"
```
After editing `/etc/defalt/grub`, you will need to run `update-grub` (Ubuntu) or `grub2-mkconfig -o /boot/grub2/grub.cfg` (**CentOS** and **openSUSE**) afterwards to update `grub.cfg` (otherwise, changes will be lost upon boot).
This command will process the boot configuration files mentioned earlier to update `grub.cfg`. This method ensures changes are permanent, while options passed through GRUB at boot time will only last during the current session.
### Fixing Linux GRUB Issues
If you install a second operating system or if your GRUB configuration file gets corrupted due to human error, there are ways you can get your system back on its feet and be able to boot again.
In the initial screen, press `c` to get a GRUB command line (remember that you can also press `e` to edit the default boot options), and use help to bring the available commands in the GRUB prompt:
![](http://www.tecmint.com/wp-content/uploads/2016/03/Fix-Grub-Issues-in-Linux.png)
>Fix Grub Configuration Issues in Linux
We will focus on **ls**, which will list the installed devices and filesystems, and we will examine what it finds. In the image below we can see that there are 4 hard drives (`hd0` through `hd3`).
Only `hd0` seems to have been partitioned (as evidenced by msdos1 and msdos2, where 1 and 2 are the partition numbers and msdos is the partitioning scheme).
Lets now examine the first partition on `hd0` (**msdos1**) to see if we can find GRUB there. This approach will allow us to boot Linux and there use other high level tools to repair the configuration file or reinstall GRUB altogether if it is needed:
```
# ls (hd0,msdos1)/
```
As we can see in the highlighted area, we found the `grub2` directory in this partition:
![](http://www.tecmint.com/wp-content/uploads/2016/03/Find-Grub-Configuration.png)
>Find Grub Configuration
Once we are sure that GRUB resides in (**hd0,msdos1**), lets tell GRUB where to find its configuration file and then instruct it to attempt to launch its menu:
```
set prefix=(hd0,msdos1)/grub2
set root=(hd0,msdos1)
insmod normal
normal
```
![](http://www.tecmint.com/wp-content/uploads/2016/03/Find-and-Launch-Grub-Menu.png)
>Find and Launch Grub Menu
Then in the GRUB menu, choose an entry and press **Enter** to boot using it. Once the system has booted you can issue the `grub2-install /dev/sdX` command (change `sdX` with the device you want to install GRUB on). The boot information will then be updated and all related files be restored.
```
# grub2-install /dev/sdX
```
Other more complex scenarios are documented, along with their suggested fixes, in the [Ubuntu GRUB2 Troubleshooting guide][10]. The concepts explained there are valid for other distributions as well.
### Summary
In this article we have introduced you to GRUB, indicated where you can find documentation both online and offline, and explained how to approach an scenario where a system has stopped booting properly due to a bootloader-related issue.
Fortunately, GRUB is one of the tools that is best documented and you can easily find help either in the installed docs or online using the resources we have shared in this article.
Do you have questions or comments? Dont hesitate to let us know using the comment form below. We look forward to hearing from you!
--------------------------------------------------------------------------------
via: http://www.tecmint.com/linux-basic-shell-scripting-and-linux-filesystem-troubleshooting/
作者:[Gabriel Cánepa][a]
译者:[ChrisLeeGit](https://github.com/chrisleegit)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.tecmint.com/author/gacanepa/
[1]: http://www.tecmint.com/sed-command-to-create-edit-and-manipulate-files-in-linux/
[2]: http://www.tecmint.com/installing-network-services-and-configuring-services-at-system-boot/
[3]: http://www.tecmint.com/linux-boot-process/
[4]: http://www.tecmint.com/linux-boot-process-and-manage-services/
[5]: http://www.tecmint.com/best-linux-log-monitoring-and-management-tools/
[6]: http://www.gnu.org/software/grub/manual/
[7]: http://askubuntu.com/questions/82140/how-can-i-boot-with-an-older-kernel-version
[8]: https://www.kernel.org/doc/Documentation/kernel-parameters.txt
[9]: http://man7.org/linux/man-pages/man7/bootparam.7.html
[10]: https://help.ubuntu.com/community/Grub2/Troubleshooting

View File

@ -1,275 +0,0 @@
translating by vim-kakali
Learn How to Use Awk Variables, Numeric Expressions and Assignment Operators part8
=======================================================================================
The [Awk command series][1] is getting exciting I believe, in the previous seven parts, we walked through some fundamentals of Awk that you need to master to enable you perform some basic text or string filtering in Linux.
Starting with this part, we shall dive into advance areas of Awk to handle more complex text or string filtering operations. Therefore, we are going to cover Awk features such as variables, numeric expressions and assignment operators.
![](http://www.tecmint.com/wp-content/uploads/2016/07/Learn-Awk-Variables-Numeric-Expressions-Assignment-Operators.png)
>Learn Awk Variables, Numeric Expressions and Assignment Operators
These concepts are not comprehensively distinct from the ones you may have probably encountered in many programming languages before such shell, C, Python plus many others, so there is no need to worry much about this topic, we are simply revising the common ideas of using these mentioned features.
This will probably be one of the easiest Awk command sections to understand, so sit back and lets get going.
### 1. Awk Variables
In any programming language, a variable is a place holder which stores a value, when you create a variable in a program file, as the file is executed, some space is created in memory that will store the value you specify for the variable.
You can define Awk variables in the same way you define shell variables as follows:
```
variable_name=value
```
In the syntax above:
- `variable_name`: is the name you give a variable
- `value`: the value stored in the variable
Lets look at some examples below:
```
computer_name=”tecmint.com”
port_no=”22”
email=”admin@tecmint.com”
server=”computer_name”
```
Take a look at the simple examples above, in the first variable definition, the value `tecmint.com` is assigned to the variable `computer_name`.
Furthermore, the value 22 is assigned to the variable port_no, it is also possible to assign the value of one variable to another variable as in the last example where we assigned the value of computer_name to the variable server.
If you can recall, right from [part 2 of this Awk series][2] were we covered field editing, we talked about how Awk divides input lines into fields and uses standard field access operator, $ to read the different fields that have been parsed. We can also use variables to store the values of fields as follows.
```
first_name=$2
second_name=$3
```
In the examples above, the value of first_name is set to second field and second_name is set to the third field.
As an illustration, consider a file named names.txt which contains a list of an applications users indicating their first and last names plus gender. Using the [cat command][3], we can view the contents of the file as follows:
```
$ cat names.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/List-File-Content-Using-cat-Command.png)
>List File Content Using cat Command
Then, we can also use the variables first_name and second_name to store the first and second names of the first user on the list as by running the Awk command below:
```
$ awk '/Aaron/{ first_name=$2 ; second_name=$3 ; print first_name, second_name ; }' names.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Store-Variables-Using-Awk-Command.png)
>Store Variables Using Awk Command
Let us also take a look at another case, when you issue the command `uname -a` on your terminal, it prints out all your system information.
The second field contains your `hostname`, therefore we can store the hostname in a variable called hostname and print it using Awk as follows:
```
$ uname -a
$ uname -a | awk '{hostname=$2 ; print hostname ; }'
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Store-Command-Output-to-Variable-Using-Awk.png)
>Store Command Output to Variable Using Awk
### 2. Numeric Expressions
In Awk, numeric expressions are built using the following numeric operators:
- `*` : multiplication operator
- `+` : addition operator
- `/` : division operator
- `-` : subtraction operator
- `%` : modulus operator
- `^` : exponentiation operator
The syntax for a numeric expressions is:
```
$ operand1 operator operand2
```
In the form above, operand1 and operand2 can be numbers or variable names, and operator is any of the operators above.
Below are some examples to demonstrate how to build numeric expressions:
```
counter=0
num1=5
num2=10
num3=num2-num1
counter=counter+1
```
To understand the use of numeric expressions in Awk, we shall consider the following example below, with the file domains.txt which contains all domains owned by Tecmint.
```
news.tecmint.com
tecmint.com
linuxsay.com
windows.tecmint.com
tecmint.com
news.tecmint.com
tecmint.com
linuxsay.com
tecmint.com
news.tecmint.com
tecmint.com
linuxsay.com
windows.tecmint.com
tecmint.com
```
To view the contents of the file, use the command below:
```
$ cat domains.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/View-Contents-of-File.png)
>View Contents of File
If we want to count the number of times the domain tecmint.com appears in the file, we can write a simple script to do that as follows:
```
#!/bin/bash
for file in $@; do
if [ -f $file ] ; then
#print out filename
echo "File is: $file"
#print a number incrementally for every line containing tecmint.com
awk '/^tecmint.com/ { counter=counter+1 ; printf "%s\n", counter ; }' $file
else
#print error info incase input is not a file
echo "$file is not a file, please specify a file." >&2 && exit 1
fi
done
#terminate script with exit code 0 in case of successful execution
exit 0
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Shell-Script-to-Count-a-String-in-File.png)
>Shell Script to Count a String or Text in File
After creating the script, save it and make it executable, when we run it with the file, domains.txt as out input, we get the following output:
```
$ ./script.sh ~/domains.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Script-To-Count-String.png)
>Script to Count String or Text
From the output of the script, there are 6 lines in the file domains.txt which contain tecmint.com, to confirm that you can manually count them.
### 3. Assignment Operators
The last Awk feature we shall cover is assignment operators, there are several assignment operators in Awk and these include the following:
- `*=` : multiplication assignment operator
- `+=` : addition assignment operator
- `/=` : division assignment operator
- `-=` : subtraction assignment operator
- `%=` : modulus assignment operator
- `^=` : exponentiation assignment operator
The simplest syntax of an assignment operation in Awk is as follows:
```
$ variable_name=variable_name operator operand
```
Examples:
```
counter=0
counter=counter+1
num=20
num=num-1
```
You can use the assignment operators above to shorten assignment operations in Awk, consider the previous examples, we could perform the assignment in the following form:
```
variable_name operator=operand
counter=0
counter+=1
num=20
num-=1
```
Therefore, we can alter the Awk command in the shell script we just wrote above using += assignment operator as follows:
```
#!/bin/bash
for file in $@; do
if [ -f $file ] ; then
#print out filename
echo "File is: $file"
#print a number incrementally for every line containing tecmint.com
awk '/^tecmint.com/ { counter+=1 ; printf "%s\n", counter ; }' $file
else
#print error info incase input is not a file
echo "$file is not a file, please specify a file." >&2 && exit 1
fi
done
#terminate script with exit code 0 in case of successful execution
exit 0
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Alter-Shell-Script.png)
>Alter Shell Script
In this segment of the [Awk series][4], we covered some powerful Awk features, that is variables, building numeric expressions and using assignment operators, plus some few illustrations of how we can actually use them.
These concepts are not any different from the one in other programming languages but there may be some significant distinctions under Awk programming.
In part 9, we shall look at more Awk features that is special patterns: BEGIN and END. Until then, stay connected to Tecmint.
--------------------------------------------------------------------------------
via: http://www.tecmint.com/learn-awk-variables-numeric-expressions-and-assignment-operators/
作者:[Aaron Kili][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对ID](https://github.com/校对ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.tecmint.com/author/aaronkili/
[1]: http://www.tecmint.com/category/awk-command/
[2]: http://www.tecmint.com/awk-print-fields-columns-with-space-separator/
[3]: http://www.tecmint.com/13-basic-cat-command-examples-in-linux/
[4]: http://www.tecmint.com/category/awk-command/

View File

@ -0,0 +1,120 @@
Being translated by ChrisLeeGit
Learn How to Use Awk Built-in Variables Part 10
=================================================
As we uncover the section of Awk features, in this part of the series, we shall walk through the concept of built-in variables in Awk. There are two types of variables you can use in Awk, these are; user-defined variables, which we covered in Part 8 and built-in variables.
![](http://www.tecmint.com/wp-content/uploads/2016/07/Awk-Built-in-Variables-Examples.png)
>Awk Built in Variables Examples
Built-in variables have values already defined in Awk, but we can also carefully alter those values, the built-in variables include:
- `FILENAME` : current input file name( do not change variable name)
- `FR` : number of the current input line (that is input line 1, 2, 3… so on, do not change variable name)
- `NF` : number of fields in current input line (do not change variable name)
- `OFS` : output field separator
- `FS` : input field separator
- `ORS` : output record separator
- `RS` : input record separator
Let us proceed to illustrate the use of some of the Awk built-in variables above:
To read the filename of the current input file, you can use the `FILENAME` built-in variable as follows:
```
$ awk ' { print FILENAME } ' ~/domains.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Awk-FILENAME-Variable.png)
>Awk FILENAME Variable
You will realize that, the filename is printed out for each input line, that is the default behavior of Awk when you use `FILENAME` built-in variable.
Using `NR` to count the number of lines (records) in an input file, remember that, it also counts the empty lines, as we shall see in the example below.
When we view the file domains.txt using cat command, it contains 14 lines with text and empty 2 lines:
```
$ cat ~/domains.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Print-Contents-of-File.png)
>Print Contents of File
```
$ awk ' END { print "Number of records in file is: ", NR } ' ~/domains.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Awk-Count-Number-of-Lines.png)
>Awk Count Number of Lines
To count the number of fields in a record or line, we use the NR built-in variable as follows:
```
$ cat ~/names.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/List-File-Contents.png)
>List File Contents
```
$ awk '{ print "Record:",NR,"has",NF,"fields" ; }' ~/names.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Awk-Count-Number-of-Fields-in-File.png)
>Awk Count Number of Fields in File
Next, you can also specify an input field separator using the FS built-in variable, it defines how Awk divides input lines into fields.
The default value for FS is space and tab, but we can change the value of FS to any character that will instruct Awk to divide input lines accordingly.
There are two methods to do this:
- one method is to use the FS built-in variable
- and the second is to invoke the -F Awk option
Consider the file /etc/passwd on a Linux system, the fields in this file are divided using the : character, so we can specify it as the new input field separator when we want to filter out certain fields as in the following examples:
We can use the `-F` option as follows:
```
$ awk -F':' '{ print $1, $4 ;}' /etc/passwd
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Awk-Filter-Fields-in-Password-File.png)
>Awk Filter Fields in Password File
Optionally, we can also take advantage of the FS built-in variable as below:
```
$ awk ' BEGIN { FS=“:” ; } { print $1, $4 ; } ' /etc/passwd
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Filter-Fields-in-File-Using-Awk.png)
>Filter Fields in File Using Awk
To specify an output field separator, use the OFS built-in variable, it defines how the output fields will be separated using the character we use as in the example below:
```
$ awk -F':' ' BEGIN { OFS="==>" ;} { print $1, $4 ;}' /etc/passwd
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Add-Separator-to-Field-in-File.png)
>Add Separator to Field in File
In this Part 10, we have explored the idea of using Awk built-in variables which come with predefined values. But we can also change these values, though, it is not recommended to do so unless you know what you are doing, with adequate understanding.
After this, we shall progress to cover how we can use shell variables in Awk command operations, therefore, stay connected to Tecmint.
--------------------------------------------------------------------------------
via: http://www.tecmint.com/awk-built-in-variables-examples/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+tecmint+%28Tecmint%3A+Linux+Howto%27s+Guide%29
作者:[Aaron Kili][a]
译者:[ChrisLeeGit](https://github.com/chrisleegit)
校对:[校对ID](https://github.com/校对ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.tecmint.com/author/aaronkili/

View File

@ -1,109 +0,0 @@
Translating by ivo-wang
What is good stock portfolio management software on Linux
linux上那些不错的管理股票组合投资软件
================================================================================
如果你在股票市场做投资那么你可能非常清楚管理组合投资的计划有多重要。管理组合投资的目标是依据你能承受的风险时间层面的长短和资金盈利的目标去为你量身打造的一种投资计划。鉴于这类软件的重要性难怪从不缺乏商业性质的app和股票行情检测软件每一个都可以兜售复杂的组合投资以及跟踪报告功能。
对于这些linux爱好者们我们找到了一些 **好用的开源组合投资管理工具** 用来在linux上管理和跟踪股票的组合投资这里高度推荐一个基于java编写的管理软件[JStock][1]。如果你不是一个java粉你不得不面对这样一个事实JStock需要运行在重型的JVM环境上。同时我相信许多人非常欣赏JStock安装JRE以后它可以非常迅速的安装在各个linux平台上。没有障碍能阻止你将它安装在你的linux环境中。
开源就意味着免费或标准低下的时代已经过去了。鉴于JStock只是一个个人完成的产物作为一个组合投资管理软件它最令人印象深刻的是包含了非常多实用的功能以上所有的荣誉属于它的作者Yan Cheng Cheok!例如JStock 支持通过监视列表去监控价格,多种组合投资,按习惯/按固定 做股票指示与相关扫描支持27个不同的股票市场和交易平台云端备份/还原。JStock支持多平台部署(Linux, OS X, Android 和 Windows)你可以通过云端保存你的JStock记录它可以无缝的备份还原到其他的不同平台上面。
现在我将向你展示如何安装以及使用过程的一些具体细节。
### 在Linux上安装JStock ###
因为JStock使用Java编写所以必须[安装 JRE][2]才能让它运行起来.小提示JStock 需要JRE1.7或更高版本。如你的JRE版本不能满足这个需求JStock将会安装失败然后出现下面的报错。
Exception in thread "main" java.lang.UnsupportedClassVersionError: org/yccheok/jstock/gui/JStock : Unsupported major.minor version 51.0
一旦你安装了JRE在你的linux上从官网下载最新的发布的JStock然后加载启动它。
$ wget https://github.com/yccheok/jstock/releases/download/release_1-0-7-13/jstock-1.0.7.13-bin.zip
$ unzip jstock-1.0.7.13-bin.zip
$ cd jstock
$ chmod +x jstock.sh
$ ./jstock.sh
教程的其他部分让我来给大家展示一些JStock的实用功能
### 监视监控列表股票价格的波动 ###
使用JStock你可以创建一个或多个监视列表它可以自动的监视股票价格的波动并给你提供相应的通知。在每一个监视列表里面你可以添加多个感兴趣的股票进去。之后添加你的警戒值在"Fall Below"和"Rise Above"的表格里,分别是在设定最低价格和最高价格。
![](https://c2.staticflickr.com/2/1588/23795349969_37f4b0f23c_c.jpg)
例如你设置了AAPL股票的最低/最高价格分别是$102 和 $115.50,你将在价格低于$102或高于$115.50的任意时间在桌面得到通知。
你也可以设置邮件通知,之后你将收到一些价格信息的邮件通知。设置邮件通知在栏的"Options"选项。在"Alert"标签,打开"Send message to email(s)"填入你的Gmail账户。一旦完成Gmail认证步骤JStock将开始发送邮件通知到你的Gmail账户也可以设置其他的第三方邮件地址
![](https://c2.staticflickr.com/2/1644/24080560491_3aef056e8d_b.jpg)
### 管理多个组合投资 ###
JStock能够允许你管理多个组合投资。这个功能对于股票经纪人是非常实用的。你可以为经纪人创建一个投资项去管理你的 买入/卖出/红利 用来了解每一个经纪人的业务情况。你也可以切换不同的组合项目通过选择一个特殊项目在"Portfolio"菜单里面。下面是一张截图用来展示一个意向投资
![](https://c2.staticflickr.com/2/1646/23536385433_df6c036c9a_c.jpg)
因为能够设置付给经纪人小费的选项所以你能付给经纪人任意的小费印花税以及清空每一比交易的小费。如果你非常懒你也可以在菜单里面设置自动计算小费和给每一个经纪人固定的小费。在完成交易之后JStock将自动的计算并发送小费。
![](https://c2.staticflickr.com/2/1653/24055085262_0e315c3691_b.jpg)
### 显示固定/自选股票提示 ###
如果你要做一些股票的技术分析你可能需要不同股票的指数这里叫做“平均股指”对于股票的跟踪JStock提供多个[预设技术指示器][3] 去获得股票上涨/下跌/逆转指数的趋势。下面的列表里面是一些可用的指示。
- 异同平均线MACD
- 相对强弱指数 (RSI)
- 货币流通指数 (MFI)
- 顺势指标 (CCI)
- 十字线
- 黄金交叉线, 死亡交叉线
- 涨幅/跌幅
开启预设指示器能需要在JStock中点击"Stock Indicator Editor"标签。之后点击右侧面板中的安装按钮。选择"Install from JStock server"选项,之后安装你想要的指示器。
![](https://c2.staticflickr.com/2/1476/23867534660_b6a9c95a06_c.jpg)
一旦安装了一个或多个指示器,你可以用他们来扫描股票。选择"Stock Indicator Scanner"标签,点击底部的"Scan"按钮,选择需要的指示器。
![](https://c2.staticflickr.com/2/1653/24137054996_e8fcd10393_c.jpg)
当你选择完需要扫描的股票(例如e.g., NYSE, NASDAQ)以后JStock将执行扫描并将捕获的结果通过列表的形式展现在指示器上面。
![](https://c2.staticflickr.com/2/1446/23795349889_0f1aeef608_c.jpg)
除了预设指示器以外你也可以使用一个图形化的工具来定义自己的指示器。下面这张图例中展示的是当前价格小于或等于60天平均价格
![](https://c2.staticflickr.com/2/1605/24080560431_3d26eac6b5_c.jpg)
### 云备份还原Linux 和 Android JStock ###
另一个非常棒的功能是JStock可以支持云备份还原。Jstock也可以把你的组合投资/监视列表备份还原在 Google Drive这个功能可以实现在不同平台例如Linux和Android上无缝穿梭。举个例子如果你把Android Jstock组合投资的信息保存在Google Drive上你可以在Linux班级本上还原他们。
![](https://c2.staticflickr.com/2/1537/24163165565_bb47e04d6c_c.jpg)
![](https://c2.staticflickr.com/2/1556/23536385333_9ed1a75d72_c.jpg)
如果你在从Google Drive还原之后不能看到你的投资信息以及监视列表请确认你的国家信息与“Country”菜单里面设置的保持一致。
JStock的安卓免费版可以从[Google Play Store][4]获取到。如果你需要完整的功能(比如云备份,通知,图表等),你需要一次性支付费用升级到高级版。我想高级版肯定有它的价值所在。
![](https://c2.staticflickr.com/2/1687/23867534720_18b917028c_c.jpg)
写在最后我应该说一下它的作者Yan Cheng Cheok他是一个十分活跃的开发者有bug及时反馈给他。最后多有的荣耀都属于他一个人
关于JStock这个组合投资跟踪软件你有什么想法呢
--------------------------------------------------------------------------------
via: http://xmodulo.com/stock-portfolio-management-software-linux.html
作者:[Dan Nanni][a]
译者:[ivo-wang](https://github.com/ivo-wang)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]:http://xmodulo.com/author/nanni
[1]:http://jstock.org/
[2]:http://ask.xmodulo.com/install-java-runtime-linux.html
[3]:http://jstock.org/ma_indicator.html
[4]:https://play.google.com/store/apps/details?id=org.yccheok.jstock.gui

View File

@ -1,101 +0,0 @@
如何在 Ubuntu Linux 16.04上安装开源的Discourse论坛
===============================================================================
Discourse 是一个开源的论坛, 它可以以邮件列表, 聊天室或者论坛等多种形式工作. 它是一个广受欢迎的现代的论坛工具. 在服务端,它使用Ruby on Rails 和 Postgres 搭建, 并且使用Redis caching 减少读取时间 , 在客户端, 它用浏览器的Java Script运行. 它是一个非常好的定制和构架工具. 并且它提供了转换插件对你现存的论坛进行转换例如: vBulletin, phpBB, Drupal, SMF 等等. 在这篇文章中, 我们将学习在Ubuntu操作系统下安装 Discourse.
它是基于安全开发的, 黑客们不能轻易的发现漏洞. 它能很好的支持各个平台, 相应的调整手机和平板的显示设置.
### Installing Discourse on Ubuntu 16.04
让我们开始吧 ! 最少需要1G的内存并且你要保证dockers已经安装了. 说到dockers, 它还需要安装Git. 要达到以上的两点要求我们只需要运行下面的命令.
```
wget -qO- https://get.docker.com/ | sh
```
![](http://linuxpitstop.com/wp-content/uploads/2016/06/124.png)
用不了多久就安装好了Docker 和 Git, 安装结束以后, 创建一个 Discourse 文件夹在 /var 分区 (当然你也可以选择其他的分区).
```
mkdir /var/discourse
```
现在我们来克隆 Discourses Github 项目到这个新建的文件夹.
```
git clone https://github.com/discourse/discourse_docker.git /var/discourse
```
进入克隆文件夹.
```
cd /var/discourse
```
![](http://linuxpitstop.com/wp-content/uploads/2016/06/314.png)
你将 看到“discourse-setup” 脚本文件, 运行这个脚本文件进行Discourse的初始化.
```
./discourse-setup
```
**Side note: 在安装discourse之前请确保你已经安装了邮件服务器.**
安装向导将会问你以下六个问题.
```
Hostname for your Discourse?
Email address for admin account?
SMTP server address?
SMTP user name?
SMTP port [587]:
SMTP password? []:
```
![](http://linuxpitstop.com/wp-content/uploads/2016/06/411.png)
当你提交了以上信息以后, 它会让你提交确认, 恩一切都很正常, 点击回车以后安装开始.
![](http://linuxpitstop.com/wp-content/uploads/2016/06/511.png)
现在坐下来,倒杯茶,看看有什么错误信息没有.
![](http://linuxpitstop.com/wp-content/uploads/2016/06/610.png)
安装成功以后看起来应该像这样.
![](http://linuxpitstop.com/wp-content/uploads/2016/06/710.png)
现在打开浏览器, 如果已经做了域名解析, 你可以使用你的域名来连接Discourse页面 , 否则你只能使用IP地址了. 你讲看到如下信息:
![](http://linuxpitstop.com/wp-content/uploads/2016/06/85.png)
就是这个, 用 “Sign Up” 选项创建一个新管理账户.
![](http://linuxpitstop.com/wp-content/uploads/2016/06/106.png)
### 结论
它是安装简便安全易用的. 它拥有当前所有论坛功能. 它支持所有的开源产品. 简单, 易用, 各类实用的功能. 希望你喜欢这篇文章你可以给我们留言.
--------------------------------------------------------------------------------
via: http://linuxpitstop.com/install-discourse-on-ubuntu-linux-16-04/
作者:[Aun][a]
译者:[kokialoves](https://github.com/kokialoves)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://linuxpitstop.com/author/aun/

View File

@ -1,71 +0,0 @@
GNU KHATA开源的会计管理软件
============================================
作为一个活跃的 Linux 爱好者,我经常向我的朋友们介绍 Linux帮助他们选择最适合他们的发行版本同时也会帮助他们安装一些适用于他们工作的开源软件。
但是,又一次,我就变得很无奈。我的叔叔,他是一个自由职业的会计师。他会有一系列的为了会计工作的付费软件。我就不那么确定,我能在在开软软件中找到这么一款可以替代的软件。直到昨天。
Abhishek 给我推荐了一些[很酷的软件][1]并且GNU Khata 这个特殊的一款,脱颖而出。
[GNU Khata][2] 是一个会计工具。 或者,我可以说成是一系列的会计工具集合?这像经济管理方面的[Evernote][3]。他的应用是如此之广,以至于他可以处理个人的财务管理,到大型公司的管理,从店铺存货管理到税率计算,都可以有效处理。
对你来说一个有趣的点,'Khata' 在印度或者是其他的印度语国家中意味着账户,所以这个会计软件叫做 GNU Khata。
### 安装
互联网上有很多关于老旧版本的 Khata 安装的介绍。现在GNU Khata 已经可以在 Debian/Ubuntu 和他们的衍生产中得到。我建议你按照如下 GNU Khata 官网的步骤来安装。我们来一次快速的入门。
- 从[这][4]下载安装器。
- 在下载目录打开终端。
- 粘贴复制以下的代码到终端,并且执行。
```
sudo chmod 755 GNUKhatasetup.run
sudo ./GNUKhatasetup.run
```
- 这就结束了,从你的 Dash 或者是应用菜单中启动 GNU Khata 吧。
### 第一次启动
GNU Khata 在浏览器中打开,并且展现以下的画面。
![](https://itsfoss.com/wp-content/uploads/2016/07/GNU-khata-1.jpg)
填写组织的名字和组织形式,经济年份并且点击 proceed 按钮进入管理设置页面。
![](https://itsfoss.com/wp-content/uploads/2016/07/GNU-khata-2.jpg)
仔细填写你的姓名密码安全问题和他的答案并且点击“create and login”。
![](https://itsfoss.com/wp-content/uploads/2016/07/GNU-khata-3.jpg)
你已经全部设置完成了。使用菜单栏来开始使用 GNU Khata 来管理你的经济吧。这很容易。
### GNU KHATA 真的是市面上付费会计应用的竞争对手吗?
首先GNU Khata 让所有的事情变得简单。顶部的菜单栏被方便的组织,可以帮助你有效的进行工作。 你可以选择管理不同的账户和项目,并且切换非常容易。[他们的官网][5]表明GNU Khata 可以“方便的转变成印度语”。同时,你知道 GNU Khata 也可以在云端使用吗?
所有的主流的账户管理工具,将账簿分类,项目介绍,法规介绍等等用专业的方式整理,并且支持自定义整理和实时显示。这让会计和进存储管理看起来如此的简单。
这个项目正在积极的发展,从实际的操作中提交反馈来帮助这个软件更加进步。考虑到软件的成熟性,使用的便利性还有免费的 tag。GNU Khata 可能会成为最好的账簿助手。
请在评论框里留言吧,让我们知道你是如何看待 GNU Khata 的。
--------------------------------------------------------------------------------
via: https://itsfoss.com/using-gnu-khata/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+ItsFoss+%28Its+FOSS%21+An+Open+Source+Blog%29
作者:[Aquil Roshan][a]
译者:[MikeCoder](https://github.com/MikeCoder)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://itsfoss.com/author/aquil/
[1]: https://itsfoss.com/category/apps/
[2]: http://www.gnukhata.in/
[3]: https://evernote.com/
[4]: https://cloud.openmailbox.org/index.php/s/L8ppsxtsFq1345E/download
[5]: http://www.gnukhata.in/

View File

@ -0,0 +1,60 @@
Vim 学习的 5 个技巧
=====================================
![](https://opensource.com/sites/default/files/styles/image-full-size/public/images/education/BUSINESS_peloton.png?itok=nuMbW9d3)
多年来,我一直想学 Vim。如今 Vim 是我最喜欢的 Linux 文本编辑器,也是开发者和系统管理者最喜爱的开源工具。我说的学习,指的是真正意义上的学习。想要精通确实很难,所以我只想要达到熟练的水平。根据我多年使用 Linux 的经验,我会的也仅仅只是打开一个文件,使用上下左右箭头按键来移动光标,切换到 insert 模式,更改一些文本,保存,然后退出。
但那只是 Vim 的最基本操作。Vim 可以让我在终端修改文本,但是它并没有任何一个我想象中强大的文本处理功能。这样我无法说明 Vim 完全优于 Pico 和 Nano。
所以到底为什么要学习 Vim因为我需要花费相当多的时间用于编辑文本而且有很大的效率提升空间。为什么不选择 Emacs或者是更为现代化的编辑器例如 Atom因为 Vim 适合我,至少我有一丁点的使用经验。而且,很重要的一点就是,在我需要处理的系统上很少碰见没有装 Vim 或者它的简化版Vi。如果你有强烈的欲望想学习 Emacs我希望这些对于 Emacs 同类编辑器的建议能对你有所帮助。
花了几周的时间专注提高我的 Vim 使用技巧之后,我想分享的第一个建议就是必须使用工具。虽然这看起来就是明知故问的回答,但事实上比我所想的在代码层面上还要困难。我的大多数工作是在网页浏览器上进行的,而且我每次都得有针对性的用 Gedit 打开并修改一段浏览器之外的文本。Gedit 需要快捷键来启动,所以第一步就是移出快捷键然后替换成 Vim 的快捷键。
为了跟更好的学习 Vim我尝试了很多。如果你也正想学习以下列举了一些作为推荐。
### Vimtutor
通常如何开始学习最好就是使用应用本身。我找到一个小的应用叫 Vimtutor当你在学习编辑一个文本时它能辅导你一些基础知识它向我展示了很多我这些年都忽视的基础命令。Vimtutor 上到处都是 Vim 影子,如果你的系统上没有 VimtutorVimtutor可以很容易从你的包管理器上下载。
### GVim
我知道并不是每个人都认同这个,但就是它让我从使用在终端的 Vim 转战到使用 GVim 来满足我基本编辑需求。反对者表示 GVim 鼓励使用鼠标,而 Vim 主要是为键盘党设计的。但是我能通过 GVim 的下拉菜单快速找到想找的指令,并且 GVim 可以提醒我正确的指令然后通过敲键盘执行它。努力学习一个新的编辑器然后陷入无法解决的困境,这种感觉并不好受。每隔几分钟读一下 man 出来的文字或者使用搜索引擎来提醒你指令也并不是最好的学习新事务的方法。
### Keyboard maps
当我转战 GVim我发现有一个键盘的“作弊表”来提醒我最基础的按键很是便利。网上有很多这种可用的表你可以下载打印然后贴在你身边的某一处地方。但是为了我的笔记本键盘我选择买一沓便签纸。这些便签纸在美国不到10美元而且当我使用键盘编辑文本尝试新的命令的时候可以随时提醒我。
### Vimium
上文提到,我工作都在浏览器上进行。其中一条我觉得很有帮助的建议就是,使用 [Vimium][1] 来用增强使用 Vim 的体验。Vimium 是 Chrome 浏览器上的一个开源插件,能用 Vim 的指令快捷操作 Chrome。当我有意识的使用快捷键切换文本的次数越少时这说明我越来越多的使用这些快捷键。同样的扩展 Firefox 上也有,例如 [Vimerator][2]。
### 人
毫无疑问,最好的学习方法就是求助于在你之前探索过的人,让他给你建议、反馈和解决方法。
如果你住在一个大城市,那么附近可能会有一个 Vim meetup 组,不然就是在 Freenode IRC 上的 #vim 频道。#vim 频道是 Freenode 上最活跃的频道之一,那上面可以针对你个人的问题来提供帮助。听上面的人发发牢骚或者看看别人尝试解决自己没有遇到过的问题,仅仅是这样我都觉得很有趣。
------
所以是什么成就了现在?如今便是极好。为它所花的时间是否值得就在于之后它为你节省了多少时间。但是我经常收到意外的惊喜与快乐,当我发现一个新的按键指令来复制、跳过词,或者一些相似的小技巧。每天我至少可以看见,一点点回报,正在逐渐配得上当初的付出。
学习 Vim 并不仅仅只有这些建议,还有很多。我很喜欢指引别人去 [Vim Advantures][3],它是一种只能使用 Vim 的快捷键的在线游戏。而且某天我发现了一个非常神奇的虚拟学习工具,在 [Vimgifts.com][4],那上面有明确的你想要的:用一个 gif 动图来描述,使用一点点 Vim 操作来达到他们想要的。
你有花时间学习 Vim 吗?或者有大量键盘操作交互体验的程序上投资时间吗?那些经过你努力后掌握的工具,你认为这些努力值得吗?效率的提高有达到你的预期?分享你们的故事在下面的评论区吧。
--------------------------------------------------------------------------------
via: https://opensource.com/life/16/7/tips-getting-started-vim
作者:[Jason Baker ][a]
译者:[maywanting](https://github.com/maywanting)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/jason-baker
[1]: https://github.com/philc/vimium
[2]: http://www.vimperator.org/
[3]: http://vim-adventures.com/
[4]: http://vimgifs.com/

View File

@ -1,37 +0,0 @@
你能在浏览器中运行UBUNTU
=====================================================
Canonical, Ubuntu的母公司, 为Linux推广做了很多努力. 无论你有多么不喜欢 Ubuntu, 你必须承认它对 “Linux 易用性”的影响. Ubuntu 以及其衍生是应用最多的Linux版本 .
为了进一步推广 Ubuntu Linux, Canonical 把它放到了浏览器里你可以再任何地方使用 [demo version of Ubuntu][1]. 它将帮你更好的体验 Ubuntu. 以便让新人更容易决定是否使用.
你可能争辩说USB版的linux更好. 我同意但是你要知道你要下载ISO, 创建USB驱动, 修改配置文件. 并不是每个人都乐意这么干的. 在线体验是一个更好的选择.
因此, 你能在Ubuntu在线看到什么. 实际上并不多.
你可以浏览文件, 你可以使用 Unity Dash, 浏览 Ubuntu Software Center, 甚至装几个 apps (当然它们不会真的安装), 看一看文件浏览器 和其它一些东西. 以上就是全部了. 但是在我看来, 它是非常漂亮的
![](https://itsfoss.com/wp-content/uploads/2016/07/Ubuntu-online-demo.jpeg)
![](https://itsfoss.com/wp-content/uploads/2016/07/Ubuntu-online-demo-1.jpeg)
![](https://itsfoss.com/wp-content/uploads/2016/07/Ubuntu-online-demo-2.jpeg)
如果你的朋友或者家人想试试Linux又不乐意安装, 你可以给他们以下链接:
[Ubuntu Online Tour][0]
--------------------------------------------------------------------------------
via: https://itsfoss.com/ubuntu-online-demo/?utm_source=newsletter&utm_medium=email&utm_campaign=linux_and_open_source_stories_this_week
作者:[Abhishek Prakash][a]
译者:[kokialoves](https://github.com/kokialoves)
校对:[校对ID](https://github.com/校对ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://itsfoss.com/author/abhishek/
[0]: http://tour.ubuntu.com/en/
[1]: http://tour.ubuntu.com/en/

View File

@ -0,0 +1,62 @@
Linux 密码管理器Keeweb
================================
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/keeweb_1.png?608)
如今,我们依赖于越来越多的线上服务。我们每注册一个线上服务,就要设置一个密码;如此,我们就不得不记住数以百计的密码。这样对于每个人来说,都很容易忘记密码。我将在本文中介绍 Keeweb它是一款 Linux 密码管理器,可以将你所有的密码安全地存储在线上或线下。
当谈及 Linux 密码管理器时,我们会发现有很多这样的软件。我们已经在 LinuxAndUbuntu 上讨论过像 [Keepass][1] 和 [Encryptr一个基于零知识系统的密码管理器][2] 这样的密码管理器。Keeweb 则是另外一款我们将在本文讲解的 Linux 密码管理器。
### Keeweb 可以在线下或线上存储密码
Keeweb 是一款跨平台的密码管理器。它可以在线下存储你所有的密码,并且能够同步到你自己的云存储服务上,例如 OneDrive, Google Drive, Dropbox 等。Keeweb 并没有它自己的用于同步你密码的在线数据库。
要使用 Keeweb 连接你的线上存储服务,只需要点击更多,然后再点击你想要使用的服务即可。
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/keeweb.png?685)
现在Keeweb 会提示你登录到你的云盘。登录成功后,给 Keeweb 授权使用你的账户。
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/authenticate-dropbox-with-keeweb_orig.jpg?649)
### 使用 Keeweb 存储密码
使用 Keeweb 存储你的密码是非常容易的。你可以使用一个复杂的密码加密你的密码文件。Keeweb 也允许你使用一个秘钥文件来锁定密码文件,但是我并不推荐这种方式。如果某个家伙拿到了你的秘钥文件,他只需要简单点击一下就可以解锁你的密码文件。
#### 创建密码
想要创建一个新的密码,你只需要简单地点击 `+` 号,然后你就会看到所有需要填充的输入框。如果你想的话,可以创建更多的输入框。
#### 搜索密码
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/search-passwords_orig.png)
Keeweb 拥有一个图标库,这样你就可以轻松地找到任何特定的密码入口。你可以改变图标的颜色、下载更多的图标,甚至可以直接从你的电脑中导入图标。这对于密码搜索来说,异常好使。
相似服务的密码可以分组,这样你就可以在一个文件夹的一个地方同时找到它们。你也可以给密码打上标签并把它们存放在不同分类中。
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/tags-passwords-in-keeweb.png?283)
### 主题
![](http://www.linuxandubuntu.com/uploads/2/1/1/5/21152474/themes.png?304)
如果你喜欢类似于白色或者高对比度的亮色主题,你可以在“设置 > 通用 > 主题”中修改。Keeweb有四款可供选择的主题其中两款为暗色另外两款为亮色。
### 不喜欢 Linux 密码管理器?没问题!
我已经发表过文章介绍了另外两款 Linux 密码管理器,它们分别是 Keepass 和 Encryptr在 Reddit 和其它社交媒体上有些关于它们的争论。有些人反对使用任何密码管理器,反之亦然。在本文中,我想要澄清的是,存放密码文件是我们自己的责任。我认为像 keepass 和 Keeweb 这样的密码管理器是非常好用的,因为它们并没有自己的云来存放你的密码。这些密码管理器会创建一个文件,然后你可以将它存放在你的硬盘上,或者使用像 VeraCrypt 这样的应用给它加密。我个人不使用也不推荐使用那些将密码存储在它们自己数据库的服务。
--------------------------------------------------------------------------------
via: http://www.linuxandubuntu.com/home/keeweb-a-linux-password-manager
作者:[author][a]
译者:[ChrisLeeGit](https://github.com/chrisleegit)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.linuxandubuntu.com/home/keeweb-a-linux-password-manager
[1]: http://www.linuxandubuntu.com/home/keepass-password-management-tool-creates-strong-passwords-and-keeps-them-secure
[2]: http://www.linuxandubuntu.com/home/encryptr-zero-knowledge-system-based-password-manager-for-linux

View File

@ -1,39 +0,0 @@
为你的Linux桌面设置一张真实的地球照片
=================================================================
![](http://www.omgubuntu.co.uk/wp-content/uploads/2016/07/Screen-Shot-2016-07-26-at-16.36.47-1.jpg)
厌倦了看同样的桌面背景了么?这里有一个几乎这世界上最好的东西。
[Himawaripy][1]是一个Python 3脚本它会接近实时抓取由[日本Himawari 8气象卫星][2]拍摄的地球照片,并将它设置成你的桌面背景。
安装完成后你可以将它设置成每10分钟运行的任务自然地它是在后台运行这样它就可以实时地取回地球的照片并设置成背景了。
因为Himawari-8是一颗同步轨道卫星你只能看到澳大利亚上空的地球的图片-但是它实时的天气形态、云团和光线仍使它很壮丽,即使对我而言在看到英国上方的更好!
高级设置允许你配置从卫星取回的图片质量,但是要记住增加图片质量会增加文件大小及更长的下载等待!
最后,虽然这个脚本与其他我们提到过的其他类似,它还仍保持更新及可用。
获取Himawaripy
Himawaripy已经在一系列的桌面环境中都测试过了包括Unity、LXDE、i3、MATE和其他桌面环境。它是免费、开源软件但是并不能直接设置及配置。
在Github上查找获取安装的应用程序和设置的所有指令提示有没有一键安装上。
[GitHub上的实时地球壁纸脚本][0]
--------------------------------------------------------------------------------
via: http://www.tecmint.com/mandatory-access-control-with-selinux-or-apparmor-linux/
作者:[ JOEY-ELIJAH SNEDDON][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://plus.google.com/117485690627814051450/?rel=author
[1]: https://github.com/boramalper/himawaripy
[2]: https://en.wikipedia.org/wiki/Himawari_8
[0]: https://github.com/boramalper/himawaripy

View File

@ -1,31 +1,29 @@
translating---geekpi
How to use multiple connections to speed up apt-get/apt on Ubuntu Linux 16.04 LTS server
如何在Ubuntu Linux 16.04 LTS中使用多条连接加速apt-get/apt
=========================================================================================
ow do I speed up my apt-get or apt command to download packages from multiple repos on a Ubuntu Linux 16.04 or 14.04 LTS server?
我该如何在Ubuntu Linux 16.04或者14.04 LTS中从多个仓库中下载包来加速apt-get或者apt命令
You need to use apt-fast shell script wrapper. It should speed up apt-get command/apt command and aptitude command by downloading packages with multiple connections per package. All packages are downloaded simultaneously in parallel. It uses aria2c as default download accelerator.
你需要使用到apt-fast这个shell封装器。它会通过多个连接同时下载一个包来加速apt-get/apt和aptitude命令。所有的包都会同时下载。它使用aria2c作为默认的下载加速。
### Install apt-fast tool
### 安装 apt-fast 工具
Type the following command on Ubuntu Linux 14.04 and later versions:
在Ubuntu Linux 14.04或者之后的版本尝试下面的命令:
```
$ sudo add-apt-repository ppa:saiarcot895/myppa
```
Sample outputs:
示例输出:
![](http://s0.cyberciti.org/uploads/faq/2016/07/install-apt-fast-repo.jpg)
Update your repo:
更新你的仓库:
```
$ sudo apt-get update
```
OR
或者
```
$ sudo apt update
@ -33,19 +31,19 @@ $ sudo apt update
![](http://s0.cyberciti.org/uploads/faq/2016/07/install-apt-fast-command.jpg)
Install apt-fast shell wrapper:
安装 apt-fast
```
$ sudo apt-get -y install apt-fast
```
OR
或者
```
$ sudo apt -y install apt-fast
```
Sample outputs:
示例输出:
```
@ -69,122 +67,122 @@ Get:4 http://01.archive.ubuntu.com/ubuntu xenial/universe amd64 aria2 amd64 1.19
54% [4 aria2 486 kB/1,143 kB 42%] 20.4 kB/s 32s
```
### Configure apt-fast
### 配置 apt-fast
You will be prompted as follows (a value between 5 and 16 must be entered):
你将会得到下面的提示必须输入一个5到16的数值
![](http://s0.cyberciti.org/uploads/faq/2016/07/max-connection-10.jpg)
And:
并且
![](http://s0.cyberciti.org/uploads/faq/2016/07/apt-fast-confirmation-box.jpg)
You can edit settings directly too:
你可以直接编辑设置:
```
$ sudo vi /etc/apt-fast.conf
```
>**Please note that this tool is not for slow network connections; it is for fast network connections. If you have a slow connection to the Internet, you are not going to benefit by this tool.**
>**请注意这个工具并不是给慢速网络连接的,它是给快速网络连接的。如果你的网速慢,那么你将无法从这个工具中得到好处。**
### How do I use apt-fast command?
### 我该怎么使用 apt-fast 命令?
The syntax is:
语法是:
```
apt-fast command
apt-fast [options] command
```
#### To retrieve new lists of packages using apt-fast
#### 使用apt-fast取回新的包列表
```
sudo apt-fast update
```
#### To perform an upgrade using apt-fast
#### 使用apt-fast执行升级
```
sudo apt-fast upgrade
```
#### To perform distribution upgrade (release or force kernel upgrade), enter:
#### 执行发行版升级(发布或者强制内核升级),输入:
```
$ sudo apt-fast dist-upgrade
```
#### To install new packages
#### 安装新的包
The syntax is:
语法是:
```
sudo apt-fast install pkg
```
For example, install nginx package, enter:
比如要安装nginx输入
```
$ sudo apt-fast install nginx
```
Sample outputs:
示例输出:
![](http://s0.cyberciti.org/uploads/faq/2016/07/sudo-apt-fast-install.jpg)
#### To remove packages
#### 删除包
```
$ sudo apt-fast remove pkg
$ sudo apt-fast remove nginx
```
#### To remove packages and its config files too
#### 删除包和它的配置文件
```
$ sudo apt-fast purge pkg
$ sudo apt-fast purge nginx
```
#### To remove automatically all unused packages, enter:
#### 删除所有未使用的包
```
$ sudo apt-fast autoremove
```
#### To Download source archives
#### 下载源码包
```
$ sudo apt-fast source pkgNameHere
```
#### To erase downloaded archive files
#### 清理下载的文件
```
$ sudo apt-fast clean
```
#### To erase old downloaded archive files
#### 清理旧的下载文件
```
$ sudo apt-fast autoclean
```
#### To verify that there are no broken dependencies
#### 验证没有破坏的依赖
```
$ sudo apt-fast check
```
#### To download the binary package into the current directory
#### 下载二进制包到当前目录
```
$ sudo apt-fast download pkgNameHere
$ sudo apt-fast download nginx
```
Sample outputs:
示例输出:
```
[#7bee0c 0B/0B CN:1 DL:0B]
@ -198,7 +196,7 @@ Status Legend:
(OK):download completed.
```
#### To download and display the changelog for the given package
#### 下载并显示指定包的changelog
```
$ sudo apt-fast changelog pkgNameHere
@ -212,7 +210,7 @@ $ sudo apt-fast changelog nginx
via: https://fedoramagazine.org/introducing-flatpak/
作者:[VIVEK GITE][a]
译者:[zky001](https://github.com/zky001)
译者:[geekpi](https://github.com/geekpi)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出

View File

@ -0,0 +1,184 @@
LFCS 系列第十三讲:如何配置并排除 GNU 引导加载程序GRUB故障
=====================================================================================
由于 LFCS 考试需求的变动已于 2016 年 2 月 2 日生效,因此我们向 [LFCS 系列][1] 添加了一些必要的话题。为了准备认证考试,我们也强烈推荐你去看 [LFCE 系列][2]。
![](http://www.tecmint.com/wp-content/uploads/2016/03/Configure-Troubleshoot-Grub-Boot-Loader.png)
>LFCS 系列第十三讲:配置并排除 Grub 引导加载程序故障。
本文将会向你介绍 GRUB 的知识,并会说明你为什么需要一个引导加载程序,以及它是如何增强系统通用性的。
[Linux 引导过程][3] 是从你按下你的电脑电源键开始,直到你拥有一个全功能的系统为止,整个过程遵循着这样的高层次顺序:
* 1. 一个叫做 **POST****上电自检**)的过程会对你的电脑硬件组件做全面的检查。
* 2. 当 **POST** 完成后,它会把控制权转交给引导加载程序,接下来引导加载程序会将 Linux 内核(以及 **initramfs**)加载到内存中并执行。
* 3. 内核首先检查并访问硬件,然后运行初始进程(主要以它的通用名 **init** 而为人熟知),接下来初始进程会启动一些服务,最后完成系统启动过程。
在该系列的第七讲(“[SysVinit, Upstart, 和 Systemd][4]”)中,我们介绍了现代 Linux 发行版使用的一些服务管理系统和工具。在继续学习之前,你可能想要回顾一下那一讲的知识。
### GRUB 引导装载程序介绍
在现代系统中,你会发现有两种主要的 **GRUB** 版本(一种是偶尔被成为 **GRUB Legacy****v1** 版本,另一种则是 **v2** 版本),虽说多数最新版本的发行版系统都默认使用了 **v2** 版本。如今,只有 **红帽企业版 Linux 6** 及其衍生系统仍在使用 **v1** 版本。
因此,在本指南中,我们将着重关注 **v2** 版本的功能。
不管 **GRUB** 的版本是什么,一个引导加载程序都允许用户:
* 1). 通过指定使用不同的内核来修改系统的表现方式;
* 2). 从多个操作系统中选择一个启动;
* 3). 添加或编辑配置节点来改变启动选项等。
如今,**GNU** 项目负责维护 **GRUB**,并在它们的网站上提供了丰富的文档。当你在阅读这篇指南时,我们强烈建议你看下 [GNU 官方文档][6]。
当系统引导时,你会在主控制台看到如下的 **GRUB** 画面。最开始,你可以根据提示在多个内核版本中选择一个内核(默认情况下,系统将会使用最新的内核启动),并且可以进入 **GRUB** 命令行模式(使用 `c` 键),或者编辑启动项(按下 `e` 键)。
![](http://www.tecmint.com/wp-content/uploads/2016/03/GRUB-Boot-Screen.png)
> GRUB 启动画面
你会考虑使用一个旧版内核启动的原因之一是之前工作正常的某个硬件设备在一次升级后出现了“怪毛病acting up例如你可以参考 AskUbuntu 论坛中的 [这条链接][7])。
**GRUB v2** 的配置文件会在启动时从 `/boot/grub/grub.cfg``/boot/grub2/grub.cfg` 文件中读取,而 **GRUB v1** 使用的配置文件则来自 `/boot/grub/grub.conf``/boot/grub/menu.lst`。这些文件不能直接手动编辑,而是根据 `/etc/default/grub` 的内容和 `/etc/grub.d` 目录中的文件来修改的。
**CentOS 7** 上,当系统最初完成安装后,会生成如下的配置文件:
```
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="vconsole.keymap=la-latin1 rd.lvm.lv=centos_centos7-2/swap crashkernel=auto vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos_centos7-2/root rhgb quiet"
GRUB_DISABLE_RECOVERY="true"
```
除了在线文档外,你也可以使用下面的命令查阅 GNU GRUB 手册:
```
# info grub
```
如果你对 `/etc/default/grub` 文件中的可用选项特别感兴趣的话,你可以直接查阅配置一节的帮助文档:
```
# info -f grub -n 'Simple configuration'
```
使用上述命令,你会发现 `GRUB_TIMEOUT` 用于设置启动画面出现和系统自动开始启动(除非被用户中断)之间的时间。当该变量值为 `-1` 时,除非用户主动做出选择,否则不会开始启动。
当同一台机器上安装了多个操作系统或内核后,`GRUB_DEFAULT` 就需要用一个整数来指定 GRUB 启动画面默认选择启动的操作系统或内核条目。我们既可以通过上述启动画查看启动条目列表,也可以使用下面的命令:
### 在 CentOS 和 openSUSE 系统上
```
# awk -F\' '$1=="menuentry " {print $2}' /boot/grub2/grub.cfg
```
### 在 Ubuntu 系统上
```
# awk -F\' '$1=="menuentry " {print $2}' /boot/grub/grub.cfg
```
如下图所示的例子中,如果我们想要使用版本为 `3.10.0-123.el7.x86_64` 的内核(第四个条目),我们需要将 `GRUB_DEFAULT` 设置为 `3`(条目从零开始编号),如下所示:
```
GRUB_DEFAULT=3
```
![](http://www.tecmint.com/wp-content/uploads/2016/03/Boot-System-with-Old-Kernel-Version.png)
> 使用旧版内核启动系统
最后一个需要特别关注的 GRUB 配置变量是 `GRUB_CMDLINE_LINUX`,它是用来给内核传递选项的。我们可以在 [内核变量文件][8] 和 [man 7 bootparam][9] 中找到能够通过 GRUB 传递给内核的选项的详细文档。
我的 **CentOS 7** 服务器上当前的选项是:
```
GRUB_CMDLINE_LINUX="vconsole.keymap=la-latin1 rd.lvm.lv=centos_centos7-2/swap crashkernel=auto vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos_centos7-2/root rhgb quiet"
```
为什么你希望修改默认的内核参数或者传递额外的选项呢?简单来说,在很多情况下,你需要告诉内核某些由内核自身无法判断的硬件参数,或者是覆盖一些内核会检测的值。
不久之前,就在我身上发生过这样的事情,当时我在自己已用了 10 年的老笔记本上尝试衍生自 **Slackware****Vector Linux**。完成安装后,内核并没有检测出我的显卡的正确配置,所以我不得不通过 GRUB 传递修改过的内核选项来让它工作。
另外一个例子是当你需要将系统切换到单用户模式以执行维护工作时。为此,你可以直接在 `GRUB_CMDLINE_LINUX` 变量中直接追加 `single` 并重启即可:
```
GRUB_CMDLINE_LINUX="vconsole.keymap=la-latin1 rd.lvm.lv=centos_centos7-2/swap crashkernel=auto vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos_centos7-2/root rhgb quiet single"
```
编辑完 `/etc/default/grub` 之后,你需要运行 `update-grub` (在 Ubuntu 上)或者 `grub2-mkconfig -o /boot/grub2/grub.cfg` (在 **CentOS****openSUSE** 上)命令来更新 `grub.cfg` 文件(否则,改动会在系统启动时丢失)。
这条命令会处理早先提到的一些启动配置文件来更新 `grub.cfg` 文件。这种方法可以确保改动持久化,而在启动时刻通过 GRUB 传递的选项仅在当前会话期间有效。
### 修复 Linux GRUB 问题
如果你安装了第二个操作系统,或者由于人为失误而导致你的 GRUB 配置文件损坏了,依然有一些方法可以让你恢复并能够再次启动系统。
在启动画面中按下 `c` 键进入 GRUB 命令行模式(记住,你也可以按下 `e` 键编辑默认启动选项),并可以在 GRUB 提示中输入 `help` 命令获得可用命令:
![](http://www.tecmint.com/wp-content/uploads/2016/03/Fix-Grub-Issues-in-Linux.png)
> 修复 Linux 的 Grub 配置问题
我们将会着重关注 **ls** 命令,它会列出已安装的设备和文件系统,并且我们将会看看它可以查找什么。在下面的图片中,我们可以看到有 4 块硬盘(`hd0` 到 `hd3`)。
貌似只有 `hd0` 已经分区了msdos1 和 msdos2 可以证明,这里的 1 和 2 是分区号msdos 则是分区方案)。
现在我们来看看能否在第一个分区 `hd0`**msdos1**)上找到 GRUB。这种方法允许我们启动 Linux并且使用高级工具修复配置文件或者如果有必要的话干脆重新安装 GRUB
```
# ls (hd0,msdos1)/
```
从高亮区域可以发现,`grub2` 目录就在这个分区:
![](http://www.tecmint.com/wp-content/uploads/2016/03/Find-Grub-Configuration.png)
> 查找 Grub 配置
一旦我们确信了 GRUB 位于 (**hd0, msdos1**),那就让我们告诉 GRUB 该去哪儿查找它的配置文件并指示它去尝试启动它的菜单:
```
set prefix=(hd0,msdos1)/grub2
set root=(hd0,msdos1)
insmod normal
normal
```
![](http://www.tecmint.com/wp-content/uploads/2016/03/Find-and-Launch-Grub-Menu.png)
> 查找并启动 Grub 菜单
然后,在 GRUB 菜单中,选择一个条目并按下 **Enter** 键以使用它启动。一旦系统成功启动后,你就可以运行 `grub2-install /dev/sdX` 命令修复问题了(将 `sdX` 改成你想要安装 GRUB 的设备)。然后启动信息将会更新,并且所有相关文件都会得到恢复。
```
# grub2-install /dev/sdX
```
其它更加复杂的情景及其修复建议都记录在 [Ubuntu GRUB2 故障排除指南][10] 中。该指南中阐述的概念对于其它发行版也是有效的。
### 总结
本文向你介绍了 GRUB并指导你可以在何处找到线上和线下的文档同时说明了如何面对由于引导加载相关的问题而导致系统无法正常启动的情况。
幸运的是GRUB 是文档支持非常丰富的工具之一,你可以使用我们在文中分享的资源非常轻松地获取已安装的文档或在线文档。
你有什么问题或建议吗?请不要犹豫,使用下面的评论框告诉我们吧。我们期待着来自你的回复!
--------------------------------------------------------------------------------
via: http://www.tecmint.com/configure-and-troubleshoot-grub-boot-loader-linux/
作者:[Gabriel Cánepa][a]
译者:[ChrisLeeGit](https://github.com/chrisleegit)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.tecmint.com/author/gacanepa/
[1]: http://www.tecmint.com/sed-command-to-create-edit-and-manipulate-files-in-linux/
[2]: http://www.tecmint.com/installing-network-services-and-configuring-services-at-system-boot/
[3]: http://www.tecmint.com/linux-boot-process/
[4]: http://www.tecmint.com/linux-boot-process-and-manage-services/
[5]: http://www.tecmint.com/best-linux-log-monitoring-and-management-tools/
[6]: http://www.gnu.org/software/grub/manual/
[7]: http://askubuntu.com/questions/82140/how-can-i-boot-with-an-older-kernel-version
[8]: https://www.kernel.org/doc/Documentation/kernel-parameters.txt
[9]: http://man7.org/linux/man-pages/man7/bootparam.7.html
[10]: https://help.ubuntu.com/community/Grub2/Troubleshooting

View File

@ -0,0 +1,275 @@
第 8 节--学习怎样使用 Awk 变量,数值表达式以及赋值运算符
=======================================================================================
我相信 [Awk 命令系列][1] 将会令人兴奋不已,在系列的前几节我们讨论了在 Linux 中处理文件和筛选字符串需要的基本 Awk 命令。
在这一部分,我们会对处理更复杂的文件和筛选字符串操作需要的更高级的命令进行讨论。因此,我们将会看到关于 Awk 的一些特性诸如变量,数值表达式和赋值运算符。
![](http://www.tecmint.com/wp-content/uploads/2016/07/Learn-Awk-Variables-Numeric-Expressions-Assignment-Operators.png)
>学习 Awk 变量,数值表达式和赋值运算符
你可能已经在很多编程语言中接触过它们,比如 shellCPython等这些概念在理解上和这些语言没有什么不同所以在这一小节中你不用担心很难理解我们将会简短的提及常用的一些 Awk 特性。
这一小节可能是 Awk 命令里最容易理解的部分,所以放松点,我们开始吧。
### 1. Awk 变量
在任何编程语言中,当你在程序中新建一个变量的时候这个变量就是一个存储了值的占位符,程序一运行就占用了一些内存空间,你为变量赋的值会存储在这些内存空间上。
你可以像下面这样定义 shell 变量一样定义 Awk 变量:
```
variable_name=value
```
上面的语法:
- `variable_name`: 为定义的变量的名字
- `value`: 为变量赋的值
再看下面的一些例子:
```
computer_name=”tecmint.com”
port_no=”22”
email=”admin@tecmint.com”
server=”computer_name”
```
观察上面的简单的例子,在定义第一个变量的时候,值 'tecmint.com' 被赋给了 'computer_name' 变量。
此外,值 22 也被赋给了 port_no 变量,把一个变量的值赋给另一个变量也是可以的,在最后的例子中我们把变量 computer_name 的值赋给了变量 server。
你可以看看 [本系列的第 2 节][2] 中提到的字段编辑,我们讨论了 Awk 怎样将输入的行分隔为若干字段并且使用标准的字段进行输入操作 $ 访问不同的被分配的字段。我们也可以像下面这样使用变量为字段赋值。
```
first_name=$2
second_name=$3
```
在上面的例子中,变量 first_name 的值设置为第二个字段second_name 的值设置为第三个字段。
再举个例子,有一个名为 names.txt 的文件,这个文件包含了一个应用程序的用户列表,这个用户列表显示了用户的名字和曾用名以及性别。可以使用 [cat 命令][3] 查看文件内容:
```
$ cat names.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/List-File-Content-Using-cat-Command.png)
>使用 cat 命令查看列表文件内容
然后,我们也可以使用下面的 Awk 命令把列表中第一个用户的第一个和第二个名字分别存储到变量 first_name 和 second_name 上:
```
$ awk '/Aaron/{ first_name=$2 ; second_name=$3 ; print first_name, second_name ; }' names.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Store-Variables-Using-Awk-Command.png)
>使用 Awk 命令为变量赋值
再看一个例子,当你在终端运行 'uname -a' 时,它可以打印出所有的系统信息。
第二个字段包含了你的 'hostname',因此,我们可以像下面这样把它赋给一个叫做 hostname 的变量并且用 Awk 打印出来。
```
$ uname -a
$ uname -a | awk '{hostname=$2 ; print hostname ; }'
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Store-Command-Output-to-Variable-Using-Awk.png)
>使用 Awk 把命令的输出赋给变量
### 2. 数值表达式
在 Awk 中,数值表达式使用下面的数值运算符组成:
- `*` : 乘法运算符
- `+` : 加法运算符
- `/` : 除法运算符
- `-` : 减法运算符
- `%` : 取模运算符
- `^` : 指数运算符
数值表达式的语法是:
```
$ operand1 operator operand2
```
上面的 operand1 和 operand2 可以是数值和变量,运算符可以是上面列出的任意一种。
下面是一些展示怎样使用数值表达式的例子:
```
counter=0
num1=5
num2=10
num3=num2-num1
counter=counter+1
```
理解了 Awk 中数值表达式的用法,我们就可以看下面的例子了,文件 domians.txt 里包括了所有属于 Tecmint 的域名。
```
news.tecmint.com
tecmint.com
linuxsay.com
windows.tecmint.com
tecmint.com
news.tecmint.com
tecmint.com
linuxsay.com
tecmint.com
news.tecmint.com
tecmint.com
linuxsay.com
windows.tecmint.com
tecmint.com
```
可以使用下面的命令查看文件的内容;
```
$ cat domains.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/View-Contents-of-File.png)
>查看文件内容
如果想要计算出域名 tecmint.com 在文件中出现的次数,我们就可以通过写一个简单的脚本实现这个功能:
```
#!/bin/bash
for file in $@; do
if [ -f $file ] ; then
#print out filename
echo "File is: $file"
#print a number incrementally for every line containing tecmint.com
awk '/^tecmint.com/ { counter=counter+1 ; printf "%s\n", counter ; }' $file
else
#print error info incase input is not a file
echo "$file is not a file, please specify a file." >&2 && exit 1
fi
done
#terminate script with exit code 0 in case of successful execution
exit 0
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Shell-Script-to-Count-a-String-in-File.png)
>计算一个字符串或文本在文件中出现次数的 shell 脚本
写完脚本后保存并赋予执行权限,当我们使用文件运行脚本的时候,文件 domains.txt 作为脚本的输入,我们会得到下面的输出:
```
$ ./script.sh ~/domains.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Script-To-Count-String.png)
>计算字符串或文本出现次数的脚本
从脚本执行后的输出中,可以看到在文件 domains.txt 中包含域名 tecmint.com 的地方有 6 行,你可以自己计算进行验证。
### 3. 赋值操作符
我们要说的最后的 Awk 特性是赋值运算符,下面列出的只是 Awk 中的部分赋值运算符:
- `*=` : 乘法赋值运算符
- `+=` : 加法赋值运算符
- `/=` : 除法赋值运算符
- `-=` : 减法赋值运算符
- `%=` : 取模赋值运算符
- `^=` : 指数赋值运算符
下面是 Awk 中最简单的一个赋值操作的语法:
```
$ variable_name=variable_name operator operand
```
例子:
```
counter=0
counter=counter+1
num=20
num=num-1
```
你可以使用在 Awk 中使用上面的赋值操作符使命令更简短,从先前的例子中,我们可以使用下面这种格式进行赋值操作:
```
variable_name operator=operand
counter=0
counter+=1
num=20
num-=1
```
因此,我们可以在 shell 脚本中改变 Awk 命令,使用上面提到的 += 操作符:
```
#!/bin/bash
for file in $@; do
if [ -f $file ] ; then
#print out filename
echo "File is: $file"
#print a number incrementally for every line containing tecmint.com
awk '/^tecmint.com/ { counter+=1 ; printf "%s\n", counter ; }' $file
else
#print error info incase input is not a file
echo "$file is not a file, please specify a file." >&2 && exit 1
fi
done
#terminate script with exit code 0 in case of successful execution
exit 0
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Alter-Shell-Script.png)
>改变了的 shell 脚本
在 [Awk 系列][4] 的这一部分,我们讨论了一些有用的 Awk 特性,有变量,使用数值表达式和赋值运算符,还有一些使用他们的实例。
这些概念和其他的编程语言没有任何不同,但是可能在 Awk 中有一些意义上的区别。
在本系列的第 9 节,我们会学习更多的 Awk 特性,比如特殊格式: BEGIN 和 END。这也会与 Tecmit 有联系。
--------------------------------------------------------------------------------
via: http://www.tecmint.com/learn-awk-variables-numeric-expressions-and-assignment-operators/
作者:[Aaron Kili][a]
译者:[vim-kakali](https://github.com/vim-kakali)
校对:[校对ID](https://github.com/校对ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.tecmint.com/author/aaronkili/
[1]: http://www.tecmint.com/category/awk-command/
[2]: http://www.tecmint.com/awk-print-fields-columns-with-space-separator/
[3]: http://www.tecmint.com/13-basic-cat-command-examples-in-linux/
[4]: http://www.tecmint.com/category/awk-command/

View File

@ -0,0 +1,166 @@
awk 系列:如何使用 awk 的特殊模式 BEGIN 和 END
===============================================================
在 awk 系列的第八节,我们介绍了一些强大的 awk 命令功能,它们是变量、数字表达式和赋值运算符。
本节我们将学习更多的 awk 功能,即 awk 的特殊模式:`BEGIN` 和 `END`
![](http://www.tecmint.com/wp-content/uploads/2016/07/Learn-Awk-Patterns-BEGIN-and-END.png)
> 学习 awk 的模式 BEGIN 和 END
随着我们逐渐展开,并探索出更多构建复杂 awk 操作的方法,将会证明 awk 的这些特殊功能的是多么强大。
开始前,先让我们回顾一下 awk 系列的介绍,记得当我们开始这个系列时,我就指出 awk 指令的通用语法是这样的:
```
# awk 'script' filenames
```
在上述语法中awk 脚本拥有这样的形式:
```
/pattern/ { actions }
```
当你看脚本中的模式(`/pattern`)时,你会发现它通常是一个正则表达式,此外,你也可以将模式(`/pattern`)当成特殊模式 `BEGIN``END`
因此,我们也能按照下面的形式编写一条 awk 命令:
```
awk '
BEGIN { actions }
/pattern/ { actions }
/pattern/ { actions }
……….
END { actions }
' filenames
```
假如你在 awk 脚本中使用了特殊模式:`BEGIN` 和 `END`,以下则是它们对应的含义:
- `BEGIN` 模式:是指 awk 将在读取任何输入行之前立即执行 `BEGIN` 中指定的动作。
- `END` 模式:是指 awk 将在它正式退出前执行 `END` 中指定的动作。
含有这些特殊模式的 awk 命令脚本的执行流程如下:
- 当在脚本中使用了 `BEGIN` 模式,则 `BEGIN` 中所有的动作都会在读取任何输入行之前执行。
- 然后,读入一个输入行并解析成不同的段。
- 接下来,每一条指定的非特殊模式都会和输入行进行比较匹配,当匹配成功后,就会执行模式对应的动作。对所有你指定的模式重复此执行该步骤。
- 再接下来,对于所有输入行重复执行步骤 2 和 步骤 3。
- 当读取并处理完所有输入行后,假如你指定了 `END` 模式,那么将会执行相应的动作。
当你使用特殊模式时,想要在 awk 操作中获得最好的结果,你应当记住上面的执行顺序。
为了便于理解,让我们使用第八节的例子进行演示,那个例子是关于 Tecmint 拥有的域名列表,并保存在一个叫做 domains.txt 的文件中。
```
news.tecmint.com
tecmint.com
linuxsay.com
windows.tecmint.com
tecmint.com
news.tecmint.com
tecmint.com
linuxsay.com
tecmint.com
news.tecmint.com
tecmint.com
linuxsay.com
windows.tecmint.com
tecmint.com
```
```
$ cat ~/domains.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/View-Contents-of-File.png)
> 查看文件内容
在这个例子中,我们希望统计出 domains.txt 文件中域名 `tecmint.com` 出现的次数。所以,我们编写了一个简单的 shell 脚本帮助我们完成任务,它使用了变量、数学表达式和赋值运算符的思想,脚本内容如下:
```
#!/bin/bash
for file in $@; do
if [ -f $file ] ; then
# 输出文件名
echo "File is: $file"
# 输出一个递增的数字记录包含 tecmint.com 的行数
awk '/^tecmint.com/ { counter+=1 ; printf "%s\n", counter ; }' $file
else
# 若输入不是文件,则输出错误信息
echo "$file 不是一个文件,请指定一个文件。" >&2 && exit 1
fi
done
# 成功执行后使用退出代码 0 终止脚本
exit 0
```
现在让我们像下面这样在上述脚本的 awk 命令中应用这两个特殊模式:`BEGIN` 和 `END`
我们应当把脚本:
```
awk '/^tecmint.com/ { counter+=1 ; printf "%s\n", counter ; }' $file
```
改成:
```
awk ' BEGIN { print "文件中出现 tecmint.com 的次数是:" ; }
/^tecmint.com/ { counter+=1 ; }
END { printf "%s\n", counter ; }
' $file
```
在修改了 awk 命令之后,现在完整的 shell 脚本就像下面这样:
```
#!/bin/bash
for file in $@; do
if [ -f $file ] ; then
# 输出文件名
echo "File is: $file"
# 输出文件中 tecmint.com 出现的总次数
awk ' BEGIN { print "文件中出现 tecmint.com 的次数是:" ; }
/^tecmint.com/ { counter+=1 ; }
END { printf "%s\n", counter ; }
' $file
else
# 若输入不是文件,则输出错误信息
echo "$file 不是一个文件,请指定一个文件。" >&2 && exit 1
fi
done
# 成功执行后使用退出代码 0 终止脚本
exit 0
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Awk-BEGIN-and-END-Patterns.png)
> awk 模式 BEGIN 和 END
当我们运行上面的脚本时,它会首先输出 domains.txt 文件的位置,然后执行 awk 命令脚本,该命令脚本中的特殊模式 `BEGIN` 将会在从文件读取任何行之前帮助我们输出这样的消息“`文件中出现 tecmint.com 的次数是:`”。
接下来,我们的模式 `/^tecmint.com/` 会在每个输入行中进行比较,对应的动作 `{ counter+=1 ; }` 会在每个匹配成功的行上执行,它会统计出 `tecmint.com` 在文件中出现的次数。
最终,`END` 模式将会输出域名 `tecmint.com` 在文件中出现的总次数。
```
$ ./script.sh ~/domains.txt
```
![](http://www.tecmint.com/wp-content/uploads/2016/07/Script-to-Count-Number-of-Times-String-Appears.png)
> 用于统计字符串出现次数的脚本
最后总结一下,我们在本节中演示了更多的 awk 功能,并学习了特殊模式 `BEGIN``END` 的概念。
正如我之前所言,这些 awk 功能将会帮助我们构建出更复杂的文本过滤操作。第十节将会给出更多的 awk 功能,我们将会学习 awk 内置变量的思想,所以,请继续保持关注。
--------------------------------------------------------------------------------
via: http://www.tecmint.com/learn-use-awk-special-patterns-begin-and-end/
作者:[Aaron Kili][a]
译者:[ChrisLeeGit](https://github.com/chrisleegit)
校对:[校对ID](https://github.com/校对ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: http://www.tecmint.com/author/aaronkili/