From 6f87757776f5c00d02a5470f3a00f826a5ec074d Mon Sep 17 00:00:00 2001 From: chenjintao_ii Date: Sat, 19 Oct 2013 23:17:38 +0800 Subject: [PATCH 01/10] translated 4% --- translated/NoSQL comparison.md | 402 +++++++++++++++++++++++++++++++++ 1 file changed, 402 insertions(+) create mode 100644 translated/NoSQL comparison.md diff --git a/translated/NoSQL comparison.md b/translated/NoSQL comparison.md new file mode 100644 index 0000000000..9ded177e78 --- /dev/null +++ b/translated/NoSQL comparison.md @@ -0,0 +1,402 @@ +各种NoSQL的比较 +================ + +即使关系型数据库依然是非常有用的工具,它们持续几十年的垄断地位就要走到头了。现在已经存在无数能撼动关系型数据库地位的NoSQL,当然,这些NoSQL还无法完全取代它们。(也就是说,关系型数据库还是处理关系型事务的最佳方式。) + +NoSQL与NoSQL之间的区别,要远大于SQL与SQL之间的区别。所以软件架构师必须要在项目一开始就选好一款合适的NoSQL。 + +考虑到这种情况,本文为大家介绍以下几种NoSQL之间的区别:[Cassandra][], [Mongodb][], [CouchDB][], [Redis][], [Riak][], [Couchbase (ex-Membase)][], [Hypertable][], [ElasticSearch][], [Accumulo][], [VoltDB][], [Kyoto Tycoon][], [Scalaris][], [Neo4j][]和[HBase][]: + +##最流行的NoSQL + +###MongoDB (2.2) + +**编程语言:** C++ + +**要点:** 保留SQL中一些用户友好的特性(查询、索引等)。 + +**许可证:** AGPL (发起者: Apache) + +**支持的数据结构:** 自定义,二进制(BSON) + +- 主/从备份(支持自动故障切换功能) +- 自带数据分片功能 +- 通过javascript表达式提供数据查询 +- 服务器端完全支持javascript脚本 +- 比CouchDB更好的升级功能 +- 数据存储使用内存映射文件技术 +- 功能丰富,性能不俗 +- 最好开启日志功能(使用--journal参数) +- 在32位系统中,内存限制在2.5GB +- 空数据库占用192MB空间 +- 使用GridFS(不是真正的文件系统)来保存大数据和元数据 +- 支持对数据建立索引 +- 数据中心意识 + +**应用场景:**动态查询;需要定义索引而不是map/reduce功能;提高大数据库性能;想使用CouchDB但数据IO量太大,CouchDB无法满足要求。 + +**For example:** For most things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back. +**案例:**想布署MySQL或PostgreSQL,但它们存在的预定义处理语句和预定义变量让你望而却步。 + +###Riak (V1.2) + +**Written in:** Erlang & C, some JavaScript + +**Main point:** Fault tolerance + +**License:** Apache + +**Protocol:** HTTP/REST or custom binary + +- Stores blobs +- Tunable trade-offs for distribution and replication +- Pre- and post-commit hooks in JavaScript or Erlang, for validation and security. +- Map/reduce in JavaScript or Erlang +- Links & link walking: use it as a graph database +- Secondary indices: but only one at once +- Large object support (Luwak) +- Comes in "open source" and "enterprise" editions +- Full-text search, indexing, querying with Riak Search +- In the process of migrating the storing backend from "Bitcask" to Google's "LevelDB" +- Masterless multi-site replication replication and SNMP monitoring are commercially licensed + +**Best used:** If you want something Dynamo-like data storage, but no way you're gonna deal with the bloat and complexity. If you need very good single-site scalability, availability and fault-tolerance, but you're ready to pay for multi-site replication. + +**For example:** Point-of-sales data collection. Factory control systems. Places where even seconds of downtime hurt. Could be used as a well-update-able web server. + +###CouchDB (V1.2) + +**Written in:** Erlang + +**Main point:** DB consistency, ease of use + +**License:** Apache + +**Protocol:** HTTP/REST + +- Bi-directional (!) replication, +- continuous or ad-hoc, +- with conflict detection, +- thus, master-master replication. (!) +- MVCC - write operations do not block reads +- Previous versions of documents are available +- Crash-only (reliable) design +- Needs compacting from time to time +- Views: embedded map/reduce +- Formatting views: lists & shows +- Server-side document validation possible +- Authentication possible +- Real-time updates via '_changes' (!) +- Attachment handling +- thus, CouchApps (standalone js apps) + +**Best used:** For accumulating, occasionally changing data, on which pre-defined queries are to be run. Places where versioning is important. + +**For example:** CRM, CMS systems. Master-master replication is an especially interesting feature, allowing easy multi-site deployments. + +###Redis (V2.4) + +**Written in:** C/C++ + +**Main point:** Blazing fast + +**License:** BSD + +**Protocol:** Telnet-like + +- Disk-backed in-memory database, +- Currently without disk-swap (VM and Diskstore were abandoned) +- Master-slave replication +- Simple values or hash tables by keys, +- but complex operations like ZREVRANGEBYSCORE. +- INCR & co (good for rate limiting or statistics) +- Has sets (also union/diff/inter) +- Has lists (also a queue; blocking pop) +- Has hashes (objects of multiple fields) +- Sorted sets (high score table, good for range queries) +- Redis has transactions (!) +- Values can be set to expire (as in a cache) +- Pub/Sub lets one implement messaging (!) + +**Best used:** For rapidly changing data with a foreseeable database size (should fit mostly in memory). + +**For example:** Stock prices. Analytics. Real-time data collection. Real-time communication. And wherever you used memcached before. + +##Clones of Google's Bigtable + +###HBase (V0.92.0) + +**Written in:** Java + +**Main point:** Billions of rows X millions of columns + +**License:** Apache + +**Protocol:** HTTP/REST (also Thrift) + +- Modeled after Google's BigTable +- Uses Hadoop's HDFS as storage +- Map/reduce with Hadoop +- Query predicate push down via server side scan and get filters +- Optimizations for real time queries +- A high performance Thrift gateway +- HTTP supports XML, Protobuf, and binary +- Jruby-based (JIRB) shell +- Rolling restart for configuration changes and minor upgrades +- Random access performance is like MySQL +- A cluster consists of several different types of nodes + +**Best used:** Hadoop is probably still the best way to run Map/Reduce jobs on huge datasets. Best if you use the Hadoop/HDFS stack already. + +**For example:** Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement. + +###Cassandra (1.2) + +**Written in:** Java + +**Main point:** Best of BigTable and Dynamo + +**License:** Apache + +**Protocol:** Thrift & custom binary CQL3 + +- Tunable trade-offs for distribution and replication (N, R, W) +- Querying by column, range of keys (Requires indices on anything that you want to search on) +- BigTable-like features: columns, column families +- Can be used as a distributed hash-table, with an "SQL-like" language, CQL (but no JOIN!) +- Data can have expiration (set on INSERT) +- Writes can be much faster than reads (when reads are disk-bound) +- Map/reduce possible with Apache Hadoop +- All nodes are similar, as opposed to Hadoop/HBase +- Very good and reliable cross-datacenter replication + +**Best used:** When you write more than you read (logging). If every component of the system must be in Java. ("No one gets fired for choosing Apache's stuff.") + +**For example:** Banking, financial industry (though not necessarily for financial transactions, but these industries are much bigger than that.) Writes are faster than reads, so one natural niche is data analysis. + +###Hypertable (0.9.6.5) + +**Written in:** C++ + +**Main point:** A faster, smaller HBase + +**License:** GPL 2.0 + +**Protocol:** Thrift, C++ library, or HQL shell + +- Implements Google's BigTable design +- Run on Hadoop's HDFS +- Uses its own, "SQL-like" language, HQL +- Can search by key, by cell, or for values in column families. +- Search can be limited to key/column ranges. +- Sponsored by Baidu +- Retains the last N historical values +- Tables are in namespaces +- Map/reduce with Hadoop + +**Best used:** If you need a better HBase. + +**For example:** Same as HBase, since it's basically a replacement: Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement. + +###Accumulo (1.4) + +**Written in:** Java and C++ + +**Main point:** A BigTable with Cell-level security + +**License:** Apache + +**Protocol:** Thrift + +- Another BigTable clone, also runs of top of Hadoop +- Cell-level security +- Bigger rows than memory are allowed +- Keeps a memory map outside Java, in C++ STL +- Map/reduce using Hadoop's facitlities (ZooKeeper & co) +- Some server-side programming + +**Best used:** If you need a different HBase. + +**For example:** Same as HBase, since it's basically a replacement: Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement. + +##Special-purpose + +###Neo4j (V1.5M02) + +**Written in:** Java + +**Main point:** Graph database - connected data + +**License:** GPL, some features AGPL/commercial + +**Protocol:** HTTP/REST (or embedding in Java) + +- Standalone, or embeddable into Java applications +- Full ACID conformity (including durable data) +- Both nodes and relationships can have metadata +- Integrated pattern-matching-based query language ("Cypher") +- Also the "Gremlin" graph traversal language can be used +- Indexing of nodes and relationships +- Nice self-contained web admin +- Advanced path-finding with multiple algorithms +- Indexing of keys and relationships +- Optimized for reads +- Has transactions (in the Java API) +- Scriptable in Groovy +- Online backup, advanced monitoring and High Availability is AGPL/commercial licensed + +**Best used:** For graph-style, rich or complex, interconnected data. Neo4j is quite different from the others in this sense. + +**For example:** For searching routes in social relations, public transport links, road maps, or network topologies. + +###ElasticSearch (0.20.1) + +**Written in:** Java + +**Main point:** Advanced Search + +**License:** Apache + +**Protocol:** JSON over HTTP (Plugins: Thrift, memcached) + +- Stores JSON documents +- Has versioning +- Parent and children documents +- Documents can time out +- Very versatile and sophisticated querying, scriptable +- Write consistency: one, quorum or all +- Sorting by score (!) +- Geo distance sorting +- Fuzzy searches (approximate date, etc) (!) +- Asynchronous replication +- Atomic, scripted updates (good for counters, etc) +- Can maintain automatic "stats groups" (good for debugging) +- Still depends very much on only one developer (kimchy). + +**Best used:** When you have objects with (flexible) fields, and you need "advanced search" functionality. + +**For example:** A dating service that handles age difference, geographic location, tastes and dislikes, etc. Or a leaderboard system that depends on many variables. + +##The "long tail" + +(Not widely known, but definitely worthy ones) + +###Couchbase (ex-Membase) (2.0) + +**Written in:** Erlang & C + +**Main point:** Memcache compatible, but with persistence and clustering + +**License:** Apache + +**Protocol:** memcached + extensions + +- Very fast (200k+/sec) access of data by key +- Persistence to disk +- All nodes are identical (master-master replication) +- Provides memcached-style in-memory caching buckets, too +- Write de-duplication to reduce IO +- Friendly cluster-management web GUI +- Connection proxy for connection pooling and multiplexing (Moxi) +- Incremental map/reduce +- Cross-datacenter replication + +**Best used:** Any application where low-latency data access, high concurrency support and high availability is a requirement. + +**For example:** Low-latency use-cases like ad targeting or highly-concurrent web apps like online gaming (e.g. Zynga). + +###VoltDB (2.8.4.1) + +**Written in:** Java + +**Main point:** Fast transactions and rapidly changing data + +**License:** GPL 3 + +**Protocol:** Proprietary + +- In-memory relational database. +- Can export data into Hadoop +- Supports ANSI SQL +- Stored procedures in Java +- Cross-datacenter replication + +**Best used:** Where you need to act fast on massive amounts of incoming data. + +**For example:** Point-of-sales data analysis. Factory control systems. + +###Scalaris (0.5) + +**Written in:** Erlang + +**Main point:** Distributed P2P key-value store + +**License:** Apache + +**Protocol:** Proprietary & JSON-RPC + +- In-memory (disk when using Tokyo Cabinet as a backend) +- Uses YAWS as a web server +- Has transactions (an adapted Paxos commit) +- Consistent, distributed write operations +- From CAP, values Consistency over Availability (in case of network partitioning, only the bigger partition - works) + +**Best used:** If you like Erlang and wanted to use Mnesia or DETS or ETS, but you need something that is accessible from more languages (and scales much better than ETS or DETS). + +**For example:** In an Erlang-based system when you want to give access to the DB to Python, Ruby or Java programmers. + +###Kyoto Tycoon (0.9.56) + +**Written in:** C++ + +**Main point:** A lightweight network DBM + +**License:** GPL + +**Protocol:** HTTP (TSV-RPC or REST) + +- Based on Kyoto Cabinet, Tokyo Cabinet's successor +- Multitudes of storage backends: Hash, Tree, Dir, etc (everything from Kyoto Cabinet) +- Kyoto Cabinet can do 1M+ insert/select operations per sec (but Tycoon does less because of overhead) +- Lua on the server side +- Language bindings for C, Java, Python, Ruby, Perl, Lua, etc +- Uses the "visitor" pattern +- Hot backup, asynchronous replication +- background snapshot of in-memory databases +- Auto expiration (can be used as a cache server) + +**Best used:** When you want to choose the backend storage algorithm engine very precisely. When speed is of the essence. + +**For example:** Caching server. Stock prices. Analytics. Real-time data collection. Real-time communication. And wherever you used memcached before. + +Of course, all these systems have much more features than what's listed here. I only wanted to list the key points that I base my decisions on. Also, development of all are very fast, so things are bound to change. + +P.s.: And no, there's no date on this review. There are version numbers, since I update the databases one by one, not at the same time. And believe me, the basic properties of databases don't change that much. + +--- + +via: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis + +本文由 [LCTT][] 原创翻译,[Linux中国][] 荣誉推出 + +译者:[译者ID][] 校对:[校对者ID][] + +[LCTT]:https://github.com/LCTT/TranslateProject +[Linux中国]:http://linux.cn/portal.php +[chenjintao]:http://linux.cn/space/chenjintao +[校对者ID]:http://linux.cn/space/校对者ID + +[Cassandra]:http://cassandra.apache.org/ +[Mongodb]:http://www.mongodb.org/ +[CouchDB]:http://couchdb.apache.org/ +[Redis]:http://redis.io/ +[Riak]:http://basho.com/riak/ +[Couchbase (ex-Membase)]:http://www.couchbase.org/membase +[Hypertable]:http://hypertable.org/ +[ElasticSearch]:http://www.elasticsearch.org/ +[Accumulo]:http://accumulo.apache.org/ +[VoltDB]:http://voltdb.com/ +[Kyoto Tycoon]:http://fallabs.com/kyototycoon/ +[Scalaris]:https://code.google.com/p/scalaris/ +[Neo4j]:http://neo4j.org/ +[HBase]:http://hbase.apache.org/ From 7901b15e50fa1b6aba3d655d940e04d7a4499704 Mon Sep 17 00:00:00 2001 From: chenjintao_ii Date: Wed, 23 Oct 2013 21:55:01 +0800 Subject: [PATCH 02/10] complete 30% --- sources/NoSQL comparison.md | 403 --------------------------------- translated/NoSQL comparison.md | 157 +++++++------ 2 files changed, 78 insertions(+), 482 deletions(-) delete mode 100644 sources/NoSQL comparison.md diff --git a/sources/NoSQL comparison.md b/sources/NoSQL comparison.md deleted file mode 100644 index 3f114372a7..0000000000 --- a/sources/NoSQL comparison.md +++ /dev/null @@ -1,403 +0,0 @@ -[这篇我领了] -NoSQL comparison -================ - -While SQL databases are insanely useful tools, their monopoly in the last decades is coming to an end. And it's just time: I can't even count the things that were forced into relational databases, but never really fitted them. (That being said, relational databases will always be the best for the stuff that has relations.) - -But, the differences between NoSQL databases are much bigger than ever was between one SQL database and another. This means that it is a bigger responsibility on software architects to choose the appropriate one for a project right at the beginning. - -In this light, here is a comparison of [Cassandra][], [Mongodb][], [CouchDB][], [Redis][], [Riak][], [Couchbase (ex-Membase)][], [Hypertable][], [ElasticSearch][], [Accumulo][], [VoltDB][], [Kyoto Tycoon][], [Scalaris][], [Neo4j][] and [HBase][]: - -##The most popular ones - -###MongoDB (2.2) - -**Written in:** C++ - -**Main point:** Retains some friendly properties of SQL. (Query, index) - -**License:** AGPL (Drivers: Apache) - -**Protocol:** Custom, binary (BSON) - -- Master/slave replication (auto failover with replica sets) -- Sharding built-in -- Queries are javascript expressions -- Run arbitrary javascript functions server-side -- Better update-in-place than CouchDB -- Uses memory mapped files for data storage -- Performance over features -- Journaling (with --journal) is best turned on -- On 32bit systems, limited to ~2.5Gb -- An empty database takes up 192Mb -- GridFS to store big data + metadata (not actually an FS) -- Has geospatial indexing -- Data center aware - -**Best used:** If you need dynamic queries. If you prefer to define indexes, not map/reduce functions. If you need good performance on a big DB. If you wanted CouchDB, but your data changes too much, filling up disks. - -**For example:** For most things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back. - - -###Riak (V1.2) - -**Written in:** Erlang & C, some JavaScript - -**Main point:** Fault tolerance - -**License:** Apache - -**Protocol:** HTTP/REST or custom binary - -- Stores blobs -- Tunable trade-offs for distribution and replication -- Pre- and post-commit hooks in JavaScript or Erlang, for validation and security. -- Map/reduce in JavaScript or Erlang -- Links & link walking: use it as a graph database -- Secondary indices: but only one at once -- Large object support (Luwak) -- Comes in "open source" and "enterprise" editions -- Full-text search, indexing, querying with Riak Search -- In the process of migrating the storing backend from "Bitcask" to Google's "LevelDB" -- Masterless multi-site replication replication and SNMP monitoring are commercially licensed - -**Best used:** If you want something Dynamo-like data storage, but no way you're gonna deal with the bloat and complexity. If you need very good single-site scalability, availability and fault-tolerance, but you're ready to pay for multi-site replication. - -**For example:** Point-of-sales data collection. Factory control systems. Places where even seconds of downtime hurt. Could be used as a well-update-able web server. - -###CouchDB (V1.2) - -**Written in:** Erlang - -**Main point:** DB consistency, ease of use - -**License:** Apache - -**Protocol:** HTTP/REST - -- Bi-directional (!) replication, -- continuous or ad-hoc, -- with conflict detection, -- thus, master-master replication. (!) -- MVCC - write operations do not block reads -- Previous versions of documents are available -- Crash-only (reliable) design -- Needs compacting from time to time -- Views: embedded map/reduce -- Formatting views: lists & shows -- Server-side document validation possible -- Authentication possible -- Real-time updates via '_changes' (!) -- Attachment handling -- thus, CouchApps (standalone js apps) - -**Best used:** For accumulating, occasionally changing data, on which pre-defined queries are to be run. Places where versioning is important. - -**For example:** CRM, CMS systems. Master-master replication is an especially interesting feature, allowing easy multi-site deployments. - -###Redis (V2.4) - -**Written in:** C/C++ - -**Main point:** Blazing fast - -**License:** BSD - -**Protocol:** Telnet-like - -- Disk-backed in-memory database, -- Currently without disk-swap (VM and Diskstore were abandoned) -- Master-slave replication -- Simple values or hash tables by keys, -- but complex operations like ZREVRANGEBYSCORE. -- INCR & co (good for rate limiting or statistics) -- Has sets (also union/diff/inter) -- Has lists (also a queue; blocking pop) -- Has hashes (objects of multiple fields) -- Sorted sets (high score table, good for range queries) -- Redis has transactions (!) -- Values can be set to expire (as in a cache) -- Pub/Sub lets one implement messaging (!) - -**Best used:** For rapidly changing data with a foreseeable database size (should fit mostly in memory). - -**For example:** Stock prices. Analytics. Real-time data collection. Real-time communication. And wherever you used memcached before. - -##Clones of Google's Bigtable - -###HBase (V0.92.0) - -**Written in:** Java - -**Main point:** Billions of rows X millions of columns - -**License:** Apache - -**Protocol:** HTTP/REST (also Thrift) - -- Modeled after Google's BigTable -- Uses Hadoop's HDFS as storage -- Map/reduce with Hadoop -- Query predicate push down via server side scan and get filters -- Optimizations for real time queries -- A high performance Thrift gateway -- HTTP supports XML, Protobuf, and binary -- Jruby-based (JIRB) shell -- Rolling restart for configuration changes and minor upgrades -- Random access performance is like MySQL -- A cluster consists of several different types of nodes - -**Best used:** Hadoop is probably still the best way to run Map/Reduce jobs on huge datasets. Best if you use the Hadoop/HDFS stack already. - -**For example:** Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement. - -###Cassandra (1.2) - -**Written in:** Java - -**Main point:** Best of BigTable and Dynamo - -**License:** Apache - -**Protocol:** Thrift & custom binary CQL3 - -- Tunable trade-offs for distribution and replication (N, R, W) -- Querying by column, range of keys (Requires indices on anything that you want to search on) -- BigTable-like features: columns, column families -- Can be used as a distributed hash-table, with an "SQL-like" language, CQL (but no JOIN!) -- Data can have expiration (set on INSERT) -- Writes can be much faster than reads (when reads are disk-bound) -- Map/reduce possible with Apache Hadoop -- All nodes are similar, as opposed to Hadoop/HBase -- Very good and reliable cross-datacenter replication - -**Best used:** When you write more than you read (logging). If every component of the system must be in Java. ("No one gets fired for choosing Apache's stuff.") - -**For example:** Banking, financial industry (though not necessarily for financial transactions, but these industries are much bigger than that.) Writes are faster than reads, so one natural niche is data analysis. - -###Hypertable (0.9.6.5) - -**Written in:** C++ - -**Main point:** A faster, smaller HBase - -**License:** GPL 2.0 - -**Protocol:** Thrift, C++ library, or HQL shell - -- Implements Google's BigTable design -- Run on Hadoop's HDFS -- Uses its own, "SQL-like" language, HQL -- Can search by key, by cell, or for values in column families. -- Search can be limited to key/column ranges. -- Sponsored by Baidu -- Retains the last N historical values -- Tables are in namespaces -- Map/reduce with Hadoop - -**Best used:** If you need a better HBase. - -**For example:** Same as HBase, since it's basically a replacement: Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement. - -###Accumulo (1.4) - -**Written in:** Java and C++ - -**Main point:** A BigTable with Cell-level security - -**License:** Apache - -**Protocol:** Thrift - -- Another BigTable clone, also runs of top of Hadoop -- Cell-level security -- Bigger rows than memory are allowed -- Keeps a memory map outside Java, in C++ STL -- Map/reduce using Hadoop's facitlities (ZooKeeper & co) -- Some server-side programming - -**Best used:** If you need a different HBase. - -**For example:** Same as HBase, since it's basically a replacement: Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement. - -##Special-purpose - -###Neo4j (V1.5M02) - -**Written in:** Java - -**Main point:** Graph database - connected data - -**License:** GPL, some features AGPL/commercial - -**Protocol:** HTTP/REST (or embedding in Java) - -- Standalone, or embeddable into Java applications -- Full ACID conformity (including durable data) -- Both nodes and relationships can have metadata -- Integrated pattern-matching-based query language ("Cypher") -- Also the "Gremlin" graph traversal language can be used -- Indexing of nodes and relationships -- Nice self-contained web admin -- Advanced path-finding with multiple algorithms -- Indexing of keys and relationships -- Optimized for reads -- Has transactions (in the Java API) -- Scriptable in Groovy -- Online backup, advanced monitoring and High Availability is AGPL/commercial licensed - -**Best used:** For graph-style, rich or complex, interconnected data. Neo4j is quite different from the others in this sense. - -**For example:** For searching routes in social relations, public transport links, road maps, or network topologies. - -###ElasticSearch (0.20.1) - -**Written in:** Java - -**Main point:** Advanced Search - -**License:** Apache - -**Protocol:** JSON over HTTP (Plugins: Thrift, memcached) - -- Stores JSON documents -- Has versioning -- Parent and children documents -- Documents can time out -- Very versatile and sophisticated querying, scriptable -- Write consistency: one, quorum or all -- Sorting by score (!) -- Geo distance sorting -- Fuzzy searches (approximate date, etc) (!) -- Asynchronous replication -- Atomic, scripted updates (good for counters, etc) -- Can maintain automatic "stats groups" (good for debugging) -- Still depends very much on only one developer (kimchy). - -**Best used:** When you have objects with (flexible) fields, and you need "advanced search" functionality. - -**For example:** A dating service that handles age difference, geographic location, tastes and dislikes, etc. Or a leaderboard system that depends on many variables. - -##The "long tail" - -(Not widely known, but definitely worthy ones) - -###Couchbase (ex-Membase) (2.0) - -**Written in:** Erlang & C - -**Main point:** Memcache compatible, but with persistence and clustering - -**License:** Apache - -**Protocol:** memcached + extensions - -- Very fast (200k+/sec) access of data by key -- Persistence to disk -- All nodes are identical (master-master replication) -- Provides memcached-style in-memory caching buckets, too -- Write de-duplication to reduce IO -- Friendly cluster-management web GUI -- Connection proxy for connection pooling and multiplexing (Moxi) -- Incremental map/reduce -- Cross-datacenter replication - -**Best used:** Any application where low-latency data access, high concurrency support and high availability is a requirement. - -**For example:** Low-latency use-cases like ad targeting or highly-concurrent web apps like online gaming (e.g. Zynga). - -###VoltDB (2.8.4.1) - -**Written in:** Java - -**Main point:** Fast transactions and rapidly changing data - -**License:** GPL 3 - -**Protocol:** Proprietary - -- In-memory relational database. -- Can export data into Hadoop -- Supports ANSI SQL -- Stored procedures in Java -- Cross-datacenter replication - -**Best used:** Where you need to act fast on massive amounts of incoming data. - -**For example:** Point-of-sales data analysis. Factory control systems. - -###Scalaris (0.5) - -**Written in:** Erlang - -**Main point:** Distributed P2P key-value store - -**License:** Apache - -**Protocol:** Proprietary & JSON-RPC - -- In-memory (disk when using Tokyo Cabinet as a backend) -- Uses YAWS as a web server -- Has transactions (an adapted Paxos commit) -- Consistent, distributed write operations -- From CAP, values Consistency over Availability (in case of network partitioning, only the bigger partition - works) - -**Best used:** If you like Erlang and wanted to use Mnesia or DETS or ETS, but you need something that is accessible from more languages (and scales much better than ETS or DETS). - -**For example:** In an Erlang-based system when you want to give access to the DB to Python, Ruby or Java programmers. - -###Kyoto Tycoon (0.9.56) - -**Written in:** C++ - -**Main point:** A lightweight network DBM - -**License:** GPL - -**Protocol:** HTTP (TSV-RPC or REST) - -- Based on Kyoto Cabinet, Tokyo Cabinet's successor -- Multitudes of storage backends: Hash, Tree, Dir, etc (everything from Kyoto Cabinet) -- Kyoto Cabinet can do 1M+ insert/select operations per sec (but Tycoon does less because of overhead) -- Lua on the server side -- Language bindings for C, Java, Python, Ruby, Perl, Lua, etc -- Uses the "visitor" pattern -- Hot backup, asynchronous replication -- background snapshot of in-memory databases -- Auto expiration (can be used as a cache server) - -**Best used:** When you want to choose the backend storage algorithm engine very precisely. When speed is of the essence. - -**For example:** Caching server. Stock prices. Analytics. Real-time data collection. Real-time communication. And wherever you used memcached before. - -Of course, all these systems have much more features than what's listed here. I only wanted to list the key points that I base my decisions on. Also, development of all are very fast, so things are bound to change. - -P.s.: And no, there's no date on this review. There are version numbers, since I update the databases one by one, not at the same time. And believe me, the basic properties of databases don't change that much. - ---- - -via: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis - -本文由 [LCTT][] 原创翻译,[Linux中国][] 荣誉推出 - -译者:[译者ID][] 校对:[校对者ID][] - -[LCTT]:https://github.com/LCTT/TranslateProject -[Linux中国]:http://linux.cn/portal.php -[译者ID]:http://linux.cn/space/译者ID -[校对者ID]:http://linux.cn/space/校对者ID - -[Cassandra]:http://cassandra.apache.org/ -[Mongodb]:http://www.mongodb.org/ -[CouchDB]:http://couchdb.apache.org/ -[Redis]:http://redis.io/ -[Riak]:http://basho.com/riak/ -[Couchbase (ex-Membase)]:http://www.couchbase.org/membase -[Hypertable]:http://hypertable.org/ -[ElasticSearch]:http://www.elasticsearch.org/ -[Accumulo]:http://accumulo.apache.org/ -[VoltDB]:http://voltdb.com/ -[Kyoto Tycoon]:http://fallabs.com/kyototycoon/ -[Scalaris]:https://code.google.com/p/scalaris/ -[Neo4j]:http://neo4j.org/ -[HBase]:http://hbase.apache.org/ diff --git a/translated/NoSQL comparison.md b/translated/NoSQL comparison.md index 9ded177e78..cdd576ea02 100644 --- a/translated/NoSQL comparison.md +++ b/translated/NoSQL comparison.md @@ -1,126 +1,125 @@ -各种NoSQL的比较 +各种 NoSQL 的比较 TODO: 中英文之间需要半角空格 ================ -即使关系型数据库依然是非常有用的工具,它们持续几十年的垄断地位就要走到头了。现在已经存在无数能撼动关系型数据库地位的NoSQL,当然,这些NoSQL还无法完全取代它们。(也就是说,关系型数据库还是处理关系型事务的最佳方式。) +即使关系型数据库依然是非常有用的工具,它们持续几十年的垄断地位就要走到头了。现在已经存在无数能撼动关系型数据库地位的 NoSQL,当然,这些 NoSQL 还无法完全取代它们。(也就是说,关系型数据库还是处理关系型事务的最佳方式。) -NoSQL与NoSQL之间的区别,要远大于SQL与SQL之间的区别。所以软件架构师必须要在项目一开始就选好一款合适的NoSQL。 +NoSQL 与 NoSQL 之间的区别,要远大于 SQL 与 SQL 之间的区别。所以软件架构师必须要在项目一开始就选好一款合适的 NoSQL。 -考虑到这种情况,本文为大家介绍以下几种NoSQL之间的区别:[Cassandra][], [Mongodb][], [CouchDB][], [Redis][], [Riak][], [Couchbase (ex-Membase)][], [Hypertable][], [ElasticSearch][], [Accumulo][], [VoltDB][], [Kyoto Tycoon][], [Scalaris][], [Neo4j][]和[HBase][]: +考虑到这种情况,本文为大家介绍以下几种 NoSQL 之间的区别:[Cassandra][], [Mongodb][], [CouchDB][], [Redis][], [Riak][], [Couchbase (ex-Membase)][], [Hypertable][], [ElasticSearch][], [Accumulo][], [VoltDB][], [Kyoto Tycoon][], [Scalaris][], [Neo4j][]和[HBase][]: -##最流行的NoSQL +##最流行的 NoSQL -###MongoDB (2.2) +###MongoDB 2.2版 -**编程语言:** C++ +**开发语言:** C++ -**要点:** 保留SQL中一些用户友好的特性(查询、索引等)。 +**主要特性:** 保留 SQL 中一些用户友好的特性(查询、索引等)。 **许可证:** AGPL (发起者: Apache) -**支持的数据结构:** 自定义,二进制(BSON) +**数据传输、存储的格式:** 自定义,二进制( BSON 文档格式) - 主/从备份(支持自动故障切换功能) - 自带数据分片功能 -- 通过javascript表达式提供数据查询 -- 服务器端完全支持javascript脚本 -- 比CouchDB更好的升级功能 +- 通过 javascript 表达式提供数据查询 +- 服务器端完全支持 javascript 脚本 +- 比 CouchDB 更好的升级功能 - 数据存储使用内存映射文件技术 - 功能丰富,性能不俗 -- 最好开启日志功能(使用--journal参数) -- 在32位系统中,内存限制在2.5GB -- 空数据库占用192MB空间 -- 使用GridFS(不是真正的文件系统)来保存大数据和元数据 +- 最好开启日志功能(使用 --journal 参数) +- 在 32 位系统中,内存限制在 2.5GB +- 空数据库占用 192MB 空间 +- 使用 GridFS(不是真正的文件系统)来保存大数据和元数据 - 支持对数据建立索引 - 数据中心意识 -**应用场景:**动态查询;需要定义索引而不是map/reduce功能;提高大数据库性能;想使用CouchDB但数据IO量太大,CouchDB无法满足要求。 +**应用场景:**动态查询;需要定义索引而不是 map/reduce 功能;提高大数据库性能;想使用 CouchDB 但数据的 IO 吞吐量太大,CouchDB 无法满足要求。MongoDB 可以满足你的需求。 -**For example:** For most things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back. -**案例:**想布署MySQL或PostgreSQL,但它们存在的预定义处理语句和预定义变量让你望而却步。 +**使用案例:**想布署 MySQL 或 PostgreSQL,但它们存在的预定义处理语句和预定义变量让你望而却步。这个时候,MongoDB 是你可以考虑的选项。 -###Riak (V1.2) +###Riak 1.2版 -**Written in:** Erlang & C, some JavaScript +**开发语言:** Erlang、C、以及一些 JavaScript -**Main point:** Fault tolerance +**主要特性:**容错机制(当一份数据失效,服务会自动切换到备份数据,保证服务一直在线 —— 译者注) -**License:** Apache +**许可证:** Apache -**Protocol:** HTTP/REST or custom binary +**数据传输、存储的格式:** HTTP/REST 架构,自定义二进制格式 -- Stores blobs -- Tunable trade-offs for distribution and replication -- Pre- and post-commit hooks in JavaScript or Erlang, for validation and security. -- Map/reduce in JavaScript or Erlang -- Links & link walking: use it as a graph database -- Secondary indices: but only one at once -- Large object support (Luwak) -- Comes in "open source" and "enterprise" editions -- Full-text search, indexing, querying with Riak Search -- In the process of migrating the storing backend from "Bitcask" to Google's "LevelDB" -- Masterless multi-site replication replication and SNMP monitoring are commercially licensed +- 可存储 BLOB(binary large object,二进制大对象,比如一张图片、一个声音文件 —— 译者注)。 +- 可在分部式存储和备份存储之间作协调。 +- 为了保证可验证性和安全性,Riak 在 JS 和 Erlaing 中提供提交前(pre-commit)和提交后(post-commit)钩子(hook)函数(你可以在提交数据前执行一个 hook,或者在提交数据后执行一个 hook —— 译者注)。 +- JS 和 Erlang 提供映射和简化(map/reduce)编程模型。 +- 使用 links 和 link walking 图形化数据库(link 用于描述对象之间的关系,link walking 是一个用于查询对象关系的进程 —— 译者注)。 +- 次要标记(secondaty indeces,开发者在写数据时可用多个名称来标记一个对象 —— 译者注),一次只能用一个。 +- 支持大数据对象(Luwak)(Luwak 是 Riak 中的一个服务层,为大数据量对象提供简单的、面向文档的抽象,弥补了 Riak 的 Key/Value 存储格式在处理大数据对象方面的不足 —— 译者注)。 +- 提供“开源”和“企业”两个版本。 +- 提供“全文搜索”(可能就是允许用户在不提供 table/volume 等信息,对一个表进行文本字段的搜索,瞎猜的,望指正 —— 译者注)。 +- 正在将存储后端从“Bitcask”迁移到 Google 的“LevelDB”上。 +- 企业版本提供多点备份(各点地位平等,非主从架构)和SNMP监控功能。 -**Best used:** If you want something Dynamo-like data storage, but no way you're gonna deal with the bloat and complexity. If you need very good single-site scalability, availability and fault-tolerance, but you're ready to pay for multi-site replication. +**应用场景:**假如你想要类似 Dynamo 的数据库,但不想要它的庞大和复杂;假如你需要良好的单点可扩展性、可用性和容错能力,但不想为多点备份买单。 Riak 能满足你的需求。 -**For example:** Point-of-sales data collection. Factory control systems. Places where even seconds of downtime hurt. Could be used as a well-update-able web server. +**使用案例:**销售点数据收集;工厂控制系统;必须实时在线的系统;需要易于升级的网站服务器。 -###CouchDB (V1.2) +###CouchDB 1.2版 -**Written in:** Erlang +**开发语言:** Erlang -**Main point:** DB consistency, ease of use +**主要特性:**数据一致性;易于使用 -**License:** Apache +**许可证:** Apache -**Protocol:** HTTP/REST +**数据传输格式:** HTTP/REST -- Bi-directional (!) replication, -- continuous or ad-hoc, -- with conflict detection, -- thus, master-master replication. (!) -- MVCC - write operations do not block reads -- Previous versions of documents are available -- Crash-only (reliable) design -- Needs compacting from time to time -- Views: embedded map/reduce -- Formatting views: lists & shows -- Server-side document validation possible -- Authentication possible -- Real-time updates via '_changes' (!) -- Attachment handling -- thus, CouchApps (standalone js apps) +- 双向复制(一种同步技术,每个备份点都有一份它们自己的拷贝,允许用户在存储点断线的情况下修改数据,当存储节点重新上线时,CouchDB 会对所有节点同步这些修改 —— 译者注)。 +- 支持持续同步或者点对点同步。 +- 支持冲突检测。 +- 支持主主互备(多个数据库时时同步数据,起到备份和分摊用户并行访问量的作用 —— 译者注)。 +- 多版本并发控制(MVCC),写操作时不需要阻塞读操作(或者说不需要锁住数据库)。 +- 向下兼容。 +- 可靠的 crash-only 设计(所谓 crash-only,就是程序出错时,只需重启下程序,丢弃内存的所有数据,不需要执行复杂的数据恢复操作 —— 译者注)。 +- 需要实时压缩数据。 +- 视图(文档是 CouchDB 的核心概念,CouchDB 中的视图声明了如何从文档中提取数据,以及如何对提取出来的数据进行处理 —— 译者注):内嵌映射和简化(map/reduce)编程模型。 +- 格式化的views字段:lists(包含把视图运行结果转换成非 JSON 格式的方法)和 shows(包含把文档转换成非 JSON 格式的方法)(在 CouchDB 中,一个 Web 应用是与一个设计文档相对应的。在设计文档中可以包含一些特殊的字段,views 字段包含永久的视图定义 —— 译者注)。 +- 可能会提供服务器端文档验证的功能。 +- 可能提供身份认证功能。 +- 通过 _changes 函数实时更新数据。 +- 链接处理(attachment:couchDB 的每份文档都可以有一个 attachment,就像一份 email 有它的网址 —— 译者注)。 +- 有个 CouchApps(第三方JS的应用)。 -**Best used:** For accumulating, occasionally changing data, on which pre-defined queries are to be run. Places where versioning is important. +**应用场景:**用于随机数据量多、需要预定义查询的地方;用于版本控制比较重要的地方。 -**For example:** CRM, CMS systems. Master-master replication is an especially interesting feature, allowing easy multi-site deployments. +**使用案例:**可用于客户关系管理(CRM),内容管理系统(CMS);可用于主主互备甚至多机互备。 -###Redis (V2.4) +###Redis 2.4版 -**Written in:** C/C++ +**开发语言:** C/C++ -**Main point:** Blazing fast +**主要特性:**快到掉渣 -**License:** BSD +**许可证:** BSD -**Protocol:** Telnet-like +**数据传输方式:** 类似 Telnet -- Disk-backed in-memory database, -- Currently without disk-swap (VM and Diskstore were abandoned) -- Master-slave replication -- Simple values or hash tables by keys, -- but complex operations like ZREVRANGEBYSCORE. -- INCR & co (good for rate limiting or statistics) -- Has sets (also union/diff/inter) -- Has lists (also a queue; blocking pop) -- Has hashes (objects of multiple fields) -- Sorted sets (high score table, good for range queries) -- Redis has transactions (!) -- Values can be set to expire (as in a cache) -- Pub/Sub lets one implement messaging (!) +- Redis 是一个内存数据库(in-memory database,简称 IMDB,将数据放在内存进行读写,这才是“快到掉渣”的真正原因 —— 译者注),磁盘只是提供数据持久化(即将内存的数据写到磁盘)的功能(这类数据库被称为“disk backed”数据库)。 +- 当前不支持将磁盘作为 swap 分区,虚拟内存(VM)和 Diskstore 方式都没加到此版本(Redis 的数据持久化共有4种方式:定时快照、基于语句追加、虚拟内存、diskstore。其中 VM 方式由于性能不好以及不稳定的问题,已经被作者放弃,而 diskstore 方式还在实验阶段 —— 译者注)。 +- 主从备份 +- 存储结构为简单的 key/value 或 hash 表。 +- 但是操作比较复杂,比如:ZREVRANGEBYSCORE。 +- 支持 INCR(INCR key 就是将key中存储的数值加一 —— 译者注)命令(对限速和统计有帮助)。 +- 支持sets数据类型(以及 union/diff/inter)。 +- 支持 lists (以及 queue/blocking pop)。 +- 支持 hash sets (多级对象)。 +- 支持 sorted sets(高效率的表,在范围查找方面有优势)。 +- 支持事务处理。 +- 缓存中的数据可被标记为过期 +- Pub/Sub 操作能让用户发送信息。 -**Best used:** For rapidly changing data with a foreseeable database size (should fit mostly in memory). +**应用场景:**适合布署快速多变的小规模数据(可以完全运行在存在中)。 -**For example:** Stock prices. Analytics. Real-time data collection. Real-time communication. And wherever you used memcached before. +**使用案例:**股价系统、分析系统、实时数据收集系统、实时通信系统、以及取代 memcached。 ##Clones of Google's Bigtable From d79ff11f8dad85c84333a62fbe18fd0220328cba Mon Sep 17 00:00:00 2001 From: geekpi Date: Fri, 25 Oct 2013 06:49:08 +0000 Subject: [PATCH 03/10] =?UTF-8?q?[=E7=BF=BB=E8=AF=91=E4=B8=AD]=2001=20The?= =?UTF-8?q?=20Linux=20Kernel--Introduction.md?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- sources/The Linux Kernel/01 The Linux Kernel--Introduction.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) mode change 100644 => 100755 sources/The Linux Kernel/01 The Linux Kernel--Introduction.md diff --git a/sources/The Linux Kernel/01 The Linux Kernel--Introduction.md b/sources/The Linux Kernel/01 The Linux Kernel--Introduction.md old mode 100644 new mode 100755 index 16e0118dd2..fdd678bada --- a/sources/The Linux Kernel/01 The Linux Kernel--Introduction.md +++ b/sources/The Linux Kernel/01 The Linux Kernel--Introduction.md @@ -1,3 +1,5 @@ +Translating----------geekpi + 01 The Linux Kernel: Introduction ================================================================================ In 1991, a Finnish student named Linus Benedict Torvalds made the kernel of a now popular operating system. He released Linux version 0.01 on September 1991, and on February 1992, he licensed the kernel under the GPL license. The GNU General Public License (GPL) allows people to use, own, modify, and distribute the source code legally and free of charge. This permits the kernel to become very popular because anyone may download it for free. Now that anyone can make their own kernel, it may be helpful to know how to obtain, edit, configure, compile, and install the Linux kernel. @@ -32,4 +34,4 @@ via: http://www.linux.org/threads/%EF%BB%BFthe-linux-kernel-introduction.4203/ 本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](http://linux.cn/) 荣誉推出 -[1]:https://www.kernel.org/ \ No newline at end of file +[1]:https://www.kernel.org/ From 62526d2f24ad6e12dbd52761395b5dbf00eaf099 Mon Sep 17 00:00:00 2001 From: Luoxcat Date: Fri, 25 Oct 2013 19:17:33 +0800 Subject: [PATCH 04/10] =?UTF-8?q?=E6=89=93=E7=8C=8E=E4=B8=AD=20=20Luox?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- ... Now Possible with Debian-Based Clonezilla Live 2.2.0-13.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/sources/Metal Backup and Recovery Is Now Possible with Debian-Based Clonezilla Live 2.2.0-13.md b/sources/Metal Backup and Recovery Is Now Possible with Debian-Based Clonezilla Live 2.2.0-13.md index a63346f010..3168aaa480 100644 --- a/sources/Metal Backup and Recovery Is Now Possible with Debian-Based Clonezilla Live 2.2.0-13.md +++ b/sources/Metal Backup and Recovery Is Now Possible with Debian-Based Clonezilla Live 2.2.0-13.md @@ -1,3 +1,4 @@ + 翻译中 Luox Metal Backup and Recovery Is Now Possible with Debian-Based Clonezilla Live 2.2.0-13 ================================================================================ Clonezilla Live 2.2.0-13, a Linux distribution based on DRBL, Partclone, and udpcast that allows users to do bare metal backup and recovery, is now available for testing. @@ -35,4 +36,4 @@ via: http://news.softpedia.com/news/Metal-Backup-and-Recovery-Is-Now-Possible-wi [4]:http://downloads.sourceforge.net/clonezilla/clonezilla-live-2.1.2-53-amd64.iso [5]:http://sourceforge.net/projects/clonezilla/files/clonezilla_live_testing/2.2.0-8/clonezilla-live-2.2.0-13-i486.iso/download [6]:http://sourceforge.net/projects/clonezilla/files/clonezilla_live_testing/2.2.0-8/clonezilla-live-2.2.0-13-i686-pae.iso/download -[7]:http://sourceforge.net/projects/clonezilla/files/clonezilla_live_testing/2.2.0-8/clonezilla-live-2.2.0-13-amd64.iso/download \ No newline at end of file +[7]:http://sourceforge.net/projects/clonezilla/files/clonezilla_live_testing/2.2.0-8/clonezilla-live-2.2.0-13-amd64.iso/download From 05f9b4ceb8a8aa67894312293b2ca3fce96dea23 Mon Sep 17 00:00:00 2001 From: geekpi Date: Fri, 25 Oct 2013 21:33:53 +0800 Subject: [PATCH 05/10] =?UTF-8?q?[=E5=B7=B2=E7=BF=BB=E8=AF=91]=2001=20The?= =?UTF-8?q?=20Linux=20Kernel--=20Introduction.md?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .../01 The Linux Kernel--Introduction.md | 37 ------------------- .../01 The Linux Kernel--Introduction.md | 36 ++++++++++++++++++ 2 files changed, 36 insertions(+), 37 deletions(-) delete mode 100755 sources/The Linux Kernel/01 The Linux Kernel--Introduction.md create mode 100644 translated/01 The Linux Kernel--Introduction.md diff --git a/sources/The Linux Kernel/01 The Linux Kernel--Introduction.md b/sources/The Linux Kernel/01 The Linux Kernel--Introduction.md deleted file mode 100755 index fdd678bada..0000000000 --- a/sources/The Linux Kernel/01 The Linux Kernel--Introduction.md +++ /dev/null @@ -1,37 +0,0 @@ -Translating----------geekpi - -01 The Linux Kernel: Introduction -================================================================================ -In 1991, a Finnish student named Linus Benedict Torvalds made the kernel of a now popular operating system. He released Linux version 0.01 on September 1991, and on February 1992, he licensed the kernel under the GPL license. The GNU General Public License (GPL) allows people to use, own, modify, and distribute the source code legally and free of charge. This permits the kernel to become very popular because anyone may download it for free. Now that anyone can make their own kernel, it may be helpful to know how to obtain, edit, configure, compile, and install the Linux kernel. - -A kernel is the core of an operating system. The operating system is all of the programs that manages the hardware and allows users to run applications on a computer. The kernel controls the hardware and applications. Applications do not communicate with the hardware directly, instead they go to the kernel. In summary, software runs on the kernel and the kernel operates the hardware. Without a kernel, a computer is a useless object. - -There are many reasons for a user to want to make their own kernel. Many users may want to make a kernel that only contains the code needed to run on their system. For instance, my kernel contains drivers for FireWire devices, but my computer lacks these ports. When the system boots up, time and RAM space is wasted on drivers for devices that my system does not have installed. If I wanted to streamline my kernel, I could make my own kernel that does not have FireWire drivers. As for another reason, a user may own a device with a special piece of hardware, but the kernel that came with their latest version of Ubuntu lacks the needed driver. This user could download the latest kernel (which is a few versions ahead of Ubuntu's Linux kernels) and make their own kernel that has the needed driver. However, these are two of the most common reasons for users wanting to make their own Linux kernels. - -Before we download a kernel, we should discuss some important definitions and facts. The Linux kernel is a monolithic kernel. This means that the whole operating system is on the RAM reserved as kernel space. To clarify, the kernel is put on the RAM. The space used by the kernel is reserved for the kernel. Only the kernel may use the reserved kernel space. The kernel owns that space on the RAM until the system is shutdown. In contrast to kernel space, there is user space. User space is the space on the RAM that the user's programs own. Applications like web browsers, video games, word processors, media players, the wallpaper, themes, etc. are all on the user space of the RAM. When an application is closed, any program may use the newly freed space. With kernel space, once the RAM space is taken, nothing else can have that space. - -The Linux kernel is also a preemptive multitasking kernel. This means that the kernel will pause some tasks to ensure that every application gets a chance to use the CPU. For instance, if an application is running but is waiting for some data, the kernel will put that application on hold and allow another program to use the newly freed CPU resources until the data arrives. Otherwise, the system would be wasting resources for tasks that are waiting for data or another program to execute. The kernel will force programs to wait for the CPU or stop using the CPU. Applications cannot unpause or use the CPU without the kernel allowing them to do so. - -The Linux kernel makes devices appear as files in the folder /dev. USB ports, for instance, are located in /dev/bus/usb. The hard-drive partitions are seen in /dev/disk/by-label. It is because of this feature that many people say "On Linux, everything is a file.". If a user wanted to access data on their memory card, for example, they cannot access the data through these device files. - -The Linux kernel is portable. Portability is one of the best features that makes Linux popular. Portability is the ability for the kernel to work on a wide variety of processors and systems. Some of the processor types that the kernel supports include Alpha, AMD, ARM, C6X, Intel, x86, Microblaze, MIPS, PowerPC, SPARC, UltraSPARC, etc. This is not a complete list. - -In the boot folder (/boot), users will see a "vmlinux" or a "vmlinuz" file. Both are compiled Linux kernels. The one that ends in a "z" is compressed. The "vm" stands for virtual memory. On systems with SPARC processors, users will see a zImage file instead. A small number of users may find a bzImage file; this is also a compressed Linux kernel. No matter which one a user owns, they are all bootable files that should not be changed unless the user knows what they are doing. Otherwise, their system can be made unbootable - the system will not turn on. - -Source code is the coding of the program. With source code, programmers can make changes to the kernel and see how the kernel works. - -### Downloading the Kernel: ### - -Now, that we understand more about the kernel, it is time to download the source code. Go to [kernel.org][1] and click the large download button. Once the download is finished, uncompress the downloaded file. - -For this article, I am using the source code for Linux kernel 3.9.4. All of the instructions in this article series are the same (or nearly the same) for all versions of the kernel. - --------------------------------------------------------------------------------- - -via: http://www.linux.org/threads/%EF%BB%BFthe-linux-kernel-introduction.4203/ - -译者:[译者ID](https://github.com/译者ID) 校对:[校对者ID](https://github.com/校对者ID) - -本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](http://linux.cn/) 荣誉推出 - -[1]:https://www.kernel.org/ diff --git a/translated/01 The Linux Kernel--Introduction.md b/translated/01 The Linux Kernel--Introduction.md new file mode 100644 index 0000000000..c99e34fd7b --- /dev/null +++ b/translated/01 The Linux Kernel--Introduction.md @@ -0,0 +1,36 @@ +01 Linux 内核: 介绍 +================================================================================ +在1991年,一个叫林纳斯·本纳第克特·托瓦兹的芬兰学生制作了一个现在非常流行的操作系统内核。他于1991年9月发布了Linux 0.01并且于1992年以GPL许可证的方式授权了该内核。GNU通用许可证(GPL)允许人们使用、拥有、修改以及合法和免费的分发源代码。这使得内核变得非常流行因为任何人都可以免费地下载。现在任何人都可以制作他们自己的内核,这对于了解如何获取、编辑、配置、编译并且安装Linux内核或许是有帮助的。 + +内核是操作系统的核心。操作系统是一系列管理硬件和允许用户在一台电脑上运行应用的程序。内核控制着硬件和应用。应用并不直接和硬件打交道,而是先进入内核。总之,软件运行在内核上而内核操作着硬件。没有内核,电脑就是一个没用的物件。 + +有很多理由用户想制作他们自己的内核。许多用户也许想要一个只包含需要的代码来运行他们的系统的内核。比如说我的内核包含了火线设备驱动,但是我的电脑缺乏这些端口。当系统启动的时,时间和内存就会浪费在那些我系统上并没有安装的设备上。如果我想要简化我的内核,我会制作自己不包含火线驱动的内核。至于另外一个理由,某个用户可能拥有一台有特殊硬件的设备,但是最新的Ubuntu版本中的内核缺乏所需的驱动。这个用户可以下载最新的内核(比当前Ununtud的Linux内核更新几个版本)并制作他们自己的有相应驱动的内核。不管怎样,这两个是用户想要制作自己的Linux内核的普遍原因。 + +在下载内核前,我们应该讨论一些重要的术语和事实。Linux内核是一个宏内核,这意味着整个操作系统是作为内核空间保留在内存上。说的更新出一些,内核是放在内存上。内核使用的空间是预留给内核的。只有内核可以使用预留的内核空间。内核拥有这些内存上的空间直到系统关闭。与内核空间相对应的还是用户空间。用户空间是内存上用户程序拥有的空间。比如浏览器、电子游戏、文字处理器、媒体播放器、壁纸、主题等都是内存上的用户空间。当一个程序关闭的时候,任何程序都可能使用新释放的空间。在内核空间,一旦内存被占用,没有任何其他程序可以使用这块空间。 + +Linux内核也是一个抢占式多任务内核。这意味这内核可以暂停一些任务来保证任何应用有机会来使用CPU。举个例子,如果一个应用正在运行但是正在等待一些数据,内核会把这个应用暂停并允许其他的程序使用新释放的CPU资源知道数据到来。否则,系统将会浪费资源给那些正在等待数据或者其他程序执行的的任务。内核将会强制程序去等待或者停止使用CPU。没有内核的允许,应用程序不能不暂停或者使用CPU。 + +Linux内核使得设备作为文件显示在/dev文件夹下。举个例子,USB端口位于/dev/bus/usb。硬盘分区则位于/dev/disk/by-label。这是这个特性许多人说:“在Linux上,一切皆文件”。举个例子,如果一个用户想要访问在存储卡上的数据,他们不能通过设备文件访问这些数据。 + +Linux内核是可移植的。可移植性是使Linux流行其中一个最好的特性。可移植性使得内核可以工作在广泛的处理器和系统上。一些内核支持的处理器的型号包括:Alpha、AMD、ARM、C6X、Intel、x86、Microblaze、MIPS、PowerPC、SPARC、UltraSPARC等等。这还不是全部的列表。 + +在引导文件夹(/boot),用户会看到诸如“vmlinux”或者“vmlinuz”的文件。这两者都是已编译的Linux内核。以“z”结尾的是已压缩的。“vm”代表虚拟内存。在SPARC处理器的系统上,用户可以看见一个zImage文件。一小部分用户可以发现一个bzImage文件,这也是一个已压缩的Linux内核。无论用户有哪个文件,他们都是不可以被更改除非用户知道他们正在做什么的引导文件。否则系统会变成无法引导---系统无法开启。 + +源代码是程序的编码。有了源代码,程序员可以修改内核并能看到内核是如何工作的。 + +### 下载内核: ### + +现在我们更多地了解了内核,是时候下载内核源代码了。进入kernel.org并点击那个巨大的下载按钮。一旦下载完成,解压下载的文件。 + +对于本文,我使用的源代码是Linux kernel 3.9.4.这个文章系列的所有指导对于所有的内核版本是相同的(或者非常相似的) + +-------------------------------------------------------------------------------- + +via: http://www.linux.org/threads/%EF%BB%BFthe-linux-kernel-introduction.4203/ + +译者:[geekpi](https://github.com/geekpi) 校对:[校对者ID](https://github.com/校对者ID) + +本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](http://linux.cn/) 荣誉推出 + +[1]:https://www.kernel.org/ + From c25fdc458f7610f686a6f2e3a83d4fc4bc17b6ae Mon Sep 17 00:00:00 2001 From: geekpi Date: Fri, 25 Oct 2013 21:41:41 +0800 Subject: [PATCH 06/10] =?UTF-8?q?[=E7=BF=BB=E8=AF=91=E4=B8=AD]=2000=20Abou?= =?UTF-8?q?t=20the=20Author?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- sources/The Linux Kernel/00 About the author.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/sources/The Linux Kernel/00 About the author.md b/sources/The Linux Kernel/00 About the author.md index 1add179002..8374986241 100644 --- a/sources/The Linux Kernel/00 About the author.md +++ b/sources/The Linux Kernel/00 About the author.md @@ -1,3 +1,5 @@ +Translating----------geekpi + 00 About the author ================================================================================ [![](http://www.linux.org/data/avatars/l/4/4843.jpg)][1] @@ -65,4 +67,4 @@ via: http://www.linux.org/members/devyncjohnson.4843/ [10]:http://stackoverflow.com/users/2354783/devyn-collier-johnson [11]:http://gnome-look.org/usermanager/search.php?username=DevynCJohnson [12]:http://www.creatity.com/?user=1449&action=detailUser -[13]:http://openclipart.org/user-detail/DevynCJohnson \ No newline at end of file +[13]:http://openclipart.org/user-detail/DevynCJohnson From 4d1451d7691f9b8b990c254fc5805b67018ddbfa Mon Sep 17 00:00:00 2001 From: wxy Date: Fri, 25 Oct 2013 21:43:16 +0800 Subject: [PATCH 07/10] =?UTF-8?q?=E5=8F=91=E5=B8=83=EF=BC=9AHow=20This=207?= =?UTF-8?q?5=20Year-Old=20Piece=20of=20Paper=20Started=20Modern=20Computin?= =?UTF-8?q?g?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- ...Piece of Paper Started Modern Computing.md | 30 +++++++++---------- 1 file changed, 15 insertions(+), 15 deletions(-) rename {translated => published}/How This 75 Year-Old Piece of Paper Started Modern Computing.md (87%) diff --git a/translated/How This 75 Year-Old Piece of Paper Started Modern Computing.md b/published/How This 75 Year-Old Piece of Paper Started Modern Computing.md similarity index 87% rename from translated/How This 75 Year-Old Piece of Paper Started Modern Computing.md rename to published/How This 75 Year-Old Piece of Paper Started Modern Computing.md index 2af50f5949..28b48d1492 100644 --- a/translated/How This 75 Year-Old Piece of Paper Started Modern Computing.md +++ b/published/How This 75 Year-Old Piece of Paper Started Modern Computing.md @@ -1,5 +1,5 @@ -一张75年前的纸,打开了现代计算机的新时代 -===== +一张在75年前开启了复印机时代的纸 +=========================== ![img](http://rack.1.mshcdn.com/media/ZgkyMDEzLzEwLzEzL2VhL1hlcm94LjM4ODIwLmpwZwpwCXRodW1iCTk1MHg1MzQjCmUJanBn/2f9f894a/5ef/Xerox.jpg) @@ -11,13 +11,13 @@ Ken Weilerstein说:“打印机改变了人们处理文档的方式,在人类的办公史上,这可是一件大事。” -###施乐的崛起 +##施乐的崛起 施乐(Xerox)称得上是最有名的打印机公司,在打印机市场持很大的占有率。就像人们谈搜索时会说“上网google一下”,谈到打印时会说“我需要xerox(复印)这份文档”,施乐公司已经渗入人们的日常生活中了。在科技领域,20世纪70年代,施乐由于在帕罗奥多研究中心(PARC)发明的用户图形界面和鼠标而声名远扬,乔帮主当年参观PARC后将很多点子用到了他的苹果机上(其中一个就是PARC研发的图形用户界面——译者注)。另外,激光打印机也借鉴了卡尔逊的发明。 刚开始,施乐并不叫施乐。位于纽约市康涅狄格州罗切斯特市的哈洛伊德(Haloid)公司看中静电打印技术,收购了卡尔逊的这项发明,之后才更名为“施乐”(在这中间,哈洛伊德还曾更名为“哈洛伊德施乐”——译者注)。 -我们继续卡尔逊的故事,这位发明家和专利代理人对手工复印法律文件感到无比厌烦,他认为世上肯定有一种复印方法能让他摆脱油墨复写纸,或者那种传统使用湿的感光纸和水的又乱又慢还昂贵的照相法。 施乐公司历史档案管理者,Ray Brewer说卡尔逊的这个涉及锌片锌粉(估计是作为感光材料——译者注)的发明即简单又容易复制。 +我们继续卡尔逊的故事,这位发明家和专利代理人对手工复印法律文件感到无比厌烦,他认为世上肯定有一种复印方法能让他摆脱油墨复写纸,或者那种传统使用湿的感光纸和水的又乱又慢还昂贵的照相法。 施乐公司的历史档案管理员Ray Brewer说卡尔逊的这个涉及锌片锌粉(估计是作为感光材料——译者注)的发明即简单又容易复制。 (Weilerstein解释说,静电复印的处理过程就是这么个回事:一张纸放在光源底下,光源会扫描整张纸,记下复印原件的信息。光线通过一组透镜照到涂有光敏材料的静电成像硒鼓上,硒鼓被曝光部分会改变其静电荷,复印原件的信息被复制到硒鼓上。之后硒像旋转,静电荷所在的地方会吸引墨粉微粒,从而在硒鼓上画出原件图像。然后硒鼓将墨粉转移到一张纸上,并加压加热,就像熨斗一样。最后,硒鼓旋转,多余的墨粉被刮下来,下一页依此循环。) @@ -29,17 +29,17 @@ Ken Weilerstein说:“打印机改变了人们处理文档的方式,在人 在20世纪50到60年代之间,施乐打印机成为办公必备用品。这项改革节省了时间和金钱。以前,复印文档的唯一方法就是使用影印机,那是相当杂乱和昂贵的操作,更糟糕的是,那种油墨纸一次最多只能复制两份。假如你想复印更多份,你必须重新打印一份出来,“并且秘书和领导希望所有的复印件都一模一样,”Brewer又出来讲话了。这就是施乐复印术的又一个好处:多份一模一样的复印件。在email和即时通信出来之前,这种复印术为部门间交流提供机会。施乐复印术催生了备忘录、办公简讯以及生日贺卡等新鲜玩意儿。 -914型复印机是施乐公司最成功的产品。在60年代到70年代初,施乐共卖掉超过20万台这个型号的复印机,《财富》杂志将它评为“美国史上最成功的产品”。Weilerstein称施乐为“你后悔没买它的股票”的公司。 +914型复印机是施乐公司最成功的产品。在60年代到70年代初,施乐共卖掉了超过20万台这个型号的复印机,《财富》杂志将它[评为][1]“美国史上最成功的产品”。Weilerstein称施乐为“你后悔没买它的股票”的公司。 这是一部大概在1960年播出的广告:http://www.youtube.com/embed/kNGdqC7QJYI -###现代计算机时代 +##现代计算机时代 直到80年代PC机开始代替打字机,施乐复印机的疯狂时代才告终结。人们依靠卡尔逊的发明创造了激光打印机,从此淘汰了功能单一的复印机。这个时候,已经没有人会把施乐的机器简单地称为复印机了,它们早已变成多功能打印机。 而这还不是施乐公司生意上所面临的唯一挑战。70年代曰本公司提供了与施乐复印机同性能但更便宜的产品。Weilerstein说:施乐失去了复印机领域的垄断地位,但凭借其在激光打印机产业的发展,施乐依然站在科技的前沿。到90年代,施乐开发Docutech系统,这种技术让你从打印机年代直接进入到印刷机年代。之后施乐又研发了iGen2000,这是种能彩印的激光打印机,能在1分钟内打印100页复印件。“然而在2000年之后的一段时间内,施乐公司再没有突出的作为,”Weilerstein说道,“并且他们还遭遇经济危机。” -施乐没有像它在罗彻斯特的邻居——伊士曼柯达一样遭遇打击。现在的柯达[处于相当阴暗的时期][1]。 +施乐没有像它在罗彻斯特的邻居——伊士曼.柯达一样遭遇打击。其时的柯达[处于相当阴暗的时期][2]。 部分原因是,与胶卷不同,文档打印依然是一个高利润的生意。还是有些部门,包括联邦政府,这些部门的员工每个月需要打印上千份文档。Weilerstein 承认随着时代发展,打印业务将继续下降:“打印业务也许是老一代的产品”,但是他又说,“但它还不会轻易消失。” @@ -49,24 +49,23 @@ Ken Weilerstein说:“打印机改变了人们处理文档的方式,在人 “我认为IT部门实行无纸办公是一个非常大的需求”,来自IDC(国际数据公司)的Boyd说,“很多公司,包括施乐,都在积极为他们提供解决方案。” - Weilerstein的观点是当所有的交流都是依赖文档时,施乐有机会成为“内容管理服务(MCS)”生意的领导者。这种服务能让企业减少使用打印机。Weilerstein在白皮书上写到:“虽然纸质文档依旧是交流的有效载体,但是当员工将不同来源的信息打印成纸质文档,并想将它们用于不同目的时,太多的文档反而无法形成有效的交流。” 在广播信息都是靠传真接收的时代,MCS应该接受广告订单;或者提供一种方式,将化工厂员工随手记下的笔记传到可供查阅的数码产品中。 ![img](http://rack.3.mshcdn.com/media/ZgkyMDEzLzEwLzExLzUwL0NoZXN0ZXJDYXJsLjAyZTI2LmpwZwpwCXRodW1iCTEyMDB4OTYwMD4/a1da164c/352/Chester-Carlson.jpg) -卡尔逊和他的静电打印机 +##卡尔逊的遗产 -在推动无纸办公的过程中,施乐不应该只是将它的生意转型,还应该帮助企业内不必要的浪费。然而,施乐还面临大量的竞争对手,最大的对手就是Adobe公司,PDF文档格式的发明者。 +在推动无纸办公的过程中,施乐不应该只是将它的生意转型,还应该帮助企业内减少不必要的浪费。然而,施乐还面临大量的竞争对手,最大的对手就是Adobe公司,PDF文档格式的发明者。 Adobe过去是做桌面打印(即通过电脑等电子手段进行文档编辑——译者注)而非纸质打印的,似乎比施乐有天然的优势。然后是苹果公司,自己赖以生存的老技术被新技术取代后,能迅速变成新技术的领导者,这在历史上是很罕见的。 -即使施乐最终失败了,卡尔逊在科技史上的贡献也是毋庸置疑的。卡尔逊的其他发明:带有流水沟的雨衣、洗鞋器等,但是他最重要的发明是一个证据,证明市场是如何回报人的坚持和开阔的视野。他的死向我们阐明了我们对公众人物的内在生活了解是如此的少。 +即使施乐最终失败了,卡尔逊在科技史上的贡献也是毋庸置疑的。卡尔逊的其他发明:带有流水沟的雨衣、洗鞋器等,但是他最重要的发明是一个证据,证明市场是如何回报人的坚持和开阔的视野。他的去世向我们阐明了我们对公众人物的内在生活了解是如此的少。 -1968年,卡尔逊和柯乃伊在阿斯托里亚的公寓内辛辛苦苦地发明静电复印技术的30年之后,卡尔逊从罗彻斯特家里回到纽约,去参加一个商业会议。他发现离开会还有一些时间,于是他走进一家电影院观看《骑虎之人(He Who Rides a Tiger)》,主演是Tom Bell和Judi Dench。当电影结束时,一个服务员看见了卡尔逊,看起来似乎正在位子上睡觉,事实并非如此。卡尔逊因心脏病去世——那是他那年第二次心脏病发作。 +1968年,卡尔逊和柯乃伊在阿斯托里亚的公寓内辛辛苦苦地发明静电复印技术的30年之后,卡尔逊从罗彻斯特家里回到纽约,去参加一个商业会议。他发现离开会还有一些时间,于是他走进一家电影院观看《骑虎人(He Who Rides a Tiger)》,主演是Tom Bell和Judi Dench。当电影结束时,一个服务员看见了卡尔逊,看起来似乎正在位子上睡觉,事实并非如此。卡尔逊因心脏病去世——那是他那年第二次心脏病发作。 -他死后,人们估计他拥有大约1.5亿美元的遗产,这笔钱使他成为1968年全美最有钱的富翁之一。然而他们估计错了,卡尔逊已将他的绝大部分财产都捐了出去。他曾对妻子说他对那种成为商业巨头的野心表示很不理解,他只想作为一个穷人死去。 +他死后,人们估计他拥有大约1.5亿美元的遗产,这笔钱使他成为1968年全美最有钱的富翁之一。然而他们发现错了,卡尔逊已将他的绝大部分财产都捐了出去。他曾对妻子说他对那种成为商业巨头的野心表示很不理解,他只想作为一个穷人死去。 图片来源:施乐公司 @@ -79,6 +78,7 @@ via: http://mashable.com/2013/10/13/xerox-history-of-copying/ 译者:[chenjintao](https://github.com/chenjintao) 校对:[jasminepeng](https://github.com/jasminepeng) -[1]:http://www.usatoday.com/story/money/business/2013/09/03/kodak-bankruptcy-ends/2759965/ - +[1]:http://money.cnn.com/2010/01/21/technology/xerox_copiers.fortune/index.htm + +[2]:http://www.usatoday.com/story/money/business/2013/09/03/kodak-bankruptcy-ends/2759965/ From 3b5b26d5ec8c260ba15f58a0a39c1bd419e95c17 Mon Sep 17 00:00:00 2001 From: geekpi Date: Fri, 25 Oct 2013 22:34:12 +0800 Subject: [PATCH 08/10] =?UTF-8?q?[=E5=B7=B2=E7=BF=BB=E8=AF=91]=2000=20Abou?= =?UTF-8?q?t=20the=20Author?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .../The Linux Kernel/00 About the author.md | 70 ------------------- translated/00 About the author.md | 68 ++++++++++++++++++ 2 files changed, 68 insertions(+), 70 deletions(-) delete mode 100644 sources/The Linux Kernel/00 About the author.md create mode 100644 translated/00 About the author.md diff --git a/sources/The Linux Kernel/00 About the author.md b/sources/The Linux Kernel/00 About the author.md deleted file mode 100644 index 8374986241..0000000000 --- a/sources/The Linux Kernel/00 About the author.md +++ /dev/null @@ -1,70 +0,0 @@ -Translating----------geekpi - -00 About the author -================================================================================ -[![](http://www.linux.org/data/avatars/l/4/4843.jpg)][1] - -Feel free to post or email me (DevynCJohnson@Gmail.com) suggestions for this series, both on topics for future articles in the series and on how the series can be made better and more interesting. - -I write two articles a week for Linux.org. One is the Linux kernel series and the other is any random Linux topic. Feel free to email suggestions on what you would like to read about in the "random article". I would like to write something that will draw numerous readers to the site. My goal is to write an article that has 10,000+ readers in one week. - -Soon, I will also write tutorials on how to install some of the popular Linux distros, so if there is a particular one you want to read about, email me. - -Check out my wallpapers on [http://gnome-look.org/usermanager/search.php?username=DevynCJohnson&action=contents&PHPSESSID=32424677ef4d9dffed020d06ef2522ac][2] - -My AI project: - -- [https://launchpad.net/neobot][3] - -Ubuntu 13.10 (AMD64) - -- [https://launchpad.net/~devyncjohnson-d][4] -- [DevynCJohnson@Gmail.com][5] - - - -**Gender**:Male - -**Birthday**:Aug 31, 1994 (Age: 19) - -**Home page**:https://launchpad.net/~devyncjohnson-d - -**Location**:United States - -Devyn Collier Johnson was home-schooled by his two wonderful parents and has graduated one university and is now attending another. His father, Jarret Wayne Buse, has many computer certifications, and Jarret has written and published many books on computers. He also does some programming, and he has given Devyn help and ideas for his artificial intelligence program. His mother, Cassandra Ann Johnson, is a stay-at-home mother, home-schooling his many siblings. Devyn Collier Johnson lives in Indiana with his parents and focuses his time on college and personal computer programming. - -Devyn Collier Johnson graduated high-school at age sixteen. He attends college as a commuting student maintaining the Dean's list. He majors in electrical technology engineering. Devyn Collier Johnson has learned many computer languages. Some he taught himself while others his father taught him and helped him understand. Some of the languages he knows include Xaiml, AIML, Unix Shell, Python3, VPython, PyQT, PyGTK, Coffeescript, GEL, SED, HTML4/5, CSS3, SVG, and XML. Devyn knows bits and pieces of some other languages. He earned four computer certifications in April 2012 and those four being NCLA, Linux+, LPIC-1, and DCTS. His Linux Professional ID is LPI000254694. - -In July 2012, Devyn Collier Johnson decided to make his chatterbot from scratch. He designed his own markup language (Xaiml) and AI engine (ProgramPY-SH or Pysh). On March 3, 2013, Devyn published his new chatterbot on Launchpad.net. The bot is named Neo which is from the Proto-Indo European word for "new". - -Devyn maintains a few other projects. He makes Opera and Firefox themes ([https://addons.mozilla.org/en-US/firefox/user/DevynCJohnson/][6]) ([https://my.opera.com/devyncjohnson/account/][7]); he also has many other graphic design projects. Most of his programming projects are hosted on [https://launchpad.net/~devyncjohnson-d][4], and some are mirrored on Sourceforge.net. Some other miscellaneous projects can be found in the links below. - -- [http://askubuntu.com/users/158340/devyn-collier-johnson][8] -- [http://unix.stackexchange.com/users/40770/devyn-collier-johnson][9] -- [http://stackoverflow.com/users/2354783/devyn-collier-johnson][10] -- [http://www.linux.org/members/devyncjohnson.4843/][1] -- [http://gnome-look.org/usermanager/search.php?username=DevynCJohnson][11] -- [http://www.creatity.com/?user=1449&action=detailUser][12] -- [http://openclipart.org/user-detail/DevynCJohnson][13] - --------------------------------------------------------------------------------- - -via: http://www.linux.org/members/devyncjohnson.4843/ - -译者:[译者ID](https://github.com/译者ID) 校对:[校对者ID](https://github.com/校对者ID) - -本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](http://linux.cn/) 荣誉推出 - -[1]:http://www.linux.org/members/devyncjohnson.4843/ -[2]:http://gnome-look.org/usermanager/search.php?username=DevynCJohnson&action=contents&PHPSESSID=32424677ef4d9dffed020d06ef2522ac -[3]:https://launchpad.net/neobot -[4]:https://launchpad.net/~devyncjohnson-d -[5]:DevynCJohnson@Gmail.com -[6]:https://addons.mozilla.org/en-US/firefox/user/DevynCJohnson/ -[7]:https://my.opera.com/devyncjohnson/account/ -[8]:http://askubuntu.com/users/158340/devyn-collier-johnson -[9]:http://unix.stackexchange.com/users/40770/devyn-collier-johnson -[10]:http://stackoverflow.com/users/2354783/devyn-collier-johnson -[11]:http://gnome-look.org/usermanager/search.php?username=DevynCJohnson -[12]:http://www.creatity.com/?user=1449&action=detailUser -[13]:http://openclipart.org/user-detail/DevynCJohnson diff --git a/translated/00 About the author.md b/translated/00 About the author.md new file mode 100644 index 0000000000..5999dce680 --- /dev/null +++ b/translated/00 About the author.md @@ -0,0 +1,68 @@ +00 关于作者 +================================================================================ +[![](http://www.linux.org/data/avatars/l/4/4843.jpg)][1] + +随时可以给我写信或者发邮件(DevynCJohnson@Gmail.com)提出对本系列的建议,无论是本系列后续的文章还是如何使这个系列更好和有趣 + +我每周为Linux.org写两篇文章。一篇是Linux内核系列而另一篇是任何随机的Linux话题。随时可以发邮件建议你想看到的“随机文章”。我乐意在站上写一些能够吸引大量读者的东西。我的目标是每周写一篇拥有1万以上读者的文章。 + +很快我会写一些关于如何安装流行Linux发行版的文章,因此,如果你想要阅读某个特定发行版文章,请给我发邮件。 + +看看我的壁纸: [http://gnome-look.org/usermanager/search.php?username=DevynCJohnson&action=contents&PHPSESSID=32424677ef4d9dffed020d06ef2522ac][2] + +我的人工智能项目: + +- [https://launchpad.net/neobot][3] + +Ubuntu 13.10 (AMD64) + +- [https://launchpad.net/~devyncjohnson-d][4] +- [DevynCJohnson@Gmail.com][5] + + + +**性别**:性别:男 + +**生日**:Aug 31, 1994 (Age: 19) + +**主页**:https://launchpad.net/~devyncjohnson-d + +**位置**:United States + +戴文.科利尔.约翰逊(Devyn Collier Johnson)在家接受他伟大的父母教育并已从一所大学毕业,现在已加入了另外一所大学。他的父亲,杰瑞特.韦恩.布斯(Jarret Wayne Buse)拥有很多的计算机证书,并且他已经撰写并出版了许多关于计算机的书籍。他也做一些编程,并给了戴文的人工智能程序提供过一些帮助和点子。他的妈妈,卡桑德拉.安.约翰逊(Cassandra Ann Johnson),是一名家庭主妇,在家教育了许多他的许多兄弟姐妹。戴文.科利尔.约翰逊和他的父母住在印第安纳并把他的时间集中在大学和个人的电脑编程上。 + +戴文.科利尔.约翰逊十六岁毕业于一所高中。他作为一名走读生进入大学并一直保持在优秀学生名单上。他的专业是电气技术工程。戴文.科利尔.约翰逊已经学习了很多计算机语言。一些是他自学的而有的则是他父亲教导并且帮助他理解的。一些他了解的语言包括Xaiml、AIML、Unix Shell、Python3、VPython、PyQT、PyGTK、Coffeescript、GEL、SED、HTML4/5、CSS3、SVG和XML。戴文另外还了解一点其他的语言。他在2012年4月获取了4项计算机证书他们是NCLA、Linux+、LPIC-1、和DCTS。 他的Linux专业ID是LPI000254694。 + +在2012年7月,戴文.科利尔.约翰逊决定从头开始做他的聊天机器人。他设计了自己的标记语言(Xaiml)和AI引擎(ProgramPY-SH 或者 Pysh)。在2013年3月,戴文在Launchpad.net上发布了他的机器人。这个机器人名为Neo,取自原始印欧语中单词的“new” + +戴文还维护了其他几个项目。他制作Opera和Firebox的主题 ([https://addons.mozilla.org/en-US/firefox/user/DevynCJohnson/][6]) ([https://my.opera.com/devyncjohnson/account/][7]); 他还有许多其他的图形设计项目。他的大多数编程项目托管在 [https://launchpad.net/~devyncjohnson-d][4], 另外在Sourceforge.net上也有镜像,其他的一些杂项可以通过下面的链接找到。 + +- [http://askubuntu.com/users/158340/devyn-collier-johnson][8] +- [http://unix.stackexchange.com/users/40770/devyn-collier-johnson][9] +- [http://stackoverflow.com/users/2354783/devyn-collier-johnson][10] +- [http://www.linux.org/members/devyncjohnson.4843/][1] +- [http://gnome-look.org/usermanager/search.php?username=DevynCJohnson][11] +- [http://www.creatity.com/?user=1449&action=detailUser][12] +- [http://openclipart.org/user-detail/DevynCJohnson][13] + +-------------------------------------------------------------------------------- + +via: http://www.linux.org/members/devyncjohnson.4843/ + +译者:[geekpi](https://github.com/geekpi) 校对:[校对者ID](https://github.com/校对者ID) + +本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](http://linux.cn/) 荣誉推出 + +[1]:http://www.linux.org/members/devyncjohnson.4843/ +[2]:http://gnome-look.org/usermanager/search.php?username=DevynCJohnson&action=contents&PHPSESSID=32424677ef4d9dffed020d06ef2522ac +[3]:https://launchpad.net/neobot +[4]:https://launchpad.net/~devyncjohnson-d +[5]:DevynCJohnson@Gmail.com +[6]:https://addons.mozilla.org/en-US/firefox/user/DevynCJohnson/ +[7]:https://my.opera.com/devyncjohnson/account/ +[8]:http://askubuntu.com/users/158340/devyn-collier-johnson +[9]:http://unix.stackexchange.com/users/40770/devyn-collier-johnson +[10]:http://stackoverflow.com/users/2354783/devyn-collier-johnson +[11]:http://gnome-look.org/usermanager/search.php?username=DevynCJohnson +[12]:http://www.creatity.com/?user=1449&action=detailUser +[13]:http://openclipart.org/user-detail/DevynCJohnson From 20d8365f9f59eb43d6cf4700f4df548d3a90ded2 Mon Sep 17 00:00:00 2001 From: crowner Date: Sat, 26 Oct 2013 12:05:51 +0800 Subject: [PATCH 09/10] =?UTF-8?q?=E7=BF=BB=E8=AF=91=E5=AE=8C=E6=88=90?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- ... Understanding The App Menus And Buttons.md | 36 +++++++++++++++++++ 1 file changed, 36 insertions(+) create mode 100644 translated/Daily Ubuntu Tips – Understanding The App Menus And Buttons.md diff --git a/translated/Daily Ubuntu Tips – Understanding The App Menus And Buttons.md b/translated/Daily Ubuntu Tips – Understanding The App Menus And Buttons.md new file mode 100644 index 0000000000..bc100733b6 --- /dev/null +++ b/translated/Daily Ubuntu Tips – Understanding The App Menus And Buttons.md @@ -0,0 +1,36 @@ +Ubuntu每日贴士 – 深入理解应用菜单和按钮 +================================================================================ +Ubuntu是一款很不错的操作系统。它基本上可以做到任何现代操作系统能做的事情,甚至有时候能做的更好。如果你是一个ubuntu新手,那么你现在还有很多不知道的事情。对于那些专家级用户来说十分普通的事情课能对你来说可能就不太普通了,因此这个“ubuntu每日贴士”系列旨在帮助你和新用户轻松设置管理ubuntu。 + +Ubuntu有一个菜单栏。主菜单栏是在屏幕的顶端黑色条状栏,其包含了状态菜单或指示器和时间日期,音量键,应用菜单和窗口管理按钮。 + +窗口管理按钮在主菜单(黑色条状栏)的左上角。当年你打开一个程序的时候,主菜单左上角的按钮包括关闭,最小化,最大化,和保存按钮叫做窗口管理按钮。 + +应用按钮位于窗口管理按钮的右侧。当它打开时显示应用菜单。 + +默认情况下,ubuntu隐藏了窗口应用菜单和管理按钮,只有当你把鼠标放在左侧角里的时候才能看到。如果你打开一个程序但是找不到菜单,只需要把你的鼠标移动到屏幕左上角就可以使它显示出来。 + +如果这让你很困惑,而且你想关闭应用菜单而使每个程序都有自己的菜单的话,继续向下看。 + +运行以下命令以安装或删除应用菜单: + + sudo apt-get autoremove indicator-appmenu + +运行上面的命令将会删除应用菜单即全局菜单。现在,为了使改变生效,先退出然后再登陆回来。 + +现在,当你打开一个ubuntu里面的程序的时候,每个程序就会用显示自己的菜单代替把它隐藏在全局菜单或主菜单里。 + +![](http://www.liberiangeek.net/wp-content/uploads/2013/09/ubuntuappmenuglobalmenu.png) + +就是这样! 想返回原来的状态的话,运行下面的命令: + + sudo apt-get install indicator-appmenu + +使用愉快! +-------------------------------------------------------------------------------- + +via: http://www.liberiangeek.net/2013/09/daily-ubuntu-tips-understanding-app-menus-buttons/ + +本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](http://linux.cn/) 荣誉推出 + +译者:[crowner](https://github.com/译者ID) 校对:[校对者ID](https://github.com/校对者ID) \ No newline at end of file From 4f742073ff5b8bfe2bd4c8dc7d41f05114104fed Mon Sep 17 00:00:00 2001 From: chenjintao_ii Date: Sat, 26 Oct 2013 15:32:03 +0800 Subject: [PATCH 10/10] work complete, done is better than perfect --- translated/NoSQL comparison.md | 431 +++++++++++++++++---------------- 1 file changed, 216 insertions(+), 215 deletions(-) diff --git a/translated/NoSQL comparison.md b/translated/NoSQL comparison.md index cdd576ea02..e09321e00d 100644 --- a/translated/NoSQL comparison.md +++ b/translated/NoSQL comparison.md @@ -1,11 +1,11 @@ -各种 NoSQL 的比较 TODO: 中英文之间需要半角空格 +各种 NoSQL 的比较 ================ 即使关系型数据库依然是非常有用的工具,它们持续几十年的垄断地位就要走到头了。现在已经存在无数能撼动关系型数据库地位的 NoSQL,当然,这些 NoSQL 还无法完全取代它们。(也就是说,关系型数据库还是处理关系型事务的最佳方式。) NoSQL 与 NoSQL 之间的区别,要远大于 SQL 与 SQL 之间的区别。所以软件架构师必须要在项目一开始就选好一款合适的 NoSQL。 -考虑到这种情况,本文为大家介绍以下几种 NoSQL 之间的区别:[Cassandra][], [Mongodb][], [CouchDB][], [Redis][], [Riak][], [Couchbase (ex-Membase)][], [Hypertable][], [ElasticSearch][], [Accumulo][], [VoltDB][], [Kyoto Tycoon][], [Scalaris][], [Neo4j][]和[HBase][]: +考虑到这种情况,本文为大家介绍以下几种 NoSQL 之间的区别:[Cassandra][], [Mongodb][], [CouchDB][], [Redis][], [Riak][], [Couchbase (ex-Membase)][], [Hypertable][], [ElasticSearch][], [Accumulo][], [VoltDB][], [Kyoto Tycoon][], [Scalaris][], [Neo4j][]和[HBase][]: ##最流行的 NoSQL @@ -13,7 +13,7 @@ NoSQL 与 NoSQL 之间的区别,要远大于 SQL 与 SQL 之间的区别。所 **开发语言:** C++ -**主要特性:** 保留 SQL 中一些用户友好的特性(查询、索引等)。 +**主要特性:** 保留 SQL 中一些用户友好的特性(查询、索引等) **许可证:** AGPL (发起者: Apache) @@ -33,344 +33,345 @@ NoSQL 与 NoSQL 之间的区别,要远大于 SQL 与 SQL 之间的区别。所 - 支持对数据建立索引 - 数据中心意识 -**应用场景:**动态查询;需要定义索引而不是 map/reduce 功能;提高大数据库性能;想使用 CouchDB 但数据的 IO 吞吐量太大,CouchDB 无法满足要求。MongoDB 可以满足你的需求。 +**应用场景:** 动态查询;需要定义索引而不是 map/reduce 功能;提高大数据库性能;想使用 CouchDB 但数据的 IO 吞吐量太大,CouchDB 无法满足要求。MongoDB 可以满足你的需求 -**使用案例:**想布署 MySQL 或 PostgreSQL,但它们存在的预定义处理语句和预定义变量让你望而却步。这个时候,MongoDB 是你可以考虑的选项。 +**使用案例:** 想布署 MySQL 或 PostgreSQL,但它们存在的预定义处理语句和预定义变量让你望而却步。这个时候,MongoDB 是你可以考虑的选项 ###Riak 1.2版 **开发语言:** Erlang、C、以及一些 JavaScript -**主要特性:**容错机制(当一份数据失效,服务会自动切换到备份数据,保证服务一直在线 —— 译者注) +**主要特性:** 容错机制(当一份数据失效,服务会自动切换到备份数据,保证服务一直在线 —— 译者注) **许可证:** Apache **数据传输、存储的格式:** HTTP/REST 架构,自定义二进制格式 -- 可存储 BLOB(binary large object,二进制大对象,比如一张图片、一个声音文件 —— 译者注)。 -- 可在分部式存储和备份存储之间作协调。 -- 为了保证可验证性和安全性,Riak 在 JS 和 Erlaing 中提供提交前(pre-commit)和提交后(post-commit)钩子(hook)函数(你可以在提交数据前执行一个 hook,或者在提交数据后执行一个 hook —— 译者注)。 -- JS 和 Erlang 提供映射和简化(map/reduce)编程模型。 -- 使用 links 和 link walking 图形化数据库(link 用于描述对象之间的关系,link walking 是一个用于查询对象关系的进程 —— 译者注)。 -- 次要标记(secondaty indeces,开发者在写数据时可用多个名称来标记一个对象 —— 译者注),一次只能用一个。 -- 支持大数据对象(Luwak)(Luwak 是 Riak 中的一个服务层,为大数据量对象提供简单的、面向文档的抽象,弥补了 Riak 的 Key/Value 存储格式在处理大数据对象方面的不足 —— 译者注)。 -- 提供“开源”和“企业”两个版本。 -- 提供“全文搜索”(可能就是允许用户在不提供 table/volume 等信息,对一个表进行文本字段的搜索,瞎猜的,望指正 —— 译者注)。 -- 正在将存储后端从“Bitcask”迁移到 Google 的“LevelDB”上。 -- 企业版本提供多点备份(各点地位平等,非主从架构)和SNMP监控功能。 +- 可存储 BLOB(binary large object,二进制大对象,比如一张图片、一个声音文件 —— 译者注) +- 可在分部式存储和备份存储之间作协调 +- 为了保证可验证性和安全性,Riak 在 JS 和 Erlaing 中提供提交前(pre-commit)和提交后(post-commit)钩子(hook)函数(你可以在提交数据前执行一个 hook,或者在提交数据后执行一个 hook —— 译者注) +- JS 和 Erlang 提供映射和简化(map/reduce)编程模型 +- 使用 links 和 link walking 图形化数据库(link 用于描述对象之间的关系,link walking 是一个用于查询对象关系的进程 —— 译者注) +- 次要标记(secondaty indeces,开发者在写数据时可用多个名称来标记一个对象 —— 译者注),一次只能用一个 +- 支持大数据对象(Luwak)(Luwak 是 Riak 中的一个服务层,为大数据量对象提供简单的、面向文档的抽象,弥补了 Riak 的 Key/Value 存储格式在处理大数据对象方面的不足 —— 译者注) +- 提供“开源”和“企业”两个版本 +- 提供“全文搜索”(可能就是允许用户在不提供 table/volume 等信息,对一个表进行文本字段的搜索,瞎猜的,望指正 —— 译者注) +- 正在将存储后端从“Bitcask”迁移到 Google 的“LevelDB”上 +- 企业版本提供多点备份(各点地位平等,非主从架构)和SNMP监控功能 -**应用场景:**假如你想要类似 Dynamo 的数据库,但不想要它的庞大和复杂;假如你需要良好的单点可扩展性、可用性和容错能力,但不想为多点备份买单。 Riak 能满足你的需求。 +**应用场景:** 假如你想要类似 Dynamo 的数据库,但不想要它的庞大和复杂;假如你需要良好的单点可扩展性、可用性和容错能力,但不想为多点备份买单。 Riak 能满足你的需求 -**使用案例:**销售点数据收集;工厂控制系统;必须实时在线的系统;需要易于升级的网站服务器。 +**使用案例:** 销售点数据收集;工厂控制系统;必须实时在线的系统;需要易于升级的网站服务器 ###CouchDB 1.2版 **开发语言:** Erlang -**主要特性:**数据一致性;易于使用 +**主要特性:** 数据一致性;易于使用 **许可证:** Apache **数据传输格式:** HTTP/REST -- 双向复制(一种同步技术,每个备份点都有一份它们自己的拷贝,允许用户在存储点断线的情况下修改数据,当存储节点重新上线时,CouchDB 会对所有节点同步这些修改 —— 译者注)。 -- 支持持续同步或者点对点同步。 -- 支持冲突检测。 -- 支持主主互备(多个数据库时时同步数据,起到备份和分摊用户并行访问量的作用 —— 译者注)。 -- 多版本并发控制(MVCC),写操作时不需要阻塞读操作(或者说不需要锁住数据库)。 -- 向下兼容。 -- 可靠的 crash-only 设计(所谓 crash-only,就是程序出错时,只需重启下程序,丢弃内存的所有数据,不需要执行复杂的数据恢复操作 —— 译者注)。 -- 需要实时压缩数据。 -- 视图(文档是 CouchDB 的核心概念,CouchDB 中的视图声明了如何从文档中提取数据,以及如何对提取出来的数据进行处理 —— 译者注):内嵌映射和简化(map/reduce)编程模型。 -- 格式化的views字段:lists(包含把视图运行结果转换成非 JSON 格式的方法)和 shows(包含把文档转换成非 JSON 格式的方法)(在 CouchDB 中,一个 Web 应用是与一个设计文档相对应的。在设计文档中可以包含一些特殊的字段,views 字段包含永久的视图定义 —— 译者注)。 -- 可能会提供服务器端文档验证的功能。 -- 可能提供身份认证功能。 -- 通过 _changes 函数实时更新数据。 -- 链接处理(attachment:couchDB 的每份文档都可以有一个 attachment,就像一份 email 有它的网址 —— 译者注)。 -- 有个 CouchApps(第三方JS的应用)。 +- 双向复制(一种同步技术,每个备份点都有一份它们自己的拷贝,允许用户在存储点断线的情况下修改数据,当存储节点重新上线时,CouchDB 会对所有节点同步这些修改 —— 译者注) +- 支持持续同步或者点对点同步 +- 支持冲突检测 +- 支持主主互备(多个数据库时时同步数据,起到备份和分摊用户并行访问量的作用 —— 译者注) +- 多版本并发控制(MVCC),写操作时不需要阻塞读操作(或者说不需要锁住数据库) +- 向下兼容 +- 可靠的 crash-only 设计(所谓 crash-only,就是程序出错时,只需重启下程序,丢弃内存的所有数据,不需要执行复杂的数据恢复操作 —— 译者注) +- 需要实时压缩数据 +- 视图(文档是 CouchDB 的核心概念,CouchDB 中的视图声明了如何从文档中提取数据,以及如何对提取出来的数据进行处理 —— 译者注):内嵌映射和简化(map/reduce)编程模型 +- 格式化的views字段:lists(包含把视图运行结果转换成非 JSON 格式的方法)和 shows(包含把文档转换成非 JSON 格式的方法)(在 CouchDB 中,一个 Web 应用是与一个设计文档相对应的。在设计文档中可以包含一些特殊的字段,views 字段包含永久的视图定义 —— 译者注) +- 可能会提供服务器端文档验证的功能 +- 可能提供身份认证功能 +- 通过 _changes 函数实时更新数据 +- 链接处理(attachment:couchDB 的每份文档都可以有一个 attachment,就像一份 email 有它的网址 —— 译者注) +- 有个 CouchApps(第三方JS的应用) -**应用场景:**用于随机数据量多、需要预定义查询的地方;用于版本控制比较重要的地方。 +**应用场景:** 用于随机数据量多、需要预定义查询的地方;用于版本控制比较重要的地方 -**使用案例:**可用于客户关系管理(CRM),内容管理系统(CMS);可用于主主互备甚至多机互备。 +**使用案例:** 可用于客户关系管理(CRM),内容管理系统(CMS);可用于主主互备甚至多机互备 ###Redis 2.4版 **开发语言:** C/C++ -**主要特性:**快到掉渣 +**主要特性:** 快到掉渣 **许可证:** BSD **数据传输方式:** 类似 Telnet -- Redis 是一个内存数据库(in-memory database,简称 IMDB,将数据放在内存进行读写,这才是“快到掉渣”的真正原因 —— 译者注),磁盘只是提供数据持久化(即将内存的数据写到磁盘)的功能(这类数据库被称为“disk backed”数据库)。 -- 当前不支持将磁盘作为 swap 分区,虚拟内存(VM)和 Diskstore 方式都没加到此版本(Redis 的数据持久化共有4种方式:定时快照、基于语句追加、虚拟内存、diskstore。其中 VM 方式由于性能不好以及不稳定的问题,已经被作者放弃,而 diskstore 方式还在实验阶段 —— 译者注)。 +- Redis 是一个内存数据库(in-memory database,简称 IMDB,将数据放在内存进行读写,这才是“快到掉渣”的真正原因 —— 译者注),磁盘只是提供数据持久化(即将内存的数据写到磁盘)的功能(这类数据库被称为“disk backed”数据库) +- 当前不支持将磁盘作为 swap 分区,虚拟内存(VM)和 Diskstore 方式都没加到此版本(Redis 的数据持久化共有4种方式:定时快照、基于语句追加、虚拟内存、diskstore。其中 VM 方式由于性能不好以及不稳定的问题,已经被作者放弃,而 diskstore 方式还在实验阶段 —— 译者注) - 主从备份 -- 存储结构为简单的 key/value 或 hash 表。 -- 但是操作比较复杂,比如:ZREVRANGEBYSCORE。 -- 支持 INCR(INCR key 就是将key中存储的数值加一 —— 译者注)命令(对限速和统计有帮助)。 -- 支持sets数据类型(以及 union/diff/inter)。 -- 支持 lists (以及 queue/blocking pop)。 -- 支持 hash sets (多级对象)。 -- 支持 sorted sets(高效率的表,在范围查找方面有优势)。 -- 支持事务处理。 +- 存储结构为简单的 key/value 或 hash 表 +- 但是操作比较复杂,比如:ZREVRANGEBYSCORE +- 支持 INCR(INCR key 就是将key中存储的数值加一 —— 译者注)命令(对限速和统计有帮助) +- 支持sets数据类型(以及 union/diff/inter) +- 支持 lists (以及 queue/blocking pop) +- 支持 hash sets (多级对象) +- 支持 sorted sets(高效率的表,在范围查找方面有优势) +- 支持事务处理 - 缓存中的数据可被标记为过期 -- Pub/Sub 操作能让用户发送信息。 +- Pub/Sub 操作能让用户发送信息 -**应用场景:**适合布署快速多变的小规模数据(可以完全运行在存在中)。 +**应用场景:** 适合布署快速多变的小规模数据(可以完全运行在存在中) -**使用案例:**股价系统、分析系统、实时数据收集系统、实时通信系统、以及取代 memcached。 +**使用案例:** 股价系统、分析系统、实时数据收集系统、实时通信系统、以及取代 memcached -##Clones of Google's Bigtable +##Google Bigtable 的衍生品 -###HBase (V0.92.0) +###HBase 0.92.0 版 -**Written in:** Java +**开发语言:** Java -**Main point:** Billions of rows X millions of columns +**主要特性:** 支持几十亿行*几百万列的大表 -**License:** Apache +**许可证:** Apache -**Protocol:** HTTP/REST (also Thrift) +**数据传输方式:** HTTP/REST (也支持 Thrift 开发框架) -- Modeled after Google's BigTable -- Uses Hadoop's HDFS as storage -- Map/reduce with Hadoop -- Query predicate push down via server side scan and get filters -- Optimizations for real time queries -- A high performance Thrift gateway -- HTTP supports XML, Protobuf, and binary -- Jruby-based (JIRB) shell -- Rolling restart for configuration changes and minor upgrades -- Random access performance is like MySQL -- A cluster consists of several different types of nodes +- 仿造 Google 的 BigTable +- 使用 Hadoop 的 HDFS 文件系统作为存储 +- 使用 Hadoop 的映射和简化(map/reduce)编程模型 +- 查询条件被推送到服务器端,由服务器端执行扫描和过滤 +- 对实时查询进行优化 +- 高性能的 Thrift gateway(访问 HBase 的接口之一,特点是利用 Thrift 序列化支持多种语言,可用于异构系统在线访问 HBase 表数据 —— 译者注) +- 使用 HTTP 通信协议,支持 XML、Protobuf 以及一些二进制文档结构 +- 支持基于 Jruby(JIRB)的shell +- 当配置信息有更改时,支持 rolling restart(轮流重启数据节点) +- 随机读写性能与 MySQL 一样 +- 一个集群可由不同类型的结点组成 -**Best used:** Hadoop is probably still the best way to run Map/Reduce jobs on huge datasets. Best if you use the Hadoop/HDFS stack already. +**应用场景:** Hadoop 可能是在大数据上跑 Map/Reduce 业务的最佳选择;如果你已经搭建了 Hadoop/HDFS 架构,HBase 也是你最佳的选择。 -**For example:** Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement. +**使用案例:** 搜索引擎;日志分析系统;扫描大型二维非关系型数据表。 -###Cassandra (1.2) +###Cassandra 1.2版 -**Written in:** Java +**开发语言:** Java -**Main point:** Best of BigTable and Dynamo +**主要特性:** BigTable 和 Dynamo的完美结合(Cassandra 以 Amazon 专有的完全分布式的 Dynamo 为基础,结合了Google BigTable基于 Column Family 的数据模型 —— 译者注) -**License:** Apache +**许可证:** Apache -**Protocol:** Thrift & custom binary CQL3 +**数据传输和存储方式:** Thrift 和自定义二进制 CQL3(即 Cassandra 查询语言第3版 —— 译者注) -- Tunable trade-offs for distribution and replication (N, R, W) -- Querying by column, range of keys (Requires indices on anything that you want to search on) -- BigTable-like features: columns, column families -- Can be used as a distributed hash-table, with an "SQL-like" language, CQL (but no JOIN!) -- Data can have expiration (set on INSERT) -- Writes can be much faster than reads (when reads are disk-bound) -- Map/reduce possible with Apache Hadoop -- All nodes are similar, as opposed to Hadoop/HBase -- Very good and reliable cross-datacenter replication +- 可以灵活调整对数据的分布式或备份式存储(通过设置N,R,W之间的关系)(NRW是数据库布署模型中的概念,N是存储网络中复制数据的节点数,R是网络中读数据的节点数,W是网络中写数据的节点数。一个环境中N值是固定的,设置不同的WR值组合能在数据可用性和数据一致性之间取得不同的平衡,可参考 CAP 定理 —— 译者注) +- 按列查询,按keys值排序后存储(需要包含你想要搜索的任何信息)(Cassandra 的数据模型借鉴自 BigTable 的列式存储,列式存储可以理解成这样,将行ID、列簇号,列号以及时间戳一起,组成一个Key,然后将Value按Key的顺序进行存储 —— 译者注) +- 类似 BigTable 的特性:列、列簇 +- 支持分布式 hash 表,使用“类 SQL” 语言 —— CQL(但没有 SQL 中的 JOIN 语句) +- 可以为数据设置一个过期时间(使用 INSERT 指令) +- 写性能远高于读性能(读性能的瓶颈是磁盘 IO) +- 可使用 Hadoop 的映射和简化(map/reduce)编程模型 +- 所有节点都相似,这点与 Hadop/HBase 架构不同 +- 可靠的跨数据中心备份解决方案 -**Best used:** When you write more than you read (logging). If every component of the system must be in Java. ("No one gets fired for choosing Apache's stuff.") +**应用场景:** 写操作多于读操作的环境(比如日志系统);如果系统全部由 JAVA 组成(“没人会因为使用了 Apache 许可下的产品而被炒鱿鱼”(此句貌似是网上有人针对“Apache considered harmful”一文所作的回应 —— 译者注)) -**For example:** Banking, financial industry (though not necessarily for financial transactions, but these industries are much bigger than that.) Writes are faster than reads, so one natural niche is data analysis. +**使用案例:** 银行、金融机构;写性能强于读性能,所以 Cassandra 天生就是用来作数据分析的。 -###Hypertable (0.9.6.5) +###Hypertable 0.9.6.5版 -**Written in:** C++ +**开发语言:** C++ -**Main point:** A faster, smaller HBase +**主要特性:** HBase 的精简版,但比 HBase 更快 -**License:** GPL 2.0 +**许可证:** GPL 2.0 -**Protocol:** Thrift, C++ library, or HQL shell +**数据传输和存储的方式:** Thrift,C++库,或者 HQL shell -- Implements Google's BigTable design -- Run on Hadoop's HDFS -- Uses its own, "SQL-like" language, HQL -- Can search by key, by cell, or for values in column families. -- Search can be limited to key/column ranges. -- Sponsored by Baidu -- Retains the last N historical values -- Tables are in namespaces -- Map/reduce with Hadoop +- 采用与 Google BigTable 相似的设计 +- 运行在 Hadoop HDFS 之上 +- 使用自己的“类 SQL”语言 —— HQL +- 可以根据 key 值、单元(cell)进行查找,可以在列簇上查找 +- 查询数据可以指定 key 或者列的范围 +- 由百度公司赞助(百度早在2009年就成为这个项目的赞助商了 —— 好吧译者表示有点大惊小怪了:P) +- 能保留一个值的 N 个历史版本 +- 表在命名空间内定义 +- 使用 Hadoop 的 Map/reduce 模型 -**Best used:** If you need a better HBase. +**应用场景:** 假如你需要一个更好的HBase,就用Hypertable吧。 -**For example:** Same as HBase, since it's basically a replacement: Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement. +**使用案例:** 与HBase一样,就是搜索引擎被换了下;分析日志数据的系统;适用于浏览大规模二维非关系型数据表。 -###Accumulo (1.4) +###Accumulo 1.4版 -**Written in:** Java and C++ +**开发语言:** Java 和 C++ -**Main point:** A BigTable with Cell-level security +**主要特性:** 一个有着单元级安全的 BigTable -**License:** Apache +**许可证:** Apache -**Protocol:** Thrift +**数据传输和存储的方式:** Thrift -- Another BigTable clone, also runs of top of Hadoop -- Cell-level security -- Bigger rows than memory are allowed -- Keeps a memory map outside Java, in C++ STL -- Map/reduce using Hadoop's facitlities (ZooKeeper & co) -- Some server-side programming +- 另一个 BigTable 的复制品,也是跑在 Hadoop 的上层 +- 单元级安全保证 +- 允许使用比内存容量更大的数据列 +- 通过 C++ 的 STL 可保持数据从 JAVA 环境的内存映射出来 +- 使用 Hadoop 的 Map/reduce 模型 +- 支持在服务器端编程 -**Best used:** If you need a different HBase. +**应用场景:** HBase的替代品 -**For example:** Same as HBase, since it's basically a replacement: Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement. +**使用案例:** 与HBase一样,就是搜索引擎被换了下;分析日志数据的系统;适用于浏览大规模二维非关系型数据表。 -##Special-purpose +##特殊用途 -###Neo4j (V1.5M02) +###Neo4j V1.5M02 版 -**Written in:** Java +**开发语言:** Java -**Main point:** Graph database - connected data +**主要特性:** 图形化数据库 -**License:** GPL, some features AGPL/commercial +**许可证:** GPL,AGPL(商业用途) -**Protocol:** HTTP/REST (or embedding in Java) +**数据传输和存储的方式:** HTTP/REST(或内嵌在 Java 中) -- Standalone, or embeddable into Java applications -- Full ACID conformity (including durable data) -- Both nodes and relationships can have metadata -- Integrated pattern-matching-based query language ("Cypher") -- Also the "Gremlin" graph traversal language can be used -- Indexing of nodes and relationships -- Nice self-contained web admin -- Advanced path-finding with multiple algorithms -- Indexing of keys and relationships -- Optimized for reads -- Has transactions (in the Java API) -- Scriptable in Groovy -- Online backup, advanced monitoring and High Availability is AGPL/commercial licensed +- 可独立存在,或内嵌在 JAVA 的应用中 +- 完全的 ACID 保证(包括正在处理的数据) +- 节点和节点的关系都可以拥有原数据 +- 集成基于“模式匹配”的查询语言(Cypher) +- 支持“Gremlin”图形转化语言 +- 可对节点与节点关系进行索引 +- 良好的自包含网页管理技术 +- 多个算法实现高级文件查找功能 +- 可对 key 与 key 的关系进行索引 +- 优化读性能 +- 在 JAVA API 中实现事务处理 +- 可运行脚本 Groovy 脚本 +- 在商用版本中提供在线备份,高级监控和高可用性功能 -**Best used:** For graph-style, rich or complex, interconnected data. Neo4j is quite different from the others in this sense. +**应用场景:** 适用于用图形显示复杂的交互型数据。 -**For example:** For searching routes in social relations, public transport links, road maps, or network topologies. +**使用案例:** 搜寻社交关系网、公共传输链、公路路线图、或网络拓扑结构 -###ElasticSearch (0.20.1) +###ElasticSearch 0.20.1 版 -**Written in:** Java +**开发语言:** Java -**Main point:** Advanced Search +**主要特性:** 高级搜索 -**License:** Apache +**许可证:** Apache -**Protocol:** JSON over HTTP (Plugins: Thrift, memcached) +**数据传输和存储的方式:** 通过 HTTP 使用 JSON 进行数据索引(插件:Thrift, memcached) -- Stores JSON documents -- Has versioning -- Parent and children documents -- Documents can time out -- Very versatile and sophisticated querying, scriptable -- Write consistency: one, quorum or all -- Sorting by score (!) -- Geo distance sorting -- Fuzzy searches (approximate date, etc) (!) -- Asynchronous replication -- Atomic, scripted updates (good for counters, etc) -- Can maintain automatic "stats groups" (good for debugging) -- Still depends very much on only one developer (kimchy). +- 以 JSON 形式保存数据 +- 提供版本升级功能 +- 有父文档和子文档功能 +- 文档有过期时间 +- 提供复杂多样的查询指令,可使用脚本 +- 支持写操作一致性的三个级别:ONE、QUORUM、ALL +- 支持通过分数排序 +- 支持通过地理位置排序 +- 支持模糊查询(通过近似数据查询等方式实现) +- 支持异步复制 +- 自动升级,也可通过设置脚本升级 +- 可以维持自动的“统计组”(对调试很有帮助) +- 只有一个开发者(kimchy) -**Best used:** When you have objects with (flexible) fields, and you need "advanced search" functionality. +**应用场景:** 当你有可伸缩性很强的项目并且想拥有“高级搜索”功能。 -**For example:** A dating service that handles age difference, geographic location, tastes and dislikes, etc. Or a leaderboard system that depends on many variables. +**使用案例:** 可布署一个约会服务,提供不同年龄、不同地理位置、不同品味的客户的交友需求。或者可以布署一个基于多项参数的排行榜。 -##The "long tail" +##其他 -(Not widely known, but definitely worthy ones) +(不怎么有名,但值得在这里介绍一下) -###Couchbase (ex-Membase) (2.0) +###Couchbase (ex-Membase) 2.0 版 -**Written in:** Erlang & C +**开发语言:** Erlang 和 C -**Main point:** Memcache compatible, but with persistence and clustering +**主要特性:** 兼容 Memcache,但数据是持久化的,并且支持集群 -**License:** Apache +**许可证:** Apache -**Protocol:** memcached + extensions +**数据传输和存储的方式:** 缓存和扩展(memcached + extensions) -- Very fast (200k+/sec) access of data by key -- Persistence to disk -- All nodes are identical (master-master replication) -- Provides memcached-style in-memory caching buckets, too -- Write de-duplication to reduce IO -- Friendly cluster-management web GUI -- Connection proxy for connection pooling and multiplexing (Moxi) -- Incremental map/reduce -- Cross-datacenter replication +- 通过 key 访问数据非常快(20万以上IOPS) +- 数据保存在磁盘(不像 Memcache 保存在内存中 —— 译者注) +- 在主主互备中,所有节点数据是一致的 +- 提供类似 Memcache 将数据保存在内存的功能 +- 支持重复数据删除功能 +- 友好的集群管理 Web 界面 +- 支持池和多丛结构的代理(利用 Moxi 项目) +- 支持 Map/reduce 模式 +- 支持跨数据中心备份 -**Best used:** Any application where low-latency data access, high concurrency support and high availability is a requirement. +**应用场景:** 适用于低延迟数据访问系统,高并发和高可用系统。 -**For example:** Low-latency use-cases like ad targeting or highly-concurrent web apps like online gaming (e.g. Zynga). +**使用案例:** 低延迟可用于广告定投;高并发可用于在线游戏(如星佳公司)。 -###VoltDB (2.8.4.1) +###VoltDB 2.8.4.1版 -**Written in:** Java +**开发语言:** Java -**Main point:** Fast transactions and rapidly changing data +**主要特性:** 快速的事务处理和数据变更 -**License:** GPL 3 +**许可证:** GPL 3 -**Protocol:** Proprietary +**数据传输和存储的方式:** 专有方式 -- In-memory relational database. -- Can export data into Hadoop -- Supports ANSI SQL -- Stored procedures in Java -- Cross-datacenter replication +- 运行在内存的关系型数据库 +- 可以将数据导入到 Hadoop +- 支持 ANSI SQL +- 在 JAVA 环境中保存操作过程 +- 支持跨数据中心备份 -**Best used:** Where you need to act fast on massive amounts of incoming data. +**应用场景:** 适用于在大量传入数据中保证快速反应能力的场合。 -**For example:** Point-of-sales data analysis. Factory control systems. +**使用案例:** 销售点数据分析系统;工厂控制系统。 -###Scalaris (0.5) +###Scalaris 0.5版 -**Written in:** Erlang +**开发语言:** Erlang -**Main point:** Distributed P2P key-value store +**主要特性:** 分布式 P2P 键值存储 -**License:** Apache +**许可证:** Apache -**Protocol:** Proprietary & JSON-RPC +**数据传输和存储的方式:** 自有方式和 基于JSON的远程过程调用协议 -- In-memory (disk when using Tokyo Cabinet as a backend) -- Uses YAWS as a web server +- 数据保存在内存中(使用 Tokyo Cabinet 作为后台时,数据可以持久化到磁盘中) +- 使用 YAWS 作为 Web 服务器 - Has transactions (an adapted Paxos commit) -- Consistent, distributed write operations -- From CAP, values Consistency over Availability (in case of network partitioning, only the bigger partition - works) +- 支持事务处理(基于 Paxos 提交)(Paxos 是一种基于消息传递模型的一致性算法 —— 译者注) +- 支持分布式数据的一致性写操作 +- 根据 CAP 定理,数据一致性要求高于数据可用性(前提是在一个比较大的网络分区环境下工作)(CAP 定理:数据一致性consistency、数据可用性availability、分隔容忍partition tolerance是分布式计算系统的三个属性,一个分布式计算系统不可能同时满足全部三项) -**Best used:** If you like Erlang and wanted to use Mnesia or DETS or ETS, but you need something that is accessible from more languages (and scales much better than ETS or DETS). +**应用场景:** 如果你喜欢 Erlang 并且想要使用 Mnesia 或 DETS 或 ETS,但你需要一个能使用多种语言(并且可扩展性强于 ETS 和 DETS)的技术,那就选它吧。 -**For example:** In an Erlang-based system when you want to give access to the DB to Python, Ruby or Java programmers. +**使用案例:** 使用基于 Erlang 的系统,但是想通过 Python、Ruby 或 JAVA 访问数据库 -###Kyoto Tycoon (0.9.56) +###Kyoto Tycoon 0.9.56版 -**Written in:** C++ +**开发语言:** C++ -**Main point:** A lightweight network DBM +**主要特性:** 轻量级网络数据库管理系统 -**License:** GPL +**许可证:** GPL -**Protocol:** HTTP (TSV-RPC or REST) +**数据传输和存储的方式:** HTTP (TSV-RPC or REST) -- Based on Kyoto Cabinet, Tokyo Cabinet's successor -- Multitudes of storage backends: Hash, Tree, Dir, etc (everything from Kyoto Cabinet) -- Kyoto Cabinet can do 1M+ insert/select operations per sec (but Tycoon does less because of overhead) -- Lua on the server side -- Language bindings for C, Java, Python, Ruby, Perl, Lua, etc -- Uses the "visitor" pattern -- Hot backup, asynchronous replication -- background snapshot of in-memory databases -- Auto expiration (can be used as a cache server) +- 基于 Kyoto Cabinet, 是 Tokyo Cabinet 的成功案例 +- 支持多种存储后端:Hash,树、目录等等(所有概念都是从 Kyoto Cabinet 那里来的) +- Kyoto Cabinet 可以达到每秒100万次插入/查询操作(但是 Tycoon 由于瓶颈问题,性能比 Cabinet 要差点) +- 服务器端支持 Lua 脚本语言 +- 支持 C、JAVA、Python、Ruby、Perl、Lua 等语言 +- 使用访问者模式开发(visitor patten:让开发者能在不修改类层次结构的前提下,定义该类层次结构的操作 —— 不明白就算了,译者也不明白) +- 支持热备、异步备份 +- 支持内存数据库在后端执行快照 +- 自动过期处理(可用来布署一个缓存服务器) -**Best used:** When you want to choose the backend storage algorithm engine very precisely. When speed is of the essence. +**应用场景:** 当你想要一个很精准的后端存储算法引擎,并且速度是刚需的时候,玩玩 Kyoto Tycoon 吧。 -**For example:** Caching server. Stock prices. Analytics. Real-time data collection. Real-time communication. And wherever you used memcached before. +**使用案例:** 缓存服务器;股价查询系统;数据分析系统;实时数据控制系统;实时交互系统;memcached的替代品。 -Of course, all these systems have much more features than what's listed here. I only wanted to list the key points that I base my decisions on. Also, development of all are very fast, so things are bound to change. +当然,上述系统的特点肯定不止列出来这么点。我只是列出了我认为很关键的信息。另外科技发展迅猛,技术改变得非常快。 -P.s.: And no, there's no date on this review. There are version numbers, since I update the databases one by one, not at the same time. And believe me, the basic properties of databases don't change that much. +附:现在下定论比较孰优孰劣还为时过早。上述数据库的版本号以及特性我会一个一个慢慢更新。相信我,这些数据库的特性不会变得很快。 ---