diff --git a/sources/talk/20150513 XML vs JSON.md b/sources/talk/20150513 XML vs JSON.md index 2b5f10e5d8..aa7b9be486 100644 --- a/sources/talk/20150513 XML vs JSON.md +++ b/sources/talk/20150513 XML vs JSON.md @@ -10,12 +10,12 @@ XML vs JSON ====== -### 简介 Introduction +### 简介 XML and JSON are the two most common formats for data interchange in the Web today. XML was created by the W3C in 1996, and JSON was publicly specified by Douglas Crockford in 2002. Although their purposes are not identical, they are frequently used to accomplish the same task, which is data interchange. Both have well-documented open standards on the Web ([RFC 7159][1], [RFC 4825][2]), and both are human and machine-readable. Neither one is absolutely superior to the other, as each is better suited for different use cases. 在今天,XML 和 JSON 是互联网上最常用的两种数据交换格式。XML 格式由 W3C 于 1996 年提出。JSON 格式由 Douglas Crockford 于 2020 年提出。虽然制作这两种格式的目的不同,但它们都是数据交换中常用的格式。这两种格式的文档都很完善([RFC 7159][1], [RFC 4825][2]),且都同时具有人类可读性human-readable机器可读性machine-readable。这两种格式并没有哪一个比另一个更强,只是各自适用的领域不用。(LCTT译注:W3C 是[互联网联盟](https://www.w3.org/),制定了各种 Web 相关的标准,如 HTML、CSS 等。Douglas Crockford 除了制定了 JSON 格式,还致力于改进 JavaScript,开发了 JavaScript 相关工具 [JSLint](http://jslint.com/) 和 [JSMin](http://www.crockford.com/javascript/jsmin.html)。) -### XML 的优点 Advantages +### XML 的优点 There are several advantages that XML has over JSON. One of the biggest differences between the two is that in XML you can put metadata into the tags in the form of attributes. With JSON, the programmer could accomplish the same goal that metadata achieves by making the entity an object and adding the attributes as members of the object. However, the way XML does it may often be preferable, given the slightly misleading nature of turning something into an object that is not one in the client program. For example, if your C++ program sends an int via JSON and needs metadata to be sent along with it, you would have to make it an object, with one name/value pair for the actual value of the int, and more name/value pairs for each attribute. The program that receives the JSON would read it as an object, when in fact it is not one. While this is a viable solution, it defies one of JSON’s key advantages: “JSON's structures look like conventional programming language structures. No restructuring is necessary.”[2] XML 与 JSON 相比有很多优点。二者间最大的不同在于使用 XML 时可以通过在标签中添加属性这一简单的方法来存储元数据metadata。而使用 JSON 时需要创建一个对象,把元数据当作对象的成员来存储。虽然二者都能达到存储元数据的目的,但在这一情况下通常选择 XML,因为 JSON 的表达形式会让客户端程序误以为要将数据转换成一个对象。举个例子,如果你的 C++ 程序需要使用 JSON 格式发送一个附带元数据的整型数据,需要创建一个对象,用对象中的一个名称/值对name/value pair来记录整型数据的值,再为每一个附带的属性添加一个名称/值对。接收到这个 JSON 的程序在读取后很可能把它当成一个对象,可事实并不是这样。虽然这是使用 JSON 传递元数据的一种变通方法,但他违背了 JSON 的核心优势:“JSON 的结构与常规的程序语言中的结构相对应,而无需修改。JSON's structures look like conventional programming language structures. No restructuring is necessary.”[2] @@ -29,13 +29,13 @@ XML 的另一个优势在于大多数的浏览器可以把它以具有高 One of the most significant advantages that XML has over JSON is its ability to communicate mixed content, i.e. strings that contain structured markup. In order to handle this with XML, the programmer need only put the marked-up text within a child tag of the parent in which it belongs. Similar to the metadata situation, since JSON only contains data, there is no such simple way to indicate markup. It would again require storing metadata as data, which could be considered an abuse of the format. XML 对比 JSON 有一个很重要的能力就是它可以混合多种内容mixed content。例如在 XML 中处理包含结构化标记的字符串时,程序员们只要把带有标记的文本放在一个标签内就可以了。而且和元数据的处理类似,由于 JSON 只包含数据,没有用于指明标签的简单方式。虽然这里还可以使用处理元数据的解决方法,但这总有点误用格式之嫌。 -### JSON 的优点 Advantages +### JSON 的优点 JSON has several advantages as well. One of the most obvious of these is that JSON is significantly less verbose than XML, because XML necessitates opening and closing tags (or in some cases less verbose self-closing tags), and JSON uses name/value pairs, concisely delineated by “{“ and “}” for objects, “[“ and “]” for arrays, “,” to separate pairs, and “:” to separate name from value. Even when zipped (using gzip), JSON is still smaller and it takes less time to zip it.[6] As determined by Sumaray and Makki as well as Nurseitov, Paulson, Reynolds, and Izurieta in their experimental findings, JSON outperforms XML in a number of ways. First, naturally following from its conciseness, JSON files that contain the same information as their XML counterparts are almost always significantly smaller, which leads to faster transmission and processing. Second, difference in size aside, both groups found that JSON was serialized and deserialized drastically faster than XML.[3][4] Third, the latter study determined that JSON processing outdoes XML in CPU resource utilization. They found that JSON used less total resources, more user CPU, and less system CPU. The experiment used RedHat machines, and RedHat claims that higher user CPU usage is preferable.[3] Unsurprisingly, the Sumaray and Makki study determined that JSON performance is superior to XML on mobile devices too.[4] This makes sense, given that JSON uses less resources, and mobile devices are less powerful than desktop machines. JSON 自身也有很多优点。其最显而易见的一点就是 JSON 比 XML 简洁得多。因为 XML 中需要标签的打开和关闭,而 JSON 使用名称/值对,使用简单的“{”和“}”来标记对象,“\[”和“\]”来标记数组,“,”来表示数据的分隔,“:”表示名称和值的分隔。就算是使用 gzip 压缩,JSON 还是比 XML 要小,而且耗时更少。[6]正如 Sumaray 和 Makki 在实验中指出的那样,JSON 在很多方面都比 XML 更具优势,得出同样结果的还有 Nurseitov、Paulson、Reynolds 和 Izurieta。首先,由于 JSON 文件天生的简洁性,与包含相同信息的 XML 相比,JSON 总是更小,这就意味着更快的传输和处理速度。第二,在不考虑大小的情况下,两组研究[3][4]表明使用 JSON 序列化和反序列化的速度显著优于使用 XML。第三,后续的研究指出 JSON 的处理会使用更多的 CPU 资源。他们发现 JSON 在总体上使用的资源更少,但在用户空间消耗更多的 CPU 资源,同时系统空间消耗更少的 CPU 资源。这一实验是在 RedHat 的设备上进行的,RedHat 更倾向于在用户空间消耗更多的 CPU 资源。[3]不出意外,Sumaray 和 Makki 的研究里还说明了在移动设备上 JSON 的性能也优于 XML。[4]这说得通,因为 JSON 在整体上消耗的资源更少,而且移动设备也没有台式机那么强劲。 Yet another advantage that JSON has over XML is that its representation of objects and arrays allows for direct mapping onto the corresponding data structures in the host language, such as objects, records, structs, dictionaries, hash tables, keyed lists, and associative arrays for objects, and arrays, vectors, lists, and sequences for arrays.[2] Although it is perfectly possible to represent these structures in XML, it is only as a function of the parsing, and it takes more code to serialize and deserialize properly. It also would not always be obvious to the reader of arbitrary XML what tags represent an object and what tags represent an array, especially because nested tags can just as easily be structured markup instead. The curly braces and brackets of JSON definitively show the structure of the data. However, this advantage does come with the caveat explained above, that the JSON can inaccurately represent the data if the need arises to send metadata. -JSON 的另一个优点在于其对对象和数组的描述允许宿主语言host language直接将它映射到对应数据结构上,例如对象object记录record结构体struct字典dictionary哈希表hash table键值列表keyed list还有对象组成的数组,以及数组array向量vector列表list等等。[2] 虽然 XML 里也能表达这些数据结构,也只需调用一个函数就能完成解析,但需要更多的代码才能正确的完成 XML 的序列化和反序列化处理。而且 XML 对于人类来说不如 JSON 那么直观,因为 XML 标准缺乏对象、数组的标签的明确定义,尤其是潜逃的标签可以简单的使用结构化的标记替代时。JSON 中的花括号和中括号则明确表示了数据的结构,当然这一优势也符合前文中的警告,在包含元数据时 JSON 的表示不如 XML 精确。 +JSON 的另一个优点在于其对对象和数组的描述允许宿主语言host language直接将它映射到对应数据结构上,例如对象object记录record结构体struct字典dictionary哈希表hash table键值列表keyed list还有对象组成的数组,以及数组array向量vector列表list等等。[2] 虽然 XML 里也能表达这些数据结构,也只需调用一个函数就能完成解析,但需要更多的代码才能正确的完成 XML 的序列化和反序列化处理。而且 XML 对于人类来说不如 JSON 那么直观,因为 XML 标准缺乏对象、数组的标签的明确定义,尤其是嵌套的标签可以简单的使用结构化的标记替代时。JSON 中的花括号和中括号则明确表示了数据的结构,当然这一优势也符合前文中的警告,在包含元数据时 JSON 的表示不如 XML 精确。 Although XML supports namespaces and prefixes, JSON’s handling of name collisions is less verbose than prefixes, and arguably feels more natural with the program using it; in JSON, each object is its own namespace, so names may be repeated as long as they are in different scopes. This may be preferable, as in most programming languages members of different objects can have the same name, because they are distinguished by the names of the objects to which they belong. 虽然 XML 支持命名空间namespace前缀prefix,但这不代表 JSON 没有处理命名冲突的能力。比起 XML 的前缀,它处理命名冲突的方式更简洁,在程序中的处理也更自然。在 JSON 里,每一个对象都在它自己的命名空间中,因此不同对象内的元素可以随意的重复。因为在大多数编程语言中,不同的对象中的成员可以包含相同的名字,所以 JSON 根据对象名称进行区分的规则在处理时更加自然。 @@ -43,7 +43,7 @@ Although XML supports namespaces and prefixes, JSON’s handling of name collisi Perhaps the most significant advantage that JSON has over XML is that JSON is a subset of JavaScript, so code to parse and package it fits very naturally into JavaScript code. This seems highly beneficial for JavaScript programs, but does not directly benefit any programs that use languages other than JavaScript. However, this drawback has been largely overcome, as currently the JSON website lists over 175 tools for 64 different programming languages that exist to integrate JSON processing. While I cannot speak to the quality of most of these tools, it is clear that the developer community has embraced JSON and has made it simple to use in many different platforms. 也许 JSON 比 XML 更优的部分是因为 JSON 是 JavaScript 的子集,所以在 JavaScript 代码中对它的解析或封装都非常的自然。虽然这看起来对 JavaScript 程序非常有用,而其他程序则不能直接从中获益,可实际上这一问题已经被很好的解决了。现在 JSON 的网站的列表上展示了 64 种不同语言的 175 个工具,它们都继承了 JSON 处理功能。虽然我不能评价大多数工具的质量,但它们的存在明确了开发者社区拥抱 JSON 这一现象,而且它们切实简化了在不同平台使用 JSON 的难度。 -### 目标 +### 二者的动机 Simply put, XML’s purpose is document markup. This is decidedly not a purpose of JSON, so XML should be used whenever this is what needs to be done. It accomplishes this purpose by giving semantic meaning to text through its tree-like structure and ability to represent mixed content. Data structures can be represented in XML, but that is not its purpose. 简单地说,XML 的目标是完成一种文档标记。这和 JSON 的目标想去甚远,所以只要用得到 XML 的地方就尽管用。它使用树形的结构和包含语义的文本来表达混合内容以达成这一目标。XML 可以表示数据的结构,但这并不是它的初衷。 @@ -69,25 +69,32 @@ The following major desktop software uses XML only: Microsoft Word, Apache OpenO 这些主流的桌面软件仍然只是用 XML:Microsoft Word,Apache OpenOffice,LibraOffice。 It makes sense for software that is mainly concerned with document creation, manipulation, and storage to use XML rather than JSON. Also, all three of these programs support mixed content, which JSON does not do well. For example, if a user is typing up an essay in Microsoft Word, they may put different font, size, color, positioning, and styling on different blocks of text. XML naturally represents these properties with nested tags and attributes. -因为这些软件需要考虑引用、格式、存储等等,所以比起 JSON,XML 优势更大。另外,这三款程序都支持混合内容,而 JSON 在这一点上做得并不如 XML 好。 For example, if a user is typing up an essay in Microsoft Word, they may put different font, size, color, positioning, and styling on different blocks of text. XML naturally represents these properties with nested tags and attributes. +因为这些软件需要考虑引用、格式、存储等等,所以比起 JSON,XML 优势更大。另外,这三款程序都支持混合内容,而 JSON 在这一点上做得并不如 XML 好。举例说明,当用户使用 Microsoft Word 编辑一篇论文时,用户需要使用不同的文字字形、文字大小、文字颜色、页边距、段落格式等,而 XML 结构化的组织形式与标签属性生来就是为了表达这些信息的。 The following major databases support XML: IBM DB2, Microsoft SQL Server, Oracle Database, PostgresSQL, BaseX, eXistDB, MarkLogic, MySQL. +这些主流的数据库支持 XML:IBM DB2,Microsoft SQL Server,Oracle Database,PostgresSQL,BaseX,eXistDB,MarkLogic,MySQL。 The following major databases support JSON: MongoDB, CouchDB, eXistDB, Elastisearch, BaseX, MarkLogic, OrientDB, Oracle Database, PostgresSQL, Riak. +这些是支持 JSON 的主流数据库:MongoDB,CouchDB,eXistDB,Elastisearch,BaseX,MarkLogic,OrientDB,Oracle Database,PostgresSQL,Riak。 For a long time, SQL and the relational database model dominated the market. Corporate giants like Oracle and Microsoft have always marketed such databases. However, in the last decade, there has been a major rise in popularity of NoSQL databases. As this has coincided with the rise of JSON, most NoSQL databases support JSON, and some, such as MongoDB, CouchDB, and Riak use JSON to store their data. These databases have two important qualities that make them better suited for modern websites: they are generally more scalable than relational SQL databases, and they are designed to the core to run in the Web.[10] Since JSON is more lightweight and a subset of JavaScript, it suits NoSQL databases well, and helps facilitate these two qualities. In addition, many older databases have added support for JSON, such as Oracle Database and PostgresSQL. Conversion between XML and JSON is a hassle, so naturally, as more developers use JSON for their apps, more database companies have incentive to support it.[7] +在很长一段时间里,SQL 和关系型数据库统治着整个数据库市场。像甲骨文Oracle微软Microsoft这样的软件巨头都提供这类数据库,然而近几年 NoSQL 数据库正逐步受到开发者的青睐。也许是正巧碰上了 JSON 的普及,大多数 NoSQL 数据库都支持 JSON,甚至像 MongoDB、CouchDB 和 Riak 这样的数据库使用 JSON 来存储数据。这些数据库有两个重要的品质是它们适用于现代网站:一是它们与关系型数据库相比更容易扩展more scalable;二是它们设计的目标 web 运行所需的核心组件。由于 JSON 更加轻量,又是 JavaScript 的子集,所以很适合 NoSQL 数据库,并且让这两个品质更容易实现。此外,许多旧的关系型数据库增加了 JSON 支持,例如 Oracle Database 和 PostgresSql。由于 XML 与 JSON 间的转换比较麻烦,所以大多数开发者会直接在他们的应用里使用 JSON,因此开发数据库的公司才有支持 JSON 的理由。(LCTT译注:NoSQL是对不同于传统的关系数据库的数据库管理系统的统称。[参考来源](https://zh.wikipedia.org/wiki/NoSQL)) -### Future +### 未来 One of the most heavily anticipated changes in the Internet is the “Internet of Things”, i.e. the addition to the Internet of non-computer devices such as watches, thermostats, televisions, refrigerators, etc. This movement is well underway, and is expected to explode in the near future, as predictions for the number of devices in the Internet of Things in 2020 range from about 26 billion to 200 billion.[12][13][13] Almost all of these devices are smaller and less powerful than laptop and desktop computers. Many of them only run embedded systems. Thus, when they have the need to exchange data with other entities in the Web, the lighter and faster JSON would naturally be preferable to XML.[10] Also, with the recent rapid rise of JSON use in the Web relative to XML, new devices may benefit more from speaking JSON. This highlights an example of Metcalf’s Law; whether XML or JSON or something entirely new becomes the most popular format in the Web, newly added devices and all existing devices will benefit much more if the newly added devices speak the most popular language. +对互联网的变革中,最让人期待的便是物联网Internet of Things。这会给互联网带来大量的非计算机设备,例如手表、温度计、电视、冰箱等等。这一势头的发展良好,预期将在不久的将来迎来爆发式的增长。据估计,到 2020 年时会有 260 亿 到 2000 亿的物联网设备被接入互联网。[12][13] 几乎所有的物联网设备都是小型设备,此外比笔记本或者台式电脑的性能要弱很多。大多数都是嵌入式系统。因此,当他们需要与 web 上的系统交换数据时,更轻量,更快速的 JSON 自然比 XML 更受青睐。[10] 受益于 JSON 在 web 上的快速普及,与 XML 相比,这些新的物联网设备更有可能从使用 JSON 中受益。这是一个典型的梅特卡夫定律的实例,无论是 XML 还是 JSON,抑或是什么其他全新的格式,现存的设备和新的设备都会从支持最广泛使用的格式中受益。 With the creation and recent rapid increase in popularity of Node.js, a server-side JavaScript framework, along with NoSQL databases like MongoDB, full-stack JavaScript development has become a reality. This bodes well for the future of JSON, as with these new apps, JSON is spoken at every level of the stack, which generally makes the apps very fast and lightweight. This is a desirable trait for any app, so this trend towards full-stack JavaScript is not likely to die out anytime soon.[10] +Node.js 是一款服务器端的 JavaScript 框架,随着她的诞生与快速成长,与 MongoDB 等 NoSQL 数据库一起,让全栈使用 JavaScript 开发成为可能。This bodes well for the future of JSON, as with these new apps, JSON is spoken at every level of the stack, which generally makes the apps very fast and lightweight. This is a desirable trait for any app, so this trend towards full-stack JavaScript is not likely to die out anytime soon.[10] Another existing trend in the world of app development is toward REST and away from SOAP.[11][15][16] Both XML and JSON can be used with REST, but SOAP exclusively uses XML. + The given trends indicate that JSON will continue to dominate the Web, and XML use will continue to decrease. This should not be overblown, however, because XML is still very heavily used in the Web, and it is the only option for apps that use SOAP. Given the widespread migration from SOAP to REST, the rise of NoSQL databases and full-stack JavaScript, and the far superior performance of JSON, I believe that JSON will soon be much more widely used than XML in the Web. There seem to be very few applications where XML is the better choice. -### References + +### 参考链接 1. [XML Tutorial](http://www.w3schools.com/xml/default.asp) 2. [Introducing JSON](http://www.json.org/)