mirror of
https://github.com/LCTT/TranslateProject.git
synced 2025-02-03 23:40:14 +08:00
Merge pull request #9517 from MjSeven/master
20180710 Python Sets What Why and How.md 翻译完毕
This commit is contained in:
commit
43218c69b9
@ -1,394 +0,0 @@
|
||||
MjSeven is translating
|
||||
|
||||
|
||||
Python Sets: What, Why and How
|
||||
============================================================
|
||||
|
||||
posted on 07/10/2018 by [wilfredinni][5]
|
||||
|
||||
![Python Sets: What, Why and How](https://raw.githubusercontent.com/wilfredinni/pysheetComments/master/2018-july/python_sets/sets.png)
|
||||
|
||||
Python comes equipped with several built-in data types to help us organize our data. These structures include lists, dictionaries, tuples and sets.
|
||||
|
||||
From the Python 3 documentation:
|
||||
|
||||
> A set is an _unordered collection_ with no _duplicate elements_ . Basic uses include _membership testing_ and _eliminating duplicate entries_ . Set objects also support mathematical operations like _union_ , _intersection_ , _difference_ , and _symmetric difference_ .
|
||||
|
||||
In this article, we are going to review and see examples of every one of the elements listed in the above definition. Let's start right away and see how we can create them.
|
||||
|
||||
### Initializing a Set
|
||||
|
||||
There are two ways to create a set: one is to provide the built-in function `set()` with a list of elements, and the other is to use the curly braces `{}`.
|
||||
|
||||
Initializing a set using the `set()` built-in function:
|
||||
|
||||
```
|
||||
>>> s1 = set([1, 2, 3])
|
||||
>>> s1
|
||||
{1, 2, 3}
|
||||
>>> type(s1)
|
||||
<class 'set'>
|
||||
|
||||
```
|
||||
|
||||
Initializing a set using curly braces `{}`
|
||||
|
||||
```
|
||||
>>> s2 = {3, 4, 5}
|
||||
>>> s2
|
||||
{3, 4, 5}
|
||||
>>> type(s2)
|
||||
<class 'set'>
|
||||
>>>
|
||||
|
||||
```
|
||||
|
||||
As you can see, both options are valid. The problem comes when what we want is an empty one:
|
||||
|
||||
```
|
||||
>>> s = {}
|
||||
>>> type(s)
|
||||
<class 'dict'>
|
||||
|
||||
```
|
||||
|
||||
That's right, we will get a dictionary instead of a set if we use empty curly braces =)
|
||||
|
||||
It's a good moment to mention that for the sake of simplicity, all the examples provided in this article will use single digit integers, but sets can have all the [hashable][6] data types that Python support. In other words, integers, strings and tuples, but not _mutable_ items like _lists_ or _dictionaries_ :
|
||||
|
||||
```
|
||||
>>> s = {1, 'coffee', [4, 'python']}
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
TypeError: unhashable type: 'list'
|
||||
|
||||
```
|
||||
|
||||
Now that you know how to create a set and what type of elements it can contain, let's continue and see _why_ we should always have them in our toolkit.
|
||||
|
||||
### Why You Should Use Them
|
||||
|
||||
When writing code, you can do it in more than a single way. Some are considered to be pretty bad, and others, _clear, concise and maintainable_ . Or " [_pythonic_][7] ".
|
||||
|
||||
From the [The Hitchhiker’s Guide to Python][8]:
|
||||
|
||||
> When a veteran Python developer (a Pythonista) calls portions of code not “Pythonic”, they usually mean that these lines of code do not follow the common guidelines and fail to express its intent in what is considered the best (hear: most readable) way.
|
||||
|
||||
Let's start exploring the way that Python sets can help us not just with readability, but also speeding up our programs execution time.
|
||||
|
||||
### Unordered Collection of Elements
|
||||
|
||||
First things first: you can't access a set element using indexes.
|
||||
|
||||
```
|
||||
>>> s = {1, 2, 3}
|
||||
>>> s[0]
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
TypeError: 'set' object does not support indexing
|
||||
|
||||
```
|
||||
|
||||
Or modify them with slices:
|
||||
|
||||
```
|
||||
>>> s[0:2]
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
TypeError: 'set' object is not subscriptable
|
||||
|
||||
```
|
||||
|
||||
BUT, if what we need is to remove duplicates, or do mathematical operations like combining lists (unions), we can, and _SHOULD_ always use Sets.
|
||||
|
||||
I have to mention that when iterating over, sets are outperformed by lists, so prefer them if that is what you need. Why? well, this article does not intend to explain the inner workings of sets, but if you are interested, here are a couple of links where you can read about it:
|
||||
|
||||
* [TimeComplexity][1]
|
||||
|
||||
* [How is set() implemented?][2]
|
||||
|
||||
* [Python Sets vs Lists][3]
|
||||
|
||||
* [Is there any advantage or disadvantage to using sets over list comps to ensure a list of unique entries?][4]
|
||||
|
||||
### No Duplicate Items
|
||||
|
||||
While writing this I cannot stop thinking in all the times I used the _for_ loop and the _if_ statement to check and remove duplicate elements in a list. My face turns red remembering that, more than once, I wrote something like this:
|
||||
|
||||
```
|
||||
>>> my_list = [1, 2, 3, 2, 3, 4]
|
||||
>>> no_duplicate_list = []
|
||||
>>> for item in my_list:
|
||||
... if item not in no_duplicate_list:
|
||||
... no_duplicate_list.append(item)
|
||||
...
|
||||
>>> no_duplicate_list
|
||||
[1, 2, 3, 4]
|
||||
|
||||
```
|
||||
|
||||
Or used a list comprehension:
|
||||
|
||||
```
|
||||
>>> my_list = [1, 2, 3, 2, 3, 4]
|
||||
>>> no_duplicate_list = []
|
||||
>>> [no_duplicate_list.append(item) for item in my_list if item not in no_duplicate_list]
|
||||
[None, None, None, None]
|
||||
>>> no_duplicate_list
|
||||
[1, 2, 3, 4]
|
||||
|
||||
```
|
||||
|
||||
But it's ok, nothing of that matters anymore because we now have the sets in our arsenal:
|
||||
|
||||
```
|
||||
>>> my_list = [1, 2, 3, 2, 3, 4]
|
||||
>>> no_duplicate_list = list(set(my_list))
|
||||
>>> no_duplicate_list
|
||||
[1, 2, 3, 4]
|
||||
>>>
|
||||
|
||||
```
|
||||
|
||||
Now let's use the _timeit_ module and see the excecution time of lists and sets when removing duplicates:
|
||||
|
||||
```
|
||||
>>> from timeit import timeit
|
||||
>>> def no_duplicates(list):
|
||||
... no_duplicate_list = []
|
||||
... [no_duplicate_list.append(item) for item in list if item not in no_duplicate_list]
|
||||
... return no_duplicate_list
|
||||
...
|
||||
>>> # first, let's see how the list perform:
|
||||
>>> print(timeit('no_duplicates([1, 2, 3, 1, 7])', globals=globals(), number=1000))
|
||||
0.0018683355819786227
|
||||
|
||||
```
|
||||
|
||||
```
|
||||
>>> from timeit import timeit
|
||||
>>> # and the set:
|
||||
>>> print(timeit('list(set([1, 2, 3, 1, 2, 3, 4]))', number=1000))
|
||||
0.0010220493243764395
|
||||
>>> # faster and cleaner =)
|
||||
|
||||
```
|
||||
|
||||
Not only we write _fewer lines_ with sets than with lists comprehensions, we also obtain more _readable_ and _performant_ code.
|
||||
|
||||
Note: remember that sets are unordered, so there is no guarantee that when converting them back to a list the order of the elements is going to be preserved.
|
||||
|
||||
From the [Zen of Python][9]:
|
||||
|
||||
> Beautiful is better than ugly.
|
||||
> Explicit is better than implicit.
|
||||
> Simple is better than complex.
|
||||
> Flat is better than nested.
|
||||
|
||||
Aren't sets just Beautiful, Explicit, Simple and Flat? =)
|
||||
|
||||
### Membership Tests
|
||||
|
||||
Every time we use an _if_ statement to check if an element is, for example, in a list, you are doing a membership test:
|
||||
|
||||
```
|
||||
my_list = [1, 2, 3]
|
||||
>>> if 2 in my_list:
|
||||
... print('Yes, this is a membership test!')
|
||||
...
|
||||
Yes, this is a membership test!
|
||||
|
||||
```
|
||||
|
||||
And sets are more performant than lists when doing them:
|
||||
|
||||
```
|
||||
>>> from timeit import timeit
|
||||
>>> def in_test(iterable):
|
||||
... for i in range(1000):
|
||||
... if i in iterable:
|
||||
... pass
|
||||
...
|
||||
>>> timeit('in_test(iterable)',
|
||||
... setup="from __main__ import in_test; iterable = list(range(1000))",
|
||||
... number=1000)
|
||||
12.459663048726043
|
||||
|
||||
```
|
||||
|
||||
```
|
||||
>>> from timeit import timeit
|
||||
>>> def in_test(iterable):
|
||||
... for i in range(1000):
|
||||
... if i in iterable:
|
||||
... pass
|
||||
...
|
||||
>>> timeit('in_test(iterable)',
|
||||
... setup="from __main__ import in_test; iterable = set(range(1000))",
|
||||
... number=1000)
|
||||
0.12354438152988223
|
||||
>>>
|
||||
|
||||
```
|
||||
|
||||
Note: the above tests come from [this][10] StackOverflow thread.
|
||||
|
||||
So if you are doing comparisons like this in huge lists, it should speed you a good bit if you convert that list into a set.
|
||||
|
||||
### How to Use Them
|
||||
|
||||
Now that you know what a set is and why you should use them, let's do a quick tour and see how can we modify and operate with them.
|
||||
|
||||
### Adding Elements
|
||||
|
||||
Depending on the number of elements to add, we will have to choose between the `add()` and `update()` methods.
|
||||
|
||||
`add()` will add a single element:
|
||||
|
||||
```
|
||||
>>> s = {1, 2, 3}
|
||||
>>> s.add(4)
|
||||
>>> s
|
||||
{1, 2, 3, 4}
|
||||
|
||||
```
|
||||
|
||||
And `update()` multiple ones:
|
||||
|
||||
```
|
||||
>>> s = {1, 2, 3}
|
||||
>>> s.update([2, 3, 4, 5, 6])
|
||||
>>> s
|
||||
{1, 2, 3, 4, 5, 6}
|
||||
|
||||
```
|
||||
|
||||
Remember, sets remove duplicates.
|
||||
|
||||
### Removing Elements
|
||||
|
||||
If you want to be alerted when your code tries to remove an element that is not in the set, use `remove()`. Otherwise, `discard()` provides a good alternative:
|
||||
|
||||
```
|
||||
>>> s = {1, 2, 3}
|
||||
>>> s.remove(3)
|
||||
>>> s
|
||||
{1, 2}
|
||||
>>> s.remove(3)
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
KeyError: 3
|
||||
|
||||
```
|
||||
|
||||
`discard()` won't raise any errors:
|
||||
|
||||
```
|
||||
>>> s = {1, 2, 3}
|
||||
>>> s.discard(3)
|
||||
>>> s
|
||||
{1, 2}
|
||||
>>> s.discard(3)
|
||||
>>> # nothing happens!
|
||||
|
||||
```
|
||||
|
||||
We can also use `pop()` to randomly discard an element:
|
||||
|
||||
```
|
||||
>>> s = {1, 2, 3, 4, 5}
|
||||
>>> s.pop() # removes an arbitrary element
|
||||
1
|
||||
>>> s
|
||||
{2, 3, 4, 5}
|
||||
|
||||
```
|
||||
|
||||
Or `clear()` to remove all the values from a set:
|
||||
|
||||
```
|
||||
>>> s = {1, 2, 3, 4, 5}
|
||||
>>> s.clear() # discard all the items
|
||||
>>> s
|
||||
set()
|
||||
|
||||
```
|
||||
|
||||
### union()
|
||||
|
||||
`union()` or `|` will create a new set that contains all the elements from the sets we provide:
|
||||
|
||||
```
|
||||
>>> s1 = {1, 2, 3}
|
||||
>>> s2 = {3, 4, 5}
|
||||
>>> s1.union(s2) # or 's1 | s2'
|
||||
{1, 2, 3, 4, 5}
|
||||
|
||||
```
|
||||
|
||||
### intersection()
|
||||
|
||||
`intersection` or `&` will return a set containing only the elements that are common in all of them:
|
||||
|
||||
```
|
||||
>>> s1 = {1, 2, 3}
|
||||
>>> s2 = {2, 3, 4}
|
||||
>>> s3 = {3, 4, 5}
|
||||
>>> s1.intersection(s2, s3) # or 's1 & s2 & s3'
|
||||
{3}
|
||||
|
||||
```
|
||||
|
||||
### difference()
|
||||
|
||||
Using `diference()` or `-`, creates a new set with the values that are in "s1" but not in "s2":
|
||||
|
||||
```
|
||||
>>> s1 = {1, 2, 3}
|
||||
>>> s2 = {2, 3, 4}
|
||||
>>> s1.difference(s2) # or 's1 - s2'
|
||||
{1}
|
||||
|
||||
```
|
||||
|
||||
### symmetric_diference()
|
||||
|
||||
`symetric_difference` or `^` will return all the values that are not common between the sets.
|
||||
|
||||
```
|
||||
>>> s1 = {1, 2, 3}
|
||||
>>> s2 = {2, 3, 4}
|
||||
>>> s1.symmetric_difference(s2) # or 's1 ^ s2'
|
||||
{1, 4}
|
||||
|
||||
```
|
||||
|
||||
### Conclusions
|
||||
|
||||
I hope that after reading this article you know what a set is, how to manipulate their elements and the operations they can perform. Knowing when to use a set will definitely help you write cleaner code and speed up your programs.
|
||||
|
||||
If you have any doubts, please leave a comment and I will gladly try to answer them. Also, don´t forget that if you already understand sets, they have their own [place][11] in the [Python Cheatsheet][12], where you can have a quick reference and refresh what you already know.
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://www.pythoncheatsheet.org/blog/python-sets-what-why-how
|
||||
|
||||
作者:[wilfredinni][a]
|
||||
译者:[译者ID](https://github.com/译者ID)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]:https://www.pythoncheatsheet.org/author/wilfredinni
|
||||
[1]:https://wiki.python.org/moin/TimeComplexity
|
||||
[2]:https://stackoverflow.com/questions/3949310/how-is-set-implemented
|
||||
[3]:https://stackoverflow.com/questions/2831212/python-sets-vs-lists
|
||||
[4]:https://mail.python.org/pipermail/python-list/2011-June/606738.html
|
||||
[5]:https://www.pythoncheatsheet.org/author/wilfredinni
|
||||
[6]:https://docs.python.org/3/glossary.html#term-hashable
|
||||
[7]:http://docs.python-guide.org/en/latest/writing/style/
|
||||
[8]:http://docs.python-guide.org/en/latest/
|
||||
[9]:https://www.python.org/dev/peps/pep-0020/
|
||||
[10]:https://stackoverflow.com/questions/2831212/python-sets-vs-lists
|
||||
[11]:https://www.pythoncheatsheet.org/#sets
|
||||
[12]:https://www.pythoncheatsheet.org/
|
392
translated/tech/20180710 Python Sets What Why and How.md
Normal file
392
translated/tech/20180710 Python Sets What Why and How.md
Normal file
@ -0,0 +1,392 @@
|
||||
Python 集合是什么,为什么应该使用以及如何使用?
|
||||
=====
|
||||
|
||||
[wilfredinni][5] 在 07/10/2018 发表
|
||||
|
||||
![Python Sets: What, Why and How](https://raw.githubusercontent.com/wilfredinni/pysheetComments/master/2018-july/python_sets/sets.png)
|
||||
|
||||
Python 配备了几种内置数据类型来帮我们组织数据。这些结构包括列表,字典,元组和集合。
|
||||
|
||||
根据 Python 3 文档:
|
||||
|
||||
> 集合是一个*无序*集合,没有*重复元素*。基本用途包括*成员测试*和*消除重复的条目*。集合对象还支持数学运算,如*并集*,*交集*,*差集*和*对等差分*。
|
||||
|
||||
在本文中,我们将回顾并查看上述定义中列出的每个要素的示例。让我们马上开始,看看如何创建它。
|
||||
|
||||
### 初始化一个集合
|
||||
|
||||
有两种方法可以创建一个集合:一个是给内置函数 `set()` 提供一个元素列表,另一个是使用花括号 `{}`。
|
||||
|
||||
使用内置函数 `set()` 来初始化一个集合:
|
||||
|
||||
```
|
||||
>>> s1 = set([1, 2, 3])
|
||||
>>> s1
|
||||
{1, 2, 3}
|
||||
>>> type(s1)
|
||||
<class 'set'>
|
||||
|
||||
```
|
||||
|
||||
使用 `{}`:
|
||||
|
||||
|
||||
```
|
||||
>>> s2 = {3, 4, 5}
|
||||
>>> s2
|
||||
{3, 4, 5}
|
||||
>>> type(s2)
|
||||
<class 'set'>
|
||||
>>>
|
||||
|
||||
```
|
||||
|
||||
如你所见,这两种方法都是有效的。但问题是,如果我们想要一个空的集合呢?
|
||||
|
||||
```
|
||||
>>> s = {}
|
||||
>>> type(s)
|
||||
<class 'dict'>
|
||||
|
||||
```
|
||||
|
||||
没错,如果我们使用空花括号,我们将得到一个字典而不是一个集合。
|
||||
|
||||
值得一提的是,为了简单起见,本文中提供的所有示例都将使用整数集合,但集合可以包含 Python 支持的所有 [hashable(可哈希)][6] 数据类型。换句话说,即整数,字符串和元组,而不是*列表*或*字典*这样的可变类型。
|
||||
|
||||
```
|
||||
>>> s = {1, 'coffee', [4, 'python']}
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
TypeError: unhashable type: 'list'
|
||||
|
||||
```
|
||||
|
||||
既然你知道如何创建一个集合以及它可以包含哪些类型的元素,那么让我们继续看看*为什么*我们总是应该把它放在我们的工具箱中。
|
||||
|
||||
### 为什么你需要使用它
|
||||
|
||||
写代码时,你可以用不止一种方法来完成它。有些被认为是相当糟糕的,另一些则是清晰的,简介的和可维护的,或者是 "[_pythonic_][7]" 的。
|
||||
|
||||
根据 [Hitchhiker 对 Python 的建议][8]:
|
||||
|
||||
> 当一个经验丰富的 Python 开发人员(Pythonista)调用一些不够 “Pythonic” 的代码时,他们通常认为着这些代码不遵循通用指南,并且无法被认为是以一种好的方式(可读性)来表达意图。
|
||||
|
||||
让我们开始探索 Python 集合那些不仅可以帮助我们提高可读性,还可以加快程序执行时间的方式。
|
||||
|
||||
### 无序的集合元素
|
||||
|
||||
首先你需要明白的是:你无法使用索引访问集合中的元素。
|
||||
|
||||
```
|
||||
>>> s = {1, 2, 3}
|
||||
>>> s[0]
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
TypeError: 'set' object does not support indexing
|
||||
|
||||
```
|
||||
或者使用切片修改它们:
|
||||
|
||||
```
|
||||
>>> s[0:2]
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
TypeError: 'set' object is not subscriptable
|
||||
|
||||
```
|
||||
|
||||
但是,如果我们需要删除重复项,或者进行组合列表(与)之类的数学运算,那么我们可以,并且*应该*始终使用集合。
|
||||
|
||||
我不得不提一下,在迭代时,集合的表现优于列表。所以,如果你需要它,那就加深对它的喜爱吧。为什么?好吧,这篇文章并不打算解释集合的内部工作原理,但是如果你感兴趣的话,这里有几个链接,你可以阅读它:
|
||||
|
||||
* [时间复杂度][1]
|
||||
|
||||
* [set() 是如何实现的?][2]
|
||||
|
||||
* [Python 集合 vs 列表][3]
|
||||
|
||||
* [在列表中使用集合是否有任何优势或劣势,以确保独一无二的列表条目?][4]
|
||||
|
||||
### 没有重复项
|
||||
|
||||
写这篇文章的时候,我总是不停地思考,我经常使用 *for* 循环和 *if* 语句检查并删除列表中的重复元素。记得那时我的脸红了,而且不止一次,我写了类似这样的代码:
|
||||
|
||||
```
|
||||
>>> my_list = [1, 2, 3, 2, 3, 4]
|
||||
>>> no_duplicate_list = []
|
||||
>>> for item in my_list:
|
||||
... if item not in no_duplicate_list:
|
||||
... no_duplicate_list.append(item)
|
||||
...
|
||||
>>> no_duplicate_list
|
||||
[1, 2, 3, 4]
|
||||
|
||||
```
|
||||
|
||||
或者使用列表解析:
|
||||
|
||||
```
|
||||
>>> my_list = [1, 2, 3, 2, 3, 4]
|
||||
>>> no_duplicate_list = []
|
||||
>>> [no_duplicate_list.append(item) for item in my_list if item not in no_duplicate_list]
|
||||
[None, None, None, None]
|
||||
>>> no_duplicate_list
|
||||
[1, 2, 3, 4]
|
||||
|
||||
```
|
||||
|
||||
但没关系,因为我们现在有了武器装备,没有什么比这更重要的了:
|
||||
|
||||
```
|
||||
>>> my_list = [1, 2, 3, 2, 3, 4]
|
||||
>>> no_duplicate_list = list(set(my_list))
|
||||
>>> no_duplicate_list
|
||||
[1, 2, 3, 4]
|
||||
>>>
|
||||
|
||||
```
|
||||
|
||||
现在让我们使用 *timeit* 模块,查看列表和集合在删除重复项时的执行时间:
|
||||
|
||||
```
|
||||
>>> from timeit import timeit
|
||||
>>> def no_duplicates(list):
|
||||
... no_duplicate_list = []
|
||||
... [no_duplicate_list.append(item) for item in list if item not in no_duplicate_list]
|
||||
... return no_duplicate_list
|
||||
...
|
||||
>>> # 首先,让我们看看列表的执行情况:
|
||||
>>> print(timeit('no_duplicates([1, 2, 3, 1, 7])', globals=globals(), number=1000))
|
||||
0.0018683355819786227
|
||||
|
||||
```
|
||||
|
||||
```
|
||||
>>> from timeit import timeit
|
||||
>>> # 使用集合:
|
||||
>>> print(timeit('list(set([1, 2, 3, 1, 2, 3, 4]))', number=1000))
|
||||
0.0010220493243764395
|
||||
>>> # 快速而且干净 =)
|
||||
|
||||
```
|
||||
|
||||
使用集合而不是列表推导不仅让我们编写*更少的代码*,而且还能让我们获得*更具可读性*和*高性能*的代码。
|
||||
|
||||
注意:请记住集合是无序的,因此无法保证在将它们转换回列表时,元素的顺序不变。
|
||||
|
||||
[Python 之禅][9]:
|
||||
(to 校正者:建议英文保留)
|
||||
> Beautiful is better than ugly. 优美胜于丑陋。
|
||||
> Explicit is better than implicit.明了胜于晦涩。
|
||||
> Simple is better than complex.简洁胜于复杂。
|
||||
> Flat is better than nested. 扁平胜于嵌套。
|
||||
|
||||
集合不正是这样美丽,明了,简单且扁平吗?
|
||||
|
||||
### 成员测试
|
||||
|
||||
每次我们使用 *if* 语句来检查一个元素,例如,它是否在列表中时,意味着你正在进行成员测试:
|
||||
|
||||
```
|
||||
my_list = [1, 2, 3]
|
||||
>>> if 2 in my_list:
|
||||
... print('Yes, this is a membership test!')
|
||||
...
|
||||
Yes, this is a membership test!
|
||||
|
||||
```
|
||||
|
||||
在执行这些操作时,集合比列表更高效:
|
||||
|
||||
```
|
||||
>>> from timeit import timeit
|
||||
>>> def in_test(iterable):
|
||||
... for i in range(1000):
|
||||
... if i in iterable:
|
||||
... pass
|
||||
...
|
||||
>>> timeit('in_test(iterable)',
|
||||
... setup="from __main__ import in_test; iterable = list(range(1000))",
|
||||
... number=1000)
|
||||
12.459663048726043
|
||||
|
||||
```
|
||||
|
||||
```
|
||||
>>> from timeit import timeit
|
||||
>>> def in_test(iterable):
|
||||
... for i in range(1000):
|
||||
... if i in iterable:
|
||||
... pass
|
||||
...
|
||||
>>> timeit('in_test(iterable)',
|
||||
... setup="from __main__ import in_test; iterable = set(range(1000))",
|
||||
... number=1000)
|
||||
.12354438152988223
|
||||
|
||||
>>>
|
||||
|
||||
```
|
||||
注意:上面的测试来自于[这里][10] StackOverflow thread。
|
||||
|
||||
因此,如果你在巨大的列表中进行这样的比较,尝试将该列表转换为集合,它应该可以加快你的速度。
|
||||
|
||||
### 如何使用
|
||||
|
||||
现在你已经了解了集合是什么以及为什么你应该使用它,现在让我们快速浏览一下,看看我们如何修改和操作它。
|
||||
|
||||
### 添加元素
|
||||
|
||||
根据要添加的元素数量,我们要在 `add()` 和 `update()` 方法之间进行选择。
|
||||
|
||||
`add()` 适用于添加单个元素:
|
||||
|
||||
```
|
||||
>>> s = {1, 2, 3}
|
||||
>>> s.add(4)
|
||||
>>> s
|
||||
{1, 2, 3, 4}
|
||||
|
||||
```
|
||||
|
||||
`update()` 适用于添加多个元素:
|
||||
|
||||
```
|
||||
>>> s = {1, 2, 3}
|
||||
>>> s.update([2, 3, 4, 5, 6])
|
||||
>>> s
|
||||
{1, 2, 3, 4, 5, 6}
|
||||
|
||||
```
|
||||
|
||||
请记住,集合会移除重复项。
|
||||
|
||||
### 移除元素
|
||||
|
||||
如果你希望在代码中尝试删除不在集合中的元素时收到警报,请使用 `remove()`。否则,`discard()` 提供了一个很好的选择:
|
||||
|
||||
```
|
||||
>>> s = {1, 2, 3}
|
||||
>>> s.remove(3)
|
||||
>>> s
|
||||
{1, 2}
|
||||
>>> s.remove(3)
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
KeyError: 3
|
||||
|
||||
```
|
||||
|
||||
`discard()` 不会引起任何错误:
|
||||
|
||||
```
|
||||
>>> s = {1, 2, 3}
|
||||
>>> s.discard(3)
|
||||
>>> s
|
||||
{1, 2}
|
||||
>>> s.discard(3)
|
||||
>>> # 什么都不会发生
|
||||
|
||||
```
|
||||
|
||||
我们也可以使用 `pop()` 来随机丢弃一个元素:
|
||||
|
||||
```
|
||||
>>> s = {1, 2, 3, 4, 5}
|
||||
>>> s.pop() # 删除一个任意的元素
|
||||
1
|
||||
>>> s
|
||||
{2, 3, 4, 5}
|
||||
|
||||
```
|
||||
|
||||
或者 `clear()` 方法来清空一个集合:
|
||||
|
||||
```
|
||||
>>> s = {1, 2, 3, 4, 5}
|
||||
>>> s.clear() # 清空集合
|
||||
>>> s
|
||||
set()
|
||||
|
||||
```
|
||||
|
||||
### union()
|
||||
|
||||
`union()` 或者 `|` 将创建一个新集合,其中包含我们提供集合中的所有元素:
|
||||
|
||||
```
|
||||
>>> s1 = {1, 2, 3}
|
||||
>>> s2 = {3, 4, 5}
|
||||
>>> s1.union(s2) # 或者 's1 | s2'
|
||||
{1, 2, 3, 4, 5}
|
||||
|
||||
```
|
||||
|
||||
### intersection()
|
||||
|
||||
`intersection` 或 `&` 将返回一个由集合共同元素组成的集合:
|
||||
|
||||
```
|
||||
>>> s1 = {1, 2, 3}
|
||||
>>> s2 = {2, 3, 4}
|
||||
>>> s3 = {3, 4, 5}
|
||||
>>> s1.intersection(s2, s3) # 或者 's1 & s2 & s3'
|
||||
{3}
|
||||
|
||||
```
|
||||
|
||||
### difference()
|
||||
|
||||
使用 `diference()` 或 `-` 创建一个新集合,其值在 “s1” 中但不在 “s2” 中:
|
||||
|
||||
```
|
||||
>>> s1 = {1, 2, 3}
|
||||
>>> s2 = {2, 3, 4}
|
||||
>>> s1.difference(s2) # 或者 's1 - s2'
|
||||
{1}
|
||||
|
||||
```
|
||||
|
||||
### symmetric_diference()
|
||||
|
||||
`symetric_difference` 或 `^` 将返回集合之间的不同元素。
|
||||
|
||||
```
|
||||
>>> s1 = {1, 2, 3}
|
||||
>>> s2 = {2, 3, 4}
|
||||
>>> s1.symmetric_difference(s2) # 或者 's1 ^ s2'
|
||||
{1, 4}
|
||||
|
||||
```
|
||||
|
||||
### 结论
|
||||
|
||||
我希望在阅读本文之后,你会知道集合是什么,如何操纵它的元素以及它可以执行的操作。知道何时使用集合无疑会帮助你编写更清晰的代码并加速你的程序。
|
||||
|
||||
如果你有任何疑问,请发表评论,我很乐意尝试回答。另外,不要忘记,如果你已经理解了集合,它们在 [Python Cheatsheet][12] 中有自己的[一席之地][11],在那里你可以快速参考并重新认知你已经知道的内容。
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://www.pythoncheatsheet.org/blog/python-sets-what-why-how
|
||||
|
||||
作者:[wilfredinni][a]
|
||||
译者:[MjSeven](https://github.com/MjSeven)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]:https://www.pythoncheatsheet.org/author/wilfredinni
|
||||
[1]:https://wiki.python.org/moin/TimeComplexity
|
||||
[2]:https://stackoverflow.com/questions/3949310/how-is-set-implemented
|
||||
[3]:https://stackoverflow.com/questions/2831212/python-sets-vs-lists
|
||||
[4]:https://mail.python.org/pipermail/python-list/2011-June/606738.html
|
||||
[5]:https://www.pythoncheatsheet.org/author/wilfredinni
|
||||
[6]:https://docs.python.org/3/glossary.html#term-hashable
|
||||
[7]:http://docs.python-guide.org/en/latest/writing/style/
|
||||
[8]:http://docs.python-guide.org/en/latest/
|
||||
[9]:https://www.python.org/dev/peps/pep-0020/
|
||||
[10]:https://stackoverflow.com/questions/2831212/python-sets-vs-lists
|
||||
[11]:https://www.pythoncheatsheet.org/#sets
|
||||
[12]:https://www.pythoncheatsheet.org/
|
||||
|
Loading…
Reference in New Issue
Block a user