mirror of
https://github.com/LCTT/TranslateProject.git
synced 2025-02-28 01:01:09 +08:00
commit
1d9e0ea1d7
@ -1,117 +0,0 @@
|
|||||||
[#]: subject: (Can Windows 11 Influence Linux Distributions?)
|
|
||||||
[#]: via: (https://news.itsfoss.com/can-windows-11-influence-linux/)
|
|
||||||
[#]: author: (Ankush Das https://news.itsfoss.com/author/ankush/)
|
|
||||||
[#]: collector: (lujun9972)
|
|
||||||
[#]: translator: (zz-air)
|
|
||||||
[#]: reviewer: ( )
|
|
||||||
[#]: publisher: ( )
|
|
||||||
[#]: url: ( )
|
|
||||||
|
|
||||||
Can Windows 11 Influence Linux Distributions?
|
|
||||||
======
|
|
||||||
|
|
||||||
Microsoft’s Windows 11 has been finally revealed. While some compare it to macOS, others compare the nitty-gritty details to find similarities with GNOME and KDE (which does not make much sense).
|
|
||||||
|
|
||||||
But, among all the buzz, I am curious about something else — **Can Microsoft’s Windows 11 influence the future design decisions of desktop Linux distributions?**
|
|
||||||
|
|
||||||
Here I shall mention some of my thoughts on why it might happen, if it has happened before, and what the future holds for Linux distributions.
|
|
||||||
|
|
||||||
### Some Linux Distributions Already Focus on Windows-like Experience: But, why?
|
|
||||||
|
|
||||||
Microsoft’s Windows is the most popular desktop operating system with **88% of the market share** for its ease of use, software support, and hardware compatibility.
|
|
||||||
|
|
||||||
On the contrary, Linux has **about 2% of the market share,** even with all the added [benefits of Linux over Windows][1].
|
|
||||||
|
|
||||||
So what can Linux do to convince more users to try Linux as their desktop operating system?
|
|
||||||
|
|
||||||
The main focus of every desktop operating system should be the user experience. While Microsoft and Apple have managed to provide a comfortable user experience for the masses, Linux distributions did not manage to get a big win on that front.
|
|
||||||
|
|
||||||
However, you will find several [Linux distributions that aim to replace Windows 10][2]. These Linux distributions try to provide a familiar user interface that could encourage a Windows user to consider switching to Linux.
|
|
||||||
|
|
||||||
And, due to the existence of such distributions, [switching to Linux in 2021][3] makes more sense than ever.
|
|
||||||
|
|
||||||
Hence, to get more users to jump-ship to Linux, Microsoft Windows has influenced many distributions for years now.
|
|
||||||
|
|
||||||
### Is Windows 11 Better than Linux in Some Way?
|
|
||||||
|
|
||||||
The user interface is constantly evolving with Windows. Even if that’s subjective, it is what most desktop users seem to be going for.
|
|
||||||
|
|
||||||
So I’d say Windows 11 has made some attractive improvements on that front.
|
|
||||||
|
|
||||||
![][4]
|
|
||||||
|
|
||||||
Not just limited to the UI/UX, things like integrating Microsoft Team’s chat features in the taskbar makes it convenient for users to instantly connect with anyone.
|
|
||||||
|
|
||||||
**While Linux distributions do not have their own full-fledged services, more out-of-the-box integrations tailored like this should make the onboarding experience easier for new users.**
|
|
||||||
|
|
||||||
And that brings me to another aspect of Windows 11—a personalized news and information feed.
|
|
||||||
|
|
||||||
Sure, Microsoft collects data for that, and you may have to sign in using a Microsoft account. But this is yet something that reduces friction for the users to go and look for a separate app to keep track of weather, news, and other daily information.
|
|
||||||
|
|
||||||
Linux does not force these choices for a user but features/integrations like this can be added as additional options which can be presented in the form of a choice to users.
|
|
||||||
|
|
||||||
**In other words, making things more accessible while integrated with the OS should get rid of a steep learning curve.**
|
|
||||||
|
|
||||||
And, the dreaded Microsoft Store has also got a serious upgrade with Windows 11.
|
|
||||||
|
|
||||||
![][5]
|
|
||||||
|
|
||||||
Unfortunately, for Linux distributions, I don’t see much meaningful upgrade to the app centers to make it visually appealing, and something interesting.
|
|
||||||
|
|
||||||
elementaryOS is probably making a good effort to focus on the UX/UI, and the evolving the experience with app center but for the most other distros, no significant upgrade.
|
|
||||||
|
|
||||||
![Software Manager in Linux Mint 20.1][6]
|
|
||||||
|
|
||||||
While I appreciate what Deepin Linux does in this regard, but it isn’t the popular choice for many users who try Linux for the first time.
|
|
||||||
|
|
||||||
### Windows 11 Introduces More Competition: Linux Has to Keep Up
|
|
||||||
|
|
||||||
With the launch of Windows 11, Linux as a desktop choice will get more competition.
|
|
||||||
|
|
||||||
While we do have some replacements for Windows 10 experience in the Linux world, there’s nothing that targets Windows 11, yet.
|
|
||||||
|
|
||||||
But this brings us to the obvious counter-response from the Linux community – **a Linux distribution that takes a dab at Windows 11**.
|
|
||||||
|
|
||||||
No matter whether you hate or love Microsoft’s latest design approach to Windows 11, the masses will adopt it over the next few years.
|
|
||||||
|
|
||||||
And to keep Linux as a compelling desktop alternative, the design language with Linux distributions must evolve as well.
|
|
||||||
|
|
||||||
Not just the desktop market—but laptop-exclusive design choices also need a significant improvement for Linux distributions.
|
|
||||||
|
|
||||||
Some options like [Pop!_OS by System76][7] have been trying to offer that experience for Linux, which is a good start.
|
|
||||||
|
|
||||||
I think Zorin OS can be one of the distributions to introduce a “**Windows 11**” layout as an option to get more users to try Linux.
|
|
||||||
|
|
||||||
Not to forget—[Deepin Linux introduced Android app support][8] right after Windows 11 marketed it as a feature.
|
|
||||||
|
|
||||||
So, you see when Microsoft’s Windows makes a move, it does have a ripple effect on Linux, too. And Deepin Linux’s Android app support is just the start…Let’s see what else comes up next.
|
|
||||||
|
|
||||||
_What do you think about Windows 11 influencing the future of Linux desktop? Do we need to evolve as well? Or should we continue being different and not get influenced by what the masses choose?_
|
|
||||||
|
|
||||||
#### Big Tech Websites Get Millions in Revenue, It's FOSS Got You!
|
|
||||||
|
|
||||||
If you like what we do here at It's FOSS, please consider making a donation to support our independent publication. Your support will help us keep publishing content focusing on desktop Linux and open source software.
|
|
||||||
|
|
||||||
I'm not interested
|
|
||||||
|
|
||||||
--------------------------------------------------------------------------------
|
|
||||||
|
|
||||||
via: https://news.itsfoss.com/can-windows-11-influence-linux/
|
|
||||||
|
|
||||||
作者:[Ankush Das][a]
|
|
||||||
选题:[lujun9972][b]
|
|
||||||
译者:[zz-air](https://github.com/zz-air)
|
|
||||||
校对:[校对者ID](https://github.com/校对者ID)
|
|
||||||
|
|
||||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
|
||||||
|
|
||||||
[a]: https://news.itsfoss.com/author/ankush/
|
|
||||||
[b]: https://github.com/lujun9972
|
|
||||||
[1]: https://itsfoss.com/linux-better-than-windows/
|
|
||||||
[2]: https://itsfoss.com/windows-like-linux-distributions/
|
|
||||||
[3]: https://news.itsfoss.com/switch-to-linux-in-2021/
|
|
||||||
[4]: 
|
|
||||||
[5]: 
|
|
||||||
[6]: 
|
|
||||||
[7]: https://pop.system76.com
|
|
||||||
[8]: https://news.itsfoss.com/deepin-linux-20-2-2-release/
|
|
@ -1,542 +0,0 @@
|
|||||||
[#]: collector: (lujun9972)
|
|
||||||
[#]: translator: ( )
|
|
||||||
[#]: reviewer: ( )
|
|
||||||
[#]: publisher: ( )
|
|
||||||
[#]: url: ( )
|
|
||||||
[#]: subject: (An advanced guide to NLP analysis with Python and NLTK)
|
|
||||||
[#]: via: (https://opensource.com/article/20/8/nlp-python-nltk)
|
|
||||||
[#]: author: (Girish Managoli https://opensource.com/users/gammay)
|
|
||||||
|
|
||||||
An advanced guide to NLP analysis with Python and NLTK
|
|
||||||
======
|
|
||||||
Get deeper into the foundational concepts behind natural language
|
|
||||||
processing.
|
|
||||||
![Brain on a computer screen][1]
|
|
||||||
|
|
||||||
In my [previous article][2], I introduced natural language processing (NLP) and the Natural Language Toolkit ([NLTK][3]), the NLP toolkit created at the University of Pennsylvania. I demonstrated how to parse text and define stopwords in Python and introduced the concept of a corpus, a dataset of text that aids in text processing with out-of-the-box data. In this article, I'll continue utilizing datasets to compare and analyze natural language.
|
|
||||||
|
|
||||||
The fundamental building blocks covered in this article are:
|
|
||||||
|
|
||||||
* WordNet and synsets
|
|
||||||
* Similarity comparison
|
|
||||||
* Tree and treebank
|
|
||||||
* Named entity recognition
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
### WordNet and synsets
|
|
||||||
|
|
||||||
[WordNet][4] is a large lexical database corpus in NLTK. WordNet maintains cognitive synonyms (commonly called synsets) of words correlated by nouns, verbs, adjectives, adverbs, synonyms, antonyms, and more.
|
|
||||||
|
|
||||||
WordNet is a very useful tool for text analysis. It is available for many languages (Chinese, English, Japanese, Russian, Spanish, and more), under many licenses (ranging from open source to commercial). The first WordNet was created by Princeton University for English under an MIT-like license.
|
|
||||||
|
|
||||||
A word is typically associated with multiple synsets based on its meanings and parts of speech. Each synset usually provides these attributes:
|
|
||||||
|
|
||||||
**Attribute** | **Definition** | **Example**
|
|
||||||
---|---|---
|
|
||||||
Name | Name of the synset | Example: The word "code" has five synsets with names `code.n.01`, `code.n.02`, `code.n.03`, `code.v.01`, `code.v.02`
|
|
||||||
POS | Part of speech of the word for this synset | The word "code" has three synsets in noun form and two in verb form
|
|
||||||
Definition | Definition of the word (in POS) | One of the definitions of "code" in verb form is: "(computer science) the symbolic arrangement of data or instructions in a computer program"
|
|
||||||
Examples | Examples of word's use | One of the examples of "code": "We should encode the message for security reasons"
|
|
||||||
Lemmas | Other word synsets this word+POC is related to (not strictly synonyms, but can be considered so); lemmas are related to other lemmas, not to words directly | Lemmas of `code.v.02` (as in "convert ordinary language into code") are `code.v.02.encipher`, `code.v.02.cipher`, `code.v.02.cypher`, `code.v.02.encrypt`, `code.v.02.inscribe`, `code.v.02.write_in_code`
|
|
||||||
Antonyms | Opposites | Antonym of lemma `encode.v.01.encode` is `decode.v.01.decode`
|
|
||||||
Hypernym | A broad category that other words fall under | A hypernym of `code.v.01` (as in "Code the pieces with numbers so that you can identify them later") is `tag.v.01`
|
|
||||||
Meronym | A word that is part of (or subordinate to) a broad category | A meronym of "computer" is "chip"
|
|
||||||
Holonym | The relationship between a parent word and its subordinate parts | A hyponym of "window" is "computer screen"
|
|
||||||
|
|
||||||
There are several other attributes, which you can find in the `nltk/corpus/reader/wordnet.py` source file in `<your python install>/Lib/site-packages`.
|
|
||||||
|
|
||||||
Some code may help this make more sense.
|
|
||||||
|
|
||||||
This helper function:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
def synset_info(synset):
|
|
||||||
print("Name", synset.name())
|
|
||||||
print("POS:", synset.pos())
|
|
||||||
print("Definition:", synset.definition())
|
|
||||||
print("Examples:", synset.examples())
|
|
||||||
print("Lemmas:", synset.lemmas())
|
|
||||||
print("Antonyms:", [lemma.antonyms() for lemma in synset.lemmas() if len(lemma.antonyms()) > 0])
|
|
||||||
print("Hypernyms:", synset.hypernyms())
|
|
||||||
print("Instance Hypernyms:", synset.instance_hypernyms())
|
|
||||||
print("Part Holonyms:", synset.part_holonyms())
|
|
||||||
print("Part Meronyms:", synset.part_meronyms())
|
|
||||||
print()
|
|
||||||
|
|
||||||
[/code] [code]`synsets = wordnet.synsets('code')`
|
|
||||||
```
|
|
||||||
|
|
||||||
shows this:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
5 synsets:
|
|
||||||
Name code.n.01
|
|
||||||
POS: n
|
|
||||||
Definition: a set of rules or principles or laws (especially written ones)
|
|
||||||
Examples: []
|
|
||||||
Lemmas: [Lemma('code.n.01.code'), Lemma('code.n.01.codification')]
|
|
||||||
Antonyms: []
|
|
||||||
Hypernyms: [Synset('written_communication.n.01')]
|
|
||||||
Instance Hpernyms: []
|
|
||||||
Part Holonyms: []
|
|
||||||
Part Meronyms: []
|
|
||||||
|
|
||||||
...
|
|
||||||
|
|
||||||
Name code.n.03
|
|
||||||
POS: n
|
|
||||||
Definition: (computer science) the symbolic arrangement of data or instructions in a computer program or the set of such instructions
|
|
||||||
Examples: []
|
|
||||||
Lemmas: [Lemma('code.n.03.code'), Lemma('code.n.03.computer_code')]
|
|
||||||
Antonyms: []
|
|
||||||
Hypernyms: [Synset('coding_system.n.01')]
|
|
||||||
Instance Hpernyms: []
|
|
||||||
Part Holonyms: []
|
|
||||||
Part Meronyms: []
|
|
||||||
|
|
||||||
...
|
|
||||||
|
|
||||||
Name code.v.02
|
|
||||||
POS: v
|
|
||||||
Definition: convert ordinary language into code
|
|
||||||
Examples: ['We should encode the message for security reasons']
|
|
||||||
Lemmas: [Lemma('code.v.02.code'), Lemma('code.v.02.encipher'), Lemma('code.v.02.cipher'), Lemma('code.v.02.cypher'), Lemma('code.v.02.encrypt'), Lemma('code.v.02.inscribe'), Lemma('code.v.02.write_in_code')]
|
|
||||||
Antonyms: []
|
|
||||||
Hypernyms: [Synset('encode.v.01')]
|
|
||||||
Instance Hpernyms: []
|
|
||||||
Part Holonyms: []
|
|
||||||
Part Meronyms: []
|
|
||||||
```
|
|
||||||
|
|
||||||
Synsets and lemmas follow a tree structure you can visualize:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
def hypernyms(synset):
|
|
||||||
return synset.hypernyms()
|
|
||||||
|
|
||||||
synsets = wordnet.synsets('soccer')
|
|
||||||
for synset in synsets:
|
|
||||||
print(synset.name() + " tree:")
|
|
||||||
pprint(synset.tree(rel=hypernyms))
|
|
||||||
print()
|
|
||||||
|
|
||||||
[/code] [code]
|
|
||||||
|
|
||||||
code.n.01 tree:
|
|
||||||
[Synset('code.n.01'),
|
|
||||||
[Synset('written_communication.n.01'),
|
|
||||||
...
|
|
||||||
|
|
||||||
code.n.02 tree:
|
|
||||||
[Synset('code.n.02'),
|
|
||||||
[Synset('coding_system.n.01'),
|
|
||||||
...
|
|
||||||
|
|
||||||
code.n.03 tree:
|
|
||||||
[Synset('code.n.03'),
|
|
||||||
...
|
|
||||||
|
|
||||||
code.v.01 tree:
|
|
||||||
[Synset('code.v.01'),
|
|
||||||
[Synset('tag.v.01'),
|
|
||||||
...
|
|
||||||
|
|
||||||
code.v.02 tree:
|
|
||||||
[Synset('code.v.02'),
|
|
||||||
[Synset('encode.v.01'),
|
|
||||||
...
|
|
||||||
```
|
|
||||||
|
|
||||||
WordNet does not cover all words and their information (there are about 170,000 words in English today and about 155,000 in the latest version of WordNet), but it's a good starting point. After you learn the concepts of this building block, if you find it inadequate for your needs, you can migrate to another. Or, you can build your own WordNet!
|
|
||||||
|
|
||||||
#### Try it yourself
|
|
||||||
|
|
||||||
Using the Python libraries, download Wikipedia's page on [open source][5] and list the synsets and lemmas of all the words.
|
|
||||||
|
|
||||||
### Similarity comparison
|
|
||||||
|
|
||||||
Similarity comparison is a building block that identifies similarities between two pieces of text. It has many applications in search engines, chatbots, and more.
|
|
||||||
|
|
||||||
For example, are the words "football" and "soccer" related?
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
syn1 = wordnet.synsets('football')
|
|
||||||
syn2 = wordnet.synsets('soccer')
|
|
||||||
|
|
||||||
# A word may have multiple synsets, so need to compare each synset of word1 with synset of word2
|
|
||||||
for s1 in syn1:
|
|
||||||
for s2 in syn2:
|
|
||||||
print("Path similarity of: ")
|
|
||||||
print(s1, '(', s1.pos(), ')', '[', s1.definition(), ']')
|
|
||||||
print(s2, '(', s2.pos(), ')', '[', s2.definition(), ']')
|
|
||||||
print(" is", s1.path_similarity(s2))
|
|
||||||
print()
|
|
||||||
|
|
||||||
[/code] [code]
|
|
||||||
|
|
||||||
Path similarity of:
|
|
||||||
Synset('football.n.01') ( n ) [ any of various games played with a ball (round or oval) in which two teams try to kick or carry or propel the ball into each other's goal ]
|
|
||||||
Synset('soccer.n.01') ( n ) [ a football game in which two teams of 11 players try to kick or head a ball into the opponents' goal ]
|
|
||||||
is 0.5
|
|
||||||
|
|
||||||
Path similarity of:
|
|
||||||
Synset('football.n.02') ( n ) [ the inflated oblong ball used in playing American football ]
|
|
||||||
Synset('soccer.n.01') ( n ) [ a football game in which two teams of 11 players try to kick or head a ball into the opponents' goal ]
|
|
||||||
is 0.05
|
|
||||||
```
|
|
||||||
|
|
||||||
The highest path similarity score of the words is 0.5, indicating they are closely related.
|
|
||||||
|
|
||||||
What about "code" and "bug"? Similarity scores for these words used in computer science are:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
Path similarity of:
|
|
||||||
Synset('code.n.01') ( n ) [ a set of rules or principles or laws (especially written ones) ]
|
|
||||||
Synset('bug.n.02') ( n ) [ a fault or defect in a computer program, system, or machine ]
|
|
||||||
is 0.1111111111111111
|
|
||||||
...
|
|
||||||
Path similarity of:
|
|
||||||
Synset('code.n.02') ( n ) [ a coding system used for transmitting messages requiring brevity or secrecy ]
|
|
||||||
Synset('bug.n.02') ( n ) [ a fault or defect in a computer program, system, or machine ]
|
|
||||||
is 0.09090909090909091
|
|
||||||
...
|
|
||||||
Path similarity of:
|
|
||||||
Synset('code.n.03') ( n ) [ (computer science) the symbolic arrangement of data or instructions in a computer program or the set of such instructions ]
|
|
||||||
Synset('bug.n.02') ( n ) [ a fault or defect in a computer program, system, or machine ]
|
|
||||||
is 0.09090909090909091
|
|
||||||
```
|
|
||||||
|
|
||||||
These are the highest similarity scores, which indicates they are related.
|
|
||||||
|
|
||||||
NLTK provides several similarity scorers, such as:
|
|
||||||
|
|
||||||
* path_similarity
|
|
||||||
* lch_similarity
|
|
||||||
* wup_similarity
|
|
||||||
* res_similarity
|
|
||||||
* jcn_similarity
|
|
||||||
* lin_similarity
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
See the Similarity section of the [WordNet Interface][6] page to determine the appropriate one for your application.
|
|
||||||
|
|
||||||
#### Try it yourself
|
|
||||||
|
|
||||||
Using Python libraries, start from the Wikipedia [Category: Lists of computer terms][7] page and prepare a list of terminologies, then see how the words correlate.
|
|
||||||
|
|
||||||
### Tree and treebank
|
|
||||||
|
|
||||||
With NLTK, you can represent a text's structure in tree form to help with text analysis.
|
|
||||||
|
|
||||||
Here is an example:
|
|
||||||
|
|
||||||
A simple text pre-processed and part-of-speech (POS)-tagged:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
import nltk
|
|
||||||
|
|
||||||
text = "I love open source"
|
|
||||||
# Tokenize to words
|
|
||||||
words = nltk.tokenize.word_tokenize(text)
|
|
||||||
# POS tag the words
|
|
||||||
words_tagged = nltk.pos_tag(words)
|
|
||||||
```
|
|
||||||
|
|
||||||
You must define a grammar to convert the text to a tree structure. This example uses a simple grammar based on the [Penn Treebank tags][8].
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
# A simple grammar to create tree
|
|
||||||
grammar = "NP: {<JJ><NN>}"
|
|
||||||
```
|
|
||||||
|
|
||||||
Next, use the grammar to create a tree:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
# Create tree
|
|
||||||
parser = nltk.RegexpParser(grammar)
|
|
||||||
tree = parser.parse(words_tagged)
|
|
||||||
pprint(tree)
|
|
||||||
```
|
|
||||||
|
|
||||||
This produces:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
`Tree('S', [('I', 'PRP'), ('love', 'VBP'), Tree('NP', [('open', 'JJ'), ('source', 'NN')])])`
|
|
||||||
```
|
|
||||||
|
|
||||||
You can see it better graphically.
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
`tree.draw()`
|
|
||||||
```
|
|
||||||
|
|
||||||
![NLTK Tree][9]
|
|
||||||
|
|
||||||
(Girish Managoli, [CC BY-SA 4.0][10])
|
|
||||||
|
|
||||||
This structure helps explain the text's meaning correctly. As an example, identify the [subject][11] in this text:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
subject_tags = ["NN", "NNS", "NP", "NNP", "NNPS", "PRP", "PRP$"]
|
|
||||||
def subject(sentence_tree):
|
|
||||||
for tagged_word in sentence_tree:
|
|
||||||
# A crude logic for this case - first word with these tags is considered subject
|
|
||||||
if tagged_word[1] in subject_tags:
|
|
||||||
return tagged_word[0]
|
|
||||||
|
|
||||||
print("Subject:", subject(tree))
|
|
||||||
```
|
|
||||||
|
|
||||||
It shows "I" is the subject:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
`Subject: I`
|
|
||||||
```
|
|
||||||
|
|
||||||
This is a basic text analysis building block that is applicable to larger applications. For example, when a user says, "Book a flight for my mom, Jane, to NY from London on January 1st," a chatbot using this block can interpret the request as:
|
|
||||||
|
|
||||||
**Action**: Book
|
|
||||||
**What**: Flight
|
|
||||||
**Traveler**: Jane
|
|
||||||
**From**: London
|
|
||||||
**To**: New York
|
|
||||||
**Date**: 1 Jan (of the next year)
|
|
||||||
|
|
||||||
A treebank refers to a corpus with pre-tagged trees. Open source, conditional free-for-use, and commercial treebanks are available for many languages. The most commonly used one for English is Penn Treebank, extracted from the _Wall Street Journal_, a subset of which is included in NLTK. Some ways of using a treebank:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
words = nltk.corpus.treebank.words()
|
|
||||||
print(len(words), "words:")
|
|
||||||
print(words)
|
|
||||||
|
|
||||||
tagged_sents = nltk.corpus.treebank.tagged_sents()
|
|
||||||
print(len(tagged_sents), "sentences:")
|
|
||||||
print(tagged_sents)
|
|
||||||
|
|
||||||
[/code] [code]
|
|
||||||
|
|
||||||
100676 words:
|
|
||||||
['Pierre', 'Vinken', ',', '61', 'years', 'old', ',', ...]
|
|
||||||
3914 sentences:
|
|
||||||
[[('Pierre', 'NNP'), ('Vinken', 'NNP'), (',', ','), ('61', 'CD'), ('years', 'NNS'), ('old', 'JJ'), (',', ','), ('will', 'MD'), ('join', 'VB'), ('the', 'DT'), ('board', 'NN'), ('as', 'IN'), ('a', 'DT'), ('nonexecutive', 'JJ'), ('director', 'NN'), ...]
|
|
||||||
```
|
|
||||||
|
|
||||||
See tags in a sentence:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
sent0 = tagged_sents[0]
|
|
||||||
pprint(sent0)
|
|
||||||
|
|
||||||
[/code] [code]
|
|
||||||
|
|
||||||
[('Pierre', 'NNP'),
|
|
||||||
('Vinken', 'NNP'),
|
|
||||||
(',', ','),
|
|
||||||
('61', 'CD'),
|
|
||||||
('years', 'NNS'),
|
|
||||||
...
|
|
||||||
```
|
|
||||||
|
|
||||||
Create a grammar to convert this to a tree:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
grammar = '''
|
|
||||||
Subject: {<NNP><NNP>}
|
|
||||||
SubjectInfo: {<CD><NNS><JJ>}
|
|
||||||
Action: {<MD><VB>}
|
|
||||||
Object: {<DT><NN>}
|
|
||||||
Stopwords: {<IN><DT>}
|
|
||||||
ObjectInfo: {<JJ><NN>}
|
|
||||||
When: {<NNP><CD>}
|
|
||||||
'''
|
|
||||||
parser = nltk.RegexpParser(grammar)
|
|
||||||
tree = parser.parse(sent0)
|
|
||||||
print(tree)
|
|
||||||
|
|
||||||
[/code] [code]
|
|
||||||
|
|
||||||
(S
|
|
||||||
(Subject Pierre/NNP Vinken/NNP)
|
|
||||||
,/,
|
|
||||||
(SubjectInfo 61/CD years/NNS old/JJ)
|
|
||||||
,/,
|
|
||||||
(Action will/MD join/VB)
|
|
||||||
(Object the/DT board/NN)
|
|
||||||
as/IN
|
|
||||||
a/DT
|
|
||||||
(ObjectInfo nonexecutive/JJ director/NN)
|
|
||||||
(Subject Nov./NNP)
|
|
||||||
29/CD
|
|
||||||
./.)
|
|
||||||
```
|
|
||||||
|
|
||||||
See it graphically:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
`tree.draw()`
|
|
||||||
```
|
|
||||||
|
|
||||||
![NLP Treebank image][12]
|
|
||||||
|
|
||||||
(Girish Managoli, [CC BY-SA 4.0][10])
|
|
||||||
|
|
||||||
The concept of trees and treebanks is a powerful building block for text analysis.
|
|
||||||
|
|
||||||
#### Try it yourself
|
|
||||||
|
|
||||||
Using the Python libraries, download Wikipedia's page on [open source][5] and represent the text in a presentable view.
|
|
||||||
|
|
||||||
### Named entity recognition
|
|
||||||
|
|
||||||
Text, whether spoken or written, contains important data. One of text processing's primary goals is extracting this key data. This is needed in almost all applications, such as an airline chatbot that books tickets or a question-answering bot. NLTK provides a named entity recognition feature for this.
|
|
||||||
|
|
||||||
Here's a code example:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
`sentence = 'Peterson first suggested the name "open source" at Palo Alto, California'`
|
|
||||||
```
|
|
||||||
|
|
||||||
See if name and place are recognized in this sentence. Pre-process as usual:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
import nltk
|
|
||||||
|
|
||||||
words = nltk.word_tokenize(sentence)
|
|
||||||
pos_tagged = nltk.pos_tag(words)
|
|
||||||
```
|
|
||||||
|
|
||||||
Run the named-entity tagger:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
ne_tagged = nltk.ne_chunk(pos_tagged)
|
|
||||||
print("NE tagged text:")
|
|
||||||
print(ne_tagged)
|
|
||||||
print()
|
|
||||||
|
|
||||||
[/code] [code]
|
|
||||||
|
|
||||||
NE tagged text:
|
|
||||||
(S
|
|
||||||
(PERSON Peterson/NNP)
|
|
||||||
first/RB
|
|
||||||
suggested/VBD
|
|
||||||
the/DT
|
|
||||||
name/NN
|
|
||||||
``/``
|
|
||||||
open/JJ
|
|
||||||
source/NN
|
|
||||||
''/''
|
|
||||||
at/IN
|
|
||||||
(FACILITY Palo/NNP Alto/NNP)
|
|
||||||
,/,
|
|
||||||
(GPE California/NNP))
|
|
||||||
```
|
|
||||||
|
|
||||||
Name tags were added; extract only the named entities from this tree:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
print("Recognized named entities:")
|
|
||||||
for ne in ne_tagged:
|
|
||||||
if hasattr(ne, "label"):
|
|
||||||
print(ne.label(), ne[0:])
|
|
||||||
|
|
||||||
[/code] [code]
|
|
||||||
|
|
||||||
Recognized named entities:
|
|
||||||
PERSON [('Peterson', 'NNP')]
|
|
||||||
FACILITY [('Palo', 'NNP'), ('Alto', 'NNP')]
|
|
||||||
GPE [('California', 'NNP')]
|
|
||||||
```
|
|
||||||
|
|
||||||
See it graphically:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
`ne_tagged.draw()`
|
|
||||||
```
|
|
||||||
|
|
||||||
![NLTK Treebank tree][13]
|
|
||||||
|
|
||||||
(Girish Managoli, [CC BY-SA 4.0][10])
|
|
||||||
|
|
||||||
NLTK's built-in named-entity tagger, using PENN's [Automatic Content Extraction][14] (ACE) program, detects common entities such as ORGANIZATION, PERSON, LOCATION, FACILITY, and GPE (geopolitical entity).
|
|
||||||
|
|
||||||
NLTK can use other taggers, such as the [Stanford Named Entity Recognizer][15]. This trained tagger is built in Java, but NLTK provides an interface to work with it (See [nltk.parse.stanford][16] or [nltk.tag.stanford][17]).
|
|
||||||
|
|
||||||
#### Try it yourself
|
|
||||||
|
|
||||||
Using the Python libraries, download Wikipedia's page on [open source][5] and identify people who had an influence on open source and where and when they contributed.
|
|
||||||
|
|
||||||
### Advanced exercise
|
|
||||||
|
|
||||||
If you're ready for it, try building this superstructure using the building blocks discussed in these articles.
|
|
||||||
|
|
||||||
Using Python libraries, download Wikipedia's [Category: Computer science page][18] and:
|
|
||||||
|
|
||||||
* Identify the most-occurring unigrams, bigrams, and trigrams and publish it as a list of keywords or technologies that students and engineers need to be aware of in this domain.
|
|
||||||
* Show the names, technologies, dates, and places that matter in this field graphically. This can be a nice infographic.
|
|
||||||
* Create a search engine. Does your search engine perform better than Wikipedia's search?
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
### What's next?
|
|
||||||
|
|
||||||
NLP is a quintessential pillar in application building. NLTK is a classic, rich, and powerful kit that provides the bricks and mortar to build practically appealing, purposeful applications for the real world.
|
|
||||||
|
|
||||||
In this series of articles, I explained what NLP makes possible using NLTK as an example. NLP and NLTK have a lot more to offer. This series is an inception point to help get you started.
|
|
||||||
|
|
||||||
If your needs grow beyond NLTK's capabilities, you could train new models or add capabilities to it. New NLP libraries that build on NLTK are coming up, and machine learning is being used extensively in language processing.
|
|
||||||
|
|
||||||
--------------------------------------------------------------------------------
|
|
||||||
|
|
||||||
via: https://opensource.com/article/20/8/nlp-python-nltk
|
|
||||||
|
|
||||||
作者:[Girish Managoli][a]
|
|
||||||
选题:[lujun9972][b]
|
|
||||||
译者:[译者ID](https://github.com/译者ID)
|
|
||||||
校对:[校对者ID](https://github.com/校对者ID)
|
|
||||||
|
|
||||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
|
||||||
|
|
||||||
[a]: https://opensource.com/users/gammay
|
|
||||||
[b]: https://github.com/lujun9972
|
|
||||||
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/brain_computer_solve_fix_tool.png?itok=okq8joti (Brain on a computer screen)
|
|
||||||
[2]: https://opensource.com/article/20/8/intro-python-nltk
|
|
||||||
[3]: http://www.nltk.org/
|
|
||||||
[4]: https://en.wikipedia.org/wiki/WordNet
|
|
||||||
[5]: https://en.wikipedia.org/wiki/Open_source
|
|
||||||
[6]: https://www.nltk.org/howto/wordnet.html
|
|
||||||
[7]: https://en.wikipedia.org/wiki/Category:Lists_of_computer_terms
|
|
||||||
[8]: https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html
|
|
||||||
[9]: https://opensource.com/sites/default/files/uploads/nltk-tree.jpg (NLTK Tree)
|
|
||||||
[10]: https://creativecommons.org/licenses/by-sa/4.0/
|
|
||||||
[11]: https://en.wikipedia.org/wiki/Subject_(grammar)
|
|
||||||
[12]: https://opensource.com/sites/default/files/uploads/nltk-treebank.jpg (NLP Treebank image)
|
|
||||||
[13]: https://opensource.com/sites/default/files/uploads/nltk-treebank-2a.jpg (NLTK Treebank tree)
|
|
||||||
[14]: https://www.ldc.upenn.edu/collaborations/past-projects/ace
|
|
||||||
[15]: https://nlp.stanford.edu/software/CRF-NER.html
|
|
||||||
[16]: https://www.nltk.org/_modules/nltk/parse/stanford.html
|
|
||||||
[17]: https://www.nltk.org/_modules/nltk/tag/stanford.html
|
|
||||||
[18]: https://en.wikipedia.org/wiki/Category:Computer_science
|
|
@ -1,97 +0,0 @@
|
|||||||
[#]: subject: (Write good examples by starting with real code)
|
|
||||||
[#]: via: (https://jvns.ca/blog/2021/07/08/writing-great-examples/)
|
|
||||||
[#]: author: (Julia Evans https://jvns.ca/)
|
|
||||||
[#]: collector: (lujun9972)
|
|
||||||
[#]: translator: (zepoch)
|
|
||||||
[#]: reviewer: ( )
|
|
||||||
[#]: publisher: ( )
|
|
||||||
[#]: url: ( )
|
|
||||||
|
|
||||||
Write good examples by starting with real code
|
|
||||||
======
|
|
||||||
|
|
||||||
When I write about programming, I spend a lot of time trying to come up with good examples. I haven’t seen a lot written about how to make examples, so here’s a little bit about my approach to writing examples!
|
|
||||||
|
|
||||||
The basic idea here is to start with real code that you wrote and then remove irrelevant details to make it into a self-contained example instead of coming up with examples out of thin air.
|
|
||||||
|
|
||||||
I’ll talk about two kinds of examples: realistic examples and suprising examples.
|
|
||||||
|
|
||||||
### good examples are realistic
|
|
||||||
|
|
||||||
To see why examples should be realistic, let’s first talk about an unrealistic example! Let’s say we’re trying to explain Python lambdas (which is just the first concept I thought of). You could give this example, of using `map` and a lambda to double a set of numbers.
|
|
||||||
|
|
||||||
```
|
|
||||||
numbers = [1, 2, 3, 4]
|
|
||||||
squares = map(lambda x: x * x, numbers)
|
|
||||||
```
|
|
||||||
|
|
||||||
I think this example is unrealistic for a couple of reasons:
|
|
||||||
|
|
||||||
* squaring a set of numbers isn’t something you’re super likely to do in a real program unless it’s for Project Euler or something (there are LOTS of operations on lists that are a lot more likely)
|
|
||||||
* This usage of `map` is not idiomatic Python, even if you were doing this I would write `[x*x for x in numbers]` instead
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
A more realistic example of Python lambdas is using them with `sort`, like this;
|
|
||||||
|
|
||||||
```
|
|
||||||
children = [{"name": "ashwin", "age": 12}, {"name": "radhika", "age": 3}]
|
|
||||||
sorted_children = sorted(children, key=lambda x: x['age'])
|
|
||||||
```
|
|
||||||
|
|
||||||
But this example is still pretty contrived (why exactly do we need to sort these children by age?). So how do we actually make realistic examples?
|
|
||||||
|
|
||||||
### how to make your examples realistic: look at actual code you wrote
|
|
||||||
|
|
||||||
I think the easiest way to make realistic examples is, instead of pulling an example out of thin air (like I did with that `children` example), instead just start by looking at real code!
|
|
||||||
|
|
||||||
For example, if I grep a bunch of Python code I wrote for `sort.+key`, I find LOTS of real examples of me sorting a list by some criterion, like:
|
|
||||||
|
|
||||||
* `tasks.sort(key=lambda task: task['completed_time'])`
|
|
||||||
* `emails = reversed(sorted(emails, key=lambda x:x['receivedAt']))`
|
|
||||||
* `sorted_keysizes = sorted(scores.keys(), key=scores.get)`
|
|
||||||
* `shows = sorted(dates[date], key=lambda x: x['time']['performanceTime'])`
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
It’s pretty easy to see a pattern here – a lot of these are sorting by time! So you can see how you could easily put together a simple realistic example of sorting some objects (emails, events, etc) by time.
|
|
||||||
|
|
||||||
### realistic examples help “sell” the concept you’re trying to explain
|
|
||||||
|
|
||||||
When I’m trying to explain an idea (like Python lambdas), I’m usually also trying to convince the reader that it’s worth learning! Python lambdas are super useful! And to convince someone that lambdas are useful, it really helps to show someone how lambdas could help them do a task that they could actually imagine themselves doing, and ideally a task that they’ve done before.
|
|
||||||
|
|
||||||
### distilling down examples from real code can take a long time
|
|
||||||
|
|
||||||
The example I just gave of explaining how to use `sort` with `lambda` is pretty simple and it didn’t take me a long time to come up with, but turning real code into a standalone example can take a really long time!
|
|
||||||
|
|
||||||
For example, I was thinking of including an example of some weird CSS behaviour in this post to illustrate how it’s fun to create examples with weird or surprising behaviour. I spent 2 hours taking a real problem I had this week, making sure I understood what was actually happening with the CSS, and making it into a minimal example.
|
|
||||||
|
|
||||||
In the end it “just” took [5 lines of HTML and a tiny bit of CSS][1] to demonstrate the problem and it doesn’t really look like it took hours to write. But originally it was hundreds of lines of JS/CSS/JavaScript, and it takes time to untangle all that and come up with something small that gets at the heart of the issue!
|
|
||||||
|
|
||||||
But I think it’s worth it to take the time to make examples really clear and minimal – if hundreds of people are reading your example, you’re saving them all so much time!
|
|
||||||
|
|
||||||
### that’s all for now!
|
|
||||||
|
|
||||||
I think there’s a lot more to say about examples – for instance I think there are a few different types of useful examples, like:
|
|
||||||
|
|
||||||
* examples that are surprising to the reader, which are more about changing someone’s mental model than providing code to use directly
|
|
||||||
* examples that are easy to copy and paste to use as a starting point
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
but maybe I’ll write about that another day :)
|
|
||||||
|
|
||||||
--------------------------------------------------------------------------------
|
|
||||||
|
|
||||||
via: https://jvns.ca/blog/2021/07/08/writing-great-examples/
|
|
||||||
|
|
||||||
作者:[Julia Evans][a]
|
|
||||||
选题:[lujun9972][b]
|
|
||||||
译者:[译者ID](https://github.com/译者ID)
|
|
||||||
校对:[校对者ID](https://github.com/校对者ID)
|
|
||||||
|
|
||||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
|
||||||
|
|
||||||
[a]: https://jvns.ca/
|
|
||||||
[b]: https://github.com/lujun9972
|
|
||||||
[1]: https://codepen.io/wizardzines/pen/0eda7725a46c919dcfdd3fa80aff3d41
|
|
@ -1,99 +0,0 @@
|
|||||||
[#]: subject: (Converseen for Batch Processing Images on Linux)
|
|
||||||
[#]: via: (https://itsfoss.com/converseen/)
|
|
||||||
[#]: author: (Abhishek Prakash https://itsfoss.com/author/abhishek/)
|
|
||||||
[#]: collector: (lujun9972)
|
|
||||||
[#]: translator: (geekpi)
|
|
||||||
[#]: reviewer: ( )
|
|
||||||
[#]: publisher: ( )
|
|
||||||
[#]: url: ( )
|
|
||||||
|
|
||||||
Converseen for Batch Processing Images on Linux
|
|
||||||
======
|
|
||||||
|
|
||||||
Converseen is a free and open source software for batch image conversion. With this tool, you can convert multiple images to another format, resize, change their aspect ratio, rotate or flip them all at once.
|
|
||||||
|
|
||||||
This is a handy tool for someone like me who has to deal with multiple screenshots of different size but has to resize them all before uploading to the website.
|
|
||||||
|
|
||||||
Batch conversion tools help a lot in such cases. This could be done in the Linux command line with the wonderful [ImageMagick][1] but a GUI tool is a lot easier to use here. Actually, Converseen uses ImageMagick underneath the Qt-based GUI.
|
|
||||||
|
|
||||||
### Batch process images with Converseen
|
|
||||||
|
|
||||||
You can use [Converseen][2] to convert, resize, rotate and flip multiple images with a mouse click.
|
|
||||||
|
|
||||||
You have plenty of supporting options for the batch conversion. You can add additional images to your selection or remove some of them. You can choose to convert only a few of your selected images.
|
|
||||||
|
|
||||||
While resizing the images, you can choose to keep the aspect ratio. Keep in mind that out of width and height, the one you changed/typed last is the one controlling the aspect ratio. So, if you want to resize keeping the same aspect ratio but according to the width, don’t touch the height field.
|
|
||||||
|
|
||||||
![][3]
|
|
||||||
|
|
||||||
You can also choose to save the converted images with different name in the same directory or some other location. You may also overwrite the existing images.
|
|
||||||
|
|
||||||
You cannot add folder but you can select and add multiple images at once.
|
|
||||||
|
|
||||||
You can convert the images to a number of formats like JPEG, JPG, TIFF, SVG and more.
|
|
||||||
|
|
||||||
There is also an option to give the transparent background a certain color while changing the format. You can also set the quality of the compression level.
|
|
||||||
|
|
||||||
![][4]
|
|
||||||
|
|
||||||
Converseen says that it can also import PDF files and convert the entire PDF or part of it into images. However, it crashed in Ubuntu 21.04 each time I tried to convert a PDF file.
|
|
||||||
|
|
||||||
### Install Converseen on Linux
|
|
||||||
|
|
||||||
Converseen is a popular application. It is available in the repositories of most Linux distributions.
|
|
||||||
|
|
||||||
You can search for it in your distribution’s software center:
|
|
||||||
|
|
||||||
![][5]
|
|
||||||
|
|
||||||
You may, of course, use your distribution’s package manager to install it via command line.
|
|
||||||
|
|
||||||
On Debian and Ubuntu-based distributions, use:
|
|
||||||
|
|
||||||
```
|
|
||||||
sudo apt install converseen
|
|
||||||
```
|
|
||||||
|
|
||||||
On Fedora, use:
|
|
||||||
|
|
||||||
```
|
|
||||||
sudo dnf install converseen
|
|
||||||
```
|
|
||||||
|
|
||||||
On Arch and Manjaro, use:
|
|
||||||
|
|
||||||
```
|
|
||||||
sudo pacman -Sy converseen
|
|
||||||
```
|
|
||||||
|
|
||||||
Converseen is also available for Windows and FreeBSD. You can get the instructions on the download page of the project website.
|
|
||||||
|
|
||||||
[Download Converseen][6]
|
|
||||||
|
|
||||||
Its source code is [available][7] on the project’s GitHub repository.
|
|
||||||
|
|
||||||
If you are looking for an even easier way to resize a single image, you can use this nifty trick and [resize and rotate images with right click context menu in Nautilus file manager][8].
|
|
||||||
|
|
||||||
Overall, Converseen is a useful GUI tool for batch image conversion. It’s not perfect but it works for the most part. Have you ever used Converseen or do you use a similar tool? How is your experience with it?
|
|
||||||
|
|
||||||
--------------------------------------------------------------------------------
|
|
||||||
|
|
||||||
via: https://itsfoss.com/converseen/
|
|
||||||
|
|
||||||
作者:[Abhishek Prakash][a]
|
|
||||||
选题:[lujun9972][b]
|
|
||||||
译者:[译者ID](https://github.com/译者ID)
|
|
||||||
校对:[校对者ID](https://github.com/校对者ID)
|
|
||||||
|
|
||||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
|
||||||
|
|
||||||
[a]: https://itsfoss.com/author/abhishek/
|
|
||||||
[b]: https://github.com/lujun9972
|
|
||||||
[1]: https://imagemagick.org/index.php
|
|
||||||
[2]: https://converseen.fasterland.net/
|
|
||||||
[3]: https://i1.wp.com/itsfoss.com/wp-content/uploads/2021/07/converseen-interface.png?resize=800%2C400&ssl=1
|
|
||||||
[4]: https://i1.wp.com/itsfoss.com/wp-content/uploads/2021/07/converseen-features-overview_copy.png?resize=800%2C497&ssl=1
|
|
||||||
[5]: https://i2.wp.com/itsfoss.com/wp-content/uploads/2021/07/install-converseen-linux.jpeg?resize=800%2C527&ssl=1
|
|
||||||
[6]: https://converseen.fasterland.net/download/
|
|
||||||
[7]: https://github.com/Faster3ck/Converseen
|
|
||||||
[8]: https://itsfoss.com/resize-images-with-right-click/
|
|
@ -2,7 +2,7 @@
|
|||||||
[#]: via: (https://fedoramagazine.org/getting-started-with-podman-in-fedora/)
|
[#]: via: (https://fedoramagazine.org/getting-started-with-podman-in-fedora/)
|
||||||
[#]: author: (Yazan Monshed https://fedoramagazine.org/author/yazanalmonshed/)
|
[#]: author: (Yazan Monshed https://fedoramagazine.org/author/yazanalmonshed/)
|
||||||
[#]: collector: (lujun9972)
|
[#]: collector: (lujun9972)
|
||||||
[#]: translator: ( )
|
[#]: translator: (geekpi)
|
||||||
[#]: reviewer: ( )
|
[#]: reviewer: ( )
|
||||||
[#]: publisher: ( )
|
[#]: publisher: ( )
|
||||||
[#]: url: ( )
|
[#]: url: ( )
|
||||||
|
@ -0,0 +1,114 @@
|
|||||||
|
[#]: subject: (Can Windows 11 Influence Linux Distributions?)
|
||||||
|
[#]: via: (https://news.itsfoss.com/can-windows-11-influence-linux/)
|
||||||
|
[#]: author: (Ankush Das https://news.itsfoss.com/author/ankush/)
|
||||||
|
[#]: collector: (lujun9972)
|
||||||
|
[#]: translator: (zz-air)
|
||||||
|
[#]: reviewer: ( )
|
||||||
|
[#]: publisher: ( )
|
||||||
|
[#]: url: ( )
|
||||||
|
|
||||||
|
Windows 11 能影响 Linux 发行版吗?
|
||||||
|
======
|
||||||
|
|
||||||
|
微软的 Windows11 终于发布了。 有些人将其与 macOS 进行比较,另一些人则比较其本质细节以找到与 GNOME 和 KDE 的相似之处(这没有多大意义)。
|
||||||
|
|
||||||
|
但是,在所以的热议中,我对另一件事很好奇—— **微软的 Windows 11 能影响桌面 Linux 发行版未来的决策吗?**
|
||||||
|
|
||||||
|
在这里,如果它以前发生过,我将提到一些我的想法,关于它为什么会发生,以及 Linux 发行版未来会发生什么。
|
||||||
|
|
||||||
|
### 一些Linux发行版已经关注类似Windows的体验:但是,为什么呢?
|
||||||
|
|
||||||
|
微软的 Windows 是最受欢迎的桌面操作系统,因其易操作、软件支持和硬件兼容占据了 88% 的市场分额。
|
||||||
|
相反, Linux 占有 **大约 2% 的市场分额,** [即使 Linux 比 Windows 有更多的优势][1]。
|
||||||
|
|
||||||
|
那么 Linux 能做什么来说服更多的用户将 Linux 作为他们的桌面操作系统呢?
|
||||||
|
|
||||||
|
每个桌面操作系统的主要关注点应该是用户体验。当微软和苹果设法为大众提供舒适的用户体验时, Linux 发行版并没有设法在这方面取得巨大的胜利。
|
||||||
|
|
||||||
|
然而,你将会发现有几个 [Linux 发行版打算取代 Windows 10][2]。这些 Linux 发行版试图提供一个熟悉的用户界面,鼓励 Windows 用户考虑切换到 Linux 。
|
||||||
|
|
||||||
|
而且,由于这些发行版的存在,[在 2021 年切换到 Linux][3] 比以往任何时候都更有意义。
|
||||||
|
|
||||||
|
因此,为了让更多的用户跳转到 Linux ,微软 Window 多年来已经影响了许多发行版。
|
||||||
|
### Windows 11 在某些方面比 Linux 好?
|
||||||
|
用户界面随着 Windows 的发展而不断的发展。即使这是主观的,它似乎是大多数桌面用户的选择。
|
||||||
|
|
||||||
|
所以我要说 Windows11 在这方面做了一些有吸引力的改进。
|
||||||
|
|
||||||
|
![][4]
|
||||||
|
|
||||||
|
不仅仅局限于 UI/UX ,比如在任务栏中集成微软团队的聊天功能,可以方便用户与任何人即时联系。
|
||||||
|
|
||||||
|
**虽然 Linux 发行版没有自己成熟的服务,但是像这样定制的更多开箱即用的集成,应该会使新用户更容易上手。**
|
||||||
|
|
||||||
|
并且这让我想起了 Windows 11 的另一个方面——一个个性化的欣慰和信息提要。
|
||||||
|
|
||||||
|
当然,微软会为此收集数据,你可能需要使用微软账号登录。但这也减少了用户寻找独立应用程序来跟踪天气、新闻和其他日常信息。
|
||||||
|
|
||||||
|
Linux 不会强迫用户做出这些选择,但是像这样的特性/集成可以作为额外的选项添加,可以以选择的形式呈现给用户。
|
||||||
|
|
||||||
|
**换句话说,在与操作系统集成的同时使事物更容易访问,应该可以摆脱陡峭的学习曲线。**
|
||||||
|
|
||||||
|
而且,可怕的微软商店也在 Windows 11 上进行了重大升级。
|
||||||
|
|
||||||
|
![][5]
|
||||||
|
|
||||||
|
不幸的是,对于 Linux 发行版,我没有看到对应用中心进行有意义的升级,来使其在视觉上更吸引人,更有趣。
|
||||||
|
|
||||||
|
|
||||||
|
elementaryOS 可能正努力专注于 UX/UI ,并发展应用中心的体验,但对于大多数其他发行版,没有重大的升级。
|
||||||
|
|
||||||
|
![Linux Mint 20.1 中的软件管理器][6]
|
||||||
|
|
||||||
|
虽然我很欣赏 Deepin Linux 在这方面所做的,但它并不是许多用户第一次尝试 Linux 时的热门选择。
|
||||||
|
|
||||||
|
### Windows 11 引入了更多的竞争:Linux 必须跟上
|
||||||
|
|
||||||
|
随着 Windows 11 的推出,作为桌面选择的 Linux 将面临更多的竞争。
|
||||||
|
|
||||||
|
虽然在 Linux 世界中,我们确实有一些 Windows 10 经验的替代品,但还没有针对 Windows 11 的。
|
||||||
|
|
||||||
|
但这让我们看到了来自 Linux 社区的明显回应—— **在 Windows 11 上使用 dab 的 Linux 发行版**.
|
||||||
|
|
||||||
|
不管是讨厌还是喜欢微软最新的 Windows 11 设计方案,在接下来的几年里,大众将会接受它。
|
||||||
|
|
||||||
|
并且,为了使 Linux 成为一个引人注目的桌面替代品, Linux 发行版的设计语言也必须发展。
|
||||||
|
|
||||||
|
不仅仅是桌面市场————还有笔记本专用的设计选择————也需要对 Linux 发行版进行显著改进。
|
||||||
|
|
||||||
|
有些选择想 [Pop!_OS_System 76 ][7] 一直试图为 Linux 提供这种体验,这是一个良好的开端。
|
||||||
|
|
||||||
|
我认为 Zorin 操作系统开源作为一个发行版引入 “**Windows 11**” 布局作为一个选择,让更多用户尝试 Linux。
|
||||||
|
别忘了——在 windows 11 将 Android 应用程序支持作为一项功能推向市场之后,[Deepin Linux 就引入了 Android 应用程序支持。][8]
|
||||||
|
|
||||||
|
所以,你看,当微软的 Windows 采取行动时,对 Linux 也会产生连锁反应。而 Deepin Linux 的 Android 应用支持只是一个开始......让我们看看接下来还会出现什么。
|
||||||
|
|
||||||
|
_你对 Windows 11 影响 Linux 桌面的未来有什么看法?我们也需要进化吗?或者我们应该继续与众不同,不受大众选择的影响?_
|
||||||
|
|
||||||
|
#### 大型科技网站获得数百万美元的收入, 是自由/开源软件吸引了你!
|
||||||
|
|
||||||
|
如果你喜欢我们的自由/开源软件,请考虑捐款支持我们的独立出版。您的支持将帮助我们继续发布专注于桌面 Linux 和开源软件的内容。
|
||||||
|
|
||||||
|
我不感兴趣
|
||||||
|
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
via: https://news.itsfoss.com/can-windows-11-influence-linux/
|
||||||
|
|
||||||
|
作者:[Ankush Das][a]
|
||||||
|
选题:[lujun9972][b]
|
||||||
|
译者:[zz-air](https://github.com/zz-air)
|
||||||
|
校对:[校对者ID](https://github.com/校对者ID)
|
||||||
|
|
||||||
|
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||||
|
|
||||||
|
[a]: https://news.itsfoss.com/author/ankush/
|
||||||
|
[b]: https://github.com/lujun9972
|
||||||
|
[1]: https://itsfoss.com/linux-better-than-windows/
|
||||||
|
[2]: https://itsfoss.com/windows-like-linux-distributions/
|
||||||
|
[3]: https://news.itsfoss.com/switch-to-linux-in-2021/
|
||||||
|
[4]: 
|
||||||
|
[5]: 
|
||||||
|
[6]: 
|
||||||
|
[7]: https://pop.system76.com
|
||||||
|
[8]: https://news.itsfoss.com/deepin-linux-20-2-2-release/
|
@ -0,0 +1,534 @@
|
|||||||
|
[#]: collector: (lujun9972)
|
||||||
|
[#]: translator: (tanloong)
|
||||||
|
[#]: reviewer: ( )
|
||||||
|
[#]: publisher: ( )
|
||||||
|
[#]: url: ( )
|
||||||
|
[#]: subject: (An advanced guide to NLP analysis with Python and NLTK)
|
||||||
|
[#]: via: (https://opensource.com/article/20/8/nlp-python-nltk)
|
||||||
|
[#]: author: (Girish Managoli https://opensource.com/users/gammay)
|
||||||
|
|
||||||
|
用 Python 和 NLTK 进行 NLP 分析的高级教程
|
||||||
|
======
|
||||||
|
进一步学习自然语言处理的基本概念
|
||||||
|
![Brain on a computer screen][1]
|
||||||
|
|
||||||
|
在[之前的文章][2]里,我介绍了<ruby>自然语言处理<rt>NLP</rt></ruby>和宾夕法尼亚大学研发的自然语言处理工具包 ([NLTK][3])。我演示了用 Python 解析文本和定义停用词的方法,并介绍了语料库的概念。语料库是由文本构成的数据集,通过提供现成的文本数据来辅助文本处理。在这篇文章里,我将继续用各种语料库对文本进行对比和分析。
|
||||||
|
|
||||||
|
这篇文章主要包括以下部分:
|
||||||
|
|
||||||
|
* <ruby>词网<rt>WordNet</rt></ruby>和<ruby>同义词集<rt>synset</rt></ruby>
|
||||||
|
* <ruby>相似度比较<rt>Similarity comparison</rt></ruby>
|
||||||
|
* <ruby>树<rt>Tree</rt></ruby>和<ruby>树库<rt>treebank</rt></ruby>
|
||||||
|
* <ruby>命名实体识别<rt>Named entity recognition</rt></ruby>
|
||||||
|
|
||||||
|
|
||||||
|
### WordNet 和<ruby>同义词集<rt>synsets</rt></ruby>
|
||||||
|
|
||||||
|
[WordNet][4] 是 NLTK 里的一个大型词典数据库。WordNet 包含各单词的诸多<ruby>认知同义词<rt>cognitive synonyms</rt></ruby> (一个<ruby>认知同义词<rt>cognitive synonyms</rt></ruby>常被称作 synset)。
|
||||||
|
|
||||||
|
WordNet 是文本分析的一个很有用的工具。它有面向多种语言的版本 (汉语、英语、日语、俄语和西班牙语等),也使用多种许可证 (从开源许可证到商业许可证都有)。初代版本的 WordNet 由普林斯顿大学研发,面向英语,使用<ruby>类 MIT 许可证<rt>MIT-like license</rt></ruby>。
|
||||||
|
|
||||||
|
因为一个词可能有多个意义或多个词性,所以可能与多个 synset 相关联。每个 synset 通常提供下列属性:
|
||||||
|
|
||||||
|
|**属性** | **定义** | **例子**|
|
||||||
|
|---|---|---|
|
||||||
|
|<ruby>名称<rt>Name</rt></ruby>| 此 synset 的名称 | 单词 code 有 5 个 synset,名称分别是 `code.n.01`、 `code.n.02`、 `code.n.03`、`code.v.01` 和 `code.v.02`|
|
||||||
|
|<ruby>词性<rt>POS</rt></ruby>| 此 synset 的词性 | 单词 code 有 3 个名词词性的 synset 和 2 个动词词性的 synset|
|
||||||
|
|<ruby>定义<rt>Definition</rt></ruby>| 该词作对应词性时的定义 | 动词 code 的一个定义是: (<ruby>计算机科学<rt>computer science</rt></ruby>)数据或计算机程序指令的<ruby>象征性排列<rt>symbolic arrangement</rt></ruby>|
|
||||||
|
|<ruby>例子<rt>Examples</rt></ruby>| 使用该词的例子 | code 一词的例子:<ruby>为了安全,我们应该给信息编码。<rt>We should encode the message for security reasons</rt></ruby>|
|
||||||
|
|<ruby>词元<rt>Lemmas</rt></ruby>| 与该词向关联的其他 synset (包括那些不一定严格地是该词的同义词,但可以大体看作同义词的);<ruby>词元<rt>lemma</rt></ruby>直接与其他<ruby>词元<rt>lemma</rt></ruby>连关联,而不是直接与<ruby>单词<rt>words/rt></ruby>相关联| `code.v.02` 的<ruby>词元<rt>lemma</rt></ruby>是`code.v.02.encipher`、`code.v.02.cipher`、`code.v.02.cypher`、`code.v.02.encrypt`、`code.v.02.inscribe` 和 `code.v.02.write_in_code`|
|
||||||
|
|<ruby>反义词<rt>Antonyms</rt></ruby>| 意思相反的词 | <ruby>词元<rt>lemma</rt></ruby>`encode.v.01.encode` 的<ruby>反义词<rt>antonym</rt></ruby>是 `decode.v.01.decode`|
|
||||||
|
|<ruby>上义词<rt>Hypernym</rt></ruby>|该词所属的一个范畴更大的词 | `code.v.01` 的一个<ruby>上义词<rt>hypernym</rt></ruby>是 `tag.v.01`|
|
||||||
|
|<ruby>分项词<rt>Meronym</rt></ruby>| 属于该词组成部分的词 | <ruby>计算机<rt>computer</rt></ruby>的一个<ruby>分项词<rt>meronym</rt></ruby>是<ruby>芯片<rt>chip</rt></ruby>|
|
||||||
|
|<ruby>总项词<rt>Holonym</rt></ruby>| 该词作为组成部分所属的词 | <ruby>窗<rt>window</rt></ruby>的一个<ruby>总项词<rt>holonym</rt></ruby>是<ruby>电脑屏幕<rt>computer screen</rt></ruby>|
|
||||||
|
|
||||||
|
synset 还有一些其他属性,在 `<你的 Python 安装路径>/Lib/site-packages` 下的 `nltk/corpus/reader/wordnet.py`,你可以找到它们。
|
||||||
|
|
||||||
|
下面的代码或许可以帮助理解。
|
||||||
|
|
||||||
|
这个函数:
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
from nltk.corpus import wordnet
|
||||||
|
|
||||||
|
|
||||||
|
def synset_info(synset):
|
||||||
|
print("Name", synset.name())
|
||||||
|
print("POS:", synset.pos())
|
||||||
|
print("Definition:", synset.definition())
|
||||||
|
print("Examples:", synset.examples())
|
||||||
|
print("Lemmas:", synset.lemmas())
|
||||||
|
print("Antonyms:", [lemma.antonyms() for lemma in synset.lemmas() if len(lemma.antonyms()) > 0])
|
||||||
|
print("Hypernyms:", synset.hypernyms())
|
||||||
|
print("Instance Hypernyms:", synset.instance_hypernyms())
|
||||||
|
print("Part Holonyms:", synset.part_holonyms())
|
||||||
|
print("Part Meronyms:", synset.part_meronyms())
|
||||||
|
print()
|
||||||
|
|
||||||
|
|
||||||
|
synsets = wordnet.synsets('code')
|
||||||
|
print(len(synsets), "synsets:")
|
||||||
|
for synset in synsets:
|
||||||
|
synset_info(synset)
|
||||||
|
```
|
||||||
|
|
||||||
|
将会显示:
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
5 synsets:
|
||||||
|
Name code.n.01
|
||||||
|
POS: n
|
||||||
|
Definition: a set of rules or principles or laws (especially written ones)
|
||||||
|
Examples: []
|
||||||
|
Lemmas: [Lemma('code.n.01.code'), Lemma('code.n.01.codification')]
|
||||||
|
Antonyms: []
|
||||||
|
Hypernyms: [Synset('written_communication.n.01')]
|
||||||
|
Instance Hpernyms: []
|
||||||
|
Part Holonyms: []
|
||||||
|
Part Meronyms: []
|
||||||
|
|
||||||
|
...
|
||||||
|
|
||||||
|
Name code.n.03
|
||||||
|
POS: n
|
||||||
|
Definition: (computer science) the symbolic arrangement of data or instructions in a computer program or the set of such instructions
|
||||||
|
Examples: []
|
||||||
|
Lemmas: [Lemma('code.n.03.code'), Lemma('code.n.03.computer_code')]
|
||||||
|
Antonyms: []
|
||||||
|
Hypernyms: [Synset('coding_system.n.01')]
|
||||||
|
Instance Hpernyms: []
|
||||||
|
Part Holonyms: []
|
||||||
|
Part Meronyms: []
|
||||||
|
|
||||||
|
...
|
||||||
|
|
||||||
|
Name code.v.02
|
||||||
|
POS: v
|
||||||
|
Definition: convert ordinary language into code
|
||||||
|
Examples: ['We should encode the message for security reasons']
|
||||||
|
Lemmas: [Lemma('code.v.02.code'), Lemma('code.v.02.encipher'), Lemma('code.v.02.cipher'), Lemma('code.v.02.cypher'), Lemma('code.v.02.encrypt'), Lemma('code.v.02.inscribe'), Lemma('code.v.02.write_in_code')]
|
||||||
|
Antonyms: []
|
||||||
|
Hypernyms: [Synset('encode.v.01')]
|
||||||
|
Instance Hpernyms: []
|
||||||
|
Part Holonyms: []
|
||||||
|
Part Meronyms: []
|
||||||
|
```
|
||||||
|
|
||||||
|
<ruby>同义词集<rt>synsets</rt></ruby>和<ruby>词元<rt>lemma</rt></ruby>在 WordNet 里是按照树状结构组织起来的,下面的代码会给出直观的展现:
|
||||||
|
|
||||||
|
```
|
||||||
|
def hypernyms(synset):
|
||||||
|
return synset.hypernyms()
|
||||||
|
|
||||||
|
synsets = wordnet.synsets('soccer')
|
||||||
|
for synset in synsets:
|
||||||
|
print(synset.name() + " tree:")
|
||||||
|
pprint(synset.tree(rel=hypernyms))
|
||||||
|
print()
|
||||||
|
|
||||||
|
[/code] [code]
|
||||||
|
|
||||||
|
code.n.01 tree:
|
||||||
|
[Synset('code.n.01'),
|
||||||
|
[Synset('written_communication.n.01'),
|
||||||
|
...
|
||||||
|
|
||||||
|
code.n.02 tree:
|
||||||
|
[Synset('code.n.02'),
|
||||||
|
[Synset('coding_system.n.01'),
|
||||||
|
...
|
||||||
|
|
||||||
|
code.n.03 tree:
|
||||||
|
[Synset('code.n.03'),
|
||||||
|
...
|
||||||
|
|
||||||
|
code.v.01 tree:
|
||||||
|
[Synset('code.v.01'),
|
||||||
|
[Synset('tag.v.01'),
|
||||||
|
...
|
||||||
|
|
||||||
|
code.v.02 tree:
|
||||||
|
[Synset('code.v.02'),
|
||||||
|
[Synset('encode.v.01'),
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
WordNet 并没有涵盖所有的单词和其信息 (现今英语有约 17,0000 个单词,最新版的 WordNet 涵盖了约 15,5000 个),但它开了个好头。掌握了 WordNet 的各个概念后,如果你觉得它词汇少,不能满足你的需要,可以转而使用其他工具。或者,你也可以打造自己的<ruby>“词网”<rt>WordNet</rt></ruby>!
|
||||||
|
|
||||||
|
#### 自主尝试
|
||||||
|
|
||||||
|
使用 Python 库,下载维基百科的 [open source][5] 页面,并列出该页面所有单词的<ruby>同义词集<rt>synsets</rt></ruby>和<ruby> 词元<rt>lemmas</rt></ruby>。
|
||||||
|
|
||||||
|
### 相似度比较
|
||||||
|
|
||||||
|
相似度比较的目的是识别出两篇文本的相似度,在搜索引擎、聊天机器人等方面有很多应用。
|
||||||
|
|
||||||
|
比如,相似度比较可以识别 football 和 soccer 是否有相似性。
|
||||||
|
|
||||||
|
```
|
||||||
|
syn1 = wordnet.synsets('football')
|
||||||
|
syn2 = wordnet.synsets('soccer')
|
||||||
|
|
||||||
|
# A word may have multiple synsets, so need to compare each synset of word1 with synset of word2
|
||||||
|
# 一个单词可能有多个 synset,需要把 word1 的每个 synset 和 word2 的每个 synset 分别比较
|
||||||
|
for s1 in syn1:
|
||||||
|
for s2 in syn2:
|
||||||
|
print("Path similarity of: ")
|
||||||
|
print(s1, '(', s1.pos(), ')', '[', s1.definition(), ']')
|
||||||
|
print(s2, '(', s2.pos(), ')', '[', s2.definition(), ']')
|
||||||
|
print(" is", s1.path_similarity(s2))
|
||||||
|
print()
|
||||||
|
|
||||||
|
[/code] [code]
|
||||||
|
|
||||||
|
Path similarity of:
|
||||||
|
Synset('football.n.01') ( n ) [ any of various games played with a ball (round or oval) in which two teams try to kick or carry or propel the ball into each other's goal ]
|
||||||
|
Synset('soccer.n.01') ( n ) [ a football game in which two teams of 11 players try to kick or head a ball into the opponents' goal ]
|
||||||
|
is 0.5
|
||||||
|
|
||||||
|
Path similarity of:
|
||||||
|
Synset('football.n.02') ( n ) [ the inflated oblong ball used in playing American football ]
|
||||||
|
Synset('soccer.n.01') ( n ) [ a football game in which two teams of 11 players try to kick or head a ball into the opponents' goal ]
|
||||||
|
is 0.05
|
||||||
|
```
|
||||||
|
|
||||||
|
两个词各个 synset 之间<ruby>路径相似度<rt>path similarity</rt></ruby>最大的是 0.5,表明它们关联性很大 (路径相似度指两个词的意义在<ruby>上下义关系的词汇分类结构<rt>hypernym/hypnoym taxonomy</rt></ruby>中的最短距离)。
|
||||||
|
|
||||||
|
那么 code 和 bug 呢?这两个计算机领域的词的相似度是:
|
||||||
|
|
||||||
|
```
|
||||||
|
Path similarity of:
|
||||||
|
Synset('code.n.01') ( n ) [ a set of rules or principles or laws (especially written ones) ]
|
||||||
|
Synset('bug.n.02') ( n ) [ a fault or defect in a computer program, system, or machine ]
|
||||||
|
is 0.1111111111111111
|
||||||
|
...
|
||||||
|
Path similarity of:
|
||||||
|
Synset('code.n.02') ( n ) [ a coding system used for transmitting messages requiring brevity or secrecy ]
|
||||||
|
Synset('bug.n.02') ( n ) [ a fault or defect in a computer program, system, or machine ]
|
||||||
|
is 0.09090909090909091
|
||||||
|
...
|
||||||
|
Path similarity of:
|
||||||
|
Synset('code.n.03') ( n ) [ (computer science) the symbolic arrangement of data or instructions in a computer program or the set of such instructions ]
|
||||||
|
Synset('bug.n.02') ( n ) [ a fault or defect in a computer program, system, or machine ]
|
||||||
|
is 0.09090909090909091
|
||||||
|
```
|
||||||
|
|
||||||
|
这些是这两个词各 synset 之间<ruby>路径相似度<rt>path similarity</rt></ruby>的最大值,这些值表明两个词是有关联性的。
|
||||||
|
|
||||||
|
NLTK 提供多种<ruby>相似度计分器<rt>similarity scorers</rt></ruby>,比如:
|
||||||
|
|
||||||
|
* path_similarity
|
||||||
|
* lch_similarity
|
||||||
|
* wup_similarity
|
||||||
|
* res_similarity
|
||||||
|
* jcn_similarity
|
||||||
|
* lin_similarity
|
||||||
|
|
||||||
|
要进一步了解这个<ruby>相似度计分器<rt>similarity scorers</rt></ruby>,请查看 [WordNet Interface][6] 的 Similarity 部分。
|
||||||
|
|
||||||
|
#### 自主尝试
|
||||||
|
|
||||||
|
使用 Python 库,从维基百科的 [Category: Lists of computer terms][7] 生成一个术语列表,然后计算各术语之间的相似度。
|
||||||
|
|
||||||
|
### <ruby>树<rt>tree</rt></ruby>和<ruby>树库<rt>treebank</rt></ruby>
|
||||||
|
|
||||||
|
使用 NLTK,你可以把文本表示成树状结构以便进行分析。
|
||||||
|
|
||||||
|
这里有一个例子:
|
||||||
|
|
||||||
|
这是一份简短的文本,对其做预处理和词性标注:
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
import nltk
|
||||||
|
|
||||||
|
text = "I love open source"
|
||||||
|
# Tokenize to words
|
||||||
|
words = nltk.tokenize.word_tokenize(text)
|
||||||
|
# POS tag the words
|
||||||
|
words_tagged = nltk.pos_tag(words)
|
||||||
|
```
|
||||||
|
|
||||||
|
要把文本转换成树状结构,你必须定义一个<ruby>语法<rt>grammar</rt></ruby> 。这个例子里用的是一个基于 [Penn Treebank tags][8] 的简单语法。
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
# A simple grammar to create tree
|
||||||
|
grammar = "NP: {<JJ><NN>}"
|
||||||
|
```
|
||||||
|
|
||||||
|
然后用这个<ruby>语法<rt>grammar</rt></ruby>创建一颗<ruby>树<rt>tree</rt></ruby>:
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
# Create tree
|
||||||
|
parser = nltk.RegexpParser(grammar)
|
||||||
|
tree = parser.parse(words_tagged)
|
||||||
|
pprint(tree)
|
||||||
|
```
|
||||||
|
|
||||||
|
运行上面的代码,将得到:
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
Tree('S', [('I', 'PRP'), ('love', 'VBP'), Tree('NP', [('open', 'JJ'), ('source', 'NN')])])
|
||||||
|
```
|
||||||
|
|
||||||
|
你也可以图形化地显示结果。
|
||||||
|
|
||||||
|
```
|
||||||
|
tree.draw()
|
||||||
|
```
|
||||||
|
|
||||||
|
![NLTK Tree][9]
|
||||||
|
|
||||||
|
(Girish Managoli, [CC BY-SA 4.0][10])
|
||||||
|
|
||||||
|
这个树状结构有助于准确解读文本的意思。比如,用它可以找到文本的主语 ([subject][11]):
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
subject_tags = ["NN", "NNS", "NP", "NNP", "NNPS", "PRP", "PRP$"]
|
||||||
|
def subject(sentence_tree):
|
||||||
|
for tagged_word in sentence_tree:
|
||||||
|
# A crude logic for this case - first word with these tags is considered subject
|
||||||
|
if tagged_word[1] in subject_tags:
|
||||||
|
return tagged_word[0]
|
||||||
|
|
||||||
|
print("Subject:", subject(tree))
|
||||||
|
```
|
||||||
|
|
||||||
|
结果显示主语是 I:
|
||||||
|
|
||||||
|
```
|
||||||
|
Subject: I
|
||||||
|
```
|
||||||
|
|
||||||
|
这是一个比较基础的文本分析步骤,可以用到更广泛的应用场景中。 比如,在聊天机器人方面,如果用户告诉机器人:“给我妈妈 Jane 预订一张机票,1 月 1 号伦敦飞纽约的“,机器人可以用这种分析方法解读这个指令:
|
||||||
|
|
||||||
|
**动作**: 预订
|
||||||
|
**动作的对象**: 机票
|
||||||
|
**乘客**: Jane
|
||||||
|
**出发地**: 伦敦
|
||||||
|
**目的地**: 纽约
|
||||||
|
**日期**: (明年) 1 月 1 号
|
||||||
|
|
||||||
|
<ruby>树库<rt>treebank</rt></ruby>指由许多预先标注好的<ruby>树<rt>tree</rt></ruby>构成的语料库。现在已经有面向多种语言的树库,既有开源的,也有限定条件下才能免费使用的,以及商用的。其中使用最广泛的是面向英语的宾州树库。宾州树库取材于<ruby> _华尔街日报_ <rt>Wall Street Journal</rt></ruby>。NLTK 也包含了宾州树库作为一个子语料库。下面是一些使用<ruby>树库<rt>treebank</rt></ruby>的方法:
|
||||||
|
|
||||||
|
```
|
||||||
|
words = nltk.corpus.treebank.words()
|
||||||
|
print(len(words), "words:")
|
||||||
|
print(words)
|
||||||
|
|
||||||
|
tagged_sents = nltk.corpus.treebank.tagged_sents()
|
||||||
|
print(len(tagged_sents), "sentences:")
|
||||||
|
print(tagged_sents)
|
||||||
|
|
||||||
|
[/code] [code]
|
||||||
|
|
||||||
|
100676 words:
|
||||||
|
['Pierre', 'Vinken', ',', '61', 'years', 'old', ',', ...]
|
||||||
|
3914 sentences:
|
||||||
|
[[('Pierre', 'NNP'), ('Vinken', 'NNP'), (',', ','), ('61', 'CD'), ('years', 'NNS'), ('old', 'JJ'), (',', ','), ('will', 'MD'), ('join', 'VB'), ('the', 'DT'), ('board', 'NN'), ('as', 'IN'), ('a', 'DT'), ('nonexecutive', 'JJ'), ('director', 'NN'), ...]
|
||||||
|
```
|
||||||
|
|
||||||
|
查看一个句子里的各个<ruby>标签<rt>tags</rt></ruby>:
|
||||||
|
|
||||||
|
```
|
||||||
|
sent0 = tagged_sents[0]
|
||||||
|
pprint(sent0)
|
||||||
|
|
||||||
|
[/code] [code]
|
||||||
|
|
||||||
|
[('Pierre', 'NNP'),
|
||||||
|
('Vinken', 'NNP'),
|
||||||
|
(',', ','),
|
||||||
|
('61', 'CD'),
|
||||||
|
('years', 'NNS'),
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
定义一个<ruby>语法<rt>grammar</rt></ruby>来把这个句子转换成树状结构:
|
||||||
|
|
||||||
|
```
|
||||||
|
grammar = '''
|
||||||
|
Subject: {<NNP><NNP>}
|
||||||
|
SubjectInfo: {<CD><NNS><JJ>}
|
||||||
|
Action: {<MD><VB>}
|
||||||
|
Object: {<DT><NN>}
|
||||||
|
Stopwords: {<IN><DT>}
|
||||||
|
ObjectInfo: {<JJ><NN>}
|
||||||
|
When: {<NNP><CD>}
|
||||||
|
'''
|
||||||
|
parser = nltk.RegexpParser(grammar)
|
||||||
|
tree = parser.parse(sent0)
|
||||||
|
print(tree)
|
||||||
|
|
||||||
|
[/code] [code]
|
||||||
|
|
||||||
|
(S
|
||||||
|
(Subject Pierre/NNP Vinken/NNP)
|
||||||
|
,/,
|
||||||
|
(SubjectInfo 61/CD years/NNS old/JJ)
|
||||||
|
,/,
|
||||||
|
(Action will/MD join/VB)
|
||||||
|
(Object the/DT board/NN)
|
||||||
|
as/IN
|
||||||
|
a/DT
|
||||||
|
(ObjectInfo nonexecutive/JJ director/NN)
|
||||||
|
(Subject Nov./NNP)
|
||||||
|
29/CD
|
||||||
|
./.)
|
||||||
|
```
|
||||||
|
|
||||||
|
图形化地显示:
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
tree.draw()
|
||||||
|
```
|
||||||
|
|
||||||
|
![NLP Treebank image][12]
|
||||||
|
|
||||||
|
(Girish Managoli, [CC BY-SA 4.0][10])
|
||||||
|
|
||||||
|
<ruby>树<rt>trees</rt></ruby>和<ruby>树库<rt>treebanks</rt></ruby>的概念是文本分析的一个强大的组成部分。
|
||||||
|
|
||||||
|
#### 自主尝试
|
||||||
|
|
||||||
|
使用 Python 库,下载维基百科的 [open source][5] 页面,将得到的文本以图形化的树状结构展现出来。
|
||||||
|
|
||||||
|
### <ruby>命名实体识别<rt>Named entity recognition</rt></ruby>
|
||||||
|
|
||||||
|
无论口语还是书面语都包含着重要数据。文本处理的主要目标之一,就是提取出关键数据。几乎所有应用场景所需要提取关键数据,比如航空公司的订票机器人或者问答机器人。 NLTK 为此提供了一个<ruby>命名实体识别<rt>named entity recognition</rt></ruby>的功能。
|
||||||
|
|
||||||
|
这里有一个代码示例:
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
`sentence = 'Peterson first suggested the name "open source" at Palo Alto, California'`
|
||||||
|
```
|
||||||
|
|
||||||
|
验证这个句子里的<ruby>人名<rt>name</rt></ruby>和<ruby>地名<rt>place</rt></ruby>有没有被识别出来。照例先预处理:
|
||||||
|
|
||||||
|
```
|
||||||
|
import nltk
|
||||||
|
|
||||||
|
words = nltk.word_tokenize(sentence)
|
||||||
|
pos_tagged = nltk.pos_tag(words)
|
||||||
|
```
|
||||||
|
|
||||||
|
运行<ruby>命名实体标注器<rt>named-entity tagger</rt></ruby>:
|
||||||
|
|
||||||
|
```
|
||||||
|
ne_tagged = nltk.ne_chunk(pos_tagged)
|
||||||
|
print("NE tagged text:")
|
||||||
|
print(ne_tagged)
|
||||||
|
print()
|
||||||
|
|
||||||
|
[/code] [code]
|
||||||
|
|
||||||
|
NE tagged text:
|
||||||
|
(S
|
||||||
|
(PERSON Peterson/NNP)
|
||||||
|
first/RB
|
||||||
|
suggested/VBD
|
||||||
|
the/DT
|
||||||
|
name/NN
|
||||||
|
``/``
|
||||||
|
open/JJ
|
||||||
|
source/NN
|
||||||
|
''/''
|
||||||
|
at/IN
|
||||||
|
(FACILITY Palo/NNP Alto/NNP)
|
||||||
|
,/,
|
||||||
|
(GPE California/NNP))
|
||||||
|
```
|
||||||
|
|
||||||
|
上面的结果里,命名实体被识别出来并做了标注;只提取这个<ruby>树<rt>tree</rt></ruby>里的命名实体:
|
||||||
|
|
||||||
|
```
|
||||||
|
print("Recognized named entities:")
|
||||||
|
for ne in ne_tagged:
|
||||||
|
if hasattr(ne, "label"):
|
||||||
|
print(ne.label(), ne[0:])
|
||||||
|
|
||||||
|
[/code] [code]
|
||||||
|
|
||||||
|
Recognized named entities:
|
||||||
|
PERSON [('Peterson', 'NNP')]
|
||||||
|
FACILITY [('Palo', 'NNP'), ('Alto', 'NNP')]
|
||||||
|
GPE [('California', 'NNP')]
|
||||||
|
```
|
||||||
|
|
||||||
|
图形化地显示:
|
||||||
|
|
||||||
|
```
|
||||||
|
ne_tagged.draw()
|
||||||
|
```
|
||||||
|
|
||||||
|
![NLTK Treebank tree][13]
|
||||||
|
|
||||||
|
(Girish Managoli, [CC BY-SA 4.0][10])
|
||||||
|
|
||||||
|
NLTK 内置的<ruby>命名实体标注器<rt>named-entity tagger</rt></ruby>,使用的是宾州法尼亚大学的 [Automatic Content Extraction][14] (ACE) 程序。 该标注器能够识别<ruby>组织机构<rt>ORGANIZATION</rt></ruby><ruby>、人名<rt>PERSON</rt></ruby><ruby>、地名<rt>LOCATION</rt></ruby><ruby>、设施<rt>FACILITY</rt></ruby>和<ruby>地缘政治实体<rt>geopolitical entity</rt></ruby>等常见<ruby>实体<rt>entites</rt></ruby>。
|
||||||
|
|
||||||
|
NLTK 也可以使用其他<ruby>标注器<rt>tagger</rt></ruby>,比如 [Stanford Named Entity Recognizer][15]. 这个经过训练的标注器用 Java 写成,但 NLTK 提供了一个使用它的接口 (详情请查看 [nltk.parse.stanford][16] 或 [nltk.tag.stanford][17])。
|
||||||
|
|
||||||
|
#### 自主尝试
|
||||||
|
|
||||||
|
使用 Python 库,下载维基百科的 [open source][5] 页面,并识别出对<ruby>开源<rt>open source</rt></ruby>有影响力的人的名字,以及他们为<ruby>开源<rt>open source</rt></ruby>做贡献的时间和地点。
|
||||||
|
|
||||||
|
### 高级实践
|
||||||
|
|
||||||
|
如果你准备好了,尝试用这篇文章以及此前的文章介绍的知识构建一个<ruby>超级结构<rt>superstructure</rt></ruby>。
|
||||||
|
|
||||||
|
使用 Python 库,下载维基百科的 [Category: Computer science page][18],然后:
|
||||||
|
|
||||||
|
|
||||||
|
* 找出其中频率最高的<ruby>单词<rt>unigrams</rt></ruby><ruby>、二元搭配<rt>bigrams</rt></ruby>和<ruby>三元搭配<rt>trigrams</rt></ruby>,将它们作为一个<ruby>关键词<rt>keywords</rt></ruby>列表或者<ruby>技术<rt>techonologies</rt></ruby>列表。相关领域的学生或者工程师需要了解这样一份列表里的内容。
|
||||||
|
* 图形化地显示这个领域里重要的人名、技术、日期和地点。这会是一份很棒的信息图。
|
||||||
|
* 构建一个<ruby>搜索引擎<rt>search engine</rt></ruby>。你的<ruby>搜索引擎<rt>search engine</rt></ruby>性能能够超过维基百科吗?
|
||||||
|
|
||||||
|
|
||||||
|
### 接下来可以做什么?
|
||||||
|
|
||||||
|
<ruby>自然语言处理<rt>NLP</rt></ruby>是<ruby>应用构建<rt>application building</rt></ruby>的典型支柱。NLTK 是经典、丰富且强大的工具集,提供了为现实世界构建有吸引力、目标明确的应用的工作坊。
|
||||||
|
|
||||||
|
在这个系列的文章里,我用 NLTK 作为例子,展示了自然语言处理可以做什么。自然语言处理和 NLTK 还有太多东西值得探索,这个系列的文章只是帮助你探索它们的切入点。
|
||||||
|
|
||||||
|
如果你的需求慢慢增长到 NLTK 已经满足不了了,你可以训练新的模型或者向 NLTK 添加新的功能。基于 NLTK 构建的新的<ruby>自然语言处理库<rt>NLP libraries</rt></ruby>正在不断涌现,机器学习也正被深度用于自然语言处理。
|
||||||
|
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
via: https://opensource.com/article/20/8/nlp-python-nltk
|
||||||
|
|
||||||
|
作者:[Girish Managoli][a]
|
||||||
|
选题:[lujun9972][b]
|
||||||
|
译者:[tanloong](https://github.com/tanloong)
|
||||||
|
校对:[校对者ID](https://github.com/校对者ID)
|
||||||
|
|
||||||
|
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||||
|
|
||||||
|
[a]: https://opensource.com/users/gammay
|
||||||
|
[b]: https://github.com/lujun9972
|
||||||
|
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/brain_computer_solve_fix_tool.png?itok=okq8joti (Brain on a computer screen)
|
||||||
|
[2]: https://opensource.com/article/20/8/intro-python-nltk
|
||||||
|
[3]: http://www.nltk.org/
|
||||||
|
[4]: https://en.wikipedia.org/wiki/WordNet
|
||||||
|
[5]: https://en.wikipedia.org/wiki/Open_source
|
||||||
|
[6]: https://www.nltk.org/howto/wordnet.html
|
||||||
|
[7]: https://en.wikipedia.org/wiki/Category:Lists_of_computer_terms
|
||||||
|
[8]: https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html
|
||||||
|
[9]: https://opensource.com/sites/default/files/uploads/nltk-tree.jpg (NLTK Tree)
|
||||||
|
[10]: https://creativecommons.org/licenses/by-sa/4.0/
|
||||||
|
[11]: https://en.wikipedia.org/wiki/Subject_(grammar)
|
||||||
|
[12]: https://opensource.com/sites/default/files/uploads/nltk-treebank.jpg (NLP Treebank image)
|
||||||
|
[13]: https://opensource.com/sites/default/files/uploads/nltk-treebank-2a.jpg (NLTK Treebank tree)
|
||||||
|
[14]: https://www.ldc.upenn.edu/collaborations/past-projects/ace
|
||||||
|
[15]: https://nlp.stanford.edu/software/CRF-NER.html
|
||||||
|
[16]: https://www.nltk.org/_modules/nltk/parse/stanford.html
|
||||||
|
[17]: https://www.nltk.org/_modules/nltk/tag/stanford.html
|
||||||
|
[18]: https://en.wikipedia.org/wiki/Category:Computer_science
|
@ -3,19 +3,19 @@
|
|||||||
[#]: author: (Arindam https://www.debugpoint.com/author/admin1/)
|
[#]: author: (Arindam https://www.debugpoint.com/author/admin1/)
|
||||||
[#]: collector: (lujun9972)
|
[#]: collector: (lujun9972)
|
||||||
[#]: translator: (geekpi)
|
[#]: translator: (geekpi)
|
||||||
[#]: reviewer: ( )
|
[#]: reviewer: (turbokernel)
|
||||||
[#]: publisher: ( )
|
[#]: publisher: ( )
|
||||||
[#]: url: ( )
|
[#]: url: ( )
|
||||||
|
|
||||||
如何在 CentOS、RHEL、Rocky Linux 最小化安装中设置互联网
|
如何在 CentOS、RHEL、Rocky Linux 最小化安装中设置互联网
|
||||||
======
|
======
|
||||||
在最小安装的服务器中,设置互联网或网络是非常容易的。在本指南中,我们将解释如何在 CentOS、RHEL、Rocky Linux 最小安装中设置互联网或网络。
|
在最小化服务器安装中,设置互联网或网络是非常容易的。在本指南中,我们将解释如何在 CentOS、RHEL、Rocky Linux 最小安装中设置互联网或网络。
|
||||||
|
|
||||||
当你安装了任何服务器发行版的最小安装,你就没有任何 GUI 或桌面环境来设置你的网络或互联网。因此,当你只能使用终端时,知道如何设置互联网是很重要的。NetworkManager 工具提供了必要的工具和 systemd 服务来完成这项工作。以下是具体方法。
|
当你初次完成最小化任何服务器发行版安装时,你没有任何图形界面或桌面环境用于设置你的网络或互联网。因此,当你只能使用终端时,了解如何设置联网是很重要的。NetworkManager 以及 systemd 服务为完成这项工作提供了必要的工具。以下是具体使用方法。
|
||||||
|
|
||||||
### 在 CentOS、RHEL、Rocky Linux 最小化安装中设置互联网
|
### 在 CentOS、RHEL、Rocky Linux 最小化安装中设置互联网
|
||||||
|
|
||||||
* 完成安装后,启动服务器终端。理想情况下,你应该会看到提示。使用 root 或 admin 账户登录。
|
* 完成安装后,启动服务器终端。理想情况下,你应该会看到提示符。使用 root 或 admin 账户登录。
|
||||||
|
|
||||||
|
|
||||||
* 然后,首先尝试使用 nmcli 检查网络接口的状态和细节。nmcli 是一个控制 NetworkManager 服务的命令行工具。使用以下命令进行检查。
|
* 然后,首先尝试使用 nmcli 检查网络接口的状态和细节。nmcli 是一个控制 NetworkManager 服务的命令行工具。使用以下命令进行检查。
|
||||||
@ -30,7 +30,7 @@ nmcli device status
|
|||||||
|
|
||||||
![nmcli device status][1]
|
![nmcli device status][1]
|
||||||
|
|
||||||
* 运行工具 `nmtui` 来配置网络接口。[nmtui][2] 是 NetworkManager 工具的一部分,它为你提供了一个漂亮的用户界面来配置网络。这是 NetworkManager-tui 包的一部分,当你完成最小服务器的安装后,它应该默认安装。
|
* 运行工具 `nmtui` 来配置网络接口。[nmtui][2] 是 NetworkManager 工具的一部分,它为你提供了一个漂亮的用户界面来配置网络。这是 NetworkManager-tui 包的一部分,当你完成最小服务器的安装时它应该默认安装。
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -44,7 +44,7 @@ nmtui
|
|||||||
|
|
||||||
![nmtui – Select options][3]
|
![nmtui – Select options][3]
|
||||||
|
|
||||||
* 选择接口名称
|
* 选择网口名称
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -56,7 +56,7 @@ nmtui
|
|||||||
|
|
||||||
![nmtui – Edit Connection][5]
|
![nmtui – Edit Connection][5]
|
||||||
|
|
||||||
* 使用下面的命令,通过 [systemd systemctl][6] 重新启动 NetworkManager 服务。
|
* 通过使用如下 [systemd systemctl][6] 命令,重新启动 NetworkManager 服务。
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -64,15 +64,15 @@ nmtui
|
|||||||
systemctl restart NetworkManager
|
systemctl restart NetworkManager
|
||||||
```
|
```
|
||||||
|
|
||||||
* 如果一切顺利,你应该在 CentOS、RHEL、Rocky Linux 服务器的最小安装中连接到网络和互联网。前提是你的网络有互联网连接。你可以用 ping 来验证它是否正常。
|
* 如果一切顺利,你应该在 CentOS、RHEL、Rocky Linux 服务器的最小化安装中连接到网络和互联网。前提是你的网络有互联网连接。你可以用 ping 来验证它是否正常。
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
![setup internet minimal server – CentOS Rocky Linux RHEL][7]
|
![setup internet minimal server – CentOS Rocky Linux RHEL][7]
|
||||||
|
|
||||||
### 额外技巧:在最小服务器中设置静态 IP
|
### 额外技巧:在最小化服务器中设置静态 IP
|
||||||
|
|
||||||
当你把网络配置设置为自动,当你连接到互联网时,接口会动态地分配 IP。在某些情况下,当你建立一个局域网 (LAN) 时,你可能想给你的网络接口分配静态 IP。这超级简单。
|
当你把网络配置设置为自动,当你连接到互联网时,网口会动态地分配 IP。在某些情况下,当你建立一个局域网 (LAN) 时,你可能想给你的网口分配静态 IP。这超级简单。
|
||||||
|
|
||||||
打开你的网络的网络配置脚本。根据你的设备修改高亮部分。
|
打开你的网络的网络配置脚本。根据你的设备修改高亮部分。
|
||||||
|
|
||||||
@ -94,7 +94,7 @@ HOSTNAME=debugpoint
|
|||||||
GATEWAY=10.1.1.1
|
GATEWAY=10.1.1.1
|
||||||
```
|
```
|
||||||
|
|
||||||
在位于 `/etc/resolv.conf` 的 resolv.conf 中添加任意公共 DNS 服务器。
|
在 `/etc/resolv.conf` 的 resolv.conf 中添加任意公共 DNS 服务器。
|
||||||
|
|
||||||
nameserver 8.8.8.8
|
nameserver 8.8.8.8
|
||||||
nameserver 8.8.4.4
|
nameserver 8.8.4.4
|
||||||
@ -105,9 +105,9 @@ nameserver 8.8.4.4
|
|||||||
systemctl restart NetworkManager
|
systemctl restart NetworkManager
|
||||||
```
|
```
|
||||||
|
|
||||||
这样就完成了静态 IP 的设置。你也可以使用 `ip addr` 命令检查 IP 的详细信息。
|
这样就完成了静态 IP 的设置。你也可以使用 `ip addr` 命令检查详细的 IP 信息。
|
||||||
|
|
||||||
我希望这个指南能帮助你在你的最小服务器中设置网络、互联网和静态 IP。如果你有任何问题,请在评论区告诉我。
|
我希望这个指南能帮助你在你的最小化服务器中设置网络、互联网和静态 IP。如果你有任何问题,请在评论区告诉我。
|
||||||
|
|
||||||
* * *
|
* * *
|
||||||
|
|
||||||
@ -118,7 +118,7 @@ via: https://www.debugpoint.com/2021/06/setup-internet-minimal-install-server/
|
|||||||
作者:[Arindam][a]
|
作者:[Arindam][a]
|
||||||
选题:[lujun9972][b]
|
选题:[lujun9972][b]
|
||||||
译者:[geekpi](https://github.com/geekpi)
|
译者:[geekpi](https://github.com/geekpi)
|
||||||
校对:[校对者ID](https://github.com/校对者ID)
|
校对:[校对者ID](https://github.com/turbokernel)
|
||||||
|
|
||||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||||
|
|
||||||
|
@ -3,7 +3,7 @@
|
|||||||
[#]: author: (Sumantro Mukherjee https://opensource.com/users/sumantro)
|
[#]: author: (Sumantro Mukherjee https://opensource.com/users/sumantro)
|
||||||
[#]: collector: (lujun9972)
|
[#]: collector: (lujun9972)
|
||||||
[#]: translator: (geekpi)
|
[#]: translator: (geekpi)
|
||||||
[#]: reviewer: ( )
|
[#]: reviewer: (turbokernel)
|
||||||
[#]: publisher: ( )
|
[#]: publisher: ( )
|
||||||
[#]: url: ( )
|
[#]: url: ( )
|
||||||
|
|
||||||
@ -12,26 +12,26 @@
|
|||||||
age 是一个简单的、易于使用的工具,允许你用一个密码来加密和解密文件。
|
age 是一个简单的、易于使用的工具,允许你用一个密码来加密和解密文件。
|
||||||
![Scissors cutting open access to files][1]
|
![Scissors cutting open access to files][1]
|
||||||
|
|
||||||
长期以来,保护文件和敏感文档的加密和安全一直是用户关心的问题。即使我们越来越多的数据被存放在网站和云服务上,并由带有越来越安全和具有挑战性的密码的用户账户来保护,但能够在我们自己的文件系统中存储敏感数据仍有很大的价值,特别是当我们能够快速和容易地加密这些数据时。
|
文件的保护和敏感文档的安全加密是用户长期以来关心的问题。即使越来越多的数据被存放在网站和云服务上,并由带有越来越安全和高强度密码的用户账户来保护,但我们能够在自己的文件系统中存储敏感数据仍有很大的价值,特别是我们能够快速和容易地加密这些数据时。
|
||||||
|
|
||||||
[age][2] 能让你这样做。它是一个小型的、易于使用的工具,允许你用一个密码加密一个文件,并根据需要解密。
|
[age][2] 能帮你这样做。它是一个小型且易于使用的工具,允许你用一个密码加密一个文件,并根据需要解密。
|
||||||
|
|
||||||
### 安装 age
|
### 安装 age
|
||||||
|
|
||||||
age 可以在大多数 Linux 软件库中[安装][3]。
|
age 可以在众多 Linux 软件库中[安装][3]。
|
||||||
|
|
||||||
要在 Fedora 上安装它:
|
在 Fedora 上安装它:
|
||||||
|
|
||||||
|
|
||||||
```
|
```
|
||||||
`$ sudo dnf install age -y`
|
`$ sudo dnf install age -y`
|
||||||
```
|
```
|
||||||
|
|
||||||
在 macOS 上,使用 [MacPorts][4] 或 [Homebrew][5]。在 Windows 上,使用 [Chocolatey][6]。
|
在 macOS 上,使用 [MacPorts][4] 或 [Homebrew][5] 来安装。在 Windows 上,使用 [Chocolatey][6] 来安装。
|
||||||
|
|
||||||
### 用 age 加密和解密文件
|
### 用 age 加密和解密文件
|
||||||
|
|
||||||
age 可以用公钥或用户设置的密码来加密和解密文件。
|
age 可以用公钥或用户自定义密码来加密和解密文件。
|
||||||
|
|
||||||
#### 在 age 中使用公钥
|
#### 在 age 中使用公钥
|
||||||
|
|
||||||
@ -52,22 +52,22 @@ Public key: age16frc22wz6z206hslrjzuv2tnsuw32rk80pnrku07fh7hrmxhudawase896m9
|
|||||||
`$ touch mypasswds.txt | age -r ageage16frc22wz6z206hslrjzuv2tnsuw32rk80pnrku07fh7hrmxhudawase896m9 > mypass.tar.gz.age`
|
`$ touch mypasswds.txt | age -r ageage16frc22wz6z206hslrjzuv2tnsuw32rk80pnrku07fh7hrmxhudawase896m9 > mypass.tar.gz.age`
|
||||||
```
|
```
|
||||||
|
|
||||||
在这个例子中,文件 `mypasswds.txt` 被我生成的公钥加密,放在一个叫做 `mypass.tar.gz.age` 的加密文件中。
|
在这个例子中,文件 `mypasswds.txt` 被我使用生成的公钥加密,放保存在名为 `mypass.tar.gz.age` 的加密文件中。
|
||||||
|
|
||||||
### 用公钥解密
|
### 用公钥解密
|
||||||
|
|
||||||
要解密你所保护的信息,使用 `age` 命令和 `--decrypt` 选项:
|
如需解密加密文件,使用 `age` 命令和 `--decrypt` 选项:
|
||||||
|
|
||||||
|
|
||||||
```
|
```
|
||||||
`$ age --decrypt -i key.txt -o mypass.tar.gz mypass.tar.gz.age`
|
`$ age --decrypt -i key.txt -o mypass.tar.gz mypass.tar.gz.age`
|
||||||
```
|
```
|
||||||
|
|
||||||
在这个例子中,age 使用存储在 `key.text` 中的密钥,并解密了我在上一步创建的文件。
|
在这个例子中,age 使用存储在 `key.text` 中的密钥,并解密了我在上一步创建的加密文件。
|
||||||
|
|
||||||
### 使用密码加密
|
### 使用密码加密
|
||||||
|
|
||||||
在没有公开密钥的情况下对文件进行加密被称为对称加密。它允许用户设置密码来加密和解密一个文件。要做到这一点:
|
不使用公钥的情况下对文件进行加密被称为对称加密。它允许用户设置密码来加密和解密一个文件。要做到这一点:
|
||||||
|
|
||||||
|
|
||||||
```
|
```
|
||||||
@ -76,22 +76,22 @@ Enter passphrase (leave empty to autogenerate a secure one):
|
|||||||
Confirm passphrase:
|
Confirm passphrase:
|
||||||
```
|
```
|
||||||
|
|
||||||
在这个例子中,age 提示你输入一个密码,它用这个密码对输入文件 `mypasswd.txt` 进行加密,并生成文件 `mypasswd-encrypted.txt`。
|
在这个例子中,age 提示你输入一个密码,它将通过这个密码对输入文件 `mypasswd.txt` 进行加密,并生成加密文件 `mypasswd-encrypted.txt`。
|
||||||
|
|
||||||
### 使用密码解密
|
### 使用密码解密
|
||||||
|
|
||||||
要解密一个用密码加密的文件,可以使用 `age` 命令和 `--decrypt` 选项:
|
如需将用密码加密的文件解密,可以使用 `age` 命令和 `--decrypt` 选项:
|
||||||
|
|
||||||
|
|
||||||
```
|
```
|
||||||
`$ age --decrypt --output passwd-decrypt.txt mypasswd-encrypted.txt`
|
`$ age --decrypt --output passwd-decrypt.txt mypasswd-encrypted.txt`
|
||||||
```
|
```
|
||||||
|
|
||||||
在这个例子中,age 提示你输入密码,然后将 `mypasswd-encrypted.txt` 文件的内容解密为 `passwd-decrypt.txt`,只要你提供的密码与加密时设置的密码一致。
|
在这个例子中,age 提示你输入密码,只要你提供的密码与加密时设置的密码一致,age 随后将 `mypasswd-encrypted.txt` 加密文件的内容解密为 `passwd-decrypt.txt`。
|
||||||
|
|
||||||
### 不要丢失你的密钥
|
### 不要丢失你的密钥
|
||||||
|
|
||||||
无论你是使用密码加密还是公钥加密,你都_不能_丢失加密数据的凭证。根据设计,如果没有用于加密的密钥,用 age 加密的文件是不能被解密的。所以,请备份你的公钥,并记住这些密码!
|
无论你是使用密码加密还是公钥加密,你都_不能_丢失加密数据的凭证。根据设计,如果没有用于加密的密钥,通过 age 加密的文件是不能被解密的。所以,请备份你的公钥,并记住这些密码!
|
||||||
|
|
||||||
### 轻松实现加密
|
### 轻松实现加密
|
||||||
|
|
||||||
@ -104,7 +104,7 @@ via: https://opensource.com/article/21/7/linux-age
|
|||||||
作者:[Sumantro Mukherjee][a]
|
作者:[Sumantro Mukherjee][a]
|
||||||
选题:[lujun9972][b]
|
选题:[lujun9972][b]
|
||||||
译者:[geekpi](https://github.com/geekpi)
|
译者:[geekpi](https://github.com/geekpi)
|
||||||
校对:[校对者ID](https://github.com/校对者ID)
|
校对:[校对者ID](https://github.com/turbokernel)
|
||||||
|
|
||||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||||
|
|
||||||
|
@ -0,0 +1,95 @@
|
|||||||
|
[#]: subject: "Write good examples by starting with real code"
|
||||||
|
[#]: via: "https://jvns.ca/blog/2021/07/08/writing-great-examples/"
|
||||||
|
[#]: author: "Julia Evans https://jvns.ca/"
|
||||||
|
[#]: collector: "lujun9972"
|
||||||
|
[#]: translator: "zepoch"
|
||||||
|
[#]: reviewer: " "
|
||||||
|
[#]: publisher: " "
|
||||||
|
[#]: url: " "
|
||||||
|
|
||||||
|
从实际代码开始编写好的示例
|
||||||
|
======
|
||||||
|
|
||||||
|
当我写关于编程的事情时,我花费了大量时间在生产好的示例上。我从未见过有人写过关于如何写出好的示例,所以我就写了一下如何写出一份好的示例。
|
||||||
|
|
||||||
|
最基础的就是你真的去写代码,然后删除不相关的细节,使其成为一个自成一体的例子,而不是无中生有地想出一些例子。
|
||||||
|
|
||||||
|
我将会谈论两种示例:实例和令人惊讶的例子
|
||||||
|
|
||||||
|
### 好的示例是真实的
|
||||||
|
|
||||||
|
为了说明为什么好的案例应该是真实的,我们就先讨论一个不真实的案例。假设我们在试图解释 Python 的 lambda 函数(这只是我想到的第一个概念)。你可以举一个例子,使用 `map` 和 lambda 来让一组数字变为原先的两倍。
|
||||||
|
|
||||||
|
```
|
||||||
|
numbers = [1, 2, 3, 4]
|
||||||
|
squares = map(lambda x: x * x, numbers)
|
||||||
|
```
|
||||||
|
|
||||||
|
我觉得这个示例是不真实的,有两方面的原因:
|
||||||
|
|
||||||
|
* 将一组数字作平方运算不是你想要在真正的程序中完成的事,除非是项目 Euler 或某种东西(也有很多其它的更有可能的操作系统)
|
||||||
|
* `map` 在 Python 中并不是很常用,即便是做这个我也更愿意写 `[x*x for x in numbers]`
|
||||||
|
|
||||||
|
一个更加真实的 Python lambdas 的示例是使用 `sort` 函数,就像这样:
|
||||||
|
|
||||||
|
```
|
||||||
|
children = [{"name": "ashwin", "age": 12}, {"name": "radhika", "age": 3}]
|
||||||
|
sorted_children = sorted(children, key=lambda x: x['age'])
|
||||||
|
```
|
||||||
|
|
||||||
|
但是这个示例是被精心设计的(为什么我们需要按照年龄对这些孩子进行排序呢?)。所以我们如何来做一个真实的示例呢?
|
||||||
|
|
||||||
|
### 如何让你的示例真实起来:看你写实际代码
|
||||||
|
|
||||||
|
我认为最简单的来生成一个例子的方法就是,不是凭空出现一个例子(就像我用那个`儿童`的例子),而只是从看真正的代码开始!
|
||||||
|
|
||||||
|
举一个例子吧,如果我要用 `sort.+key` 来编写一串 Python 代码,我会发现很多我按某个标准对列表进行排序的真实例子,例如:
|
||||||
|
|
||||||
|
* `tasks.sort(key=lambda task: task['completed_time'])`
|
||||||
|
* `emails = reversed(sorted(emails, key=lambda x:x['receivedAt']))`
|
||||||
|
* `sorted_keysizes = sorted(scores.keys(), key=scores.get)`
|
||||||
|
* `shows = sorted(dates[date], key=lambda x: x['time']['performanceTime'])`
|
||||||
|
|
||||||
|
在这里很容易看到一个规律——这些大都是按时间排序的!因此,您可以明白如何轻松地将按时间排序的某些对象(电子邮件、事件等)的简单实例放在一起。
|
||||||
|
|
||||||
|
### 现实的例子有助于"推销"你试图解释的概念
|
||||||
|
|
||||||
|
当我试图去解释一个想法(就好比 Python Lambdas)的时候,我通常也会试图说服读者,说这是值得学习的想法。Python lambdas 是如此的有用!当我去试图说服某个人 lambdas 是很好用的时候,让他想象一下 lambdas 如何帮助他们完成一项他们将要去做的任务或是以及一项他们以前做过的任务,对说服他会很有帮助。
|
||||||
|
|
||||||
|
### 从真实代码中提炼出示例可能需要很长时间
|
||||||
|
|
||||||
|
The example I just gave of explaining how to use `sort` with `lambda` is pretty simple and it didn’t take me a long time to come up with, but turning real code into a standalone example can take a really long time!
|
||||||
|
|
||||||
|
我给出的解释如何使用 `lambda` 和 `sort` 函数的例子是十分简单的,它并不需要花费我很长时间来想出来,但是将真实的代码化为一个独立的示例则是会需要花费很长的时间
|
||||||
|
|
||||||
|
举个例子,我想在这篇文章中融入一些奇怪的 CSS 行为的例子来说明创造一个怪异或令人惊讶的案例是十分有趣的。我花费了两个小时来解决我这周遇到的一个实际的问题,确保我理解 CSS 的实际情况,并将其变成一个迷你的示例。
|
||||||
|
|
||||||
|
最后,它“仅仅”用了[五行 HTML 和一点点的 CSS][1] 来说明了这个问题,看起来并不想是我花费了好多小时写出来的。但是最初它却是几百行的 JS/CSS/JavaScript,它需要花费很长时间来将所有的代码化为核心的很少的代码。
|
||||||
|
|
||||||
|
但我认为花点时间把示例讲得非常简单明了是值得的——如果有成百上千的人在读你的示例,你就节省了他们这么多时间!
|
||||||
|
|
||||||
|
### 就这么多了!
|
||||||
|
|
||||||
|
我觉得关于示例还有更多可以去讲的——我觉得还有几个不同类型的有用示例,例如
|
||||||
|
|
||||||
|
* 可以更多的改变人的思维而不是直接提供使用的代码的让读者感到惊喜的示例
|
||||||
|
* 易于复制粘贴以用作起点的示例
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
也许有一天我还会再写一些呢? :)
|
||||||
|
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
via: https://jvns.ca/blog/2021/07/08/writing-great-examples/
|
||||||
|
|
||||||
|
作者:[Julia Evans][a]
|
||||||
|
选题:[lujun9972][b]
|
||||||
|
译者:[zepoch](https://github.com/zepoch)
|
||||||
|
校对:[校对者ID](https://github.com/校对者ID)
|
||||||
|
|
||||||
|
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||||
|
|
||||||
|
[a]: https://jvns.ca/
|
||||||
|
[b]: https://github.com/lujun9972
|
||||||
|
[1]: https://codepen.io/wizardzines/pen/0eda7725a46c919dcfdd3fa80aff3d41
|
@ -0,0 +1,99 @@
|
|||||||
|
[#]: subject: (Converseen for Batch Processing Images on Linux)
|
||||||
|
[#]: via: (https://itsfoss.com/converseen/)
|
||||||
|
[#]: author: (Abhishek Prakash https://itsfoss.com/author/abhishek/)
|
||||||
|
[#]: collector: (lujun9972)
|
||||||
|
[#]: translator: (geekpi)
|
||||||
|
[#]: reviewer: ( )
|
||||||
|
[#]: publisher: ( )
|
||||||
|
[#]: url: ( )
|
||||||
|
|
||||||
|
在 Linux 上批量处理图像的 Converseen
|
||||||
|
======
|
||||||
|
|
||||||
|
Converseen 是一个免费的开源软件,用于批量图像转换。有了这个工具,你可以一次将多张图片转换成另一种格式,调整大小,改变它们的长宽比,旋转或翻转它们。
|
||||||
|
|
||||||
|
对于像我这样的人来说,这是一个很方便的工具,我必须处理多个不同大小的截图,但在上传到网站之前必须调整它们的大小。
|
||||||
|
|
||||||
|
批量转换工具在这种情况下有很大的帮助。这可以在 Linux 命令行中用不错的 [ImageMagick][1] 来完成,但在这里使用 GUI 工具要容易得多。实际上,Converseen 在基于 Qt 的图形用户界面下使用 ImageMagick。
|
||||||
|
|
||||||
|
### 用 Converseen 批量处理图像
|
||||||
|
|
||||||
|
你可以用 [Converseen][2] 通过鼠标点击来转换、调整大小、旋转和翻转多个图像。
|
||||||
|
|
||||||
|
你有很多支持批量转换的选项。你可以在你的选择中添加更多的图片,或者删除其中的一些。你可以选择只转换你选择的几张图片。
|
||||||
|
|
||||||
|
在调整图像大小时,你可以选择保持长宽比。请记住,在宽度和高度中,你最后改变/输入的那个是控制长宽比的那个。所以,如果你想在保持长宽比的情况下调整大小,但要根据宽度来调整,不要修改高度栏。
|
||||||
|
|
||||||
|
![][3]
|
||||||
|
|
||||||
|
你也可以选择将转换后的图像以不同的名称保存在同一目录或其他位置。你也可以覆盖现有的图像。
|
||||||
|
|
||||||
|
你不能添加文件夹,但你可以一次选择并添加多个图像。
|
||||||
|
|
||||||
|
你可以将图像转换为多种格式,如 JPEG、JPG、TIFF、SVG 等。
|
||||||
|
|
||||||
|
在改变格式的同时,还有一个选项可以给透明背景以某种颜色。你还可以设置压缩级别的质量。
|
||||||
|
|
||||||
|
![][4]
|
||||||
|
|
||||||
|
Converseen 还可以导入 PDF 文件,并将整个 PDF 或其中的一部分转换为图像。然而,在 Ubuntu 21.04 中,每次我试图转换一个 PDF 文件时,它就会崩溃。
|
||||||
|
|
||||||
|
### 在 Linux 上安装 Converseen
|
||||||
|
|
||||||
|
Converseen 是一个流行的应用。它在大多数 Linux 发行版仓库中都有。
|
||||||
|
|
||||||
|
你可以在你的发行版的软件中心搜索到它:
|
||||||
|
|
||||||
|
![][5]
|
||||||
|
|
||||||
|
当然,你也可以使用你的发行版的包管理器通过命令行来安装它。
|
||||||
|
|
||||||
|
在基于 Debian 和 Ubuntu 的发行版上,使用:
|
||||||
|
|
||||||
|
```
|
||||||
|
sudo apt install converseen
|
||||||
|
```
|
||||||
|
|
||||||
|
在 Fedora 上,使用:
|
||||||
|
|
||||||
|
```
|
||||||
|
sudo dnf install converseen
|
||||||
|
```
|
||||||
|
|
||||||
|
在 Arch 和 Manjaro 上,使用:
|
||||||
|
|
||||||
|
```
|
||||||
|
sudo pacman -Sy converseen
|
||||||
|
```
|
||||||
|
|
||||||
|
Converseen 也可用于 Windows 和 FreeBSD。你可以在项目网站的下载页面获得相关说明。
|
||||||
|
|
||||||
|
[下载 Converseen][6]
|
||||||
|
|
||||||
|
它的源码可在 GitHub 仓库[获取][7]。
|
||||||
|
|
||||||
|
如果你正在寻找一个更简单的方法来调整一张图片的大小,你可以使用这个巧妙的技巧,[在 Nautilus 文件管理器中用右键菜单调整图片大小和旋转图片][8]。
|
||||||
|
|
||||||
|
总的来说,Converseen 是一个有用的用于批量图像转换的 GUI 工具。它并不完美,但在大多数情况下是有用的。你曾经使用过 Converseen 或者你使用类似的工具吗?你对它的体验如何?
|
||||||
|
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
via: https://itsfoss.com/converseen/
|
||||||
|
|
||||||
|
作者:[Abhishek Prakash][a]
|
||||||
|
选题:[lujun9972][b]
|
||||||
|
译者:[geekpi](https://github.com/geekpi)
|
||||||
|
校对:[校对者ID](https://github.com/校对者ID)
|
||||||
|
|
||||||
|
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||||
|
|
||||||
|
[a]: https://itsfoss.com/author/abhishek/
|
||||||
|
[b]: https://github.com/lujun9972
|
||||||
|
[1]: https://imagemagick.org/index.php
|
||||||
|
[2]: https://converseen.fasterland.net/
|
||||||
|
[3]: https://i1.wp.com/itsfoss.com/wp-content/uploads/2021/07/converseen-interface.png?resize=800%2C400&ssl=1
|
||||||
|
[4]: https://i1.wp.com/itsfoss.com/wp-content/uploads/2021/07/converseen-features-overview_copy.png?resize=800%2C497&ssl=1
|
||||||
|
[5]: https://i2.wp.com/itsfoss.com/wp-content/uploads/2021/07/install-converseen-linux.jpeg?resize=800%2C527&ssl=1
|
||||||
|
[6]: https://converseen.fasterland.net/download/
|
||||||
|
[7]: https://github.com/Faster3ck/Converseen
|
||||||
|
[8]: https://itsfoss.com/resize-images-with-right-click/
|
Loading…
Reference in New Issue
Block a user