mirror of
https://github.com/LCTT/TranslateProject.git
synced 2025-01-19 22:51:41 +08:00
commit
94147961fd
218
sources/talk/20180710 How I Fully Quit Google And You Can Too.md
Normal file
218
sources/talk/20180710 How I Fully Quit Google And You Can Too.md
Normal file
@ -0,0 +1,218 @@
|
|||||||
|
martin2011qi is translating
|
||||||
|
|
||||||
|
How I Fully Quit Google (And You Can, Too)
|
||||||
|
============================================================
|
||||||
|
|
||||||
|
>My enlightening quest to break free of a tech giant
|
||||||
|
|
||||||
|
Over the past six months, I have gone on a surprisingly tough, time-intensive, and enlightening quest — to quit using, entirely, the products of just one company — Google. What should be a simple task was, in reality, many hours of research and testing. But I did it. Today, I am Google free, part of the western world’s ultimate digital minority, someone who does not use products from the world’s two most valuable technology companies (yes, I don’t use [Facebook either][6]).
|
||||||
|
|
||||||
|
This guide is to show you how I quit the Googleverse, and the alternatives I choose based on my own research and personal needs. I’m not a technologist or a coder, but my work as a journalist requires me to be aware of security and privacy issues.
|
||||||
|
|
||||||
|
I chose all of these alternatives based solely on their merit, usability, cost, and whether or not they had the functionality I desired. My choices are not universal as they reflect my own needs and desires. Nor do they reflect any commercial interests. None of the alternatives listed below paid me or are giving me any commission whatsoever for citing their services.
|
||||||
|
|
||||||
|
### But First: Why?
|
||||||
|
|
||||||
|
Here’s the thing. I don’t hate Google. In fact, not too long ago, I was a huge fan of Google. I remember the moment when I first discovered one amazing search engine back in the late 1990’s, when I was still in high school. Google was light years ahead of alternatives such as Yahoo, Altavista, or Ask Jeeves. It really did help users find what they were seeking on a web that was, at that time, a mess of broken websites and terrible indexes.
|
||||||
|
|
||||||
|
Google soon moved from just search to providing other services, many of which I embraced. I was an early adopter of Gmail back in 2005, when you could only join [via invites][7]. It introduced threaded conversations, archiving, labels, and was without question the best email service I had ever used. When Google introduced its Calendar tool in 2006, it was revolutionary in how easy it was to color code different calendars, search for events, and send shareable invites. And Google Docs, launched in 2007, was similarly amazing. During my first full time job, I pushed my team to do everything as a Google spreadsheet, document, or presentation that could be edited by many of us simultaneously.
|
||||||
|
|
||||||
|
Like many, I was a victim of Google creep. Search led to email, to documents, to analytics, photos, and dozens of other services all built on top of and connected to each other. Google turned from a company releasing useful products to one that has ensnared us, and the internet as a whole, into its money-making, data gathering apparatus. Google is pervasive in our digital lives in a way no other corporation is or ever has been. It’s relatively easy to quit using the products of other tech giants. With Apple, you’re either in the iWorld, or out. Same with Amazon, and even Facebook owns only a few platforms and quitting is more of a [psychological challenge][8] than actually difficult.
|
||||||
|
|
||||||
|
Google, however, is embedded everywhere. No matter what laptop, smartphone, or tablet you have, chances are you have at least one Google app on there. Google is synonymous for search, maps, email, our browser, the operating system on most of our smartphones. It even provides the “[services][9]” and analytics that other apps and websites rely on, such as Uber’s use of Google Maps to operate its ride-hailing service.
|
||||||
|
|
||||||
|
Google is now a word in many languages, and its global dominance means there are not many well-known, or well-used alternatives to its behemoth suite of tools — especially if you are privacy minded. We all started using Google because it, in many ways, provided better alternatives to existing products. But now, we can’t quit because either Google has become a default, or because its dominance means that alternatives can’t get enough traction.
|
||||||
|
|
||||||
|
The truth is, alternatives do exist, many of which have launched in the years since Edward Snowden revealed Google’s participation in [Prism][10]. I embarked on this project late last year. After six months of research, testing, and a lot of trial and error, I was able to find privacy minded alternatives to all the Google products I was using. Some, to my surprise, were even better.
|
||||||
|
|
||||||
|
### A Few Caveats
|
||||||
|
|
||||||
|
One of the biggest challenges to quitting is the fact that most alternatives, particularly those in the open source of privacy space, are really not user friendly. I’m not a techie. I have a website, understand how to manage Wordpress, and can do some basic troubleshooting, but I can’t use Command Line or do anything that requires coding.
|
||||||
|
|
||||||
|
These alternatives are ones you can easily use with most, if not all, the functionality of their Google alternatives. For some, though, you’ll need your own web host or access to a server.
|
||||||
|
|
||||||
|
Also, [Google Takeout][11] is your friend. Being able to download my entire email history and upload it on my computer to access via Thunderbird meant I have easy access to over a decade of emails. The same can be said about Calendar or Docs, the latter of which I converted to ODT format and now keep on my cloud alternative, further detailed below.
|
||||||
|
|
||||||
|
### The Easy Ones
|
||||||
|
|
||||||
|
#### Search
|
||||||
|
|
||||||
|
[DuckDuckGo][12] and [Startpage][13] are both privacy-centric search engines that do not collect any of your search data. Together, they take care of everything I was previously using Google search for.
|
||||||
|
|
||||||
|
_Other Alternatives: _ Really not many when Google has 74% global market share, with the remainder mostly due to it’s being blocked in China. Ask.com is still around. And there’s Bing…
|
||||||
|
|
||||||
|
#### Chrome
|
||||||
|
|
||||||
|
[Mozilla Firefox][14] — it recently got [a big upgrade][15], which is a huge improvement from earlier versions. It’s created by a non-profit foundation that actively works to protect privacy. There’s really no reason at all to use Chrome.
|
||||||
|
|
||||||
|
_Other Alternatives: _ Avoid Opera and Vivaldi, as they use Chrome as their base. [Brave][16] is my secondary browser.
|
||||||
|
|
||||||
|
#### Hangouts and Google Chat
|
||||||
|
|
||||||
|
[Jitsi Meet][17] — an open source, free alternative to Google Hangouts. You can use it directly from a browser or download the app. It’s fast, secure, and works on nearly every platform.
|
||||||
|
|
||||||
|
_Other Alternatives: Z_ oom has become popular among those in the professional space, but requires you to pay for most features. [Signal][18], an open source, secure messaging app, also has a call function but only on mobile. Avoid Skype, as it’s both a data hog and has a terrible interface.
|
||||||
|
|
||||||
|
#### Google Maps
|
||||||
|
|
||||||
|
Desktop: [Here WeGo][19] — it loads faster and can find nearly everything that Google Maps can. For some reason, they’re missing some countries, like Japan.
|
||||||
|
|
||||||
|
Mobile: [Maps.me][20] — here Maps was my initial choice here too, but became less useful once they modified the app to focus on driver navigation. Maps.me is pretty good, and has far better offline functionality than Google, something very useful to a frequent traveler like me.
|
||||||
|
|
||||||
|
_Other alternatives_ : [OpenStreetMap][21] is a project I wholeheartedly support, but it’s functionality was severely lacking. It couldn’t even find my home address in Oakland.
|
||||||
|
|
||||||
|
### Easy but Not Free
|
||||||
|
|
||||||
|
Some of this was self-inflicted. For example, when looking for an alternative to Gmail, I did not just want to switch to an alternative from another tech giant. That meant no Yahoo Mail, or Microsoft Outlook as that would not address my privacy concerns.
|
||||||
|
|
||||||
|
Remember, the fact that so many of Google’s services are free (not to mention those of its competitors including Facebook) is because they are actively monetizing our data. For alternatives to survive without this level of data monetization, they have to charge us. I am willing to pay to protect my privacy, but do understand that not everyone is able to make this choice.
|
||||||
|
|
||||||
|
Think of it this way: Remember when you used to send letters and had to pay for stamps? Or when you bought weekly planners from the store? Essentially, this is the cost to use a privacy-focused email or calendar app. It’s not that bad.
|
||||||
|
|
||||||
|
#### Gmail
|
||||||
|
|
||||||
|
[ProtonMail][22] — it was founded by former CERN scientists and is based in Switzerland, a country with strong privacy protections. But what really appealed to me about ProtonMail was that it, unlike most other privacy minded email programs, was user friendly. The interface is similar to Gmail, with labels, filters, and folders, and you don’t need to know anything about security or privacy to use it.
|
||||||
|
|
||||||
|
The free version only gives you 500MB of storage space. I opted for a paid 5GB account along with their VPN service.
|
||||||
|
|
||||||
|
_Other alternatives_ : [Fastmail][23] is not as privacy oriented but also has a great interface. There’s also [Hushmail][24] and [Tutanota][25], both with similar features to ProtonMail.
|
||||||
|
|
||||||
|
#### Calendar
|
||||||
|
|
||||||
|
[Fastmail][26] Calendar — this was surprisingly tough, and brings up another issue. Google products have become so ubiquitous in so many spaces that start-ups don’t even bother to create alternatives anymore. After trying a few other mediocre options, I ended getting a recommendation and choose Fastmail as a dual second-email and calendar option.
|
||||||
|
|
||||||
|
### More Technical
|
||||||
|
|
||||||
|
These require some technical knowledge or access to your web host service. I do include simpler alternatives that I researched but did not end up choosing.
|
||||||
|
|
||||||
|
#### Google Docs, Drive, Photos, and Contacts
|
||||||
|
|
||||||
|
[NextCloud ][27]— a fully featured, secure, open source cloud suite with an intuitive, user-friendly interface. The catch is that you’ll need your own host to use Nextcloud. I already had one for my own website and was able to quickly install NextCloud using Softaculous on my host’s C-Panel. You’ll need a HTTPS certificate, which I got for free from[ Let’s Encrypt][28]. Not as easy as opening a Google Drive account but not too challenging either.
|
||||||
|
|
||||||
|
I also use Nextcloud as an alternative for Google’s photo storage and contacts, which I sync with my phone using CalDev.
|
||||||
|
|
||||||
|
_Other alternative_ s: There are other open source options such as [OwnCloud][29] or [Openstack][30]. Some for-profit options are good too, as top choices Dropbox and Box are independent entities that don’t profit off of your data.
|
||||||
|
|
||||||
|
#### Google Analytics
|
||||||
|
|
||||||
|
[Matomo ][31]— formally called Piwic, this is a self-hosted analytics platform. While not as feature rich as Google Analytics, it is plenty fine for understanding basic website traffic, with the added bonus that you aren’t gifting that traffic data to Google.
|
||||||
|
|
||||||
|
_Other alternatives: _ Not much really. [OpenWebAnalytics][32] is another open source option, and there are some for-profit alternatives too, such as GoStats and Clicky.
|
||||||
|
|
||||||
|
#### Android
|
||||||
|
|
||||||
|
[LineageOS][33] + [F-Droid App Store][34]. Sadly, the smartphone world has become a literal duopoly, with Google’s Android and Apple’s iOS controlling the entire market. The few usable alternatives that existed a few years ago, such as Blackberry OS or Mozilla’s Firefox OS, are no longer being maintained.
|
||||||
|
|
||||||
|
So the next best option is Lineage OS: a privacy minded, open source version of Android that can be installed without Google services or Apps. It requires some technical knowledge as the installation process is not completely straightforward, but it works really well, and lacks the bloatware that comes with most Android installations.
|
||||||
|
|
||||||
|
_Other alternatives: _ Ummm…Windows 10 Mobile? [PureOS][35] looks promising, as does [UbuntuTouch][36].
|
||||||
|
|
||||||
|
### Unexpected Challenges
|
||||||
|
|
||||||
|
Firstly, this took much longer than I planned due to the lack of good resources about usable alternatives, and the challenge in moving data from Google to other platforms.
|
||||||
|
|
||||||
|
But the toughest thing was email, and it has nothing to do with ProtonMail or Google.
|
||||||
|
|
||||||
|
Before I joined Gmail in 2004, I probably switched emails once a year. My first account was with Hotmail, and I then used Mail.com, Yahoo Mail, and long-forgotten services like Bigfoot. I never recall having an issue when I changed email providers. I would just tell all my friends to update their address books and change the email address on other web accounts. It used to be necessary to change email addresses regularly — remember how spam would take over older inboxes?
|
||||||
|
|
||||||
|
In fact, one of Gmail’s best innovations was its ability to filter out spam. That meant no longer needing to change emails.
|
||||||
|
|
||||||
|
Email is key to using the internet. You need it to open a Facebook account, to use online banking, to post on message boards, and many more. So when you switch accounts, you need to update your email address on all these different services.
|
||||||
|
|
||||||
|
To my surprise, changing from Gmail today is a major hassle because of all the places that require email addresses to set up an account. Several sites no longer let you do it from the backend on your own. One service actually required me to close my account and open a new one as they were unable to change my email, and then they transferred over my account data manually. Others forced me to call customer service and request an email account change, meaning time wasted on hold.
|
||||||
|
|
||||||
|
Even more amazingly, others accepted my change, and then continued to send messages to my old Gmail account, requiring another phone call. Others were even more annoying, sending some messages to my new email, but still using my old account for other emails. This became such a cumbersome process that I ended up leaving my Gmail account open for several months alongside my new ProtonMail account just to make sure important emails did not get lost. This was the main reason this took me six months.
|
||||||
|
|
||||||
|
People so rarely change their emails these days that most companies’ platforms are not designed to deal with the possibility. It’s a telling sign of the sad state of the web today that it was easier to change your email back in 2002 than it is in 2018\. Technology does not always move forward.
|
||||||
|
|
||||||
|
### So, Are These Google Alternatives Any Good?
|
||||||
|
|
||||||
|
Some are actually better! Jitsi Meet runs smoother, requires less bandwidth, and is more platform friendly than Hangouts. Firefox is more stable and less of a memory suck than Chrome. Fastmail’s Calendar has far better time zone integration.
|
||||||
|
|
||||||
|
Others are adequate equivalents. ProtonMail has most of the features of Gmail but lacks some useful integrations, such as the Boomerang email scheduler I was using before. It also has a lacking Contacts interface, but I’m using Nextcloud for that. Speaking of Nextcloud, it’s great for hosting files, contacts, and has a nifty notes tool (and lots of other plug-ins). But it does not have the rich multi-editing features of Google Docs. I’ve not yet found a workable alternative in my budget. There is Collabora Office, but it requires me to upgrade my server, something that is not feasible for me.
|
||||||
|
|
||||||
|
Some depend on location. Maps.me is actually better than Google Maps in some countries (such as Indonesia) and far worse in others (including America).
|
||||||
|
|
||||||
|
Others require me to sacrifice some features or functionality. Piwic is a poor man’s Google Analytics, and lacks many of the detailed reports or search functions of the former. DuckDuckGo is fine for general searches but has issues with specific searches, and both it and StartPage sometimes fail when I’m searching for non-English language content.
|
||||||
|
|
||||||
|
### In the End, I Don’t Miss Google at All
|
||||||
|
|
||||||
|
In fact, I feel liberated. To be so dependent on a single company for so many products is a form of servitude, especially when your data is what you’re often paying with. Moreover, many of these alternatives are, in fact, better. And there is real comfort in knowing you are in control of your data.
|
||||||
|
|
||||||
|
If we have no choice but to use Google products, then we lose what little power we have as consumers.
|
||||||
|
|
||||||
|
I want Google, Facebook, Apple, and other tech giants to stop taking users for granted, to stop trying to force us inside their all-encompassing ecosystems. I also want new players to be able to emerge and compete, just as, once upon a time, Google’s new search tool could compete with the then-industry giants Altavista and Yahoo, or Facebook’s social network was able to compete with MySpace and Friendster. The internet was a better place because Google gave us the opportunity to have a better search. Choice is good. As is portability.
|
||||||
|
|
||||||
|
Today, few of us even try other products because we’re just so used to Googling. We don’t change emails cause it’s hard. We don’t even try to use a Facebook alternative because all of our friends are on Facebook. I understand.
|
||||||
|
|
||||||
|
You don’t have to quit Google entirely. But give other alternatives a chance. You might be surprised, and remember why you loved the web way back when.
|
||||||
|
|
||||||
|
* * *
|
||||||
|
|
||||||
|
#### Other Resources
|
||||||
|
|
||||||
|
I created this resource not to be an all-encompassing guide but a story of how I was able to quit Google. Here are some resources that show other alternatives. Some are far too technical for me, and others I just didn’t have time to explore.
|
||||||
|
|
||||||
|
* [Localization Lab][2] has a detailed list of open source or privacy-tech projects — some highly technical, others quite user friendly.
|
||||||
|
|
||||||
|
* [Framasoft ][3]has an entire suite of mostly open-source Google alternatives, though many are just in French.
|
||||||
|
|
||||||
|
* Restore Privacy has also [collected a list of alternatives][4].
|
||||||
|
|
||||||
|
Your turn. Please share your favorite Google alternatives in the responses or via Twitter. I am sure there are many that I missed and would love to try. I don’t plan to stick with the alternatives listed above forever.
|
||||||
|
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
作者简介:
|
||||||
|
|
||||||
|
Nithin Coca
|
||||||
|
|
||||||
|
Freelance journalist covering politics, environment & human rights + social impacts of tech globally. For more http://www.nithincoca.com
|
||||||
|
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
via: https://medium.com/s/story/how-i-fully-quit-google-and-you-can-too-4c2f3f85793a
|
||||||
|
|
||||||
|
作者:[Nithin Coca][a]
|
||||||
|
译者:[译者ID](https://github.com/译者ID)
|
||||||
|
校对:[校对者ID](https://github.com/校对者ID)
|
||||||
|
|
||||||
|
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||||
|
|
||||||
|
[a]:https://medium.com/@excinit
|
||||||
|
[1]:https://medium.com/@excinit
|
||||||
|
[2]:https://www.localizationlab.org/projects/
|
||||||
|
[3]:https://framasoft.org/?l=en
|
||||||
|
[4]:https://restoreprivacy.com/google-alternatives/
|
||||||
|
[5]:https://medium.com/@excinit
|
||||||
|
[6]:https://www.nithincoca.com/2011/11/20/7-months-no-facebook/
|
||||||
|
[7]:https://www.quora.com/How-long-was-Gmail-in-private-%28invitation-only%29-beta
|
||||||
|
[8]:https://www.theverge.com/2018/4/28/17293056/facebook-deletefacebook-social-network-monopoly
|
||||||
|
[9]:https://en.wikipedia.org/wiki/Google_Play_Services
|
||||||
|
[10]:https://www.theguardian.com/world/2013/jun/06/us-tech-giants-nsa-data
|
||||||
|
[11]:https://takeout.google.com/settings/takeout
|
||||||
|
[12]:https://duckduckgo.com/
|
||||||
|
[13]:https://www.startpage.com/
|
||||||
|
[14]:https://www.mozilla.org/en-US/firefox/new/
|
||||||
|
[15]:https://www.seattletimes.com/business/firefox-is-back-and-its-time-to-give-it-a-try/
|
||||||
|
[16]:https://brave.com/
|
||||||
|
[17]:https://jitsi.org/jitsi-meet/
|
||||||
|
[18]:https://signal.org/
|
||||||
|
[19]:https://wego.here.com/
|
||||||
|
[20]:https://maps.me/
|
||||||
|
[21]:https://www.openstreetmap.org/
|
||||||
|
[22]:https://protonmail.com/
|
||||||
|
[23]:https://www.fastmail.com/
|
||||||
|
[24]:https://www.hushmail.com/
|
||||||
|
[25]:https://tutanota.com/
|
||||||
|
[26]:https://www.fastmail.com/
|
||||||
|
[27]:https://nextcloud.com/
|
||||||
|
[28]:https://letsencrypt.org/
|
||||||
|
[29]:https://owncloud.org/
|
||||||
|
[30]:https://www.openstack.org/
|
||||||
|
[31]:https://matomo.org/
|
||||||
|
[32]:http://www.openwebanalytics.com/
|
||||||
|
[33]:https://lineageos.org/
|
||||||
|
[34]:https://f-droid.org/en/
|
||||||
|
[35]:https://puri.sm/posts/tag/pureos/
|
||||||
|
[36]:https://ubports.com/
|
285
sources/tech/20180609 Anatomy of a Linux DNS Lookup – Part I.md
Normal file
285
sources/tech/20180609 Anatomy of a Linux DNS Lookup – Part I.md
Normal file
@ -0,0 +1,285 @@
|
|||||||
|
pinewall is translating
|
||||||
|
|
||||||
|
Anatomy of a Linux DNS Lookup – Part I
|
||||||
|
============================================================
|
||||||
|
|
||||||
|
Since I [work][3] [a][4] [lot][5] [with][6] [clustered][7] [VMs][8], I’ve ended up spending a lot of time trying to figure out how [DNS lookups][9] work. I applied ‘fixes’ to my problems from StackOverflow without really understanding why they work (or don’t work) for some time.
|
||||||
|
|
||||||
|
Eventually I got fed up with this and decided to figure out how it all hangs together. I couldn’t find a complete guide for this anywhere online, and talking to colleagues they didn’t know of any (or really what happens in detail)
|
||||||
|
|
||||||
|
So I’m writing the guide myself.
|
||||||
|
|
||||||
|
_If you’re looking for Part II, click [here][1]_
|
||||||
|
|
||||||
|
Turns out there’s quite a bit in the phrase ‘Linux does a DNS lookup’…
|
||||||
|
|
||||||
|
* * *
|
||||||
|
|
||||||
|
![linux-dns-0](https://zwischenzugs.files.wordpress.com/2018/06/linux-dns-0.png?w=121)
|
||||||
|
|
||||||
|
_“How hard can it be?”_
|
||||||
|
|
||||||
|
* * *
|
||||||
|
|
||||||
|
These posts are intended to break down how a program decides how it gets an IP address on a Linux host, and the components that can get involved. Without understanding how these pieces fit together, debugging and fixing problems with (for example) `dnsmasq`, `vagrant landrush`, or `resolvconf` can be utterly bewildering.
|
||||||
|
|
||||||
|
It’s also a valuable illustration of how something so simple can get so very complex over time. I’ve looked at over a dozen different technologies and their archaeologies so far while trying to grok what’s going on.
|
||||||
|
|
||||||
|
I even wrote some [automation code][10] to allow me to experiment in a VM. Contributions/corrections are welcome.
|
||||||
|
|
||||||
|
Note that this is not a post on ‘how DNS works’. This is about everything up to the call to the actual DNS server that’s configured on a linux host (assuming it even calls a DNS server – as you’ll see, it need not), and how it might find out which one to go to, or how it gets the IP some other way.
|
||||||
|
|
||||||
|
* * *
|
||||||
|
|
||||||
|
### 1) There is no such thing as a ‘DNS Lookup’ call
|
||||||
|
|
||||||
|
* * *
|
||||||
|
|
||||||
|
![linux-dns-1](https://zwischenzugs.files.wordpress.com/2018/06/linux-dns-1.png?w=121)
|
||||||
|
|
||||||
|
_This is NOT how it works_
|
||||||
|
|
||||||
|
* * *
|
||||||
|
|
||||||
|
The first thing to grasp is that there is no single method of getting a DNS lookup done on Linux. It’s not a core system call with a clean interface.
|
||||||
|
|
||||||
|
There is, however, a standard C library call called which many programs use: `[getaddrinfo][2]`. But not all applications use this!
|
||||||
|
|
||||||
|
Let’s just take two simple standard programs: `ping` and `host`:
|
||||||
|
|
||||||
|
```
|
||||||
|
root@linuxdns1:~# ping -c1 bbc.co.uk | head -1
|
||||||
|
PING bbc.co.uk (151.101.192.81) 56(84) bytes of data.
|
||||||
|
```
|
||||||
|
|
||||||
|
```
|
||||||
|
root@linuxdns1:~# host bbc.co.uk | head -1
|
||||||
|
bbc.co.uk has address 151.101.192.81
|
||||||
|
```
|
||||||
|
|
||||||
|
They both get the same result, so they must be doing the same thing, right?
|
||||||
|
|
||||||
|
Wrong.
|
||||||
|
|
||||||
|
Here’s the files that `ping` looks at on my host that are relevant to DNS:
|
||||||
|
|
||||||
|
```
|
||||||
|
root@linuxdns1:~# strace -e trace=open -f ping -c1 google.com
|
||||||
|
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
|
||||||
|
open("/lib/x86_64-linux-gnu/libcap.so.2", O_RDONLY|O_CLOEXEC) = 3
|
||||||
|
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
|
||||||
|
open("/etc/resolv.conf", O_RDONLY|O_CLOEXEC) = 4
|
||||||
|
open("/etc/resolv.conf", O_RDONLY|O_CLOEXEC) = 4
|
||||||
|
open("/etc/nsswitch.conf", O_RDONLY|O_CLOEXEC) = 4
|
||||||
|
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 4
|
||||||
|
open("/lib/x86_64-linux-gnu/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = 4
|
||||||
|
open("/etc/host.conf", O_RDONLY|O_CLOEXEC) = 4
|
||||||
|
open("/etc/hosts", O_RDONLY|O_CLOEXEC) = 4
|
||||||
|
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 4
|
||||||
|
open("/lib/x86_64-linux-gnu/libnss_dns.so.2", O_RDONLY|O_CLOEXEC) = 4
|
||||||
|
open("/lib/x86_64-linux-gnu/libresolv.so.2", O_RDONLY|O_CLOEXEC) = 4
|
||||||
|
PING google.com (216.58.204.46) 56(84) bytes of data.
|
||||||
|
open("/etc/hosts", O_RDONLY|O_CLOEXEC) = 4
|
||||||
|
64 bytes from lhr25s12-in-f14.1e100.net (216.58.204.46): icmp_seq=1 ttl=63 time=13.0 ms
|
||||||
|
[...]
|
||||||
|
```
|
||||||
|
|
||||||
|
and the same for `host`:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ strace -e trace=open -f host google.com
|
||||||
|
[...]
|
||||||
|
[pid 9869] open("/usr/share/locale/en_US.UTF-8/LC_MESSAGES/libdst.cat", O_RDONLY) = -1 ENOENT (No such file or directory)
|
||||||
|
[pid 9869] open("/usr/share/locale/en/libdst.cat", O_RDONLY) = -1 ENOENT (No such file or directory)
|
||||||
|
[pid 9869] open("/usr/share/locale/en/LC_MESSAGES/libdst.cat", O_RDONLY) = -1 ENOENT (No such file or directory)
|
||||||
|
[pid 9869] open("/usr/lib/ssl/openssl.cnf", O_RDONLY) = 6
|
||||||
|
[pid 9869] open("/usr/lib/x86_64-linux-gnu/openssl-1.0.0/engines/libgost.so", O_RDONLY|O_CLOEXEC) = 6[pid 9869] open("/etc/resolv.conf", O_RDONLY) = 6
|
||||||
|
google.com has address 216.58.204.46
|
||||||
|
[...]
|
||||||
|
```
|
||||||
|
|
||||||
|
You can see that while my `ping` looks at `nsswitch.conf`, `host` does not. And they both look at `/etc/resolv.conf`.
|
||||||
|
|
||||||
|
We’re going to take these two `.conf` files in turn.
|
||||||
|
|
||||||
|
* * *
|
||||||
|
|
||||||
|
### 2) NSSwitch, and `/etc/nsswitch.conf`
|
||||||
|
|
||||||
|
We’ve established that applications can do what they like when they decide which DNS server to go to. Many apps (like `ping`) above can refer (depending on the implementation (*)) to NSSwitch via its config file `/etc/nsswitch.conf`.
|
||||||
|
|
||||||
|
###### (*) There’s a surprising degree of variation in
|
||||||
|
ping implementations. That’s a rabbit-hole I
|
||||||
|
_didn’t_ want to get lost in.
|
||||||
|
|
||||||
|
NSSwitch is not just for DNS lookups. It’s also used for passwords and user lookup information (for example).
|
||||||
|
|
||||||
|
NSSwitch was originally created as part of the Solaris OS to allow applications to not have to hard-code which file or service they look these things up on, but defer them to this other configurable centralised place they didn’t have to worry about.
|
||||||
|
|
||||||
|
Here’s my `nsswitch.conf`:
|
||||||
|
|
||||||
|
```
|
||||||
|
passwd: compat
|
||||||
|
group: compat
|
||||||
|
shadow: compat
|
||||||
|
gshadow: files
|
||||||
|
hosts: files dns myhostname
|
||||||
|
networks: files
|
||||||
|
protocols: db files
|
||||||
|
services: db files
|
||||||
|
ethers: db files
|
||||||
|
rpc: db files
|
||||||
|
netgroup: nis
|
||||||
|
```
|
||||||
|
|
||||||
|
The ‘hosts’ line is the one we’re interested in. We’ve shown that `ping` cares about `nsswitch.conf` so let’s fiddle with it and see how we can mess with `ping`.
|
||||||
|
|
||||||
|
* ### Set `nsswitch.conf` to only look at ‘files’
|
||||||
|
|
||||||
|
If you set the `hosts` line in `nsswitch.conf` to be ‘just’ `files`:
|
||||||
|
|
||||||
|
`hosts: files`
|
||||||
|
|
||||||
|
Then a `ping` to google.com will now fail:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ ping -c1 google.com
|
||||||
|
ping: unknown host google.com
|
||||||
|
```
|
||||||
|
|
||||||
|
but `localhost` still works:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ ping -c1 localhost
|
||||||
|
PING localhost (127.0.0.1) 56(84) bytes of data.
|
||||||
|
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.039 ms
|
||||||
|
```
|
||||||
|
|
||||||
|
and using `host` still works fine:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ host google.com
|
||||||
|
google.com has address 216.58.206.110
|
||||||
|
```
|
||||||
|
|
||||||
|
since, as we saw, it doesn’t care about `nsswitch.conf`
|
||||||
|
|
||||||
|
* ### Set `nsswitch.conf` to only look at ‘dns’
|
||||||
|
|
||||||
|
If you set the `hosts` line in `nsswitch.conf` to be ‘just’ dns:
|
||||||
|
|
||||||
|
`hosts: dns`
|
||||||
|
|
||||||
|
Then a `ping` to google.com will now succeed again:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ ping -c1 google.com
|
||||||
|
PING google.com (216.58.198.174) 56(84) bytes of data.
|
||||||
|
64 bytes from lhr25s10-in-f174.1e100.net (216.58.198.174): icmp_seq=1 ttl=63 time=8.01 ms
|
||||||
|
```
|
||||||
|
|
||||||
|
But `localhost` is not found this time:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ ping -c1 localhost
|
||||||
|
ping: unknown host localhost
|
||||||
|
```
|
||||||
|
|
||||||
|
Here’s a diagram of what’s going on with NSSwitch by default wrt `hosts` lookup:
|
||||||
|
|
||||||
|
* * *
|
||||||
|
|
||||||
|
![linux-dns-2 (1)](https://zwischenzugs.files.wordpress.com/2018/06/linux-dns-2-11.png?w=525)
|
||||||
|
|
||||||
|
_My default ‘`hosts:`‘ configuration in `nsswitch.conf`_
|
||||||
|
|
||||||
|
* * *
|
||||||
|
|
||||||
|
### 3) `/etc/resolv.conf`
|
||||||
|
|
||||||
|
We’ve seen now that `host` and `ping` both look at this `/etc/resolv.conf` file.
|
||||||
|
|
||||||
|
Here’s what my `/etc/resolv.conf` looks like:
|
||||||
|
|
||||||
|
```
|
||||||
|
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
|
||||||
|
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
|
||||||
|
nameserver 10.0.2.3
|
||||||
|
```
|
||||||
|
|
||||||
|
Ignore the first two lines – we’ll come back to those (they are significant, but you’re not ready for that ball of wool yet).
|
||||||
|
|
||||||
|
The `nameserver` lines specify the DNS servers to look up the host for.
|
||||||
|
|
||||||
|
If you hash out that line:
|
||||||
|
|
||||||
|
```
|
||||||
|
#nameserver 10.0.2.3
|
||||||
|
```
|
||||||
|
|
||||||
|
and run:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ ping -c1 google.com
|
||||||
|
ping: unknown host google.com
|
||||||
|
```
|
||||||
|
|
||||||
|
it fails, because there’s no nameserver to go to (*).
|
||||||
|
|
||||||
|
###### * Another rabbit hole: `host` appears to fall back to
|
||||||
|
127.0.0.1:53 if there’s no nameserver specified.
|
||||||
|
|
||||||
|
This file takes other options too. For example, if you add this line to the `resolv.conf` file:
|
||||||
|
|
||||||
|
```
|
||||||
|
search com
|
||||||
|
```
|
||||||
|
|
||||||
|
and then `ping google` (sic)
|
||||||
|
|
||||||
|
```
|
||||||
|
$ ping google
|
||||||
|
PING google.com (216.58.204.14) 56(84) bytes of data.
|
||||||
|
```
|
||||||
|
|
||||||
|
it will try the `.com` domain automatically for you.
|
||||||
|
|
||||||
|
### End of Part I
|
||||||
|
|
||||||
|
That’s the end of Part I. The next part will start by looking at how that resolv.conf gets created and updated.
|
||||||
|
|
||||||
|
Here’s what you covered above:
|
||||||
|
|
||||||
|
* There’s no ‘DNS lookup’ call in the OS
|
||||||
|
|
||||||
|
* Different programs figure out the IP of an address in different ways
|
||||||
|
* For example, `ping` uses `nsswitch`, which in turn uses (or can use) `/etc/hosts`, `/etc/resolv.conf` and its own hostname to get the result
|
||||||
|
|
||||||
|
* `/etc/resolv.conf` helps decide:
|
||||||
|
* which addresses get called
|
||||||
|
|
||||||
|
* which DNS server to look up
|
||||||
|
|
||||||
|
If you thought that was complicated, buckle up…
|
||||||
|
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
via: https://zwischenzugs.com/2018/06/08/anatomy-of-a-linux-dns-lookup-part-i/
|
||||||
|
|
||||||
|
作者:[dmatech][a]
|
||||||
|
译者:[译者ID](https://github.com/译者ID)
|
||||||
|
校对:[校对者ID](https://github.com/校对者ID)
|
||||||
|
|
||||||
|
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||||
|
|
||||||
|
[a]:https://twitter.com/dmatech2
|
||||||
|
[1]:https://zwischenzugs.com/2018/06/18/anatomy-of-a-linux-dns-lookup-part-ii/
|
||||||
|
[2]:http://man7.org/linux/man-pages/man3/getaddrinfo.3.html
|
||||||
|
[3]:https://zwischenzugs.com/2017/10/31/a-complete-chef-infrastructure-on-your-laptop/
|
||||||
|
[4]:https://zwischenzugs.com/2017/03/04/a-complete-openshift-cluster-on-vagrant-step-by-step/
|
||||||
|
[5]:https://zwischenzugs.com/2017/03/04/migrating-an-openshift-etcd-cluster/
|
||||||
|
[6]:https://zwischenzugs.com/2017/03/04/1-minute-multi-node-vm-setup/
|
||||||
|
[7]:https://zwischenzugs.com/2017/03/18/clustered-vm-testing-how-to/
|
||||||
|
[8]:https://zwischenzugs.com/2017/10/27/ten-things-i-wish-id-known-before-using-vagrant/
|
||||||
|
[9]:https://zwischenzugs.com/2017/10/21/openshift-3-6-dns-in-pictures/
|
||||||
|
[10]:https://github.com/ianmiell/shutit-linux-dns/blob/master/linux_dns.py
|
@ -1,3 +1,4 @@
|
|||||||
|
[Moelf](https://github.com/Moelf) Translating
|
||||||
Why is Arch Linux So Challenging and What are Its Pros & Cons?
|
Why is Arch Linux So Challenging and What are Its Pros & Cons?
|
||||||
======
|
======
|
||||||
|
|
||||||
|
@ -1,3 +1,5 @@
|
|||||||
|
translating---geekpi
|
||||||
|
|
||||||
Revisiting wallabag, an open source alternative to Instapaper
|
Revisiting wallabag, an open source alternative to Instapaper
|
||||||
======
|
======
|
||||||
|
|
||||||
|
@ -0,0 +1,543 @@
|
|||||||
|
FSSlc is translating
|
||||||
|
|
||||||
|
4 Essential and Practical Usage of Cut Command in Linux
|
||||||
|
============================================================
|
||||||
|
|
||||||
|
The cut command is the canonical tool to remove “columns” from a text file. In this context, a “column” can be defined as a range of characters or bytes identified by their physical position on the line, or a range of fields delimited by a separator.
|
||||||
|
|
||||||
|
I have written about [using AWK commands][13] earlier. In this detailed guide, I’ll explain four essential and practical examples of cut command in Linux that will help you big time.
|
||||||
|
|
||||||
|
![Cut Linux command examples](https://i1.wp.com/linuxhandbook.com/wp-content/uploads/2018/07/cut-command-linux.jpeg?resize=702%2C395&ssl=1)
|
||||||
|
|
||||||
|
### 4 Practical examples of Cut command in Linux
|
||||||
|
|
||||||
|
If you prefer, you can watch this video explaining the same practical examples of cut command that I have listed in the article.
|
||||||
|
|
||||||
|
|
||||||
|
Table of Contents:
|
||||||
|
|
||||||
|
* [Working with character ranges][8]
|
||||||
|
* [What’s a range?][1]
|
||||||
|
|
||||||
|
* [Working with byte ranges][9]
|
||||||
|
* [Working with multibyte characters][2]
|
||||||
|
|
||||||
|
* [Working with fields][10]
|
||||||
|
* [Handling lines not containing the delimiter][3]
|
||||||
|
|
||||||
|
* [Changing the output delimiter][4]
|
||||||
|
|
||||||
|
* [Non-POSIX GNU extensions][11]
|
||||||
|
|
||||||
|
### 1\. Working with character ranges
|
||||||
|
|
||||||
|
When invoked with the `-c` command line option, the cut command will remove characterranges.
|
||||||
|
|
||||||
|
Like any other filter, the cut command does not change the input file in place but it will copy the modified data to its standard output. It is your responsibility to redirect the command output to a file to save the result or to use a pipe to send it as input to another command.
|
||||||
|
|
||||||
|
If you’ve downloaded the [sample test files][26] used in the video above, you can see the `BALANCE.txt` data file, coming straight out of an accounting software my wife is using at her work:
|
||||||
|
|
||||||
|
```
|
||||||
|
sh$ head BALANCE.txt
|
||||||
|
ACCDOC ACCDOCDATE ACCOUNTNUM ACCOUNTLIB ACCDOCLIB DEBIT CREDIT
|
||||||
|
4 1012017 623477 TIDE SCHEDULE ALNEENRE-4701-LOC 00000001615,00
|
||||||
|
4 1012017 445452 VAT BS/ENC ALNEENRE-4701-LOC 00000000323,00
|
||||||
|
4 1012017 4356 PAYABLES ALNEENRE-4701-LOC 00000001938,00
|
||||||
|
5 1012017 623372 ACCOMODATION GUIDE ALNEENRE-4771-LOC 00000001333,00
|
||||||
|
5 1012017 445452 VAT BS/ENC ALNEENRE-4771-LOC 00000000266,60
|
||||||
|
5 1012017 4356 PAYABLES ALNEENRE-4771-LOC 00000001599,60
|
||||||
|
6 1012017 4356 PAYABLES FACT FA00006253 - BIT QUIROBEN 00000001837,20
|
||||||
|
6 1012017 445452 VAT BS/ENC FACT FA00006253 - BIT QUIROBEN 00000000306,20
|
||||||
|
6 1012017 623795 TOURIST GUIDE BOOK FACT FA00006253 - BIT QUIROBEN 00000001531,00
|
||||||
|
```
|
||||||
|
|
||||||
|
This is a fixed-width text file since the data fields are padded with a variable number of spaces to ensure they are displayed as a nicely aligned table.
|
||||||
|
|
||||||
|
As a corollary, a data column always starts and ends at the same character position on each line. There is a little pitfall though: despite its name, the `cut` command actually requires you to specify the range of data you want to _keep_ , not the range you want to _remove_ . So, if I need _only_ the `ACCOUNTNUM` and `ACCOUNTLIB` columns in the data file above, I would write that:
|
||||||
|
|
||||||
|
```
|
||||||
|
sh$ cut -c 25-59 BALANCE.txt | head
|
||||||
|
ACCOUNTNUM ACCOUNTLIB
|
||||||
|
623477 TIDE SCHEDULE
|
||||||
|
445452 VAT BS/ENC
|
||||||
|
4356 /accountPAYABLES
|
||||||
|
623372 ACCOMODATION GUIDE
|
||||||
|
445452 VAT BS/ENC
|
||||||
|
4356 PAYABLES
|
||||||
|
4356 PAYABLES
|
||||||
|
445452 VAT BS/ENC
|
||||||
|
623795 TOURIST GUIDE BOOK
|
||||||
|
```
|
||||||
|
|
||||||
|
#### What’s a range?
|
||||||
|
|
||||||
|
As we have just seen it, the cut command requires we specify the _range_ of data we want to keep. So, let’s introduce more formally what is a range: for the `cut` command, a range is defined by a starting and ending position separated by a hyphen. Ranges are 1-based, that is the first item of the line is the item number 1, not 0\. Ranges are inclusive: the start and end will be preserved in the output, as well as all characters between them. It is an error to specify a range whose ending position is before (“lower”) than its starting position. As a shortcut, you can omit the start _or_ end value as described in the table below:
|
||||||
|
|
||||||
|
|
||||||
|
|||
|
||||||
|
|--|--|
|
||||||
|
| `a-b` | the range between a and b (inclusive) |
|
||||||
|
|`a` | equivalent to the range `a-a` |
|
||||||
|
| `-b` | equivalent to `1-a` |
|
||||||
|
| `b-` | equivalent to `b-∞` |
|
||||||
|
|
||||||
|
The cut commands allow you to specify several ranges by separating them with a comma. Here are a couple of examples:
|
||||||
|
|
||||||
|
```
|
||||||
|
# Keep characters from 1 to 24 (inclusive)
|
||||||
|
cut -c -24 BALANCE.txt
|
||||||
|
|
||||||
|
# Keep characters from 1 to 24 and 36 to 59 (inclusive)
|
||||||
|
cut -c -24,36-59 BALANCE.txt
|
||||||
|
|
||||||
|
# Keep characters from 1 to 24, 36 to 59 and 93 to the end of the line (inclusive)
|
||||||
|
cut -c -24,36-59,93- BALANCE.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
One limitation (or feature, depending on the way you see it) of the `cut` command is it will _never reorder the data_ . So the following command will produce exactly the same result as the previous one, despite the ranges being specified in a different order:
|
||||||
|
|
||||||
|
```
|
||||||
|
cut -c 93-,-24,36-59 BALANCE.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
You can check that easily using the `diff` command:
|
||||||
|
|
||||||
|
```
|
||||||
|
diff -s <(cut -c -24,36-59,93- BALANCE.txt) \
|
||||||
|
<(cut -c 93-,-24,36-59 BALANCE.txt)
|
||||||
|
Files /dev/fd/63 and /dev/fd/62 are identical
|
||||||
|
```
|
||||||
|
|
||||||
|
Similarly, the `cut` command _never duplicates data_ :
|
||||||
|
|
||||||
|
```
|
||||||
|
# One might expect that could be a way to repeat
|
||||||
|
# the first column three times, but no...
|
||||||
|
cut -c -10,-10,-10 BALANCE.txt | head -5
|
||||||
|
ACCDOC
|
||||||
|
4
|
||||||
|
4
|
||||||
|
4
|
||||||
|
5
|
||||||
|
```
|
||||||
|
|
||||||
|
Worth mentioning there was a proposal for a `-o` option to lift those two last limitations, allowing the `cut` utility to reorder or duplicate data. But this was [rejected by the POSIX committee][14] _“because this type of enhancement is outside the scope of the IEEE P1003.2b draft standard.”_
|
||||||
|
|
||||||
|
As of myself, I don’t know any cut version implementing that proposal as an extension. But if you do, please, share that with us using the comment section!
|
||||||
|
|
||||||
|
### 2\. Working with byte ranges
|
||||||
|
|
||||||
|
When invoked with the `-b` command line option, the cut command will remove byte ranges.
|
||||||
|
|
||||||
|
At first sight, there is no obvious difference between _character_ and _byte_ ranges:
|
||||||
|
|
||||||
|
```
|
||||||
|
sh$ diff -s <(cut -b -24,36-59,93- BALANCE.txt) \
|
||||||
|
<(cut -c -24,36-59,93- BALANCE.txt)
|
||||||
|
Files /dev/fd/63 and /dev/fd/62 are identical
|
||||||
|
```
|
||||||
|
|
||||||
|
That’s because my sample data file is using the [US-ASCII character encoding][27] (“charset”) as the `file -i` command can correctly guess it:
|
||||||
|
|
||||||
|
```
|
||||||
|
sh$ file -i BALANCE.txt
|
||||||
|
BALANCE.txt: text/plain; charset=us-ascii
|
||||||
|
```
|
||||||
|
|
||||||
|
In that character encoding, there is a one-to-one mapping between characters and bytes. Using only one byte, you can theoretically encode up to 256 different characters (digits, letters, punctuations, symbols, … ) In practice, that number is much lower since character encodings make provision for some special values (like the 32 or 65 [control characters][28]generally found). Anyway, even if we could use the full byte range, that would be far from enough to store the variety of human writing. So, today, the one-to-one mapping between characters and byte is more the exception than the norm and is almost always replaced by the ubiquitous UTF-8 multibyte encoding. Let’s see now how the cut command could handle that.
|
||||||
|
|
||||||
|
#### Working with multibyte characters
|
||||||
|
|
||||||
|
As I said previously, the sample data files used as examples for that article are coming from an accounting software used by my wife. It appends she updated that software recently and, after that, the exported text files were subtlely different. I let you try spotting the difference by yourself:
|
||||||
|
|
||||||
|
```
|
||||||
|
sh$ head BALANCE-V2.txt
|
||||||
|
ACCDOC ACCDOCDATE ACCOUNTNUM ACCOUNTLIB ACCDOCLIB DEBIT CREDIT
|
||||||
|
4 1012017 623477 TIDE SCHEDULE ALNÉENRE-4701-LOC 00000001615,00
|
||||||
|
4 1012017 445452 VAT BS/ENC ALNÉENRE-4701-LOC 00000000323,00
|
||||||
|
4 1012017 4356 PAYABLES ALNÉENRE-4701-LOC 00000001938,00
|
||||||
|
5 1012017 623372 ACCOMODATION GUIDE ALNÉENRE-4771-LOC 00000001333,00
|
||||||
|
5 1012017 445452 VAT BS/ENC ALNÉENRE-4771-LOC 00000000266,60
|
||||||
|
5 1012017 4356 PAYABLES ALNÉENRE-4771-LOC 00000001599,60
|
||||||
|
6 1012017 4356 PAYABLES FACT FA00006253 - BIT QUIROBEN 00000001837,20
|
||||||
|
6 1012017 445452 VAT BS/ENC FACT FA00006253 - BIT QUIROBEN 00000000306,20
|
||||||
|
6 1012017 623795 TOURIST GUIDE BOOK FACT FA00006253 - BIT QUIROBEN 00000001531,00
|
||||||
|
```
|
||||||
|
|
||||||
|
The title of this section might help you in finding what has changed. But, found or not, let see now the consequences of that change:
|
||||||
|
|
||||||
|
```
|
||||||
|
sh$ cut -c 93-,-24,36-59 BALANCE-V2.txt
|
||||||
|
ACCDOC ACCDOCDATE ACCOUNTLIB DEBIT CREDIT
|
||||||
|
4 1012017 TIDE SCHEDULE 00000001615,00
|
||||||
|
4 1012017 VAT BS/ENC 00000000323,00
|
||||||
|
4 1012017 PAYABLES 00000001938,00
|
||||||
|
5 1012017 ACCOMODATION GUIDE 00000001333,00
|
||||||
|
5 1012017 VAT BS/ENC 00000000266,60
|
||||||
|
5 1012017 PAYABLES 00000001599,60
|
||||||
|
6 1012017 PAYABLES 00000001837,20
|
||||||
|
6 1012017 VAT BS/ENC 00000000306,20
|
||||||
|
6 1012017 TOURIST GUIDE BOOK 00000001531,00
|
||||||
|
19 1012017 SEMINAR FEES 00000000080,00
|
||||||
|
19 1012017 PAYABLES 00000000080,00
|
||||||
|
28 1012017 MAINTENANCE 00000000746,58
|
||||||
|
28 1012017 VAT BS/ENC 00000000149,32
|
||||||
|
28 1012017 PAYABLES 00000000895,90
|
||||||
|
31 1012017 PAYABLES 00000000240,00
|
||||||
|
31 1012017 VAT BS/DEBIT 00000000040,00
|
||||||
|
31 1012017 ADVERTISEMENTS 00000000200,00
|
||||||
|
32 1012017 WATER 00000000202,20
|
||||||
|
32 1012017 VAT BS/DEBIT 00000000020,22
|
||||||
|
32 1012017 WATER 00000000170,24
|
||||||
|
32 1012017 VAT BS/DEBIT 00000000009,37
|
||||||
|
32 1012017 PAYABLES 00000000402,03
|
||||||
|
34 1012017 RENTAL COSTS 00000000018,00
|
||||||
|
34 1012017 PAYABLES 00000000018,00
|
||||||
|
35 1012017 MISCELLANEOUS CHARGES 00000000015,00
|
||||||
|
35 1012017 VAT BS/DEBIT 00000000003,00
|
||||||
|
35 1012017 PAYABLES 00000000018,00
|
||||||
|
36 1012017 LANDLINE TELEPHONE 00000000069,14
|
||||||
|
36 1012017 VAT BS/ENC 00000000013,83
|
||||||
|
```
|
||||||
|
|
||||||
|
I have copied above the command output _in-extenso_ so it should be obvious something has gone wrong with the column alignment.
|
||||||
|
|
||||||
|
The explanation is the original data file contained only US-ASCII characters (symbol, punctuations, numbers and Latin letters without any diacritical marks)
|
||||||
|
|
||||||
|
But if you look closely at the file produced after the software update, you can see that new export data file now preserves accented letters. For example, the company named “ALNÉENRE” is now properly spelled whereas it was previously exported as “ALNEENRE” (no accent)
|
||||||
|
|
||||||
|
The `file -i` utility did not miss that change since it reports now the file as being [UTF-8 encoded][15]:
|
||||||
|
|
||||||
|
```
|
||||||
|
sh$ file -i BALANCE-V2.txt
|
||||||
|
BALANCE-V2.txt: text/plain; charset=utf-8
|
||||||
|
```
|
||||||
|
|
||||||
|
To see how are encoded accented letters in an UTF-8 file, we can use the `[hexdump][12]` utility that allows us to look directly at the bytes in a file:
|
||||||
|
|
||||||
|
```
|
||||||
|
# To reduce clutter, let's focus only on the second line of the file
|
||||||
|
sh$ sed '2!d' BALANCE-V2.txt
|
||||||
|
4 1012017 623477 TIDE SCHEDULE ALNÉENRE-4701-LOC 00000001615,00
|
||||||
|
sh$ sed '2!d' BALANCE-V2.txt | hexdump -C
|
||||||
|
00000000 34 20 20 20 20 20 20 20 20 20 31 30 31 32 30 31 |4 101201|
|
||||||
|
00000010 37 20 20 20 20 20 20 20 36 32 33 34 37 37 20 20 |7 623477 |
|
||||||
|
00000020 20 20 20 54 49 44 45 20 53 43 48 45 44 55 4c 45 | TIDE SCHEDULE|
|
||||||
|
00000030 20 20 20 20 20 20 20 20 20 20 20 41 4c 4e c3 89 | ALN..|
|
||||||
|
00000040 45 4e 52 45 2d 34 37 30 31 2d 4c 4f 43 20 20 20 |ENRE-4701-LOC |
|
||||||
|
00000050 20 20 20 20 20 20 20 20 20 20 20 20 20 30 30 30 | 000|
|
||||||
|
00000060 30 30 30 30 31 36 31 35 2c 30 30 20 20 20 20 20 |00001615,00 |
|
||||||
|
00000070 20 20 20 20 20 20 20 20 20 20 20 0a | .|
|
||||||
|
0000007c
|
||||||
|
```
|
||||||
|
|
||||||
|
On the line 00000030 of the `hexdump` output, after a bunch of spaces (byte `20`), you can see:
|
||||||
|
|
||||||
|
* the letter `A` is encoded as the byte `41`,
|
||||||
|
|
||||||
|
* the letter `L` is encoded a the byte `4c`,
|
||||||
|
|
||||||
|
* and the letter `N` is encoded as the byte `4e`.
|
||||||
|
|
||||||
|
But, the uppercase [LATIN CAPITAL LETTER E WITH ACUTE][16] (as it is the official name of the letter _É_ in the Unicode standard) is encoded using the _two_ bytes `c3 89`
|
||||||
|
|
||||||
|
And here is the problem: using the `cut` command with ranges expressed as byte positions works well for fixed length encodings, but not for variable length ones like UTF-8 or [Shift JIS][17]. This is clearly explained in the following [non-normative extract of the POSIX standard][18]:
|
||||||
|
|
||||||
|
> Earlier versions of the cut utility worked in an environment where bytes and characters were considered equivalent (modulo <backspace> and <tab> processing in some implementations). In the extended world of multi-byte characters, the new -b option has been added.
|
||||||
|
|
||||||
|
Hey, wait a minute! I wasn’t using the `-b` option in the “faulty” example above, but the `-c`option. So, _shouldn’t_ that have worked?!?
|
||||||
|
|
||||||
|
Yes, it _should_ : it is unfortunate, but we are in 2018 and despite that, as of GNU Coreutils 8.30, the GNU implementation of the cut utility still does not handle multi-byte characters properly. To quote the [GNU documentation][19], the `-c` option is _“The same as -b for now, but internationalization will change that[… ]”_ — a mention that is present since more than 10 years now!
|
||||||
|
|
||||||
|
On the other hand, the [OpenBSD][20] implementation of the cut utility is POSIX compliant, and will honor the current locale settings to handle multi-byte characters properly:
|
||||||
|
|
||||||
|
```
|
||||||
|
# Ensure subseauent commands will know we are using UTF-8 encoded
|
||||||
|
# text files
|
||||||
|
openbsd-6.3$ export LC_CTYPE=en_US.UTF-8
|
||||||
|
|
||||||
|
# With the `-c` option, cut works properly with multi-byte characters
|
||||||
|
openbsd-6.3$ cut -c -24,36-59,93- BALANCE-V2.txt
|
||||||
|
ACCDOC ACCDOCDATE ACCOUNTLIB DEBIT CREDIT
|
||||||
|
4 1012017 TIDE SCHEDULE 00000001615,00
|
||||||
|
4 1012017 VAT BS/ENC 00000000323,00
|
||||||
|
4 1012017 PAYABLES 00000001938,00
|
||||||
|
5 1012017 ACCOMODATION GUIDE 00000001333,00
|
||||||
|
5 1012017 VAT BS/ENC 00000000266,60
|
||||||
|
5 1012017 PAYABLES 00000001599,60
|
||||||
|
6 1012017 PAYABLES 00000001837,20
|
||||||
|
6 1012017 VAT BS/ENC 00000000306,20
|
||||||
|
6 1012017 TOURIST GUIDE BOOK 00000001531,00
|
||||||
|
19 1012017 SEMINAR FEES 00000000080,00
|
||||||
|
19 1012017 PAYABLES 00000000080,00
|
||||||
|
28 1012017 MAINTENANCE 00000000746,58
|
||||||
|
28 1012017 VAT BS/ENC 00000000149,32
|
||||||
|
28 1012017 PAYABLES 00000000895,90
|
||||||
|
31 1012017 PAYABLES 00000000240,00
|
||||||
|
31 1012017 VAT BS/DEBIT 00000000040,00
|
||||||
|
31 1012017 ADVERTISEMENTS 00000000200,00
|
||||||
|
32 1012017 WATER 00000000202,20
|
||||||
|
32 1012017 VAT BS/DEBIT 00000000020,22
|
||||||
|
32 1012017 WATER 00000000170,24
|
||||||
|
32 1012017 VAT BS/DEBIT 00000000009,37
|
||||||
|
32 1012017 PAYABLES 00000000402,03
|
||||||
|
34 1012017 RENTAL COSTS 00000000018,00
|
||||||
|
34 1012017 PAYABLES 00000000018,00
|
||||||
|
35 1012017 MISCELLANEOUS CHARGES 00000000015,00
|
||||||
|
35 1012017 VAT BS/DEBIT 00000000003,00
|
||||||
|
35 1012017 PAYABLES 00000000018,00
|
||||||
|
36 1012017 LANDLINE TELEPHONE 00000000069,14
|
||||||
|
36 1012017 VAT BS/ENC 00000000013,83
|
||||||
|
```
|
||||||
|
|
||||||
|
As expected, when using the `-b` byte mode instead of the `-c` character mode, the OpenBSD cut implementation behave like the legacy `cut`:
|
||||||
|
|
||||||
|
```
|
||||||
|
openbsd-6.3$ cut -b -24,36-59,93- BALANCE-V2.txt
|
||||||
|
ACCDOC ACCDOCDATE ACCOUNTLIB DEBIT CREDIT
|
||||||
|
4 1012017 TIDE SCHEDULE 00000001615,00
|
||||||
|
4 1012017 VAT BS/ENC 00000000323,00
|
||||||
|
4 1012017 PAYABLES 00000001938,00
|
||||||
|
5 1012017 ACCOMODATION GUIDE 00000001333,00
|
||||||
|
5 1012017 VAT BS/ENC 00000000266,60
|
||||||
|
5 1012017 PAYABLES 00000001599,60
|
||||||
|
6 1012017 PAYABLES 00000001837,20
|
||||||
|
6 1012017 VAT BS/ENC 00000000306,20
|
||||||
|
6 1012017 TOURIST GUIDE BOOK 00000001531,00
|
||||||
|
19 1012017 SEMINAR FEES 00000000080,00
|
||||||
|
19 1012017 PAYABLES 00000000080,00
|
||||||
|
28 1012017 MAINTENANCE 00000000746,58
|
||||||
|
28 1012017 VAT BS/ENC 00000000149,32
|
||||||
|
28 1012017 PAYABLES 00000000895,90
|
||||||
|
31 1012017 PAYABLES 00000000240,00
|
||||||
|
31 1012017 VAT BS/DEBIT 00000000040,00
|
||||||
|
31 1012017 ADVERTISEMENTS 00000000200,00
|
||||||
|
32 1012017 WATER 00000000202,20
|
||||||
|
32 1012017 VAT BS/DEBIT 00000000020,22
|
||||||
|
32 1012017 WATER 00000000170,24
|
||||||
|
32 1012017 VAT BS/DEBIT 00000000009,37
|
||||||
|
32 1012017 PAYABLES 00000000402,03
|
||||||
|
34 1012017 RENTAL COSTS 00000000018,00
|
||||||
|
34 1012017 PAYABLES 00000000018,00
|
||||||
|
35 1012017 MISCELLANEOUS CHARGES 00000000015,00
|
||||||
|
35 1012017 VAT BS/DEBIT 00000000003,00
|
||||||
|
35 1012017 PAYABLES 00000000018,00
|
||||||
|
36 1012017 LANDLINE TELEPHONE 00000000069,14
|
||||||
|
36 1012017 VAT BS/ENC 00000000013,83
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3\. Working with fields
|
||||||
|
|
||||||
|
In some sense, working with fields in a delimited text file is easier for the `cut` utility, since it will only have to locate the (one byte) field delimiters on each row, copying then verbatim the field content to the output without bothering with any encoding issues.
|
||||||
|
|
||||||
|
Here is a sample delimited text file:
|
||||||
|
|
||||||
|
```
|
||||||
|
sh$ head BALANCE.csv
|
||||||
|
ACCDOC;ACCDOCDATE;ACCOUNTNUM;ACCOUNTLIB;ACCDOCLIB;DEBIT;CREDIT
|
||||||
|
4;1012017;623477;TIDE SCHEDULE;ALNEENRE-4701-LOC;00000001615,00;
|
||||||
|
4;1012017;445452;VAT BS/ENC;ALNEENRE-4701-LOC;00000000323,00;
|
||||||
|
4;1012017;4356;PAYABLES;ALNEENRE-4701-LOC;;00000001938,00
|
||||||
|
5;1012017;623372;ACCOMODATION GUIDE;ALNEENRE-4771-LOC;00000001333,00;
|
||||||
|
5;1012017;445452;VAT BS/ENC;ALNEENRE-4771-LOC;00000000266,60;
|
||||||
|
5;1012017;4356;PAYABLES;ALNEENRE-4771-LOC;;00000001599,60
|
||||||
|
6;1012017;4356;PAYABLES;FACT FA00006253 - BIT QUIROBEN;;00000001837,20
|
||||||
|
6;1012017;445452;VAT BS/ENC;FACT FA00006253 - BIT QUIROBEN;00000000306,20;
|
||||||
|
6;1012017;623795;TOURIST GUIDE BOOK;FACT FA00006253 - BIT QUIROBEN;00000001531,00;
|
||||||
|
```
|
||||||
|
|
||||||
|
You may know that file format as [CSV][29] (for Comma-separated Value), even if the field separator is not always a comma. For example, the semi-colon (`;`) is frequently encountered as a field separator, and it is often the default choice when exporting data as “CSV” in countries already using the comma as the [decimal separator][30] (like we do in France — hence the choice of that character in my sample file). Another popular variant uses a [tab character][31] as the field separator, producing what is sometimes called a [tab-separated values][32] file. Finally, in the Unix and Linux world, the colon (`:`) is yet another relatively common field separator you may find, for example, in the standard `/etc/passwd` and `/etc/group` files.
|
||||||
|
|
||||||
|
When using a delimited text file format, you provide to the cut command the range of fields to keep using the `-f` option, and you have to specify the delimiter using the `-d` option (without the `-d` option, the cut utility defaults to a tab character for the separator):
|
||||||
|
|
||||||
|
```
|
||||||
|
sh$ cut -f 5- -d';' BALANCE.csv | head
|
||||||
|
ACCDOCLIB;DEBIT;CREDIT
|
||||||
|
ALNEENRE-4701-LOC;00000001615,00;
|
||||||
|
ALNEENRE-4701-LOC;00000000323,00;
|
||||||
|
ALNEENRE-4701-LOC;;00000001938,00
|
||||||
|
ALNEENRE-4771-LOC;00000001333,00;
|
||||||
|
ALNEENRE-4771-LOC;00000000266,60;
|
||||||
|
ALNEENRE-4771-LOC;;00000001599,60
|
||||||
|
FACT FA00006253 - BIT QUIROBEN;;00000001837,20
|
||||||
|
FACT FA00006253 - BIT QUIROBEN;00000000306,20;
|
||||||
|
FACT FA00006253 - BIT QUIROBEN;00000001531,00;
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Handling lines not containing the delimiter
|
||||||
|
|
||||||
|
But what if some line in the input file does not contain the delimiter? It is tempting to imagine that as a row containing only the first field. But this is _not_ what the cut utility does.
|
||||||
|
|
||||||
|
By default, when using the `-f` option, the cut utility will always output verbatim a line that does not contain the delimiter (probably assuming this is a non-data row like a header or comment of some sort):
|
||||||
|
|
||||||
|
```
|
||||||
|
sh$ (echo "# 2018-03 BALANCE"; cat BALANCE.csv) > BALANCE-WITH-HEADER.csv
|
||||||
|
|
||||||
|
sh$ cut -f 6,7 -d';' BALANCE-WITH-HEADER.csv | head -5
|
||||||
|
# 2018-03 BALANCE
|
||||||
|
DEBIT;CREDIT
|
||||||
|
00000001615,00;
|
||||||
|
00000000323,00;
|
||||||
|
;00000001938,00
|
||||||
|
```
|
||||||
|
|
||||||
|
Using the `-s` option, you can reverse that behavior, so `cut` will always ignore such line:
|
||||||
|
|
||||||
|
```
|
||||||
|
sh$ cut -s -f 6,7 -d';' BALANCE-WITH-HEADER.csv | head -5
|
||||||
|
DEBIT;CREDIT
|
||||||
|
00000001615,00;
|
||||||
|
00000000323,00;
|
||||||
|
;00000001938,00
|
||||||
|
00000001333,00;
|
||||||
|
```
|
||||||
|
|
||||||
|
If you are in a hackish mood, you can exploit that feature as a relatively obscure way to keep only lines containing a given character:
|
||||||
|
|
||||||
|
```
|
||||||
|
# Keep lines containing a `e`
|
||||||
|
sh$ printf "%s\n" {mighty,bold,great}-{condor,monkey,bear} | cut -s -f 1- -d'e'
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Changing the output delimiter
|
||||||
|
|
||||||
|
As an extension, the GNU implementation of cut allows to use a different field separator for the output using the `--output-delimiter` option:
|
||||||
|
|
||||||
|
```
|
||||||
|
sh$ cut -f 5,6- -d';' --output-delimiter="*" BALANCE.csv | head
|
||||||
|
ACCDOCLIB*DEBIT*CREDIT
|
||||||
|
ALNEENRE-4701-LOC*00000001615,00*
|
||||||
|
ALNEENRE-4701-LOC*00000000323,00*
|
||||||
|
ALNEENRE-4701-LOC**00000001938,00
|
||||||
|
ALNEENRE-4771-LOC*00000001333,00*
|
||||||
|
ALNEENRE-4771-LOC*00000000266,60*
|
||||||
|
ALNEENRE-4771-LOC**00000001599,60
|
||||||
|
FACT FA00006253 - BIT QUIROBEN**00000001837,20
|
||||||
|
FACT FA00006253 - BIT QUIROBEN*00000000306,20*
|
||||||
|
FACT FA00006253 - BIT QUIROBEN*00000001531,00*
|
||||||
|
```
|
||||||
|
|
||||||
|
Notice, in that case, all occurrences of the field separator are replaced, and not only those at the boundary of the ranges specified on the command line arguments.
|
||||||
|
|
||||||
|
### 4\. Non-POSIX GNU extensions
|
||||||
|
|
||||||
|
Speaking of non-POSIX GNU extension, a couple of them that can be particularly useful. Worth mentioning the following extensions work equally well with the byte, character (for what that means in the current GNU implementation) or field ranges:
|
||||||
|
|
||||||
|
Think of that option like the exclamation mark in a sed address (`!`); instead of keeping the data matching the given range, `cut` will keep data NOT matching the range
|
||||||
|
|
||||||
|
```
|
||||||
|
# Keep only field 5
|
||||||
|
sh$ cut -f 5 -d';' BALANCE.csv |head -3
|
||||||
|
ACCDOCLIB
|
||||||
|
ALNEENRE-4701-LOC
|
||||||
|
ALNEENRE-4701-LOC
|
||||||
|
|
||||||
|
# Keep all but field 5
|
||||||
|
sh$ cut --complement -f 5 -d';' BALANCE.csv |head -3
|
||||||
|
ACCDOC;ACCDOCDATE;ACCOUNTNUM;ACCOUNTLIB;DEBIT;CREDIT
|
||||||
|
4;1012017;623477;TIDE SCHEDULE;00000001615,00;
|
||||||
|
4;1012017;445452;VAT BS/ENC;00000000323,00;
|
||||||
|
```
|
||||||
|
|
||||||
|
use the [NUL character][6] as the line terminator instead of the [newline character][7]. The `-z`option is particularly useful when your data may contain embedded newline characters, like when working with filenames (since newline is a valid character in a filename, but NUL isn’t).
|
||||||
|
|
||||||
|
|
||||||
|
To show you how the `-z` option works, let’s make a little experiment. First, we will create a file whose name contains embedded new lines:
|
||||||
|
|
||||||
|
```
|
||||||
|
bash$ touch $'EMPTY\nFILE\nWITH FUNKY\nNAME'.txt
|
||||||
|
bash$ ls -1 *.txt
|
||||||
|
BALANCE.txt
|
||||||
|
BALANCE-V2.txt
|
||||||
|
EMPTY?FILE?WITH FUNKY?NAME.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
Let’s now assume I want to display the first 5 characters of each `*.txt` file name. A naive solution will miserably fail here:
|
||||||
|
|
||||||
|
```
|
||||||
|
sh$ ls -1 *.txt | cut -c 1-5
|
||||||
|
BALAN
|
||||||
|
BALAN
|
||||||
|
EMPTY
|
||||||
|
FILE
|
||||||
|
WITH
|
||||||
|
NAME.
|
||||||
|
```
|
||||||
|
|
||||||
|
You may have already read `[ls][21]` was designed for [human consumption][33], and using it in a command pipeline is an anti-pattern (it is indeed). So let’s use the `[find][22]` command instead:
|
||||||
|
|
||||||
|
```
|
||||||
|
sh$ find . -name '*.txt' -printf "%f\n" | cut -c 1-5
|
||||||
|
BALAN
|
||||||
|
EMPTY
|
||||||
|
FILE
|
||||||
|
WITH
|
||||||
|
NAME.
|
||||||
|
BALAN
|
||||||
|
```
|
||||||
|
|
||||||
|
and … that produced basically the same erroneous result as before (although in a different order because `ls` implicitly sorts the filenames, something the `find` command does not do)
|
||||||
|
|
||||||
|
The problem is in both cases, the `cut` command can’t make the distinction between a newline character being part of a data field (the filename), and a newline character used as an end of record marker. But, using the NUL byte (`\0`) as the line terminator clears the confusion so we can finally obtain the expected result:
|
||||||
|
|
||||||
|
```
|
||||||
|
# I was told (?) some old versions of tr require using \000 instead of \0
|
||||||
|
# to denote the NUL character (let me know if you needed that change!)
|
||||||
|
sh$ find . -name '*.txt' -printf "%f\0" | cut -z -c 1-5| tr '\0' '\n'
|
||||||
|
BALAN
|
||||||
|
EMPTY
|
||||||
|
BALAN
|
||||||
|
```
|
||||||
|
|
||||||
|
With that latest example, we are moving away from the core of this article that was the `cut`command. So, I will let you try to figure by yourself the meaning of the funky `"%f\0"` after the `-printf` argument of the `find` command or why I used the `[tr][23]` command at the end of the pipeline.
|
||||||
|
|
||||||
|
### A lot more can be done with Cut command
|
||||||
|
|
||||||
|
I just showed the most common and in my opinion the most essential usage of Cut command. You can apply the command in even more practical ways. It depends on your logical reasoning and imagination.
|
||||||
|
|
||||||
|
Don’t hesitate to use the comment section below to post your findings. And, as always, if you like this article, don’t forget to share it on your favorite websites and social media!
|
||||||
|
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
via: https://linuxhandbook.com/cut-command/
|
||||||
|
|
||||||
|
作者:[Sylvain Leroux ][a]
|
||||||
|
译者:[译者ID](https://github.com/译者ID)
|
||||||
|
校对:[校对者ID](https://github.com/校对者ID)
|
||||||
|
|
||||||
|
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||||
|
|
||||||
|
[a]:https://linuxhandbook.com/author/sylvain/
|
||||||
|
[1]:https://linuxhandbook.com/cut-command/#_what_s_a_range
|
||||||
|
[2]:https://linuxhandbook.com/cut-command/#_working_with_multibyte_characters
|
||||||
|
[3]:https://linuxhandbook.com/cut-command/#_handling_lines_not_containing_the_delimiter
|
||||||
|
[4]:https://linuxhandbook.com/cut-command/#_changing_the_output_delimiter
|
||||||
|
[5]:http://click.linksynergy.com/deeplink?id=IRL8ozn3lq8&type=10&mid=39197&murl=https%3A%2F%2Fwww.udemy.com%2Fyes-i-know-the-bash-linux-command-line-tools%2F
|
||||||
|
[6]:https://en.wikipedia.org/wiki/Null_character
|
||||||
|
[7]:https://en.wikipedia.org/wiki/Newline
|
||||||
|
[8]:https://linuxhandbook.com/cut-command/#_working_with_character_ranges
|
||||||
|
[9]:https://linuxhandbook.com/cut-command/#_working_with_byte_ranges
|
||||||
|
[10]:https://linuxhandbook.com/cut-command/#_working_with_fields
|
||||||
|
[11]:https://linuxhandbook.com/cut-command/#_non_posix_gnu_extensions
|
||||||
|
[12]:https://linux.die.net/man/1/hexdump
|
||||||
|
[13]:https://linuxhandbook.com/awk-command-tutorial/
|
||||||
|
[14]:http://pubs.opengroup.org/onlinepubs/9699919799/utilities/cut.html#tag_20_28_18
|
||||||
|
[15]:https://en.wikipedia.org/wiki/UTF-8#Codepage_layout
|
||||||
|
[16]:https://www.fileformat.info/info/unicode/char/00c9/index.htm
|
||||||
|
[17]:https://en.wikipedia.org/wiki/Shift_JIS#Shift_JIS_byte_map
|
||||||
|
[18]:http://pubs.opengroup.org/onlinepubs/9699919799/utilities/cut.html#tag_20_28_16
|
||||||
|
[19]:https://www.gnu.org/software/coreutils/manual/html_node/cut-invocation.html#cut-invocation
|
||||||
|
[20]:https://www.openbsd.org/
|
||||||
|
[21]:https://linux.die.net/man/1/ls
|
||||||
|
[22]:https://linux.die.net/man/1/find
|
||||||
|
[23]:https://linux.die.net/man/1/tr
|
||||||
|
[24]:https://linuxhandbook.com/author/sylvain/
|
||||||
|
[25]:https://linuxhandbook.com/cut-command/#comments
|
||||||
|
[26]:https://static.yesik.it/EP22/Yes_I_Know_IT-EP22.tar.gz
|
||||||
|
[27]:https://en.wikipedia.org/wiki/ASCII#Character_set
|
||||||
|
[28]:https://en.wikipedia.org/wiki/Control_character
|
||||||
|
[29]:https://en.wikipedia.org/wiki/Comma-separated_values
|
||||||
|
[30]:https://en.wikipedia.org/wiki/Decimal_separator
|
||||||
|
[31]:https://en.wikipedia.org/wiki/Tab_key#Tab_characters
|
||||||
|
[32]:https://en.wikipedia.org/wiki/Tab-separated_values
|
||||||
|
[33]:http://lists.gnu.org/archive/html/coreutils/2014-02/msg00005.html
|
@ -1,84 +0,0 @@
|
|||||||
translating---geekpi
|
|
||||||
|
|
||||||
Incomplete Path Expansion (Completion) For Bash
|
|
||||||
======
|
|
||||||
|
|
||||||
![](https://4.bp.blogspot.com/-k2pRIKTzcBU/W1BpFtzzWuI/AAAAAAAABOE/pqX4XcOX8T4NWkKOmzD0T0OioqxzCmhLgCLcBGAs/s1600/Gnu-bash-logo.png)
|
|
||||||
|
|
||||||
[bash-complete-partial-path][1] enhances the path completion in Bash (on Linux, macOS with gnu-sed, and Windows with MSYS) by adding incomplete path expansion, similar to Zsh. This is useful if you want this time-saving feature in Bash, without having to switch to Zsh.
|
|
||||||
|
|
||||||
Here is how this works. When the `Tab` key is pressed, bash-complete-partial-path assumes each component is incomplete and tries to expand it. Let's say you want to navigate to `/usr/share/applications` . You can type `cd /u/s/app` , press `Tab` , and bash-complete-partial-path should expand it into `cd /usr/share/applications` . If there are conflicts, only the path without conflicts is completed upon pressing `Tab` . For instance Ubuntu users should have quite a few folders in `/usr/share` that begin with "app" so in this case, typing `cd /u/s/app` will only expand the `/usr/share/` part.
|
|
||||||
|
|
||||||
Here is another example of deeper incomplete file path expansion. On an Ubuntu system type `cd /u/s/f/t/u` , press `Tab` , and it should be automatically expanded to cd `/usr/share/fonts/truetype/ubuntu` .
|
|
||||||
|
|
||||||
Features include:
|
|
||||||
|
|
||||||
* Escapes special characters
|
|
||||||
|
|
||||||
* If the user starts the path with quotes, character escaping is not applied and instead, the quote is closed with a matching character after expending the path
|
|
||||||
|
|
||||||
* Properly expands ~ expressions
|
|
||||||
|
|
||||||
* If bash-completion package is already in use, this code will safely override its _filedir function. No extra configuration is required, just make sure you source this project after the main bash-completion.
|
|
||||||
|
|
||||||
Check out the [project page][2] for more information and a demo screencast.
|
|
||||||
|
|
||||||
### Install bash-complete-partial-path
|
|
||||||
|
|
||||||
The bash-complete-partial-path installation instructions specify downloading the bash_completion script directly. I prefer to grab the Git repository instead, so I can update it with a simple `git pull` , therefore the instructions below will use this method of installing bash-complete-partial-path. You can use the [official][3] instructions if you prefer them.
|
|
||||||
|
|
||||||
1. Install Git (needed to clone the bash-complete-partial-path Git repository).
|
|
||||||
|
|
||||||
In Debian, Ubuntu, Linux Mint and so on, use this command to install Git:
|
|
||||||
|
|
||||||
```
|
|
||||||
sudo apt install git
|
|
||||||
```
|
|
||||||
|
|
||||||
2. Clone the bash-complete-partial-path Git repository in `~/.config/`:
|
|
||||||
|
|
||||||
```
|
|
||||||
cd ~/.config && git clone https://github.com/sio/bash-complete-partial-path
|
|
||||||
```
|
|
||||||
|
|
||||||
3. Source `~/.config/bash-complete-partial-path/bash_completion` in your `~/.bashrc` file,
|
|
||||||
|
|
||||||
Open ~/.bashrc with a text editor. You can use Gedit for example:
|
|
||||||
|
|
||||||
```
|
|
||||||
gedit ~/.bashrc
|
|
||||||
```
|
|
||||||
|
|
||||||
At the end of the `~/.bashrc` file add the following (as a single line):
|
|
||||||
|
|
||||||
```
|
|
||||||
[ -s "$HOME/.config/bash-complete-partial-path/bash_completion" ] && source "$HOME/.config/bash-complete-partial-path/bash_completion"
|
|
||||||
```
|
|
||||||
|
|
||||||
I mentioned adding it at the end of the file because this needs to be included below (after) the main bash-completion from your `~/.bashrc` file. So make sure you don't add it above the original bash-completion as it will cause issues.
|
|
||||||
|
|
||||||
4\. Source `~/.bashrc`:
|
|
||||||
|
|
||||||
```
|
|
||||||
source ~/.bashrc
|
|
||||||
```
|
|
||||||
|
|
||||||
And you're done, bash-complete-partial-path should now be installed and ready to be used.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
--------------------------------------------------------------------------------
|
|
||||||
|
|
||||||
via: https://www.linuxuprising.com/2018/07/incomplete-path-expansion-completion.html
|
|
||||||
|
|
||||||
作者:[Logix][a]
|
|
||||||
选题:[lujun9972](https://github.com/lujun9972)
|
|
||||||
译者:[译者ID](https://github.com/译者ID)
|
|
||||||
校对:[校对者ID](https://github.com/校对者ID)
|
|
||||||
|
|
||||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
|
||||||
|
|
||||||
[a]:https://plus.google.com/118280394805678839070
|
|
||||||
[1]:https://github.com/sio/bash-complete-partial-path
|
|
||||||
[2]:https://github.com/sio/bash-complete-partial-path
|
|
||||||
[3]:https://github.com/sio/bash-complete-partial-path#installation-and-updating
|
|
@ -0,0 +1,82 @@
|
|||||||
|
针对 Bash 的不完整路径展开(补全)
|
||||||
|
======
|
||||||
|
|
||||||
|
![](https://4.bp.blogspot.com/-k2pRIKTzcBU/W1BpFtzzWuI/AAAAAAAABOE/pqX4XcOX8T4NWkKOmzD0T0OioqxzCmhLgCLcBGAs/s1600/Gnu-bash-logo.png)
|
||||||
|
|
||||||
|
[bash-complete-partial-path][1] 通过添加不完整的路径展开(类似于 Zsh)来增强 Bash(它在 Linux 上,macOS 使用 gnu-sed,Windows 使用 MSYS)中的路径补全。如果你想在 Bash 中使用这个省时特性,而不必切换到 Zsh,它将非常有用。
|
||||||
|
|
||||||
|
这是它如何工作的。当按下 `Tab` 键时,bash-complete-partial-path 假定每个部分都不完整并尝试展开它。假设你要进入 `/usr/share/applications` 。你可以输入 `cd /u/s/app`,按下 `Tab`,bash-complete-partial-path 应该把它展开成 `cd /usr/share/applications` 。如果存在冲突,那么按 `Tab` 仅补全没有冲突的路径。例如,Ubuntu 用户在 `/usr/share` 中应该有很多以 “app” 开头的文件夹,在这种情况下,输入 `cd /u/s/app` 只会展开 `/usr/share/` 部分。
|
||||||
|
|
||||||
|
这是更深层不完整文件路径展开的另一个例子。在Ubuntu系统上输入 `cd /u/s/f/t/u`,按下 `Tab`,它应该自动展开为 `cd /usr/share/fonts/truetype/ubuntu`。
|
||||||
|
|
||||||
|
功能包括:
|
||||||
|
|
||||||
|
* 转义特殊字符
|
||||||
|
|
||||||
|
* 如果用户路径开头使用引号,则不转义字符转义,而是在展开路径后使用匹配字符结束引号
|
||||||
|
|
||||||
|
* 正确展开 ~ 表达式
|
||||||
|
|
||||||
|
* 如果 bash-completion 包正在使用,则此代码将安全地覆盖其 _filedir 函数。无需额外配置,只需确保在主 bash-completion 后 source 此项目。
|
||||||
|
|
||||||
|
查看[项目页面][2]以获取更多信息和演示截图。
|
||||||
|
|
||||||
|
### 安装 bash-complete-partial-path
|
||||||
|
|
||||||
|
bash-complete-partial-path 安装说明指定直接下载 bash_completion 脚本。我更喜欢从 Git 仓库获取,这样我可以用一个简单的 `git pull` 来更新它,因此下面的说明将使用这种安装 bash-complete-partial-path。如果你喜欢,可以使用[官方][3]说明。
|
||||||
|
|
||||||
|
1. 安装 Git(需要克隆 bash-complete-partial-path 的 Git 仓库)。
|
||||||
|
|
||||||
|
在 Debian、Ubuntu、Linux Mint 等中,使用此命令安装 Git:
|
||||||
|
|
||||||
|
```
|
||||||
|
sudo apt install git
|
||||||
|
```
|
||||||
|
|
||||||
|
2. 在 `~/.config/` 中克隆 bash-complete-partial-path 的 Git 仓库:
|
||||||
|
|
||||||
|
```
|
||||||
|
cd ~/.config && git clone https://github.com/sio/bash-complete-partial-path
|
||||||
|
```
|
||||||
|
|
||||||
|
3. 在 `~/.bashrc` 文件中 source `~/.config/bash-complete-partial-path/bash_completion`,
|
||||||
|
|
||||||
|
用文本编辑器打开 ~/.bashrc。例如你可以使用 Gedit:
|
||||||
|
|
||||||
|
```
|
||||||
|
gedit ~/.bashrc
|
||||||
|
```
|
||||||
|
|
||||||
|
在 `~/.bashrc` 的末尾添加以下内容(在一行中):
|
||||||
|
|
||||||
|
```
|
||||||
|
[ -s "$HOME/.config/bash-complete-partial-path/bash_completion" ] && source "$HOME/.config/bash-complete-partial-path/bash_completion"
|
||||||
|
```
|
||||||
|
|
||||||
|
我提到在文件的末尾添加它,因为这需要包含在你的 `~/.bashrc` 文件的主 bash-completion 下面(之后)。因此,请确保不要将其添加到原始 bash-completion 之上,因为它会导致问题。
|
||||||
|
|
||||||
|
4\. Source `~/.bashrc`:
|
||||||
|
|
||||||
|
```
|
||||||
|
source ~/.bashrc
|
||||||
|
```
|
||||||
|
|
||||||
|
这样就好了,现在应该安装完 bash-complete-partial-path 并可以使用了。
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
via: https://www.linuxuprising.com/2018/07/incomplete-path-expansion-completion.html
|
||||||
|
|
||||||
|
作者:[Logix][a]
|
||||||
|
选题:[lujun9972](https://github.com/lujun9972)
|
||||||
|
译者:[geekpi](https://github.com/geekpi)
|
||||||
|
校对:[校对者ID](https://github.com/校对者ID)
|
||||||
|
|
||||||
|
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||||
|
|
||||||
|
[a]:https://plus.google.com/118280394805678839070
|
||||||
|
[1]:https://github.com/sio/bash-complete-partial-path
|
||||||
|
[2]:https://github.com/sio/bash-complete-partial-path
|
||||||
|
[3]:https://github.com/sio/bash-complete-partial-path#installation-and-updating
|
Loading…
Reference in New Issue
Block a user