Merge remote-tracking branch 'LCTT/master'

This commit is contained in:
Xingyu Wang 2019-11-06 10:41:41 +08:00
commit db225cb840
14 changed files with 1881 additions and 620 deletions

View File

@ -0,0 +1,107 @@
[#]: collector: (lujun9972)
[#]: translator: ( )
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (Open Source Big Data Solutions Support Digital Transformation)
[#]: via: (https://opensourceforu.com/2019/11/open-source-big-data-solutions-support-digital-transformation/)
[#]: author: (Vinayak Ramachandra Adkoli https://opensourceforu.com/author/vinayak-adkoli/)
Open Source Big Data Solutions Support Digital Transformation
======
[![][1]][2]
_The digital transformation (DT) of enterprises is enabled by the judicious use of Big Data. And its open source technologies that are the driving force behind the power of Big Data and DT._
Digital Transformation (DT) and Big Data combine to offer several advantages. Big Data based digitally transformed systems make life easier and smarter, whether in the field of home automation or industrial automation. The digital world tracks Big Data generated by IoT devices, etc. It tries to make this data more productive and hence, DT should be taken for granted as the world progresses.
For example, NASA s rover Curiosity is sending Big Data from Mars to the Earth. As compared to data sent by NASAs satellites that are revolving around Mars, this data is nothing but digitally transformed Big Data, which works with DT to provide a unique platform for open source applications. Today, Curiosity has its own Twitter account with four million followers.
A Digital Transformation isnt complete unless a business adopts Big Data. The phrase “Data is the new crude oil,” is not new. However, crude oil itself has no value, unless it is refined into petrol, diesel, tar, wax, etc. Similarly, in our daily lives, we deal with tons of data. If this data is refined to a useful form, only then is it of some real use.
As an example, we can see the transformation televisions have undergone, in appearance. We once had picture tube based TVs. Today, we have LEDs, OLEDs, LCD based TVs, curved TVs, Internet enabled TVs, and so on. Such transformation is also quite evident in the digital world.
In a hospital, several patients may be diagnosed with cancer, each year. The patient data generated is voluminous, including treatment methods, diverse drug therapies, patient responses, genetic histories, etc. But such vast pools of information, i.e., Big Data, would serve no useful purpose without proper analysis. So DT, coupled with Big Data and open source applications, can create a more patient-focused and effective treatment one that might have higher recovery rates.
Big Data combines structured data with unstructured data to give us new business insights that weve never had before. Structured data may be traditional spreadsheets, your customer list, information about your products and business processes, etc. Unstructured data may include Google Trends data, feeds from IoT sensors, etc. When a layer of unstructured data is placed on top of structured data and analysed, thats where the magic happens.
Lets look into a typical business situation. Lets suppose a century old car-making company asks its data team to use Big Data concepts to find an efficient way to make safe sales forecasts. In the past, the team would look at the number of products it had sold in the previous month, as well as the number of cars it had sold a year ago and use that data to make a safe forecast. But now the Big Data teams use sentiment analysis on Twitter and look at what people are saying about its products and brand. They also look at Google Trends to see which similar products and brands are being searched the most. Then they correlate such data from the preceding few months with the actual current sales figures to check if the former was predictive i.e., had Google Trends over the past few months actually predicted the firms current sales figures?
In the case of the car company, while making sales forecasts, the team used structured data (how many cars sold last month, a year ago, etc) and layers of unstructured data (sentiment analysis from Twitter and Google Trends) and it resulted in a smart forecast. Thus, Big Data is today becoming more effective in business situations like sales planning, promotions, market campaigns, etc.
**Open source is the key to DT**
Open source, nowadays, clearly dominates domains like Big Data, mobile and cloud platforms. Once open source becomes a key component that delivers a good financial performance, the momentum is unstoppable. Open source (often coupled with the cloud) is giving Big Data based companies like Google, Facebook and other Web giants flexibility to innovate faster.
Big Data companies are using DT to understand their processes, so that they can employ technologies like IoT, Big Data analytics, AI, etc, better. The journey of enterprises migrating from old digital infrastructure to new platforms is an exciting trend in the open source environment.
Organisations are relying on data warehouses and business intelligence applications to help make important data driven business decisions. Different types of data, such as audio, video or unstructured data, is organised in formats to help identify it for making future decisions.
**Open source tools used in DT**
Several open source tools are becoming popular for dealing with Big Data and DT. Some of them are listed below.
* **Hadoop** is known for the ability to process extremely large data volumes in both structured and unstructured formats, reliably placing Big Data to nodes in the group and making it available locally on the processing machine.
* **MapReduce** happens to be a crucial component of Hadoop. It works rapidly to process vast amounts of data in parallel on large clusters of computer nodes. It was originally developed by Google.
* **Storm** is different from other tools with its distributed, real-time, fault-tolerant processing system, unlike the batch processing of Hadoop. It is fast and highly scalable. It is now owned by Twitter.
* **Apache Cassandra** is used by many organisations with large, active data sets, including Netflix, Twitter, Urban Airship, Cisco and Digg. Originally developed by Facebook, it is now managed by the Apache Foundation.
* **Kaggle** is the worlds largest Big Data community. It helps organisations and researchers to post their data and statistics. It is an open source Big Data tool that allows programmers to analyse large data sets on Hadoop. It helps with querying and managing large data sets really fast.
**DT: A new innovation**
DT is the result of IT innovation. It is driven by well-planned business strategies, with the goal of inventing new business models. Today, any organisation can undergo business transformation because of three main business-focused essentials — intelligence, the ability to decide more quickly and a customer-centric outlook.
DT, which includes establishing Big Data analytics capabilities, poses considerable challenges for traditional manufacturing organisations, such as car companies. The successful introduction of Big Data analytics often requires substantial organisational transformation including new organisational structures and business processes.
Retail is one of the most active sectors when it comes to DT. JLab is an innovative DT venture by retail giant John Lewis, which offers lots of creativity and entrepreneurial dynamism. It is even encouraging five startups each year and helps them to bring their technologies to market. For example, Digital Bridge, a startup promoted by JLab, has developed a clever e-commerce website that allows shoppers to snap photos of their rooms and see what furniture and other products would look like in their own homes. It automatically detects walls and floors, and creates a photo realistic virtual representation of the customers room. Here, lighting and decoration can be changed and products can be placed, rotated and repositioned with a realistic perspective.
Companies across the globe are going through digital business transformation as it helps to improve their business processes and leads to new business opportunities. The importance of Big Data in the business world cant be ignored. Nowadays, it is a key factor for success. There is a huge amount of valuable data which companies can use to improve their results and strategies. Today, every important decision can and should be supported by the application of data analytics.
Big Data and open source help DT do more for businesses. DT helps companies become digitally mature and gain a solid presence on the Internet. It helps companies to identify any drawbacks that may exist in their e-commerce system.
**Big Data in DT**
Data is critical, but it cant be used as a replacement for creativity. In other words, DT is not all about creativity versus data, but its about creativity enhanced by data.
Companies gather data to analyse and improve the customer experience, and then to create targeted messages emphasising the brand promise. But emotion, story-telling and human connections remain as essential as ever. The DT world today is dominated by Big Data. This is inevitable given the fact that business organisations always want DT based Big Data, so that data is innovative, appealing, useful to attract customers and hence to increase their sales.
Tesla cars today are equipped with sensors and IoT connections to gather a vast amount of data. Improvements based on this data are then fed back into the cars, creating a better driving experience.
**DT in India**
DT can transform businesses across every vertical in India. Data analytics has changed from being a good-to-have to a must-have technology.
According to a survey by Microsoft in partnership with International Data Corporation (IDC), by 2021, DT will add an estimated US$ 154 billion to Indias GDP and increase the growth rate by 1 per cent annually. Ninety per cent of Indian organisations are in the midst of their DT journey. India is the biggest user and contributor to open source technology. DT has created a new ripple across the whole of India and is one of the major drivers for the growth of open source. The government of India has encouraged the adoption of this new technology in the Digital India initiative, and this has further encouraged the CEOs of enterprises and other government organisations to make a move towards this technology.
The continuous DT in India is being driven faster with the adoption of emerging technologies like Big Data. Thats one of the reasons why organisations today are investing in these technological capabilities. Businesses in India are recognising the challenges of DT and embracing them. Overall, it may be said that the new DT concept is more investor and technology friendly, in tune with the Make in India programme of the present government.
From finding ways to increase business efficiency and trimming costs, to retaining high-value customers, determining new revenue opportunities and preventing fraud, advanced analytics is playing an important role in the DT of Big Data based companies.
**The way forward**
Access to Big Data has changed the game for small and large businesses alike. Big Data can help businesses to solve almost every problem. DT helps companies to embrace a culture of change and remain competitive in a global environment. Losing weight is a life style change and so is the incorporation of Big Data into business strategies.
Big Data is the currency of tomorrow, and today, it is the fuel running a business. DT can harness it to a greater level.
![Avatar][3]
[Vinayak Ramachandra Adkoli][4]
The author is a B.E. in industrial production, and has been a lecturer in the mechanical engineering department for ten years at three different polytechnics. He is also a freelance writer and cartoonist. He can be contacted at [karnatakastory@gmail.com][5] or [vradkoli@rediffmail.com][6].
--------------------------------------------------------------------------------
via: https://opensourceforu.com/2019/11/open-source-big-data-solutions-support-digital-transformation/
作者:[Vinayak Ramachandra Adkoli][a]
选题:[lujun9972][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensourceforu.com/author/vinayak-adkoli/
[b]: https://github.com/lujun9972
[1]: https://i0.wp.com/opensourceforu.com/wp-content/uploads/2019/11/Big-Data-.jpg?resize=696%2C517&ssl=1 (Big Data)
[2]: https://i0.wp.com/opensourceforu.com/wp-content/uploads/2019/11/Big-Data-.jpg?fit=800%2C594&ssl=1
[3]: https://secure.gravatar.com/avatar/7b4383616c8708e3417051b3afd64bbc?s=100&r=g
[4]: https://opensourceforu.com/author/vinayak-adkoli/
[5]: mailto:karnatakastory@gmail.com
[6]: mailto:vradkoli@rediffmail.com

View File

@ -0,0 +1,69 @@
[#]: collector: (lujun9972)
[#]: translator: ( )
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (A Birds Eye View of Big Data for Enterprises)
[#]: via: (https://opensourceforu.com/2019/11/a-birds-eye-view-of-big-data-for-enterprises-2/)
[#]: author: (Swapneel Mehta https://opensourceforu.com/author/swapneel-mehta/)
A Birds Eye View of Big Data for Enterprises
======
[![][1]][2]
_Entrepreneurial decisions are made using data and business acumen. Big Data is today a tool that helps to maximise revenue and customer engagement. Open source tools like Hadoop, Apache Spark and Apache Storm are the popular choices when it comes to analysing Big Data. As the volume and variety of data in the world grows by the day, there is great scope for the discovery of trends as well as for innovation in data analysis and storage._
In the past five years, the spate of research focused on machine learning has resulted in a boom in the nature and quality of heterogeneous data sources that are being tapped by providers for their customers. Cheaper compute and widespread storage makes it so much easier to apply bulk data processing techniques, and derive insights from existing and unexplored sources of rich user data including logs and traces of activity whilst using software products. Business decision making and strategy has been primarily dictated by data and is usually supported by business acumen. But in recent times it has not been uncommon to see data providing conclusions seemingly in contrast with conventional business logic.
One could take the simple example of the baseball movie Moneyball, in which the protagonist defies all notions of popular wisdom in looking solely at performance statistics to evaluate player viability, eventually building a winning team of players a team that would otherwise never have come together. The advantage of Big Data for enterprises, then, becomes a no brainer for most corporate entities looking to maximise revenue and engagement. At the back-end, this is accomplished by popular combinations of existing tools specially designed for large scale, multi-purpose data analysis. Apache, Hadoop and Spark are some of the most widespread open source tools used in this space in the industry. Concomitantly, it is easy to imagine that there are a number of software providers offering B2B services to corporate clients looking to outsource specific portions of their analytics. Therefore, there is a bustling market with customisable, proprietary technological solutions in this space as well.
![Figure 1: A crowded landscape to follow \(Source: Forbes\)][3]
Traditionally, Big Data refers to the large volumes of unstructured and heterogeneous data that is often subject to processing in order to provide insights and improve decision-making regarding critical business processes. The McKinsey Global institute estimates that data volumes have been growing at 40 per cent per year and will grow 44x between the years 2009 and 2020. But there is more to Big Data than just its immense volume. The rate of data production is an important factor given that smaller data streams generated at faster rates produce larger pools than their counterparts. Social media is a great example of how small networks can expand rapidly to become rich sources of information — up to massive, billion-node scales.
Structure in data is a highly variable attribute given that data is now extracted from across the entire spectrum of user activity. Conventional formats of storage, including relational databases, have been virtually replaced by massively unstructured data pools designed to be leveraged in manners unique to their respective use cases. In fact, there has been a huge body of work on data storage in order to leverage various write formats, compression algorithms, access methods and data structures to arrive at the best combination for improving productivity of the workflow reliant on that data. A variety of these combinations has emerged to set the industry standards in their respective verticals, with the benefits ranging from efficient storage to faster access.
Finally, we have the latent value in these data pools that remains to be exploited by the use of emerging trends in artificial intelligence and machine learning. Personalised advertising recommendations are a huge factor driving revenue for social media giants like Facebook and companies like Google that offer a suite of products and an ecosystem to use them. The well-known Silicon Valley giant started out as a search provider, but now controls a host of apps and most of the entry points for the data generated in the course of people using a variety of electronic devices across the world. Established financial institutions are now exploring the possibility of a portion of user data being put on an immutable public ledger to introduce a blockchain-like structure that can open the doors to innovation. The pace is picking up as product offerings improve in quality and expand in variety. Lets get a birds eye view of this subject to understand where the market stands.
The idea behind building better frameworks is increasingly turning into a race to provide more add-on features and simplify workflows for the end user to engage with. This means the categories have many blurred lines because most products and tools present themselves as end-to-end platforms to manage Big Data analytics. However, well attempt to divide this broadly into a few categories and examine some providers in each of these.
**Big Data storage and processing**
Infrastructure is the key to building a reliable workflow when it comes to enterprise use cases. Earlier, relational databases were worthwhile to invest in for small and mid-sized firms. However, when the data starts pouring in, it is usually the scalability that is put to the test first. Building a flexible infrastructure comes at the cost of complexity. It is likely to have more moving parts that can cause failure in the short-term. However, if done right something that will not be easy because it has to be tailored exactly to your company it can result in life-changing improvements for both users and the engineers working with the said infrastructure to build and deliver state-of-the-art products.
There are many alternatives to SQL, with the NoSQL paradigm being adopted and modified for building different types of systems. Cassandra, MongoDB and CouchDB are some well-known alternatives. Most emerging options can be distinguished based on their disruption, which is aimed at the fundamental ACID properties of databases. To recall, a transaction in a database system must maintain atomicity, consistency, isolation, and durability commonly known as ACID properties in order to ensure accuracy, completeness, and data integrity (from Tutorialspoint). For instance, CockroachDB, an open source offshoot of Googles Spanner database system, has gained traction due to its support for being distributed. Redis and HBase offer a sort of hybrid storage solution while Neo4j remains a flag bearer for graph structured databases. However, traditional areas aside, there are always new challenges on the horizon for building enterprise software.
Backups are one such area where startups have found viable disruption points to enter the market. Cloud backups for enterprise software are expensive, non-trivial procedures and offloading this work to proprietary software offers a lucrative business opportunity. Rubrik and Cohesity are two companies that originally started out in this space and evolved to offer added services atop their primary offerings. Clumio is a recent entrant, purportedly creating a data fabric that the promoters expect will serve as a foundational layer to run analytics on top of. It is interesting to follow recent developments in this burgeoning space as we see competitors enter the market and attempt to carve a niche for themselves with their product offerings.
**Big Data analytics in the cloud**
Apache Hadoop remains the popular choice for many organisations. However, many successors have emerged to offer a set of additional analytical capabilities: Apache Spark, commonly hailed as an improvement to the Hadoop ecosystem; Apache Storm that offers real-time data processing capabilities; and Googles BigQuery, which is supposedly a full-fledged platform for Big Data analytics.
Typically, cloud providers such as Amazon Web Services and Google Cloud Platform tend to build in-house products leveraging these capabilities, or replicate them entirely and offer them as hosted services to businesses. This helps them provide enterprise offerings that are closely integrated within their respective cloud computing ecosystem. There has been some discussion about the moral consequences of replicating open source products to profit off closed source versions of the same, but there has been no consensus on the topic, nor any severe consequences suffered on account of this questionable approach to boost revenue.
Another hosted service offering a plethora of Big Data analytics tools is Cloudera which has an established track record in the market. It has been making waves since its merger with Hortonworks earlier this year, giving it added fuel to compete with the giants in its bid to become the leading enterprise cloud provider in the market.
Overall, weve seen interesting developments in the Big Data storage and analysis domain and as the volume and variety of data grows, so do the opportunities to innovate in the field.
![Avatar][4]
[Swapneel Mehta][5]
The author has worked at Microsoft Research, CERN and startups in AI and cyber security. He is an open source enthusiast who enjoys spending time organising software development workshops for school and college students. You can contact him at <https://www.linkedin.com/in/swapneelm>; <https://github.com/SwapneelM> or <http://www.ccdev.in>.
--------------------------------------------------------------------------------
via: https://opensourceforu.com/2019/11/a-birds-eye-view-of-big-data-for-enterprises-2/
作者:[Swapneel Mehta][a]
选题:[lujun9972][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensourceforu.com/author/swapneel-mehta/
[b]: https://github.com/lujun9972
[1]: https://i1.wp.com/opensourceforu.com/wp-content/uploads/2019/11/Figure-1-Big-Data-analytics-and-processing-for-the-enterprise.jpg?resize=696%2C449&ssl=1 (Figure 1 Big Data analytics and processing for the enterprise)
[2]: https://i1.wp.com/opensourceforu.com/wp-content/uploads/2019/11/Figure-1-Big-Data-analytics-and-processing-for-the-enterprise.jpg?fit=900%2C580&ssl=1
[3]: https://i0.wp.com/opensourceforu.com/wp-content/uploads/2019/11/Figure-2-A-crowded-landscape-to-follow.jpg?resize=350%2C254&ssl=1
[4]: https://secure.gravatar.com/avatar/2ba7abaf240a1f6166d506dccdcda00f?s=100&r=g
[5]: https://opensourceforu.com/author/swapneel-mehta/

View File

@ -0,0 +1,94 @@
[#]: collector: (lujun9972)
[#]: translator: ( )
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (AI and 5G: Entering a new world of data)
[#]: via: (https://www.networkworld.com/article/3451718/ai-and-5g-entering-a-new-world-of-data.html)
[#]: author: (Matt Conran https://www.networkworld.com/author/Matt-Conran/)
AI and 5G: Entering a new world of data
======
The deployment model of vendor-centric equipment cannot sustain this exponential growth in traffic.
[Stinging Eyes][1] [(CC BY-SA 2.0)][2]
Today the telecom industry has identified the need for faster end-user-data rates. Previously users were happy to call and text each other. However, now mobile communication has converted our lives in such a dramatic way it is hard to imagine this type of communication anymore.
Nowadays, we are leaning more towards imaging and VR/AR video-based communication. Therefore, considering such needs, these applications are looking for a new type of network. Immersive experiences with 360° video applications require a lot of data and a zero-lag network.
To give you a quick idea, VR with a resolution equivalent to 4K TV resolution would require a bandwidth of 1Gbps for a smooth play or 2.5 Gbps for interactive; both requiring a minimal latency of 10ms and minimal delay. And that's for round-trip time. Soon these applications will target the smartphone, putting additional strains on networks. As AR/VR services grow in popularity, the proposed 5G networks will yield the speed and the needed performance.
Every [IoT device][3] _[Disclaimer: The author works for Network Insight]_, no matter how dumb it is, will create data and this data is the fuel for the engine of AI. AI enables us to do more interesting things with the data. The ultimate goal of the massive amount of data we will witness is the ability to turn this data into value. The rise in data from the enablement of 5G represents the biggest opportunity for AI.
There will be unprecedented levels of data that will have to move across the network for processing and in some cases be cached locally to ensure low latency. For this, we primarily need to move the processing closer to the user to utilize ultra-low latency and ultra-high throughput.
### Some challenges with 5G
The introduction of 5G is not without challenges. It's expensive and is distributed in ways that have not been distributed in the past. There is an extensive cost involved in building this type of network. Location is central to effective planning, deployment and optimization of 5G networks.
Also, the 5G millimeter wave comes with its own challenges. There are techniques that allow you to take the signal and send it towards a specific customer instead of sending it to every direction. The old way would be similar to a light bulb that reaches all the parts of the room, as opposed to a flashlight that targets specific areas.
[The time of 5G is almost here][4]
So, choosing the right location plays a key role in the development and deployment of 5G networks. Therefore, you must analyze if you are building in the right place, and are marketing to the right targets. How many new subscribers do you expect to sign up for the services if you choose one area over the other? You need to take into account the population that travels around that area, the building structures and how easy it is to get the signal.
Moreover, we must understand the potential of flooding and analyze real-time weather to predict changes in traffic. So, if there is a thunderstorm, we need to understand how such events influence the needs of the networks and then make predictive calculations. AI can certainly assist in predicting these events.
### AI, a doorway to opportunity
5G is introducing new challenges, but by integrating AI techniques into networks is one way the industry is addressing these complexities. AI techniques is a key component that needs to be adapted to the network to help manage and control this change. Another important use case for AI is for network planning and operations.
With 5G, we will have 100,000s of small cells everywhere where each cell is connected to a fiber line. It has been predicted that we can have 10 million cells globally. Figuring out how to plan and design all these cells would be beyond human capability. This is where AI can do site evaluations and tell you what throughput you have with certain designs.
AI can help build out the 5G infrastructure and map out the location of cell towers to pinpoint the best location for the 5G rollout. It can continuously monitor how the network is being used. If one of the cell towers is not functioning as expected, AI can signal to another cell tower to take over.
### Vendor-centric equipment cannot sustain 5G
With the enablement of 5G networks, we have a huge amount of data. In some cases, this could be high in the PB region per day; the majority of this will be due to video-based applications. A deployment model of vendor-centric equipment cannot sustain this exponential growth in traffic.
We will witness a lot of open source in this area, with the movement of the processing and compute, storage and network functionality to the edge. Eventually, this will create a real-time network at the edge.
### More processing at the edge
Edge computing involves having the computer, server and network at the very edge of the network that is closer to the user. It provides intelligence at the edge, thereby reducing the amount of traffic going to the backbone.
Edge computing can result in for example AI object identification to reach the target recognition in under .35 seconds. Essentially, we have the image recognition deep learning algorithm that is sitting on the edge. The algorithm sitting on the edge of the network will help to reduce the traffic sent to the backbone.
However, this also opens up a new attack surface and luckily AI plays well with cybersecurity. A closed-loop system will collect data at the network edge, identity threats and take real-time action.
### Edge and open source
We have a few popular open-source options available at our disposal. Some examples of open source edge computing could be Akraino Edge Stack, ONAP Open Network Animation Platform and Airship Open Infrastructure Project.
The Akraino Edge Stack creates an open-source software stack that supports high-availability cloud services. These services are optimized for edge computing systems and applications.
The Akraino R1 release includes 10 “ready and proven” blueprints and delivers a fully functional edge stack for edge use cases. These range from Industrial IoT, Telco 5G Core &amp; vRAN, uCPE, SDWAN, edge media processing and carrier edge media processing.
The ONAP (Open Network Platform) provides a comprehensive platform for real-time, policy-driven orchestration and automation of physical and virtual network functions. It is an open-source networking project hosted by the Linux Foundation.
Finally, the Airship Open Infrastructure Project is a collection of open-source tools for automating cloud provisioning and management. These tools include OpenStack for virtual machines, Kubernetes for container orchestration and MaaS for bare metal, with planned support for OpenStack Ironic.
**This article is published as part of the IDG Contributor Network. [Want to Join?][5]**
Join the Network World communities on [Facebook][6] and [LinkedIn][7] to comment on topics that are top of mind.
--------------------------------------------------------------------------------
via: https://www.networkworld.com/article/3451718/ai-and-5g-entering-a-new-world-of-data.html
作者:[Matt Conran][a]
选题:[lujun9972][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://www.networkworld.com/author/Matt-Conran/
[b]: https://github.com/lujun9972
[1]: https://www.flickr.com/photos/martinlatter/4233363677
[2]: https://creativecommons.org/licenses/by-sa/2.0/legalcode
[3]: https://network-insight.net/2017/10/internet-things-iot-dissolving-cloud/
[4]: https://www.networkworld.com/article/3354477/mobile-world-congress-the-time-of-5g-is-almost-here.html
[5]: https://www.networkworld.com/contributor-network/signup.html
[6]: https://www.facebook.com/NetworkWorld/
[7]: https://www.linkedin.com/company/network-world

View File

@ -0,0 +1,61 @@
[#]: collector: (lujun9972)
[#]: translator: ( )
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (Forrester: Edge computing is about to bloom)
[#]: via: (https://www.networkworld.com/article/3451532/forrester-edge-computing-is-about-to-bloom.html)
[#]: author: (Jon Gold https://www.networkworld.com/author/Jon-Gold/)
Forrester: Edge computing is about to bloom
======
2020 is set to be a “breakout year” for edge computing technology, according to the latest research from Forrester Research
Getty Images
The next calendar year will be the one that propels [edge computing][1] into the enterprise technology limelight for good, according to a set of predictions from Forrester Research.
While edge computing is primarily an [IoT][2]-related phenomenon, Forrester said that addressing the need for on-demand compute and real-time app engagements will also play a role in driving the growth of edge computing in 2020.
[[Get regularly scheduled insights by signing up for Network World newsletters.]][3]
What it all boils down to, in some ways, is that form factors will shift sharply away from traditional rack, blade or tower servers in the coming year, depending on where the edge technology is deployed. An autonomous car, for example, wont be able to run a traditionally constructed server.
[][4]
BrandPost Sponsored by HPE
[Take the Intelligent Route with Consumption-Based Storage][4]
Combine the agility and economics of HPE storage with HPE GreenLake and run your IT department with efficiency.
Itll also mean that telecom companies will begin to feature a lot more heavily in the cloud and distributed-computing markets. Forrester said that CDNs and [colocation vendors][5] could become juicy acquisition targets for big telecom, which missed the boat on cloud computing to a certain extent, and is eager to be a bigger part of the edge. Theyre also investing in open-source projects like Akraino, an edge software stack designed to support carrier availability.
But the biggest carrier impact on edge computing in 2020 will undoubtedly be the growing availability of [5G][6] network coverage, Forrester says. While that availability will still mostly be confined to major cities, that should be enough to prompt reconsideration of edge strategies by businesses that want to take advantage of capabilities like smart, real-time video processing, 3D mapping for worker productivity and use cases involving autonomous robots or drones.
Beyond the carriers, theres a huge range of players in the edge computing, all of which have their eyes firmly on the future. Operational-device makers in every field from medicine to utilities to heavy industry will need custom edge devices for connectivity and control, huge cloud vendors will look to consolidate their hold over that end of the market and AI/ML startups will look to enable brand-new levels of insight and functionality.
Whats more, the average edge-computing implementation will often use many of them at the same time, according to Forrester, which noted that integrators who can pull products and services from many different vendors into a single system will be highly sought-after in the coming year. Multivendor solutions are likely to be much more popular than single-vendor, in large part because few individual companies have products that address all parts of the edge and IoT stacks.
Join the Network World communities on [Facebook][7] and [LinkedIn][8] to comment on topics that are top of mind.
--------------------------------------------------------------------------------
via: https://www.networkworld.com/article/3451532/forrester-edge-computing-is-about-to-bloom.html
作者:[Jon Gold][a]
选题:[lujun9972][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://www.networkworld.com/author/Jon-Gold/
[b]: https://github.com/lujun9972
[1]: https://www.networkworld.com/article/3224893/what-is-edge-computing-and-how-it-s-changing-the-network.html
[2]: https://www.networkworld.com/article/3207535/what-is-iot-how-the-internet-of-things-works.html
[3]: https://www.networkworld.com/newsletters/signup.html
[4]: https://www.networkworld.com/article/3440100/take-the-intelligent-route-with-consumption-based-storage.html?utm_source=IDG&utm_medium=promotions&utm_campaign=HPE20773&utm_content=sidebar ( Take the Intelligent Route with Consumption-Based Storage)
[5]: https://www.networkworld.com/article/3407756/colocation-facilities-buck-the-cloud-data-center-trend.html
[6]: https://www.networkworld.com/article/3203489/what-is-5g-how-is-it-better-than-4g.html
[7]: https://www.facebook.com/NetworkWorld/
[8]: https://www.linkedin.com/company/network-world

View File

@ -1,452 +0,0 @@
[#]: collector: (lujun9972)
[#]: translator: ( )
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (Understanding system calls on Linux with strace)
[#]: via: (https://opensource.com/article/19/10/strace)
[#]: author: (Gaurav Kamathe https://opensource.com/users/gkamathe)
Understanding system calls on Linux with strace
======
Trace the thin layer between user processes and the Linux kernel with
strace.
![Hand putting a Linux file folder into a drawer][1]
A system call is a programmatic way a program requests a service from the kernel, and **strace** is a powerful tool that allows you to trace the thin layer between user processes and the Linux kernel.
To understand how an operating system works, you first need to understand how system calls work. One of the main functions of an operating system is to provide abstractions to user programs.
An operating system can roughly be divided into two modes:
* **Kernel mode:** A privileged and powerful mode used by the operating system kernel
* **User mode:** Where most user applications run
Users mostly work with command-line utilities and graphical user interfaces (GUI) to do day-to-day tasks. System calls work silently in the background, interfacing with the kernel to get work done.
System calls are very similar to function calls, which means they accept and work on arguments and return values. The only difference is that system calls enter a kernel, while function calls do not. Switching from user space to kernel space is done using a special [trap][2] mechanism.
Most of this is hidden away from the user by using system libraries (aka **glibc** on Linux systems). Even though system calls are generic in nature, the mechanics of issuing a system call are very much machine-dependent.
This article explores some practical examples by using some general commands and analyzing the system calls made by each command using **strace**. These examples use Red Hat Enterprise Linux, but the commands should work the same on other Linux distros:
```
[root@sandbox ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.7 (Maipo)
[root@sandbox ~]#
[root@sandbox ~]# uname -r
3.10.0-1062.el7.x86_64
[root@sandbox ~]#
```
First, ensure that the required tools are installed on your system. You can verify whether **strace** is installed using the RPM command below; if it is, you can check the **strace** utility version number using the **-V** option:
```
[root@sandbox ~]# rpm -qa | grep -i strace
strace-4.12-9.el7.x86_64
[root@sandbox ~]#
[root@sandbox ~]# strace -V
strace -- version 4.12
[root@sandbox ~]#
```
If that doesn't work, install **strace** by running:
```
`yum install strace`
```
For the purpose of this example, create a test directory within **/tmp** and create two files using the **touch** command using:
```
[root@sandbox ~]# cd /tmp/
[root@sandbox tmp]#
[root@sandbox tmp]# mkdir testdir
[root@sandbox tmp]#
[root@sandbox tmp]# touch testdir/file1
[root@sandbox tmp]# touch testdir/file2
[root@sandbox tmp]#
```
(I used the **/tmp** directory because everybody has access to it, but you can choose another directory if you prefer.)
Verify that the files were created using the **ls** command on the **testdir** directory:
```
[root@sandbox tmp]# ls testdir/
file1  file2
[root@sandbox tmp]#
```
You probably use the **ls** command every day without realizing system calls are at work underneath it. There is abstraction at play here; here's how this command works:
```
`Command-line utility -> Invokes functions from system libraries (glibc) -> Invokes system calls`
```
The **ls** command internally calls functions from system libraries (aka **glibc**) on Linux. These libraries invoke the system calls that do most of the work.
If you want to know which functions were called from the **glibc** library, use the **ltrace** command followed by the regular **ls testdir/** command:
```
`ltrace ls testdir/`
```
If **ltrace** is not installed, install it by entering:
```
`yum install ltrace`
```
A bunch of output will be dumped to the screen; don't worry about it—just follow along. Some of the important library functions from the output of the **ltrace** command that are relevant to this example include:
```
opendir("testdir/")                                  = { 3 }
readdir({ 3 })                                       = { 101879119, "." }
readdir({ 3 })                                       = { 134, ".." }
readdir({ 3 })                                       = { 101879120, "file1" }
strlen("file1")                                      = 5
memcpy(0x1665be0, "file1\0", 6)                      = 0x1665be0
readdir({ 3 })                                       = { 101879122, "file2" }
strlen("file2")                                      = 5
memcpy(0x166dcb0, "file2\0", 6)                      = 0x166dcb0
readdir({ 3 })                                       = nil
closedir({ 3 })                      
```
By looking at the output above, you probably can understand what is happening. A directory called **testdir** is being opened by the **opendir** library function, followed by calls to the **readdir** function, which is reading the contents of the directory. At the end, there is a call to the **closedir** function, which closes the directory that was opened earlier. Ignore the other **strlen** and **memcpy** functions for now.
You can see which library functions are being called, but this article will focus on system calls that are invoked by the system library functions.
Similar to the above, to understand what system calls are invoked, just put **strace** before the **ls testdir** command, as shown below. Once again, a bunch of gibberish will be dumped to your screen, which you can follow along with here:
```
[root@sandbox tmp]# strace ls testdir/
execve("/usr/bin/ls", ["ls", "testdir/"], [/* 40 vars */]) = 0
brk(NULL)                               = 0x1f12000
&lt;&lt;&lt; truncated strace output &gt;&gt;&gt;
write(1, "file1  file2\n", 13file1  file2
)          = 13
close(1)                                = 0
munmap(0x7fd002c8d000, 4096)            = 0
close(2)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++
[root@sandbox tmp]#
```
The output on the screen after running the **strace** command was simply system calls made to run the **ls** command. Each system call serves a specific purpose for the operating system, and they can be broadly categorized into the following sections:
* Process management system calls
* File management system calls
* Directory and filesystem management system calls
* Other system calls
An easier way to analyze the information dumped onto your screen is to log the output to a file using **strace**'s handy **-o** flag. Add a suitable file name after the **-o** flag and run the command again:
```
[root@sandbox tmp]# strace -o trace.log ls testdir/
file1  file2
[root@sandbox tmp]#
```
This time, no output dumped to the screen—the **ls** command worked as expected by showing the file names and logging all the output to the file **trace.log**. The file has almost 100 lines of content just for a simple **ls** command:
```
[root@sandbox tmp]# ls -l trace.log
-rw-r--r--. 1 root root 7809 Oct 12 13:52 trace.log
[root@sandbox tmp]#
[root@sandbox tmp]# wc -l trace.log
114 trace.log
[root@sandbox tmp]#
```
Take a look at the first line in the example's trace.log:
```
`execve("/usr/bin/ls", ["ls", "testdir/"], [/* 40 vars */]) = 0`
```
* The first word of the line, **execve**, is the name of a system call being executed.
* The text within the parentheses is the arguments provided to the system call.
* The number after the **=** sign (which is **0** in this case) is a value returned by the **execve** system call.
The output doesn't seem too intimidating now, does it? And you can apply the same logic to understand other lines.
Now, narrow your focus to the single command that you invoked, i.e., **ls testdir**. You know the directory name used by the command **ls**, so why not **grep** for **testdir** within your **trace.log** file and see what you get? Look at each line of the results in detail:
```
[root@sandbox tmp]# grep testdir trace.log
execve("/usr/bin/ls", ["ls", "testdir/"], [/* 40 vars */]) = 0
stat("testdir/", {st_mode=S_IFDIR|0755, st_size=32, ...}) = 0
openat(AT_FDCWD, "testdir/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
[root@sandbox tmp]#
```
Thinking back to the analysis of **execve** above, can you tell what this system call does?
```
`execve("/usr/bin/ls", ["ls", "testdir/"], [/* 40 vars */]) = 0`
```
You don't need to memorize all the system calls or what they do, because you can refer to documentation when you need to. Man pages to the rescue! Ensure the following package is installed before running the **man** command:
```
[root@sandbox tmp]# rpm -qa | grep -i man-pages
man-pages-3.53-5.el7.noarch
[root@sandbox tmp]#
```
Remember that you need to add a **2** between the **man** command and the system call name. If you read **man**'s man page using **man man**, you can see that section 2 is reserved for system calls. Similarly, if you need information on library functions, you need to add a **3** between **man** and the library function name.
The following are the manual's section numbers and the types of pages they contain:
```
1\. Executable programs or shell commands
2\. System calls (functions provided by the kernel)
3\. Library calls (functions within program libraries)
4\. Special files (usually found in /dev)
```
Run the following **man** command with the system call name to see the documentation for that system call:
```
`man 2 execve`
```
As per the **execve** man page, this executes a program that is passed in the arguments (in this case, that is **ls**). There are additional arguments that can be provided to **ls**, such as **testdir** in this example. Therefore, this system call just runs **ls** with **testdir** as the argument:
```
'execve - execute program'
'DESCRIPTION
       execve()  executes  the  program  pointed to by filename'
```
The next system call, named **stat**, uses the **testdir** argument:
```
`stat("testdir/", {st_mode=S_IFDIR|0755, st_size=32, ...}) = 0`
```
Use **man 2 stat** to access the documentation. **stat** is the system call that gets a file's status—remember that everything in Linux is a file, including a directory.
Next, the **openat** system call opens **testdir.** Keep an eye on the **3** that is returned. This is a file description, which will be used by later system calls:
```
`openat(AT_FDCWD, "testdir/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3`
```
So far, so good. Now, open the **trace.log** file and go to the line following the **openat** system call. You will see the **getdents** system call being invoked, which does most of what is required to execute the **ls testdir** command. Now, **grep getdents** from the **trace.log** file:
```
[root@sandbox tmp]# grep getdents trace.log
getdents(3, /* 4 entries */, 32768)     = 112
getdents(3, /* 0 entries */, 32768)     = 0
[root@sandbox tmp]#
```
The **getdents** man page describes it as **get directory entries**, which is what you want to do. Notice that the argument for **getdents** is **3**, which is the file descriptor from the **openat** system call above.
Now that you have the directory listing, you need a way to display it in your terminal. So, **grep** for another system call, **write**, which is used to write to the terminal, in the logs:
```
[root@sandbox tmp]# grep write trace.log
write(1, "file1  file2\n", 13)          = 13
[root@sandbox tmp]#
```
In these arguments, you can see the file names that will be displayed: **file1** and **file2**. Regarding the first argument (**1**), remember in Linux that, when any process is run, three file descriptors are opened for it by default. Following are the default file descriptors:
* 0 - Standard input
* 1 - Standard out
* 2 - Standard error
So, the **write** system call is displaying **file1** and **file2** on the standard display, which is the terminal, identified by **1**.
Now you know which system calls did most of the work for the **ls testdir/** command. But what about the other 100+ system calls in the **trace.log** file? The operating system has to do a lot of housekeeping to run a process, so a lot of what you see in the log file is process initialization and cleanup. Read the entire **trace.log** file and try to understand what is happening to make the **ls** command work.
Now that you know how to analyze system calls for a given command, you can use this knowledge for other commands to understand what system calls are being executed. **strace** provides a lot of useful command-line flags to make it easier for you, and some of them are described below.
By default, **strace** does not include all system call information. However, it has a handy **-v verbose** option that can provide additional information on each system call:
```
`strace -v ls testdir`
```
It is good practice to always use the **-f** option when running the **strace** command. It allows **strace** to trace any child processes created by the process currently being traced:
```
`strace -f ls testdir`
```
Say you just want the names of system calls, the number of times they ran, and the percentage of time spent in each system call. You can use the **-c** flag to get those statistics:
```
`strace -c ls testdir/`
```
Suppose you want to concentrate on a specific system call, such as focusing on **open** system calls and ignoring the rest. You can use the **-e** flag followed by the system call name:
```
[root@sandbox tmp]# strace -e open ls testdir
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libcap.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libacl.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libpcre.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libattr.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
file1  file2
+++ exited with 0 +++
[root@sandbox tmp]#
```
What if you want to concentrate on more than one system call? No worries, you can use the same **-e** command-line flag with a comma between the two system calls. For example, to see the **write** and **getdents** systems calls:
```
[root@sandbox tmp]# strace -e write,getdents ls testdir
getdents(3, /* 4 entries */, 32768)     = 112
getdents(3, /* 0 entries */, 32768)     = 0
write(1, "file1  file2\n", 13file1  file2
)          = 13
+++ exited with 0 +++
[root@sandbox tmp]#
```
The examples so far have traced explicitly run commands. But what about commands that have already been run and are in execution? What, for example, if you want to trace daemons that are just long-running processes? For this, **strace** provides a special **-p** flag to which you can provide a process ID.
Instead of running a **strace** on a daemon, take the example of a **cat** command, which usually displays the contents of a file if you give a file name as an argument. If no argument is given, the **cat** command simply waits at a terminal for the user to enter text. Once text is entered, it repeats the given text until a user presses Ctrl+C to exit.
Run the **cat** command from one terminal; it will show you a prompt and simply wait there (remember **cat** is still running and has not exited):
```
`[root@sandbox tmp]# cat`
```
From another terminal, find the process identifier (PID) using the **ps** command:
```
[root@sandbox ~]# ps -ef | grep cat
root      22443  20164  0 14:19 pts/0    00:00:00 cat
root      22482  20300  0 14:20 pts/1    00:00:00 grep --color=auto cat
[root@sandbox ~]#
```
Now, run **strace** on the running process with the **-p** flag and the PID (which you found above using **ps**). After running **strace**, the output states what the process was attached to along with the PID number. Now, **strace** is tracing the system calls made by the **cat** command. The first system call you see is **read**, which is waiting for input from 0, or standard input, which is the terminal where the **cat** command ran:
```
[root@sandbox ~]# strace -p 22443
strace: Process 22443 attached
read(0,
```
Now, move back to the terminal where you left the **cat** command running and enter some text. I entered **x0x0** for demo purposes. Notice how **cat** simply repeated what I entered; hence, **x0x0** appears twice. I input the first one, and the second one was the output repeated by the **cat** command:
```
[root@sandbox tmp]# cat
x0x0
x0x0
```
Move back to the terminal where **strace** was attached to the **cat** process. You now see two additional system calls: the earlier **read** system call, which now reads **x0x0** in the terminal, and another for **write**, which wrote **x0x0** back to the terminal, and again a new **read**, which is waiting to read from the terminal. Note that Standard input (**0**) and Standard out (**1**) are both in the same terminal:
```
[root@sandbox ~]# strace -p 22443
strace: Process 22443 attached
read(0, "x0x0\n", 65536)                = 5
write(1, "x0x0\n", 5)                   = 5
read(0,
```
Imagine how helpful this is when running **strace** against daemons to see everything it does in the background. Kill the **cat** command by pressing Ctrl+C; this also kills your **strace** session since the process is no longer running.
If you want to see a timestamp against all your system calls, simply use the **-t** option with **strace**:
```
[root@sandbox ~]#strace -t ls testdir/
14:24:47 execve("/usr/bin/ls", ["ls", "testdir/"], [/* 40 vars */]) = 0
14:24:47 brk(NULL)                      = 0x1f07000
14:24:47 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2530bc8000
14:24:47 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
14:24:47 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
```
What if you want to know the time spent between system calls? **strace** has a handy **-r** command that shows the time spent executing each system call. Pretty useful, isn't it?
```
[root@sandbox ~]#strace -r ls testdir/
0.000000 execve("/usr/bin/ls", ["ls", "testdir/"], [/* 40 vars */]) = 0
0.000368 brk(NULL)                 = 0x1966000
0.000073 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb6b1155000
0.000047 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
0.000119 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
```
### Conclusion
The **strace** utility is very handy for understanding system calls on Linux. To learn about its other command-line flags, please refer to the man pages and online documentation.
--------------------------------------------------------------------------------
via: https://opensource.com/article/19/10/strace
作者:[Gaurav Kamathe][a]
选题:[lujun9972][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/gkamathe
[b]: https://github.com/lujun9972
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/yearbook-haff-rx-linux-file-lead_0.png?itok=-i0NNfDC (Hand putting a Linux file folder into a drawer)
[2]: https://en.wikipedia.org/wiki/Trap_(computing)

View File

@ -1,168 +0,0 @@
[#]: collector: (lujun9972)
[#]: translator: (geekpi)
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (Getting started with awk, a powerful text-parsing tool)
[#]: via: (https://opensource.com/article/19/10/intro-awk)
[#]: author: (Seth Kenlon https://opensource.com/users/seth)
Getting started with awk, a powerful text-parsing tool
======
Let's jump in and start using it.
![Woman programming][1]
Awk is a powerful text-parsing tool for Unix and Unix-like systems, but because it has programmed functions that you can use to perform common parsing tasks, it's also considered a programming language. You probably won't be developing your next GUI application with awk, and it likely won't take the place of your default scripting language, but it's a powerful utility for specific tasks.
What those tasks may be is surprisingly diverse. The best way to discover which of your problems might be best solved by awk is to learn awk; you'll be surprised at how awk can help you get more done but with a lot less effort.
Awk's basic syntax is:
```
`awk [options] 'pattern {action}' file`
```
To get started, create this sample file and save it as **colours.txt**
```
name       color  amount
apple      red    4
banana     yellow 6
strawberry red    3
grape      purple 10
apple      green  8
plum       purple 2
kiwi       brown  4
potato     brown  9
pineapple  yellow 5
```
This data is separated into columns by one or more spaces. It's common for data that you are analyzing to be organized in some way. It may not always be columns separated by whitespace, or even a comma or semicolon, but especially in log files or data dumps, there's generally a predictable pattern. You can use patterns of data to help awk extract and process the data that you want to focus on.
### Printing a column
In awk, the **print** function displays whatever you specify. There are many predefined variables you can use, but some of the most common are integers designating columns in a text file. Try it out:
```
$ awk '{print $2;}' colours.txt
color
red
yellow
red
purple
green
purple
brown
brown
yellow
```
In this case, awk displays the second column, denoted by **$2**. This is relatively intuitive, so you can probably guess that **print $1** displays the first column, and **print $3** displays the third, and so on.
To display _all_ columns, use **$0**.
The number after the dollar sign (**$**) is an _expression_, so **$2** and **$(1+1)** mean the same thing.
### Conditionally selecting columns
The example file you're using is very structured. It has a row that serves as a header, and the columns relate directly to one another. By defining _conditional_ requirements, you can qualify what you want awk to return when looking at this data. For instance, to view items in column 2 that match "yellow" and print the contents of column 1:
```
awk '$2=="yellow"{print $1}' file1.txt
banana
pineapple
```
Regular expressions work as well. This conditional looks at **$2** for approximate matches to the letter **p** followed by any number of (one or more) characters, which are in turn followed by the letter **p**:
```
$ awk '$2 ~ /p.+p/ {print $0}' colours.txt
grape   purple  10
plum    purple  2
```
Numbers are interpreted naturally by awk. For instance, to print any row with a third column containing an integer greater than 5:
```
awk '$3&gt;5 {print $1, $2}' colours.txt
name    color
banana  yellow
grape   purple
apple   green
potato  brown
```
### Field separator
By default, awk uses whitespace as the field separator. Not all text files use whitespace to define fields, though. For example, create a file called **colours.csv** with this content:
```
name,color,amount
apple,red,4
banana,yellow,6
strawberry,red,3
grape,purple,10
apple,green,8
plum,purple,2
kiwi,brown,4
potato,brown,9
pineapple,yellow,5
```
Awk can treat the data in exactly the same way, as long as you specify which character it should use as the field separator in your command. Use the **\--field-separator** (or just **-F** for short) option to define the delimiter:
```
$ awk -F"," '$2=="yellow" {print $1}' file1.csv
banana
pineapple
```
### Saving output
Using output redirection, you can write your results to a file. For example:
```
`$ awk -F, '$3>5 {print $1, $2} colours.csv > output.txt`
```
This creates a file with the contents of your awk query.
You can also split a file into multiple files grouped by column data. For example, if you want to split colours.txt into multiple files according to what color appears in each row, you can cause awk to redirect _per query_ by including the redirection in your awk statement:
```
`$ awk '{print > $2".txt"}' colours.txt`
```
This produces files named **yellow.txt**, **red.txt**, and so on.
In the next article, you'll learn more about fields, records, and some powerful awk variables.
* * *
This article is adapted from an episode of [Hacker Public Radio][2], a community technology podcast.
--------------------------------------------------------------------------------
via: https://opensource.com/article/19/10/intro-awk
作者:[Seth Kenlon][a]
选题:[lujun9972][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/seth
[b]: https://github.com/lujun9972
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/programming-code-keyboard-laptop-music-headphones.png?itok=EQZ2WKzy (Woman programming)
[2]: http://hackerpublicradio.org/eps.php?id=2114

View File

@ -0,0 +1,61 @@
[#]: collector: (lujun9972)
[#]: translator: (geekpi)
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (Cloning a MAC address to bypass a captive portal)
[#]: via: (https://fedoramagazine.org/cloning-a-mac-address-to-bypass-a-captive-portal/)
[#]: author: (Esteban Wilson https://fedoramagazine.org/author/swilson/)
Cloning a MAC address to bypass a captive portal
======
![][1]
If you ever attach to a WiFi system outside your home or office, you often see a portal page. This page may ask you to accept terms of service or some other agreement to get access. But what happens when you cant connect through this kind of portal? This article shows you how to use NetworkManager on Fedora to deal with some failure cases so you can still access the internet.
### How captive portals work
Captive portals are web pages offered when a new device is connected to a network. When the user first accesses the Internet, the portal captures all web page requests and redirects them to a single portal page.
The page then asks the user to take some action, typically agreeing to a usage policy. Once the user agrees, they may authenticate to a RADIUS or other type of authentication system. In simple terms, the captive portal registers and authorizes a device based on the devices MAC address and end user acceptance of terms. (The MAC address is [a hardware-based value][2] attached to any network interface, like a WiFi chip or card.)
Sometimes a device doesnt load the captive portal to authenticate and authorize the device to use the locations WiFi access. Examples of this situation include mobile devices and gaming consoles (Switch, Playstation, etc.). They usually wont launch a captive portal page when connecting to the Internet. You may see this situation when connecting to hotel or public WiFi access points.
You can use NetworkManager on Fedora to resolve these issues, though. Fedora will let you temporarily clone the connecting devices MAC address and authenticate to the captive portal on the devices behalf. Youll need the MAC address of the device you want to connect. Typically this is printed somewhere on the device and labeled. Its a six-byte hexadecimal value, so it might look like _4A:1A:4C:B0:38:1F_. You can also usually find it through the devices built-in menus.
### Cloning with NetworkManager
First, open _**nm-connection-editor**_, or open the WiFI settings via the Settings applet. You can then use NetworkManager to clone as follows:
* For Ethernet Select the connected Ethernet connection. Then select the _Ethernet_ tab. Note or copy the current MAC address. Enter the MAC address of the console or other device in the _Cloned MAC address_ field.
* For WiFi Select the WiFi profile name. Then select the WiFi tab. Note or copy the current MAC address. Enter the MAC address of the console or other device in the _Cloned MAC address_ field.
### **Bringing up the desired device**
Once the Fedora system connects with the Ethernet or WiFi profile, the cloned MAC address is used to request an IP address, and the captive portal loads. Enter the credentials needed and/or select the user agreement. The MAC address will then get authorized.
Now, disconnect the WiFi or Ethernet profile, and change the Fedora systems MAC address back to its original value. Then boot up the console or other device. The device should now be able to access the Internet, because its network interface has been authorized via your Fedora system.
This isnt all that NetworkManager can do, though. For instance, check out this article on [randomizing your systems hardware address][3] for better privacy.
> [Randomize your MAC address using NetworkManager][3]
--------------------------------------------------------------------------------
via: https://fedoramagazine.org/cloning-a-mac-address-to-bypass-a-captive-portal/
作者:[Esteban Wilson][a]
选题:[lujun9972][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://fedoramagazine.org/author/swilson/
[b]: https://github.com/lujun9972
[1]: https://fedoramagazine.org/wp-content/uploads/2019/10/clone-mac-nm-816x345.jpg
[2]: https://en.wikipedia.org/wiki/MAC_address
[3]: https://fedoramagazine.org/randomize-mac-address-nm/

View File

@ -0,0 +1,252 @@
[#]: collector: (lujun9972)
[#]: translator: ( )
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (Fields, records, and variables in awk)
[#]: via: (https://opensource.com/article/19/11/fields-records-variables-awk)
[#]: author: (Seth Kenlon https://opensource.com/users/seth)
Fields, records, and variables in awk
======
In the second article in this intro to awk series, learn about fields,
records, and some powerful awk variables.
![Man at laptop on a mountain][1]
Awk comes in several varieties: There is the original **awk**, written in 1977 at AT&amp;T Bell Laboratories, and several reimplementations, such as **mawk**, **nawk**, and the one that ships with most Linux distributions, GNU awk, or **gawk**. On most Linux distributions, awk and gawk are synonyms referring to GNU awk, and typing either invokes the same awk command. See the [GNU awk user's guide][2] for the full history of awk and gawk.
The [first article][3] in this series showed that awk is invoked on the command line with this syntax:
```
`$ awk [options] 'pattern {action}' inputfile`
```
Awk is the command, and it can take options (such as **-F** to define the field separator). The action you want awk to perform is contained in single quotes, at least when it's issued in a terminal. To further emphasize which part of the awk command is the action you want it to take, you can precede your program with the **-e** option (but it's not required):
```
$ awk -F, -e '{print $2;}' colours.txt
yellow
blue
green
[...]
```
### Records and fields
Awk views its input data as a series of _records_, which are usually newline-delimited lines. In other words, awk generally sees each line in a text file as a new record. Each record contains a series of _fields_. A field is a component of a record delimited by a _field separator_.
By default, awk sees whitespace, such as spaces, tabs, and newlines, as indicators of a new field. Specifically, awk treats multiple _space_ separators as one, so this line contains two fields:
```
`raspberry red`
```
As does this one:
```
`tuxedo                  black`
```
Other separators are not treated this way. Assuming that the field separator is a comma, the following example record contains three fields, with one probably being zero characters long (assuming a non-printable character isn't hiding in that field):
```
`a,,b`
```
### The awk program
The _program_ part of an awk command consists of a series of rules. Normally, each rule begins on a new line in the program (although this is not mandatory). Each rule consists of a pattern and one or more actions:
```
`pattern { action }`
```
In a rule, you can define a pattern as a condition to control whether the action will run on a record. Patterns can be simple comparisons, regular expressions, combinations of the two, and more.
For instance, this will print a record _only_ if it contains the word "raspberry":
```
$ awk '/raspberry/ { print $0 }' colours.txt
raspberry red 99
```
If there is no qualifying pattern, the action is applied to every record.
Also, a rule can consist of only a pattern, in which case the entire record is written as if the action was **{ print }**.
Awk programs are essentially _data-driven_ in that actions depend on the data, so they are quite a bit different from programs in many other programming languages.
### The NF variable
Each field has a variable as a designation, but there are special variables for fields and records, too. The variable **NF** stores the number of fields awk finds in the current record. This can be printed or used in tests. Here is an example using the [text file][3] from the previous article:
```
$ awk '{ print $0 " (" NF ")" }' colours.txt
name       color  amount (3)
apple      red    4 (3)
banana     yellow 6 (3)
[...]
```
Awk's **print** function takes a series of arguments (which may be variables or strings) and concatenates them together. This is why, at the end of each line in this example, awk prints the number of fields as an integer enclosed by parentheses.
### The NR variable
In addition to counting the fields in each record, awk also counts input records. The record number is held in the variable **NR**, and it can be used in the same way as any other variable. For example, to print the record number before each line:
```
$ awk '{ print NR ": " $0 }' colours.txt
1: name       color  amount
2: apple      red    4
3: banana     yellow 6
4: raspberry  red    3
5: grape      purple 10
[...]
```
Note that it's acceptable to write this command with no spaces other than the one after **print**, although it's more difficult for a human to parse:
```
`$ awk '{print NR": "$0}' colours.txt`
```
### The printf() function
For greater flexibility in how the output is formatted, you can use the awk **printf()** function. This is similar to **printf** in C, Lua, Bash, and other languages. It takes a _format_ argument followed by a comma-separated list of items. The argument list may be enclosed in parentheses.
```
`$ printf format, item1, item2, ...`
```
The format argument (or _format string_) defines how each of the other arguments will be output. It uses _format specifiers_ to do this, including **%s** to output a string and **%d** to output a decimal number. The following **printf** statement outputs the record followed by the number of fields in parentheses:
```
$ awk 'printf "%s (%d)\n",$0,NF}' colours.txt
name       color  amount (3)
raspberry  red    4 (3)
banana     yellow 6 (3)
[...]
```
In this example, **%s (%d)** provides the structure for each line, while **$0,NF** defines the data to be inserted into the **%s** and **%d** positions. Note that, unlike with the **print** function, no newline is generated without explicit instructions. The escape sequence **\n** does this.
### Awk scripting
All of the awk code in this article has been written and executed in an interactive Bash prompt. For more complex programs, it's often easier to place your commands into a file or _script_. The option **-f FILE** (not to be confused with **-F**, which denotes the field separator) may be used to invoke a file containing a program.
For example, here is a simple awk script. Create a file called **example1.awk** with this content:
```
/^a/ {print "A: " $0}
/^b/ {print "B: " $0}
```
It's conventional to give such files the extension **.awk** to make it clear that they hold an awk program. This naming is not mandatory, but it gives file managers and editors (and you) a useful clue about what the file is.
Run the script:
```
$ awk -f example1.awk colours.txt
A: raspberry  red    4
B: banana     yellow 6
A: apple      green  8
```
A file containing awk instructions can be made into a script by adding a **#!** line at the top and making it executable. Create a file called **example2.awk** with these contents:
```
#!/usr/bin/awk -f
#
# Print all but line 1 with the line number on the front
#
NR &gt; 1 {
    printf "%d: %s\n",NR,$0
}
```
Arguably, there's no advantage to having just one line in a script, but sometimes it's easier to execute a script than to remember and type even a single line. A script file also provides a good opportunity to document what a command does. Lines starting with the **#** symbol are comments, which awk ignores.
Grant the file executable permission:
```
`$ chmod u+x example2.awk`
```
Run the script:
```
$ ./example2.awk colours.txt
2: apple      red    4
2: banana     yellow 6
4: raspberry red    3
5: grape      purple 10
[...]
```
An advantage of placing your awk instructions in a script file is that it's easier to format and edit. While you can write awk on a single line in your terminal, it can get overwhelming when it spans several lines.
### Try it
You now know enough about how awk processes your instructions to be able to write a complex awk program. Try writing an awk script with more than one rule and at least one conditional pattern. If you want to try more functions than just **print** and **printf**, refer to [the gawk manual][4] online.
Here's an idea to get you started:
```
#!/usr/bin/awk -f
#
# Print each record EXCEPT
# IF the first record contains "raspberry",
# THEN replace "red" with "pi"
$1 == "raspberry" {
        gsub(/red/,"pi")
}
{ print }
```
Try this script to see what it does, and then try to write your own.
The next article in this series will introduce more functions for even more complex (and useful!) scripts.
* * *
_This article is adapted from an episode of [Hacker Public Radio][5], a community technology podcast._
--------------------------------------------------------------------------------
via: https://opensource.com/article/19/11/fields-records-variables-awk
作者:[Seth Kenlon][a]
选题:[lujun9972][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/seth
[b]: https://github.com/lujun9972
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/computer_laptop_code_programming_mountain_view.jpg?itok=yx5buqkr (Man at laptop on a mountain)
[2]: https://www.gnu.org/software/gawk/manual/html_node/History.html#History
[3]: https://opensource.com/article/19/10/intro-awk
[4]: https://www.gnu.org/software/gawk/manual/
[5]: http://hackerpublicradio.org/eps.php?id=2129

View File

@ -0,0 +1,308 @@
[#]: collector: (lujun9972)
[#]: translator: ( )
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (How to Add Windows and Linux host to Nagios Server for Monitoring)
[#]: via: (https://www.linuxtechi.com/add-windows-linux-host-to-nagios-server/)
[#]: author: (James Kiarie https://www.linuxtechi.com/author/james/)
How to Add Windows and Linux host to Nagios Server for Monitoring
======
In the previous article, we demonstrated how to install [Nagios Core on CentOS 8 / RHEL 8][1] server. In this guide, we will dive deeper and add Linux and Windows hosts to the Nagios Core server for monitoring.
![Add-Linux-Windows-Host-Nagios-Server][2]
### Adding a Remote Windows Host to Nagios Server
In this section, you will learn how to add a **Windows host** system to the **Nagios server**. For this to be possible, you need to install **NSClient++** agent on the Windows Host system. In this guide, we are going to install the NSClient++ on a Windows Server 2019 Datacenter edition.
On the Windows host system,  head out to the download link as specified <https://sourceforge.net/projects/nscplus/> and download NSClient ++ agent.
Once downloaded, double click on the downloaded installation file to launch the installation wizard.
[![NSClient-installer-Windows][2]][3]
On the first step on the installation procedure click **Next**
[![click-nex-to-install-NSClient][2]][4]
In the next section, check off the **I accept the terms in the license Agreement** checkbox and click **Next**
[![Accept-terms-conditions-NSClient][2]][5]
Next, click on the **Typical** option from the list of options and click **Next**
[![click-on-Typical-option-NSClient-Installation][2]][6]
In the next step, leave the default settings as they are and click **Next**.
[![Define-path-NSClient-Windows][2]][7]
On the next page, specify your Nagios Server cores IP address and tick off all the modules and click **Next** as shown below.
[![Specify-Nagios-Server-IP-address-NSClient-Windows][2]][8]
Next, click on the **Install** option to commence the installation process.[![Click-install-to-being-the-installation-NSClient][2]][9]
The installation process will start and will take a couple of seconds to complete. On the last step. Click **Finish** to complete the installation and exit the Wizard.
[![Click-finish-NSClient-Windows][2]][10]
To start the NSClient service, click on the **Start** menu and click on the **Start NSClient ++** option.
[![Click-start-NSClient-service-windows][2]][11]
To confirm that indeed the service is running, press **Windows Key + R**, type services.msc and hit **ENTER**. Scroll and search for the **NSClient** service and ensure its running
[![NSClient-running-windows][2]][12]
At this point, we have successfully installed NSClient++ on Windows Server 2019 host and verified that its running.
### Configure Nagios Server to monitor Windows host
After the successful installation of the NSClient ++ on the Windows host PC, log in to the Nagios server Core system and configure it to monitor the Windows host system.
Open the windows.cfg file using your favorite text editor
```
# vim /usr/local/nagios/etc/objects/windows.cfg
```
In the configuration file, ensure that the host_name attribute matches the hostname of your Windows client system. In our case, the hostname for the Windows server PC is windows-server. This hostname should apply for all the host_name attributes.
For the address attribute, specify your Windows host IP address. , In our case, this was 10.128.0.52.
![Specify-hostname-IP-Windows][2]
After you are done, save the changes and exit the text editor.
Next, open the Nagios configuration file.
```
# vim /usr/local/nagios/etc/nagios.cfg
```
Uncomment the line below and save the changes.
cfg_file=/usr/local/nagios/etc/objects/windows.cfg
![Uncomment-Windows-cfg-Nagios][2]
Finally, to verify that Nagios configuration is free from any errors, run the command:
```
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
```
Output
![Verify-configuration-for-errors-Nagios][2]
As you can see from the output, there are no warnings or errors.
Now browse your Nagios Server IP address, log in and click on Hosts. Your Windows hostname, in this case, windows-server will appear on the dashboard.
![Windows-Host-added-Nagios][2]
### Adding a remote Linux Host to Nagios Server
Having added a Windows host to the Nagios server, lets add a Linux host system. In our case, we are going to add a **Ubuntu 18.04 LTS** to the Nagios monitoring server. To monitor a Linux host, we need to install an agent on the remote Linux system called **NRPE**. NRPE is short for **Nagios Remote Plugin Executor**. This is the plugin that will allow you to monitor Linux host systems. It allows you to monitor resources such as Swap, memory usage, and CPU load to mention a few on remote Linux hosts. So the first step is to install NRPE on Ubuntu 18.04 LTS remote system.
But first, update Ubuntu system
```
# sudo apt update
```
Next,  install Nagios NRPE by running the command as shown:
```
# sudo apt install nagios-nrpe-server nagios-plugins
```
![Install-nrpe-server-nagios-plugins][2]
After the successful installation of  NRPE and Nagios plugins, configure NRPE by opening its configuration file in /etc/nagios/nrpe.cfg
```
# vim /etc/nagios/nrpe.cfg
```
Append the Linux host IP address to the **server_address** attribute. In this case, 10.128.0.53 is the IP address of the Ubuntu 18.04 LTS system.
![Specify-server-address-Nagios][2]
Next, add Nagios server IP address in the allowed_hosts attribute, in this case, 10.128.0.50
![Allowed-hosts-Nagios][2]
Save and exit the configuration file.
Next, restart NRPE service and verify its status
```
# systemctl restart nagios-nrpe-server
# systemctl enable nagios-nrpe-server
# systemctl status nagios-nrpe-server
```
![Restart-nrpe-check-status][2]
### Configure Nagios Server to monitor Linux host
Having successfully installed NRPE and nagios plugins on the remote linux server, log in to Nagios Server and install EPEL (Extra packages for Enterprise Linux) package.
```
# dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
```
Next, install NRPE plugin on the server
```
# dnf install nagios-plugins-nrpe -y
```
After the installation of the NRPE plugin, open the Nagios configuration file “/usr/local/nagios/etc/nagios.cfg”
```
# vim /usr/local/nagios/etc/nagios.cfg
```
Next, uncomment the line below in the configuration file
cfg_dir=/usr/local/nagios/etc/servers
![uncomment-servers-line-Nagios-Server-CentOS8][2]
Next, create a configuration directory
```
# mkdir /usr/local/nagios/etc/servers
```
Then create client configuration file
```
# vim /usr/local/nagios/etc/servers/ubuntu-host.cfg
```
Copy and paste the configuration below to the file. This configuration monitors swap space, system load, total processes, logged in users, and disk usage.
```
define host{
use linux-server
host_name ubuntu-nagios-client
alias ubuntu-nagios-client
address 10.128.0.53
}
define hostgroup{
hostgroup_name linux-server
alias Linux Servers
members ubuntu-nagios-client
}
define service{
use local-service
host_name ubuntu-nagios-client
service_description SWAP Uasge
check_command check_nrpe!check_swap
}
define service{
use local-service
host_name ubuntu-nagios-client
service_description Root / Partition
check_command check_nrpe!check_root
}
define service{
use local-service
host_name ubuntu-nagios-client
service_description Current Users
check_command check_nrpe!check_users
}
define service{
use local-service
host_name ubuntu-nagios-client
service_description Total Processes
check_command check_nrpe!check_total_procs
}
define service{
use local-service
host_name ubuntu-nagios-client
service_description Current Load
check_command check_nrpe!check_load
}
```
Save and exit the configuration file.
Next, verify that there are no errors in Nagios configuration
```
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
```
Now restart Nagios service and ensure that it is up and running.
```
# systemctl restart nagios
```
Remember to open port 5666 which is used by NRPE plugin on the firewall of the Nagios server.
```
# firewall-cmd --permanent --add-port=5666/tcp
# firewall-cmd --reload
```
![Allow-firewall-Nagios-server][2]
Likewise, head out to your Linux host (Ubuntu 18.04 LTS) and allow the port on UFW firewall
```
# ufw allow 5666/tcp
# ufw reload
```
![Allow-NRPE-service][2]
Finally, head out to the Nagios Servers URL and click on **Hosts**. Your Ubuntu system will be displayed on the dashboard alongside the Windows host machine we added earlier on.
![Linux-host-added-monitored-Nagios][2]
And this wraps up our 2-part series on Nagios installation and adding remote hosts. Feel free to get back to us with your feedback.
--------------------------------------------------------------------------------
via: https://www.linuxtechi.com/add-windows-linux-host-to-nagios-server/
作者:[James Kiarie][a]
选题:[lujun9972][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://www.linuxtechi.com/author/james/
[b]: https://github.com/lujun9972
[1]: https://www.linuxtechi.com/install-nagios-core-rhel-8-centos-8/
[2]: data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
[3]: https://www.linuxtechi.com/wp-content/uploads/2019/11/NSClient-installer-Windows.jpg
[4]: https://www.linuxtechi.com/wp-content/uploads/2019/11/click-nex-to-install-NSClient.jpg
[5]: https://www.linuxtechi.com/wp-content/uploads/2019/11/Accept-terms-conditions-NSClient.jpg
[6]: https://www.linuxtechi.com/wp-content/uploads/2019/11/click-on-Typical-option-NSClient-Installation.jpg
[7]: https://www.linuxtechi.com/wp-content/uploads/2019/11/Define-path-NSClient-Windows.png
[8]: https://www.linuxtechi.com/wp-content/uploads/2019/11/Specify-Nagios-Server-IP-address-NSClient-Windows.jpg
[9]: https://www.linuxtechi.com/wp-content/uploads/2019/11/Click-install-to-being-the-installation-NSClient.jpg
[10]: https://www.linuxtechi.com/wp-content/uploads/2019/11/Click-finish-NSClient-Windows.jpg
[11]: https://www.linuxtechi.com/wp-content/uploads/2019/11/Click-start-NSClient-service-windows.jpg
[12]: https://www.linuxtechi.com/wp-content/uploads/2019/11/NSClient-running-windows.jpg

View File

@ -0,0 +1,76 @@
[#]: collector: (lujun9972)
[#]: translator: ( )
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (My first contribution to open source: Impostor Syndrome)
[#]: via: (https://opensource.com/article/19/11/my-first-open-source-contribution-impostor-syndrome)
[#]: author: (Galen Corey https://opensource.com/users/galenemco)
My first contribution to open source: Impostor Syndrome
======
A new open source contributor documents a series of five mistakes she
made starting out in open source.
![Dandelion held out over water][1]
The story of my first mistake goes back to the beginning of my learn-to-code journey. I taught myself the basics through online resources. I was working through tutorials and projects, making progress but also looking for the next way to level up. Pretty quickly, I came across a blog post that told me the best way for beginners _just like me_ to take their coding skills to the next level was to contribute to open source.
> "Anyone can do this," insisted the post, "and it is a crucial part of participating in the larger developer community."
My internal impostor (who, for the purpose of this post, is the personification of my imposter syndrome) latched onto this idea. "Look, Galen," she said. "The only way to be a real developer is to contribute to open source." "Alrighty," I replied, and started following the instructions in the blog post to make a [GitHub][2] account. It took me under ten minutes to get so thoroughly confused that I gave up on the idea entirely. It wasnt that I was unwilling to learn, but the resources that I was depending on expected me to have quite a bit of preexisting knowledge about [Git][3], GitHub, and how these tools allowed multiple developers to collaborate on a single project.
"Maybe Im not ready for this yet," I thought, and went back to my tutorials. "But the blog post said that anyone can do it, even beginners," my internal impostor nagged. Thus began a multi-year internal battle between the idea that contributing to open source was easy and valuable and I should be doing it, and the impression I was not yet _ready_ to write code for open source projects.
Even once I became comfortable with Git, my internal impostor was always eager to remind me of why I was not yet ready to contribute to open source. When I was in coding Bootcamp, she whispered: "Sure, you know Git and you write code, but youve never written real code before, only fake Bootcamp code. Youre not qualified to contribute to real projects that people use and depend on." When I was working my first year at work as a Software Engineer, she chided, "Okay maybe the code you write is 'real,' but you only work with one codebase! What makes you think you can write high-quality code somewhere else with different conventions, frameworks, or even languages?"
It took me about a year and a half of fulltime work to finally feel confident enough to shut down my internal impostors arguments and go for my first pull request (PR). The irony here is that my internal imposter was the one talking me both into and out of contributing to open source.
### Harmful myths
There are two harmful myths here that I want to debunk.
#### Myth 1: Contributing to open source is "easy"
Throughout this journey, I frequently ran across the message that contributing to open source was supposed to be easy. This made me question my own skills when I found myself unable to "easily" get started.
I understand why people might say that contributing to open source is easy, but I suspect what they actually mean is "its an attainable goal," "its accessible to beginners if they put in the work," or "it is possible to contribute to open source without writing a ton of really complex code."
All of these things are true, but it is equally important to note that contributing to open source is difficult. It requires you to take the time to understand a new codebase _and_ understand the tools that developers use.
I definitely dont want to discourage beginners from trying. It is just important to remember that running into challenges is an expected part of the process.
#### Myth 2: All "real" or "good" developers contribute to open source
My internal impostor was continually reminding me that my lack of open source contributions was a blight on my developer career. In fact, even as I write this post, I feel guilty that I have not contributed more to open source. But while working on open source is a great way to learn and participate in the broader community of developers, it is not the only way to do this. You can also blog, attend meetups, work on side projects, read, mentor, or go home at the end of a long day at work and have a lovely relaxing evening. Contributing to open source is a challenge that can be fun and rewarding if it is the challenge you choose.
Julia Evans wrote a blog post called [Dont feel guilty about not contributing to open source][4], which is a healthy reminder that there are many productive ways to use your time as a developer. I highly recommend bookmarking it for any time you feel that guilt creeping in.
### Mistake number one
Mistake number one was letting my internal impostor guide me. I let her talk me out of contributing to open source for years by telling me I was not ready. Instead, I just did not understand the amount of work I would need to put in to get to the level where I felt confident in my ability to write code for an unfamiliar project (I am still working toward this). I also let her talk me into it, with the idea that I had to contribute to open source to prove my worth as a developer. The end result was still my first merged pull request in a widely used project, but the insecurity made my entire experience less enjoyable.
### Don't let Git get you down
If you want to learn more about Git, or if you are a beginner and Git is a blocker toward making your first open-source contribution, dont panic. Git is very complicated, and you are not expected to know what it is already. Once you get the hang of it, you will find that Git is a handy tool that lets many different developers work on the same project at the same time, and then merge their individual changes together.
There are many resources to help you learn about Git and Github (a site that hosts code so that people can collaborate on it with Git). Here are some suggestions on where to start: [_Hello World_ intro to GitHub][5] and _[Resources to learn Git][6]_.
--------------------------------------------------------------------------------
via: https://opensource.com/article/19/11/my-first-open-source-contribution-impostor-syndrome
作者:[Galen Corey][a]
选题:[lujun9972][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/galenemco
[b]: https://github.com/lujun9972
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/dandelion_blue_water_hand.jpg?itok=QggW8Wnw (Dandelion held out over water)
[2]: https://github.com
[3]: https://git-scm.com
[4]: https://jvns.ca/blog/2014/04/26/i-dont-feel-guilty-about-not-contributing-to-open-source/
[5]: https://guides.github.com/activities/hello-world/
[6]: https://try.github.io/

View File

@ -0,0 +1,58 @@
[#]: collector: (lujun9972)
[#]: translator: ( )
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (My first contribution to open source: Making a decision)
[#]: via: (https://opensource.com/article/19/11/my-first-open-source-contribution-mistake-decisions)
[#]: author: (Galen Corey https://opensource.com/users/galenemco)
My first contribution to open source: Making a decision
======
A new open source contributor documents a series of five mistakes she
made starting out in open source.
![Lightbulb][1]
Previously, I put a lot of [blame on impostor syndrome][2] for delaying my first open source contribution. But there was another factor that I cant ignore: I cant make a decision to save my life. And with [millions][3] of open source projects to choose from, choosing one to contribute to is overwhelming. So overwhelming that I would often end up closing my laptop, thinking, "Maybe Ill just do this another day."
Mistake number two was letting my fear of making a decision get in the way of making my first contribution. In an ideal world, perhaps I would have come into my open source journey with a specific project in mind that I genuinely cared about and wanted to work on, but all I had was a vague goal of contributing to open source somehow. For those of you in the same position, here are strategies that helped me pick out the right project (or at least a good one) for my contribution.
### Tools that I used frequently
At first, I did not think it would be necessary to limit myself to tools or projects with which I was already familiar. There were projects that I had never used before but seemed like appealing candidates because of their active community, or the interesting problems that they solved.
However, given that I had a limited amount of time to devote to this project, I decided to stick with a tool that I already knew. To understand what a tool needs, you need to be familiar with how it is supposed to work. If you want to contribute to a project that you are unfamiliar with, you need to complete an additional step of getting to know the functionality and goals of the code. This extra load can be fun and rewarding, but it can also double your work time. Since my goal was primarily to contribute, sticking to what I knew was a helpful way to narrow things down. It is also rewarding to give back to a project that you have found useful.
### An active and friendly community
When choosing my project, I wanted to feel confident that someone would be there to review the code that I wrote. And, of course, I wanted the person who reviewed my code to be a nice person. Putting your work out there for public scrutiny is scary, after all. While I was open to constructive feedback, there were toxic corners of the developer community that I hoped to avoid.
To evaluate the community that I would be joining, I checked out the _issues_ sections of the repos that I was considering. I looked to see if someone from the core team responded regularly. More importantly, I tried to make sure that no one was talking down to each other in the comments (which is surprisingly common in issues discussions). I also looked out for projects that had a code of conduct, outlining what was appropriate vs. inappropriate behavior for online interaction.
### Clear contribution guidelines
Because this was my first time contributing to open source, I had a lot of questions around the process. Some project communities are excellent about documenting the procedures for choosing an issue and making a pull request. Although I did not select them at the time because I had never worked with the product before, [Gatsby][4] is an exemplar of this practice.
This type of clear documentation helped ease some of my insecurity about not knowing what to do. It also gave me hope that the project was open to new contributors and would take the time to look at my work. In addition to contribution guidelines, I looked in the issues section to see if the project was making use of the "good first issue" flag. This is another indication that the project is open to beginners (and helps you discover what to work on).
### Conclusion
If you dont already have a project in mind, choosing the right place to make your first open source contribution can be overwhelming. Coming up with a list of standards helped me narrow down my choices and find a great project for my first pull request.
--------------------------------------------------------------------------------
via: https://opensource.com/article/19/11/my-first-open-source-contribution-mistake-decisions
作者:[Galen Corey][a]
选题:[lujun9972][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/galenemco
[b]: https://github.com/lujun9972
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/lightbulb-idea-think-yearbook-lead.png?itok=5ZpCm0Jh (Lightbulb)
[2]: https://opensource.com/article/19/10/my-first-open-source-contribution-mistakes
[3]: https://github.blog/2018-02-08-open-source-project-trends-for-2018/
[4]: https://www.gatsbyjs.org/contributing/

View File

@ -0,0 +1,221 @@
[#]: collector: (lujun9972)
[#]: translator: ( )
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (Bash Script to Generate Patching Compliance Report on CentOS/RHEL Systems)
[#]: via: (https://www.2daygeek.com/bash-script-to-generate-patching-compliance-report-on-centos-rhel-systems/)
[#]: author: (Magesh Maruthamuthu https://www.2daygeek.com/author/magesh/)
Bash Script to Generate Patching Compliance Report on CentOS/RHEL Systems
======
If you are running a large Linux environment you may have already integrated your Red Hat systems with the Satellite.
If yes, there is a way to export this from the Satellite Server so you dont have to worry about patching compliance reports.
But if you are running a small Red Hat environment without satellite integration, or if it is CentOS systems, this script will help you to create a report.
The patching compliance report is usually created monthly once or three months once, depending on the companys needs.
Add a cronjob based on your needs to automate this.
This **[bash script][1]** is generally good to run with less than 50 systems, but there is no limit.
Keeping the system up-to-date is an important task for Linux administrators, keeping your computer very stable and secure.
The following articles may help you to learn more about installing security patches on Red Hat (RHEL) and CentOS systems.
* **[How to check available security updates on Red Hat (RHEL) and CentOS system][2]**
* **[Four ways to install security updates on Red Hat (RHEL) &amp; CentOS systems][3]**
* **[Two methods to check or list out installed security updates on Red Hat (RHEL) &amp; CentOS system][4]**
Four **[shell scripts][5]** are included in this tutorial and pick the suitable one for you.
### Method-1: Bash Script to Generate Patching Compliance Report for Security Errata on CentOS/RHEL Systems
This script allows you to create a security errata patch compliance report only. It sends the output via a mail in a plain text.
```
# vi /opt/scripts/small-scripts/sec-errata.sh
#!/bin/sh
/tmp/sec-up.txt
SUBJECT="Patching Reports on "date""
MESSAGE="/tmp/sec-up.txt"
TO="[email protected]"
echo "+---------------+-----------------------------+" >> $MESSAGE
echo "| Server_Name | Security Errata |" >> $MESSAGE
echo "+---------------+-----------------------------+" >> $MESSAGE
for server in `more /opt/scripts/server.txt`
do
sec=`ssh $server yum updateinfo summary | grep 'Security' | grep -v 'Important|Moderate' | tail -1 | awk '{print $1}'`
echo "$server $sec" >> $MESSAGE
done
echo "+---------------------------------------------+" >> $MESSAGE
mail -s "$SUBJECT" "$TO" < $MESSAGE
```
Run the script file once you have added the above script.
```
# sh /opt/scripts/small-scripts/sec-errata.sh
```
You get an output like the one below.
```
# cat /tmp/sec-up.txt
+---------------+-------------------+
| Server_Name | Security Errata |
+---------------+-------------------+
server1
server2
server3 21
server4
+-----------------------------------+
```
Add the following cronjob to get the patching compliance report once a month.
```
# crontab -e
@monthly /bin/bash /opt/scripts/system-uptime-script-1.sh
```
### Method-1a: Bash Script to Generate Patching Compliance Report for Security Errata on CentOS/RHEL Systems
This script allows you to generate a security errata patch compliance report. It sends the output through a mail with the CSV file.
```
# vi /opt/scripts/small-scripts/sec-errata-1.sh
#!/bin/sh
echo "Server Name, Security Errata" > /tmp/sec-up.csv
for server in `more /opt/scripts/server.txt`
do
sec=`ssh $server yum updateinfo summary | grep 'Security' | grep -v 'Important|Moderate' | tail -1 | awk '{print $1}'`
echo "$server, $sec" >> /tmp/sec-up.csv
done
echo "Patching Report for `date +"%B %Y"`" | mailx -s "Patching Report on `date`" -a /tmp/sec-up.csv [email protected]
rm /tmp/sec-up.csv
```
Run the script file once you have added the above script.
```
# sh /opt/scripts/small-scripts/sec-errata-1.sh
```
You get an output like the one below.
![][6]
### Method-2: Bash Script to Generate Patching Compliance Report for Security Errata, Bugfix, and Enhancement on CentOS/RHEL Systems
This script allows you to generate patching compliance reports for Security Errata, Bugfix, and Enhancement. It sends the output via a mail in a plain text.
```
# vi /opt/scripts/small-scripts/sec-errata-bugfix-enhancement.sh
#!/bin/sh
/tmp/sec-up.txt
SUBJECT="Patching Reports on "`date`""
MESSAGE="/tmp/sec-up.txt"
TO="[email protected]"
echo "+---------------+-------------------+--------+---------------------+" >> $MESSAGE
echo "| Server_Name | Security Errata | Bugfix | Enhancement |" >> $MESSAGE
echo "+---------------+-------------------+--------+---------------------+" >> $MESSAGE
for server in `more /opt/scripts/server.txt`
do
sec=`ssh $server yum updateinfo summary | grep 'Security' | grep -v 'Important|Moderate' | tail -1 | awk '{print $1}'`
bug=`ssh $server yum updateinfo summary | grep 'Bugfix' | tail -1 | awk '{print $1}'`
enhance=`ssh $server yum updateinfo summary | grep 'Enhancement' | tail -1 | awk '{print $1}'`
echo "$server $sec $bug $enhance" >> $MESSAGE
done
echo "+------------------------------------------------------------------+" >> $MESSAGE
mail -s "$SUBJECT" "$TO" < $MESSAGE
```
Run the script file once you have added the above script.
```
# sh /opt/scripts/small-scripts/sec-errata-bugfix-enhancement.sh
```
You get an output like the one below.
```
# cat /tmp/sec-up.txt
+---------------+-------------------+--------+---------------------+
| Server_Name | Security Errata | Bugfix | Enhancement |
+---------------+-------------------+--------+---------------------+
server01 16
server02 5 16
server03 21 266 20
server04 16
+------------------------------------------------------------------+
```
Add the following cronjob to get the patching compliance report once every three months. This script is scheduled to run on the 1st of January, April, July and October months.
```
# crontab -e
0 0 01 */3 * /bin/bash /opt/scripts/system-uptime-script-1.sh
```
### Method-2a: Bash Script to Generate Patching Compliance Report for Security Errata, Bugfix, and Enhancement on CentOS/RHEL Systems
This script allows you to generate patching compliance reports for Security Errata, Bugfix, and Enhancement. It sends the output through a mail with the CSV file.
```
# vi /opt/scripts/small-scripts/sec-errata-bugfix-enhancement-1.sh
#!/bin/sh
echo "Server Name, Security Errata,Bugfix,Enhancement" > /tmp/sec-up.csv
for server in `more /opt/scripts/server.txt`
do
sec=`ssh $server yum updateinfo summary | grep 'Security' | grep -v 'Important|Moderate' | tail -1 | awk '{print $1}'`
bug=`ssh $server yum updateinfo summary | grep 'Bugfix' | tail -1 | awk '{print $1}'`
enhance=`ssh $server yum updateinfo summary | grep 'Enhancement' | tail -1 | awk '{print $1}'`
echo "$server,$sec,$bug,$enhance" >> /tmp/sec-up.csv
done
echo "Patching Report for `date +"%B %Y"`" | mailx -s "Patching Report on `date`" -a /tmp/sec-up.csv [email protected]
rm /tmp/sec-up.csv
```
Run the script file once you have added the above script.
```
# sh /opt/scripts/small-scripts/sec-errata-bugfix-enhancement-1.sh
```
You get an output like the one below.
![][6]
--------------------------------------------------------------------------------
via: https://www.2daygeek.com/bash-script-to-generate-patching-compliance-report-on-centos-rhel-systems/
作者:[Magesh Maruthamuthu][a]
选题:[lujun9972][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://www.2daygeek.com/author/magesh/
[b]: https://github.com/lujun9972
[1]: https://www.2daygeek.com/category/bash-script/
[2]: https://www.2daygeek.com/check-list-view-find-available-security-updates-on-redhat-rhel-centos-system/
[3]: https://www.2daygeek.com/install-security-updates-on-redhat-rhel-centos-system/
[4]: https://www.2daygeek.com/check-installed-security-updates-on-redhat-rhel-and-centos-system/
[5]: https://www.2daygeek.com/category/shell-script/
[6]: data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7

View File

@ -0,0 +1,409 @@
[#]: collector: (lujun9972)
[#]: translator: (wxy)
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (Understanding system calls on Linux with strace)
[#]: via: (https://opensource.com/article/19/10/strace)
[#]: author: (Gaurav Kamathe https://opensource.com/users/gkamathe)
在 Linux 上用 strace 来理解系统调用
======
> 使用 strace 跟踪用户进程和 Linux 内核之间的薄层。
![Hand putting a Linux file folder into a drawer][1]
<ruby>系统调用<rt>system call</rt></ruby>是程序从内核请求服务的一种编程方式,而 `strace` 是一个功能强大的工具,可让你跟踪用户进程与 Linux 内核之间的薄层。
要了解操作系统的工作原理,首先需要了解系统调用的工作原理。操作系统的主要功能之一是为用户程序提供抽象。
操作系统可以大致分为两种模式:
* 内核模式:操作系统内核使用的一种强大的特权模式
* 用户模式:大多数用户应用程序运行的地方
  
用户大多使用命令行实用程序和图形用户界面GUI来执行日常任务。系统调用在后台静默运行与内核交互以完成工作。
系统调用与函数调用非常相似,这意味着它们接受并处理参数然后返回值。唯一的区别是系统调用进入内核,而函数调用不进入。从用户空间切换到内核空间是使用特殊的 [trap][2] 机制完成的。
通过使用系统库(在 Linux 系统上又称为 glibc系统调用大部分对用户隐藏了。尽管系统调用本质上是通用的但是发出系统调用的机制在很大程度上取决于机器。
本文通过使用一些常规命令并使用 `strace` 分析每个命令进行的系统调用来探索一些实际示例。这些示例使用 Red Hat Enterprise Linux但是这些命令运行在其他 Linux 发行版上应该也是相同的:
```
[root@sandbox ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.7 (Maipo)
[root@sandbox ~]#
[root@sandbox ~]# uname -r
3.10.0-1062.el7.x86_64
[root@sandbox ~]#
```
首先,确保在系统上安装了必需的工具。你可以使用下面的 `rpm` 命令来验证是否安装了 `strace`。如果安装了,则可以使用 `-V` 选项检查 `strace` 实用程序的版本号:
```
[root@sandbox ~]# rpm -qa | grep -i strace
strace-4.12-9.el7.x86_64
[root@sandbox ~]#
[root@sandbox ~]# strace -V
strace -- version 4.12
[root@sandbox ~]#
```
如果没有安装,运行命令安装:
```
yum install strace
```
出于本示例的目的,在 `/tmp` 中创建一个测试目录,并使用 `touch` 命令创建两个文件:
```
[root@sandbox ~]# cd /tmp/
[root@sandbox tmp]#
[root@sandbox tmp]# mkdir testdir
[root@sandbox tmp]#
[root@sandbox tmp]# touch testdir/file1
[root@sandbox tmp]# touch testdir/file2
[root@sandbox tmp]#
```
(我使用 `/tmp` 目录是因为每个人都可以访问它,但是你可以根据需要选择另一个目录。)
`testdir` 目录下使用 `ls` 命令验证文件已经创建:
```
[root@sandbox tmp]# ls testdir/
file1  file2
[root@sandbox tmp]#
```
你可能每天都使用`ls`命令,而没有意识到系统调用在其下面发生的作用。这里有抽象作用。该命令的工作方式如下:
```
Command-line utility -> Invokes functions from system libraries (glibc) -> Invokes system calls
```
`ls` 命令在 Linux 上从系统库(即 glibc内部调用函数。这些库调用完成大部分工作的系统调用。
如果你想知道从 glibc 库中调用了哪些函数,请使用 `ltrace` 命令,然后跟上常规的 `ls testdir/`命令:
```
ltrace ls testdir/
```
如果没有安装 `ltrace`,键入如下命令安装:
```
yum install ltrace
```
一堆输出会被显示到屏幕上;不必担心,只需继续就行。`ltrace` 命令输出中与该示例有关的一些重要库函数包括:
```
opendir("testdir/") = { 3 }
readdir({ 3 }) = { 101879119, "." }
readdir({ 3 }) = { 134, ".." }
readdir({ 3 }) = { 101879120, "file1" }
strlen("file1") = 5
memcpy(0x1665be0, "file1\0", 6) = 0x1665be0
readdir({ 3 }) = { 101879122, "file2" }
strlen("file2") = 5
memcpy(0x166dcb0, "file2\0", 6) = 0x166dcb0
readdir({ 3 }) = nil
closedir({ 3 })                    
```
通过查看上面的输出,你或许可以了解正在发生的事情。`opendir` 库函数打开一个名为 `testdir` 的目录,然后调用 `readdir` 函数,该函数读取目录的内容。最后,有一个对 `closedir` 函数的调用,该函数将关闭先前打开的目录。现在先忽略其他 `strlen``memcpy` 功能。
你可以看到正在调用哪些库函数,但是本文将重点介绍由系统库函数调用的系统调用。
与上述类似,要了解调用了哪些系统调用,只需将 `strace` 放在 `ls testdir` 命令之前,如下所示。 再次,将一堆乱码丢到了你的屏幕上,你可以按照以下步骤进行操作:
```
[root@sandbox tmp]# strace ls testdir/
execve("/usr/bin/ls", ["ls", "testdir/"], [/* 40 vars */]) = 0
brk(NULL) = 0x1f12000
<<< truncated strace output >>>
write(1, "file1 file2\n", 13file1 file2
) = 13
close(1) = 0
munmap(0x7fd002c8d000, 4096) = 0
close(2) = 0
exit_group(0) = ?
+++ exited with 0 +++
[root@sandbox tmp]#
```
运行 `strace` 命令后屏幕上的输出只是运行 `ls` 命令的系统调用。每个系统调用都为操作系统提供特定的用途,可以将它们大致分为以下几个部分:
* 进程管理系统调用
* 文件管理系统调用
* 目录和文件系统管理系统调用
* 其他系统调用
分析显示到屏幕上的信息的一种更简单的方法是使用 `strace` 方便使用的 `-o` 标志将输出记录到文件中。在 `-o` 标志后添加一个合适的文件名,然后再次运行命令:
```
[root@sandbox tmp]# strace -o trace.log ls testdir/
file1  file2
[root@sandbox tmp]#
```
这次,没有任何输出干扰屏幕显示,`ls` 命令如预期般工作,显示了文件名并将所有输出记录到文件 `trace.log` 中。仅仅是一个简单的 `ls` 命令,该文件就有近 100 行内容:
```
[root@sandbox tmp]# ls -l trace.log
-rw-r--r--. 1 root root 7809 Oct 12 13:52 trace.log
[root@sandbox tmp]#
[root@sandbox tmp]# wc -l trace.log
114 trace.log
[root@sandbox tmp]#
```
让我们看一下这个示例的 `trace.log` 文件的第一行:
```
execve("/usr/bin/ls", ["ls", "testdir/"], [/* 40 vars */]) = 0
```
* 该行的第一个单词 `execve` 是正在执行的系统调用的名称。
* 括号内的文本是提供给该系统调用的参数。
* 符号 `=` 后的数字(在这种情况下为 `0`)是 `execve` 系统调用的返回值。
现在的输出似乎还不太吓人,不是吗?你可以应用相同的逻辑来理解其他行。
现在,将关注点集中在你调用的单个命令上,即 `ls testdir`。你知道命令 `ls` 使用的目录名称,那么为什么不在 `trace.log` 文件中使用 `grep` 查找 `testdir` 并查看得到的结果呢?让我们详细查看一下结果的每一行:
```
[root@sandbox tmp]# grep testdir trace.log
execve("/usr/bin/ls", ["ls", "testdir/"], [/* 40 vars */]) = 0
stat("testdir/", {st_mode=S_IFDIR|0755, st_size=32, ...}) = 0
openat(AT_FDCWD, "testdir/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
[root@sandbox tmp]#
```
回顾一下上面对 `execve` 的分析,你能说一下这个系统调用的作用吗?
```
execve("/usr/bin/ls", ["ls", "testdir/"], [/* 40 vars */]) = 0
```
你无需记住所有系统调用或它们所做的事情,因为你可以在需要时参考文档。手册页可以解救你!在运行 `man` 命令之前,请确保已安装以下软件包:
```
[root@sandbox tmp]# rpm -qa | grep -i man-pages
man-pages-3.53-5.el7.noarch
[root@sandbox tmp]#
```
请记住,你需要在 `man` 命令和系统调用名称之间添加 `2`。如果使用 `man man` 阅读 `man` 命令的手册页,你会看到第 2 节是为系统调用保留的。同样,如果你需要有关库函数的信息,则需要在 `man` 和库函数名称之间添加一个 `3`
以下是手册的章节编号及其包含的页面类型:
* `1`:可执行的程序或 shell 命令
* `2`:系统调用(由内核提供的函数)
* `3`:库调用(在程序的库内的函数)
* `4`:特殊文件(通常出现在 `/dev`
使用系统调用名称运行以下 `man` 命令以查看该系统调用的文档:
```
man 2 execve
```
按照 `execve` 手册页,这将执行在参数中传递的程序(在本例中为 `ls`)。可以为 `ls` 提供其他参数,例如本例中的 `testdir`。因此,此系统调用仅以 `testdir` 作为参数运行 `ls`
```
execve - execute program
DESCRIPTION
execve() executes the program pointed to by filename
```
下一个系统调用,名为 `stat`,它使用 `testdir` 参数:
```
stat("testdir/", {st_mode=S_IFDIR|0755, st_size=32, ...}) = 0
```
使用 `man 2 stat` 访问该文档。`stat` 是获取文件状态的系统调用请记住Linux 中的一切都是文件,包括目录。
接下来,`openat` 系统调用将打开 `testdir`。密切注意返回的 `3`。这是一个文件描述符,将在以后的系统调用中使用:
```
openat(AT_FDCWD, "testdir/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
```
到现在为止一切都挺好。现在,打开 `trace.log` 文件,并转到 `openat` 系统调用之后的行。你会看到 `getdents` 系统调用被调用,该调用完成了执行 `ls testdir` 命令所需的大部分操作。现在,从 `trace.log` 文件中用 `grep` 获取 `getdents`
```
[root@sandbox tmp]# grep getdents trace.log
getdents(3, /* 4 entries */, 32768)     = 112
getdents(3, /* 0 entries */, 32768)     = 0
[root@sandbox tmp]#
```
`getdents` 的手册页将其描述为 “获取目录项”,这就是你要执行的操作。注意,`getdents` 的参数是 `3`,这是来自上面 `openat` 系统调用的文件描述符。
现在有了目录列表,你需要一种在终端中显示它的方法。因此,在日志中用 `grep` 搜索另一个用于写入终端的系统调用 `write`
```
[root@sandbox tmp]# grep write trace.log
write(1, "file1  file2\n", 13)          = 13
[root@sandbox tmp]#
```
在这些参数中,你可以看到将要显示的文件名:`file1` 和 `file2`。关于第一个参数(`1`),请记住在 Linux 中,当运行任何进程时,默认情况下会为其打开三个文件描述符。以下是默认的文件描述符:
* `0`:标准输入
* `1`:标准输出
* `2`:标准错误
因此,`write` 系统调用将在标准显示(这就是终端,由 `1` 所标识的)上显示 `file1``file2`
现在你知道哪个系统调用完成了 `ls testdir/` 命令的大部分工作。但是在 `trace.log` 文件中其它的 100 多个系统调用呢?操作系统必须做很多内务处理才能运行一个进程,因此,你在该日志文件中看到的很多内容都是进程初始化和清理。阅读整个 `trace.log` 文件,并尝试了解什么使 `ls` 命令可以工作。
既然你知道了如何分析给定命令的系统调用,那么就可以将该知识用于其他命令来了解正在执行哪些系统调用。`strace` 提供了许多有用的命令行标志,使你更容易使用,下面将对其中一些进行描述。
默认情况下,`strace` 并不包含所有系统调用信息。但是,它有一个方便的 `-v verbose` 选项,可以在每个系统调用中提供附加信息:
```
strace -v ls testdir
```
在运行 `strace` 命令时始终使用 `-f` 选项是一种好的作法。它允许 `strace` 跟踪由当前正在跟踪的进程创建的任何子进程:
```
strace -f ls testdir
```
假设你只需要系统调用的名称、运行的次数以及每个系统调用花费的时间百分比。你可以使用 `-c` 标志来获取这些统计信息:
```
strace -c ls testdir/
```
假设你想专注于特定的系统调用,例如专注于 `open` 系统调用,而忽略其余部分。你可以使用`-e`标志跟上系统调用的名称:
```
[root@sandbox tmp]# strace -e open ls testdir
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libcap.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libacl.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libpcre.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libattr.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
file1  file2
+++ exited with 0 +++
[root@sandbox tmp]#
```
如果你想关注多个系统调用怎么办?不用担心,你同样可以使用 `-e` 命令行标志,并用逗号分隔开两个系统调用。例如,要查看 `write``getdents` 系统调用:
```
[root@sandbox tmp]# strace -e write,getdents ls testdir
getdents(3, /* 4 entries */, 32768)     = 112
getdents(3, /* 0 entries */, 32768)     = 0
write(1, "file1  file2\n", 13file1  file2
)          = 13
+++ exited with 0 +++
[root@sandbox tmp]#
```
到目前为止,这些示例已明确跟踪了运行的命令。但是,要跟踪已经运行并正在执行的命令又怎么办呢?例如,如果要跟踪只是长时间运行的进程的守护程序,该怎么办?为此,`strace` 提供了一个特殊的 `-p` 标志,你可以向其提供进程 ID。
不用在守护程序上运行 `strace`,而是以 `cat` 命令为例,如果你将文件名作为参数,通常会显示文件的内容。如果没有给出参数,`cat` 命令会在终端上等待用户输入文本。输入文本后,它将重复给定的文本,直到用户按下 `Ctrl + C` 退出为止。
从一个终端运行 `cat` 命令;它会向你显示一个提示,而等待在那里(记住 `cat` 仍在运行且尚未退出):
```
[root@sandbox tmp]# cat
```
在另一个终端上,使用 `ps` 命令找到进程标识符PID
```
[root@sandbox ~]# ps -ef | grep cat
root      22443  20164  0 14:19 pts/0    00:00:00 cat
root      22482  20300  0 14:20 pts/1    00:00:00 grep --color=auto cat
[root@sandbox ~]#
```
现在,使用 `-p` 标志和 PID在上面使用 `ps` 找到)对运行中的进程运行 `strace`。运行 `strace` 之后,其输出说明了所接驳的进程的内容及其 PID。现在`strace` 正在跟踪 `cat` 命令进行的系统调用。看到的第一个系统调用是 `read`,它正在等待文件描述符 `0`(标准输入,这是运行 `cat` 命令的终端)的输入:
```
[root@sandbox ~]# strace -p 22443
strace: Process 22443 attached
read(0,
```
现在,返回到你使 `cat` 命令运行的终端,并输入一些文本。我出于演示目的输入了 `x0x0`。注意 `cat` 是如何简单地重复我输入的内容。因此,`x0x0` 出现了两次。我输入了第一个,第二个是 `cat` 命令重复的输出:
```
[root@sandbox tmp]# cat
x0x0
x0x0
```
返回到将 `strace` 接驳到 `cat` 进程的终端。现在你会看到两个额外的系统调用:较早的 `read` 系统调用,现在在终端中读取 `x0x0`,另一个为 `write`,将 `x0x0` 写回到终端,然后是再一个新的 `read`,正在等待从终端读取。请注意,标准输入(`0`)和标准输出(`1`)都在同一终端中:
```
[root@sandbox ~]# strace -p 22443
strace: Process 22443 attached
read(0, "x0x0\n", 65536)                = 5
write(1, "x0x0\n", 5)                   = 5
read(0,
```
想象一下,对守护进程运行 `strace` 以查看其在后台执行的所有操作时这有多大帮助。按下 `Ctrl + C` 杀死 `cat` 命令;由于该进程不再运行,因此这也会终止你的 `strace` 会话。
如果要查看所有的系统调用的时间戳,只需将 `-t` 选项与 `strace` 一起使用:
```
[root@sandbox ~]#strace -t ls testdir/
14:24:47 execve("/usr/bin/ls", ["ls", "testdir/"], [/* 40 vars */]) = 0
14:24:47 brk(NULL)                      = 0x1f07000
14:24:47 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2530bc8000
14:24:47 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
14:24:47 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
```
如果你想知道两次系统调用之间所花费的时间怎么办?`strace` 有一个方便的 `-r` 命令,该命令显示执行每个系统调用所花费的时间。非常有用,不是吗?
```
[root@sandbox ~]#strace -r ls testdir/
0.000000 execve("/usr/bin/ls", ["ls", "testdir/"], [/* 40 vars */]) = 0
0.000368 brk(NULL)                 = 0x1966000
0.000073 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb6b1155000
0.000047 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
0.000119 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
```
### 总结
`strace` 实用程序非常有助于理解 Linux 上的系统调用。要了解它的其它命令行标志,请参考手册页和在线文档。
--------------------------------------------------------------------------------
via: https://opensource.com/article/19/10/strace
作者:[Gaurav Kamathe][a]
选题:[lujun9972][b]
译者:[wxy](https://github.com/wxy)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/gkamathe
[b]: https://github.com/lujun9972
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/yearbook-haff-rx-linux-file-lead_0.png?itok=-i0NNfDC (Hand putting a Linux file folder into a drawer)
[2]: https://en.wikipedia.org/wiki/Trap_(computing)

View File

@ -0,0 +1,165 @@
[#]: collector: (lujun9972)
[#]: translator: (geekpi)
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (Getting started with awk, a powerful text-parsing tool)
[#]: via: (https://opensource.com/article/19/10/intro-awk)
[#]: author: (Seth Kenlon https://opensource.com/users/seth)
awk 入门,强大的文本分析工具
======
让我们开始使用它。
![Woman programming][1]
awk 是用于 Unix 和类 Unix 系统的强大文本解析工具,但是由于它有可编程函数,因此你可以用它来执行常规解析任务,因此它也被视为一种编程语言。你可能不会使用 awk 开发下一个 GUI 应用,并且它可能不会代替你的默认脚本语言,但是它是用于特定任务的强大程序。
这些任务或许是惊人的多样化。了解 awk 可以很好解决你的哪些问题的最好方法是学习 awk。你会惊讶于 awk 如何帮助你完成更多工作,却花费更少的精力。
awk 的基本语法是:
```
`awk [options] 'pattern {action}' file`
```
首先,创建此示例文件并将其保存为 **colours.txt**
```
name       color  amount
apple      red    4
banana     yellow 6
strawberry red    3
grape      purple 10
apple      green  8
plum       purple 2
kiwi       brown  4
potato     brown  9
pineapple  yellow 5
```
数据被一个或多个空格分隔为列。以某种方式组织要分析的数据是很常见的。它不一定总是由空格分隔的列,甚至不是逗号或分号,但尤其是在日志文件或数据转储中,通常有一个可预测的格式。你可以使用数据格式来帮助 awk 提取和处理你关注的数据。
### 打印列
在 awk 中,**print** 函数显示你指定的内容。你可以使用许多预定义的变量,但是最常见的是文本文件中指定的列数。试试看:
```
$ awk '{print $2;}' colours.txt
color
red
yellow
red
purple
green
purple
brown
brown
yellow
```
在这里awk 显示第二列,用 **$2** 表示。这是相对直观的,因此你可能会猜测 **print $1** 显示第一列,而 **print $3** 显示第三列,依此类推。
要显示_全部_列请使用 **$0**。
美元符号(**$**后的数字是_表达式_因此 **$2**和 **$(1+1)** 是同一意思。
### 有条件地选择列
你使用的示例文件非常结构化。它有一行充当标题并且各列直接相互关联。通过定义_条件_你可以限定 awk 在找到此数据时返回的内容。例如,要查看第 2 列中与 “yellow” 匹配的项并打印第 1 列的内容:
```
awk '$2=="yellow"{print $1}' file1.txt
banana
pineapple
```
正则表达式也可以工作。此表达式近似匹配 **$2** 中以 **p** 开头跟上任意数量(一个或多个)字符后继续跟上 **p** 的值:
```
$ awk '$2 ~ /p.+p/ {print $0}' colours.txt
grape   purple  10
plum    purple  2
```
数字能被 awk 自然解释。例如,要打印第三列包含大于 5 的整数的行:
```
awk '$3&gt;5 {print $1, $2}' colours.txt
name    color
banana  yellow
grape   purple
apple   green
potato  brown
```
### 字段分隔符
默认情况下awk 使用空格作为字段分隔符。但是,并非所有文本文件都使用空格来定义字段。例如,用以下内容创建一个名为 **colours.csv** 的文件:
```
name,color,amount
apple,red,4
banana,yellow,6
strawberry,red,3
grape,purple,10
apple,green,8
plum,purple,2
kiwi,brown,4
potato,brown,9
pineapple,yellow,5
```
只要你指定将哪个字符用作命令中的字段分隔符awk 就能以完全相同的方式处理数据。使用 **\--field-separator**(或简称为 **-F**)选项来定义分隔符:
```
$ awk -F"," '$2=="yellow" {print $1}' file1.csv
banana
pineapple
```
### 保存输出
使用输出重定向,你可以将结果写入文件。例如:
```
`$ awk -F, '$3>5 {print $1, $2} colours.csv > output.txt`
```
这将创建一个包含 awk 查询内容的文件。
你还可以将文件拆分为按列数据分组的多个文件。例如,如果要根据每行显示的颜色将 colours.txt 拆分为多个文件,你可以在 awk 中包含重定向语句来重定向_每条查询_
```
`$ awk '{print > $2".txt"}' colours.txt`
```
这将生成名为 **yellow.txt****red.txt** 等文件。
在下一篇文章中,你将了解有关字段,记录和一些强大的 awk 变量的更多信息。
* * *
本文改编自社区技术播客 [Hacker Public Radio][2]。
--------------------------------------------------------------------------------
via: https://opensource.com/article/19/10/intro-awk
作者:[Seth Kenlon][a]
选题:[lujun9972][b]
译者:[geekpi](https://github.com/geekpi)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/seth
[b]: https://github.com/lujun9972
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/programming-code-keyboard-laptop-music-headphones.png?itok=EQZ2WKzy (Woman programming)
[2]: http://hackerpublicradio.org/eps.php?id=2114