【翻译完成】20141117 Restricting process CPU usage using nice cpulimit and cgroups.md

This commit is contained in:
coloka 2014-11-30 11:37:20 +08:00
parent 8ee9eb4893
commit cd27963acc
2 changed files with 196 additions and 200 deletions

View File

@ -1,200 +0,0 @@
翻译中 by coloka
Restricting process CPU usage using nice, cpulimit, and cgroups
================================================================================
注:本文中的图片似乎都需要翻墙后才能看到,发布的时候注意
![](https://dl.dropboxusercontent.com/u/468982/blog/cpu_usage_blog/juggle.jpg)
The Linux kernel is an incredible circus performer, carefully juggling many processes and their resource needs to keep your server humming along. The kernel is also all about equity: when there is competition for resources, the kernel tries to distribute those resources fairly.
However, what if you've got an important process that needs priority? What about a low-priority process? Or what about limiting resources for a group of a processes?
**The kernel can't determine what CPU processes are important without your help. **
Most processes are started at the same priority level and the Linux kernel schedules time for each task evenly on the processor. Have a CPU intensive process that can be run at a lower priority? Then you need to tell the scheduler about it!
There are at least three ways in which you can control how much CPU time a process gets:
- Use the nice command to manually lower the task's priority.
- Use the cpulimit command to repeatedly pause the process so that it doesnt exceed a certain limit.
- Use Linuxs built-in **control groups**, a mechanism which tells the scheduler to limit the amount of resources available to the process.
Let's look at how these work and the pros and cons of each.
### Simulating high CPU usage ###
Before looking at these three techniques, we need to find a tool that will simulate high CPU usage on a system. We will be using CentOS as our base system, and to artificially load the processor we can use the prime number generator from the [Mathomatic toolkit][1].
There isnt a prebuilt package for CentOS so you will need to build it yourself. Download the source code from http://mathomatic.orgserve.de/mathomatic-16.0.5.tar.bz2 and then unpack the archive file. Change directory into **mathomatic-16.0.5/primes**. Run **make** and **sudo make install** to build and install the binaries. You will now have the **matho-primes** binary in **/usr/local/bin**.
Run the command like this:
/usr/local/bin/matho-primes 0 9999999999 > /dev/null &
This will generate a list of prime numbers from zero to nine billion nine hundred ninety-nine million nine hundred ninety-nine thousand nine hundred ninety-nine. Since we dont really want to keep the list, the output is redirected to /dev/null.
Now run top and you will see that the matho-primes process is using all the available CPU.
![](https://dl.dropboxusercontent.com/u/468982/blog/cpu_usage_blog/image00.jpg)
Exit top (press the q key) and kill the matho-primes process (fg to bring the process to the foreground and press CTRL+C).
### nice ###
The nice command tweaks the priority level of a process so that it runs less frequently. **This is useful when you need to run a CPU intensive task as a background or batch job**. The niceness level ranges from -20 (most favorable scheduling) to 19 (least favorable). Processes on Linux are started with a niceness of 0 by default. The nice command (without any additional parameters) will start a process with a niceness of 10. At that level the scheduler will see it as a lower priority task and give it less CPU resources.
Start two **matho-primes** tasks, one with nice and one without:
nice matho-primes 0 9999999999 > /dev/null &
matho-primes 0 9999999999 > /dev/null &
Now run top.
![](https://dl.dropboxusercontent.com/u/468982/blog/cpu_usage_blog/image05.jpg)
Observe that the process started without nice (at niceness level 0) gets more processor time, whereas the process with a niceness level of 10 gets less.
What this means in real terms is that if you want to run a CPU intensive task you can start it using nice and the scheduler will always ensure that other tasks have priority over it. This means that the server (or desktop) will remain responsive even when under heavy load.
Nice has an associated command called renice. It changes the niceness level of an already running process. To use it, find out the PID of process hogging all the CPU time (using ps) and then run renice:
renice +10 1234
Where 1234 is the PID.
Dont forget to kill the **matho-primes** processes once you have finished experimenting with the **nice** and **renice** commands.
### cpulimit ###
The **cpulimit** tool curbs the CPU usage of a process by pausing the process at different intervals to keep it under the defined ceiling. It does this by sending SIGSTOP and SIGCONT signals to the process. It does not change the **nice** value of the process, instead it monitors and controls the real-world CPU usage.
cpulimit **is useful when you want to ensure that a process doesn't use more than a certain portion of the CPU**. The disadvantage over nice is that the process can't use all of the available CPU time when the system is idle.
To install it on CentOS type:
wget -O cpulimit.zip https://github.com/opsengine/cpulimit/archive/master.zip
unzip cpulimit.zip
cd cpulimit-master
make
sudo cp src/cpulimit /usr/bin
The commands above will download the source code from GitHub, unpack the archive file, build the binary, and copy it to /usr/bin.
cpulimit is used in a similar way to nice, however you need to explicitly define the maximum CPU limit for the process using the -l parameter. For example:
cpulimit -l 50 matho-primes 0 9999999999 > /dev/null &
![](https://dl.dropboxusercontent.com/u/468982/blog/cpu_usage_blog/image03.jpg)
Note how the matho-primes process is now only using 50% of the available CPU time. On my example system the rest of the time is spent in idle.
You can also limit a currently running process by specifying its PID using the -p parameter. For example
cpulimit -l 50 -p 1234
Where 1234 is the PID of the process.
### cgroups ###
Control groups (cgroups) are a Linux kernel feature that allows you to specify how the kernel should allocate specific resources to a group of processes. With cgroups you can specify how much CPU time, system memory, network bandwidth, or combinations of these resources can be used by the processes residing in a certain group.
**The advantage of control groups over** nice **or** cpulimit **is that the limits are applied to a set of processes, rather than to just one**. Also, nice or cpulimit only limit the CPU usage of a process, whereas cgroups can limit other process resources.
By judiciously using cgroups the resources of entire subsystems of a server can be controlled. For example in CoreOS, the minimal Linux distribution designed for massive server deployments, the upgrade processes are controlled by a cgroup. This means the downloading and installing of system updates doesnt affect system performance.
To demonstrate cgroups, we will create two groups with different CPU resources allocated to each group. The groups will be called cpulimited and lesscpulimited.
The groups are created with the cgcreate command like this:
sudo cgcreate -g cpu:/cpulimited
sudo cgcreate -g cpu:/lesscpulimited
The “-g cpu” part of the command tell cgroups that the groups can place limits on the amount of CPU resources given to the processes in the group. Other contollers include cpuset, memory, and blkio. The cpuset controller is related to the cpu controller in that it allows the processes in a group to be bound to a specific CPU, or set of cores in a CPU.
The cpu controller has a property known as cpu.shares. It is used by the kernel to determine the share of CPU resources available to each process across the cgroups. The default value is 1024. By leaving one group (lesscpulimited) at the default of 1024 and setting the other (cpulimited) to 512, we are telling the kernel to split the CPU resources using a 2:1 ratio.
To set the cpu.shares to 512 in the cpulimited group, type:
sudo cgset -r cpu.shares=512 cpulimited
To start a task in a particular cgroup you can use the cgexec command. To test the two cgroups, start matho-primes in the cpulimited group, like this:
sudo cgexec -g cpu:cpulimited /usr/local/bin/matho-primes 0 9999999999 > /dev/null &
If you run top you will see that the process is taking all of the available CPU time.
![](https://dl.dropboxusercontent.com/u/468982/blog/cpu_usage_blog/image01.jpg)
This is because when a single process is running, it uses as much CPU as necessary, regardless of which cgroup it is placed in. The CPU limitation only comes into effect when two or more processes compete for CPU resources.
Now start a second matho-primes process, this time in the lesscpulimited group:
sudo cgexec -g cpu:lesscpulimited /usr/local/bin/matho-primes 0 9999999999 > /dev/null &
The top command shows us that the process in the cgroup with the greater cpu.shares value is getting more CPU time.
![](https://dl.dropboxusercontent.com/u/468982/blog/cpu_usage_blog/image02.jpg)
Now start another matho-primes process in the cpulimited group:
sudo cgexec -g cpu:cpulimited /usr/local/bin/matho-primes 0 9999999999 > /dev/null &
![](https://dl.dropboxusercontent.com/u/468982/blog/cpu_usage_blog/image04.jpg)
Observe how the CPU is still being proportioned in a 2:1 ratio. Now the two matho-primes tasks in the cpulimited group are sharing the CPU equally, while the process in the other group still gets more processor time.
You can [read the full control groups documentation from Red Hat][2] (which applies equally to CentOS 7).
### Monitoring process CPU usage with Scout ###
What's the easiest way to monitor process CPU usage? [Scout][3] automatically tracks track process CPU + memory usage when our monitoring agent is installed on your servers.
### Monitoring process CPU usage with Scout ###
What's the easiest way to monitor process CPU usage? Scout automatically tracks track process CPU + memory usage when our monitoring agent is installed on your servers.
![](https://dl.dropboxusercontent.com/u/468982/blog/server_view/processes.png)
You can then create triggers to alert you when processes exceed specific CPU + memory usage thresholds.
[Signup for a free trial of Scout][4] to try process CPU monitoring.
### TL;DR ###
![](https://dl.dropboxusercontent.com/u/468982/blog/cpu_usage_blog/overview.png)
The finite resources of any server or desktop are a valuable commodity. The tools described above help you manage those resources, especially the CPU resource:
- **nice** is a great tool for 'one off' tweaks to a system.
- **cpulimit** is useful when you need to run a CPU intensive job and having free CPU time is essential for the responsiveness of a system.
- **cgroups** are the Swiss army knife of process limiting and offer the greatest flexibility.
--------------------------------------------------------------------------------
via: http://blog.scoutapp.com/articles/2014/11/04/restricting-process-cpu-usage-using-nice-cpulimit-and-cgroups
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](http://linux.cn/) 荣誉推出
[1]:http://www.mathomatic.org/
[2]:https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Resource_Management_and_Linux_Containers_Guide/chap-Introduction_to_Control_Groups.html
[3]:https://scoutapp.com/
[4]:https://scoutapp.com/
[5]:
[6]:
[7]:
[8]:
[9]:
[10]:
[11]:
[12]:
[13]:
[14]:
[15]:
[16]:
[17]:
[18]:
[19]:
[20]:

View File

@ -0,0 +1,196 @@
使用nice、cpulimit和cgroups限制cpu占用率
================================================================================
注:本文中的图片似乎都需要翻墙后才能看到,发布的时候注意
![](https://dl.dropboxusercontent.com/u/468982/blog/cpu_usage_blog/juggle.jpg)
Linux内核是一名了不起的马戏表演者它在进程和系统资源间小心地玩着杂耍并保持系统的能够正常运转。 同时,内核也很公正:它将资源公平地分配给各个进程。
但是,如果你需要给一个重要进程提高优先级时,该怎么做呢? 或者是,如何降低一个进程的优先级? 又或者,如何限制一组进程所使用的资源呢?
**答案是需要由用户来为内核指定进程的优先级**
大部分进程启动时的优先级时相同的因此Linux内核会公平地进行调度。 如果想让一个CPU密集型的进程运行在低优先级那么你就得事先配置好调度器。
下面介绍3种控制进程运行时间的方法
- 使用nice命令手动减低任务的优先级。
- 使用cpulimit命令控制进程的运行时间上限。
- 使用linux内建的**control groups**功能,它提供了限制进程资源消耗的机制。
我们来看一下这3个工具的工作原理和各自的优缺点。
### 模拟高cpu占用率 ###
在分析这3种技术前我们要先安装一个工具来模拟高CPU占用率的场景。我们会用到CentOS作为测试系统并使用[Mathomatic toolkit][1]中的质数生成器来模拟CPU负载。
很不幸在CentOS上这个工具没有预编译好的版本所以必须要从源码进行安装。先从http://mathomatic.orgserve.de/mathomatic-16.0.5.tar.bz2这个链接下载源码包并解压。然后进入**mathomatic-16.0.5/primes**文件夹,运行**make** 和 **sudo make install**进行编译和安装。这样,就把**matho-primes**程序安装到了**/usr/local/bin**目录中。
接下来,通过命令行运行:
/usr/local/bin/matho-primes 0 9999999999 > /dev/null &
程序运行后将输出从0到9999999999之间的质数。因为我们并不需要这些输出结果直接将输出重定向到/dev/null就好。
现在使用top命令就可以看到matho-primes进程榨干了你所有的cpu资源。
![](https://dl.dropboxusercontent.com/u/468982/blog/cpu_usage_blog/image00.jpg)
好了接下来退出top按q键并杀掉matho-primes进程使用fg命令将进程切换到前台再按CTRL+C
### nice命令 ###
下来介绍一下nice命令的使用方法nice命令可以修改进程的优先级这样就可以让进程运行得不那么频繁。 **这个功能在运行cpu密集型的后台进程或批处理作业时尤为有用。** nice值的取值范围是[-20,19],-20表示最高优先级而19表示最低优先级。 Linux进程的默认nice值为0。使用nice命令不带任何参数时可以将进程的nice值设置为10。这样调度器就会将此进程视为低优先级的进程从而减少cpu资源的分配。
下面来看一个例子,我们同时运行两个**matho-primes**进程一个使用nice命令来启动运行而另一个正常启动运行
nice matho-primes 0 9999999999 > /dev/null &
matho-primes 0 9999999999 > /dev/null &
再运行top命令。
![](https://dl.dropboxusercontent.com/u/468982/blog/cpu_usage_blog/image05.jpg)
看到没正常运行的进程nice值为0获得了更多的cpu运行时间相反的用nice命令运行的进程占用的cpu时间会较少nice值为10
在实际使用中如果你要运行一个CPU密集型的程序那么最好用nice命令来启动它这样就可以保证其他进程获得更高的优先级。 也就是说,即使你的服务器或者台式机在重载的情况下,也可以快速响应。
nice还有一个关联命令叫做renice它可以在运行时调整进程的nice值。使用renice命令时要先找出进程的PID。下面是一个例子
renice +10 1234
其中1234是进程的PID。
测试完**nice** 和 **renice**命令后,记得要将**matho-primes**进程全部杀掉。
### cpulimit命令 ###
接下来介绍 **cpulimit** 命令的用法。 **cpulimit** 命令的工作原理是为进程预设一个cpu占用率门限并实时监控进程是否超出此门限若超出则让该进程暂停运行一段时间。cpulimit使用 SIGSTOP和SIGCONT这两个信号来控制进程。它不会修改进程的nice值而是通过监控进程的cpu占用率来做出动态调整。
cpulimit的优势是可以控制进程的cpu使用率的上限值。但与nice相比也有缺点那就是即使cpu是空闲的进程也不能完全使用整个cpu资源。
在CentOS上可以用下面的方法来安装
wget -O cpulimit.zip https://github.com/opsengine/cpulimit/archive/master.zip
unzip cpulimit.zip
cd cpulimit-master
make
sudo cp src/cpulimit /usr/bin
上面的命令行会先从从GitHub上将源码下载到本地然后再解压、编译、并安装到/usr/bin目录下。
cpulimit的使用方式和nice命令类似但是需要用户使用-l选项显式地定义进程的cpu使用率上限值。举例说明
cpulimit -l 50 matho-primes 0 9999999999 > /dev/null &
![](https://dl.dropboxusercontent.com/u/468982/blog/cpu_usage_blog/image03.jpg)
从上面的例子可以看出matho-primes只使用了50%的cpu资源剩余的cpu时间都为idle。
You can also limit a currently running process by specifying its PID using the -p parameter. For example
cpulimit还可以在运行时对进程进行动态限制使用-p选项来指定进程的PID下面是一个实例
cpulimit -l 50 -p 1234
其中1234是进程的PID。
### cgroups命令集 ###
最后介绍功能最为强大的控制组cgroups的用法。cgroups是Linux内核提供的一种机制利用它可以指定一组进程的资源分配。 具体来说使用cgroups用户能够限定一组进程的cpu占用率、系统内存消耗、网络带宽以及这几种资源的组合。
对比nice和cpulimit**cgroups的优势**在于它可以控制一组进程不像前者仅能控制单进程。同时nice和cpulimit只能限制cpu使用率而cgroups可以限制其他进程资源的使用。
对cgroups善加利用就可以控制好整个子系统的资源消耗。就拿CoreOS作为例子这是一个专为大规模服务器部署而设计的最简化的Linux发行版本它的upgrade进程就是使用cgroups来管控。这样系统在下载和安装升级版本时也不会影响到系统的性能。
下面做一下演示我们将创建两个控制组cgroups并对其分配不同的cpu资源。这两个控制组分别命名为“cpulimited”和“lesscpulimited”。
使用cgcreate命令来创建控制组如下所示
sudo cgcreate -g cpu:/cpulimited
sudo cgcreate -g cpu:/lesscpulimited
其中“-g cpu”选项用于设定cpu的使用上限。除此cpu外cgroups还提供cpuset、memory、blkio等控制器。cpuset控制器与cpu控制器的不同在于cpu控制器只能限制一个cpu核的使用率而cpuset可以控制多个cpu核。
cpu控制器中的cpu.shares属性用于控制cpu使用率。它的默认值是1024我们将lesscpulimited控制组的cpu.shares设为1024默认值而cpulimited设为512配置后内核就会按照21的比例为这两个控制组分配资源。
To set the cpu.shares to 512 in the cpulimited group, type:
sudo cgset -r cpu.shares=512 cpulimited
使用cgexec命令来启动控制组的运行为了测试这两个控制组我们先用cpulimited控制组来启动matho-primes进程命令行如下
sudo cgexec -g cpu:cpulimited /usr/local/bin/matho-primes 0 9999999999 > /dev/null &
打开top可以看到matho-primes进程占用了所有的cpu资源。
![](https://dl.dropboxusercontent.com/u/468982/blog/cpu_usage_blog/image01.jpg)
因为只有一个进程在系统中运行不管将其放到哪个控制组中启动它都会尽可能多的使用cpu资源。cpu资源限制只有在两个进程争夺cpu资源时才会生效。
那么现在我们就启动第二个matho-primes进程这一次我们在lesscpulimited控制组中来启动它
sudo cgexec -g cpu:lesscpulimited /usr/local/bin/matho-primes 0 9999999999 > /dev/null &
再打开top就可以看到cpu.shares值大的控制组会得到更多的cpu运行时间。
![](https://dl.dropboxusercontent.com/u/468982/blog/cpu_usage_blog/image02.jpg)
现在我们再在cpulimited控制组中增加一个matho-primes进程
sudo cgexec -g cpu:cpulimited /usr/local/bin/matho-primes 0 9999999999 > /dev/null &
![](https://dl.dropboxusercontent.com/u/468982/blog/cpu_usage_blog/image04.jpg)
看到没两个控制组的cpu的占用率比例仍然为21。其中cpulimited控制组中的两个matho-primes进程获得的cpu时间基本相当而另一组中的matho-primes进程显然获得了更多的运行时间。
更多的使用方法可以在Red Hat上查看详细的cgroups使用[说明][2]。当然CentOS 7也有
### 使用Scout来监控cpu占用率 ###
监控cpu占用率最为简单的方法是什么[Scout][3]工具能够监控能够自动监控进程的cpu使用率和内存使用情况。
![](https://dl.dropboxusercontent.com/u/468982/blog/server_view/processes.png)
[Scout][3]的触发器trigger功能还可以设定cpu和内存的使用门限超出门限时会自动产生报警。
从这里可以获取[Scout][4]的试用版。
### 总结 ###
![](https://dl.dropboxusercontent.com/u/468982/blog/cpu_usage_blog/overview.png)
计算机的系统资源是非常宝贵的。上面介绍的这3个工具能够帮助大家有效地管理系统资源特别是cpu资源
- **nice**可以一次性调整进程的优先级。
- **cpulimit**在运行cpu密集型任务且要保持系统的响应性时会很有用。
- **cgroups**是资源管理的瑞士军刀,同时在使用上也很灵活。
--------------------------------------------------------------------------------
via: http://blog.scoutapp.com/articles/2014/11/04/restricting-process-cpu-usage-using-nice-cpulimit-and-cgroups
译者:[coloka](https://github.com/coloka)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](http://linux.cn/) 荣誉推出
[1]:http://www.mathomatic.org/
[2]:https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Resource_Management_and_Linux_Containers_Guide/chap-Introduction_to_Control_Groups.html
[3]:https://scoutapp.com/
[4]:https://scoutapp.com/
[5]:
[6]:
[7]:
[8]:
[9]:
[10]:
[11]:
[12]:
[13]:
[14]:
[15]:
[16]:
[17]:
[18]:
[19]:
[20]: