[Translated] 08 The Linux Kernel--Configuring the Kernel Part 4

This commit is contained in:
geekpi 2013-11-04 06:39:34 +00:00
parent fda0a24bfe
commit b287cb50af
2 changed files with 151 additions and 155 deletions

View File

@ -1,155 +0,0 @@
Translating-------geekpi
08 The Linux Kernel: Configuring the Kernel Part 4
================================================================================
![](http://www.linux.org/attachments/slide-jpg.392/)
Here, in part four of “The Linux kernel: Configuring the Kernel”, we are continuing with more settings and features to configure.
Here, we are asked about “IBM Calgary IOMMU support (CALGARY_IOMMU)”. This option will enable support for IOMMUs that belong to the xSeries x366 and x460 by IBM. This will also allow 32-bit PCI that do not support DAC (Double Address Cycle) devices of such systems to run properly because this system setup will have issues accessing more than 3GB of RAM. If needed, these IOMMU devices can be turned off at boot time using the “iommu=off” parameter. (These kernel/module parameters will be discussed in a later article.)
An IOMMU (input/output memory management unit) is a memory management unit (MMU) that connects a DMA-capable I/O bus to the main memory. DMA (Direct Memory Access) is a feature of many computers that allows certain devices to access the memory without help from the CPU. Double Address Cycle (DAC) is 64-bit DMA; regular DMA uses 32-bits.
Next, we are asked about enabling Calgary by default (Should Calgary be enabled by default? (CALGARY_IOMMU_ENABLED_BY_DEFAULT)). Calgary is the same concept as the IOMMU support mentioned above. The difference between the two is this is support for such a feature on many devices while the one above is specifically for IOMMU IBM devices. If this is disabled and you need to use it later, use this kernel parameter (iommu=calgary).
Here is a question that should be handled carefully (Enable Maximum number of SMP Processors and NUMA Nodes (MAXSMP)). Only enable this if the system the kernel will run on has many SMP processors and NUMA nodes like the Core i7 and many AMD CPU chips. If such a system lacks SMP processors and NUMA nodes or has a very little amount, the kernel can be inefficient. It is best to select “No”.
Non-Uniform Memory Access (NUMA) is a system of memory where each part of the memory takes longer to access than others. A node is a set of memory. For instance, a NUMA system may have three RAM chips. Each chip is one node. There is a node/chip on the motherboard with the CPU (this is the fastest node). The other two nodes are on a different bus. These two nodes take more time to access than the first node.
NOTE: ccNUMA and NUMA are now the same or at least very similar.
Symmetric Multi-Processing (SMP) is an alternative to NUMA. The memory is on the same bus. Only so many CPUs can access a bus, so this limits the number of processors on a SMP system. However, the memory is equally fast.
REMEMBER: I am compiling a kernel for an AMD64 system, so I will tell you what choice I made to help readers to understand the process and choices. If I do not specify what I choose, then I choose the default choice. When you are compiling for a different system or you have different needs, you will need to make alternate decisions based on your situation.
Next, choose the maximum number of CPUs the kernel will support unless the configuration tool chose for you. This configuration can optimize the kernel for the given amount.
Then, enable or disable “SMT (Hyperthreading) scheduler support (SCHED_SMT)”. The SMT scheduler improves the CPU's decision-making on Pentium 4 processors that use HyperThreading. However, there will be extra overhead. For some systems, it is best to choose “no” as I have done.
HyperThreading is a proprietary SMT parallelization for microprocessors (Intel invented the implementation). This is a special form of multitasking/multithreading (doing many tasks at once). Simultaneous multithreading (SMT) improves multithreading efficiency.
After that, enable or disable “Multi-core scheduler support (SCHED_MC)”. This feature also improves the CPU's decision making on multi-core chips. However, the cost is extra overhead. I chose “No”.
Here, in this next option, the preemption model can be chosen.
Preemption Model
1. No Forced Preemption (Server) (PREEMPT_NONE)
> 2. Voluntary Kernel Preemption (Desktop) (PREEMPT_VOLUNTARY)
3. Preemptible Kernel (Low-Latency Desktop) (PREEMPT)
choice[1-3]: 2
Preemption is the process of pausing an interrupting task with the intent of allowing it to continue executing later. Preemption forces the task to pause. The task cannot ignore preemption.
Next, we are asked about “Reroute for broken boot IRQs (X86_REROUTE_FOR_BROKEN_BOOT_IRQS)”. This is simply a fix for spurious interrupts. A spurious interrupt is an unwanted hardware interrupt. These are usually triggered by electrical interference or improperly connected electronics. Remember, an interrupt is a signal to the processor that needs immediate attention.
This option is vital for every machine; I doubt anyone would have a reason for disabling this feature (Machine Check / overheating reporting (X86_MCE)). The kernel must be aware of overheating and data corruption, otherwise, the system will continue to operate only to receive further damage.
Next, users can enable/disable “Intel MCE features (X86_MCE_INTEL)”. This is extra support for Intel MCE features like a thermal monitor. I chose “no” since I am compiling for an AMD64 processor. Machine Check Exception (MCE) is a type of error produced when the processor finds a hardware issue. An MCE will usually cause a kernel panic (equivalent to the “Blue-Screen-of-Death” on Windows).
Here is the same question again except for AMD devices (AMD MCE features (X86_MCE_AMD)).
Next is a debugging feature that I will disable (Machine check injector support (X86_MCE_INJECT)). This allows injecting machine checks. If you do perform machine injections occasionally, it may be better to enable this as a module rather than build it into the kernel. Machine injection makes a device send an error message even though no real error exists. This is used to make sure the kernel and other processes act correctly towards errors. For instance, if the CPU over heats, then it should shutdown, but how would a developer test such code without harming the CPU. Injecting errors is the best way since it is just software that tells hardware to send an error signal.
NOTE: Modules are for features/drivers that may be used or are very rarely executed. Only add features/drivers to the kernel itself if it will be used by many systems that will use the kernel being built.
If the kernel will likely be used on Dell laptops, then enable this feature (Dell laptop support (I8K)). Otherwise, add it as a module if some users of this kernel may use it on a Dell laptop. If this kernel is not planned for Dell laptops, then disable this support as I have done. Specifically, this support is a driver that allows the System Management Mode of the processor to be accessed on the Dell Inspiron 8000. The system management mode's purpose is to get the processor's temperature and fan status which are needed information for controlling the fans on such systems.
Next, users can allow microcode loading support (CPU microcode loading support (MICROCODE)). This will allow users to update the microcode on AMD and Intel CPU chips that support such a feature.
NOTE: To load microcode, you must have a legal copy of the microcode binary-file that is designed for your processor.
For loading microcode patches (to fix bugs or add minor features) to Intel chips (Intel microcode loading support (MICROCODE_INTEL)), this must be enabled. I disabled this feature.
This is the same as above except for AMD chips (AMD microcode loading support (MICROCODE_AMD)).
Enabling this support (/dev/cpu/*/msr - Model-specific register support (X86_MSR)) will allow certain processes to have permission to use x86
Model-Specific Registers (MSRs). These registers are a character devices that include major 202 and minors 0 to 31 (/dev/cpu/0/msr to /dev/cpu/31/msr). This feature is used on multiprocessor systems. Each virtual character device is linked to a specific CPU.
NOTE: MSRs are used for changing CPU settings, debugging, performance monitoring, and execution tracing. MSRs use the x86 instruction set.
After that, we have an option for “CPU information support (X86_CPUID)”. Enabling this feature allows processes to access the x86 CPUID instructions needed to execute on a particular CPU through character devices. These character devices that include major 202 and minors 0 to 31 (/dev/cpu/0/msr to /dev/cpu/31/msr), just like x86_MSR support above.
For processors that support it, enable the kernel linear mapping to use 1GB pages (Enable 1GB pages for kernel pagetables (DIRECT_GBPAGES)). Enabling this feature helps performance by reducing TLB pressure.
A page is the basic unit of memory itself (bits are the basic units of data). Page size is determined by the hardware itself. A page table is the mapping between virtual and physical memory. Physical memory is the memory on devices. The virtual memory is the addresses to the memory. Depending on the architecture of a system, there may be more addresses the hardware has the ability to access than what there is to access. For instance, on a 64-bit system with a 6GB RAM chip, an administrator can add more RAM if needed. This is because there are still more virtual addresses. However, on many 32-bit systems, an administrator can add an 8GB RAM chip, but the system will not use all of it because there are not enough virtual addresses for the system to use to access the large amount of RAM. Translation Lookaside Buffer (TLB) is a cache system that improves virtual address translation speed. Reducing pressure on it keeps it from being so busy.
Next, we have a NUMA option (Numa Memory Allocation and Scheduler Support (NUMA)). This will allow the kernel to allocate memory used by the CPU on the local memory controller of the CPU. This support also makes the kernel more NUMA aware. Very few 32-bit systems need this feature, but some common 64-bit processors use this feature. I chose “no”.
For the system to detect AMD NUMA node topology using an old method, enable this feature (Old style AMD Opteron NUMA detection (AMD_NUMA)). Next is an option for a newer detection method (ACPI NUMA detection (X86_64_ACPI_NUMA)). If both are enabled, the newer one will dominate. Some hardware works better using one of the methods instead of the other.
For NUMA emulation for the purpose of debugging, enable this next feature (NUMA emulation (NUMA_EMU)).
NOTE: If you do not plan to do debugging and you need a fast, light-weight system, disable as many debugging features as you can.
In this next option, choose the maximum number of NUMA nodes your kernel is planned to handle. Then, choose the memory model. There may be only one choice for a memory model. The memory model specifies how the memory is stored.
Maximum NUMA Nodes (as a power of 2) (NODES_SHIFT) [6]
Memory model
> 1. Sparse Memory (SPARSEMEM_MANUAL)
choice[1]: 1
To help performance, there is this option for optimizing pfn_to_page and page_to_pfn operations via a virtually mapped memmap (Sparse Memory virtual memmap (SPARSEMEM_VMEMMAP)). Page Frame Number (pfn) is a number given to each page. These two operations get a page from a number or a number from a page.
Next is an option that allows a node to move memory (Enable to assign a node which has only movable memory (MOVABLE_NODE)). Kernel pages cannot normally be moved. When enabled, users can hotplug memory nodes. Also, movable memory allows memory defragmentation. As data goes in and out of memory, a set of data may get divided across the memory where ever there is space available.
Following the previous memory question, we have more. These may be preconfigured by the configuration tool. The third option (BALLOON_COMPACTION), when enabled, helps to lessen memory fragmentation. Fragmented memory can slow down the system. The fourth option (COMPACTION) allows memory to be compressed. The fifth option listed below (MIGRATION) allows pages to be moved.
Allow for memory hot-add (MEMORY_HOTPLUG)
Allow for memory hot remove (MEMORY_HOTREMOVE)
Allow for balloon memory compaction/migration (BALLOON_COMPACTION)
Allow for memory compaction ()
Page migration (MIGRATION)
NOTE: Enabling movable memory will enable the five features listed above.
Next, we can “Enable KSM for page merging (KSM)”. Kernel Samepage Merging (KSM) views memory that an application says can be merged. This saves memory because if two pages are identical, one can be deleted or merged and only one will be used.
The configuration tool may automatically choose how much memory to save for user allocation (Low address space to protect from user allocation (DEFAULT_MMAP_MIN_ADDR) [65536]).
This next option is important (Enable recovery from hardware memory errors (MEMORY_FAILURE)). If the memory fails and the system has MCA recovery and ECC memory, the system can continue to run and try to recover. To have such a feature, the hardware itself must be able to support it as well as the kernel.
Machine Check Architecture (MCA) is a feature of some CPUs where they can send hardware error messages to the operating system. Error-correcting code memory (ECC memory) is a form of a memory device with error detection and correction.
Next, the configuration tool automatically enables “HWPoison pages injector (HWPOISON_INJECT)”. This feature allows the kernel to mark a bad page as “poisoned”, and then the kernel kills the application that created the bad page. This helps to stop and correct errors.
To allow the kernel to use large pages (Transparent Hugepage Support (TRANSPARENT_HUGEPAGE)), enable this feature. This speeds up the system, but more memory is needed. Embedded system should not use this feature. Embedded systems generally have very small amounts of memory.
If the above is enabled, then the sysfs support for huge pages must be configured.
Transparent Hugepage Support sysfs defaults
1. always (TRANSPARENT_HUGEPAGE_ALWAYS)
> 2. madvise (TRANSPARENT_HUGEPAGE_MADVISE)
choice[1-2?]: 2
This next option is for adding system calls process_vm_readv and
process_vm_writev (Cross Memory Support (CROSS_MEMORY_ATTACH)). This allows privileged processes to access the address space of another application.
If tmem is present, enabling cleancache will generally be a good idea. (Enable cleancache driver to cache clean pages if Transcendent Memory (tmem) is present (CLEANCACHE)). When needed pages are removed from the memory, cleancache will place the page on cleancache-enabled filesystems. When the page is needed, it is placed back on memory. Transcendent Memory (tmem) is memory without a set known size. The kernel indirectly addresses such memory.
This next option allows caching swap pages if tmem is active (Enable frontswap to cache swap pages if tmem is present (FRONTSWAP)). Frontswap places data on swap partitions. This is needed for swap support.
It is best to enable this next feature (Check for low memory corruption (X86_CHECK_BIOS_CORRUPTION)). This will check the low memory for memory corruption. This feature is disabled at runtime. To use this feature, add “memory_corruption_check=1” to the kernel command-line (this will be discussed in a later article; this is not the same as any command-line). This feature, even when actively being executed, uses very little overhead (nearly none).
Next, we can “Set the default setting of memory_corruption_check (X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK)”. This will set whether memory_corruption_check is on or off. It is best to have the memory checked otherwise data can be lost and the system can crash if an important part of the memory is corrupted.
This option concerns the BIOS (Amount of low memory, in kilobytes, to reserve for the BIOS (X86_RESERVE_LOW) [64]). The configuration tool usually knows the best amount of memory to reserve for the BIOS.
For Intel P6 processors, developers can enable memory type range registers (MTRR (Memory Type Range Register) support (MTRR)). This is used for AGP and PCI cards with a VGA card attached. Enabling this feature creates /proc/mtrr.
If X drivers need to add writeback entries, then enable this following option (MTRR cleanup support (MTRR_SANITIZER)). This will convert MTRR layout from continuous to discrete. Memory type range registers (MTRRs) provide software a way to access CPU cache.
Next, some MTRR options are set by the configuration tool.
MTRR cleanup enable value (0-1) (MTRR_SANITIZER_ENABLE_DEFAULT) [1]
MTRR cleanup spare reg num (0-7) (MTRR_SANITIZER_SPARE_REG_NR_DEFAULT) [1]
To set up page-level cache control, enable PAT attributes (x86 PAT support (X86_PAT)). Page Attribute Table (PATs) are the modern equivalents of MTRRs and are much more flexible than MTRRs. If you experience bootup issues with this enabled, then remake the kernel with this feature disabled. I chose “no”.
--------------------------------------------------------------------------------
via: http://www.linux.org/threads/the-linux-kernel-configuring-the-kernel-part-4.4392/
译者:[译者ID](https://github.com/译者ID) 校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](http://linux.cn/) 荣誉推出

View File

@ -0,0 +1,151 @@
08 Linux 内核: 配置内核(Part 4)
================================================================================
![](http://www.linux.org/attachments/slide-jpg.392/)
在“The Linux kernel: Configuring the Kernel”的第四部分里我们将继续配置更多的设置和特性。
这里我们被问及关于"IBM Calgary IOMMU support (CALGARY_IOMMU)"。这个选项将会提供对IBM xSeries x366和x460的IOMMU的支持。这也将让那些32位PCI,这些系统上不支持双地址周期(DAC (Double Address Cycle))的设备正常运行,因为该系统设置在访问超过3GB内存的时候会有问题。如果需要这些IOMMU设备可以用"iommu=off"在启动时关闭。(这些内核/模块参数会在以后的文章中讨论)
IOMMU(input/output memory management unit)是一个内存管理单元(MMU),它连接具有DMA功能的I/O总线到主内存上。DMA(Direct Memory Access)是许多计算机支持的一种允许特定设备不借助CPU直接访问内存的特性。双地址周期(Double Address Cycle, DAC)是64位DMA;通常的DMA使用32位。
下面我们被问及是否默认启用Calgary(Should Calgary be enabled by default? (CALGARY_IOMMU_ENABLED_BY_DEFAULT))。Calgary与上面提到的IOMMU是同一个概念。这两者之间的不同是IOMMU可以支持许多设备而Calgary只能支持IBM IOMMU设备。如果禁用了它但是以后需要使用到它可以使用内核参数(iommu=calgary)。
这里有个问题需要小心处理(Enable Maximum number of SMP Processors and NUMA Nodes (MAXSMP))。只有在内核运行在拥有很多SMP处理器和NUMA节点的情况下如Core i7和许多AMD CPU芯片才启用它。如果系统缺乏或者只有少量的SMP处理器和NUMA节点内核就会变得低效。这个最好选择"No"。
非一致性内存访问(Non-Uniform Memory Access (NUMA))是一个每部分内存都需要花费更长时间访问其他部分内存的系统。一个节点就是一组内存。例如一个NUMA系统可能有三块内存芯片。每块芯片是一个节点。在带CPU的主板上有一个节点/芯片(这是最快的节点)。另外两个在不同的总线傻nag。这两个节点需要比第一个节点花费更长的时间去访问。
注意ccNUMA和NUMA目前是相同的至少是非常相似的。
对称多处理器(Symmetric Multi-Processing (SMP))是NUMA的替代品。它的内存在同一根总线上。只有这么多的CPU可以访问总线所以这限制了SMP系统上处理器的数量。然而它内存的访问速度一样块。
记住我是在为AMD64系统在编译内核所以我会告诉你我的选择来帮助读者理解过程和选择。如果我没有指出我的选择那么我用的就是默认选择。如果你在为不同的系统编译或者你有不同的需求你需要在你的情况下做出替代的选择。
接下来除非配置工具已经为你做了选择选择一个内核需要支持的最多CPU的数量。这个配置根据你给的数量优化内核。
接着启用或禁用"SMT (Hyperthreading) scheduler support (SCHED_SMT)"。SMT调度器提升了在使用了超线程技术的Pentium 4处理器上的CPU决策。然而这会带来额外的功耗在一些系统上最好像我一样选择"no"。
超线程一种专有的SMT并行微处理器(Intel 实现了它)。这是多任务/多线程(同时做许多任务)的一种特殊形式。同时多线程(Simultaneous multithreading (SMT))提升了多线程的效率。
在这之后,启用或者禁用"Multi-core scheduler support (SCHED_MC)"。这样也是一种提升多核CPU决策的特性。然而这回带来额外功耗。我选择了"No"。
在下一个选项中可以选择抢占式模式。
Preemption Model
1. No Forced Preemption (Server) (PREEMPT_NONE)
> 2. Voluntary Kernel Preemption (Desktop) (PREEMPT_VOLUNTARY)
3. Preemptible Kernel (Low-Latency Desktop) (PREEMPT)
choice[1-3]: 2
抢占就是暂停一个意图让它之后继续执行的中断任务的过程。抢占强制一个进程暂停。任务无法忽视抢占。
接着,我们被询问关于"Reroute for broken boot IRQs (X86_REROUTE_FOR_BROKEN_BOOT_IRQS)"。这是一个对于假中断的简单修复。假中断是一种不需要的硬件中断。这些通常是有电子干扰或者错误连接的电子产品触发。记住,中断是发送给处理器需要马上注意的信号。
这个选项对任何机器都很重要;我怀疑任何人可能都会有禁用这个特性的理由(Machine Check / overheating reporting (X86_MCE))。内核必须意识到过热和数据损坏,不然,系统将会继续操作,这样只会导致进一步的破坏。
下面,用户可以启用禁用"Intel MCE features (X86_MCE_INTEL)"这是一种额外的对像热度监控的Intel MCE特性的支持。因为我是为AMD64处理器编译内核所以我选择了"no"。机器检测异常(MCE)是一种当处理器发现硬件问题时的错误输出。MCE通常会导致内核严重错误(kernel panic)(相当于Windows中的"蓝屏")。
这个除了是AMD设备外是同一个问题Intel MCE features (X86_MCE_INTEL)。
下一个是我会禁用的调试特性(Machine check injector support (X86_MCE_INJECT))。这个会允许注射机检查。如果你偶尔执行机器注射那最好编译成模块而不是编译进内核。机器注射可以使设备即使实际没有错误也可以发送一个错误信息。这个用来确认内核和其他进程可以正常处理错误。比如如果CPU过热接着应该关机但是开发者如何在不损坏CPU的情况下测试代码。注射错误是一种最好的方法因为它只是一种告诉硬件发送错误信号的软件。
注:模块是对可能被使用或者很少执行的特性/驱动而言的。只加入在许多使用该内核的系统中用到的特性/驱动到内核中。
如果内核很可能用在Dell笔记本上那么启用这个特性(Dell laptop support (I8K))。否则,如果一些用户可能在戴尔笔记本电脑上使用这个内核将其作为一个模块加入。如果这个内核不打算支持Dell笔记本那就像我一样注释掉它。特别地这个支持是一个允许Dell Inspiron 8000系列笔记本访问处理器的系统管理模式的驱动。系统管理模式的目的是得到处理器的温度和风扇状态这对一些需要控制风扇的系统有用。
下面,用户可以选择微代码加载支持(CPU microcode loading support (MICROCODE))。这可以允许用户在支持这个特性的AMD或者Intel芯片上更新微代码。
注意:为了加载微代码,你必须拥有一个为你的处理器设计的合法的二进制微代码拷贝。
为了加载微代码补丁(修复bug或加入次要的特性)到intel芯片上(Intel microcode loading support (MICROCODE_INTEL)),这个必须启用。我禁用了它。
除了是AMD芯片外这个与上面的相同(AMD microcode loading support (MICROCODE_AMD))。
启用这个支持(/dev/cpu/*/msr - Model-specific register support (X86_MSR))可以允许某个处理器有权限使用x86特殊模块寄存器(Model-Specific Registers (MSRs))。这些寄存器是一些字符设备包括主要的202和次要的0到21((/dev/cpu/0/msr to /dev/cpu/31/msr))。这个特性用在多处理器系统上。每个虚拟字符设备都连接到一个特定的CPU。
注意MSRs是用来改变CPU设备、调试、性能监控和执行追踪。MSRs使用x86指令集。
在这之后,我们有一个选项"CPU information support (X86_CPUID"启用这个特性允许处理器访问x86 CPUID指令这需要通过字符设备在一个特定的CPU上执行。这些字符设备包括主要的202和次要的0到31(/dev/cpu/0/msr to /dev/cpu/31/msr),就像x86_MSR支持上面这些。
对于支持它的处理器启用内核线性映射来使用1GB的页(Enable 1GB pages for kernel pagetables (DIRECT_GBPAGES))。启用这个可以帮助减轻TLB的压力。
页是内存本身的基本单位(位是数据的基本单位)。页的大小是由硬件自身决定的。页码表是虚拟和物理内存间的映射。物理内存是设备上的内存。虚拟内存是到内存的地址。依赖于系统架构硬件可以访问大于实际内存地址的地址。举例来说一个64位系统拥有6GB内存管理员在需要时可以加上更多的内存。这是因为还有很多虚拟内存地址。然而在很多32位系统上系统管理员可以增加一条8GB的内存但是系统无法完全使用它因为系统中没有足够的虚拟内存地址去访问大容量的内存。转换后援缓冲器(Translation Lookaside Buffer (TLB))是一种提升虚拟内存转换速度的缓存系统。
下面我们有了一个NUMA选项(Numa Memory Allocation and Scheduler Support (NUMA))。这可以允许内核在CPU本地内存分配器上分配CPU使用的内存。这个支持同样内核更好感知到NUMA。很少的32位系统需要这个特性但是一些通用的645位处理器使用这个特性。我选择了"no"。
为了系统使用旧方式来检测AMD NUMA节点拓扑启用这个特性feature (Old style AMD Opteron NUMA detection (AMD_NUMA))。下一个选项是一种更新的检测方式(ACPI NUMA detection (X86_64_ACPI_NUMA))。如果两个都启用,新的方式将会占支配作用。一些硬件在使用其中一种方式而不是另外一个时工作得更好。
至于为了调试目的的NUMA仿真启用下一个特性(NUMA emulation (NUMA_EMU))。
注意:如果你不打算调试并且你需要一个快速、轻量级系统,那么禁用尽可能多的调试特性。
下一个选项中选择你的内核打算如何处理NUMA节点的最大数量。接下来选择内存模型。这里可能只有一个内存模型选择。内存模型指定了内存如何存储。
Maximum NUMA Nodes (as a power of 2) (NODES_SHIFT) [6]
Memory model
> 1. Sparse Memory (SPARSEMEM_MANUAL)
choice[1]: 1
为了有助于性能,这里有一个选项用通过虚拟内存映射(Sparse Memory virtual memmap (SPARSEMEM_VMEMMAP))来优化pfn_to_page和page_to_pfn操作。页帧号是每页被给定的号码。这两个操作用来从号码得到页或者从页得到号码。
下一个选项是允许一个节点可以移除内存(Enable to assign a node which has only movable memory (MOVABLE_NODE))。内核页通常无法移除。当启用后,用户可以热插拔内存节点。同样可移除内存允许内存整理。作为出入内存的数据,只要有可用空间一组数据可能被划分到不同内存。
接着前面的内存问题,我们还有更多的问题。这些可能已被配置工具预配置了。第三个选项(BALLOON_COMPACTION),当启用时可以帮助减少内存碎片。碎片内存会减慢系统速度。第四个选项(COMPACTION)允许内存压缩。下面列到的第五个选项(MIGRATION)允许页面被移动。
Allow for memory hot-add (MEMORY_HOTPLUG)
Allow for memory hot remove (MEMORY_HOTREMOVE)
Allow for balloon memory compaction/migration (BALLOON_COMPACTION)
Allow for memory compaction ()
Page migration (MIGRATION)
注意启用可移动内存会启用以上5个特性。
下一步,我们可以"Enable KSM for page merging (KSM)"。内核同页合并(Kernel Samepage Merging (KSM))会查看程序认为可以合并的内核。如果两页内存完全相同这可以节约内存。一块内存可以被删除或者被合并,并且只有一块可以使用。
配置工具可能会自动选择使用多少内存保存用户分配(Low address space to protect from user allocation (DEFAULT_MMAP_MIN_ADDR) [65536])。
下一个选项很重要(Enable recovery from hardware memory errors (MEMORY_FAILURE))。如果内存故障并且系统有MCA恢复或者ECC内存系统就可以继续运行并且恢复。要有这个特性硬件自身和内核都必须支持。
机器检测架构(Machine Check Architecture (MCA))是一个一些CPU上可以发送硬件错误信息给操作系统的特性。错误更正码内存(Error-correcting code memory (ECC memory))是一种内存设备检测和纠正错误的形式。
下面,配置工具会自动启用"HWPoison pages injector (HWPOISON_INJECT)"。这个特性允许内核标记一块坏页为"poisoned",接着内核会杀死创建坏页的程序。这有助于停止并纠正错误。
为了允许内核使用大页(Transparent Hugepage Support (TRANSPARENT_HUGEPAGE)),启用这个特性。这可以加速系统但是需要更多内存。嵌入式系统不必使用这个特性。嵌入式系统通常只有非常小的内存。
如果启用了上面的那么巨大页的sysfs支持必须配置。
Transparent Hugepage Support sysfs defaults
1. always (TRANSPARENT_HUGEPAGE_ALWAYS)
> 2. madvise (TRANSPARENT_HUGEPAGE_MADVISE)
choice[1-2?]: 2
下面的选项是增加process_vm_readv和process_vm_writev这两个系统调用(Cross Memory Support (CROSS_MEMORY_ATTACH))。这允许特权进程访问另外一个程序的地址空间。
如果有tmem启用缓存清理(cleancache)通常是一个好主意 (Enable cleancache driver to cache clean pages if Transcendent Memory (tmem) is present (CLEANCACHE))。当一些内存页需要从内存中移除时cleancache会将页面放在cleancache-enabled的文件系统上。当需要该页时页会被重新放回内存中。超绝记忆体(tmem)没有一组一直大小的内存。内核对此内存使用间接寻址。
下一个选项允许在tmen激活后缓存交换页(Enable frontswap to cache swap pages if tmem is present (FRONTSWAP))。frontswap在交换分区放置数据。交换特性的支持需要这个。超绝记忆体
最好启用下一个特性(Check for low memory corruption (X86_CHECK_BIOS_CORRUPTION))。这会检测低位内存的内存损坏情况。这个特性在执行期被禁止。为了启用这个特性,在内核命令行内加入"memory_corruption_check=1"(这会在以后的文章中讨论;这不同于任何命令行)。即使经常执行这个特性,也只使用非常小的开销(接近没有)。
接下来我门可以设置内存损坏检测的默认设置(“Set the default setting of memory_corruption_check (X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK))。这可以选择是否开启或关闭memory_corruption_check。最好启用内存损坏检测不然如果一部分重要内存损坏后可能会导致数据丢失和系统崩溃。
这个选项关注的是BIOS(Amount of low memory, in kilobytes, to reserve for the BIOS (X86_RESERVE_LOW) [64])。配置工具通常知道给BIOS预留内存的最佳大小。
对于Intel P6处理器开发者可以启用存储区域类型寄存器(MTRR (Memory Type Range Register) support (MTRR))。这用于连接着VGA卡的AGP和PCI卡。启用这个特性内核会创建/proc/mtrr。
如果X驱动需要加入回写入口那么启用下面的选项(MTRR cleanup support (MTRR_SANITIZER))。这会将MTRR的布局从连续转换到离散。存储区域类型寄存器(Memory type range registers (MTRRs))提供了一种软件访问CPU缓存的方法。
下面配置工具已经设置了一些MTRR选项
MTRR cleanup enable value (0-1) (MTRR_SANITIZER_ENABLE_DEFAULT) [1]
MTRR cleanup spare reg num (0-7) (MTRR_SANITIZER_SPARE_REG_NR_DEFAULT) [1]
为了设置页级缓冲控制那就启用PAT属性(x86 PAT support (X86_PAT))。页属性表(Page Attribute Table (PATs))是现在版的MTRRs并比它更灵活。如果你经历过因启用它而引发的启动问题那么禁用这个特性后重新编译内核。我选择了"no"。
--------------------------------------------------------------------------------
via: http://www.linux.org/threads/the-linux-kernel-configuring-the-kernel-part-4.4392/
译者:[geekpi](https://github.com/geekpi) 校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](http://linux.cn/) 荣誉推出