Merge pull request #26015 from lkxed/20220607-How-Garbage-Collection-works-inside-a-Java-Virtual-Machine

[提交译文][tech]: 20220607 How Garbage Collection works inside a Java Virtual Machine.md
This commit is contained in:
六开箱 2022-06-12 14:32:57 +08:00 committed by GitHub
commit 85cb5a784c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 157 additions and 155 deletions

View File

@ -1,155 +0,0 @@
[#]: subject: "How Garbage Collection works inside a Java Virtual Machine"
[#]: via: "https://opensource.com/article/22/6/garbage-collection-java-virtual-machine"
[#]: author: "Jayashree Huttanagoudar https://opensource.com/users/jayashree-huttanagoudar"
[#]: collector: "lkxed"
[#]: translator: "lkxed"
[#]: reviewer: " "
[#]: publisher: " "
[#]: url: " "
How Garbage Collection works inside a Java Virtual Machine
======
Understanding how Java handles memory isn't always necessary, but it can help you envision how the JVM deals with your variables and class instances.
![Coffee beans][1]
Image by: Pixabay. CC0.
Automatic Garbage Collection (GC) is one of the most important features that makes Java so popular. This article explains why GC is essential. It includes automatic and generational GC, how the Java Virtual Machine (JVM) divides heap memory, and finally, how GC works inside the JVM.
### Java memory allocation
Java memory is divided into four sections:
1. Heap: The memory for object instances is allocated in the heap. When the object declaration is made, there won't be any memory allocated in the heap. Instead, a reference is created for that object in the stack.
2. Stack: This section allocates the memory for methods, local variables, and class instance variables.
3. Code: Bytecode resides in this section.
4. Static: Static data and methods are placed in this section.
### What is automatic Garbage Collection (GC)?
Automatic GC is a process in which the referenced and unreferenced objects in heap memory are identified, and then unreferenced objects are considered for deletion. The term *referenced objects* means some part of your program is using those objects. *Unreferenced objects* are not currently being used by the program.
Programming languages like C and C++ require manual allocation and deallocation of memory. This is automatically handled by GC in Java, although you can trigger GC manually with the `system.gc();` call in your code.
The fundamental steps of GC are:
#### 1. Mark used and unused objects
In this step, the used and unused objects are marked separately. This is a time-consuming process, as all objects in memory must be scanned to determine whether they're in use or not.
![Marking used and unused objects][2]
#### 2. Sweep/Delete objects
There are two variations of sweep and delete.
**Simple deletion**: Only unreferenced objects are removed. However, the memory allocation for new objects becomes difficult as the free space is scattered across available memory.
![Normal deleting process][3]
**Deletion with compaction**: Apart from deleting unreferenced objects, referenced objects are compacted. Memory allocation for new objects is relatively easy, and memory allocation performance is improved.
![Deletion with compacting][4]
### What is generational Garbage Collection (GC), and why is it needed?
As seen in the sweep and delete model, scanning all objects for memory reclamation from unused objects becomes difficult once the objects keep growing. An experimental study shows that most objects created during the program execution are short-lived.
The existence of short-lived objects can be used to improve the performance of GC. For that, the JVM divides the memory into different generations. Next, it categorizes the objects based on these memory generations and performs the GC accordingly. This approach is known as *generational GC*.
### Heap memory generations and the generational Garbage Collection (GC) process
To improve the performance of the GC mark and sweep steps, the JVM divides the heap memory into three generations:
* Young Generation
* Old Generation
* Permanent Generation
![Hotspot heap structure][5]
Here is a description of each generation and its key features.
#### Young Generation
All created objects are present here. The young generation is further divided into:
1. Eden: All newly created objects are allocated with the memory here.
2. Survivor space (S0 and S1): After surviving one GC, the live objects are moved to one of these survivor spaces.
![Object allocation][6]
The generational GC that happens in the Young Generation is known as *Minor GC*. All Minor GC cycles are "Stop the World" events that cause the other applications to pause until it completes the GC cycle. This is why Minor GC cycles are faster.
To summarize: Eden space has all newly created objects. Once Eden is full, the first Minor GC cycle is triggered.
![Filling Eden space][7]
Minor GC: The live and dead objects are marked during this cycle. The live objects are moved to survivor space S0. Once all live objects are moved to S0, the unreferenced objects are deleted.
![Copying referenced objects][8]
The age of objects in S0 is 1 because they have survived one Minor GC. Now Eden and S1 are empty.
Once cleared, the Eden space is again filled with new live objects. As time elapses, some objects in Eden and S0 become dead (unreferenced), and Eden's space is full again, triggering the Minor GC.
![Object aging][9]
This time the dead and live objects in Eden and S0 are marked. The live objects from Eden are moved to S1 with an age increment of 1. The live objects from S0 are also moved to S1 with an age increment of 2 (because they've now survived two Minor GCs). At this point, S0 and Eden are empty. After every Minor GC, Eden and one of the survivor spaces are empty.
The same cycle of creating new objects in Eden continues. When the next Minor GC occurs, Eden and S1 are cleared by moving the aged objects to S0. The survivor spaces switch after every Minor GC.
![Additional aging][10]
This process continues until the age of one of the surviving objects reaches a certain threshold, at which point it is moved to the so-called the Old Generation with a process called *promotion*.
Further, the `-Xmn` flag sets the Young Generation size.
### Old Generation (Tenured Generation)
This generation contains the objects that have survived several Minor GCs and aged to reach an expected threshold.
![Promotion][11]
In the example diagram above, the threshold is 8. The GC in the Old Generation is known as a *Major GC*. Use the flags `-Xms` and `-Xmx` to set the initial and maximum size of the heap memory.
### Permanent Generation
The Permanent Generation space stores metadata related to library classes and methods of an application, J2SE, and what's in use by the JVM itself. The JVM populates this data at runtime based on which classes and methods are in use. Once the JVM finds the unused classes, they are unloaded or collected, making space for used classes.
Use the flags `-XX:PermGen` and `-XX:MaxPermGen` to set the initial and maximum size of the Permanent Generation.
#### Metaspace
Metaspace was introduced in Java 8u and replaced PermGen. The advantage of this is automatic resizing, which avoids OutOfMemory errors.
### Wrap up
This article discusses the various memory generations of JVM and how they are helpful for automatic generational Garbage Collection (GC). Understanding how Java handles memory isn't always necessary, but it can help you envision how the JVM deals with your variables and class instances. This understanding allows you to plan and troubleshoot your code and comprehend potential limitations inherent in a specific platform.
Image by: (Jayashree Huttanagoudar, CC BY-SA 4.0)
--------------------------------------------------------------------------------
via: https://opensource.com/article/22/6/garbage-collection-java-virtual-machine
作者:[Jayashree Huttanagoudar][a]
选题:[lkxed][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/jayashree-huttanagoudar
[b]: https://github.com/lkxed
[1]: https://opensource.com/sites/default/files/lead-images/java-coffee-beans.jpg
[2]: https://opensource.com/sites/default/files/2022-06/1Marking.png
[3]: https://opensource.com/sites/default/files/2022-06/2NormalDeletion.png
[4]: https://opensource.com/sites/default/files/2022-06/3DeletionwithCompacting.png
[5]: https://opensource.com/sites/default/files/2022-06/4Hotspot.png
[6]: https://opensource.com/sites/default/files/2022-06/5ObjAllocation.png
[7]: https://opensource.com/sites/default/files/2022-06/6FillingEden.png
[8]: https://opensource.com/sites/default/files/2022-06/7CopyingRefdObjs.png
[9]: https://opensource.com/sites/default/files/2022-06/8ObjAging.png
[10]: https://opensource.com/sites/default/files/2022-06/9AddlAging.png
[11]: https://opensource.com/sites/default/files/2022-06/10Promotion.png

View File

@ -0,0 +1,157 @@
[#]: subject: "How Garbage Collection works inside a Java Virtual Machine"
[#]: via: "https://opensource.com/article/22/6/garbage-collection-java-virtual-machine"
[#]: author: "Jayashree Huttanagoudar https://opensource.com/users/jayashree-huttanagoudar"
[#]: collector: "lkxed"
[#]: translator: "lkxed"
[#]: reviewer: " "
[#]: publisher: " "
[#]: url: " "
JVM 垃圾回收的工作原理
======
对于程序员来说,掌握 Java 的内存管理机制并不是必须的,但它能够帮助你更好地理解 JVM 是如何处理程序中的变量和类实例的。
![咖啡豆][1]
图源Pixabay. CC0.
Java 之所以能够如此流行,自动 <ruby>垃圾回收<rt>Garbage Collection</rt></ruby>GC功不可没它也是 Java 最重要的几个特性之一。在这篇文章中我将说明为什么垃圾回收如此重要。本文的主要内容为自动的分代垃圾回收、JVM 划分内存的依据,以及 JVM 垃圾回收的工作原理。
### Java 内存分配
Java 程序的内存空间被划分为以下四个区域:
1. 堆区Heap对象实例就是在这个区域分配的。不过当我们声明一个对象时堆中不会有任何内存分配发生只是在栈中创建了一个对象的引用而已。
2. 栈区Stack方法、局部变量和类的实例变量就是在这个区域分配的。
3. 代码区Code这个区域存放了程序的字节码。
4. 静态区Static这个区域存放了程序的静态数据和静态方法。
### 什么是自动垃圾回收?
自动垃圾回收是这样一个过程:首先,堆中的所有对象会被分类为“被引用的”和“未被引用的”;接着,“未被引用的对象”就会被做上标记,以待之后删除。其中,“被引用的对象”是指程序中的某一部分仍在使用的对象,“未被引用的对象”是指目前没有正在被使用的对象。
许多编程语言,例如 C 和 C++,都需要程序员手动管理内存的分配和释放。在 Java 中,这一过程是通过垃圾回收机制来自动完成的(尽管你也可以在代码中调用 `system.gc();` 来手动触发垃圾回收)。
垃圾回收的基本步骤如下:
#### 1. 标记已使用和未使用的对象
在这一步骤中,已使用和未使用的对象会被分别做上标记。这是一个及其耗时的过程,因为需要扫描内存中的所有对象,才能够确定它们是否正在被使用。
![标记已使用和未使用的对象][2]
#### 2. 扫描/删除对象
有两种不同的扫描和删除算法:
**简单删除(标记清除)**:它的过程很简单,我们只需要删除未被引用的对象即可。但是,后续给新对象分配内存就会变得很困难了,因为可用空间被分割成了一块块碎片。
![标记清除的过程][3]
**删除压缩(标记整理)**:除了会删除未被引用的对象,我们还会压缩被引用的对象(未被删除的对象)。这样以来,新对象的内存分配就相对容易了,并且内存分配的效率也有了提升。
![标记整理的过程][4]
### 什么是分代垃圾回收,为什么需要它?
正如我们在“扫描删除”模型中所看到的,一旦对象不断增长,我们就很难扫描所有未使用的对象以回收内存。不过,有一项实验性研究指出,在程序执行期间创建的大多数对象,它们的存活时间都很短。
既然大多数对象的存活时间都很短那么我们就可以利用这个事实从而提升垃圾回收的效率。该怎么做呢首先JVM 将内存划分为不同的“代”。接着,它将所有的对象都分类到这些内存“代”中,然后对这些“代”分别执行垃圾回收。这就是“分代垃圾回收”。
### 堆内存的“代”和分代垃圾回收过程
为了提升垃圾回收中的“标记清除”的效率JVM 将对内存划分成以下三个“代”:
* 年轻代
* 老年代
* 永久代
![Hotspot 堆内存结构][5]
下面我将介绍每个“代”及其主要特征。
#### 年轻代
所有创建不久的对象都存放在这里。年轻代被进一步分为以下两个区域:
1. 伊甸区Eden所有新创建的对象都在此处分配内存。
2. 幸存者区Survivor分为 S0 和 S1经历过一次垃圾回收后仍然存活的对象会被移动到两个幸存者区中的一个。
![对象分配][6]
在年轻代发生的分代垃圾回收被称为 “Minor GC”。Minor GC 过程中的每个阶段都是“<ruby>停止世界<rt>Stop The World</rt></ruby>STW这会导致其他应用程序暂停运行直到垃圾回收结束。这也是 Minor GC 更快的原因。
一句话总结:伊甸区存放了所有新创建的对象,当它的可用空间被耗尽,第一次垃圾回收就会被触发。
![填充伊甸区][7]
Minor GC在该垃圾回收过程中所有存活和死亡的对象都会被做上标记。其中存活对象会被移动到 S0 幸存者区。当所有存活对象都被移动到了 S0未被引用的对象就会被删除。
![拷贝被引用的对象][8]
S0 中的对象年龄为 1因为它们挺过了一次 Minor GC。此时伊甸区和 S1 都是空的。
每当完成清理后,伊甸区就会再次接受新的存活对象。随着时间的推移,伊甸区和 S0 中的某些对象被宣判死亡(不再被引用),并且伊甸区的可用空间也再次耗尽(填满了),那么 Minor GC 又将再次被触发。
![对象年龄增长][9]
这一次,伊甸区和 S0 中的死亡和存活的对象会被做上标记。其中,伊甸区的存活对象会被移动到 S1并且年龄增加至 1。S0 中的存活对象也会被移动到 S1并且年龄增加至 2因为它们挺过了两次 Minor GC。此时伊甸区和 S0 又是空的了。每次 Minor GC 之后,伊甸区和两个幸存者区中的一个都会是空的。
新对象总是在伊甸区被创建,周而复始。当下一次垃圾回收发生时,伊甸区和 S1 都会被清理,它们中的存活对象会被移动到 S0 区。每次 Minor GC 之后这两个幸存者区S0 和 S1就会交换一次。
![额外年龄增长][10]
这个过程会一直进行下去,直到某个存活对象的年龄达到了某个阈值,然后它就会被移动到一个叫做“老年代”的地方,这是通过一个叫做“晋升”的过程来完成的。
使用 `-Xmn` 选项可以设置年轻代的大小。
### 老年代
这个区域存放着那些挺过了许多次 Minor GC并且达到了某个年龄阈值的对象。
![晋升][11]
在上面这个示例图表中,晋升的年龄阈值为 8。在老年代发生的垃圾回收被称为 “Major GC”。
使用 `-Xms``-Xmx` 选项可以分别设置堆内存大小的初始值和最大值。LCTT 译注:结合上面的 `-Xmn` 选项,就可以间接设置老年代的大小了。)
### 永久代
永久代存放着一些元数据它们与应用程序、Java 标准环境以及 JVM 自用的库类及其方法相关。JVM 会在运行时,用到了什么类和方法,就会填充相应的数据。当 JVM 发现有未使用的类,就会卸载或是回收它们,从而为正在使用的类腾出空间。
使用 `-XX:PermGen``-XX:MaxPerGen` 选项可以分别设置永久代大小的初始值和最大值。
#### 元空间
Java 8 引入了元空间,并用它替换了永久代。这么做的好处是自动调整大小,避免了 <ruby>内存不足<rt>OutOfMemory</rt></ruby>OOM错误。
### 总结
本文讨论了各种不同的 JVM 内存“代”,以及它们是如何在分代垃圾回收算法中起作用的。对于程序员来说,掌握 Java 的内存管理机制并不是必须的,但它能够帮助你更好地理解 JVM 处理程序中的变量和类实例的方式。这种理解使你能够规划和排除代码故障,并理解特定平台固有的潜在限制。
正文配图来自Jayashree HuttanagoudarCC BY-SA 4.0
--------------------------------------------------------------------------------
via: https://opensource.com/article/22/6/garbage-collection-java-virtual-machine
作者:[Jayashree Huttanagoudar][a]
选题:[lkxed][b]
译者:[lkxed](https://github.com/lkxed)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/jayashree-huttanagoudar
[b]: https://github.com/lkxed
[1]: https://opensource.com/sites/default/files/lead-images/java-coffee-beans.jpg
[2]: https://opensource.com/sites/default/files/2022-06/1Marking.png
[3]: https://opensource.com/sites/default/files/2022-06/2NormalDeletion.png
[4]: https://opensource.com/sites/default/files/2022-06/3DeletionwithCompacting.png
[5]: https://opensource.com/sites/default/files/2022-06/4Hotspot.png
[6]: https://opensource.com/sites/default/files/2022-06/5ObjAllocation.png
[7]: https://opensource.com/sites/default/files/2022-06/6FillingEden.png
[8]: https://opensource.com/sites/default/files/2022-06/7CopyingRefdObjs.png
[9]: https://opensource.com/sites/default/files/2022-06/8ObjAging.png
[10]: https://opensource.com/sites/default/files/2022-06/9AddlAging.png
[11]: https://opensource.com/sites/default/files/2022-06/10Promotion.png