Introduction to Java Garbage Collector
In this post, we will explore what is Java Garbage Collector, How Java Garbage Collector Work? We will also cover what are the different Garbage collector available in Java and what are the new enhancement/improvement available in Java 8. This post covers basic of the Java Garbage Collector API , provide a ground on what is Java Garbage Collector and does not provide an inside view of the API.
1. What is Java Garbage Collector
In simple terms, Java garbage collection is an automatic memory management program in Java. This program will automatically determine as what memory is no longer used by our application and recycle this memory so this can be used by other program as required. Since this is an automatic process, developer are no longer required write memory management code (though there are few rules which should be followed). Java garbage collection works with a fundamental assumptions that most object used in the Java program are short-lived and can be reclaimed shortly after they are created. Let’s look at the following program for better understanding.
for (String name : nameList) {
String s = name.getName();
}
In above code, We are creating String object
In order for better memory management, JVM comes with Garbage Collector which perform automatic memory management (i.e. JVM will pull it whenever required to clean up memory).
2. How Java Garbage Collection Really Works
One of the main misunderstandings about the Java Garbage Collector is that it remove dead objects (not in reference) which it works in the opposite way ( 🙂 ). Garbage collector keeps a trek of all the live objects and everything else is marked as garbage.In theory, Java Garbage collector work in a
- When an object is no longer in use, Garbage collector will claim memory used by this unused object and will use it for future object creation.
- There will not be any explicit object deletion and head memory will remain with JVM.
To determine which object is alive and which is garbage, JVM uses a special object also known as GC root (garbage collection root). JVM will treat the object as alive if the program can reach to root object. I will skip this discussion how JVM marked an object as GC root, this topic needs a separate blog post.
2.1 Mark and Sweep Algorithm
JVM uses the mark and sweep algorithm to determine/mark which object is in use and which is no longer in use. This algorithm work in 2 steps
- In the first step, it will process all references and will mark all those objects which are alive.
- As a second step, it will reclaim all the heap memory for all the objects which are not marked as alive.
While it seems simple, keep in mind that Garbage collector mark object alive based on the reference, in case you created an object is not in use but still referred by some instance, it will be treated as alive object (even it is not in use).
3. Java Garbage Collector Types
One of another major misunderstanding about Java Garbage collector is that JVM has only 1 Garbage collector, but the truth is there are around 5 garbage collectors (as per JDK7). We will cover these different garbage collector in the next segment. All these GC algorithms work on a fundamental assumption that “Objects in Heap are short lived and should be recycled as quickly as possible.“
3.1 Serial GC
It’s the simplest and least usable one. It was mainly designed for a single-threaded environment. Do not use Serial GC. One of the main issue with Serial GC is its ability to freeze all threads whenever it’s active (one reason they call it as Serial GC), this can cause serious application performance issue. To enable Serial GC, you need to pass following parameters to the JVM
-XX:+UseSerialGC
3.2 Parallel GC
This is the default GC in Java 7 and Java 8. Parallel GC use multiple threads to scan heap for the GC process. Having the ability to use multiple threads makes this GC much faster, however, it will stop all application thread whenever it’s performing GC operation (full or partial GC operation). We also know parallel GC as Throughput collector.
This is the default GC for the JVM, in case you want to change other GC collector to Parallel GC, you need to specify following JVM parameter
-XX:+UseParallelGC
3.3 CMS GC
Concurrent-Mark-Sweep also known as CMS GC use multiple threads to scan through the head for possible GC process. It works as follows
- It used multiple threads to scan through the heap and will recycle unused object
Multiple threads denote concurrency, scanning head denotes marking (where it marks alive object in the heap) and recycles unused objects is marked as sweep hence Concurrent-Mark-Sweep. One of the main advantages of this algorithm is having a very slow pause time of application threads as it works in parallel to application threads (without stopping them).
This GC is best suitable for application where application response time is a critical aspect. CMS GC has disadvantages.
- Since it works in concurrency mode, it usually requires more memory and CPU usage.
- In case running application have done changes to the heap state when GC was running, it will be forced to redo final steps to make sure it has the updated reference information.
- One of the main disadvantages of the CMS GC is encountering Promotion Failure (Long Pauses) which happens because of the race conditions.
This does not default GC in JVM, use the following command to enable it for the underlying JVM
XX:+USeParNewGC
3.3. G1 GC
JDK 7 introduced a new GC known as The Garbage-First (G1), collector. One of the fundamental difference between G1 Collector and other collector is a division of the Heap into multiple regions. The JVM targets around 2000 regions varying in size from 1 to 32Mb.
G1 collector normally uses multiple background threads to scan these regions and will pick the region with most garbage objects (that’s why we call it Garbage First). To enable this GC, you need to pass the following parameter to JVM.
XX:+USeG1GC
There few advantages of over other GC
- It’s fast as compare of other GC since it will target region with most garbage objects.
- G1 GC will compact heap on the go which other GC lacks.
- Since G1 split Heap into multiple regions, a common “stop the world (pausing all running application threads )” is avoided by this GC (In place of scanning entire heap, it will scan on region basis).
- G1 GC is really a performance boost in the current scenarios where big heap size and multiple JMV per machines is common architectures
3.4. G1 GC and Java 8
Java 8 added a new feature called String Deduplication , String takes a lot of heaps size and this new feature will ensure if a String is duplicated across the heap, it will be automatically pointed to same internal char[]
, thus avoiding multiple copies of the same content. Use following JVM argument to enable this feature
-XX:+UseStringDeduplication
It tries to Reduce the Java heap live-data set by enhancing the G1 garbage collector so that duplicate instances of String are automatically and continuously de-duplicated. When G1 GC come into the picture, it will perform following operations
- It scans objects in the heap and check is applied to see if the object is a candidate for the string de-duplication.
- It adds all such items to a queue and de-duplication thread will process this queue to make sure all duplicate instances are pointing to the same internal
char[]
.
Another significant change happened in Java8 memory management is the removal of PermGen. It shows that JVM will also perform this memory management, and it’s a step to handle those OutOfMemoryError
.
Please read Will Java 8 Solve PermGen OutOfMemoryError? to get more details and reasons for removing it in Java 8.
Summary
I hope that it will give you a high-level overview of different Garbage Collector available in Java along with how they work and how we can be configured in the JVM. We also get a high-level overview of the new enhancement and changes introduced in the Java 8 for the G1 collector.
Comments are closed.