In this post, I'll discuss why garbage collection is Java's trump card when talking about performance.
I recently conducted a technical interview for a Java developer. The candidate made a pretty bold statement: he believed that the biggest design error in Java was that garbage collection was automated. At first, I was stunned by this trash talk aimed at Java. What's more is that those words were uttered in a Java-based shop - for a Java-based job - in front of Java-biased techies!
There are some things in life you just take for granted and stop questioning after a while. Things like:
- Gravity guarantees I will not be swept into outer space as I type this blog
- A continuous stream of electricity guarantees I can type this sentence
- Garbage collection destroys my objects once I'm finished with them.
When someone comes along and asserts that a fundamental commodity (gravity, electricity or garbage collection) is a design error, you have to take a step back before you can even begin to formulate a coherent counter-argument.
There is still a contingent of people out there who will bring up C++ when discussing performance in Java. I believe that doing so leads to missing the forest for the trees. For the record, it's true that if all we're talking about is brute speed, in all likelihood, a sorting algorithm written in C++ will perform better than the same written in Java. However, automated garbage collection is the trump card that tips the balance in Java's favour.
We can gloss over the fact that automated garbage collection is a problem that has been solved at the turn of the 21st century. This is one of those instances where after decades of research, a mainstream language has come along to allow designers to stop thinking about garbage collection and move up higher along the stack. At the very least, automated garbage collection solves the big problem of memory leaks. In large-scale applications with always-on type uptime requirements, this alone creates an almost insurmountable advantage in favour of Java's garbage collection. While memory leaks still occur in Java, it's a higher quality problem. It's the difference between "do I have enough money to put food on the table" vs "do I have enough money for marble counter tops in my yacht". Finding memory leaks in a large-scale C++ applications is more like the first kind of problem. Even the best designed applications will have hard to find leaks in C++ that will threaten uptimes. Applications with leaks eventually run out of memory and need be restarted to free this lost memory. I don't know about you but I'd rather be worried about marble counter tops than food.
More importantly, garbage collection is the crown jewel of Java when it comes to performance. Java's garbage collection enables efficiencies when dealing with heap management. Destruction of objects in C++ is done synchronously. This means that your application needs to wait for an O/S call to de-allocate memory. In Java, this is done asynchronously by a separate thread. This not only removes a potential bottleneck in your application that can occur from having many threads blocking on a centralized resource, (memory management in this case), it also performs faster because the object is not actually destroyed by the calling thread. Given that garbage collection is done by a separate non-application component, the bottleneck barrier is blown away.
Garbage collectors can also take advantage of multi-CPU, multi-core processors to parallelize garbage collection. Depending on the GC strategies configured, collection can be performed in parallel by either having a GC thread running on a dedicated core at all times or pausing the application and running one GC thread on each core in parallel.
Garbage collectors are predicated on the observation that most objects die young. By young, I mean a number measured in milliseconds. With this in mind, a garbage collector can partition the heap into multiple spaces or generations. These are known as the young generation (also known as nurseries or eden) and the old generation. As the names suggest, objects are segregated by age. Given that most objects die young, they can be created in the young generation and die without ever being manipulated by the garbage collector. It is then easier to move only the minority of surviving objects one by one into the other half of the young generation and to simply destroy the entire allocation table for this generation in one fell swoop. This is more efficient that individually removing a majority of objects one entry at a time. With this in mind, programmers need not use error-prone techniques such as object pooling only as means to improve memory allocation efficiency. In fact in Java, it usually better to do nothing rather than implementing object pooling. (Of course, if creating an object is time-consuming for reasons other than memory allocation, then pooling is fine.)
Garbage collection is a huge asset for Java. In fact, it is largely responsible for the success of Java in the enterprise application space. It allows Java to go head to head with the likes of C++ even with the latter's inherent advantage of having native access to the O/S. Looks like there's gold in that garbage.
0 comments:
Post a Comment