Saturday, July 26, 2014

Java 7 Fork/Join and Java 8 Streams: Merge Sort

Fork/Join

The fork/join framework from Java 7 has been on my list of things to check out for a while now. Basically, it provides an easy way to parallelize work that can be broken up into smaller parts. Work is distributed to threads in a pool, and idle threads can steal work from busy threads (work stealing). There are many parallel algorithms and I decided to try parallel merge sort. I have two processors, so I would expect to see some speedup over the canonical top-down merge sort.


My fork/join parallel implementation does not become faster than the non-parallel implementation until I try to sort over about 25 million integers. It is obvious fork/join comes with significant overhead, but I suspect that would be less noticeable if I had more processors to work with.

Streams

As I have been trying to become more adept at functional programming and am also a Java fan, I have been starting to explore what's new in Java 8 as well. This probably is not at all practical, but I had the idea to try to re-implement merge sort using streams. To do this I had to use the non-recursive bottom-up merge sort algorithm.

This is faster than the fork/join version up to about 3 million elements, then it becomes slower. It was not really clear to me how to properly parallelize this one. Converting it to a parallel stream before doing the merge seems to only slightly improve performance. Again, maybe it is difficult to really see the benefit with only two processors.

It is also worth noting that since all parallel streams use the same fork/join thread pool, they will be affected by each other's performance if used concurrently as explained in Java Parallel Streams Are Bad for Your Health.

Notes

For completeness, here is the merge method I used in all three merge sort implementations:

Friday, July 25, 2014

Demystifying Java Dumps

I have never taken dumps of an application to help with debugging. I can think of a couple of instances when it would have been helpful, but at the time I did not know about them. Now I hear more about thread dumps and heap dumps and feel like they should be in my toolkit.

In this post I will just try to touch on what the dumps will contain, why they are useful, and how to get started interpreting them. This is by no means a comprehensive analysis. There are several different JVM and OS-specific ways to trigger dumps (signals, environment variables, JVM command-line options), their content varies by JVM, and there are multiple tools for analyzing them.

I am using 64-bit Oracle SDK 1.8 on Windows 7 because that is what I already had available.

Thread Dumps

thread dump is a snapshot of all threads in a process, including their stacks. It can help you diagnose problems like deadlocks. By taking several consecutive thread dumps you can also detect bottlenecks like multiple threads vying for the same lock.

To create my thread dump I simply ran Ctrl+Break from the window my server was running in.

To analyze my thread dump I tried out the Thread Dump Analyzer (TDA). I provided three thread dumps about five seconds apart and it found my deadlock:

Program that intentionally deadlocked two threads

Heap Dumps

A heap dump is a snapshot of the memory of a Java process, including everything in the JVM heap space. It is useful for figuring out why you are running out of memory.

To create my heap dump I set the -XX:+HeapDumpOnOutOfMemoryError JVM option, but to create one interactively I could also have used the jmap tool: jmap.exe -dump:live,format=b,file=<file name> <pid>

To analyze my heap dump I used the Memory Analyzer Eclipse plugin which shows things like:
  • a histogram showing the number of instances per class
  • dominator tree - a list of the biggest objects and what they keep alive (shown below)
  • top consumers - the most expensive objects grouped by class and package
  • classes loaded by multiple class loaders
  • memory leak suspects
  • top components - components bigger than one percent of total heap
Program that intentionally caused an OutOfMemoryError by creating a large list