Monday, March 24, 2014

Exploring Immutability

In Java theory and practice: To mutate or not to mutate?, Brian Goetz says immutable objects simplify programming. One of the reasons is that they can be shared and cached without needing to be cloned, which means they are thread-safe if written correctly, which means you can more easily take advantage of multiple cores. In Scala you could write something like List(1, 2, 3).par.map(_ * 2) and map function is applied to the list elements in parallel. That would not work if someone else was changing the contents of the list at the same time.

Thread-safety, parallelization, it all sounds great, but one of the first things I think of is that if you had large immutable collections would there not be a lot of copying going on? There would be if every time you added or deleted an element the entire collection was recreated. It turns out that does not need to happen, though, because the collection implementation can exploit the similar (immutable) structure between the old and new versions. This might mean sharing a sub-tree or sharing the tail of a list.1,2 Compilers can potentially do additional optimizations with immutable objects as well.3

Like with anything else immutable objects do not solve all problems and can be used incorrectly. At the end of the day I see it as another tool in the toolbox. To look at how (un)natural it is to work with immutable collections, below are some Scala examples.


1 http://en.wikipedia.org/wiki/Persistent_data_structure
2 http://pragprog.com/magazines/2012-01/scala-for-the-intrigued
http://www.drdobbs.com/architecture-and-design/optimizing-immutable-and-purity/228700592

Sunday, March 23, 2014

Closure Examples

A closure is a function plus a referencing environment that is remembered from when it was created. Below I created a simple example in three different languages for comparison. It adds two numbers where the sum is capped at a certain max value. The function adds x and y, and the referencing environment contains the max value.

Python

Scala

JavaScript


In Scala and Python closures are probably seen more often when a lambda uses values from outside its scope. In JavaScript closures are commonly used to make private-like variables and functions that do not pollute the global namespace.

Saturday, March 8, 2014

Asynchronous Non-blocking I/O Java Echo Server

After writing a Node.js blog a couple weeks ago I wanted to revisit non-blocking I/O. I was first introduced to the topic in a college class where we were using Java, but that was quite a while ago, so I wondered what a non-blocking I/O server looks like in Java these days (and of course Node.js handles a lot of these details for you, so it is probably not the best place to learn about it). Non-blocking I/O was added in Java 1.4 (NIO) and then in Java 7 asynchronous non-blocking I/O (NIO2) was added.

I set out to write a simple echo server that can at least handle simultaneous connections so as to not be completely trivial. I would use the NIO2 APIs so that it is completely asynchronous to be most like the Node.js TCP echo server.

My Server Code


This code is disappointingly difficult to read. I think this is what JavaScript folks call callback hell: three levels of callbacks. Although if you can get through all the boilerplate and exception handling code, you will see I really only had to write a few lines of "real" code which is pretty nice. It is more concise than a similar server written using the older NIO APIs too.

Differences Between Blocking I/O, Non-blocking I/O, and Asynchronous Non-blocking I/O

With blocking I/O when a thread does a read or write it is blocked until some data is read or the data is written. The canonical Java web server example spawns a new thread for each request so that the main thread will not be blocked from accepting new connections.

With non-blocking I/O you request a read and you get what is available (maybe nothing is available) and the thread can continue on. You request a write and whatever can be written is written and the thread continues on.1 In other words, a single thread can manage multiple connections, but it might have to call read and write multiple times to completely read and write for each request and response.

With asynchronous non-blocking I/O an I/O operation always returns immediately--before anything is read/written--and the thread continues on. The I/O operation is handled in the background and the callback you provide eventually handles the result. This is how I wrote my server shown above.

These real-world analogies are helpful to at least understand the differences between blocking and non-blocking.

Conclusion

Non-blocking I/O is good when you need to manage many long-lived concurrent connections like a chat server or a P2P network. A use case where blocking I/O is probably better is one where more data is transferred at once, but connections are short-lived, like a traditional web server.1

It also seem like web sockets would be implemented with non-blocking I/O. Can anyone point me to more information about that?

Update

I posted a Java 8 version here.


Resources

1 more info at http://java.dzone.com/articles/java-nio-vs-io