Friday, May 4, 2012

Java 7: Closing NIO.2 file channels without loosing data

Closing an asynchronous file channel can be very difficult. If you submitted I/O tasks to the asynchronous channel you want to be sure that the tasks are executed properly. This can actually be a tricky requirement on asynchronous channels for several reasons. The default channel group uses deamon threads as worker threads, which isn't a good choice, cause these threads just abandon if the JVM exits. If you use a custom thread pool executor with non-deamon threads (see last part of this series) you need to manage the lifecycle of your thread pool yourself. If you don't the threads just stay alive when the main thread exits. Hence, the JVM actually does not exit at all, what you can do is kill the JVM.

Another issue when closing asynchronous channels is mentioned in the javadoc of AsynchronousFileChannel: "Shutting down the executor service while the channel is open results in unspecified behavior." This is because the close() operation on AsynchronousFileChannel issues tasks to the associated executor service that simulate the failure of pending I/O operations (in that same thread pool) with an AsynchronousCloseException. Hence, you'll get RejectedExecutionException if you perform close() on an asynchronous file channel instance when you previously closed the associated executor service.

That all being said, the proposed way to safely configure the file channel and shutdown that channel goes like this:


The custom thread pool executor service is defined in lines 6 and 7. The file channel is defined in lines 10 to 13. In the lines 18 to 20 the asynchronous channel is closed in an orderly manner. First the channel itself is closed, then the executor service is shutdown and last not least the thread awaits termination of the thread pool executor.

Although this is a safe way to close a channel with a custom executor service, there's a new issue introduced. The clients submitted asynchronous write tasks (line 16) and may want be sure that, once they've been submitted successfully, those tasks will definitely be executed. Always waiting for Future.get() to return (line 23), isn't an option, cause in many cases this would lead *asynchronous* file channels ad adsurdum. The snippet above will return lot's of "Task wasn't executed!" messages cause the channel is closed immediately after the write operations were submitted to the channel (line 18). To avoid such 'data loss' you can implement your own CompletionHandler and pass that to the requested write operation.


The CompletionHandler.failed() method (line 16) catches any runtime exception during task processing. You can implement any compensation code here to avoid data loss. When you work on mission critical data, then it may be a good idea to use CompletionHandlers. But *still* there's another issue. The clients can submit tasks but they don't know if the pool will successfully process these tasks. Successful in this context means that the bytes submitted actually reach their destination (the file on the hard disk). If you want to be sure that all submitted tasks are actually processed before closing, it gets a little trickier. You need a 'graceful' closing mechanism, that waits until the work queue is empty *before* it actually closes the channel and the associated executor service (this isn't possible using standard lifecycle methods).

Introducing GracefulAsynchronousChannel


My last snippets introduce the GracefulAsynchronousFileChannel. You can get the complete code here in my Git repository. The behaviour of that channel is like this: guarantee to process all successfully submitted write operations and throw an NonWritableChannelException if the channel prepares shutdown. It takes two things to implement that behaviour. Firstly, you'll need to implement the afterExecute() in an extension of ThreadPoolExecutor that sends a signal when the queue is empty. This is what DefensiveThreadPoolExecutor does.


The afterExecute() method (line 12) is executed after each processed task by the thread that processed that given task. The implementation sends the isEmpty signal in line 18. The second part you need two gracefully close a channel is a custom implementation of the close() method of AsynchronousFileChannel.


Study that code for a while. The interesting bits are in line 11 where the innerChannel gets replaced by a read-only channel. That causes any subsequent asynchronous write requests to fail with an NonWritableChannelException. In line 16 the close() method waits for the isEmpty signal to happen. When this signal is send after the last write task the close() method continues with an orderly shutdown procedure (line 27 ff.). Basically, the code adds a shared lifecycle state across the file channel and the associated thread pool. That way both objects can communicate during the shutdown procedure and avoid data loss.

Here is a logging client that uses the GracefulAsynchronousFileChannel.


The client starts two threads, one thread issues write operations in an infinite loop (line 6 ff.). The other thread closes the file channel asynchronously after one second of processing (line 25 ff.). If you run that client, then the following output is produced:
Starting graceful shutdown ...
Deal with the fact that the channel was closed asynchronously ... java.nio.channels.NonWritableChannelException
Channel blocked for write access ...
Waiting for signal that queue is empty ...
Issueing signal that queue is empty ...
Received signal that queue is empty ... closing
File closed ...
Pool closed ...
Expected file size (bytes): 400020
Actual file size (bytes): 400020
No write operation was lost!
The output shows the orderly shutdown procedure of participating threads. The logging thread needs to deal with the fact that the channel was closed asynchronously. After the queued tasks are processed the channel resources are closed. No data was lost, everything that the client issued was really written to the file destination. No AsynchronousClosedExceptions or RejectedExecutionExceptions in such a graceful closing procedure.

That's all in terms of safely closing asynchronous file channels. The complete code is here in my Git repository. I hope you've enjoyed it a little. Looking forward to your comments.
Cheers, Niklas

The NIO.2 file channels series:
- Introduction
- Applying custom thread pools
- Closing file channels without loosing data
- I/O operations are not atomic



No comments:

Post a Comment