Tuesday, July 9, 2013

Threads: All about ThreadLocal

I was thinking to improve my threading fundamentals, so I planned to make few notes for myself that will assist me in future (Obviously at the time of Interviews). Following questions and answers have been taken from stackoverflow.com. This post is a sticky notes post for me.

There is not specific source for the post, but I got this question from this post. Source

What is ThreadLocal?
Java docs provides following information about ThreadLocal

public class ThreadLocal<T> extends Object
This class provides thread-local variables. These variables differ from their normal counterparts in that each thread that accesses one (via its get or set method) has its own, independently initialized copy of the variable. ThreadLocal instances are typically private static fields in classes that wish to associate state with a thread (e.g., a user ID or Transaction ID).
For example, in the class below, the private static ThreadLocal instance (serialNum) maintains a "serial number" for each thread that invokes the class's static SerialNum.get() method, which returns the current thread's serial number. (A thread's serial number is assigned the first time it invokes SerialNum.get(), and remains unchanged on subsequent calls.)
 public class SerialNum {
     // The next serial number to be assigned
     private static int nextSerialNum = 0;

     private static ThreadLocal serialNum = new ThreadLocal() {
         protected synchronized Object initialValue() {
             return new Integer(nextSerialNum++);
         }
     };

     public static int get() {
         return ((Integer) (serialNum.get())).intValue();
     }
 }

Each thread holds an implicit reference to its copy of a thread-local variable as long as the thread is alive and the ThreadLocal instance is accessible; after a thread goes away, all of its copies of thread-local instances are subject to garbage collection (unless other references to these copies exist).

ThreadLocal variable are instance variables which are not shared across various threads. Each thread maintains its own copy instead sharing the resource (which was declared globally in the Thread instance) across multiple threads. So it is not required to use synchronization for such variables. In java it is one way to achieve thread safety, another way to achieve such ThreadSafety is to use immutable classes.

In most of the discussion, it is said that when it is possible to have clear demarcation between the usability of both than instance variables are preferred over ThreadLocals.

ThreadLocals are used to preserve the state of the Thread. When we do not want to write a synchronized block, then we try to ignore instance variables. So alternative to instance variables is either to pass every member to each method in thread or define the specific member as ThreadLocal. In both the cases state of the thread will remain intact.

In Java, if you have a datum that can vary per-thread, your choices are to pass that datum around to every method that needs (or may need) it, or to associate the datum with the thread. Passing the datum around everywhere may be workable if all your methods already need to pass around a common "context" variable.
If that's not the case, you may not want to clutter up your method signatures with an additional parameter. In a non-threaded world, you could solve the problem with the Java equivalent of a global variable. In a threaded word, the equivalent of a global variable is a thread-local variable.

Essentially, when you need a variable's value to depend on the current thread and it isn't convenient for you to attach the value to the thread in some other way (for example, subclassing thread).
A typical case is where some other framework has created the thread that your code is running in, e.g. a servlet container, or where it just makes more sense to use ThreadLocal because your variable is then "in its logical place" (rather than a variable hanging from a Thread subclass or in some other hash map).

1) ThreadLocal are fantastic to implement Per Thread Singleton classes or per thread context information like transaction id.

2) You can wrap any non Thread Safe object in ThreadLocal and suddenly its uses becomes Thread-safe, as its only being used by Thread Safe. One of the classic example of ThreadLocal is sharing SimpleDateForamt. Since SimpleDateFormat is not thread safe, having a global formatter may not work but having per Thread formatter will certainly work.

3) ThreadLocal provides another way to extend Thread. If you want to preserve or carry information from one method call to another you can carry it by using ThreadLocal. This can provide immense flexibility as you don't need to modify any method.

It can be difficult to write efficient code that is safe for multithreaded access. Java'sThreadLocal class provides a powerful, easy-to-use solution, while avoiding the drawbacks of other approaches. Plus, ThreadLocal implementations are more efficient, particularly in later JVMs. If you are trying to improve the performance of frequently used classes that use nonthreadsafe resources that are expensive to create (such as XML parsers or connections to a database), try a ThreadLocal implementation.(Source)

They reduce reusability in much the same way that global variables do: when you method's computations depend on state which is external to the method, but not passed as parameters (i.e. class fields for example), your method is less reusable, because it's tightly coupled to the state of the object/class in which it resides (or worse, on a different class entirely).

ThreadLocal and Memory Leaks
Since a ThreadLocal is a reference to data within a given Thread, you can end up with classloading leaks when using ThreadLocals in application servers which use thread pools. You need to be very careful about cleaning up any ThreadLocals you get() or set() by using the ThreadLocal'sremove() method.
If you do not clean up when you're done, any references it holds to classes loaded as part of a deployed webapp will remain in the permanent heap and will never get garbage collected. Redeploying/undeploying the webapp will not clean up each Thread's reference to your webapp's class(es) since the Thread is not something owned by your webapp. Each successive deployment will create a new instance of the class which will never be garbage collected.
You will end up with out of memory exceptions due to java.lang.OutOfMemoryError: PermGen space and after some googling will probably just increase -XX:MaxPermSize instead of fixing the bug.

if you use ThreadLocal to store some object instance there is a high risk to have the object stored in the thread local never garbage collected when your app runs inside an app server like WebLogic Server, which manage a pool of working thread - even when the class that created this ThreadLocal instance is garbage collected.

Josh Bloch (co-author ofjava.lang.ThreadLocal along with Doug Lea) wrote:
"The use of thread pools demands extreme care. Sloppy use of thread pools in combination with sloppy use of thread locals can cause unintended object retention, as has been noted in many places."

People were complaining about the bad interaction of ThreadLocal with thread pools even then. But Josh did sanction:
"Per-thread instances for performance. Aaron's SimpleDateFormat example (above) is one example of this pattern."
Some Lessons
  1. If you put any kind of objects into any object pool, you must provide a way to remove them 'later'.
  2. If you 'pool' using a ThreadLocal, you have limited options for doing that. Either: a) you know that the Thread(s) where you put values will terminate when your application is finished; OR b) you can later arrange for same thread that invoked ThreadLocal#set() to invoke ThreadLocal#remove() whenever your application terminates
  3. As such, your use of ThreadLocal as an object pool is going to exact a heavy price on the design of your application and your class. The benefits don't come for free.
  4. As such, use of ThreadLocal is probably a premature optimization, even though Joshua Bloch urged you to consider it in 'Effective Java'.

1 comment: