Usage of Singleton pattern in multithreaded applications

August 13, 2019

In this article, we would like to raise a discussion about code safety in enterprise solutions that have to be working on a 24/7 basis with zero downtime and crashes. There are many best practices on how to write code safely, but now we would like to talk about another thing that could bring floating issues on multithreaded solutions. All software developers know what the Singleton Pattern is and often use it in their solutions. But not everybody knows which potential problems singleton could bring to a multithreading application. The traditional implementation of the singleton is based on making a pointer to an object the first time the object is requested.

In the single thread application, it’s generally working pretty fine. But the world is not without its imperfections and when we are working on the real production solutions we can’t use single-threaded applications due to various reasons. While we are using multithreading access to a singleton instance that can be performed from various threads it could be a problem while constructing singleton instances. If you are in Singleton::Instance() and receive an interrupt, invoke Singleton::Instance() from another thread, you can see how you’d get into trouble.

Let’s say that Thread A enters the instance function, executes through Line 14, and is then suspended. At the point where it is suspended, it has just determined that mInstance is null, which means that Singleton object is not created yet. Thread B now enters the instance function and executes Line 14. It seems that mInstance is null, so it proceeds to Line 15 and creates a Singleton for mInstance to point to. It then returns mInstance to instance’s caller. At some point later, Thread A is allowed to continue running, and the first thing it does is move to Line 15, where it creates another Singleton object and makes mInstance point to it. It should be clear that this violates the meaning of a singleton, as there are now two Singleton objects and one of them is a leak for application.

Making the classic Singleton implementation thread-safe seems to be easy. We could just acquire Lock before testing mInstance.

Looks good and safe. But, such implementation seems to be expensive. It’s because each call to Singleton::Instance requires the acquisition of the lock, but in reality, it’s necessary only once while constructing singleton instance. Why should we pay for other lock acquisitions if we exactly know what is needed only once?

There is another solution to this problem called the Double-Checked Locking Pattern (DCLP).

Using such an approach we are doing a double-check of mInstance for NULL — before the Lock and after the Lock acquired. Only if mInstance has not been yet initialized is the lock acquired, and after that, the test performed again to make sure that mInstance is still NULL. The second test is necessary as described above we could have a situation when another thread is trying to initialize mInstance between the time was first tested into a different thread. The papers defining DCLP discuss some implementation issues (e.g., the importance of volatile-qualifying the singleton pointer and the impact of separate caches on multiprocessor systems, both of which we will address in another article; as well as the need to ensure the atomicity of certain reads and writes, which we do not discuss in this article), but they fail to consider a much more fundamental problem, that of ensuring that the machine instructions executed during DCLP are executed in an acceptable order. It is this fundamental problem we focus on here.

Let’s see what happens when we are creating an instance of any class:

This statement causes three things that have to happen:

Allocate memory to hold Singleton object
Construct Singleton object in the allocated memory
Make mInstance pointer to the allocated memory

Of critical importance is the observation that compilers are not constrained to perform these steps in this order! In particular, compilers are sometimes allowed to swap steps 2 and 3. Why they might want to do that is a question we’ll address in a moment. For now, let’s focus on what happens if they do. Consider the following code, where we’ve expanded pInstance’s initialization line into the three constituent tasks we mentioned above and where we’ve merged steps 1 (memory allocation) and 3 (mInstance assignment) into a single statement that precedes step 2 (Singleton construction). The idea is not that a human would write this code. Rather, it’s that a compiler might generate code equivalent to this in response to the conventional DCLP source code (shown earlier) that a human would write.

In general, this is not a valid translation of the original DCLP source code, because the Singleton constructor called in step 2 might throw an exception, and if an exception is thrown, it’s important that mInstance not yet have been modified. That’s why, in general, compilers cannot move step 3 above step 2. However, there are conditions under which this transformation is legitimate. Perhaps the simplest condition is when a compiler can prove that the Singleton constructor cannot throw (e.g., via post-inlining flow analysis), but that is not the only condition. Some constructors that throw can also have their instructions reordered such that this problem arises. Given the above translation, consider the following sequence of events:

Thread A enters the method, performs the first test of mInstance, acquires the lock, and executes the statement made up of steps 1 and 3. It is then suspended. At this point mInstance is non-null, but no Singleton object has yet been constructed in the memory mInstance points to.
Thread B enters method, determines that mInstance is non-null, and returns it to instance’s caller. The caller then dereferences the pointer to access the Singleton that, oops, has not yet been constructed.

DCLP will work only if steps 1 and 2 are completed before step 3 is performed, but there is no way to express this constraint in C or C++.

As we can see — various scenarios could bring problems with such obvious things like Singleton pattern. Most of the scenarios probably might not be a problem in real-life but there is a chance that it could happen in your solution. And when it could happen, it’s really hard to catch the issue because it rarely occurs or it depends on hardware configuration, etc. Probably one workaround that can be used to prevent Singleton initialization on a multi-threading application is just to do the call Singleton::Instance() in the application Main() function before any threads have been created.