c++ - pointer - std::atomic_bool




C++:std::atomic<bool> and volatile bool (2)

A "classic" bool, as you put it, would not work reliably (if at all). One reason for this is that the compiler could (and most likely does, at least with optimizations enabled) load data_ready only once from memory, because there is no indication that it ever changes in the context of reader_thread.

You could work around this problem by using volatile bool to enforce loading it every time (which would probably seem to work) but this would still be undefined behavior regarding the C++ standard because the access to the variable is neither synchronized nor atomic.

You could enforce synchronization using the locking facilities from the mutex header, but this would introduce (in your example) unnecessary overhead (hence std::atomic).


The problem with volatile is that it only guarantees that instructions are not omitted and the instruction ordering is preserved. volatile does not guarantee a memory barrier to enforce cache coherence. What this means is that writer_thread on processor A can write the value to it's cache (and maybe even to the main memory) without reader_thread on processor B seeing it, because the cache of processor B is not consistent with the cache of processor A. For a more thorough explanation see memory barrier and cache coherence on Wikipedia.


There can be additional problems with more "complex" expressions then x = y (i.e. x += y) that would require synchronization through a lock (or in this simple case an atomic +=) to ensure the value of x does not change during processing.

x += y for example is actually:

  • read x
  • compute x + y
  • write result back to x

If a context switch to another thread occurs during the computation this can result in something like this (2 threads, both doing x += 2; assuming x = 0):

Thread A                 Thread B
------------------------ ------------------------
read x (0)
compute x (0) + 2
                 <context switch>
                         read x (0)
                         compute x (0) + 2
                         write x (2)
                 <context switch>
write x (2)

Now x = 2 even though there were two += 2 computations. This effect is known as tearing.

I'm just reading the C++ concurrency in action book by Anthony Williams. There is this classic example with two threads, one produce data, the other one consumes the data and A.W. wrote that code pretty clear :

std::vector<int> data;
std::atomic<bool> data_ready(false);

void reader_thread()
{
    while(!data_ready.load())
    {
        std::this_thread::sleep(std::milliseconds(1));
    }
    std::cout << "The answer=" << data[0] << "\n";
}

void writer_thread()
{
    data.push_back(42);
    data_ready = true;
}

And I really don't understand why this code differs from one where I'd use a classic volatile bool instead of the atomic one. If someone could open my mind on the subject, I'd be grateful. Thanks.


Ben Voigt's answer is completely correct, still a little theoretical, and as I've been asked by a colleague "what does this mean for me", I decided to try my luck with a little more practical answer.

With your sample, the "simplest" optimization problem that could occur is the following:

According to the Standard, an optimized execution order may not change the functionality of a program. Problem is, this is only true for single threaded programs, or single threads in multithreaded programs.

So, for writer_thread and a (volatile) bool

data.push_back(42);
data_ready = true;

and

data_ready = true;
data.push_back(42);

are equivalent.

The result is, that

std::cout << "The answer=" << data[0] << "\n";

can be executed without having pushed any value into data.

An atomic bool does prevent this kind of optimization, as per definition it may not be reordered. There are flags for atomic operations which allow statements to be moved in front of the operation but not to the back, and vice versa, but those require a really advanced knowledge of your programming structure and the problems it can cause...





multithreading