std::atomic - behaviour of relaxed ordering-CodePudding

Can the following call to print result in outputting stale/unintended values?

std::mutex g;
std::atomic<int> seq;
int g_s = 0;
int i = 0, j = 0, k = 0; // ignore fact that these could easily made atomic

// Thread 1
void do_work() // seldom called
{
    // avoid over
    std::lock_guard<std::mutex> lock{g};
    i  ; 
    j  ;
    k  ;
    seq.fetch_add(1, std::memory_order_relaxed);
}

// Thread 2
void consume_work() // spinning
{
    const auto s = g_s;
    // avoid overhead of constantly acquiring lock
    g_s = seq.load(std::memory_order_relaxed);
    if (s != g_s)
    { 
       // no lock guard
       print(i, j, k);
    }
}

CodePudding user response：

Even ignoring the staleness, this is causes a data race and UB.

Thread 2 can read i,j,k while thread 1 is modifying them, you don't synchronize the access to those variables. If thread 2 doesn't respect the g, there's no point in locking it in thread 1.

CodePudding user response：

Yes, it can.

First of all, the lock guard does not have any effect on your code. A lock has to be used by at least two threads to have any effect.

Thread 2 can read at any moment. It can read an incremented i and not incremented j and k. In theory, it can even read a weird partial value obtained by reading in between updating the various bytes that compose i - for example incrementing from 0xFF to 0x100 results reading 0x1FF or 0x0 - but not on x86 where these updates happen to be atomic.

CodePudding user response：

What you're trying to do looks like a broken version of a SeqLock. But you're reading the sequence counter twice in a row and then the i,j,k "payload". (Or something? Your writer never touches g_s). Anyway, you're feeding it to print directly, not reading into local temporaries and then re-checking the sequence counter, but the writer could be about to start changing one of those variables at any point.

So this is not at all safe or synchronized. It's obviously C undefined behaviour, and in practice after compiling to machine code for normal CPUs, it will be possible for the writer's updates to one or some of the non-atomic vars to become visible, but not all. If reader sleeps at some point between reading i and j, you could miss many updates.

A Seq Lock is good for cheap reads and occasional writes that make the reader retry. Implementing 64 bit atomic counter with 32 bit atomics shows appropriate fencing.

It relies on non-atomic reads that may have a data race, but not using the result if your sequence counter detects tearing. C doesn't define the behaviour in that case, but it works in practice on real implementations. (C is mostly keeping its options open in case of hardware race detection, which normal CPUs don't do.)

If you have multiple writers, you'd still use a normal lock to give mutual exclusion between them. or use the sequence counter as a spinlock, as a writer acquires it by making the count odd. Otherwise you just need the sequence counter.