How does a compare and swap loop achieve atomicity?-CodePudding

This page talks about a CAS loop in details: https://preshing.com/20150402/you-can-do-any-kind-of-atomic-read-modify-write-operation/

An example of fetch_multiply in C :

uint32_t fetch_multiply(std::atomic<uint32_t>& shared, uint32_t multiplier){
    uint32_t oldValue = shared.load();
    while (!shared.compare_exchange_weak(oldValue, oldValue * multiplier)){}
    return oldValue;
}

Essentially if the *memory value matches our oldValue then a newValue gets stored atomically, else oldValue get updated with *memory.

I have 2 questions:

1 - Why do we have to check if oldValue is still unchanged in memory? What happens if we just write newValue to memory? Are we trying to avoid overwriting or using an intermediate value from another thread?

2- Suppose this scenario with 2 threads:

Thread B is trying to store a non-aligned value non-atomically. Store tearing occurs.
Thread A attempts a swap.
Swap fails as oldValue doesn't match. An intermediate (teared) value from memory gets loaded to our oldValue.
Thread A does multiplication with an intermediate value and attempts another swap which succeeds.
Now Thread B writes the remaining of its values to the same location partially overwriting our previous write.

I'm assuming Thread B could operate with that much delay and if so not only we multiplied with an intermediate value, it even got partially overwritten afterwords and CAS did nothing.

CodePudding user response：

I've managed to convince myself that code is wrong. I think it should look like this:

uint32_t fetch_multiply(std::atomic<uint32_t>& shared, uint32_t multiplier){
    uint32_t oldValue;
    do {
        oldValue = shared.load();
    } while (!shared.compare_exchange_weak(oldValue, oldValue * multiplier));
    return oldValue;
}

If the target value changes to something else we need to read in oldValue again or we will spin forever.

However the point of the CAS construction is you cannot ever observe an intermediate value in the shared location. A tear is impossible; shared.load() prevents it. This is implemented in hardware.

"What happens if we just write newValue to memory?" Then you don't have atomic access. Always follow the pattern.

"non-aligned value" if shared is non-aligned you have already introduced undefined behavior into your code even before talking about std::atomic. Non-aligned pointers cannot be safely de-referenced. For a normal * you just took a dependency on byte-addressable architecture, but this is a std::atomic. If it's not aligned you can fault even on x86.