Why is *ptr = (*ptr) Undefined Behavior-CodePudding

I am trying to learn how to explain the cause of(if any) of undefined behavior in the following cases(given below).

int i = 0, *ptr = &i;
i =   i; //is this UB? If yes then why according to C  11
*ptr = (*ptr)  ; //i think this is UB but i am unable to explain exactly why is this so
*ptr =   (*ptr); //i think this is not UB but can't explain why

I have looked at many SO posts describing UB for different pointer cases similar to the cases above, but still i am unable to explain why exactly(like using which point(s) from the standard we can prove that they will result in UB) they result in UB.

I am looking for explanations according to C 11(or C 14) but not C 17 and not Pre-C 11.

CodePudding user response：

Undefined behavior stems from this:

C 11 [intro.execution]/15 Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced... If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.

C 17 [intro.execution]/17 Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced... If a side effect on a memory location (4.4) is unsequenced relative to either another side effect on the same memory location or a value computation using the value of any object in the same memory location, and they are not potentially concurrent (4.7), the behavior is undefined.

This text is similar. The main difference lies in "except where noted" part; in C 17, the order of evaluation of operands is specified for more operators than in C 11. Thus:

C 17 [expr.ass]/1 In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression. The right operand is sequenced before the left operand.

C 11 lacks the bolded part. This part is what makes i = i well-defined in C 17, but undefined in C 11. That's because for postfix increment, the side effect is not part of a value computation of the expression:

C 11 and C 17 [expr.post.incr]/1 The value computation of the expression is sequenced before the modification of the operand object.

So "the assignment is sequenced after the value computation of the right and left operands" is not by itself sufficient: the assignment is sequenced after the value computation of i , and the side effect is also sequenced after that same value computation, but nothing says how they are sequenced relative to each other. Therefore, they are unsequenced, and they are both modifying the same object (here, i). This exhibits undefined behavior.

The addition of "the right operand is sequenced before the left operand" in C 17 means that the side effect of i is sequenced before the value computation of i, and both are sequenced before the assignment.

On the other hand, for pre-increment the side effect is necessarily part of the evaluation of the expression:

C 11 and C 17 [expr.pre.incr]/1 ... The result is the updated operand; it is an lvalue ...

So the value computation of i involves incrementing i first, and then applying an lvalue-to-rvalue conversion to obtain the updated value. This value computation is sequenced before the assignment in both C 11 and C 17, and so the two side effects on i are sequenced relative to each other; no undefined behavior.

Nothing changes in this analysis if i is replaced with (*ptr). That's just another way to refer to the same object or memory location.

CodePudding user response：

The C Standard is based upon the C Standard, whose authors didn't need any particular "reason" to say that implementations may process a construct in whatever fashion would be most useful to their customers [which is what they intended the phrase "Undefined Behavior" to mean]. Many platforms can cheaply guarantee, for small primitive types, that race conditions involving a read and conflicting write to the same object will always yield either old or new data, and that race conditions involving conflicting writes will result every individual subsequent read seeing one of the written values. Rather than trying to identify all of the cases where implementations should or should not be expected to uphold guarantee, the Standard allows implementations to, at their leisure, process code "in a documented manner characteristic of the environment". Because it's not practical for all implementations to offer such guarantees in all possible scenarios, and because the range of scenarios where such guarantees would be practical would be different on different platforms, the authors of the Standard allowed implementations to weigh the pros and cons of offering various behavioral guarantees on their particular target platforms, rather than trying to write precise rules that would be appropriate for all possible implementations.

Note also that if one were to do something like:

*p = (*q)  ;
return q[0]   q[i]; // where 'i' is some object of type `int`.

when p and q are equal and i is zero, a compiler might quite plausibly generate code where the assignment would undo the effect of the increment, but which would return the sum of the old value of q, plus 1, plus the actual stored value of q (which would be the old value, rather than the incremented value). Although this would be a logical consequence of the specified race-condition semantics, trying to specify it precisely would have been sufficiently awkward that the Standard simply allows implementations to specify the behavior as tightly or loosely as they see fit.