I can understand that:
- One of the origins of the UB is a performance increase (e.g. by removing never executed code, such as
if (i 1 < i) { /* never_executed_code */ }). - UB can be triggered at compile time because C does not clearly distinguish between compile time and run time. The "whole language is based on the (rather unhelpful) concept of an "abstract machine" (link).
However, I cannot understand yet why C preprocessor is a subject of undefined behavior? It is known that preprocessing directives are executed at compile time.
Consider C11, 6.10.3.3 The ## operator, 3:
If the result is not a valid preprocessing token, the behavior is undefined.
Why not make it a constraint? For example:
The result shall be a valid preprocessing token.
The same question goes for all the other "the behavior is undefined" in 6.10 Preprocessing directives.
CodePudding user response:
Why is the C preprocessor a subject of undefined behavior?
When the C standard was created, there were some existing C preprocessors and there was some imaginary ideal C preprocessor in the minds of standardization committee members.
So there were these gray areas, where committee members weren't completely sure what would they want to do and/or existing C preprocessor implementations differed which each other in behavior.
So, these cases are not defined behavior. Because the C committee members are not completely sure what the behavior actually should be. So there is no requirement on what it should be.
One of the origins of the UB
Yes, one of.
UB may exist to ease up implementing the language. Like for example, in case of the preprocessor, the preprocessor writers don't have to care about what happens when an invalid preprocessor token is a result of ##.
Or UB may exist to reconcile existing implementations with different behaviors or as a point for extensions. So a preprocessor that segfaults in case of UB, a preprocessor that accepts and works in case of UB, and a preprocessor that formats your hard drive in case of UB, all can be standard conformant (but I wouldn't want to work on that one that formats your drive).
CodePudding user response:
Without getting into specifics, my guess is, there exist several preprocessor implementations which have bugs, but the Standard doesn't want to declare them non-conforming, for compatibility reasons.
In human language: if you write a program which has X in it, preprocessor does weird stuff.
In standardese: the behavior of program with X is undefined.
If the standard says something like "The result shall be a valid preprocessing token", it might be unclear what "shall" means in this context.
- The programmer shall write the program so this condition holds? If so, the wording with "undefined behavior" is clearer and more uniform (it appears in other places too)
- The preprocessor shall make sure this condition holds? If so, this requires dedicated logic which checks the condition; may be impractical to implement.
