Consider the following function:
void doMath(const std::vector<double> &v)
{
double a = v[0];
double b = v[1];
double c = v[2];
double d = v[3];
double e = v[4];
double f = v[5];
//do math
}
If I have complex equations, often, changing the names of the variables make typing the equations much easier (e.g., avoiding typos, faster typing, more readable).
- I know the computational cost is minor, but is there any additional computation cost in contrast to using e.g., v[0] in operations instead of a = v[0]? For example, does the compiler know the assignment isn't necessary and optimize this during compilation?
- Does the data type make a difference? I imagine for more complex data types, yes it does (right?). For example, if the input is
std::vector<std::vector<double>>, and I createstd::vector<double>and reassign each element of the vector of vectors to separate vectors, would the compiler optimize this?
CodePudding user response:
No, not in general. It could change the behavior of the function.
double a = v[0];
double b = v[1];
double c = v[2];
double d = v[3];
double e = v[4];
double f = v[5];
each of a through f is a copy of the state of the v[i] at the time the copy was made. This copy doesn't have to be made, but the compiler has to guarantee that if the v[i] changes after that point, that change is not reflected in the state of the a through f variable.
C compilers use techniques like "static single assignment" to break your code down before optimizing it. I mean, if you do this:
double a = v[0];
a = a v[1];
a compiler might very well have two names for a.
double a' = v[0];
double a'' = a' v[1];
here, each named value is assigned to only once. It then unravels your code to work with this "immutable" versions of your local variables.
This doesn't work with values passed by reference or pointer. There, the compiler has to either assume the reference/pointer is followed each time, or it has to prove that the indirection results in the exact same value at two adjacent lines.
double a = v[0];
double b = v[0];
the compiler is likely to be able to prove that a and b have the same value. So it can skip reading from the reference v and from its internal pointer to the buffer twice.
double a = v[0];
some_function_pointer();
double b = v[0];
in this case, the compiler cannot prove that a and b are the same value unless it works out what the some_function_pointer function does. Lacking that ability, it must independently calculate a and b.
When you do
double a = v[0];
double b = v[1];
double c = v[2];
double d = v[3];
double e = v[4];
double f = v[5];
you "lock" the values of the vector you are reading, taking a single snapshot. So code afterwards that uses a through f may generate different assembly than code that uses v[0] through v[5], only because it is easier for the compiler to prove the vector or its contents was not modified.
double const& a = v[0];
double const& b = v[1];
double const& c = v[2];
double const& d = v[3];
double const& e = v[4];
double const& f = v[5];
this ends up being a bit closer to using v[0] at point of use. Here, changes to the vector's buffer should be reflected in a through f, but changes in the vector itself should not be (!).
While you passed the vector by const&, that just promises you won't modify v without a const_cast, it does not promise the vector will not be modified while you are running. Anyone with a reference to the vector that isn't const can do so while you are running (say, you call a callback, or a system function, or anything the compiler cannot inline) is free to have modified the vector or its buffer.
As for more complex data types, a copy of a double is a noop unless the copy is modified differently than the source.
A copy of a vector<double> is not a noop in practice. A copy of a vector<double> requests an allocation of a new buffer.
A compiler is free to notice that both the copy and the source is used only in an immutable manner and eliminate it, but that elimination is much harder than eliminating a mere double. And if it does occur, the copy is more expensive.
A copy of a double is done every time you add doubles, in order to put it into a register. So a single "needless" copy isn't a real cost; this copy can be the same copy as the one you do when you add it to another number, as compilers are great at eliminating duplicates like this.
A copy of a vector isn't done when you do most operations on it. So a needless copy can have significant performance effects.
The elimination of the extra doubles is a matter of "as-if" optimization. The extra doubles really exist in the abstract machine that C specifies the behaviour of.
But doubles don't have much in the way of side effects to their existence. As C specifies the behaviour of the abstract machine and only requires the concrete machine code it runs be consistent with it, removing things that exist in the abstract machine is perfectly legal! So long as it behaves as-if they existed (up to undefined behaviour).
The same would be true of temporary vectors. Except vectors have complex side effects in their existence, as they do things like allocate memory. Modern C compilers are free to eliminate needless memory allocation, but it is harder than eliminating a mere 64 bit flat value.
CodePudding user response:
Well no, but sometimes yes
On the one hand, assigning variables can be an expensive operation (we can mention copying std::string when not needed), and on the other hand, you want to have a nice, readable code. The fact is that your question does not have the correct answer. A lot depends on the specific task. In some cases, the optimizer may think to optimize redundant copying, in some redundant copying will exist after optimization. Some say that in this case you can use the references, but also no one (or correct me) does not guarantee that they will not perform unnecessary operations.
So In some cases there will be no difference, but you can't be sure before writing the code.
The main advice I can give you is to MEASURE the execution time of your code after each optimization and remember that Premature optimization is the root of all evil
