How to increment values (multiple in a continuous sequence) in a std::vector in O(1) time?-CodePudding

A stack was to to implemented with a function taking in 2 arguments (number, increment) where it would update the 'number' of values from bottom of the stack by incrementing them by 'increment'.

In the code, I used std::vector for my convenience. However, the addition operation was done (first option for most) using a loop

for (int i = 0; i < number;   i) {
    vec.at(i) = vec.at(i)   increment;
}

How to do this operation in O(1) time? This probably is where I failed test cases.

vec = 1 2 3 4 5
      1 1 1 1             // how to in O(1) time instead of loop?
vec = 2 3 4 5 5

CodePudding user response：

The question lacks important details. In general, it’s impossible to increment an arbitrary range of elements in O(1) time. There must be additional constraints that make it possible.

The question mentions a stack but doesn’t go into further detail. Reading between the lines it could mean that the problem asks for a stack with regular push and pop operations plus bulk increment for a range. The stack “access pattern”, i.e. elements are consumed sequentially, enables increment in O(1) time.

The key idea is to maintain a parallel array of increments. I-th item, increments[i], encodes an argument to a bulk increment operation for the range [0, i]. It starts as 0 initially. Bulk increment operation simply updates a single item in increments, which is clearly O(1) operation.

Returning a container item a[i] clearly requires a summation of increments[i] till increments[N], where N is the total size of the container. Only the last item can be produced efficiently, but that’s ok for a stack.

The pop operation should increment increments[N-1] by increments[N] prior to reducing the container size by 1.

CodePudding user response：

You can do that (under certain circumstances) with SIMD. O(1) means constant time and you may be able to do that when operating a fixed amount of adjacent positions in the vector using vectorization in a single operation. It is probably not what you wanted, but it is worth mentioning it here. For the other cases, using SIMD - despite not being O(1) -, can speed up computation significantly and as a matter of fact, is frequently used for operating images in memory.

SSE and AVX (NEON too in ARM) instruction sets allow you to use a 128bits or 256bits register to place 2, 4 or 8 numbers and operate them simultaneously.

For example, if you have a vector of floats (32 bits) of size 4, you can operate them in a single operation placing them in a 128 bits register (since you mentioned you use your vector as a stack, which implies variable length, this is probably not for you, and could only work in O(1) if the number of elements fits in a single register).

You can do that using OMP SIMD (more high level) or intrinsics (low level). Or just rely on compiler optimization. If you want to use multiple threads, you can use OMP PARALLEL FOR (probably not worth in this case though).

Despite that, many people would not expect to be possible to do that in O(1), so I suspect that this is not the reason why your tests are failing. Loop through a vector is already extremely efficient computationally (compared to other data structures).

As some people also said here, you could try to postpone that and add when retrieving the values, skipping the loop entirely. Not exactly what you asked, but it would be very efficient since is taking a ride in a future operation.

CodePudding user response：

I believe this is possible, using arrays of fixed length. Imagine you have an array of three numbers, each of them consisting of two bytes, and their values are (in hexadecimal) 1A, 02 and 3F.

In memory this will look like:

Memory address 1 : 1A
Memory address 2 : 02
Memory address 3 : 3F.

As the numbers are of a type, consisting of two bytes, this will be the same as:

Memory address 1 : 1A023F.

In order to add 1 to each of these numbers, you just need to add the hexadecimal value 010101 to the value in memory address 1, so you will get:

Memory address 1 : 1B0340.

Which will be understood as an array of two-byte values:

Memory address 1 : 1B
Memory address 2 : 03
Memory address 3 : 40

As you see, you can do this with an array with a fix length, but why would you even want to do this, knowing that a regular size, like 1000, you would need a number with 1000 bytes, for which there most probably is no type definition?