Why is b.pop(0) over 200 times slower than del b[0] for bytearray?-CodePudding

Letting them compete three times (a million pops/dels each time):

from timeit import timeit

for _ in range(3):
    t1 = timeit('b.pop(0)', 'b = bytearray(1000000)')
    t2 = timeit('del b[0]', 'b = bytearray(1000000)')
    print(t1 / t2)

Time ratios (Try it online!):

274.6037053753368
219.38099365582403
252.08691226683823

Why is pop that much slower at doing the same thing?

CodePudding user response：

When you run b.pop(0), Python moves all the elements back by one as you might expect. This takes O(n) time.

When you del b[0], Python simply increases the start pointer of the object by 1.

In both cases, PyByteArray_Resize is called to adjust the size. When the new size is smaller than half the allocated size, the allocated memory will be shrunk. In the del b[0] case, this is the only point where the data will be copied. As a result, this case will take O(1) amortized time.

Relevant code:

bytearray_pop_impl function: Always calls

    memmove(buf   index, buf   index   1, n - index);

The bytearray_setslice_linear function is called for del b[0] with lo == 0, hi == 1, bytes_len == 0. It reaches this code (with growth == -1):

if (lo == 0) {
    /* Shrink the buffer by advancing its logical start */
    self->ob_start -= growth;
    /*
      0   lo               hi             old_size
      |   |<----avail----->|<-----tail------>|
      |      |<-bytes_len->|<-----tail------>|
      0    new_lo         new_hi          new_size
    */
}
else {
    /*
      0   lo               hi               old_size
      |   |<----avail----->|<-----tomove------>|
      |   |<-bytes_len->|<-----tomove------>|
      0   lo         new_hi              new_size
    */
    memmove(buf   lo   bytes_len, buf   hi,
            Py_SIZE(self) - hi);
}

CodePudding user response：

I have to admit, I was very surprised by the timings myself. After convincing myself that they were in fact correct, I took a dive into the CPython source code, and I think I found the answer- cpython optimizes del bytearr[0:x], by just incrementing the pointer to the start of the array:

    if (growth < 0) {
        if (!_canresize(self))
            return -1;

        if (lo == 0) {
            /* Shrink the buffer by advancing its logical start */
            self->ob_start -= growth;

You can find the del bytearray[...] logic here (implemented via bytearray_setslice, with values being NULL), which in turn calls bytearray_setslice_linear, which contains the above optimization.

For comparison, bytearray.pop does NOT implement this optimization- see here in the source code.