Numba performance issue with np.nan and np.inf-CodePudding

I am playing around with numba to accelerate my code. I notice that the performance varies significantly when using np.inf instead np.nan inside the function. Below I have attached three sample functions for illustration.

function1 is not accelerated by numba.
function2 and function3 are both accelerated by numba, but one uses np.nan while the other uses np.inf.

On my machine, the average runtime of the three functions are 0.032284s, 0.041548s and 0.019712s respectively. It appears that using np.nan is much slower than np.inf. Why does the performance vary significantly? Thanks in advance.

Edit: I am using Python 3.7.11 and Numba 0.55.Orc1.

import numpy as np
import numba as nb

def function1(array1, array2):
    nr, nc = array1.shape
    output1 = np.empty((nr, nc), dtype='float')
    output2 = np.empty((nr, nc), dtype='float')
    output1[:] = np.nan
    output2[:] = np.nan

    for r in range(nr):
        row1 = array1[r]
        row2 = array2[r]
        diff = row1 - row2
        id_threshold =np.nonzero( (row1 - row2) > 8 )
        output1[r][id_threshold] = 1
        output2[r][id_threshold] = 0

    output1 = output1.flatten()
    output2 = output2.flatten()
    id_keep = np.nonzero(output1 != np.nan)
    output1 = output1[id_keep]
    output2 = output2[id_keep]
    output = np.vstack((output1, output2))
    return output

@nb.njit('float64[:,::1](float64[:,::1], float64[:,::1])', parallel=True)
def function2(array1, array2):
    nr, nc = array1.shape
    output1 = np.empty((nr,nc), dtype='float')
    output2 = np.empty((nr, nc), dtype='float')
    output1[:] = np.nan
    output2[:] = np.nan

    for r in nb.prange(nr):
        row1 = array1[r]
        row2 = array2[r]
        diff = row1 - row2
        id_threshold =np.nonzero( (row1 - row2) > 8 )
        output1[r][id_threshold] = 1
        output2[r][id_threshold] = 0

    output1 = output1.flatten()
    output2 = output2.flatten()
    id_keep = np.nonzero(output1 != np.nan)
    output1 = output1[id_keep]
    output2 = output2[id_keep]
    output = np.vstack((output1, output2))
    return output

@nb.njit('float64[:,::1](float64[:,::1], float64[:,::1])', parallel=True)
def function3(array1, array2):
    nr, nc = array1.shape
    output1 = np.empty((nr,nc), dtype='float')
    output2 = np.empty((nr, nc), dtype='float')
    output1[:] = np.inf
    output2[:] = np.inf

    for r in nb.prange(nr):
        row1 = array1[r]
        row2 = array2[r]
        diff = row1 - row2
        id_threshold =np.nonzero( (row1 - row2) > 8 )
        output1[r][id_threshold] = 1
        output2[r][id_threshold] = 0
    output1 = output1.flatten()
    output2 = output2.flatten()
    id_keep = np.nonzero(output1 != np.inf)
    output1 = output1[id_keep]
    output2 = output2[id_keep]
    output = np.vstack((output1, output2))
    return output


array1 = 10*np.random.random((1000,1000))
array2 = 10*np.random.random((1000,1000))

output1 = function1(array1, array2)
output2 = function2(array1, array2)
output3 = function3(array1, array2)

CodePudding user response：

The second one is much slower because output1 != np.nan returns a copy output1 since np.nan != np.nan is True (like any other value -- v != np.nan is always true). Thus, the resulting array to compute are much bigger causing a slower execution.

The point is you must never compare a value to np.nan using comparison operators: use np.isnan(value) instead. In your case, you should use np.logical_not(np.isnan(output1)).

The second implementation may be slightly slower due to the temporary array created by np.logical_not (I did not see any statistically significant difference on my machine between using NaN or Inf once the code has been corrected).