Numpy random number generator latency-CodePudding

Why is the numpy generation of random numbers so much slower in the case of repeated calls compared to a single function call?

Example:

import numpy as np
import timeit

if __name__ == '__main__':


    latency_normal = timeit.timeit('np.random.uniform(size=(100,))', setup = 'import numpy as np')
    latency_normal_loop = timeit.timeit('[np.random.uniform(size=1) for _ in range(100)]', setup = 'import numpy as np')

    rng = np.random.default_rng()

    latency_generator = timeit.timeit('rng.uniform(size=(100,))', setup = 'import numpy as np')
    latency_generator_loop = timeit.timeit('[rng.uniform(size=1) for _ in range(100)]', setup = 'import numpy as np')

    print("latencies:\t normal: [{}, {}]\t generator: [{},{}]".format(latency_normal, latency_normal_loop, latency_generator, latency_generator_loop))

Output:

latencies:       normal: [2.7388298519999807, 31.694285575999857]        generator: [2.6634575979996953,31.0009219450003]

Are there any alternatives that performs better for repeated calls with smaller sample sizes?

CodePudding user response：

Obviously there is a large fixed per-call cost associated with the function call. To work around it, you can make a wrapper that will retrieve a batch of random numbers from numpy (i.e. 100) in a single call and then return values from this cache. When the cache gets depleted, it will ask numpy for another 100 numbers, etc.

Or, you can simply use Python's random!