Trying to generate a random prime p in the range [2,2147483647] using the C 11 std::uniform_int_distribution.
It's been commented that this approach might not be correct:
It is not immediately obvious that this p is uniformly distributed over the set of all primes <= 2^31 - 1. Whatever uniformity and bias guarantees the random-number generator has, they refer to all integers within the range, but the code is "sieving" just the primes out of it.
However, from another similar S.O. article, it notes
As long as the input random numbers are uniformly distributed in the range, the primes chosen by this method will be uniformly distributed among the primes in that range, too.
Question
Is this code truly correct to generate a random prime?
https://onlinegdb.com/FMzz78LBq
#include <stdio.h>
#include <stdint.h>
#include <math.h>
#include <time.h>
#include <random>
int isPrimeNumber (int num)
{
if (num == 1) return 0;
for (int i = 2; i <= sqrt (num); i )
{
if (num % i == 0)
{
// not prime
return 0;
}
}
// prime
return 1;
}
int main ()
{
std::random_device rd;
std::mt19937 rng (rd ());
// Define prime range from 2 to 2^31 - 1.
std::uniform_int_distribution<int>uni (2, 2147483647);
int prime;
// Generate a random prime.
do { prime = uni(rng); } while (!isPrimeNumber(prime));
printf ("prime = %d", prime);
return 0;
}
CodePudding user response:
The code you presented:
- Uniformly picks a random prime number in the range. Any given prime number in the range will have the same probability of coming up as any other prime in the range.
- will not produce numbers that are uniformly distributed around the range of integers. E.g. there will be much more numbers in the range 1-1000 than there will be in the range 1000001-1001000.
- is extremely inefficient (if you were to generate many numbers)
You should get clarity on exactly what you want. If you wanted 2. above you need something different.
CodePudding user response:
A sketch for a proof:
Assume for the purpose of contradiction that the algorithm produces a non-uniform distribution over the primes. Then there must be a prime p that has a higher probability than a prime q. As these are also integers, that means that in the larger range from 2 to N, sieving out non-primes had no effect on either p or q, hence the uniform distribution had to produce p with a higher probability than q, which is a direct contradiction.
Note: Observe that this proof would not hold if you compute a random number and search for the next prime after that number.
However, as pointed out by Jeffrey, this algorithm is rather inefficient. One alternative comes to mind: In the range up to 2**31 you have about 10**8 primes, according to the pi function. So it is close to feasible to just create a lookup table with all those primes and store it, if not in your binary, in a separate file (I expect about 400 MB), and just compute a uniform random number in the appropriate interval and pick that number out of the lookup table.
However, just to be safe, be aware that for any purpose close to security, std::mt19937 is not a cryptographically secure PRNG.
