Generating random floats, summing to 1, with minimum value-CodePudding

I saw a many solutions for generating random floats within a specific range (like this) which actually helps me, and solutions for generating random floats summing to 1 (like this), and separately solutions work perfectly, but I can't figure how to merge them.

Currently my code is:

import random
def sample_floats(low, high, k=1):
    """ Return a k-length list of unique random floats
        in the range of low <= x <= high
    """
    result = []
    seen = set()
    for i in range(k):
        x = random.uniform(low, high)
        while x in seen:
            x = random.uniform(low, high)
        seen.add(x)
        result.append(x)
    return result

And still, applying

weights = sample_floats(0.055, 1.0, 11)
weights /= np.sum(weights)

Returns weights array, in which there are some floats less that 0.055

Should I somehow implement np.random.dirichlet in function above, or it should be built on the basis of np.random.dirichlet and then implement condition > 0.055? Can't figure any solution.

Thank you in advice!

CodePudding user response：

IIUC, you want to generate an array of k values, with minimum value of low=0.055.

It is easier to generate numbers from 0 that sum up to 1-low*k, and then to add low so that the final array sums to 1. Thus, this guarantees both the lower bound and the sum.

Regarding the high, I am pretty sure it is mathematically impossible to add this constraint as once you fix the lower bound and the sum, there is not enough degrees of freedom to chose an upper bound. The upper bound will be 1-low*(k-1) (here 0.505).

Also, be aware that, with a minimum value, you necessarily enforce a maximum k of 1//low (here 18 values). If you set k higher, the low bound won't be correct.

# parameters
low = 0.055
k = 10

a = np.random.rand(k)
a = (a/a.sum()*(1-low*k))
weights = a low

# checking that the sum is 1
assert np.isclose(weights.sum(), 1)

Example output:

array([0.13608635, 0.06796974, 0.07444545, 0.1361171 , 0.07217206,
       0.09223554, 0.12713463, 0.11012871, 0.1107402 , 0.07297022])

CodePudding user response：

You could generate k-1 numbers iteratively by varying the lower and upper bounds of the uniform random number generator - the constraint at any iteration being that the number generated allows the rest of the numbers to be at least low

def sample_floats(low, high, k=1):
    result = []
    generated = 0
    while generated < k-1:
        current_higher_bound = max(low, 1 - (k - 1 - generated)*low - sum(result))
        next_num = random.uniform(low, current_higher_bound)
        result.append(next_num)
        generated  = 1
    last_num = 1 - sum(result)
    result.append(last_num)
    return result

print(sample_floats(0.01, 1, k=15))
#[0.08878760926151083,
# 0.17897435239586243,
# 0.5873150041878156,
# 0.021487776792166513,
# 0.011234379498998357,
# 0.012408564286727042,
# 0.015391011259745103,
# 0.01264921242128719,
# 0.010759267284382326,
# 0.010615007333002748,
# 0.010288605412288477,
# 0.010060487014659121,
# 0.010027216923973544,
# 0.010000064276203318,
# 0.010001441651377285]

CodePudding user response：

The samples are correlated, so I believe you can't generate them in an IID way. you can, however, do it in an iterative manner. For example, you can do it as I show in the code below. There are a few more special cases to check like what if the user inputs low<high or high*k<sum. But I figured you can find and account for them using my modification to your code.

import random
import warnings
  

def sample_floats(low = 0.055, high = 1., x_sum = 1., k = 1):
    """ Return a k-length list of unique random floats
        in the range of 'low' <= x <= 'high' summing up to 'sum'.
    """
    sum_i = 0
    xs = []
    
    if x_sum - (k-1)*low < high:
        warnings.warn(f'high = {high} is to high to be generated under the'
            f' conditions set by k = {k}, sum = {x_sum}, and low = {low}.'
            f' high automatically set to {x_sum - (k-1)*low}.') 

    if k == 1:
        if high < x_sum:
            raise ValueError(f'The parameter combination k = {k}, sum = {x_sum},'
                ' and high = {high} is impossible.')
        else: return x_sum
    high_i = high
    for i in range(k-1):
        x = random.uniform(low, high_i)
        xs.append(x)
        sum_i = sum_i   x
        if high < (x_sum - sum_i - (k-1-i)*low):
            high_i = high
        else: high_i = x_sum - sum_i - (k-1-i)*low

    xs.append(x_sum - sum_i)

    return xs

For example:

random.seed(0)
xs = sample_floats(low = 0.055, high = 0.5, x_sum = 1., k = 5)
print(xs)
print(sum(xs))

Output:

[0.43076772392864643, 0.27801464913542906, 0.08495210994346317, 0.06568433355884717, 0.14058118343361425]
1.0