I am working on data-driven robust optimization approach. In the numerical results part, I need to validate the method using sample data. I use np.random.multivariate_normal to generate the data for which I give the covariance matrix and mean vector as follows:
mean = [-1000, 1,1,1]
cov = [[200,0,0,0],[0,0.001,0,0],[0,0,0.001,0],[0,0,0,0.001]]
After generating the data, the first component of the samples are not distributed in the interval [-1200,-800], and instead, they belong to a smaller interval (e.g., [-1003,-997]). I need to generate sample data whose first components are more extensively distributed through the interval [-1200,-800]. The code that I use is as follows:
import numpy as np
import matplotlib.pyplot as plt
from numpy.random import multivariate_normal
# First 2D gaussian:
n=50 #number of samples
mean = [-1000, 1,1,1]
cov = [[200,0,0,0],[0,0.001,0,0],[0,0,0.001,0],[0,0,0,0.001]]
Samples = np.random.multivariate_normal(mean, cov, n).T
I do not know where I am going wrong.
CodePudding user response:
It seems that you are asking for a mean of -1000 and a variance of 1.
Note that we expect 99.9% of the values in an infinite population to be within 3 sigma (standard deviation, $\sqrt{Var}$), which for you is essentially 3. So your results are expected (see image).

If you want a larger variance you will need to specify that in cov. Here is some messing around I did to demonstrate this:
# increase n to get more asymptotic
>>> n = 1000
# what you had
>>> cov = [[1,0,0,0],[0,0.001,0,0],[0,0,0.001,0],[0,0,0,0.001]]
>>> Samples = np.random.multivariate_normal(mean, cov, n).T
>>> min(Samples[0,:])
-1003.1521026984535
# larger variance ([0, 0] element)
>>> cov = [[4000,0,0,0],[0,0.001,0,0],[0,0,0.001,0],[0,0,0,0.001]]
>>> Samples = np.random.multivariate_normal(mean, cov, n).T
>>> min(Samples[0,:])
-1247.6489017302786
