I am performing normalization on the random data. I calculated the mean and standard deviation of the data and normalizing by the formula Norm = (data - mean) / std, but I am not getting the result same as numpy normalization function gives and also I am not getting the bell shape curve.
import numpy as np
from scipy.stats import norm
a = np.arange(-4, 4)
mean = sum(a/len(a))
std = (np.sqrt(sum((a-mean)**2)/(len(a)-1)))
y1 = norm.pdf(a, mean, std)
y2 = (a - mean) / std
print(y1)
print(y2)
output
y2 = [0.05868028 0.09674742 0.1350219 0.15950953 0.15950953 0.1350219
0.09674742 0.05868028]
y1 = [-1.42886902 -1.02062073 -0.61237244 -0.20412415 0.20412415 0.61237244
1.02062073 1.42886902]
What is the problem?
CodePudding user response:
Then, the y2 you calculate is the result of standardization or z-score. In order to calculate z-score with scipy, you should you scipy.stats.zscore() instead of scipy.stats.norm.pdf().
import numpy as np
from scipy.stats import zscore
a = np.arange(-4, 4)
y1 = (a - a.mean()) / a.std()
y2 = zscore(a)
y1 and y2 will be the same.
CodePudding user response:
Put yet another way, adding this to your code will produce a value equal to norm.pdf:
pdf = np.exp(-y2*y2/2)/((2*np.pi)**0.5) / std

