Home > Back-end >  How to fit a normal distribution for scatter plot data
How to fit a normal distribution for scatter plot data

Time:01-21

I have a dataframe with the x (column x) and y (column 1) values below I am getting the mean and stdev.

Next I am plotting them together on one chart, but it just looks very wrong, It is not just that the fitted curve is shifted, I am not sure what is wrong with it.

import matplotlib.pyplot as plt
from scipy import stats
from scipy import optimize
import numpy as np

data_sample = {'x': [0,1,2,3,4,5,6,7,8,9,10], '1': [0,1,2,3,4,5,4,3,2,1,0]}  
def test_func(x, a, b): 
    return stats.norm.pdf(x,a,b)

params, cov_params = optimize.curve_fit(test_func, data_sample['x'], data_sample['1'])

print(params)

plt.scatter(data_sample['x'], data_sample['1'], label='Data')
plt.plot(data_sample['x'] , test_func(data_sample['x'], params[0], params[1]), label='Fitted function')

plt.legend(loc='best')

plt.show()

enter image description here

CodePudding user response:

The data needs to be normalized such that the area under the curve is 1. To calculate the area, when all x-values are 1 apart, you need the sum of the y-values. If the space between the x-values is larger or smaller than 1, that factor should also be included. Another way to calculate the area is fitting a normal curve to some points

PS: Using the original code (without the normalization), but with more detailed x values, the narrow curve would be more apparent:

x_detailed = np.linspace(min(data_sample['x']) - 1, max(data_sample['x'])   1, 500)
plt.plot(x_detailed, test_func(x_detailed, params[0], params[1]), color='m', label='Fitted function')

narrow gauss curve for non-normalized data

  •  Tags:  
  • Related