Correlation Co-efficient calculation in Python
How would I calculate the correlation coefficient using Python between the spring training wins column and the regular-season wins column?
| Name | Spr.TR | Reg Szn |
|---|---|---|
| Team B | 0.429 | 0.586 |
| Team C | 0.417 | 0.646 |
| Team D | 0.569 | 0.6 |
| Team E | 0.569 | 0.457 |
| Team F | 0.533 | 0.563 |
| Team G | 0.724 | 0.617 |
| Team H | 0.5 | 0.64 |
| Team I | 0.577 | 0.649 |
| Team J | 0.692 | 0.466 |
| Team K | 0.5 | 0.477 |
| Team L | 0.731 | 0.699 |
| Team M | 0.643 | 0.588 |
| Team N | 0.448 | 0.531 |
CodePudding user response:
You can use corr (Pearson correlation by default):
df['Spr.TR'].corr(df['Reg Szn'], method='pearson')
output: 0.10811116955657629
CodePudding user response:
If we assume that your data is in a variable of type pandas.DataFrame named df.
from scipy.stats.stats import pearsonr
correlation = pearsonr(df["Spr.TR"].tolist(),df["Reg Szn"].tolist())[0]
