Home > Mobile >  Inconsistent results from pandas.rolling.std()
Inconsistent results from pandas.rolling.std()

Time:01-08

As you can see, the code is from the PANDAS official example, the STD of the last 3 numbers(5,5,5) should be 0, but it's not in the example.

In [1]: s = pd.Series([5,5,6,7,5,5,5])

In [2]: s.rolling(3).std()
Out[2]:
0             NaN
1             NaN
2    5.773503e-01
3    1.000000e 00
4    1.000000e 00
5    1.154701e 00
6    2.580957e-08
dtype: float64

If I reverse the array, the outcomes seem correct. I don't know why.

In [3]: s[::-1].rolling(3).std()
Out[3]:
6         NaN
5         NaN
4    0.000000
3    1.154701
2    1.000000
1    1.000000
0    0.577350
dtype: float64

CodePudding user response:

What you see is the result of small rounding errors in the floating point calculations done when calculating the standard deviation with a rolling window. In earlier versions of pandas, the code to calculate standard deviation and variance automatically caught small values and rounded them to zero. This was found to cause problems when calculating the standard deviation (or variance) for small numbers and it was decided to remove the automatic rounding. The discussion of this issue can be found in:

https://github.com/pandas-dev/pandas/issues/37051

and the change was made in:

https://github.com/pandas-dev/pandas/pull/40505

In issue 37051, they mention the need to update the documentation, but apparently this change doesn't seem to be reflected in the current online documentation.

If you want to replicate the behavior of the earlier version of pandas, you can manually set small values to 0 by finding any small values and setting them to 0.

In [10]: s_std = s.rolling(3).std()

In [11]: s_std
Out[11]:
0             NaN
1             NaN
2    5.773503e-01
3    1.000000e 00
4    1.000000e 00
5    1.154701e 00
6    2.580957e-08
dtype: float64

In [12]: s_std[s_std < 1e-7] = 0

In [13]: s_std
Out[13]:
0         NaN
1         NaN
2    0.577350
3    1.000000
4    1.000000
5    1.154701
6    0.000000
dtype: float64
  •  Tags:  
  • Related