I saw that pct_change function is partially implemented with the missing of some parameters.
- Using Pyspark pandas Series:
data = pandas.Series([90, 91, 85], index=[2, 4, 1])
print(type(data))
print(data.pct_change())
UPDATE:
The error occurs because, using
DataFrame.toPandasis different fromDataFrame.toPandas().In this case, when you use
data.toPandasit returns an object of typemethod. When you try to usepct_change()on this object, it is giving error.
- Using
DataFrame.toPandas()would return a DataFrame object on which you can usepct_change(). So modify the code as following to achieve the requirement.
data_pd = data.toPandas()
print(type(data_pd))
op = data_pd.pct_change()
print(op)
CodePudding user response:
After having a chat with @SaideepArik, we find that pandas_api() can solve the problem.
#Covert Spark Dataframe to Spark Pandas Dataframe
data_pd = data.pandas_api()
data_pd.pct_change()




