I have some questions about combining the first 2 columns in pandas/python with n/a value long story: I need to read an excel and alter those changes. I can not change anything in excel, so any change has been done by python.
Here is the excel input
and the expected expect output will be
I manage to read it in, but when I try to combine the first 2 columns, I have some problems. since in excel, the first row is merged, so once it is read in. only one row has value, but the rest of the row is all N/A.
such as below:
Year number 2016
Month Jan
Month 2016-01
Grade 1 100
NaN 2 99
NaN 3 98
NaN 4 96
NaN 5 92
NaN Total 485
Is there any function that can easily help me to combine the first two columns and make it as below:
Year 2016
Month Jan
Month 2016-01
Grade 1 100
Grade 2 99
Grade 3 98
Grade 4 96
Grade 5 92
Grade Total 485
Anything will be really appreciated.
I searched and google the key word for so long but did not find any answer that fits my situation here.
CodePudding user response:
d = '''
Year,number,2016
Month,,Jan
Month,,2016-01
Grade,1, 100
NaN,2, 99
NaN,3, 98
NaN,4, 96
NaN,5, 92
NaN,Total,485
'''
df = pd.read_csv(StringIO(d))
df
df['Year'] = df.Year.fillna(method='ffill')
df = df.fillna('') # skip this step if your data from excel does not have nan in col 2.
df['Year'] = df.Year ' ' df.number.astype('str')
df = df.drop('number',axis=1)
df




