This is a time table, columns=hour, rows=weekday, data=subject [weekday x hour]
1 2 3 4 5 6 7
Name
Monday Project Project Project Data Science Embedded Systems Data Mining Industrial Psychology
Tuesday Project Project Project Project Data Science Industrial Psychology Embedded Systems
Wednesday Data Science Project Project Project Project Project Project
Thursday Data Mining Industrial Psychology Embedded Systems Data Mining Project Project Project
Friday Industrial Psychology Embedded Systems Data Science Data Mining Project Project Project
How do you generate a pandas.Dataframe where, rows=weekday, columns=subject, data = subject frequency in the corresponding weekday?
Required table: [weekday x subject]
Data Mining, Data Science, Embedded Systems, Industrial Psychology, Project
Name
Monday 1 1 1 1 3
Tuesday ...
Wednesday
Thursday
Friday
self.file = 'timetable.csv'
self.sdf = pd.read_csv(self.file, header=0, index_col="Name")
print(self.sdf.to_string())
self.subject_frequency = self.sdf.apply(pd.value_counts)
print(self.subject_frequency.to_string())
self.subject_frequency["sum"] = self.subject_frequency.sum(axis=1)
CodePudding user response:
Use melt to flatten your dataframe then pivot_table to reshape your dataframe:
out = (
df.melt(var_name='Freq', value_name='Data', ignore_index=False).assign(variable=1)
.pivot_table('Freq', 'Name', 'Data', fill_value=0, aggfunc='count')
.loc[df.index] # sort by original index: Monday > Thuesday > ...
)
Output:
>>> out
Data Data Mining Data Science Embedded Systems Industrial Psychology Project
Name
Monday 1 1 1 1 3
Tuesday 0 1 1 1 4
Wednesday 0 1 0 0 6
Thursday 2 0 1 1 3
Friday 1 1 1 1 3
