I need to create multiple pivottables wherein only part that changes is data passed to it. e.g.
Q1 = pd.pivot_table(df1, values='id', index='race', columns='gender', aggfunc='count')
Q2 = pd.pivot_table(df1[df1['state']=='NJ'], values='id', index='race', columns='gender', aggfunc='count')
Q3 = pd.pivot_table(df1[df1['zipcode']=='07030'], values='id', index='race', columns='gender', aggfunc='count')
Instead of writing each pivot statement multiple times, I would like to create a dictionary of questions:conditions and provide the condition of each question in the argument for pivottable, using a for loop that would look something like this:
filter = {"Q1": "", "Q2":"df1['state']=='NJ'", "Q3":"df1['zipcode']=='07030'" }
for qn, mask in filter.items():
Qx = pd.pivot_table(df1[mask], values='id', index='race', columns='gender', aggfunc='count')
Qx.to_excel('some.xlsx')
Obviously this throws error since mask is read-in as a string. How do you make it read as not a string? Thanks!
CodePudding user response:
You can use the condition alone (and df1.columns to select the entire DataFrame for Q1)
filter = {"Q1": df1.columns, "Q2":df1['state']=='NJ', "Q3":df1['zipcode']=='07030'}
