Home > Back-end >  How to convert two columns to list of values?
How to convert two columns to list of values?

Time:01-24

I have a dataframe like this below,

    A        B  C     D
0  A1    Egypt  10  Yes
1  A1  Morocco   5   No
2  A2  Algeria   4  Yes
3  A3    Egypt  45   No
4  A3    Egypt  17  Yes
5  A3  Tunisia   4  Yes
6  A3  Algeria  32   No
7  A4  Tunisia   7   No
8  A5    Egypt   6   No
9  A5  Morocco   1   No

I want to get the count of yes and no from the column D wrt column B. The expected output needs to be in the lists like this below which can help to plot the multivariable chart.

Exected output:

yes = [1,2,0,1]
no = [1,2,2,1]
country = ['Algeria', 'Egypt', 'Morocco','Tunisia']

I am not sure how to achieve this from the above dataframe. Any help will be appreciated.

Here is the minimum reproducible dataframe sample:

import pandas as pd

df = pd.DataFrame({'A': {0: 'A1',
                         1: 'A1',
                         2: 'A2',
                         3: 'A3',
                         4: 'A3',
                         5: 'A3',
                         6: 'A3',
                         7: 'A4',
                         8: 'A5',
                         9: 'A5'},
                   'B': {0: 'Egypt',
                         1: 'Morocco',
                         2: 'Algeria',
                         3: 'Egypt',
                         4: 'Egypt',
                         5: 'Tunisia',
                         6: 'Algeria',
                         7: 'Tunisia',
                         8: 'Egypt',
                         9: 'Morocco'},
                   'C ': {0: 10, 1: 5, 2: 4, 3: 45, 4: 17, 5: 4, 6: 32, 7: 7, 8: 6, 9: 1},
                   'D': {0: 'Yes',
                         1: 'No',
                         2: 'Yes',
                         3: 'No',
                         4: 'Yes',
                         5: 'Yes',
                         6: 'No',
                         7: 'No',
                         8: 'No',
                         9: 'No'}}
                  )

CodePudding user response:

Create new columns by counting "yes", "no"; then groupby "B" and use sum on the newly created columns:

country, yes, no = df.assign(Yes=df['D']=='Yes', No=df['D']=='No').groupby('B')[['Yes','No']].sum().reset_index().T.to_numpy().tolist()

Output:

['Algeria', 'Egypt', 'Morocco', 'Tunisia']
[1, 2, 0, 1]
[1, 2, 2, 1]

CodePudding user response:

Use crosstab:

df1 = pd.crosstab(df.B, df.D)
print (df1)
D        No  Yes
B               
Algeria   1    1
Egypt     2    2
Morocco   2    0
Tunisia   1    1

Then for plot use DataFrame.plot.bar

df1.plot.bar()

If need lists:

yes = df1['Yes'].tolist()
no = df1['No'].tolist()
country = df1.index.tolist()
  •  Tags:  
  • Related