Home > Mobile >  summarize number of appearances of different values in different columns in Data Frame
summarize number of appearances of different values in different columns in Data Frame

Time:02-06

For Example I have this data :

star1                  star2                star3            star4

Francis                Ron              Robert               Hen   
Francis                Ron              Shir                 Coppola              
Shir                   Ron              Francis              Coppola              

I want to have :

Francis,3             
Coppola,2 
Ron,3
Shir,2
Robert,1
Hen,1

How can I do this?

CodePudding user response:

As @Michael Szczesny suggested in the comment, the statistical frequency of distinct values over all columns within dataframe can be achieved by:

import pandas as pd
import numpy as np

df = pd.DataFrame({
         'star1': ['Francis', 'Francis', 'Shir'],
         'star2': ['Ron', 'Ron', 'Ron'],
         'star3': ['Robert', 'Shir', 'Francis'],
         'star4': ['Hen', 'Coppola', 'Coppola']
})

print(df)
#    star1   star2   star3    star4
#0  Francis   Ron   Robert      Hen
#1  Francis   Ron     Shir  Coppola
#2     Shir   Ron  Francis  Coppola

Output:

df.stack().value_counts()
#Francis    3
#Ron        3
#Shir       2
#Coppola    2
#Robert     1
#Hen        1
#dtype: int64

you can also check the statistical frequency of distinct values in each column:

for i in df.columns:
    print(df[i].value_counts()) 
  •  Tags:  
  • Related