Home > Blockchain >  Calculate the total occurences of a list values in pandas column
Calculate the total occurences of a list values in pandas column

Time:02-09

I want to calculate the numbers of occurrences of list values in a pandas column

lst = ['place','wait','ok','amazing','beautiful']
ID TEXT
1 beautiful place ,me
1 ok ,good work
2 wait for me ,ok
2 amazing place
3 amazing day
3 amazing country
3 amazing world
3 thank you

the output should be like

ID OCCURENCES
1 2
1 1
2 2
2 2
3 1
3 1
3 1
3 0

my solution :

df['occurences'] =pd.DataFrame([df['text'].str.count(c) for c in list]).sum()

CodePudding user response:

split the words and use a set intersection for efficiency:

lst = ['place','wait','ok','amazing','beautiful']
words = set(lst)

df['OCCURENCES'] = [len(words.intersection(x)) for x in df['TEXT'].str.split('\W ')]

output:

   ID                  TEXT  OCCURENCES
0    1  beautiful place ,me           2
1    1        ok ,good work           1
2    2      wait for me ,ok           2
3    2        amazing place           2
4    3          amazing day           1
5    3      amazing country           1
6    3        amazing world           1
7    3            thank you           0
  •  Tags:  
  • Related