Calculate the total occurences of a list values in pandas column-CodePudding

I want to calculate the numbers of occurrences of list values in a pandas column

lst = ['place','wait','ok','amazing','beautiful']

ID	TEXT
1	beautiful place ,me
1	ok ,good work
2	wait for me ,ok
2	amazing place
3	amazing day
3	amazing country
3	amazing world
3	thank you

the output should be like

ID	OCCURENCES
1	2
1	1
2	2
2	2
3	1
3	1
3	1
3	0

my solution :

df['occurences'] =pd.DataFrame([df['text'].str.count(c) for c in list]).sum()

CodePudding user response：

split the words and use a set intersection for efficiency:

lst = ['place','wait','ok','amazing','beautiful']
words = set(lst)

df['OCCURENCES'] = [len(words.intersection(x)) for x in df['TEXT'].str.split('\W ')]

output:

   ID                  TEXT  OCCURENCES
0    1  beautiful place ,me           2
1    1        ok ,good work           1
2    2      wait for me ,ok           2
3    2        amazing place           2
4    3          amazing day           1
5    3      amazing country           1
6    3        amazing world           1
7    3            thank you           0