!pip install emot
from emot.emo_unicode import EMOTICONS_EMO
def convert_emoticons(text):
for emot in EMOTICONS_EMO:
text = re.sub(u'\(' emot '\)', "_".join(EMOTICONS_EMO[emot].replace(",","").split()), text)
return text
text = "Hello :-) :-)"
convert_emoticons(text)
I'm trying to run the above code in google collab, but it gives the following error: unbalanced parenthesis at position 4
My undesrtanding from the re module documentation tells that '\(any_expression'\)' is correct way to use, but I still get the error. So, I'have tried replacing '\(' emot '\) with:
'(' emot ')', it gives the same error'[' emot ']', it gives the following output:Hello Happy_face_or_smiley-Happy_face_or_smiley Happy_face_or_smiley-Happy_face_or_smiley
The correct output should be Hello Happy_face_smiley Happy_face_smiley for text = "Hello :-) :-)"
Can someone help me fix the problem?
CodePudding user response:
This is pretty tricky using regex, as you'd first need to escape the metachars in the regex that are contained in the emoji, such as :) and :(, which is why you get the unbalanced parens. So, you'd need to do something like this first:
>>> print(re.sub(r'([()...])', r'%s\1' % '\\\\', ':)'))
:\)
But I'd suggest just doing a straight replacement since you already have a mapping that you're iterating through it. So we'd have:
from emot.emo_unicode import EMOTICONS_EMO
def convert_emoticons(text):
for emot in EMOTICONS_EMO:
text = text.replace(emot, EMOTICONS_EMO[emot].replace(" ","_"))
return text
text = "Hello :-) :-)"
convert_emoticons(text)
# 'Hello Happy_face_smiley Happy_face_smiley'
