I wonder what is the simplest way to check if a pandas dataframe column has unicode.
I was gonna try df['fieldname'].str.isascii but that does not seem to exist.
CodePudding user response:
In python3, strs are unicode.
Your code, df['fieldname'].str.isascii, returns a Series, which rows of True or False. Since you want to check if the column has at least one str value, you can check as follows:
import pandas as pd
df = pd.DataFrame(
{
'text': [1, 2, '3'], # '3' is str here.
}
)
if True in df['text'].apply(lambda x: x.isascii() if isinstance(x, str) else False).to_list():
print('at least one ascii')
else:
print('no ascii')
# at least one str -> this will be printed because '3' is ascii.
