I have a Pandas dataframe with a column with elements of type string. I need to get 2 columns for future pairwise comparison.
Example: We had: Abc Bdf Ftp ...
What I need to get: (2 different columns) Abc, Bdf Abc, Ftp Bdf, Ftp ...
I've searched a lot of different sources and what I came to is that I need to use itertools. But how?
CodePudding user response:
use shift
The dataframe:
df = pd.DataFrame(columns = ['col1'], data = ['Abc','Bdf', 'Ftp'])
and the answer
df['col2'] = df['col1'].shift()
df.fillna(method = 'ffill', axis=1)
output:
col1 col2
0 Abc Abc
1 Bdf Abc
2 Ftp Bdf
CodePudding user response:
You can use itertools.combinations:
from itertools import combinations
df2 = pd.DataFrame(combinations(df['col1'], 2),
columns=('col1', 'col2'))
Output:
col1 col2
0 Abc Bdf
1 Abc Ftp
2 Bdf Ftp
