here is my code:
df = pd.read_csv('my_path\\zzounds.csv')
df.head()
variation_type main_image
['yellow', 'orange'] ['https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg', 'https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg']
I tried this code
df.explode(['variation_type','main_image'])
But it's returning the original dataframe.
CodePudding user response:
I believe its because python struggles to explode multiple columns in this manner.
You can use this code to get the results I believe you are expecting
data = {
' variation_type' : [['yellow', 'orange']],
'main_image' : [['https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg', 'https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg']]
}
df = pd.DataFrame(data)
df.apply(pd.Series.explode)
***Note: This will only work if all "list" fields are the same length
CodePudding user response:
Just to narrow down the issue, note that the dataframe displayed in your question does indeed work with explode(). If your values are strings that look like lists, then as suggested by @Ynjxsjmh, it may be necessary to convert them to list values first.
Sample test code:
import pandas as pd
df = pd.DataFrame({
'variation_type':[['yellow', 'orange']],
'main_image':[['https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg', 'https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg']]})
print(df.to_string())
df = df.explode(['variation_type','main_image'])
print(df.to_string())
Input:
variation_type main_image
0 [yellow, orange] [https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg, https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg]
Output:
variation_type main_image
0 yellow https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg
0 orange https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg
CodePudding user response:
Since you are reading from csv file, you need to convert string list first
df[['variation_type','main_image']] = df[['variation_type','main_image']].applymap(lambda x: pd.eval(x, local_dict={'nan': np.nan}))
# or
df[['variation_type','main_image']] = df[['variation_type','main_image']].applymap(lambda x: eval(x, {'nan': np.nan}))
df = df.explode(['variation_type','main_image'])
