I have a CSV file of multiple columns. One of the columns has a string of different data types, letters & floats. These are the ProductName and Price eg. Coffee - 2.50, Tea - 3.00, ...etc However, I cannot figure out how to seperate the price(float) from the string (i believe putting it into dictionary format is best? to make {Product(str):Price(float)}
Column example: "Large Flavoured iced latte - Caramel - 3.25, Regular Flavoured iced latte - Hazelnut - 2.75, Regular Flavoured iced latte - Caramel - 2.75, Large Flavoured iced latte - Hazelnut - 3.25, Regular Flavoured latte - Hazelnut - 2.55, Regular Flavoured iced latte - Hazelnut - 2.75"
I tried:
my_list=[i.split(',') for i in my_list]
print(my_list)
But after this i have a list as so and do not know how to process further the elements
[['Large Flavoured iced latte - Caramel - 3.25', ' Regular Flavoured iced latte - Hazelnut - 2.75', ' Regular Flavoured iced latte - Caramel - 2.75', ' Large Flavoured iced latte - Hazelnut - 3.25', ' Regular Flavoured latte - Hazelnut - 2.55', ' Regular Flavoured iced latte - Hazelnut - 2.75']]
Thank you in advance
CodePudding user response:
Using re.findall here is one approach:
inp = "Large Flavoured iced latte - Caramel - 3.25, Regular Flavoured iced latte - Hazelnut - 2.75, Regular Flavoured iced latte - Caramel - 2.75, Large Flavoured iced latte - Hazelnut - 3.25, Regular Flavoured latte - Hazelnut - 2.55, Regular Flavoured iced latte - Hazelnut - 2.75"
d = dict(re.findall(r'(.*?)\s*-\s*(\d (?:\.\d )?),?\s*', inp))
print(d)
This prints:
{'Large Flavoured iced latte - Caramel': '3.25',
'Regular Flavoured iced latte - Hazelnut': '2.75',
'Regular Flavoured latte - Hazelnut': '2.55',
'Regular Flavoured iced latte - Caramel': '2.75',
'Large Flavoured iced latte - Hazelnut': '3.25'}
CodePudding user response:
Working on from the mylist you could do this:
mylist = [['Large Flavoured iced latte - Caramel - 3.25', '
Regular Flavoured iced latte - Hazelnut - 2.75', ' Regular
Flavoured iced latte - Caramel - 2.75', ' Large Flavoured iced
latte - Hazelnut - 3.25', ' Regular Flavoured latte - Hazelnut -
2.55', ' Regular Flavoured iced latte - Hazelnut - 2.75']]
items = [thing.split('-') for thing in mylist[0]]
items = [(thing[0] thing [1], float(thing[2])) for thing in
items]
mydict = {key:value for (key,value) in items}
This assumes the '-' between all entries and always three entries. Hope it could work ! :)
CodePudding user response:
if the strings are similar in format, then you can simply pick up the signs. If more difficult, use regular expressions.
for example:
string - word - price
s = ' Regular Flavoured latte - Hazelnut - 2.55'
you could split this string as
s.split('-')
and get
[' Regular Flavoured latte ', ' Hazelnut ', ' 2.55']
mylist2 = [x.split('-') for x in mylist]
mydict = {'-'.join(x[:2]):float(x[2]) for x in mylist2}
{'Large Flavoured iced latte - Caramel ': 3.25, ' Regular Flavoured iced latte - Hazelnut ': 2.75, ' Regular Flavoured iced latte - Caramel ': 2.75, ' Large Flavoured iced latte - Hazelnut ': 3.25, ' Regular Flavoured latte - Hazelnut ': 2.55}
you also can use regular exceptions ('re' libary).
