Home > Net >  Seperate CSV string of letters and floats
Seperate CSV string of letters and floats

Time:01-28

I have a CSV file of multiple columns. One of the columns has a string of different data types, letters & floats. These are the ProductName and Price eg. Coffee - 2.50, Tea - 3.00, ...etc However, I cannot figure out how to seperate the price(float) from the string (i believe putting it into dictionary format is best? to make {Product(str):Price(float)}

Column example: "Large Flavoured iced latte - Caramel - 3.25, Regular Flavoured iced latte - Hazelnut - 2.75, Regular Flavoured iced latte - Caramel - 2.75, Large Flavoured iced latte - Hazelnut - 3.25, Regular Flavoured latte - Hazelnut - 2.55, Regular Flavoured iced latte - Hazelnut - 2.75"

I tried:

my_list=[i.split(',') for i in my_list]

print(my_list)

But after this i have a list as so and do not know how to process further the elements

[['Large Flavoured iced latte - Caramel - 3.25', ' Regular Flavoured iced latte - Hazelnut - 2.75', ' Regular Flavoured iced latte - Caramel - 2.75', ' Large Flavoured iced latte - Hazelnut - 3.25', ' Regular Flavoured latte - Hazelnut - 2.55', ' Regular Flavoured iced latte - Hazelnut - 2.75']] Thank you in advance

CodePudding user response:

Using re.findall here is one approach:

inp = "Large Flavoured iced latte - Caramel - 3.25, Regular Flavoured iced latte - Hazelnut - 2.75, Regular Flavoured iced latte - Caramel - 2.75, Large Flavoured iced latte - Hazelnut - 3.25, Regular Flavoured latte - Hazelnut - 2.55, Regular Flavoured iced latte - Hazelnut - 2.75"
d = dict(re.findall(r'(.*?)\s*-\s*(\d (?:\.\d )?),?\s*', inp))
print(d)

This prints:

{'Large Flavoured iced latte - Caramel': '3.25',
 'Regular Flavoured iced latte - Hazelnut': '2.75',
 'Regular Flavoured latte - Hazelnut': '2.55',
 'Regular Flavoured iced latte - Caramel': '2.75',
 'Large Flavoured iced latte - Hazelnut': '3.25'}

CodePudding user response:

Working on from the mylist you could do this:

mylist = [['Large Flavoured iced latte - Caramel - 3.25', ' 
Regular Flavoured iced latte - Hazelnut - 2.75', ' Regular 
Flavoured iced latte - Caramel - 2.75', ' Large Flavoured iced 
latte - Hazelnut - 3.25', ' Regular Flavoured latte - Hazelnut - 
2.55', ' Regular Flavoured iced latte - Hazelnut - 2.75']]

items = [thing.split('-') for thing in mylist[0]]
items = [(thing[0]   thing [1], float(thing[2])) for thing in 
items]

mydict = {key:value for (key,value) in items}

This assumes the '-' between all entries and always three entries. Hope it could work ! :)

CodePudding user response:

if the strings are similar in format, then you can simply pick up the signs. If more difficult, use regular expressions.

for example:

string - word - price

s = ' Regular Flavoured latte - Hazelnut - 2.55'

you could split this string as

s.split('-')

and get

[' Regular Flavoured latte ', ' Hazelnut ', ' 2.55']

 mylist2 = [x.split('-') for x in mylist]
 mydict = {'-'.join(x[:2]):float(x[2]) for x in mylist2}

{'Large Flavoured iced latte - Caramel ': 3.25, ' Regular Flavoured iced latte - Hazelnut ': 2.75, ' Regular Flavoured iced latte - Caramel ': 2.75, ' Large Flavoured iced latte - Hazelnut ': 3.25, ' Regular Flavoured latte - Hazelnut ': 2.55}

you also can use regular exceptions ('re' libary).

  •  Tags:  
  • Related