I am scraping web content with Beautifulsoup, Python and I would like to manipulate the following strings:
'Induktora 28" 36V/14 Ah | 16.5" Bordo'
'Induktora 28" 36V/14 Ah | 18" Bordo'
'Induktora 26" 36V/14 Ah | 16.5" Black Matte/Red'
'Induktora 26" 36V/14 Ah | 18" Black Matte/Red'
I would like to get:
- word after "
|" and contains quote at the end" - word(s) after "
|" and after the quote"(if there is any)
Example:
str='Induktora 28" 36V/14 Ah | 16.5" Bordo'
size='16.5"'
color='Bordo'
newtitle='Induktora 28" 36V/14 Ah'
str='Induktora 26" 36V/14 Ah | 18" Black Matte/Red'
size='18"'
color='Black Matte/Red'
newtitle='Induktora 26" 36V/14 Ah'
CodePudding user response:
You'd probably use the built-in re module for that. Your pattern would probably look something like \| ([\d\.] )" (.*)$.
If your regex pattern isn't doing what you expect, you can debug it at a site like https://pythex.org/.
