In Python/BeautifulSoup, below code title values is
<span ><!--F#f_7[0]-->4K Photon MONO<!--F/--></span>
when use title.get_text() to get text 4K Photon MONO, it failed .Any can help ? Thanks!
import requests
from bs4 import BeautifulSoup
url='https://www.ebay.com/itm/284163810059'
req=requests.get(url)
soup=BeautifulSoup(req.text,'lxml')
title=soup.select('div > div:nth-child(2) > div:nth-child(4) > div > span > div > span')
title_text= title.get_text()
CodePudding user response:
this is happening because select, returns a list, not a single string solve:
import requests
from bs4 import BeautifulSoup
url='https://www.ebay.com/itm/284163810059'
req=requests.get(url)
soup=BeautifulSoup(req.text,'lxml')
title=soup.select('div > div:nth-child(2) > div:nth-child(4) > div > span > div > span')
text = ''.join(list(map(lambda t: t.get_text(),title)))
print(text)
CodePudding user response:
It can also be done using soup.find function.
import requests
from bs4 import BeautifulSoup
url='https://www.ebay.com/itm/284163810059'
req=requests.get(url)
soup=BeautifulSoup(req.text,'lxml')
title=soup.find("span", {"itemprop" : "model"})
title_text= "" if title is None else title.get_text()
CodePudding user response:
Main issue is that you use select that will return a ResultSet and you are not able to use get_text() or text until you iterat it and call the method on each element. Another issue is your selection, it could be more specific.
So how to fix?
Instead of select() use select_one() to call your get_text() directly:
soup.select_one('[itemprop="model"]')
Be aware that you always should check that an element your try to select is available:
title = e.get_text() if (e:= soup.select_one('[itemprop="model"]')) else None
Note: walrus operator requires python 3.8 or higher
Alternative for python <3.8:
title = soup.select_one('[itemprop="model"]').get_text() if soup.select_one('[itemprop="model"]') else None
Example
import requests
from bs4 import BeautifulSoup
url='https://www.ebay.com/itm/284163810059'
req=requests.get(url)
soup=BeautifulSoup(req.text)
title = e.get_text() if (e:= soup.select_one('[itemprop="model"]')) else None
title
Output
4K Photon MONO
