I am trying to scrape price for an item from a website using python.
import requests
from bs4 import BeautifulSoup
URL = "https://..."
result = requests.get(URL)
doc = BeautifulSoup(result.text, "html.parser")
prices = doc.find_all(???)
print(prices)
In question marks I know I can write the full string which to look for, but I want so that it finds every time there is a text that starts with "$".
Is it possible, if so, how?
CodePudding user response:
Use regular expression to catch the tags that starts with certain character as below:
import re
from bs4 import BeautifulSoup
html = """
<p>$Show me</p>
<p>I am invisible</p>
<p>me too</p>
<p>$Show me too</p>
"""
soup = BeautifulSoup(html, 'html.parser')
result = soup.find_all("p", text=re.compile("^\$"))
# -> [<p>$Show me</p>, <p>$Show me too</p>]
Note that I used \ operated before $ since dollar sign itself is a special character. See regular expression syntax for more information.
