I am trying to find weather temperature off of weather.com using beautiful soup. If I go to the url and inspect element, 8:00 pm, the text I am looking for, is on the website. However, the code outputs a NoneType object and can't find an instance of the text. I tried weather_entry=soup.find(text="8.00") and that didn't yield any results either.
import requests
import re
from bs4 import BeautifulSoup
def weather():
url='https://weather.com/weather/hourbyhour/l/823266028e3362e3a9578cfe64cb1c6ac654c492d22b41dbe3ac567cd31e1083'
#open with GET method
resp=requests.get(url)
#http_respone 200 means OK status
if resp.status_code==200:
soup=BeautifulSoup(resp.text,'html.parser')
#this line is the problem, .find("8:00) and .find(text=re.compile("8:00") dont work either
weather_entry=soup.find(text=re.compile("8:00 pm"))
print(str(weather_entry) "\n")
print(weather_entry.get_text())
else:
print("Error")
weather()
CodePudding user response:
I think that the weather information you are trying to find is contained in Javascript. If you switch to Debugger in the developers console (I'm using Firefox) you will see a folder called hourly/assets which contains a lot of js scripts.
I've tried to do use Beautiful Soup to read weather websites previously and come up against the exact same problem. The solution I found (which may not be available to you) was to ask the website for access to the raw data via JSON or API.
Another solution I have used previously is to find a website for an amateur web station, which is far more likely to be written in pure HTML
