I'm trying to scrape the data-ppu value from this line of HTML code called trade_data:
<input class="tradeForm" data-id="10397992" data-ppu="3893" data-toggle="tooltip" maximum="16450" name="rcustomamount" title="Enter Your Desired Amount" type="number" value="16450"/>
I'm using Python 3 and Beautiful Soup. Here's the code I've tried:
for index, trade_data in enumerate(trade_data):
price = trade_data.find('data-ppu')
print(price)
However this returns nothing. Any help is greatly appreciated!
CodePudding user response:
What you've asked for is all of the <data-ppu> tags, of which there are none. You need to search the attributes of the tag:
for part in trade_data:
price = part.findAll( lambda tag: tag.name='input' and 'data-ppu' in tag.attrs)
CodePudding user response:
To get the attributes, you need to do like this:
- Since the
data-ppuis an attribute of the<input>tag, you need to first select it and then extract it's attributes.
Selecting the <input> tag
x = soup.find('input')
Extracting the attribute data-ppu
x['data-ppu']
Here is the complete code:
from bs4 import BeautifulSoup
s = """
<input class="tradeForm" data-id="10397992" data-ppu="3893" data-toggle="tooltip" maximum="16450" name="rcustomamount" title="Enter Your Desired Amount" type="number" value="16450"/>
"""
soup = BeautifulSoup(s,'lxml')
x = soup.find('input')
print(x['data-ppu'])
3893
