Home > OS >  Trying to scrape a table from a website with <div tags
Trying to scrape a table from a website with <div tags

Time:01-25

I am trying to scrape this table https://momentranks.com/topshot/account/mariodustice?limit=250

I have tried this:

import requests
from bs4 import BeautifulSoup
url = 'https://momentranks.com/topshot/account/mariodustice?limit=250'
page = requests.get(url)
soup = BeautifulSoup(page.content, 'lxml')
table = soup.find_all('table', attrs={'class':'Table_tr__1JI4P'})

But it returns an empty list. Can someone give advice on how to approach this?

CodePudding user response:

The data is indexed into the page using js code you cant use requests alone however you can use selenium Keep in mind that Selenium's driver.get dosnt wait for the page to completley load which means you need to wait

Here to get you started with selenium

url = 'https://momentranks.com/topshot/account/mariodustice?limit=250'
page = driver.get(url)
time.sleep(5) #edit the time of this depending on your case (in seconds)
soup = BeautifulSoup(page.source, 'lxml')
table = soup.find_all('table', attrs={'class':'Table_tr__1JI4P'})

CodePudding user response:

Selenium is a bit overkill when there is an available api. Just get the data directly:

import requests
import pandas as pd

url = 'https://momentranks.com/api/account/details'

rows = []
page = 0
while True:
    
    payload = {
        'filters': {'page': '%s' %page, 'limit': "250", 'type': "moments"},
        'flowAddress': "f64f1763e61e4087"}
    
    jsonData = requests.post(url, json=payload).json()
    
    data = jsonData['data']
    rows  = data
    
    print('%s of %s' %(len(rows),jsonData['totalCount'] ))
    if len(rows) == jsonData['totalCount']:
        break
    
    page  = 1

df = pd.DataFrame(rows)

Output:

print(df)
                           _id    flowId  ...  challenges priceFloor
0     619d2f82fda908ecbe74b607  24001245  ...         NaN        NaN
1     61ba30837c1f070eadc0f8e4  25651781  ...         NaN        NaN
2     618d87b290209c5a51128516  21958292  ...         NaN        NaN
3     61aea763fda908ecbe9e8fbf  25201655  ...         NaN        NaN
4     60c38188e245f89daf7c4383  15153366  ...         NaN        NaN
                       ...       ...  ...         ...        ...
1787  61d0a2c37c1f070ead6b10a8  27014524  ...         NaN        NaN
1788  61d0a2c37c1f070ead6b10a8  27025557  ...         NaN        NaN
1789  61e9fafcd8acfcf57792dc5d  28711771  ...         NaN        NaN
1790  61ef40fcd8acfcf577273709  28723650  ...         NaN        NaN
1791  616a6dcb14bfee6c9aba30f9  18394076  ...         NaN        NaN

[1792 rows x 40 columns]
  •  Tags:  
  • Related