Home > Enterprise >  Scrape pages with "load more" button
Scrape pages with "load more" button

Time:02-02

I'm trying to scrape stock codes from my country but I'm stuck on a "load more" button on the website in question.

Website: enter image description here

CodePudding user response:

You should rather make POST requests to the backend API, in your browser open the Developer Tools - Network tab - fetch/XHR then click "load more" and watch the "scan" query, you can replicate that in python and get all the data you want by editing the POST request like this:

import requests
import pandas as pd
import json

rows_to_scrape = 1000

payload = {"filter":[{"left":"name","operation":"nempty"},
    {"left":"type","operation":"equal","right":"stock"},
    {"left":"subtype","operation":"equal","right":"common"},
    {"left":"typespecs","operation":"has_none_of","right":"odd"}],
    "options":{"lang":"pt"},"markets":["brazil"],
    "symbols":{"query":{"types":[]},"tickers":[]},"columns":
    ["logoid","name","close","change","change_abs","Recommend.All","volume","Value.Traded","market_cap_basic","price_earnings_ttm","earnings_per_share_basic_ttm","number_of_employees","sector","description","type","subtype","update_mode","pricescale","minmov","fractional","minmove2","currency","fundamental_currency_code"],
    "sort":{"sortBy":"name","sortOrder":"asc"},
    "range": [0,rows_to_scrape]} #change this to get more/less data

headers =   {'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36'}
url = 'https://scanner.tradingview.com/brazil/scan'

resp = requests.post(url,headers=headers,data=json.dumps(payload)).json()
output = [x['d'] for x in resp['data']]
print(len(output))

df= pd.DataFrame(output)
df.to_csv('tradingview_br.csv',index=False)
print('Saved to tradingview_br.csv')

It should be pretty easy to figure out what each data point is unfortunately there aren't any headings in that data

  •  Tags:  
  • Related