Home > Blockchain >  How Can I Scape Comments From This Json Data
How Can I Scape Comments From This Json Data

Time:01-21

How Can I Scape Comments From This Json Data. Thank you for your help.

    headers = {
    'Accept-Encoding': 'gzip, deflate, sdch',
    'Accept-Language': 'en-US,en;q=0.8',
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36',
    'Accept': 'text/html,application/xhtml xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
    'Referer': 'https://www.trendyol.com/',
    'Connection': 'keep-alive',
}

params =(
    ('boutiqueId', '594432'),
    ('merchantId', '968'),
    ('culture', 'tr-TR'),
    ('storefrontId', '1'),
    ('logged-in', 'false'),
    ('userId', '0'),
    ('isBuyer', 'false')
)

response = requests.get('https://public-mdc.trendyol.com/discovery-web-socialgw-service/reviews/denokids/yilbasi-kokos-elbise-p-3893218/yorumlar', headers=headers, params=params)
result_json = response.json()
result_json['result']['hydrateScript']

I couldn't do more than that

CodePudding user response:

You can slice up the text that is in the hydrateScript like this an get the JSON data you want by using json.loads():

import requests
import json

headers = {
    'Accept-Encoding': 'gzip, deflate, sdch',
    'Accept-Language': 'en-US,en;q=0.8',
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36',
    'Accept': 'text/html,application/xhtml xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
    'Referer': 'https://www.trendyol.com/',
    'Connection': 'keep-alive',
    }

params =(
    ('boutiqueId', '594432'),
    ('merchantId', '968'),
    ('culture', 'tr-TR'),
    ('storefrontId', '1'),
    ('logged-in', 'false'),
    ('userId', '0'),
    ('isBuyer', 'false')
)

response = requests.get('https://public-mdc.trendyol.com/discovery-web-socialgw-service/reviews/denokids/yilbasi-kokos-elbise-p-3893218/yorumlar', headers=headers, params=params)
result_json = response.json()
s = result_json['result']['hydrateScript']

start = "window.__REVIEW_APP_INITIAL_STATE__ = "
end = 'window.TYPageName="product_reviews"'
dirty = s[s.find(start) len(start):s.rfind(end)].strip()[:-1] #get the token out the html

review_data = json.loads(dirty)

print(review_data['ratingAndReviewResponse'])
  •  Tags:  
  • Related