Home > Net >  The website has 9 pages and my code just add the last page elements to the list
The website has 9 pages and my code just add the last page elements to the list

Time:01-10

The website has 9 pages and my code just add the last page elements to the list. I want to add all elements for all pages next together in list.

alltitles = []
allnames = []
alllinks = []
allpeices = []
allstocks = []
for n in range(pagenum):
    pages_url = f"https://www.ispsupplies.com/manufacturers/TP~Link?order=relevance:asc&page= 
    {n 1}&keywords=tp-link"
    driver.get(pages_url)
    html = driver.page_source
    soup = Soup(html)
    title = soup.find_all("span", itemprop="name")
    titleloop = [titles.text for titles in title]
    alltitles.append(titleloop)
    name = soup.find_all("div", class_="item-details-sku-container")
    nameloop = [names.text for names in name]
    allnames.append(nameloop)
    link = soup.find_all("a", class_="facets-item-cell-grid-title")
    linkloop = [links.text for links in link]
    alllinks.append(linkloop)
    price = soup.find_all("span", class_="item-views-price-lead")
    priceloop = [prices.text for prices in price]
    allpeices.append(priceloop)
    stock = soup.find_all("div", class_="item-details-stock")
    stockloop = [stocks.text for stocks in stock]
    allstocks.append(stockloop)

enter image description here

CodePudding user response:

What happens?

Code works well, but iterates to fast and elements your looking for are not present in the moment you try to find them.

How to fix?

Use selenium waits to check if elements are present in the DOM:

...
driver.get(pages_url)
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, '[data-type="item"]')))
html = driver.page_source
...

Note: You have to make additional imports

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

Example

Not sure why decided for these bunch of lists, this example deals with a single list of dicts:

data = []

for n in range(2):
    pages_url = f"https://www.ispsupplies.com/manufacturers/TP~Link?order=relevance:asc&page={n 1}&keywords=tp-link"
    driver.get(pages_url)
    WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, '[data-type="item"]')))
    html = driver.page_source
    soup = Soup(html)
    
    for item in soup.select('[data-type="item"]'):
        data.append({
            'title' : item.find("span", itemprop="name").text,
            'name' : item.find("div", class_="item-details-sku-container").text,
            'link' : item.find("a", class_="facets-item-cell-grid-title")['href'],
            'price' : item.find("span", class_="item-views-price-lead").text,
            'stock' : item.find("div", class_="item-details-stock").text.strip()
        })
        
pd.DataFrame(data)

Output

title name link price stock
TP-Link AC750 Wireless Dual Band Router SKU: Archer C20 /TP-Link-Archer-C20 US$‎34.99 Direct Ship item Item usually ships directly from the manufacturer
TP-Link 16-Port Gigabit Unmanaged Pro Switch SKU: TL-SG116E /TP-Link-TL-SG116E US$‎79.99 3 In Stock
TP-Link AC1200 Wireless MU-MIMO Gigabit Router Archer A6 SKU: Archer A6_V3 /TP-Link-Archer-A6 US$‎49.99 Direct Ship item Item usually ships directly from the manufacturer
TP-Link AC4000 MU-MIMO Tri-Band Wi-Fi Router Archer A20 SKU: Archer A20 /TP-Link-Archer-A20 US$‎189.99 Direct Ship item Item usually ships directly from the manufacturer
TP-Link AC5400 MU-MIMO Tri-Band Gaming Router SKU: Archer C5400X /TP-Link-Archer-C5400X US$‎279.99 Direct Ship item Item usually ships directly from the manufacturer

CodePudding user response:

Any reason not just go through the api? Far more efficient, and you'll get more data. You can always just filter out columns you don't need.

import requests
import pandas as pd

items = []
page = 0
while True:
    url = 'https://www.ispsupplies.com/api/items'
    payload = {
    '_t': '1641815468877',
    'c': '393682',
    'country': 'US',
    'currency': 'USD',
    'custitem_disable_from_main_website': '0',
    'custitem_is_international': '0',
    'fieldset': 'search',
    'include': 'facets',
    'language': 'en',
    'limit': '100',
    'manufacturers': 'TP~Link',
    'n': '2',
    'nocache': 'T',
    'offset': str(page*100),
    'sort': 'quantityavailable:desc'}


    jsonData = requests.get(url, params=payload).json()
    
    items  = jsonData['items']
    print('Page: %s' %(page 1))
    
    if len(jsonData['items']) < 100:
        break
    page  = 1
    
df = pd.DataFrame(items)

Output:

Full Output (just first 5 rows of the 199 products):

print(df.head(5).to_string())
  custitem88 custitem89 custitem83  custitem_is_international custitem_open_box_ids custitem_ns_pr_item_attributes  custitemnew  ispurchasable custitem_ns_pr_attributes_rating stockdescription  custitemclearance                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      itemimages_detail custitem_commercecategory_brand custitemwarehousemessage  custitem_incanada                                                    onlinecustomerprice_detail custitem71  weight custitem_ns_pr_rating_by_rate  internalid                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     itemoptions_detail outofstockmessage                                                                     custitemextralargeimage2 custitem_availableus                                              storedescription pricelevel1_formatted  isinstock custitem67  custitem20  custitem21  onlinecustomerprice  dontshowprice  custitemrefurbished  custitemonsale custitem68 manufacturer  custitem69  custitemfree_shipping         itemid  custitemondiscount  offersupport onlinecustomerprice_formatted nopricemessage  custitem_disable_from_main_website pricelevel66_formatted  isbackorderable  custitemtariff_item  custitemfree_shipping_cw                                                       custitem93 custitem94  custitem19  custitem18 custitem_st7 custitem_st6  showoutofstockmessage outofstockbehavior custitem_st8  itemtype  quantityavailable custitem_st3 custitem_st2 custitem_st5 displayname                                    storedisplayname2 custitem_st4 custitem_availableca  pricelevel1 custitem_st1  custitem_gpon                                         urlcomponent  pricelevel66 custitem_commerce_category_1 custitem_commerce_category_3 custitem_commerce_category_2
0                     0                                 False                                               &nbsp;        False           True                                                                False                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          {'5366': {'urls': [{'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-LINK-Gigabit-PCI-Express-Network-Adapter-TG-3468.5366-2.jpg'}]}}                         TP-Link                11/8/2021              False  {'onlinecustomerprice_formatted': 'US$‎14.99', 'onlinecustomerprice': 14.99}               0.50                                      5366  {'fields': [{'internalid': 'custcol19', 'label': 'Item Length', 'type': 'float'}, {'internalid': 'custcol20', 'label': 'Item Width', 'type': 'float'}, {'internalid': 'custcol21', 'label': 'Item Height', 'type': 'float'}, {'internalid': 'custcol_tariff_fee_option', 'label': 'Tariff Fee', 'type': 'currency'}, {'internalid': 'custcol_tariff_fee', 'label': 'Tariff Fee Custom', 'type': 'currency'}, {'internalid': 'custcol_is_tariff', 'label': 'Is Tariff', 'type': 'checkbox'}, {'internalid': 'custcol26', 'label': 'Purchase Price', 'type': 'currency'}, {'internalid': 'custcol36', 'label': 'Not Kit Component', 'type': 'checkbox'}, {'internalid': 'custcol67', 'label': 'Is Tariff (Webstore)', 'type': 'text'}, {'internalid': 'custcol_shiphawk_proposed_shipment_id', 'label': 'ShipHawk Proposed Shipment ID', 'type': 'text'}, {'internalid': 'custcol_shiphawk_source_system_line_n', 'label': 'ShipHawk Source System Line Number', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier', 'label': 'Carrier Name', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier_service', 'label': 'Carrier Service', 'type': 'text'}]}                     /core/media/media.nl?id=922920&c=393682&h=1qP1ijidIPW2P4DK3Fi_jlV_N3UT-StJuJYKXsZSuMSrOrIn                  109                           32-bit Gigabit PCIe Network Adapter             US$‎14.99       True                   2.25       False                14.99          False                False           False                 TP-Link       False                   True        TG-3468               False         False                     US$‎14.99                                              False              US$‎14.99             True                False                     False  <div >In stock at College Station</div>                   5.50        6.25                                            False        - Default -               InvtPart              109.0                                                             TP-LINK 32-bit Gigabit PCIe Network Adapter                                          14.99                       False  TP-LINK-Gigabit-PCI-Express-Network-Adapter-TG-3468         14.99                 PCI Adapters                          NaN                          NaN
1                     0                                 False                                               &nbsp;        False           True                                                                False                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      {'urls': [{'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-LINK-TL-PA4010-KIT.01.jpg'}]}                         TP-Link                11/8/2021              False  {'onlinecustomerprice_formatted': 'US$‎39.99', 'onlinecustomerprice': 39.99}               1.00                                      5406  {'fields': [{'internalid': 'custcol19', 'label': 'Item Length', 'type': 'float'}, {'internalid': 'custcol20', 'label': 'Item Width', 'type': 'float'}, {'internalid': 'custcol21', 'label': 'Item Height', 'type': 'float'}, {'internalid': 'custcol_tariff_fee_option', 'label': 'Tariff Fee', 'type': 'currency'}, {'internalid': 'custcol_tariff_fee', 'label': 'Tariff Fee Custom', 'type': 'currency'}, {'internalid': 'custcol_is_tariff', 'label': 'Is Tariff', 'type': 'checkbox'}, {'internalid': 'custcol26', 'label': 'Purchase Price', 'type': 'currency'}, {'internalid': 'custcol36', 'label': 'Not Kit Component', 'type': 'checkbox'}, {'internalid': 'custcol67', 'label': 'Is Tariff (Webstore)', 'type': 'text'}, {'internalid': 'custcol_shiphawk_proposed_shipment_id', 'label': 'ShipHawk Proposed Shipment ID', 'type': 'text'}, {'internalid': 'custcol_shiphawk_source_system_line_n', 'label': 'ShipHawk Source System Line Number', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier', 'label': 'Carrier Name', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier_service', 'label': 'Carrier Service', 'type': 'text'}]}                     /core/media/media.nl?id=875835&c=393682&h=blNs8_wT0YD2isH8-8LHyXuDVz82k4V5VxMsQVeVrrUeVsAE                   94  AV500 Nano Powerline Ethernet Adapter Starter Kit, Twin Pack             US$‎39.99       True                   4.00       False                39.99          False                False           False                 TP-Link       False                   True  TL-PA4010 KIT               False         False                     US$‎39.99                                              False              US$‎39.99             True                False                     False  <div >In stock at College Station</div>                   6.00        8.00                                            False        - Default -               InvtPart               94.0                                                                     TP-LINK AV600 Powerline Starter Kit                                          39.99                       False                                TP-LINK-TL-PA4010-KIT         39.99            Powerline Systems                          NaN                          NaN
2                     0                                 False                                               &nbsp;        False           True                                                                False  {'urls': [{'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-Link-UE300.01.jpg'}, {'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-Link-UE300.02.jpg'}, {'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-Link-UE300.03.jpg'}, {'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-Link-UE300.04.jpg'}, {'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-Link-UE300.05.jpg'}]}                         TP-Link                9/17/2021              False  {'onlinecustomerprice_formatted': 'US$‎12.99', 'onlinecustomerprice': 12.99}               0.25                                     20996  {'fields': [{'internalid': 'custcol19', 'label': 'Item Length', 'type': 'float'}, {'internalid': 'custcol20', 'label': 'Item Width', 'type': 'float'}, {'internalid': 'custcol21', 'label': 'Item Height', 'type': 'float'}, {'internalid': 'custcol_tariff_fee_option', 'label': 'Tariff Fee', 'type': 'currency'}, {'internalid': 'custcol_tariff_fee', 'label': 'Tariff Fee Custom', 'type': 'currency'}, {'internalid': 'custcol_is_tariff', 'label': 'Is Tariff', 'type': 'checkbox'}, {'internalid': 'custcol26', 'label': 'Purchase Price', 'type': 'currency'}, {'internalid': 'custcol36', 'label': 'Not Kit Component', 'type': 'checkbox'}, {'internalid': 'custcol67', 'label': 'Is Tariff (Webstore)', 'type': 'text'}, {'internalid': 'custcol_shiphawk_proposed_shipment_id', 'label': 'ShipHawk Proposed Shipment ID', 'type': 'text'}, {'internalid': 'custcol_shiphawk_source_system_line_n', 'label': 'ShipHawk Source System Line Number', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier', 'label': 'Carrier Name', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier_service', 'label': 'Carrier Service', 'type': 'text'}]}                    /core/media/media.nl?id=7189171&c=393682&h=qYPfPWXvWc_Udet9IChlyz96qbiA25Y-jMsjg8svIFm-WHxm                   79                                                                           US$‎12.99       True                   0.67       False                12.99          False                False           False                 TP-Link       False                  False          UE300               False         False                     US$‎12.99                                              False              US$‎12.99             True                False                     False  <div >In stock at College Station</div>                   3.35        6.10                                            False        - Default -               InvtPart               79.0                                                     TP-Link USB 3.0 to Gigabit Ethernet Network Adapter                                          12.99                       False                                        TP-Link-UE300         12.99               USB Converters                          NaN                          NaN
3                     0                                 False                                               &nbsp;        False           True                                                                False                                                                                                                                                                                                                          {'urls': [{'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-LINK-2-4GHz-300Mbps-9dBi-Outdoor-CPE-CPE210.001.jpg'}, {'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-LINK-2-4GHz-300Mbps-9dBi-Outdoor-CPE-CPE210.002.jpg'}, {'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-LINK-2-4GHz-300Mbps-9dBi-Outdoor-CPE-CPE210.003.jpg'}]}                         TP-Link                9/22/2021              False  {'onlinecustomerprice_formatted': 'US$‎39.99', 'onlinecustomerprice': 39.99}               1.65                                      5319  {'fields': [{'internalid': 'custcol19', 'label': 'Item Length', 'type': 'float'}, {'internalid': 'custcol20', 'label': 'Item Width', 'type': 'float'}, {'internalid': 'custcol21', 'label': 'Item Height', 'type': 'float'}, {'internalid': 'custcol_tariff_fee_option', 'label': 'Tariff Fee', 'type': 'currency'}, {'internalid': 'custcol_tariff_fee', 'label': 'Tariff Fee Custom', 'type': 'currency'}, {'internalid': 'custcol_is_tariff', 'label': 'Is Tariff', 'type': 'checkbox'}, {'internalid': 'custcol26', 'label': 'Purchase Price', 'type': 'currency'}, {'internalid': 'custcol36', 'label': 'Not Kit Component', 'type': 'checkbox'}, {'internalid': 'custcol67', 'label': 'Is Tariff (Webstore)', 'type': 'text'}, {'internalid': 'custcol_shiphawk_proposed_shipment_id', 'label': 'ShipHawk Proposed Shipment ID', 'type': 'text'}, {'internalid': 'custcol_shiphawk_source_system_line_n', 'label': 'ShipHawk Source System Line Number', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier', 'label': 'Carrier Name', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier_service', 'label': 'Carrier Service', 'type': 'text'}]}                     /core/media/media.nl?id=875579&c=393682&h=skaSM39aCBHsxoAkbixkUtedRt2h7qw6xp6EXKWbFg9QUAGA                   71       Outdoor 2.4GHz 300Mbps High power Wireless Access Point             US$‎39.99       True                   4.10       False                39.99          False                False           False                 TP-Link       False                   True         CPE210               False         False                     US$‎39.99                                              False              US$‎39.99             True                False                     False  <div >In stock at College Station</div>                   5.25       10.62                                            False        - Default -               InvtPart               71.0                                                          TP-LINK 2.4GHz 300Mbps 9dBi Outdoor CPE CPE210                                          39.99                       False       TP-LINK-2-4GHz-300Mbps-9dBi-Outdoor-CPE-CPE210         39.99                2GHz PTP/PTMP                          NaN                          NaN
4                     0                                 False                                               &nbsp;        False           True                                                                False                                                                                                                                                                                                                                                                                                              {'urls': [{'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-Link-TL-WR902AC.011.jpg'}, {'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-Link-TL-WR902AC.012.jpg'}, {'altimagetext': '', 'url': 'https://www.ispsupplies.com/SSP Applications/NetSuite Inc. - SCA Vinson/Development/product_images/images/TP-Link-TL-WR902AC.013.jpg'}]}                         TP-Link               11/29/2021              False  {'onlinecustomerprice_formatted': 'US$‎39.99', 'onlinecustomerprice': 39.99}               0.60                                      5512  {'fields': [{'internalid': 'custcol19', 'label': 'Item Length', 'type': 'float'}, {'internalid': 'custcol20', 'label': 'Item Width', 'type': 'float'}, {'internalid': 'custcol21', 'label': 'Item Height', 'type': 'float'}, {'internalid': 'custcol_tariff_fee_option', 'label': 'Tariff Fee', 'type': 'currency'}, {'internalid': 'custcol_tariff_fee', 'label': 'Tariff Fee Custom', 'type': 'currency'}, {'internalid': 'custcol_is_tariff', 'label': 'Is Tariff', 'type': 'checkbox'}, {'internalid': 'custcol26', 'label': 'Purchase Price', 'type': 'currency'}, {'internalid': 'custcol36', 'label': 'Not Kit Component', 'type': 'checkbox'}, {'internalid': 'custcol67', 'label': 'Is Tariff (Webstore)', 'type': 'text'}, {'internalid': 'custcol_shiphawk_proposed_shipment_id', 'label': 'ShipHawk Proposed Shipment ID', 'type': 'text'}, {'internalid': 'custcol_shiphawk_source_system_line_n', 'label': 'ShipHawk Source System Line Number', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier', 'label': 'Carrier Name', 'type': 'text'}, {'internalid': 'custcol_shiphawk_carrier_service', 'label': 'Carrier Service', 'type': 'text'}]}                    /core/media/media.nl?id=1056731&c=393682&h=che7-nic7o8Sln8Cl1UJWkH_DVUv7VRlcJi9_va_9WP4bFwv                   60                  AC750 Portable Wi-Fi Travel Router, 2.4/5GHz             US$‎39.99       True                   3.00       False                39.99          False                False           False                 TP-Link       False                   True     TL-WR902AC               False         False                     US$‎39.99                                              False              US$‎39.99             True                False                     False  <div >In stock at College Station</div>                   4.50        4.50                                            False        - Default -               InvtPart               60.0                                                           TP-Link AC750 Wireless Travel Router 2.4/5GHz                                          39.99                       False                                   TP-Link-TL-WR902AC         39.99             Wireless Routers                          NaN    

Or just whats seen on the site:

print(df[['storedisplayname2', 
          'itemid', 
          'urlcomponent',
          'onlinecustomerprice_formatted',
          'quantityavailable']].head(5).to_string())


                                     storedisplayname2         itemid                                         urlcomponent onlinecustomerprice_formatted  quantityavailable
0          TP-LINK 32-bit Gigabit PCIe Network Adapter        TG-3468  TP-LINK-Gigabit-PCI-Express-Network-Adapter-TG-3468                     US$‎14.99              109.0
1                  TP-LINK AV600 Powerline Starter Kit  TL-PA4010 KIT                                TP-LINK-TL-PA4010-KIT                     US$‎39.99               94.0
2  TP-Link USB 3.0 to Gigabit Ethernet Network Adapter          UE300                                        TP-Link-UE300                     US$‎12.99               79.0
3       TP-LINK 2.4GHz 300Mbps 9dBi Outdoor CPE CPE210         CPE210       TP-LINK-2-4GHz-300Mbps-9dBi-Outdoor-CPE-CPE210                     US$‎39.99               71.0
4        TP-Link AC750 Wireless Travel Router 2.4/5GHz     TL-WR902AC                                   TP-Link-TL-WR902AC                     US$‎39.99               60.0
  •  Tags:  
  • Related