一尘不染

Selenium无法使用python抓取Shopee电子商务网站

selenium

我无法在Shopee(电子商务网站)上拉低产品的价格。 我看了@dmitrybelyakov解决的问题)。

该解决方案帮助我获得了产品的“名称”和“
historical_sold”,但我无法获得产品的价格。我在Json字符串中找不到价格值。因此,我尝试使用Selenium通过xpath提取数据,但似乎失败了。

电子商务网站的链接:https :
//shopee.com.my/search? keyword
=h370m

我的代码:

import time

from selenium import webdriver

import pandas as pd

path = r'C:\Users\\admin\\Desktop\\chromedriver_win32\\Chromedriver'

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('headless')
chrome_options.add_argument('window-size=1200x600')

browserdriver = webdriver.Chrome(executable_path = path,options=chrome_options)
link='https://shopee.com.my/search?keyword=h370m'
browserdriver.get(link)
productprice='//*[@id="main"]/div/div[2]/div[2]/div/div/div/div[2]/div/div/div[2]/div[1]/div/a/div/div[2]/div[1]'
productprice_printout=browserdriver.find_element_by_xpath(productname).text
print(productprice_printout)

当我运行该代码时,它显示如下错误通知:

selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//*[@id="main"]/div/div[2]/div[2]/div/div/div/div[2]/div/div/div[2]/div[1]/div/a/div/div[2]/div[1]"}

请帮助我在Shopee上获得产品的价格!


阅读 575

收藏
2020-06-26

共1个答案

一尘不染

要使用Selenium和 _Python_提取Shopee]上的产品价格,您可以使用以下解决方案:

  • 代码块:

    from selenium import webdriver
    

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC

    options = webdriver.ChromeOptions()
    options.add_argument(‘–headless’)
    options.add_argument(‘start-maximized’)
    options.add_argument(‘disable-infobars’)
    options.add_argument(‘–disable-extensions’)
    browserdriver = webdriver.Chrome(chrome_options=options, executable_path=r’C:\WebDrivers\chromedriver.exe’)
    browserdriver.get('https://shopee.com.my/search?keyword=h370m’)
    WebDriverWait(browserdriver, 20).until(EC.element_to_be_clickable((By.XPATH, “//div[@class=’shopee-modal__container’]//button[text()=’English’]”))).click()
    print([my_element.text for my_element in WebDriverWait(browserdriver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, “//span[text()=’RM’]//following::span[1]”)))])
    print(“Program Ended”)

  • 控制台输出:

    ['430.00', '385.00', '435.00', '409.00', '479.00', '439.00', '479.00', '439.00', '439.00', '403.20', '369.00', '420.00', '479.00', '465.00', '465.00']
    

    Program Ended

2020-06-26