一尘不染

使用Selenium Python WebDriver滚动网页

selenium

我正在抓取此网页中的用户名,该用户名在滚动后会加载用户

转到页面的网址:“ http://www.quora.com/Kevin-
Rose/followers ”

我知道页面上的用户数量(在这种情况下,编号为43812)如何滚动页面,直到所有用户加载完毕?我在互联网上搜索了相同的代码,到处都可以找到几乎相同的代码行:

driver.execute_script(“ window.scrollTo(0,)”)

如何确定垂直位置以确保所有用户都被装载?还有其他选项可以实现相同的功能而无需实际滚动吗?

   from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
import urllib

driver = webdriver.Firefox()
driver.get('http://www.quora.com/')
time.sleep(10)

wait = WebDriverWait(driver, 10)

form = driver.find_element_by_class_name('regular_login')
time.sleep(10)
#add explicit wait

username = form.find_element_by_name('email')
time.sleep(10)
#add explicit wait

username.send_keys('abc@gmail.com')
time.sleep(30)
#add explicit wait

password = form.find_element_by_name('password')
time.sleep(30)
#add explicit wait

password.send_keys('def')
#add explicit wait

password.send_keys(Keys.RETURN)
time.sleep(30)

#search = driver.find_element_by_name('search_input')
search = wait.until(EC.presence_of_element_located((By.XPATH, "//form[@name='search_form']//input[@name='search_input']")))

search.clear()
search.send_keys('Kevin Rose')
search.send_keys(Keys.RETURN)

link = wait.until(EC.presence_of_element_located((By.LINK_TEXT, "Kevin Rose")))
link.click()
#Wait till the element is loaded (Asynchronusly loaded webpage)

handle = driver.window_handles
driver.switch_to.window(handle[1])
#switch to new window

element = WebDriverWait(driver, 2).until(EC.presence_of_element_located((By.PARTIAL_LINK_TEXT, "Followers")))
element.click()

阅读 258

收藏
2020-06-26

共1个答案

一尘不染

由于在加载了最后一个关注者存储桶之后没有出现任何特殊情况,因此我将依赖于这样一个事实,即您知道用户拥有多少个关注者,并且您知道每次向下滚动时都加载了多少个关注者(我检查过-是18每卷)。因此,您可以计算将页面向下滚动多少次。

这是实现(我使用了只有53个关注者的其他用户来演示解决方案):

import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

followers_per_page = 18

driver = webdriver.Chrome()  # webdriver.Firefox() in your case
driver.get("http://www.quora.com/Andrew-Delikat/followers")

# get the followers count
element = WebDriverWait(driver, 2).until(EC.presence_of_element_located((By.XPATH, '//li[contains(@class, "FollowersNavItem")]//span[@class="profile_count"]')))
followers_count = int(element.text.replace(',', ''))
print followers_count

# scroll down the page iteratively with a delay
for _ in xrange(0, followers_count/followers_per_page + 1):
    driver.execute_script("window.scrollTo(0, 10000);")
    time.sleep(2)

另外,10000在跟随者数量众多的情况下,您可能需要根据循环变量增加此Y坐标值。

2020-06-26