一尘不染

如何导航到Selenium中的新网页?

selenium

我有以下代码:

driver.get(<some url>)
for element in driver.find_elements_by_class_name('thumbnail'):
    element.find_element_by_xpath(".//a").click() #this works and navigates to new page
    element.find_element_by_link_text('Click here').click() #this doesn't

后者需要通过单击缩略图来导航以下HTML(当然是简化的),该缩略图指向一个新页面,然后需要单击该Click here新页面中的链接:

<!DOCTYPE html>
<html lang="en-US" prefix="og: http://iuytp.me/ns# fb: http://iuytp.me/ns/fb#">
<head>
<meta charset="UTF-8" />
<title>Releases</title>
</head>

<body class="archive category category-releases category-4 custom-background">
    <div id="main">
        <div id="container" class="one-column">
            <div id="content" role="main">

                <h1 class="page-title">Releases</h1>

            <div id="thumbnail-post-display">
        <div id="thumbnail-post" class="post-7158 post type-post status-publish format-standard has-post-thumbnail hentry category-blog category-designer category-releases category-uncategorized">
            <div class="thumbnail"><a href="http://records.net/uncategorized/designer-7-inch-bufu-records-co-release/" title="Permanent link to Designer &#8211; 7 inch" rel="bookmark"><img width="300" height="300" src="http://records.net/dev/wp-content/uploads/2014/05/dboypledge32-300x300.png" class="attachment-thumbnail wp-post-image" alt="dboypledge3" /></a></div>
            <h2><a href="http://records.net/uncategorized/designer-7-inch-bufu-records-co-release/" title="Permanent link to Designer &#8211; 7 inch" rel="bookmark">Designer &#8211; 7 inch</a></h2>
        </div>
    </div><!--end thumbnail post display-->

            <div id="thumbnail-post-display">
        <div id="thumbnail-post" class="post-7107 post type-post status-publish format-standard has-post-thumbnail hentry category-blog category-releases">
            <div class="thumbnail"><a href="http://records.net/releases/people-2014-tour-demos/" title="Permanent link to All My People &#8211; 2014 Tour Demos" rel="bookmark"><img width="300" height="300" src="http://records.net/dev/wp-content/uploads/2014/04/01_Doubt-mp3-image-300x300.png" class="attachment-thumbnail wp-post-image" alt="" /></a></div>
            <h2><a href="http://records.net/releases/people-2014-tour-demos/" title="Permanent link to All My People &#8211; 2014 Tour Demos" rel="bookmark">All My People &#8211; 2014 Tour Demos</a></h2>
        </div>
    </div><!--end thumbnail post display-->

            <div id="thumbnail-post-display">
        <div id="thumbnail-post" class="post-7089 post type-post status-publish format-standard has-post-thumbnail hentry category-blog category-releases">
            <div class="thumbnail"><a href="http://records.net/releases/sirens-blossom-talk/" title="Permanent link to Syrins &#8211; Boss Talk" rel="bookmark"><img width="300" height="300" src="http://records.net/dev/wp-content/uploads/2014/04/sirens_final_smaller-300x300.jpg" class="attachment-thumbnail wp-post-image" alt="sirens_final_smaller" /></a></div>
            <h2><a href="http://records.net/releases/sirens-blossom-talk/" title="Permanent link to Syrins &#8211; Boss Talk" rel="bookmark">Syrins &#8211; Boss Talk</a></h2>
        </div>
    </div><!--end thumbnail post display-->

            <div id="thumbnail-post-display">
        <div id="thumbnail-post" class="post-7073 post type-post status-publish format-standard has-post-thumbnail hentry category-blog category-releases">
            <div class="thumbnail"><a href="http://records.net/releases/worlds-strongest-man-scares/" title="Permanent link to World&#8217;s Tough Man &#8211; Sorry Scares You" rel="bookmark"><img width="300" height="300" src="http://records.net/dev/wp-content/uploads/2014/03/a2312749950_10-300x300.jpg" class="attachment-thumbnail wp-post-image" alt="a2312749950_10" /></a></div>
            <h2><a href="http://records.net/releases/worlds-strongest-man-scares/" title="Permanent link to World&#8217;s Tough Man &#8211; Sorry Scares You" rel="bookmark">World&#8217;s Tough Man &#8211; Sorry Scares You</a></h2>
        </div>
    </div><!--end thumbnail post display-->

            <div id="thumbnail-post-display">
        <div id="thumbnail-post" class="post-7046 post type-post status-publish format-standard has-post-thumbnail hentry category-blog category-releases">
            <div class="thumbnail"><a href="http://records.net/releases/sundog-space-criminal/" title="Permanent link to Dog &#8211; Space Criminal" rel="bookmark"><img width="300" height="300" src="http://records.net/dev/wp-content/uploads/2014/03/Sundog_cover_high_res-300x300.jpg" class="attachment-thumbnail wp-post-image" alt="dog_cover_high_res" /></a></div>
            <h2><a href="http://records.net/releases/sundog-space-criminal/" title="Permanent link to Dog &#8211; Space Criminal" rel="bookmark">Dog &#8211; Space Criminal</a></h2>
        </div>
    </div><!--end thumbnail post display-->

<div style="clear:both"></div>


        </div><!-- #container -->

    </div><!-- #main -->
</div><!-- #wrapper -->

</div><!--#bg-wrapper-->

</body>
</html>

但是我的代码吐出了以下错误:

Traceback (most recent call last):
  ...
  File "crawler.py", line 17, in main
    driver.find_element_by_link_text('Click here').click()
  File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 254, in find_element_by_link_text
    return self.find_element(by=By.LINK_TEXT, value=link_text)
  File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 662, in find_element
    {'using': by, 'value': value})['value']
  File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 173, in execute
    self.error_handler.check_response(response)
  File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 164, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: u'no such element\n  (Session info: chrome=35.0.1916.153)\n  (Driver info: chromedriver=2.10.267517,platform=Mac OS X 10.9.3 x86_64)'

问题似乎是该元素未使用新页面的内容进行更新。更换有问题的生产线的element使用driver也不起作用。我究竟做错了什么?

请注意,我必须能够对所有缩略图执行此操作(因此需要for循环)。


阅读 294

收藏
2020-06-26

共1个答案

一尘不染

原来,您需要预先存储要导航的链接。这就是最终对我有用的东西(发现此线程很有帮助):

driver.get(<some url>)
elements = driver.find_elements_by_xpath("//h2/a")

links = []
for i in range(len(elements)):
    links.append(elements[i].get_attribute('href'))

for link in links:
    print 'navigating to: ' + link
    driver.get(link)

    # do stuff within that page here...

    driver.back()
2020-06-26