一尘不染

在R中访问Selenium API

selenium

我对将Selenium与R一起使用很感兴趣。我注意到WebDriver(Selenium
2)API文档在此介绍了各种文档。在R的实现上是否做过任何工作。我将如何实现这一目标。在文档中,它记录了有关运行selenium服务器的信息,并且可以使用Javascript查询api。任何帮助将非常感激。


阅读 278

收藏
2020-06-26

共1个答案

一尘不染

可以使用JsonWireProtocol访问Selenium 。

首先,通过以下命令从命令行启动Selenium服务器:

java -jar selenium-server-standalone-2.25.0.jar

可以按以下方式打开新的Firefox浏览器:

library(RCurl)
library(RJSONIO)
library(XML)

baseURL<-"http://localhost:4444/wd/hub/"
server<-list(desiredCapabilities=list(browserName='firefox',javascriptEnabled=TRUE))

getURL(paste0(baseURL,"session"),
       customrequest="POST",
       httpheader=c('Content-Type'='application/json;charset=UTF-8'),
       postfields=toJSON(server))

serverDetails<-fromJSON(rawToChar(getURLContent('http://localhost:4444/wd/hub/sessions',binary=TRUE)))
serverId<-serverDetails$value[[1]]$id

导航到谷歌。

getURL(paste0(baseURL,"session/",serverId,"/url"),
       customrequest="POST",
       httpheader=c('Content-Type'='application/json;charset=UTF-8'),
       postfields=toJSON(list(url="http://www.google.com")))

获取搜索框的ID

elementDetails<-fromJSON(rawToChar(getURLContent(paste0(baseURL,"session/",serverId,"/element"),
       customrequest="POST",
       httpheader=c('Content-Type'='application/json;charset=UTF-8'),
       postfields=toJSON(list(using="xpath",value="//*[@id=\"gbqfq\"]")),binary=TRUE))
       )

elementId<-elementDetails$value

搜索主题

rawToChar(getURLContent(paste0(baseURL,"session/",serverId,"/element/",elementId,"/value"),
       customrequest="POST",
       httpheader=c('Content-Type'='application/json;charset=UTF-8'),
       postfields=toJSON(list(value=list("\uE009","a","\uE009",'\b','Selenium api in R')))
       ,binary=TRUE))

返回搜索HTML

googData<-fromJSON(rawToChar(getURLContent(paste0(baseURL,"session/",serverId,"/source"),
       customrequest="GET",
       httpheader=c('Content-Type'='application/json;charset=UTF-8'),
       binary=TRUE
       ))
       )

获得建议的链接

gxml<-htmlParse(googData$value)
urls<-unname(xpathSApply(gxml,"//*[@class='l']/@href"))

关闭会议

getURL(paste0(baseURL,"session/",serverId),
       customrequest="DELETE",
       httpheader=c('Content-Type'='application/json;charset=UTF-8')
       )
2020-06-26