XPath 定位网页元素实例分析

摘要：待整理

快速知识点

路径

如果XPath的开头是一个斜线（/）代表绝对路径，元素存在一个或者不存在；

如果开头是两个斜线（//）代表相对路径，表示文件中所有符合模式的元素都会被选出来，所以可能会有多个。

两种路径表示方式可以放在任意位置，表示之后的路径的选择方式。

无论是使用那种方式，都有可能返回多个，那么通过 [x] 可以决定选择第几个元素

属性

路径重复太多，使用属性区分

XPath1：     //*[@id="su"]
XPath2：     //*[@value="百度一下"]
XPath3：     //input[@class="bg s_btn"] 
XPath4：     //*[@id="lg"]/img 
XPath5：     //a[@class="mnav"]
XPath6：     //a[@class="mnav"][2]

使用 = 来表示属性值

也可以使用函数来定位特殊的内容

//div[contains(@class, "input-group mb")] 表示匹配任意的 div 其 class 的属性值包含如下字符串：input-group mb

其他参考

starts-with 顾名思义，匹配一个属性开始位置的关键字
contains 匹配一个属性值中包含的字符串
text（）匹配的是显示文本信息，此处也可以用来做定位用

//input[starts-with(@name,'name1')]     查找name属性中开始位置包含'name1'关键字的页面元素
//input[contains(@name,'na')]         查找name属性中包含na关键字的页面元素
<a href="http://www.test.com">内容</a>
xpath写法为 //a[text()='内容'] 
或者 //a[contains(text(),"内容")] 有问题

属性可以用 and 连接多个属性

其他部分属性函数

1
2
3

By.xpath("//input[start-with(@id,'nice')]")
By.xpath("//input[ends-with(@id,'很漂亮')]")
By.xpath("//input[contains(@id,'那么美')]")

标签内容

https://stackoverflow.com/a/3655588/2000468

1	//li[text()[contains(., 'fed-domain2')]]

这个表示获取 li 标签中有包含内容 fed-domain2

//*[text()[contains(.,’ABC’)]]

is a selector that matches any element (i.e. tag) — it returns a node-set. The outer [] are a conditional that operates on each individual node in that node set — here it operates on each element in the document.

text() is a selector that matches all of the text nodes that are children of the context node — it returns a node set.

The inner [] are a conditional that operates on each node in that node set — here each individual text node. Each individual text node is the starting point for any path in the brackets, and can also be referred to explicitly as . within the brackets. It matches if any of the individual nodes it operates on match the conditions inside the brackets.

contains is a function that operates on a string. Here it is
passed an individual text node (.). Since it is passed the second text node in the <Comment> tag individually, it will see the ‘ABC’ string and be able to match it.

组合

举个例子

这个 Xpath //div[contains(@class, "ember-power-select-trigger")] 会筛选出三个元素。这时候我们可以再去约束一下他们的父类，如下 Xpath

1	//div[contains(@class, "input-group mb")]//div[contains(@class, "ember-power-select-trigger")]

我们会发现现在只有一个元素被筛选出来，因为其他两个元素都没有在这个父类里面。

其他案例

1	By.xpath("//input[@id='kw1']//input[start-with(@id,'nice']/div[1]/form[3])