5.5 Robust Xpath
Below are few xpath combinations that can be used to make your xpaths more robust.
Using starts-with : it checks if text in td starts with ‘Returns’
.//*[@id='content']/table[1]/tbody/tr/td[starts-with(text(),'Returns')]
Using contains : it checks if the td contains text ‘number’.
.//*[@id='content']/table[1]/tbody/tr/td[contains(text(),'number')]
Using wildcard : Select all immediate child elements of the tag preceding /*
html/*
Finding matches by comparing attributes
.//*[@id='content']/table[@class='code-snippet']
Same can be written using contains
.//*[@id='content']/table[contains(@class, 'de-sni')]
And starts-with
.//*[@id='content']/table[starts-with(@class, 'code')]
Using axes
.//*[@id='main']/div[attribute::class]
.//*[@id='main']/div[child::*]
.//*[@id='main']/div[child::text()] select only text nodes
.//*[@id='main']/div[child::*/child::price]
Using axes in conjuction with contains
.//*[@id='main']/div[contains(attribute::class,'chapter')]
Using operators
.//*[@id='main']/table/tbody/tr[4]/td[2][string-length(text())-1+1*1 div 1=12 and 3<4 and 4>3 or 4>=4 or 5!=6 and 5 mod 3 = 2]
String handling :
.//*[@id='main']/table/tbody/tr[4]/td[2][substring-after(text(),'Sub') = 'traction']
Selects all the title AND price elements of all book elements
//book/title | //book/price
Selects all the child nodes of the bookstore element
/bookstore/*
//* Selects all elements in the document
//title[@*] Selects all title elements which have any attribute
child with child
book[author/degree]
The second text node in each <p> element in the context node.
p/text()[2]
The first two <degree> elements that are children of the context node.
degree[position() < 3]
The first three books (1, 2, 3).
book[position() <= 3]
All <author> elements that contain at least one <degree> element child and at least one <award> element child.
author[degree][award]
The last <book> element of the current context node.
book[last()]
The last <author> child of each <book> element of the current context node.
book/author[last()]
The last <author> element from the entire set of <author> children of <book> elements of the current context node.
(book/author)[last()]
all div that do not contain style attribute
//div[not(@style)]
Last updated
Was this helpful?