5.5 Robust Xpath

Below are few xpath combinations that can be used to make your xpaths more robust.

Using starts-with : it checks if text in td starts with ‘Returns’
.//*[@id='content']/table[1]/tbody/tr/td[starts-with(text(),'Returns')]

Using contains : it checks if the td contains text ‘number’.
.//*[@id='content']/table[1]/tbody/tr/td[contains(text(),'number')]

Using wildcard : Select all immediate child elements of the tag preceding /*
html/*

Finding matches by comparing attributes
.//*[@id='content']/table[@class='code-snippet']

Same can be written using contains
.//*[@id='content']/table[contains(@class, 'de-sni')]
And starts-with
.//*[@id='content']/table[starts-with(@class, 'code')]

Using axes
.//*[@id='main']/div[attribute::class]
.//*[@id='main']/div[child::*]
.//*[@id='main']/div[child::text()] select only text nodes
.//*[@id='main']/div[child::*/child::price]

Using axes in conjuction with contains
.//*[@id='main']/div[contains(attribute::class,'chapter')]

Using operators
.//*[@id='main']/table/tbody/tr[4]/td[2][string-length(text())-1+1*1 div 1=12 and 3<4 and 4>3 or 4>=4 or 5!=6 and 5 mod 3 = 2]

String handling :
.//*[@id='main']/table/tbody/tr[4]/td[2][substring-after(text(),'Sub') = 'traction']

Selects all the title AND price elements of all book elements
//book/title | //book/price

Selects all the child nodes of the bookstore element
/bookstore/*

//* Selects all elements in the document
//title[@*] Selects all title elements which have any attribute

child with child
book[author/degree]

The second text node in each <p> element in the context node.
p/text()[2]

The first two <degree> elements that are children of the context node.
degree[position() &lt; 3]

The first three books (1, 2, 3).
book[position() &lt;= 3]

All <author> elements that contain at least one <degree> element child and at least one <award> element child.
author[degree][award]

The last <book> element of the current context node.
book[last()]

The last <author> child of each <book> element of the current context node.
book/author[last()]

The last <author> element from the entire set of <author> children of <book> elements of the current context node.
(book/author)[last()]

all div that do not contain style attribute
//div[not(@style)]

Last updated

Was this helpful?