XPath

Please read the Material Usage Rules on this site.

🔗 Original page — Source of this material

This is a flexible and powerful query language for selecting elements in XML or (X)HTML documents and for XSLT transformations by DOM. It is a standard created by the W3C consortium.

What is XPath for in ZennoPoster?

For parsing data from websites (action ❗→ Parse data)
For searching for and interacting with web page elements
❗→ Perform action
❗→ Set value
❗→ Get value
You can use it in the ❗→ Action Builder.

With XPath, you can create a more universal and robust data search algorithm that is less sensitive to website layout changes compared to ❗→ regular expressions. This query language allows you to significantly simplify parser logic and speed up development.

Testing queries as you create them

ZennoPoster comes with a built-in ❗→ X/Json Path Tester that lets you test the expressions you've created.
You can also build and test XPath expressions in the ❗→ Web Developer Tools (DevTools) window: open DevTools, press ctrl+f to open the search bar, and enter your XPath expression there:

image-20210227-143255

For example, to get the names of events on the http://w3.org website, you can use the following expression:

//*[@id="w3c_home_upcoming_events"]/ul/li//a

Basic syntax

Paths

Expression	Description
.	current context
.//	recursive descent (zero or more levels down from the current context)
/html/body	absolute path
a	relative path
//*	everything in the current context
li//a	links that are “grandchildren” of li
//a\|//button	links and buttons (combines two node sets)

Relationships

Expression	Description
a/i/parent::p	immediate parent <p>
p/ancestor::\	all ancestors
p/following-sibling::\	all following siblings
p/preceding-sibling::\	all preceding siblings
p/following::\	all elements after except descendants
p/preceding::\	all elements before except ancestors
p/descendant-or-self::\	the context node and all its descendants
p/ancestor-or-self::\	the context node and all its ancestors

Getting nodes

Expression	Description
/div/text()	get text nodes
/div/text()[1]	get the first text node

Element position

Expression	Description
a[1]	first element
a[last()]	last element
a[2]	second link
a[position() <= 3]	first 3 links
ul[li[1]=”OK”]	list (UL) whose first item is 'OK'
tr[position() mod 2 = 1]	odd elements
tr[position() mod 2 = 0]	even elements
p/text()[2]	second text node

Attributes and filters [] - means filtering elements

Expression	Description
input[@type=”text”]	<input> tag with type attribute equal to text
input[@class='OK']	<input> tag with class attribute equal to OK
p[not(@)]	paragraphs with no attributes
[@style]	all elements with a style attribute
a[. = “OK”]	links with value “OK”
a/@id	link IDs
a/@\	all link attributes
a[@id and @rel] a[@id][@rel]	links that have both id and rel attributes
a[i or b]	links containing an <i> or <b> element

Functions

Basic Xpath functions - http://www.w3.org/TR/xpath/#corelib

Function	Description	Example
name()	Returns the element name	[name()='a']
string(val)	Get the value of an attribute	string(a[1]/@id)
substring(val, from, to)	Cut part of a string	substring(@id, 1, 6)
substring-before(val, to)	Return the part of val before the string to	substring-before('12-May-1998', '-') = '12'
substring-after(val, from)	Return the part of val after the string to	substring-after('12-May-1998', '-') = 'May-1998'
string-length()	Returns the number of characters in a string	[string-length(text()) > 5]
count()	Returns the number of elements
concat()	Takes two or more strings and returns the concatenation of its arguments.
normalize-space()	Just like Trim	[normalize-space(text())='SEARCH']
starts-with()	Starts with	[starts-with(text(), 'SEARCH')]
contains()	Contains	[contains(name(), 'SEARCH')]
translate(val, from, to)	Replaces the characters in the first string argument that appear in the second argument with the corresponding characters in the third argument.	translate(«bar»,«abc»,«ABC»)

Grouping

Expression	Description
(table/tbody/tr)[last()]	the last <tr> row of all tables
(//h1\|//h2)[contains(text(), 'Text')]	heading 1 or 2 containing 'Text'
a[//tr/@data-id=@data-id]	all links whose data-id attribute matches the data-id attribute of a table row

Useful links

https://ru.wikipedia.org/wiki/XPath https://www.w3schools.com/xml/xpath_syntax.asp

What is XPath for in ZennoPoster?​

Testing queries as you create them​

Basic syntax​

Paths​

Relationships​

Getting nodes​

Element position​

Functions​

Grouping​

Useful links​