161 reads

Getting Your Way Around With Common XPath Methods

by Eugene TruutsOctober 18th, 2023

Too Long; Didn't Read

This tutorial will clarify various methods for selecting elements and data within XML or HTML documents. I’ve included for you the explanations and examples of standard XPath methods.

featured image - Getting Your Way Around With Common XPath Methods

This tutorial will clarify various methods for selecting elements and data within XML or HTML documents. I’ve included for you the explanations and examples of standard XPath methods.

Node Selection

Node selection in XPath refers to choosing specific elements, attributes, or nodes within an XML or HTML document based on their type or location in the document’s hierarchy.

//img

Attribute Selection

Attribute selection in XPath involves choosing elements within an XML or HTML document based on their attributes’ values.

//*[@id = 'gridItemRoot']

Predicate Filtering

Predicate filtering in XPath applies conditions or filters to select specific elements or nodes based on certain criteria. Use conditions inside square brackets to filter elements.

//span[contains(@class, 'sc-price') and number(translate(., '$', '')) < 10.00]

Positional Selection

Positional selection in XPath involves choosing elements within an XML or HTML document based on their position or index in its structure.

//*[@id = 'gridItemRoot'][4]

Text Content Selection

Text content selection in XPath refers to choosing elements within an XML or HTML document based on the textual content contained within those elements.

//*[text()='The 48 Laws of Power']

Logical Operators

Logical operators in XPath are used to combine or modify conditions within an XPath expression to make more complex selections.

//div[@id='gridItemRoot' and //*[contains(@class, 'a-icon-star-small')] and .//span[contains(@class, 'sc-price') and number(translate(., '$', '')) < 10.00]]

Axis Selection

Axis selection in XPath involves navigating the document’s hierarchy based on the relationships between elements and nodes, allowing you to select elements related to a specific context node.

Parent Selection

The “parent” XPath is used to select the parent element of a given element. It allows you to navigate the document’s hierarchy to access a specific node's immediate or nearest enclosing parent element.

//li[contains(.//span, 'Comics & Graphic Novels')]/parent::*
or
//li[contains(.//span, 'Comics & Graphic Novels')]/..

Preceding Sibling Selection

Preceding sibling selection in XPath allows you to select elements that are siblings of a given context node and appear before it in the document’s hierarchy.

//li[contains(.//span, 'Comics & Graphic Novels')]/preceding-sibling::*

Following Sibling Selection

Following sibling selection in XPath allows you to select elements that are siblings of a given context node and appear after it in the document’s hierarchy.

//li[contains(.//span, 'Comics & Graphic Novels')]/following-sibling::*

Child Selection

Child selection in XPath involves selecting elements that are direct children of a given parent element or context node within the XML or HTML document.

//li[contains(.//span, 'Comics & Graphic Novels')]/../child::*

Wildcards

Wildcard selection in XPath involves using wildcard symbols to match elements or attributes regardless of their specific names or values.

//*

Functions

Functions in XPath are predefined operations or calculations that you can use within an XPath expression to manipulate or evaluate nodes, attributes, or values in XML or HTML documents.

//*[@class = 'a-size-small']//child::*[starts-with(text(), 'It')]

These methods offer versatile ways to locate specific elements, attributes, or data within XML and HTML documents, making XPath a powerful tool for tasks such as web scraping, data extraction, and test automation.