The 5th Annual China PHP Conference

The DOMXPath class

(PHP 5, PHP 7)


Supports XPath 1.0

Class synopsis

DOMXPath {
/* Properties */
/* Methods */
public __construct ( DOMDocument $doc )
public mixed evaluate ( string $expression [, DOMNode $contextnode [, bool $registerNodeNS = true ]] )
public DOMNodeList query ( string $expression [, DOMNode $contextnode [, bool $registerNodeNS = true ]] )
public bool registerNamespace ( string $prefix , string $namespaceURI )
public void registerPhpFunctions ([ mixed $restrict ] )



Table of Contents

add a note add a note

User Contributed Notes 6 notes

Mark Omohundro, ajamyajax dot com
8 years ago
// to retrieve selected html data, try these DomXPath examples:

$file = $DOCUMENT_ROOT. "test.html";
$doc = new DOMDocument();

$xpath = new DOMXpath($doc);

// example 1: for everything with an id
//$elements = $xpath->query("//*[@id]");

// example 2: for node data in a selected id
//$elements = $xpath->query("/html/body/div[@id='yourTagIdHere']");

// example 3: same as above with wildcard
$elements = $xpath->query("*/div[@id='yourTagIdHere']");

if (!
is_null($elements)) {
  foreach (
$elements as $element) {
"<br/>[". $element->nodeName. "]";

$nodes = $element->childNodes;
    foreach (
$nodes as $node) {
$node->nodeValue. "\n";
archimedix32783262 at mailinator dot com
2 years ago
Note that evaluate() will use the same encoding as the XML document.

So if you have a UTF-16 XML, you will have to query using UTF-16 strings.

You can use iconv() to convert from your code's encoding to the target encoding for better legibility.
peter at softcoded dot com
22 days ago
You may not always know at runtime whether your file has
a namespace or not. This can make it difficult to create
XPath queries. Use the seriously underdocumented
"namespaceURI" property of the documentElement of a
DOMDocument to determine if there is a namespace.
Use code such as the following:

$doc = new DOMDocument();
$xpath = new DOMXPath($doc);
$ns = $doc->documentElement->namespaceURI;
if($ns) {
  $xpath->registerNamespace("ns", $ns);
  $nodes = $xpath->query("//ns:em[@class='glossterm']");
} else {
  $nodes = $xpath->query("//em[@class='glossterm']");
//look at nodes here
peter at softcoded dot com
22 days ago
Using XPath expressions can save a lot of programming
and allow you to home in on only the nodes you want.
Suppose you want to delete all empty <p> tags.
If you create a query using the following XPath expression,
you can find <p> tags that do not have any text
(other than spaces), any attributes,
any children or comments:

$expression = "//p[not(@*) 
   and not(*)
   and not(./comment())
   and normalize-space(text())='']";
This expression will only find para tags that look like:

<p>[any number of spaces]</p>

Imagine the code you would have to add if you used
DOMDocument::getElementsByTagName("p") instead.
6 years ago
I just spent far too much time chasing this one....

When running an xpath query on a table be careful about table internal nodes (ie: <tr></tr>, and <td></td>).  If the master <table> tag is missing, then query() (and likely evaluate() also) will return unexpected results.

I had a DOMNode with a structure like this:


Upon which I was trying to do a relative query (ie: <?php $xpath_obj->query('my/x/path', $relative_node); ?>).

But because of the lone outer <td></td> tags, the inner tags were being invalidated, while the nodes were still recognized.  Meaning that the following query would work:

<?php $xpath_obj->query('*[2]/*[*[2]]', $relative_node); ?>

But when replacing any of the "*" tokens with the corresponding (and valid) "table", "tr", or "td" tokens the query would inexplicably break.
david at lionhead dot nl
7 years ago
When using DOMXPath and having a default namespace. Consider using an intermediate function to add the default namespace to all queries:

// The default namespace: x:xmlns="http://..."

// Result: /x:Book/x:Title
To Top