Tutorial - Path Expressions in Candle

Introduction

In this tutorial, we'll go through the path expressions in Candle. As Candle path expressions are closely based on XPath, this tutorial will be straightforward for you, if you already understand XPath.

XPath-like expressions provides a very powerful way to select nodes out of a hierarchical data source. It is much more expressive and convenient than other means like DOM functions or SAX parsers. And there are many hierarchical data sources around us, common ones like XML, XHTML, HTML documents, JSON markup, MIME messages. XPath-like expressions are the fundamental building blocks of Candle and probably any serious Internet scripting language.

And we will use the following XML document in the examples below.
<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>
  <book category="COOKING">
    <book-title lang="en">Everyday Italian</book-title>
    <author>Giada De Laurentiis</author>
    <year>2005</year>
    <price>30.00</price>
  </book>
  <book category="CHILDREN">
    <book-title lang="en">Harry Potter</book-title>
    <author>J K. Rowling</author>
    <year>2005</year>
    <price>29.99</price>
  </book>
  <book category="WEB">
    <book-title lang="en">XQuery Kick Start</book-title>
    <author>James McGovern</author>
    <author>Per Bothner</author>
    <author>Kurt Cagle</author>
    <author>James Linn</author>
    <author>Vaidyanathan Nagarajan</author>
    <year>2003</year>
    <price>49.99</price>
  </book>
  <book category="WEB">
    <book-title lang="en">Learning XML</book-title>
    <author>Erik T. Ray</author>
    <year>2003</year>
    <price>39.95</price>
  </book>
</bookstore>

Steps and Axes

Like XPath, Candle path expressions are similar to two other well-known path notations: firstly, the path notation used in traditional file systems, and the other is URI. This is not by accident, but a intensional design by the XPath designer, I think.

There are two kinds of path expressions: absolute path expressions and relative path expressions. Like in the file systems and URI, absolute path in Candle starts with the root operator '/' and relative path starts with a step.

A step in a path expression selects nodes relative to current node-set. A step consists of:

The syntax for a step is:

axis-name::node-test[predicate]

There are 12 types of axes defined in Candle:
Axis Name Result
ancestor Selects all ancestors (parent, grandparent, etc.) of the current node
ancestor-or-self Selects all ancestors (parent, grandparent, etc.) of the current node and the current node itself
attribute Selects all attributes of the current node
child Selects all children of the current node
descendant Selects all descendants (children, grandchildren, etc.) of the current node
descendant-or-self Selects all descendants (children, grandchildren, etc.) of the current node and the current node itself
following Selects everything in the document after the closing tag of the current node
following-sibling Selects all siblings after the current node
parent Selects the parent of the current node
preceding Selects everything in the document that is before the start tag of the current node
preceding-sibling Selects all siblings before the current node
self Selects the current node
The namespace axis defined in XPath is not supported in Candle, as namespace nodes are not considered part of the data model in Candle.

And there are 5 types of node test in Candle:
Node Test Result
nodename Selects attribute nodes or child element nodes with the specified name;
text() Selects all text nodes among the children of the current node;
comment() Selects all comment nodes among the children of the current node;
data() Selects all data nodes among the children of the current node; text node is a special kind of data node;
node() Selects all children of the current node;
Processing-instructions are treated as comment in Candle and data node is a new node type instroduced in Candle data model.

Below is a script that uses path expressions to select nodes out of the sample XML document:
<?csp1.0?>
function main(input) {
  <style>
  <<<
  book { display: block; border: 1px solid #00f; margin:3px; }
  book-title { background-color: #ccf; margin:3px; }
  author { background-color: #ff9; margin:3px; }
  year { background-color: #afa; margin:3px; }
  price { background-color: #fce; margin:3px; }
  #bookstore { border-collapse: collapse; }
  #bookstore td { border: 1px solid black; }
  >>>
  </style>
  let store = input/bookstore;
  <table id="bookstore">
    <tr>
      <td> "Selects all book nodes that are children of the current node" </td>
      <td> copy(store/child::book); </td>
    </tr>
    <tr>
      <td> "Selects the lang attribute of the current node"</td>
      <td> {store/book/book-title/attribute::lang} </td>
    </tr>
    <tr>
      <td> "Selects all text node children of the current node" </td>
      <td> {store/book/book-title/.text} </td>
    </tr>
    <tr>
      <td> "Selects all price descendants of the current node" </td>
      <td> copy(store/descendant::price); </td>
    </tr>
  </table>
}
Try it yourself »

Abbreviated and Wildcard Path Expressions

The abbreviated and wildcard path expressions defined in XPath are also supported in Candle.

Expression Description
name Selects all child elements with the specified name, as child axis is the default axis in Candle.
/ Selects the root node
// Selects descendant nodes of the current node and the current node itself
. Selects the current node
.. Selects the parent of the current node
@ Shorthand for attribute axis
* Selects all child element nodes regardless of its name
*:name Selects any child elements with the specified local name, regardless its namespace
namespace:* Selects any child elements under the specified namespace
@* Selects all attribute nodes regardless of its name
@*:name Selects any attribute nodes with the specified local name, regardless its namespace
@namespace:* Selects any attribute nodes under the specified namespace

Predicates

There are 3 types predicates in Candle.

Predicates Description
step[predicate-expr] The filter predicate is used to filter the nodes selected by the path step. If the predicate expression evaluates to an integer, it selects the Nth node among the nodes based on the integer value; otherwise, the result is casted into the effective boolean value, and the context node is kept if the boolean is true, and filtered if false.
step?property The property predicate is used to return specific property values of the context item.
step?routine() To invoke a member routine defined on the context item.

Here are some more examples of path expressions in Candle:
<?csp1.0?>
function main(input) {
  <style>
  <<<
  book { display: block; border: 1px solid #00f; margin:3px; }
  book-title { background-color: #ccf; margin:3px; }
  author { background-color: #ff9; margin:3px; }
  year { background-color: #afa; margin:3px; }
  price { background-color: #fce; margin:3px; }
  #bookstore { border-collapse: collapse; }
  #bookstore td { border: 1px solid black; }
  >>>
  </style>
  let store = input/bookstore;
  <table id="bookstore">
    <tr>
      <td> "Selects the title of the first book node under the bookstore element" </td>
      <td> copy(store/book[1]/book-title); </td>
    </tr>
    <tr>
      <td> "Selects all the price nodes of books with a price higher than 35"</td>
      <td> copy(store/book[number(child::price)>35]/price); </td>
    </tr>
    <tr>
      <td> "Selects all the title nodes of books published after year 2004" </td>
      <td> copy(store/book[number(child::year)>2004]/book-title); </td>
    </tr>
    <tr>
      <td> "Selects all author names that contain capital letter K" </td>
      <td> copy(store//author[contain(., "K")]); </td>
    </tr>
  </table>
}
Try it yourself »