Disclaimer : This blog space does not necessarily reflect my views/ideas on the technology and beyond doubt, never reflects the views of my employer.

Wednesday, September 24, 2008

Revisiting XPath

Was reading w3schools.com in order to learn more concepts of XPath and thought to create a post as extracting information from tutorial. Here is some ready information on XPath.

What is XPath?
  • XPath is a syntax for defining parts of an XML document
  • XPath uses path expressions to navigate in XML documents
  • XPath contains a library of standard functions
  • XPath is a major element in XSLT
  • XPath is a W3C Standard

or XPath is a language for finding information in an XML document. The XPath language is based on a tree representation of the XML document, and provides the ability to navigate around the tree through elements and attributes in an XML document by a variety of criteria.

In addition, XPath may be used to compute values (strings, numbers, or boolean values) from the content of an XML document. The current version of the language is XPath 2.0.


XPath became a W3C Recommendation 16. November 1999.


XPath was designed to be used by XSLT, XPointer and other XML parsing software.
In XPath, there are seven kinds of nodes: element, attribute, text, namespace, processing-instruction, comment, and document (root) nodes.


If we have books.xml as below :
<?xml version="1.0" encoding="ISO-8859-1"?><bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book><book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book><book category="WEB">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book><book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book></bookstore>

If you carefully observer first book has title Everydat Itralian, 2nd Book is Harry Potter. To extract this values create HTML page in the similar folder where you have created above books.html as below :
<html> <body>
<script type="text/javascript">
function loadXMLDoc(fname)
{
var xmlDoc;
// code for IE
if (window.ActiveXObject)
{
xmlDoc=new ActiveXObject("Microsoft.XMLDOM");
}
// code for Mozilla, Firefox, Opera, etc.
else if (document.implementation
&& document.implementation.createDocument)
{
xmlDoc=document.implementation.createDocument("","",null);
}
else
{
alert('Your browser cannot handle this script');
}
xmlDoc.async=false;
xmlDoc.load(fname);
return(xmlDoc);
}
xml=loadXMLDoc("books.xml");
path="/bookstore/book[2]/title";

// code for IE
if (window.ActiveXObject)
{
xml.setProperty("SelectionLanguage","XPath");
var nodes=xml.selectNodes(path);
for (i=0;i<nodes.length;i++)
{
document.write(nodes[i].childNodes[0].nodeValue);
document.write("<br />");
}
}
// code for Mozilla, Firefox, Opera, etc.
else if (document.implementation && document.implementation.createDocument)
{
var nodes=document.evaluate(path, xml, null, XPathResult.ANY_TYPE,null);
var result=nodes.iterateNext();
while (result)
{
document.write(result.childNodes[0].nodeValue);
document.write("<br />");
result=nodes.iterateNext();
}
}
</script>
</body> </html>

Open the html file in browser and check the output. It should be "Everyday Italian".
Please observe the following line in HTML page.
path="/bookstore/book[1]/title";
This is pointing to specific book tag in the xml file. book[1] points to the first book tag in the xml document. change the value from book[1] to book[2] and you will get output Harry Potter instead Everyday Italian.

Also, observe the following line in HTML page
var nodes=xml.selectNodes(path);
Check the highlighted lines in HTML page and this is self-explanatory.

Hope this XPath article helps. Hope to write mroe about XPath nodes in detail in near future.
Many thanks to w3schools.com

Cheers
Nirav

No comments: