I encountered an issue when I tried to get a node from RDF. Normally, I would do the following to get the node:

xmllint --xpath '//foo/bar' test.xml

But this will not work on RDF since there are namespaces involving. So, you need to tell xmllint about the namespaces:

$ xmllint --shell test.xml
/ > setrootns
/ > cd rdf:RDF
RDF > dir
ELEMENT rdf:RDF
  namespace rdf href=http://www.w3.org/1999/02/22-rdf-syntax-...
  default namespace href=http://purl.org/rss/1.0/
  namespace taxo href=http://purl.org/rss/1.0/modules/taxonomy...
  namespace syn href=http://purl.org/rss/1.0/modules/syndicat...
RDF > setns a=http://purl.org/rss/1.0/
RDF > cd a:item[1]
item > dir
ELEMENT item
  ATTRIBUTE about
    TEXT
      content=http://example.com
item >

The setrootns is for selecting <rdf:RDF/> node by letting xmllint create rdf namespace prefix for you. That is not necessary if you are not interested in that node. You will need to assign a prefix for the default namespace in order to selecting the node you want. The name of prefix isnt important, just pick up a random name will do.

Unfortunately, I dont see anyway that I can do it without --shell. So, you will need to run xmllint like:

echo -e 'setns a=http://purl.org/rss/1.0/\ncat //a:item[1]/a:description/text()' | xmllint --shell test.xml

Then you parse the output. Remove unwanted lines of shell usage, etc.

If you dont want to go through above, you can use *[1] to select or remove the namespace parts from the source, that should do if you like that way.


2012-06-29T17:03:45Z: You may want to try xmlstarlet, its easier to query with namespaces.