The Taxonomy XML schema makes use of XML Namespaces. If you are familiar with XML namespaces you may skip to the next section. Continue reading for an introduction to namespaces.
Sometimes it is useful to combine XML data content from different sources in a
single XML document. Suppose you have a lot of fish photos you shot last time
you went diving with your waterproof camera in your grandma's fishtank. You
uploaded your photos to a photo sharing service which provides photo metadata
in XML format. In this format, the
<date> element contains
the date when you took the photo,
<focalLength> give technical data about the photo etc.
In addition to the photo sharing service you are also using a biology site
to get information about fish you photographed, also in XML. This XML format
contains elements such as
Over the weekend you wrote a little application that combines data from both
sources and consolidates them in a single XML file. Your plan is to use the combined
XML data over next couple of dozen weekends to build an enterprise-strength
application that produces a beautiful illustrated catalog of fish in PDF format.
So far so good. Now, you know what's coming: the biology XML format also uses
an element called
<date>, which this time means something
completely different: the date the fish was first observed. This is a problem,
because now it is impossible to distinguish between these two different dates.
The solution is to use prefixes to distinguish your photo sharing service's XML
vocabulary from the the one used by the biology site. For example you could use
<photo:date> for the first element and
<bio:date> for the second element. Colon is a valid
character to use in XML identifiers so everything is fine. You don't really
need fancy namespaces to build this renaming into your application. However,
using the W3C recommendation
has the advantage that the XML tools and libraries that you are using would
take care of namespace issues automatically. Taxonomy of Human Services XML
format uses namespaces for the same reason: to make it easier for different
vendors to integrate the Taxonomy data with other data sources while using
widely available processing tools.
Here's how your XML might look like using W3C namespaces:
In this example we are using three different namespaces. Now, pay close attention: the names of these namespaces are:
<?xml version="1.0"?> <fc:fishPic xmlns:fc="http://joe.example.org/namespaces/fishphotos" xmlns:bio="http://biology.example.org/classification.html" xmlns:photo="http://photosh.example.org/metadata.xsd"> <fc:title>Photo of a deepwater stingray from my grandmother's fishtank</fc:title> <photo:date>2006-08-30</photo:date> <photo:exposure>1/15</photo:exposure> <photo:focalLength>50 mm</photo:focalLength> ... <bio:date>1899-06-06</bio:date> <bio:species>Plesiobatis daviesi</bio:species> ... </fc:fishPic>
Each of these namespaces is declared using the special
xmlns attribute and associated with namespace prefixes
If these names look like something you can type in your browser that's... a pure accident! Well, not exactly: the W3C recommendation says that namespace names must be URIs. This means that following strings would be just as suitable as namespace names as the ones above:
The idea behind using URIs for namespace names are that URIs are designed to be
unique and persistent. For example, if 212-555-0123 were your phone number it is
highly unlikely that anybody else would use the
tel:+1-212-555-0123 as a namespace name. Also, if an XML vocabulary
is described in a book, using the book's ISBN in form of its URN (also a form
of URI, proposed in RFC 3187)
would make it unlikely that another person uses the same string for describing
his XML vocabulary of things completely unrelated with the content of that
People usually use a URL starting with
http:// as I did in the
original example. It is important to note that using a namespace name such as
http://joe.example.org/namespaces/fishphotos does not
mean that there is a page retrievable under that URL or even that the domain
joe.example.org exists. It is a good idea, however, to use a
domain name that you control. If everybody acts that way namespace clashes
can't happen. While you are at that you might as well put a file behind the
URL. You can use an HTML page explaining the vocabulary or another
useful file related to the vocabulary, such as an XML schema definition file
that formalizes the vocabulary.
Namespace names are URIs in order to be globally unique. Unfortunately,
that usually makes them quite long. It would be very clumsy if we had to prefix
each XML element with the full namespace name. It would even lead to invalid
XML because element and attribute names must not contain characters such as
slashes. That's why we have prefixes. In our example
photo stand for their respective namespaces.
Prefixes have meaning only inside document where they are declared. In the above
example I could have used a different set of prefixes such as
p. The resulting document would be completely
So far we have talked about namespaces without mention of XML schemas. We could do that because it XML namespaces can be used without schemas. Taxonomy of Human Services uses an XML schema so that's what we'll talk about next.
In our example we had XML elements belonging to different
vocabularies. For example
belonged to the vocabulary of our fictitious photo sharing service. Any
constraints on the form and content of those vocabularies were implicit, i.e.
in programmer's head, code, or maybe in documentation. For example we never
formally specified that
<fishPic> element from the namespace
http://joe.example.org/namespaces/fishphotos may contain a
<title> element from the same namespace and an
<exposure> element from the namespace
http://photosh.example.org/metadata.xsd. Also, we never said
in what format the two
date elements should be represented.
XML schemas are languages for specifying this kind of constraints. Taxonomy of Human Services is using the XML Schema language from W3C. The current version of the XML Schema Definition file for the Taxonomy can always be found at http://www.211taxonomy.org/resources/xml_schema/taxonomy.xsd.
W3C's XML Schema definition language is itself an XML vocabulary which makes heavy use of namespaces. We'll explain the XML Schema basics as we explain the way it is used in Taxonomy of Human Services.
taxonomy.xmlfile and you'll notice that it contains an
xmlnsattribute on the top level element
<taxonomy name="Taxonomy of Human Services" releaseDate="2006-08-14T18:27:31Z" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.211taxonomy.org/resources/xml_schema/taxonomy.xsd taxonomy.xsd" xmlns="http://www.211taxonomy.org/resources/xml_schema/taxonomy.xsd"> <record code="B"> <name>Basic Needs</name> <definition>Programs that furnish survival level resources...</definition> <createdDate>1992-03-10</createdDate> <lastModifiedDate>2005-03-02</lastModifiedDate> ... </taxonomy>
Notice two aspects of this declaration: first, for the namespace name we are using the URL of the XSD file. Although this is quite a common practice, remember that we could just as well have used the URL of the schema documentation, the home page or any other valid URI. Second, notice that we are not using a prefix. This is a feature of XML Namespaces that we didn't mention before: each document can declare one namespace without a prefix. This way our document is more readable.
Now, let's take a look at these two lines of
Here, we are declaring a special namespace
<taxonomy name="Taxonomy of Human Services"
<definition>Programs that furnish survival level resources...</definition>
http://www.w3.org/2001/XMLSchema-instance and immediately using
schemaLocation from that namespace. This is one of
the few attributes defined
in the XML Schema specification intended for (optional) use in the document
instance (as opposed to the schema definition file). As you might
have guessed this attribute gives a hint at where the actual XML Schema
definition file is located.
Notice that the value of our
schemaLocation attribute consists of a
two-element space-separated list. The first list element is:
While the second list element is:
According to the spec, the first element is the name of the namespace for which
we are locating the XSD file. The second element is the file's URI. The above value
for the file's URI may seem wrong at first. Shouldn't we put the full URL of
the file as well? That would certainly be a valid choice. So why did we just
taxonomy.xsd and how is that to be interpreted? This is simply
a relative URI. If you ever worked with HTML you used relative URIs
all the time in anchor elements such as
taxonomy.xsd will be
retrieved from the directory where the instance document
taxonomy.xml is stored. It is a good idea to store the XSD file
locally so that it works even if the 211taxonomy.org website isn't reachable.
Putting the XSD file next to the instance document is an easy way to achieving
this without special configuration steps.
What about users who still want to retrieve the XSD file over the Internet
rather than keep a local copy? They should be fine as well: the XML Schema spec
says that schema-aware processors may try to resolve the namespace URI and grab
the XSD from there. If
taxonomy.xsd is nowhere to be found
locally, most such processors will probably try to do that. All this isn't that
important, really, as every schema-aware XML processor should be able to
configure these things separately so that any undesired effects of these
choices can be overridden.
Finally, let's take a look at the Taxonomy XML Schema itself.
On the top element we have two
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <schema xmlns="http://www.w3.org/2001/XMLSchema" xmlns:tx="http://www.211taxonomy.org/resources/xml_schema/taxonomy.xsd" targetNamespace="http://www.211taxonomy.org/resources/xml_schema/taxonomy.xsd" elementFormDefault="qualified" attributeFormDefault="unqualified" version="8"> <annotation> <appinfo> <csv-id>$Id: explanation.adp,v 1.2 2014/01/13 18:05:02 dbauer Exp $</csv-id> </appinfo> <documentation> The AIRS/INFO LINE Taxonomy of Human Services. For schema documentation see <a href="http://211taxonomy.org/resources/xml_schema/docs/docs.html">http://211taxonomy.org/resources/xml_schema/docs/docs.html</a> as well as <a href="http://211taxonomy.org/resources/xml_schema/explanation">http://211taxonomy.org/resources/xml_schema/explanation</a> </documentation> </annotation> <element name="bibliographicReference" type="string"> <annotation> <documentation>A list of references which credits sources used in writing taxonomy definitions or structuring taxonomy sections.</documentation> </annotation> </element> <element name="comments" type="string"> <annotation> <documentation>Comments on the term in plain text.</documentation> </annotation> </element> <element name="createdDate" type="date"> <annotation> <documentation>The date a term was first added to the taxonomy.</documentation> </annotation> </element> <element name="definition" type="string"> <annotation> <documentation>A plain text description of the meaning of the taxonomy term.</documentation> </annotation> </element> <element name="externalCode" type="token"> <annotation> <documentation>A code in an external classification system. Not to be confused with a taxonomy code.</documentation> </annotation> </element> <element name="externalTerm"> <annotation> <documentation>A term in another classification system which corresponds to this taxonomy term.</documentation> </annotation> <complexType> <sequence> <element ref="tx:system"> <annotation> <documentation>Values like NPC, NTEE, UWASIS go here.</documentation> </annotation> </element> <element ref="tx:externalCode"> <annotation> <documentation>Code in the external system. Not to be confused with the code attribute of a taxonomy record.</documentation> </annotation> </element> <element ref="tx:name"> <annotation> <documentation/> </annotation> </element> </sequence> </complexType> </element> ... </schema>
xmlnsdeclaration. The namespace
http://www.w3.org/2001/XMLSchemarepresents the vocabulary of the XML Schema language. We chose this namespace to be the default so that all the elements and attributes can be used without prefixes. This makes the file much easier to read. The second namespace is the familiar
http://www.211taxonomy.org/resources/xml_schema/taxonomy.xsd. For this one we chose the prefix
It may seem odd that we had to declare this second namespace here. We are
certainly not using any elements and attributes from that namespace in this
file, since this file is used to define those elements and attributes!
That's one of the peculiarities of XML Schema: in the schema definition
namespaces are used not only for correct resolution of element and attribute
names but also for resolving the content of certain attributes. For
<element ref="tx:system"> has to use
tx:system in order to make it clear that it is referring to the
system element defined in our namespace further down in the file.
On the other hand
type="token"/> refers to the
token data type which
belongs to the XML Schema namespace. There is no prefix before
token because its namespace,
http://www.w3.org/2001/XMLSchema, happens to be the default namespace.
Had we chosen a prefix such as
xsd we would have had to write
There is one more attribute to explain: The value of the
targetNamespace attribute is
This means that the elements and attributes defined in this XML Schema
definition file belong to the namespace
way XML Schema processors know that every element and attribute defined in this
file belongs to that namespace.
For further documentation on the Taxonomy XML schema, see the Schema documentation.