For practical exercises, the XML software is installed in the IFI CIP
Pool.
Table of contents
The small "books" example (books.xml, books.dtd) of the lecture is
available in the directory
/afs/informatik.uni-goettingen.de/course/xml-lecture/XML-DTD
Data-centric XML: The Mondial XML database
In the course, a nested XML version of the Mondial
database is used.
You find also all Mondial XML files in the directory
/afs/informatik.uni-goettingen.de/course/xml-lecture/Mondial
Text-oriented XML: Shakespeare's Plays
A more text-oriented example uses
Shakespeare's
plays by John Bosak.
The files (XML and DTD) are installed
at /afs/informatik.uni-goettingen.de/course/xml-lecture/Shakespeare/
.
Mixed Text- and Database-oriented XML: DBIS publication list
The following files are available in the directory
/afs/informatik.uni-goettingen.de/course/xml-lecture/XML-DTD
Browsing, editing
- Browsing: Use mozilla/firefox: for looking at XML
instances.
XML instances that are not HTML/XHTML are presented
by their XML structure - then you can click subtrees to open or close.
- Editing: kxmleditor.
Validation, Navigation, Exploration
- xmllint
- xmllint -loaddtd -valid --noblanks -noout
filename
validates a given file against the DTD given in its DocTypeDecl
(--noblanks ignores ignorable (indentation) whitespace, which would
not be allowed in non-mixed contents).
- xmllint -noout -schema
schemafilename file.xml
validates a given file against an XML Schema.
- xmllint can also be used for actually navigating inside
an XML instance:
with saxon (see Section on XQuery)
call saxonXQ -dtd:on -qs:. -s:filename.xml.
- XHTML 1.0/HTML 4.01 is described at
http://www.w3.org/TR/html/.
- DTDs of XHTML 1.0 for downloading can be found
here.
- Usually, the DTD on the W3C server is referenced by
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
- Note that this DTD associates a namespace with XHTML documents
via
<!ELEMENT html (head, body)>
<!ATTLIST html
xmlns %URI; #FIXED 'http://www.w3.org/1999/xhtml'
>
(see comments on querying XHTML below for the consequences of this).
- The W3C provides a
validator that checks
the validity of HTML/XHTML documents and reports problems.
- Here is an example
XHTML document (that validated in early 2002).
- HTML documents can be transformed (more or less ...)
to XHTML by
tidy html-filename > out-filename
Parts of XPath are also supported by xmllint (see above).
XPath queries can be executed using several tools:
W3C documents:
Used in this course:
- The saxon lightweight tool:
-
For use in the IFI CIP pool, the following aliases should be set in the file .bashrc:
alias saxonValid='java -cp /afs/informatik.uni-goettingen.de/group/dbis/public/saxon/saxon9.jar net.sf.saxon.Query -qs:. -dtd:on '
alias saxonXQ='java -cp /afs/informatik.uni-goettingen.de/group/dbis/public/saxon/saxon9.jar net.sf.saxon.Query'
alias saxonXSL='java -cp /afs/informatik.uni-goettingen.de/group/dbis/public/saxon/saxon9.jar net.sf.saxon.Transform'
-
Call saxonValid -s:filename.xml for validating an XML file that contains a DTD reference.
-
Call saxonXQ filename.xq to
execute an XQuery (or XPath query) from a file. The query in filename.xq
must address a document by using the
doc('filename.xml or url') function.
Here, filename.xml is a name of a file in the local file system,
saxonXQ also accepts urls for addressing documents from the web.
-
In order to call saxonXQ for a local XML source file, additionally the option "-s" can be used:
saxonXQ -s:/afs/informatik.uni-goettingen.de/course/xml-lecture/Mondial/mondial.xml filename.xq
Thus, it is not necessary to specify the document in the query file. This can also be
specified as an alias:
alias saxonXQMondial='saxonXQ -s:/afs/informatik.uni-goettingen.de/course/xml-lecture/Mondial/mondial.xml'
-
Further command line for XQuery options (see
here for complete list)
- !indent=yes (bash: \!indent=yes) for indented output
- -update:on for (saxonEE required)
-
Download Saxon
Download Saxon, unpack it, set
CLASSPATH accordingly, set aliases for saxon-XQuery and saxon-XSL accordingly.
-
The XQuery/XSLT-Demo WebService. This
WebService also uses Saxon internally.
The WebService is also available for local installations for
Download and for copying in
the CIP Pool at
/afs/informatik.uni-goettingen.de/course/xml-lecture/XML-Tools/xquery-demo.war
It can be used with any standard web application server, e.g. tomcat (see installation).
The main use for XSLT stylesheets is the transformation fom XML to XML or to HTML (see also below).
- saxon (supports XSLT 2.0):
call saxonXSL xml-file stylesheet.xsl
In order to install it yourself:
download Saxon, unpack it, set
CLASSPATH accordingly, set aliases for saxon-XQuery and saxon-XSL accordingly.
- The XQuery/XSLT-Demo WebService. This
WebService also uses Saxon internally.
- xalan (implements XPath/XSLT 1.0):
xalan
-xsl stylesheet.xsl
-in xml-input
- xsltproc (belongs to the XSLT library for Gnome):
xsltproc
stylesheet.xsl xml-input
xsltproc −−debug
stylesheet.xsl xml-input
produces a "sax-style" depth-first traversal of the input tree.
Transformation Stylesheets
- if a stylesheet is given in the document header, most browsers
do not show the XML source, but directly apply the stylesheet and
show the result:
mondial-with-stylesheet.xml
is processed by
mondial-simple.xsl that
shows a table of the countries. If you execute "view page source" in
your browser, you see the original XML file.
- redirected-input.xsl
is a stylesheet that gets its input from another file. Try this out
by viewing dummy.xml in your
browser:
viewing this file starts redirected-input.xsl that in course
states a query against mondial-europe.xml.
- other example stylesheets can be found in
/afs/informatik.uni-goettingen.de/course/xml-lecture/XSLT:
- Example: mondial.xsd, can also be found here:
/afs/informatik.uni-goettingen.de/course/xml-lecture/Mondial/mondial.xsd
- SQC (XML Schema Quality Checker)
checks the quality of a schema (i.e. validates it against the XML Schema Specification)
(i.e., it does not validate a document against an XML Schema).
- move to cd /afs/informatik.uni-goettingen.de/course/xml-lecture/XML-Tools/SQC and then
call ./SQC.sh any-xsd-filename.xsd.
- xmllint can be used for XML Schema validation:
xmllint -noout -schema <schemafilename> <file.xml>
-
Saxon EE supports XML Schema.
Download evaluation license and saxon EE version from
http://www.saxonica.com.
Call java -cp path-to-saxon9ee.jar
com.saxonica.Validate -xsd:xsd-filename xml-filename.
java -cp ~dbis/XML-Tools/saxon/saxon9ee.jar com.saxonica.Validate -xsd:mondial.xsd mondial.xml
-
SQL Web interface (in addition to the
Mondial tables, the tables mondial, countryXML, cityXML used on the slides exist)
- an online documentation can be found
here.
Note: tomcat is not a full-fledged Web Server (like the Apache Web Server), but a
Web Servlet Container with a simple Web Server. It does not host (plain HTML)
Web pages, but Web Services. Tomcat can be run in an Apache Web Server to have both together.
Here, we deal with tomcat only.
Students can install tomcat on their computers at home, and even install/run it in their accounts on the computers in
the CIP Pool:
- Download Apache tomcat here,
- unpack it in your home directory (it will create a directory tomcatX-Y-Z/), rename the directory just to "tomcat" (or
put a softlink by ln -s tomcatX-Y-Z tomcat),
- first test:
cd tomcat/bin
./startup.sh
look into the logfile: tail -f ~/tomcat/logs/catalina.out
- tomcat is now accessible on port 8080:
Browser: localhost:8080 should show an empty page (while e.g. localhost:8081 says "not found")
- copy e.g. the XQuery/XSLT Demo Servlet
(download)
into tomcat's webapps directory:
cp /afs/informatik.uni-goettingen.de/course/xml-lecture/XML-Tools/xquery-demo.war tomcat/webapps/
cd tomcat/bin
./shutdown.sh don't forget this. Running tomcat twice crashes.
If you forget it, down and up, until it works again.
./startup.sh will unpack the .war automatically.
Watch the logfile: tail -f ~/tomcat/logs/catalina.out
- Browser: localhost:8080/xquery-demo ... and there it is!
- Own servlets are not developped inside tomcat, but anywhere (e.g. using eclipse).
Go to your home directory (or wherever you maintain your source codes) and
get the ServletDemo
(download):
cp /afs/informatik.uni-goettingen.de/course/xml-lecture/servletdemo.zip .
unzip servletdemo.zip
cd servletdemo
there is an ant-file "build.xml" in it. (by the way, ant is written in XML, but now we just want to use it).
Set the CATALINA_HOME environment variable to your tomcat. You can do this by adding
export CATALINA_HOME="/home/YOURNAME/tomcat"
(note: a relative path ~/tomcat will NOT work)
to your .bashrc file, or simply (temporarily) by executing that command now.
Run the build.xml with the deploy target:
ant deploy and it copies its executable into tomcat/webapps.
cd tomcat/bin
./shutdown.sh
./startup.sh
Watch the logfile: tail -f ~/tomcat/logs/catalina.out
Browser: localhost:8080/servletdemo ... and there it is!
note: you did not even yet need to edit the configuration file in tomcat/conf/server.xml. If the servlet name
is the same as the source code name in the webapps, it is found automatically.
- For Web Service communication between different computers in the CIP Pool, Tomcats running on other
computers in the CIP Pool can be addressed by cip0XX.cip.loc:8080.
You can also run two tomcats on the same computer by installing another one in a directory
tomcat2/. Set the Connector port of the second one to 8081, and its shutdown port (by default set to 8005)
to 8006 in its tomcat2/conf/server.xml.