Part one of this article looked at how Cascading Style Sheets (CSS) can be used to make XML documents look good in a web browser. In part two, I’ll explore the more complex eXtensible Style sheet Language (XSL) and how it can be used to transform XML into HTML and PDF documents.
This article has downloads!
By the end of part one of this article I had a my tasty pico de gallo recipe marked up with XML tags and nicely styled using CSS. It looked great in my Firefox browser. Unfortunately, one of the major problems with using XML/CSS is that it doesn’t work for everyone. Older, proprietary browsers and text-only browsers can only understand HTML and get terribly confused when trying to interpret XML/CSS. If I want to share my delicious recipes with someone using Netscape 4 or Lynx, I’m going to have to convert my XML into a format that their browser can handle. This means leaving behind CSS and constructing a new style sheet using the eXtensible Style-sheet Language (XSL).
Unfortunately, one of the major problems with using XML/CSS is that it doesn’t work for everyone
As long as I’m in the mood for creating new things, I may as well create a new XML recipe to use for the examples in this article. You already have a recipe for a nice appetizer from part one, and now, for part two, you just need a drink to go with it. I can’t think of anything better to have with pico de gallo and tortilla chips than a cold margarita so that’s what I’ll use in the examples. By the way, the margarita recipe and all of the style sheets in this article are available as a compressed download.
An XSL Transformation, or XSLT, is the process of transforming an XML document into another type of XML document. Now why on earth would someone want to transform one XML document into another XML document? I’ll give you a hint, Grasshopper: HTML is another type of XML document. I can build a custom XSL style sheet, apply it with a tool called an XSLT processor and presto, my XML is magically transformed into HTML.
In the examples, I will be using the XSLT processor called
xsltproc to process the XSL style sheets. The
xsltproc tool should be available as a package for most GNU/Linux distributions and Apple’s OS X, if you want to follow along. The basic syntax for the command is
xsltproc -o [output-file] [style-sheet] [input-file]. Or, in the case of this article’s examples,
xsltproc -o output.html recipe-style.xsl margarita.xml. If you don’t have access to
xsltproc, or if you’re just feeling a little apathetic about typing all these commands, the output files from each of the examples are included alongside the other files in the compressed download.
An XSL Transformation is the process of transforming an XML document into another type of XML document
The most basic XSL style sheet is one that does nothing at all. Of course, there are a few headers that are required, but for the most part the style sheet is devoid of any processing instructions. When using a blank style sheet like this, it appears that
xsltproc simply strips the tags from the XML recipe and dumps it as plain text to the output file. Close, but not quite. Take a look at the first line of the output and you will see a document-type declaration for HTML. This was added during processing, because the second line of the
recipe-style.xsl specifies HTML as the output document. So I’m on the right track, but
output.html displayed in a browser looks really bad. That’s because the only output I have produced so far is a plain-text file masquerading itself as HTML.
A real HTML document should have tags and currently
output.html has none. To make a valid HTML file I’ll need to fix up the XSL style sheet to at least produce ,
It wasn’t hard to add the template rule to my
recipe-style.xsl style sheet and get the HTML tags I wanted to see. Unfortunately, although I am one step closer to a valid HMTL document, it seems that all of the recipe’s content has now disappeared. This is because I have started using processing rules in my style sheet, but I have not followed through by specifying where to place the recipe’s content. Adding a simple <xsl:apply-templates /> element between the HTML
Now that I have created some basic HTML output, my recipe just needs a little aesthetic improvement. OK, so it needs a lot of improvement, especially in terms of appropriate line breaks and whitespace, but this is something I can easily do by employing more XSL templates. So far I’ve used a single XSL template in my
recipe-style.xsl file. There’s no reason I should stop at just one. In fact, I can create a template for each XML element in my
margarita.xml source file to give the HTML output a nice look.
You can think of an XSLT template as being like a word processor’s find-and-replace feature
Since I am to catering to people with older, proprietary browsers and text-only browsers, I’ll use standard HTML tags to achieve the basic style I want. A big improvement to my HTML output can be made by just by adding
tags to get line-breaks in the correct places. I can also change the font size for the title and section heading by using
recipe-style.xslgive the HTML output a more appealing look.
My recipe already has the title displayed as a nice big level-one heading, but it really should have a title between the HTMLtags as well. It’s not a big deal in terms of the look of the document, but it is required for my output to be considered valid HTML.
I already have a template for the XML title element that displays it between
Remember when I said that templates were like using find-and-replace in a word processor? Well, <value-of /> is like using just the find without the replace, and it uses a select attribute to specify what to find. For example, if I want to know what text is between the
With just a few simple templates, XSL does a great job of transforming my XML-based margarita recipe into HTML, but that’s not all it can do. With a little more work I can also produce documents for Adobe Acrobat or OpenOffice all from the same XML source file. I’ll need to construct a new XSL style sheet and install a new tool called a Formatting Objects processor, but otherwise the procedure is very similar to what we’ve covered so far.
The reason for the new style sheet and the new tool is that transforming to PDFs is a two-step process. First, I will use XSL templates to transform my XML-based margarita recipe into another type of XML document. Sounds familiar right? Only this time I’m not transforming into HTML, but rather into a markup language called Formatting Objects (FO). The second step is to take the FO markup and run it through the FO processor to get a PDF document.
Formatting Objects markup may look scary at first, but it’s not bad after you play around with it a bit. To help you get comfortable I have included a sample XSL style sheet and Formatting Objects output as part of the files in the compressed download.
This article barely scratches the surface of the everything that can be done with XSL and FO. For those of you who want to go further, I have listed some helpful references below.
You can also learn a lot from using the various tools. Available XSLT processors include xsltproc and Saxon. A couple of my favorite Formatting Objects processors are Apache FOP and XMLMind’s FO Converter. All but
xsltproc are Java-based and may be used on a variety of platforms.