ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

Java and XML: Web Publishing Frameworks
Pages: 1, 2, 3, 4, 5, 6, 7, 8

Using a Publishing Framework

Using a good publishing framework like Cocoon doesn't require any special instruction; it is not a complex application that users must learn to adapt to. In fact, all the uses of Cocoon are based on simple URLs entered into a standard web browser. Generating dynamic HTML from XML, viewing XML transformed into PDF files, and even seeing VRML applications generated from XML is simply a matter of typing the URL to the desired XML file into your browser and watching Cocoon and the power of XML take action.

Viewing XML Converted to HTML

Now that our framework is in place and correctly handling requests that end in .xml, we can begin to see it publish our XML files. Cocoon comes with several sample XML files and associated XSL stylesheets in the project's samples/ subdirectory. However, we have our own XML and XSL from earlier chapters, so let's transform the partial XML table of contents for our book with the XSL stylesheet we built in Chapter 6, Transforming XML. The XML file should be named contents.xml (and is also available from the book's web site). Locate where you saved this file, and copy it into the servlet engine's document root. On a default installation of Tomcat, this is under <TOMCAT_ROOT>/webapps/ROOT/. The document refers to the stylesheet XSL/JavaXML.html.xsl. Create the XSL directory in your web document root, and copy the stylesheet we built in Chapter 6 into that directory. You should make sure that the DTD referred to in the XML document is commented out (remember, validation should rarely occur in production); also convert the OReillyCopyright entity reference to HTML as discussed in Chapter 6. Although validation and external entity references are supported by Cocoon, it is easier to view our XML without worrying about those details for now.

Once you have the XML document and its stylesheet in place, you should be able to access it with the URL http://<hostname>:<port>/contents.xml in your web browser. If you made all the modifications discussed in Chapter 6, the transformed XML should look like Figure 9-2.

Figure 9-2. Transformed XML from Chapter 6

Figure 9-2  

This should have seemed almost trivial to you; once Cocoon is set up and configured, serving up dynamic content is a piece of cake! The mapping from XML extensions to Cocoon should work across your entire servlet engine.

Viewing PDFs from XML

So far, we have talked almost exclusively about converting XML documents to HTML; when not looking at this, we have assumed our data was being used in an application-to-application manner. The format was entirely arbitrary, as both the sending and receiving applications parsed the XML using the specified DTD or schema. However, a publishing framework offers many more possibilities. Not only are a variety of markup languages supported as final document formats, but in addition, Java provides libraries for converting XML to some non-markup-based formats. The most popular and stable library in this category is the Apache XML group's Formatting Objects Processor, FOP, which we discussed briefly in Chapter 6. This gives Cocoon or any other publishing framework the ability to turn XML documents into Portable Document Format (PDF) documents, which are generally viewed with Adobe Acrobat (http://www.adobe.com/).

The importance of being able to convert a document from XML into a PDF cannot be overstated; particularly for document-driven web sites, such as print media or publishing companies, this could revolutionize web delivery of data. Consider the following XML document, an XML-formatted excerpt from Chapter 1, Introduction, shown in Example 9-1.

Example 9-1: XML Version of Chapter 1

<?xml version="1.0"?>
 
<?cocoon-process type="xslt"?>
<?xml-stylesheet href="XSL/JavaXML.fo.xsl" type="text/xsl"?>
 
<book>
 <cover>
  <title>Java and XML</title>
   <author>Brett McLaughlin</author>
 </cover>
 
 <contents>
  <chapter id="chapterOne">
   <title>Chapter 1: Introduction</title>
      
   <paragraph>XML.  These three letters have brought shivers to 
   almost every developer in the world today at some point in the
   last two years.  While those shivers were often fear at another
   acronym to memorize, excitement at the promise of a new technology,
   or annoyance at another source of confusion for today's 
   developer, they were shivers all the same.  Surprisingly, almost every
   type of response was well merited with regard to XML.  It is another 
   acronym to memorize, and in fact brings with it a dizzying array of 
   companions: XSL, XSLT, PI, DTD, XHTML, and more.  It also brings with 
   it a huge promise -- what Java did for portability of code, XML claims 
   to do for portability of data.  Sun has even been touting the 
   rather ambitious slogan"Java + XML = Portable Code + Portable 
   Data" in recent months.  And yes, XML does bring with it a 
   significant amount of confusion.  We will seek to unravel and 
   demystify XML, without being so abstract and general as to be 
   useless, and without diving in so deeply that this becomes just 
   another droll specification to wade through.  This 
   is a book for you, the Java developer, who wants to understand the 
   hype and use the tools that XML brings to the table.</paragraph>
 
   <paragraph>Today's web application now faces a wealth of problems
   that were not even considered ten years ago.  Systems that are 
   distributed across thousands of miles must perform quickly and 
   flawlessly.  Data from heterogeneous systems, databases, directory 
   services, and applications must be transferred without a single 
   decimal place being lost.  Applications must be able to communicate 
   not only with other business components, but other business systems 
   altogether, often across companies as well as technologies.  Clients 
   are no longer limited to thick clients, but can be web browsers that 
   support HTML, mobile phones that support Wireless Application 
   Protocol (WAP), or handheld organizers with entirely different markup 
   languages altogether. Data, and the transformation of that data, has 
   become the crucial centerpiece of every application being developed 
   today.</paragraph>
  </chapter>
 
 </contents>
</book>

We have already seen how XSL stylesheets allow us to transform this document into HTML. But converting an entire chapter of a book into HTML could result in a gigantic HTML document, and certainly an unreadable format; potential readers wanting online delivery of a book generally would prefer a PDF document. On the other hand, generating PDF statically from the chapter means changes to the chapter must be matched with subsequent PDF file generation. Keeping a single XML document format means the chapter can be easily updated (with any XML editor), formatted into SGML for printing hard copy, transferred to other companies and applications, and included in other books or compendiums. Now add to this robust set of features the ability for web users to type in a URL and access the book in PDF format, and you have a complete publishing system.

Although we don't have the time to cover formatting objects and the FOP for Java libraries in detail, you can review the entire formatting objects definition within the XSL specification at the W3C at http://www.w3.org/TR/xsl/. Example 9-2 is an XSL stylesheet that uses formatting objects to specify a transformation from XML to a PDF document, appropriate for our XML version of Chapter 1.

Example 9-2: XSL Stylesheet to Transform Example 9-1 into a PDF Document

<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:fo="http://www.w3.org/1999/XSL/Format">
 
  <xsl:template match="book">
    <xsl:processing-instruction name="cocoon-format">
      type="text/xslfo"
    </xsl:processing-instruction>
    <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
      <fo:layout-master-set>
      <fo:simple-page-master
        page-master-name="right"
        margin-top="75pt"
        margin-bottom="25pt"
        margin-left="100pt"
        margin-right="50pt">
        <fo:region-body margin-bottom="50pt"/>
        <fo:region-after extent="25pt"/>
      </fo:simple-page-master>
      <fo:simple-page-master
        page-master-name="left"
        margin-top="75pt"
        margin-bottom="25pt"
        margin-left="50pt"
        margin-right="100pt">
        <fo:region-body margin-bottom="50pt"/>
        <fo:region-after extent="25pt"/>
      </fo:simple-page-master>
      </fo:layout-master-set>
 
      <fo:page-sequence>
 
        <fo:sequence-specification>
          <fo:sequence-specifier-alternating
            page-master-first="right"
            page-master-odd="right"
            page-master-even="left"/>
        </fo:sequence-specification>
 
        <fo:static-content flow-name="xsl-after">
          <fo:block text-align-last="centered" font-size="10pt">
            <fo:page-number/>
          </fo:block>
        </fo:static-content>
 
        <fo:flow>
          <xsl:apply-templates/>
        </fo:flow>
      </fo:page-sequence>
 
    </fo:root>
  </xsl:template>
 
  <xsl:template match="cover/title">
    <fo:block font-size="36pt" text-align-last="centered" 
              space-before.optimum="24pt">
      <xsl:apply-templates/>
    </fo:block>
  </xsl:template>
 
  <xsl:template match="author">
    <fo:block font-size="24pt" text-align-last="centered" 
              space-before.optimum="24pt">
      <xsl:apply-templates/>
    </fo:block>
  </xsl:template>
 
  <xsl:template match="chapter">
    <xsl:apply-templates/>
  </xsl:template>
 
  <xsl:template match="chapter/title">
    <fo:block font-size="24pt" text-align-last="centered" 
              space-before.optimum="24pt">
      <xsl:apply-templates/>
    </fo:block>
  </xsl:template>
 
  <xsl:template match="paragraph">
    <fo:block font-size="12pt" space-before.optimum="12pt" 
              text-align="justified">
      <xsl:apply-templates/>
    </fo:block>
  </xsl:template>
</xsl:stylesheet>

If you create both of these files, saving the chapter as chapterOne.xml, and the XSL stylesheet as JavaXML.fo.xsl within a subdirectory called XSL/, you can see the result of the transformation in a web browser. Make sure you have the Adobe Acrobat Reader and plug-in for your web browser, and then access the XML document just created. Figure 9-3 shows the results.

Figure 9-3. PDF document from Example 9-1 and Example 9-2.

Figure 9-3  

Browser-Dependent Styling

In addition to specifically requesting certain types of transformations, such as a conversion to a PDF, Cocoon allows for dynamic processing to occur based on the request. A common example of this is applying different formatting based on the media of the client. In a traditional web environment, this would allow an XML document to be transformed differently based on the browser being used. A client using Internet Explorer could be served a different presentation than a client using Netscape; with the recent wars between versions of HTML, DHTML, and JavaScript brewing between Netscape and Microsoft, this is a powerful feature to have available. Cocoon provides built-in support for many common browser types. Locate the cocoon.properties file you referenced earlier, open it, and scroll to the bottom of the file. You will see the following section (this may be slightly different for newer versions):

##########################################
# User Agents (Browsers)                 #
##########################################
 
# NOTE: numbers indicate the search order. This is very important since
# some words may be found in more than one browser description. (MSIE is
# presented as "Mozilla/4.0 (Compatible; MSIE 4.01; ...")
#
# for example, the "explorer=MSIE" tag indicates that the XSL stylesheet
# associated to the media type "explorer" should be mapped to those 
# browsers that have the string "MSIE" in their "user-Agent" HTTP header.
 
browser.0 = explorer=MSIE
browser.1 = opera=Opera
browser.2 = lynx=Lynx
browser.3 = java=Java
browser.4 = wap=Nokia
browser.5 = wap=UP
browser.6 = netscape=Mozilla

The keywords after the first equals sign are the items to take note of:explorer,lynx, java, andnetscape, for example, all differentiate between different user-agents, the codes the browsers send with requests for URLs. As an example of applying stylesheets based on this property, let's create a sample XSL stylesheet to apply when the client accesses our XML table of contents document with Internet Explorer. Copy our original stylesheet, JavaXML.html.xsl, to JavaXML.explorer-html.xsl. Then make the modifications shown in Example 9-3.

Example 9-3: Internet Explorer XSL Stylesheet

<?xml version="1.0"?>
 
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:JavaXML="http://www.oreilly.com/catalog/javaxml/"
>
 
  <xsl:template match="JavaXML:Book">
    <html>
      <head>
        <title>
          <xsl:value-of select="JavaXML:Title" /> (Explorer Version)
        </title>
      </head>
      <body>
        <xsl:apply-templates select="*[not(self::JavaXML:Title)]" />
      </body>
    </html>
  </xsl:template>
 
  <xsl:template match="JavaXML:Contents">
    <center>
     <h2>Table of Contents (Explorer Version)</h2>
     <small>
       Try <a href="http://www.netscape.com">Netscape</a> today!
     </small>
    </center>
...

While this is a trivial example, dynamic HTML could be inserted for Internet Explorer 5.0, and standard HTML could be used for Netscape Navigator, which has less DHTML support. With this in place, we need to let our XML document know that if the media type (or user-agent) matches up with theexplorer type defined in the properties file, a different XSL stylesheet should be used. The additional processing instruction shown in Example 9-4 handles this, and can be added to the contents.xml file.

Example 9-4: XML Document with Multiple Stylesheets Based on Media Type

<?xml version="1.0"?>
<?xml-stylesheet href="XSL\JavaXML.html.xsl" type="text/xsl"?>
<?xml-stylesheet href="XSL\JavaXML.explorer-html.xsl" type="text/xsl"
                 media="explorer"?>
<?xml-stylesheet href="XSL\JavaXML.wml.xsl" type="text/xsl" 
                 media="wap"?>
<?cocoon-process type="xslt"?>
...

Accessing the XML in your Netscape browser yields the same results as before; however, if you access the page in Internet Explorer, you will see that the document has been transformed with the alternate stylesheet, and looks like Figure 9-4.

Figure 9-4. Internet Explorer version of generated HTML

Figure 9-4  

WAP and WML

One of the real powers in this dynamic application of stylesheets lies in the use of wireless devices. Remember our properties file?

browser.0 = explorer=MSIE
browser.1 = opera=Opera
browser.2 = lynx=Lynx
browser.3 = java=Java
browser.4 = wap=Nokia
browser.5 = wap=UP
browser.6 = netscape=Mozilla

The two highlighted entries detect that a wireless agent, such as an Internet-capable phone, is being used to access content. Just as Cocoon detected whether the incoming web browser was Internet Explorer or Netscape, responding with the correct stylesheet, a WAP device can be handled by yet another stylesheet. So far we have looked at the line that specifies a stylesheet to use for WAP media in our contents.xml file without paying it much attention:

<?xml-stylesheet href="XSL\JavaXML.html.xsl" type="text/xsl"?>
<?xml-stylesheet href="XSL\JavaXML.explorer-html.xsl" type="text/xsl"
                 media="explorer"?>
<?xml-stylesheet href="XSL\JavaXML.wml.xsl" type="text/xsl" 
                 media="wap"?>
<?cocoon-process type="xslt"?>

Now we take a look at this in more detail. When building a stylesheet for a WAP device, the Wireless Markup Language (WML) is typically used. This is a variant on HTML, but has a slightly different method of representing different pages. When a wireless device requests a URL, the returned response must be within awml element. Within that root element, several cards can be defined, each through the WMLcard element. The device downloads multiple cards at one time (often referred to as a deck) so that it does not have to go back to the server for the additional screens. Example 9-5 shows a simple XML page using these constructs.

Example 9-5: A Simple WML Page

<wml>
 <card id="index" title="Home Page">
  <p align="left">
   <i>Main Menu</i><br />
   <a href="#title">Title Page</a><br />
   <a href="#myPage">My Page</a><br />
  <p>
 </card>
 
 <card id="title" title="My Title Page">
  Welcome to my Title Page!<br />
  So happy to see you.
 </card>
 
 <card id="myPage" title="Hello World">
  <p align="center">
   Hello World!
  </p>
 </card>
</wml>

This simple example would serve requests with a menu, and two screens that could be accessed from links within that menu. The complete WML 1.1 specification is available online at http://updev.phone.com/dev/ts/ by signing up for a free membership to phone.com's developer website, located at http://updev.phone.com/. Additionally, the UP.SDK can be downloaded; this is a software emulation of a wireless device that allows testing of your WML pages. With this software, we can develop an XSL stylesheet to output WML for WAP devices, and test the results by pointing our UP.SDK browser to http://<hostname>:<port>/contents.xml.

Because phone displays are much smaller, we only want to show a subset of the information in our XML table of contents. Example 9-6 is an XSL stylesheet that outputs three cards in WML. The first is a menu with links to the other two cards. The second card generates a table of contents listing from our contents.xml document. The third card is a simple copyright screen. This stylesheet can be saved as JavaXML.wml.xsl in the XSL/ subdirectory of your web server's document root.

Example 9-6: XSL Stylesheet to Output WML from XML Table of Contents

<?xml version="1.0"?>
 
<xsl:stylesheet version="1.0"                 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"                
                xmlns:JavaXML="http://www.oreilly.com/catalog/javaxml/"
                exclude-result-prefixes="JavaXML"
>
 
 <xsl:template match="JavaXML:Book">
  <xsl:processing-instruction name="cocoon-format">
   type="text/wml"
  </xsl:processing-instruction>
 
  <wml>
   <card id="index" title="{JavaXML:Title}">
    <p align="center">
     <i><xsl:value-of select="JavaXML:Title" /></i><br />
     <a href="#contents">Contents</a><br/>
     <a href="#copyright">Copyright</a><br/>
    </p>
   </card>
  
   <xsl:apply-templates select="JavaXML:Contents" />
 
   <card id="copyright" title="Copyright">
    <p align="center">
     Copyright 2000, O&apos;Reilly &amp; Associates
    </p>
   </card>
  </wml>
 </xsl:template>
 
 <xsl:template match="JavaXML:Contents">
  <card id="contents" title="Contents">
   <p align="center">
    <i>Contents</i><br />
    <xsl:for-each select="JavaXML:Chapter">
     <xsl:number value="position(  )" format="1: " />
     <xsl:value-of select="JavaXML:Heading" /><br />
    </xsl:for-each>
   </p>
  </card>
 </xsl:template>
 
</xsl:stylesheet>

Other than the WML tags, most of this example should look familiar. A new XSL function is introduced,position( ), and a new XSL element,xsl:number, displays it. This adds output that indicates the position in the xsl:for-each loop each element is at; the format attribute allows the specification of the output format. In our case, we want output similar to this:

1: Introduction
2: Creating XML
...

We also added a processing instruction for Cocoon, with the target specified ascocoon-format. The data sent, type="text/wml", instructs Cocoon to output this stylesheet with a content header specifying the output is text/wml (instead of the normal text/html ortext/plain). The last new construct is an important one, and is seen as an attribute added to the root element of the stylesheet:

<xsl:stylesheet version="1.0"                 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"                
                xmlns:JavaXML="http://www.oreilly.com/catalog/javaxml/"
                exclude-result-prefixes="JavaXML"
>

By default, any XML namespace declarations other than the XSL namespace are added to the root element of the transformation output. In our example, the root element of our transformed output, wml, would have theJavaXML namespace declaration added to it:

<wml xmlns:JavaXML="http://www.oreilly.com/catalog/javaxml/">
...
</wml>

This would cause a WAP browser to report an error, as xmlns:JavaXML is not an allowed attribute for the wml element. The browser is not as forgiving as an HTML browser, and the rest of our content would not be shown. However, we must declare the namespace so our XSL stylesheet can handle template matching for the input document, which does use theJavaXML namespace. To handle this problem, XSL allows the attribute exclude-result-prefixes to be added to the xsl:stylesheet element. The namespace prefix specified to this attribute will not be added to the transformed output, which is exactly what we want. Our output would now look like this:

<wml>
...
</wml>

This is understood perfectly by a WAP browser. If you've downloaded the UP.SDK browser, you can point it to our XML table of contents, and see the results. Figure 9-5 shows the main menu that results from the transformation using our WML stylesheet when a WAP device requests our contents.xml file through Cocoon.

Figure 9-5. WML main menu

Figure 9-5  

Figure 9-6 shows the generated table of contents, accessed by clicking the "Link" button when the "Contents" link is indicated in the display.

Figure 9-6. WML table of contents

Figure 9-6  

For more information on WML and WAP, visit http://www.phone.com/ and http://www.wapforum.org/ ; both sites have extensive online resources for wireless device development.

By now, you should have a pretty good idea of the variety of output that can be created with Cocoon. With a minimal amount of effort and an extra stylesheet, the same XML document can be served in multiple formats to multiple types of clients; this is one of the reasons the web publishing framework is such a powerful tool. Without XML and a framework like this, separate sites would have to be created for each type of client. Now that you have seen how flexible the generation of output is when using Cocoon, we move on to looking at how Cocoon provides technology that allows for dynamic creation and customization of the input to these transformations.

Pages: 1, 2, 3, 4, 5, 6, 7, 8

Next Pagearrow