Cocoon 3 Without Tears

The theme of this article will be familiar to readers of Cocoon 2.2 without tears: We will explore ways to configure the latest version of cocoon to allow interactive development on a live site. Just like the good old days.

For readers who stumble upon this site, Apache Cocoon is a minimal but powerful framework for building web applications that process and assemble content from distributed sources and document types.

A little Cocoon history

Cocoon has a long and complex history with tidal flows of waxing and waning popularity. Currently at ebb tide, this site is both instructional and promotional: If you need to do single-source publishing or manage complex document transformations to create web content, cocoon may be exactly what you're looking for.

Cocoon was originally promoted as a collection of components to process exposed content: A cocoon website combined and filtered local and remote resources guided by user requests. These resources usually came from files or database systems. Developers could add, modify, or delete resources on a live site: the effects of changes were instantly visible.

Beginning with Cocoon 2.2, the official documentation began to promote a "block centric" view of cocoon: A block is a jar file that appears to be a regular java servlet. In fact, it may be implemented entirely by a cocoon sitemap using the built-in components or using any combination of java classes and built-in components.

Blocks are a revolutionary idea because users can create software tools implemented as servlets without traditional java programming. So it is by no means that case that cocoon 2.2 (or 3) are "java centric" and require programming skills to operate. But unfortunately, the documentation (or lack of it), lead many to conclude that cocoon was no longer accessible to people other than java software engineers. And by providing examples that packaged content as well as tools inside jars, interactive and incremental development seemed to be impossible.

In fact, cocoon 2.2 retained nearly all the capability of its predecessor in a far more elegant package. It's possible to use cocoon 2.2 with any combination of exposed resources and block servlets. But this fact passed unnoticed and the user base for cocoon shrank at an alarming speed.

Certainly another factor in the "demise" of Cocoon was its percieved status as a "one size fits all" web application framework. And for time this did seem to be the case: Cocoon contained support for every sort of content processing and presentation commonly used in the early years of the century.

But such a large architecture couldn't be sustained: The pace of web development increased and become increasingly specialized and sophisticated. It was simply not feasible to maintain a large monolithic framework.

At this juncture, key Cocoon developers made a difficult and courageous decision: Cocoon was contracted to become a minimal framework: A tool for integrating and directing the flow of information between software components without trying to provide all the components. As a consequence, it's now perfectly feasible to document and understand cocoon with reasonable effort.

While this decision was almost a certainly the right direction to take the project, it has left many with the perception that the day of Cocoon has passed. The goal of this very brief tutorial is to show how Cocoon 3 works with exposed content (or with blocks) and still empowers non-programmer web developers.

The importance of interactive development

The motivation for this arrangment seems obvious to me, but since it totally escapes most cocoon developers, I'll be more specific: It is time-consuming and aggravating to constantly copy files, package wars, and wait for servlet engines to restart. And even if these processes could be made instantainious, service to users is interrupted.

The whole concept of development mode vs production mode is an artifical limitation on developer productivity. It's an artfact of the old compile/link/run/test cycle of the punched card era. A fine thing when machine time was more valuable than developer time.

Why have modes if modes can be avoided? In complex applications, a great deal of time is spent bringing the state of a system to a point where some new feature or unwanted behavior is demonstrated. This state restoration activity has to be repeated for each turn of the development cycle. But when changes can be made on the fly without disturbing the the context, the development cycle is greatly shorted.

RANT (May be skipped by sensitive souls)

Cocoon succeeded and grew dispite incomplete, inaccurate, and non-existent documentation. And it grew dispite interwoven and interdependent examples (Dare I say malevolent?) that couldn't be understood until the totallity of cocoon was understood.

Several 75mm books were published and a great many public website were deployed with cocoon prior to 2.2. A few large institutions realized the full potential cocoon and more would have followed... But then the brakes were put on:

Cocoon 2.2 introduced Spring and Maven. These tools were daunting to some users, but that decision was both correct and far-sighted: Both technologies have been broadly accepted in the java world.

And the idea of blocks packed in jars that implement servlets is a great idea: Developers can create new servlets by combining existing tools with pipelines specified in a sitemap: All without using Java! I often hear critics of cocoon 2.2 and 3 lament that the framework has become java centric: This may be true for some of the developers, but it's still possible to create web tools without java and package them so others can use them. This was a major innovation unmatched by any other web technology.

But blocks, in my opinion, should be used for tools, not as containers for content. And the documentation for cocoon 2.2, such as it is, doesn't reveal any other way to create a webapp. The abandoment of interactive development was the real cocoon killer: Only a handful of cocoon 2 sites exist in the world. And just as a few intrepid souls tried to learn and use Cocoon 2.2, the developers en masse abandoned the project in favor of the even-more-undocumented cocoon 3. This "took the wind out of the sails" of the early adopters who might have joined the effort to document and promote the future of cocoon. Instead, most users simply moved on to alternative frameworks. In fact, this may be the only public site implemented with with cocoon 3.

But there is another way. Both cocoon 2.2 and cocoon 3.x can be used productivly to manage live sites. And I agree with the developers that cocoon 3 is a superior and greatly simplified framework with better potential than cocoon 2.2: Earlier versions of cocoon simply contained too many components that were never documented and no longer maintained. Getting back to the basic component and pipeline architecture was a good move. But it was also strong medicine that nearly killed the user community.

Overview

It's possible to use cocoon 2 in a highly productive, interactive manner. Details of this setup are outlined in the above mentioned Cocoon 2.2 without tears. Cocoon 3 presents additional challanges: It is no longer possible to interactively modify the sitemap and see instant results. But is is possible to reload the whole spring context when classes or selected files are modified. And we can keep content on the site outside the scope of any block.

Project directories and files

pom.xml
parent/
	pom.xml
	
src/main/webapp/
	sitemap.xmap
	stylesheet.xsl
	stylesheet.css
	index.xml
	GravenImages/
	WEB-INF/
		applicationContext.xml
		log4j.xml
		web.xml
		lib/

Maven project object model (pom.xml)

<?xml version="1.0" encoding="UTF-8"?>
<project
	xmlns="http://maven.apache.org/POM/4.0.0"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="
		http://maven.apache.org/POM/4.0.0
		http://maven.apache.org/maven-v4_0_0.xsd"
>

<modelVersion>4.0.0</modelVersion>
<packaging>war</packaging>

<parent>
	<groupId>org.apache.cocoon.parent</groupId>
	<artifactId>cocoon-parent</artifactId>
	<version>3.0.0-alpha-3-SNAPSHOT</version>
	<relativePath>parent/pom.xml</relativePath>
</parent>

<groupId>com.csparks.demo</groupId>
<artifactId>c3demo</artifactId>
<version>1.0-SNAPSHOT</version>

<name>c3demo</name>
<description>A minimalist cocoon 3 webapp</description>

<dependencies>
	<dependency>
		<groupId>log4j</groupId>
		<artifactId>log4j</artifactId>
	</dependency>
	<dependency>
		<groupId>org.apache.cocoon.servlet</groupId>
		<artifactId>cocoon-servlet</artifactId>
	</dependency>
	<dependency>
		<groupId>org.apache.cocoon.sitemap</groupId>
		<artifactId>cocoon-sitemap</artifactId>
	</dependency>
</dependencies>

</project>

Web deployment descriptor (web.xml)

<?xml version="1.0" encoding="UTF-8"?>
<web-app version="2.4"
         xmlns="http://java.sun.com/xml/ns/j2ee"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="
	 	http://java.sun.com/xml/ns/j2ee
		http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd"
>

<!-- Define the webapp root path referenced in applicationContext.xml -->

	<context-param>
		<param-name>webAppRootKey</param-name>
		<param-value>c3.root</param-value>
	</context-param>

<!-- Configure Spring logging -->

	<!-- This must be done before the Spring ContextLoaderListener is registered -->
	
	<context-param>
		<param-name>log4jConfigLocation</param-name>
		<param-value>/WEB-INF/log4j.xml</param-value>
	</context-param>

	<listener>
		<listener-class>org.springframework.web.util.Log4jConfigListener</listener-class>
	</listener>

<!-- Process Spring beans -->

	<listener>
		<listener-class>org.springframework.web.context.ContextLoaderListener</listener-class>
	</listener>

<!-- Define the Spring request listener -->

	<listener>
		<listener-class>org.springframework.web.context.request.RequestContextListener</listener-class>
	</listener>

<!-- Install cocoon blocks -->

	<listener>
		<listener-class>org.apache.cocoon.blockdeployment.BlockDeploymentServletContextListener</listener-class>
	</listener>

<!-- Cocoon blocks dispatcher  -->

	<servlet>
		<servlet-name>DispatcherServlet</servlet-name>
		<servlet-class>org.apache.cocoon.servletservice.DispatcherServlet</servlet-class>
		<load-on-startup>1</load-on-startup>
	</servlet>

<!-- Cocoon sitemap processor -->

	<servlet>
		<servlet-name>XMLSitemapServlet</servlet-name>
		<servlet-class>org.apache.cocoon.servlet.XMLSitemapServlet</servlet-class>
		<init-param>
			<param-name>sitemap-path</param-name>
			<param-value>/sitemap.xmap</param-value>
		</init-param>
		<load-on-startup>2</load-on-startup>
	</servlet>

<!-- URL space mapping -->

	<servlet-mapping>
		<servlet-name>DispatcherServlet</servlet-name>
		<url-pattern>/*</url-pattern>
	</servlet-mapping>

</web-app>

Spring application context (applicationContext.xml)

<?xml version="1.0" encoding="UTF-8"?>

<beans
	xmlns="http://www.springframework.org/schema/beans"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xmlns:util="http://www.springframework.org/schema/util"
	xmlns:configurator="http://cocoon.apache.org/schema/configurator"
	xmlns:servlet="http://cocoon.apache.org/schema/servlet"
	xsi:schemaLocation="
		http://www.springframework.org/schema/beans
		http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
		http://www.springframework.org/schema/util
		http://www.springframework.org/schema/util/spring-util-2.5.xsd
		http://cocoon.apache.org/schema/configurator
		http://cocoon.apache.org/schema/configurator/cocoon-configurator-1.0.1.xsd
		http://cocoon.apache.org/schema/servlet
		http://cocoon.apache.org/schema/servlet/cocoon-servlet-1.0.xsd"
>

<!-- Activate Cocoon Spring Configurator -->

	<configurator:settings/>

<!-- Configure Log4j -->

	<bean
		name="org.apache.cocoon.spring.configurator.log4j"
		class="org.apache.cocoon.spring.configurator.log4j.Log4JConfigurator"
		scope="singleton"
	>
		<property name="settings" ref="org.apache.cocoon.configuration.Settings"/>
		<property name="resource" value="/WEB-INF/log4j.xml"/>
	</bean>

<!-- Sitemap servlet -->

	<bean id="c3demo.servlet.service" class="org.apache.cocoon.servlet.XMLSitemapServlet">
		<servlet:context mount-path="" context-path="file:///${c3.root}"/>
	</bean>

</beans>

The cocoon sitemap (sitemap.xmap)

This is a simple sitemap that highlights a few surprises: Cocoon 3 uses jexl expressions to access and pass parameters.

<?xml version="1.0" encoding="UTF-8"?>
<map:sitemap xmlns:map="http://apache.org/cocoon/sitemap">

<map:pipelines>
<map:pipeline>

	<map:match pattern="">
	        <map:redirect-to uri="index.xml"/>
	</map:match>

	<map:match pattern="{name}.xml">
		<map:generate src="{map:name}.xml"/>
		<map:transform src="stylesheet.xsl">
			<map:parameter name="section" value="{jexl:cocoon.request.section}"/>
		</map:transform>
		<map:serialize type="html"/>
	</map:match>

	<map:match pattern="{name}.jpg">
	    <map:read src="GravenImages/{map:name}.jpg"/>
	</map:match>

	<map:match pattern="{name}.css">
	    <map:read src="{map:name}.css"/>
	</map:match>

</map:pipeline>
</map:pipelines>

</map:sitemap>

In this example, note that pattern components can be named. The xslt stylesheet has a parameter and we get the value from the request, again using a jexl expression. The last component of the term cocoon.request.section is the actual request parameter name.

Deployment

The goal here is to make the servlet container notice changes to our sitemap.xmap file and reload the context. Unfortunately, the method for doing this varies depending on the container and method of deployment.

The webapp can have a context file inside it's own META-INF directory that specifies reloading when the sitemap changes. The path will be:

<?xml version="1.0"?>
<Context reloadable="true">
	<WatchedResource>sitemap.xmap</WatchedResource>
</Context>

Linux users can put the webapp anywhere and add a symbolic link to the tomcat/webapps directory. Windows users can either keep their applications under tomcat/webapps or alternatively create a context file:

In this case, the context must use absolute paths for both the docBase attribute and the WatchedResource element:

<?xml version="1.0"?>
<Context docBase="${myappRoot}" reloadable="true">
	<WatchedResource>${webroot}/sitemap.xmap</WatchedResource>
</Context>

The variable webroot may be defined as a jvm startup parameter or just type in the full path.

To test with the maven jetty plugin, add the following elements to the project pom.xml:

<build>
<plugins>
	<plugin>
		<groupId>org.mortbay.jetty</groupId>
		<artifactId>maven-jetty-plugin</artifactId>
		<configuration>
			<connectors>
				<connector implementation="org.mortbay.jetty.nio.SelectChannelConnector">
					<port>8890</port>
					<maxIdleTime>30000</maxIdleTime>
				</connector>
			</connectors>
			<webAppSourceDirectory>${webroot}</webAppSourceDirectory>
			<contextPath>/</contextPath>
			<scanIntervalSeconds>5</scanIntervalSeconds>
			<scanTargets>
				<scanTarget>${webroot}/sitemap.xmap</scanTarget>
			</scanTargets>
		</configuration>
	</plugin>
</plugins>
</build>

<properties>
	<webroot>src/main/webapp</webroot>
</properties>

It's really too bad there are so may "webroots" around, but I don't know how to define it all in one place so it's visible to to the pom as well as web.xml and applicationContext.xml.