New article

Martin Bauer

A case study of open source content management using eZ publish

In March 2002, we were asked by the Graduate Careers Council of Australia (GCCA) to look at redesigning the gradlink website and implementing a content management system (CMS). This case study covers the rebuild of the gradlink website using the eZ publish open source content management system.

The Site The site is - a resource designed primarily to help graduates find jobs. It also helps both employers in graduate recruitment and careers services in assisting graduates find jobs.

The site was first build in the mid 90's with it's own job vacancy system. This was later outsourced and is now seek campus. The rest of the site consisted of static html and underwent a number of redesigns. Prior to this project, the most recent redesign was in 1998.

Initially all maintenance of the site was outsourced to web development firms but over time, GCCA developed enough in-house skills to manage the majority of the site themselves. Unfortunately, over the four year period since the last redesign & rebuild, the site had become difficult to manage and maintain. It was time for an overhaul.

The Brief Before even considering designs or CMS packages, with our assistance GCCA conducted a user survey to get an idea of who was using gradlink, what they were using it for and what improvements users would like to have. The results of the survey were eye opening. The site had always been split into three main sections, one for students, one for employers and one for careers services. The survey showed that over 80% of people visiting gradlinkwere students, approximately 10% careers services and only 2% employers. And the overwhelming reason for students coming to the site was to find a job, a facility that GCCA had outsourced to Seek.

There were four key outcomes that GCCA wanted from the rebuild of gradlink.

give the site a modern, clean professional look integrate a job search facility that used the Seek Campus job database, implement a CMS that would enable GCCA to maintain the site in house and ensure any change to content had global impact on the site Defining the solution In order to choose a CMS, we started with a requirements gathering phase that was conducted over a series of long meetings (approx 3 - 4 hours each). This produced a document which was a list of written statements of what GCCA wanted for the new gradlink site. The next step was to translate the requirements into a functional specification. We did this by creating a site map and then producing wire frames for each main section of the site, defining each of the elements of the screen and any business rules.

The requirements gave us what the site was supposed to achieve.

The functional specifications gave us how we were going to do it.

Staged Approach Once we had the full requirements and functional specifications, the scope of the project (and budget) was much larger than expected. The problem was all of the functionality was important and necessary but could not be afforded at this point in time. So we prioritized the requirements to establish the minimum functionality required for the initial launch and called this Stage 1. The rest of the functionality was then prioritized into Stages 2 and 3 to be implemented later.

This prioritized approach was primarily driven by budget, but we also felt it was important to get the CMS up and running and have GCCA work with it for a while, after which we could review Stage 2 and 3 with more knowledge and experience with the system.

Choosing the CMS There were two key elements that we had to consider in choosing a CMS. The first and most important was budget - the more the CMS was going to cost, the less that could be spent on design and implementation. The second element was the functionality. Ideally, we were hoping to find an open source solution that had all the functionality but no license fee associated with it. Of course, there is no such thing as free lunch and the chances were an open source solution would require more work to implement than a commercial product. It was a matter of balancing the effort required to implement against the gain of an open source product with no license fee.

Our first step in choosing a CMS was to distill the requirements and functional specification into a list of functions. We found there were 18 areas of functionality that the CMS had to provide. When went about searching on-line for potential solutions and we found there were dozens of open source solutions.

In selecting the end solution, the first step was to see if the solution provided the full set of functionality. This eliminated many of the solutions we found and we ended up with a shortlist of 10 possible options that we documented in an evaluation matrix. For the 10 shortlisted solutions we then looked at each of them more closely and looked at factors such as:

was it actively maintained by the open source community how mature was the solution, eg. had it evolved over several versions had the solution been used for commercial sites was there good documentation and support This shortened the list to 3 possible solutions. Then we got a copy of each solution and installed them to get a closer look at the underlying architecture and structure as well as getting an idea of how easy the solution was to use from the client's perspective. In the end the solution that we thought was the best option on all fronts was eZ publish, produced by eZ systems in Norway.

Rebuilding Gradlink Interface Design Solution designIT mocked up many ‘look and feel’ designs to gain feedback from gradlink as to which one resonated with their organisation as a way to be represented. As the site is the main front door and resource it had to represent both their intention and attitude, as well as be user friendly to all their ‘markets’. While 80% of the users were found to be students, the funding for the organisation comes from institutions so their needs are as important even though they are a minority user.

A case study of open source content management using eZ publish Interface Design Solution Test Implementation and which version? Real Implementation Content Population Job Search Integration Support and Documentation The End Result

Rebuilding Gradlink Test Implementation and which version? As we had not implemented eZ publish before, we knew there was going to be a steep learning curve on the first implementation. Rather than face that learning curve on the gradlink site, we decided to invest our own time and resources in the implementation of our own site so that we could be sure that it was the right solution for GCCA. What we discovered in implementing eZ publish for our own site was that the actual implementation was the easy part! Defining the structure and how it was to work was the hard part. Basically, what we had to do was deconstruct our site into each element and then configure eZ publish to assemble these elements to produce the website. These elements we defined as the following

a) content types - the structure of the actual content within the site, each with there own attributes

eg. one content type was an article which had a heading, author, date, body copy and image

b) templates - how the content was to be displayed, eg. the look and feel, basically html and graphics which provided the structure for the page and how the content would look.

c) rules - what content was to be displayed with which template into which sections of the site, eg. staff profiles only appeared in the "about" section of the site.

Which version? At the time we did the test implementation, the latest version (3.0) was supposed to have been released but delays meant that only a beta version had been released. This changed our plans significantly as we had planned on the new features provided by the new version. We didn’t want to implement an old version and within a matter of months have to upgrade to a new version which would cost more in the long run.

We discussed this at length with GCCA and looked at time lines for implementation with the previous version and the latest version. In the end, in line with our recommendation, GCCA decided to delay the deadline 2 weeks to give us the extra time required to wait for the release of the latest version.

When the time came to start work on the real implementation, the final version was still not ready and we were forced to go forward with the latest release candidate. The final version was released during our real implementation but by then it was too late to migrate to the final version. This caused two minor yet annoying issues. Firstly, we had to implement some work-arounds for bugs that would not be fixed until the final release. Secondly, the wysiwig content editor would also not be available until the final release. This meant that GCCA would have to deal with some manual content formatting using tags eg. bold, bulleted lists, etc.

A case study of open source content management using eZ publish Interface Design Solution Test Implementation and which version? Real Implementation Content Population Job Search Integration Support and Documentation The End Result

Rebuilding Gradlink Real Implementation Having learnt much about eZ publish from our test implementation we went about defining the content types, templates and rules for the gradlink site which we compiled into what we called the "information architecture" document.

It took designIT many iterations to get the information architecture to the point that we felt ready to start working on the site. The hardest part about defining the information architecture was being able to abstract content from presentation and understanding how to construct the site from the combination of content types and templates. It was difficult shift in mind set from simply constructing sites based on pages. eg. a piece of content entered once could appear in different places on the site in different ways. We had to capture all of these possibilities to ensure the site would work as expected.

What took us approximately 10 months to define and plan from the user survey through to the information architecture in the end took us just under 2 months to actually implement. But it must be remembered that the reason it only took 2 months to implement was BECAUSE we’d spent that valuable planning phase!

A case study of open source content management using eZ publish Interface Design Solution Test Implementation and which version? Real Implementation Content Population Job Search Integration Support and Documentation The End Result

Rebuilding Gradlink Content Population This was the real test of how well we had done on the information architecture. Once GCCA started to enter content, we found that we had got most of it right but there were some content types that we hadn’t accommodated, eg, articles that spanned more than one page or contact details. Also, we thought that the article content type would be flexible enough to handle event information but when we saw how it looked on the site, we found that it didn't quite work and that we needed to provide another content type just for events. There were other things like page anchors that we take for granted in static sites but aren't as easy to accommodate in a CMS.

GCCA had prepared for the content population phase by taking all the content on their current site and saving it in Microsoft Word files in folders to reflect the new structure. But in this process they discovered that it wasn't as easy as cutting and pasting from one site into the other. They found replication in their existing site and had to edit some of their content to make sure it fitted.

To get GCCA started we conducted a training session on how eZ publish worked. The administration screens were very user friendly and we found that it took only an hour or two for people to be comfortable enough to find their way around and enter content. The difficulty was not so much entering content but understanding how that content would end up in the website, ie. understanding the separation of content from presentation.

During the content population phase we had a few issues to resolve but all in all, it went quite smoothly. The biggest issue was the amount of HTML formatting that had to be done to the content and the other ‘buggy’ problems with the version of eZ publish being used (release candidate 2). This version lacked the online wysiwyg editor and had other minor bugs like adding extra white space making previewing difficult without publishing the article, reviewing and re-editing to get it right.

All of these problems were solved when the code based was upgraded to the final version of eZ 3 a few months later.

A case study of open source content management using eZ publish Interface Design Solution Test Implementation and which version? Real Implementation Content Population Job Search Integration Support and Documentation The End Result

Rebuilding Gradlink Job Search Integration One of the requirements of the site was to have a job search facility that required the publication of graduate job advertisements sourced from 2 different websites, Seek Campus and Graduate Opportunities (GO). These had to be displayed, by a returned file from both these websites together but in a particular logic priority. We integrated the external data and then integrated this into gradlink in the business logic required.

We built this as a separate project but had to integrate it into eZ publish so that, from the user's perspective, it was seamless.

Initially, we integrated the job search directly into the eZ publish code. While this worked, it meant that updating bwould require that the job search functionality was reintegrated with each upgrade.

The release of the eZ publish 3 included examples and documentation detailing how to create custom modules. Armed with this information the job search functionality was recreated as a custom module. This has allowed for this functionality to be independent of the core eZ publish distribution and makes the upgrade process much simpler.

A case study of open source content management using eZ publish Interface Design Solution Test Implementation and which version? Real Implementation Content Population Job Search Integration Support and Documentation The End Result

Rebuilding Gradlink Support and Documentation We purchased enterprise support from eZ publish to ensure we could get our questions answered within 24 hours. As it is built by eZ systems who are based in Norway, we couldn't rely on a local contact or use of phone so email support was very important.

On the whole we were impressed with the quality and professionalism of the support service. There were two occasions where the service did not live up to expectations. With both of these incidents the support questions were answered within a reasonable time frame but not with in the advertised 24 hours. The first of these was due to the support staff being overloaded and when queried we were credited with an hour of support time and promised that it would not happen again. The second incident was due to a Norwegian public holiday. As a result we contacted the eZ Systems management team outlining our concerns and received a response stating that they were working on improving their support procedures so that they could better accommodate overseas clients.

What we did find frustrating however was the level of documentation. As we did not have access to training courses our learning and understanding of teh system relied solely heavily on experimentation, user forums and reading the source code. When the documentation wasn't sufficient, we basically had to learn the hard way by trying things out ourselves which resulted in a fair amount of trial and error to work out the best way to implement things.

A case study of open source content management using eZ publish Interface Design Solution Test Implementation and which version? Real Implementation Content Population Job Search Integration Support and Documentation The End Result

Rebuilding Gradlink The End Result The new gradlink site went live on 21 March ‘03, the launch date planned 4 months earlier and approximately a year after GCCA first spoke to us about a rebuild. Since then, the site has worked well and GCCA are now in full control of their site. About a month after we went live, we conducted a project debrief with GCCA to find out what we all thought went well and what could have been improved.

What went well

delivered on time looks great have received positive feedback designIT & GCCA worked well as a team good planning upfront user survey helped focus project open communication on both sides, able to give & receive advice high levels of trust integration with Seek & GO was straightforward What could have been improved

closer review of content to see how it fits into the CMS and content types defined selection of a mature product (eg. not in beta) to prevent the risk of delays and difficulty in usage 5 months later About 2 months after the initial launch, designIT planned and worked on a much reduced Stage 2, mostly a cosmetic update but also more importantly the final version (3.0) of eZ publish which fixed some of the temporary solutions we implemented to launch on time and improved performance of the site through better caching and template management. Other than that, everything is working fine and we are now planning the next stage to add the next set of functionality.

In Conclusion

The project as a whole was a great success.

Each stage of the project was valuable in shaping the following stage and in turn the end result. The user survey helped shaped the requirements, which helped shape, the functional specifications and so on. There were definitely times that we all felt that we were spending too much time documenting and not enough time building. But the speed, ease of implementation and small amount of changes needed since the launch proved the effort spent upfront was well worth it and is standing us in good stead as we plan Stage 3.

As a product, eZ publish has proven to be a quality solution and eZ systems a professional organization to deal with. The product is well supported and future releases will add valuable features.

The hardest part of the project was understanding the difference between content and the presentation of that content as a page on a website, ie. how to break a site down into its elements (content, templates) and then re-construct as a page on a website based on a series of rules.

Defining the information architecture is where the real challenge and real value lies in implementing a CMS.