What Goes Where: WCMS Implementation in the Next 10 Years.

Martin Bauer
05/05/2005

Ten years ago, the number of commercial Web content management systems (WCMSs) available could be counted on one hand. Now the market is saturated with offerings, not to mention most Web development firms having their own inhouse solutions. These developments have given customers a much wider and better range of products to choose from. However, the process of implementing a WCMS is still difficult, timeconsuming, and costly.

Ten years ago, the number of commercial Web content management systems (WCMSs) available could be counted on one hand. Now the market is saturated with offerings, not to mention most Web development firms having their own inhouse solutions. These developments have given customers a much wider and better range of products to choose from. However, the process of implementing a WCMS is still difficult, timeconsuming, and costly.

The main problems with WCMS implementations are content input (acquisition and importation and output (publication). The first is often a manual task that frequently takes more time and resources than planned; the second is a technical task requiring expertise that takes time to acquire. Both problems add cost and time to every WCMS implementation.

Over the next 10 years, we will see a number of changes that will improve the implementation process. The most important changes will be the emergence of better tools that will make the process more accessible to business people. The result will be better support for how content is input and output, without the cost and effort of many current implementations.

CONTENT ACQUISITION

Acquiring content in the right format has proven to be much more difficult than expected. It is also one of the key factors in projects going over time and budget. There are many reasons for this: lack of preparation, lack of understanding, lack of clarity of purpose, and sometimes simply the lack of actual content.

In an ideal world, all the content would exist in a consistent Webfriendly format and be properly organized before the implementation starts, making it a relatively simple task. In reality this rarely happens. Sometimes the project is dealing with content that is yet to be created or only exists in hard copy. Also, it’s very hard to predict how best to organize the content when you don’t know how it’s going to be output.

The key to improving content acquisition is being able to gather all the content in one place and in a consistent format. In future, the goal is to move toward the concept of a single source of content that then can be output in a range of formats best suited to the medium in which the content is to be viewed (e.g., via a Web browser, PDA, printer, or other device).

Formats and Sources

Most businesses will have content from a variety of sources. This is because they have never had the need to bring all the content together in a single place. For instance, the marketing department organizes a corporate brochure or annual report, the product department drafts technical details on a product, and never the twain have met. It’s just never been a need before, which means different types of content will be in different formats. With each format, organizations face challenges in converting the content into a format that can then be imported into the WCMS. Below I will examine the main formats.

Hard Copy

At the moment, the main technology that can assist with converting hard to soft copy is scanning and optical character recognition. A step forward would be to integrate this feature into the WCMS. This would mean a hard copy document could be scanned directly into the WCMS for editing rather than having to go via a graphics application first and then being input into the WCMS. This would work well for text-based content but would be impractical for other types of content that would need to be formatted prior to input into the WCMS.

Soft Copy

The problem with soft copy content is the wide range of available formats, not all of which are appropriate for importing into a WCMS. For instance, on a recent project for a car parts manufacturer, all product information was provided in soft copy; however, it was spread across the following formats:

  • Indesign files

  • Word documents

  • XML

  • Excel spreadsheets

  • PDFs

Despite the product information being well structured, no one source had all the information for a particular product — it was spread across three different file types.

Graphics Files. Most organizations will have their graphics in a format for printing, and there are many formats available. Digital editing applications such as Adobe Photoshop allow for the conversion of graphics from one format to another. It is possible that WCMSs will introduce tools to manage this conversion. There are only a few graphics formats currently suitable for the Web (GIF, JPEG, and PNG). The ability to take a print image and upload it to the WCMS, which automatically converts it to the appropriate format, will mean that Web designers will spend less time converting files from one format to another, potentially saving time and money.

Text Files. The popularity of Microsoft Office means that the majority of text files are likely to be MS Word documents. This consistency means that managing text-based content is a straightforward task.

Data Files. Unlike text documents, there are a wide range of data files available, as there are many data storage systems, each with its own proprietary format. As long as there is an identifiable structure to the data, it is possible to import that content. However, complications quickly arise when the data structures go beyond two dimensions; for instance, when an Excel document has multiple worksheets that share cells.

Databases. Similar to data files, there are many different databases out there, even though there is some consistency in the use of SQL. The difficulty is not so much the format of the data; if the database is in SQL, it can be exported and imported. What can’t be easily automated or managed is the meaning of the structure — why the tables are related in a particular way and how to bring all the data together in a coherent fashion.

Existing Systems. One of the major sources of content in future will be existing systems. Almost every business will have information in some type of electronic system, ranging from a simple CRM system through to ERP and financial systems. A WCMS that is integrated with an ERP system to get product information and publish that on a company’s Web site will be invaluable.

Gathering Content

Once the content is in a consistent Web-friendly format, the next issue is ensuring it is well organized. Along with the creation of content is the need to gather the content together in a coherent fashion to help with content input. For some projects, the content is simply a collection of Word documents filed in folders on a file server. However, when graphics and other assets are involved, it becomes more complex. Which image goes with which article? The answer to this is almost another content management system that allows the content creators to gather and store content in preparation for input into the CMS once it’s ready.

The structure of this system would be different to the final WCMS, as it will have to deal with multiple content formats as outlined above. What we will see is a form of simple document/data management used to gather and store content prior to entry into the CMS. To start with, these systems will be hand built and standalone, but in time they are likely to become integrated into the WCMS as a holding area for content.

CONTENT INPUT

Once the content has been gathered and organized, it needs to be entered into the WCMS. This is where considerable gains can be made in automating the process. What we will see is the improvement of importing functions to the point that they will be able to guess the type of content being imported and suggest the structure for the incoming content. We can already see this in the migration of content from one application to another (e.g., from one mail client to another and from one database to another). Another advance will be the ability to integrate content from other systems.

Importing Content

Assuming the content has a consistent structure, importing it into the WCMS should not be complicated. We are likely to see tools for handling common file types such as Word documents, Excel spreadsheets, or tab/comma delimited files.

Although this will save time in one area, it will raise issues in others; namely, the quality and meaning of the content. Although the content might have a consistent structure, that doesn’t mean it makes sense. Automation will simply act to highlight content that doesn’t make sense or is inconsistent. Also, the meaning and relevance of the content depends on the context. The contents of a standalone Word document change when that content is placed into a page on a site with surrounding content. The same applies to content migrated from an existing site; unless the new site has exactly the same structure, there will be a need to review the content.

As I mentioned earlier, our car parts manufacturer client had information on a particular product in three different formats: Indesign files (for print), XML files (from the previous site), and Excel spreadsheets (exported from the company’s ERP system). The problem was that the format of the information was slightly different in each file, and none of them contained the full information. So even though we could import the XML, it didn’t have all the data we needed. We still required the graphics from the Indesign file and some information from the Excel spreadsheet to have the full set of data for a particular product.

Another example involves articles.An organization will often have a number of articles on a particular subject. Typically, these articles will only have been viewed on their own (e.g., in a magazine or newsletter). When they are imported into a WCMS and seen next to each other, suddenly the overlap becomes obvious, and it no longer makes sense to have two articles covering the same subject matter. It’s even possible that the articles contradict each other! Although there will be advances in technology to assist with the importation of editorial content, once the content has been imported, it will still have to be manually reviewed, edited, and organized. This will place greater importance on the role of Web editors.

System Integration

Not all content presented will be stored in the WCMS. There will be a move to use existing sources of data rather than replicate them. We are already starting to see this happen with modules CMS vendors are building to allow for this integration of systems. Over time, this will become one of the most significant factors in the selection of a WCMS. It will be important to ensure that this integration is a straightforward matter.

CONTENT PUBLICATION

Once the content is in the system, the next step is deciding what appears where on the site; that is, defining the rules for context and display. This is a complex task, as it requires the ability to think of the content in various dimensions. Here we will consider two levels: the site structure and the page layout.

Site Structure

In constructing a site, my colleagues and I commonly use flowchart tools such as Inspiration, Visio, or Excel to define site maps. Figure 1 shows a site map created in Visio for ACFS, a food manufacturer.

Once the site map has been constructed, entering it into a WCMS is a relatively simple matter, for which most solutions provide a reasonable interface. However, the process could be improved. The ability to generate the site map visually within the WCMS would provide some great advantages in the initial generation of the site structure. It would also provide some powerful management tools for dealing with the arrangement of content and association of one item to another. In databases, it is possible to use visual tools to create relationships between data tables; the ability to do this for sections and content within sections of Web sites would be a great step forward. It would enable people to work in a far more visual manner and assist with relating content.

The real benefit of a visual site structuring tool would not just be in the initial construction but in the refinement and ongoing management of content. For example, the final location of content is something that often needs refinement, and this is difficult to manage until the content is brought together in the one place. A particular article might be earmarked for one section but actually turn out to be more suitable in another section. The ordering of articles in a section is another issue that is not always considered up front. Being able to visually drag and drop content from one section to another and see how it fits together would be a very useful feature for Web editors. Once again, it comes back to context; a visual site mapping tool makes it much easier to understand and better manage the context of content.

The other advantage visual site mapping offers is content association. We tend to think of content only within its displayed context (e.g., the article on the page, the product in a product listing). Understanding the relationship between different items of content requires a single person to have an understanding of all content within the site.

Page Layout

Next the Web editor must decide how the content should appear on the site. As I discussed in relation to the content location, the display will differ based on where the content is to appear. There are already some simple conventions on views of content that can be used (e.g., full view, summary view, line view). However, which view to choose will depend on the location and the other content that will appear on the page.

At the moment, there are a number of tools the Web editor can use to define how the content is to be displayed, an activity often known as creating wireframes. These tools include flowchart applications such as Visio and Inspiration and graphics applications such as Illustrator, Quark, Indesign, and Photoshop. Figure 2 is an example of a wireframe created in Visio. It shows the ACFS home page, which has the main navigation across the top with a member login and search function on the right. The content is a summary view of the different categories of food offerings. In the right-hand column, there are promotional items.

Creating this page requires someone with technical expertise in a particular WCMS to construct the page and to define the rules that specify what content is to appear where on the page and in what order based on the wireframe he has been given. As with any technical task, there is a cost involved. There is also the issue of understanding; the technician will have to interpret what the person who created the wireframe meant, as it’s not possible to capture all the details in a wireframe.

In the future, we will see this activity begin to be integrated into WCMS offerings so that wireframes can be created within the WCMS and applied immediately, as opposed to the current process of defining the wireframe in one application and then applying it to the WCMS. This will speed up the process considerably and allow for changes to be made more quickly and easily. It will mean the people making decisions on what content is to appear where will be the site editors rather than technicians.

The downside is that less thought will be put into this stage up front, and people will go straight into development rather than thinking through the design first. This is a significant problem in software development. Forcing developers to do a formal design before writing the code has proven to be the most successful method for reducing bugs, thus providing a much better long-term solution.

Permissions

On the surface, this is a straightforward concept: a permission specifies who has access to what content and at what level. It takes the same concept of users and groups that has been well defined in operating systems and applies it to the world of WCMS. However, it has a slightly greater level of complexity as workflow is introduced.

The users and groups approach can define access levels in an operating system, but it doesn’t define the workflow and collaboration rules that are required in a CMS. What has already started to happen and will continue is that permissions — at least for intranet access levels — will be defined externally. For example, integration with LDAP is becoming a standard feature of WCMS offerings, and this type of integration with existing systems will continue until the WCMS is a standard part of the business infrastructure rather than a bolt-on, as it tends to be.

CONTENT REUSE

There is some debate in the industry over the practice of content reuse. There are some who advocate it as a considerable benefit to the site owner, as an organization would be able to leverage its content to a greater degree and provide a much richer experience [2].

However, others claim that content reuse is a myth, arguing that while on the surface it appears to be of value, in practice it is very hard to implement and maintain [1].

We can think of content reuse on two levels: publishing content in more than one place (distributed publishing) and relating one piece of content to another. I will discuss these two aspects of reuse below.

Distributed Publishing

Figure 3 helps to illustrate the idea of distributed publishing. It shows a content model that was implemented for the Web site of RMIT University’s Centre for Design (Melbourne, Australia).

Across the top are the content types. Below that are the main areas of the site. From each content type are lines to show where that content can and is to appear. As you can see from the diagram, some content types appear in many different sections. My colleagues and I configured the system so that this works automatically. For instance, when a news item is published in the news area, it automatically appears on the homepage. When a training program is added to the training area, it is automatically added to the program to which it is related.

This diagram was initially constructed by hand on a large piece of paper before being created in a flowchart application Inspiration). Then the developer applied it in the WCMS by setting the rules for publication of each content type. This is a conceptually difficult task for all involved. What we found after the site was implemented was that users quickly forgot the rules defined in the content model — an outcome that supports the argument that content reuse is a myth.

What would happen is that the user would enter content into one part of the site and forget that it automatically appeared in another part of the site. She would review how the content looked in the section in which it was entered but fail to check the other sections where the content appeared. This created problems when the content was written and structured for the main section, because when that content appeared in the other sections, it didn’t make sense.

The alternative is to make the act of distributed publishing a more manual task. However, this strategy also has its downsides, as it relies on a human process. People simply forget that once the content is entered in one section, it also has to be published in other areas. For example, a feature story added to the feature story section might also need to be added to the homepage. The person entering the content will add it to the feature story area, as that’s where it belongs, but then forget to add it to the home.

In future, it will be important for WCMS vendors to recognize the complexity of these content models and somehow allow for the rules to be defined visually so that the users of the CMS can see where the content will be published after they enter it. A visual content modeling tool will be an essential part of future WCMS developments. These tools will rely on established conventions that people can apply to help reduce the complexity of content models and the time it takes to implement them.

Related Content

Similar to the concept of distributed publishing is the idea of related content. We see this all the time on Web sites. When you look at a particular page on a site, you’ll be pointed toward content that is related to the content on that page.

One way to handle this is to manually associate a particular piece of content with another and link them in the WCMS. This requires the ability to think of the content in different contexts and then connect the relevant content items, most probably using links within the WCMS. It is a similar task to creating the overall site structure and creating links between each item; however, it takes the concept to a greater level. The person creating the links has to know all of the content in the system at an in-depth level, not just a structural level. For example, at a higher level, linking a page with particular subject matter to similar subject matter is easy. For instance, linking a page that contains information on a product to a page that has details on where to buy that item is easy enough. It becomes more complex when the linking is based on the details of what is in the content.

In conventional media, this is accomplished through a few well practiced and understood conventions (table of contents, glossary, index, etc.). We have seen these conventions translated to the Web with hyperlinks and glossaries. But in conventional media, the linkage is established once and then the content is printed, never to be changed. On the Web, however, content is constantly being added, and tools are required that allow new terms to be defined and related to other content. The challenge is providing tools that enable these connections to grow.

There would be an advantage in being able to define metadata for each item of content that would then automatically make the associations between relevant items of content. That way the task wouldn’t require a human being to remember and know every piece of content on the site. The next step would be for such associations to occur automatically based on the content of the article itself. Say, for example, that a user of a tourism Web site was looking at an article on a resort in Fiji. If the WCMS had an index of other items of content that contained the word “Fiji,” it could automatically display them as links at the bottom of the page.

Another advance would involve adding terms to a glossary. Each time a term appears on a page, it would automatically be linked back to the glossary. Similarly, say new content is added that contains the term in question. When the term is viewed in the glossary, the WCMS would automatically add a link to the new content. This approach would make connections between content easier to manage and would require less human involvement.

CONCLUSION

The WCMS Holy Grail is the single source that allows the client to output content in different formats (e.g., Web, PDF, print, handheld). In reality, it will take years before we move to this concept of a single source, but for those that make the transition, there will be incredible gains in efficiency. There is little doubt that we will see improvements that will make the implementation of a WCMS a less technically complex task.

From a technology perspective, we will see significant improvements in two areas:

  1. The procedure for importing content (assisting with content from multiple formats, integrating content from other systems)

  2. The tools for helping people organize and implement content output

What these improvements will lead to is a movement away from the manual and technical aspects of implementation toward a focus on value and meaning. Clients will be able to concentrate more on the content itself and how best to leverage it, as they will have more options and greater flexibility in what they do. This will increase the pressure and accountability on the decisions that they make. All of these changes should result in better structured, more effective solutions.

REFERENCES

  1. Robertson, James. “Content Reuse in Practice.” KM Column, 3 September 2004 (www.steptwo.com.au/papers/kmc_contentreuse/index.html).

  2. Rockley, Ann. Reuse: A Substantial Factor in Determining ROI for Content Management. Data Conversion Laboratory, 2005 (www.dclab.com/ann_rockley_roi.asp).