Change Management:
Containing the Web Crisis
by Susan Dart
Dart Technology Strategies, Inc.
1280 Bison, Suite B9-510,
Newport Beach, CA. 92660. USA
Phone: +1 (949) 224-9929
Fax: +1 (949) 515-4442
Email: sdart@susandart.com
Web: www.susandart.com Please note: This is a draft document to be published in the International Proceedings of Software Configuration Management Symposium, 1999 Toulouse France later this year.
ABSTRACT
The web is one of the most powerful technologies for enabling business to be done at the speed of thought. It is changing the way that people do business, communicate and live. In order to survive beyond the first decade of the new millenium, companies must transform themselves into e-business: where all communications, transactions and work are done via the web. But, behind the often garish or entertaining facade of a web site, lies the challenge of managing its infrastructure and its content. It is the latter that is generating The Web Crisis - the proliferation of "hacked together" web-based systems kept running via a continual stream of patches developed without any rigorous or systematic approach.
The software community has experienced a similar crisis and knows that software configuration management (software CM) is a key player in resolving it. For web systems, it will be change content management (CCM) that companies will turn to for diminishing the Web Crisis. Is CCM really software CM applied to web technology tools? Yes, and it is even more. There are exceptional challenges presented by web systems compared to traditional software systems. These challenges include issues regarding the dynamic nature of the content, variant explosion problem, the free-form style of web development, the performance effect of content, scaleability, the urgency and frequency of change, the outsourcing of content, the immaturity of tools, techniques, standards and skills and corporate politics.
As the entire world becomes "webified", the CCM problems will be magnified and the Web Crisis will escalate. Web engineering techniques will evolve to support the dynamic nature of web systems. For instance, while software CM provides a static solution (such as via a centralized development methodology creating batched, planned releases), CCM will have to provide a dynamic solution (via distributed, real-time releases) in response to user traffic monitoring. In order to find a good CCM solution, it is imperative that the lessons learned from software CM be applied to web technology tools. Otherwise, the Web community is doomed to experience all the delivery, quality and complexity problems that have plagued the software community. And, the consequences of those - legal, financial, emotional - will be more devastating because of the real-time, global nature of the Web.
KEYWORDS
Configuration management, content change management, content development tools, web technology, commercial-off-the-shelf tools
INTRODUCTION
The World Wide Web (WWW) is a unifying force bringing the world closer to together: Regardless of race, color, creed, skills, educational background, computer platform, browser, nature of business, geographical location, and job position, we all "look" the same. E-commerce revenue is expected to hit $220 billion by 2001 (says International Data Corp.). Behind the facade of e-commerce though, the Web Crisis is looming (1). That crisis is the exponential proliferation of web content that was created, and is maintained, without any expertise in data management techniques. Companies are desperate to "webify" their business applications. With the advent of many, low-cost publishing tools that are very easy to use, web system creation is now so simple that anyone without programming skills can create one.
The demand for content creation and maintenance is escalating at an unmanageable rate. Some analysts (Merrill Lynch Co. 1999) have predicted that by the year 2002, the market revenue from content management tools will be around $5 billion. And, even when we have it under control, content has a multiplier effect, a snowball effect, where we will further exploit new ways of using content. First generation web systems have focused on providing access to any piece of information around the world. The next generation web systems will focus on knowledge management: -- managing the semantics of, or concepts of, content, rather than just the raw information.
For now though, we see the shortcomings of first generation web systems. There are problems with information being published on the web site at the wrong time and information that is inaccurate, top secret, corrupt, inconsistent, unauthorized, unchecked, garbage, stale, or inappropriate. These can have devastating consequences for companies. The causes are easily linked back to lack of: well-defined processes, testing, cross-checking of information, authorized changes, security checking, or responsibility for co-ordinated changes. Essentially, the problems stem from poor configuration management (CM) practices. The first generation of web systems were crafted from immature tools and languages, and inexperienced staff. To properly provide Change Content Management (CCM) -- CM for web systems -- we will have to go beyond the capabilities traditionally provided by industrial-strength software CM tools because the challenges presented by the emerging web economy are exceptional.
This paper is designed to raise questions about CCM for web so that we can understand the new demands placed on companies by web systems. A web system is a generic term for an application that can be accessed via the WWW. It fundamentally consists of content (its data, such as a document), application server (for executing actions on the data, such as updating document), access (its interface, such as the client's browser) and the web server (supporting the applications; common ones include Apache, Internet Information Server and Enterprise Server). This paper defines the nature of the WWW environment, specifies the classes of web systems being developed, identifies the many challenges that companies are facing in their efforts to understand CCM, and highlights capabilities provided by software CM and web CCM tools.
THE WORLD WIDE WEB ENVIRONMENT
Web systems can be huge, with millions of pages, many interconnections and with incredibly high hit rates. Consider the many kinds of resources across the WWW that can be a component of a web system. Users can be connected to the network via a thin client or a fat client. A thin client means application code is resident on the server, rather than on the client (fat client). A firewall determines the kind of access, encryption and security levels. Web servers provide much of the application code and can have accelerators for caching dynamic pages in order to improve user access time. The network can be specialized into an Intranet, Extranet or Virtual Private Network (VPN). An Intranet is an internal network behind a firewall that allows only users within the company to access it. An Extranet allows outside partners to have access to the Intranet. A VPN is a secure and encrypted connection between two points across the Internet. It acts as an Intranet or Extranet except it uses the public Internet as the networking connection rather than a company's own wiring. This enables, for instance, a company's branch offices to be inexpensively connected via the Internet.
Attached to the network can be other types of networks such as Storage Area Networks (SANs) and Portals. SANs are networks that pool resources for centralized data storage. They may include multiple servers working against a centralized data store built with redundant hardware such as RAID (high volume storage) devices. Portals (such as Yahoo!, AOL) are full-service hubs of e-commerce, mail, online communities, customized news, search engines and directories, all suited to the particular needs of an audience. Portals are evolving into corporate enterprise portals. Such portals for instance, enhance corporate decision making by integrating the company's applications thereby removing barriers that exist between business units.
Other resources that can make up web systems are: DBMS (Data Base Management Systems); workflow applications used for optimizing business processes, such as ERP (Enterprise Resource Planning) tools (e.g., SAP, PeopleSoft, Baan); database applications such as OLAP (OnLine Analytical Processing) systems which allow users to perform "multi dimensional" analysis on data via their browsers; document management tools for providing access into shared libraries of documents (9); imaging systems for optical character recognition of documents; data warehouses which contain thousands of gigabytes data; (Data warehouses provide common interfaces to variant databases); multi-media databases for holding archives of music, speech, videos; mainframes contain approximately 70% of legacy data for large companies; data-marts which are data warehouses with their own unique interpretation of business data to suit certain functional needs of a business unit; and, non-PC devices, such as pagers, personal digital assistants (PDAs) and smart phones, which are also being connected to the WWW.
Web systems are made up of various combinations of the resources. Each of the resources imply content that can be dynamically added, changed, deleted, accessed, manipulated, along with their relationships and hyperlinks. CCM will need to control the static content which goes into the creation of the web systems along with the dynamic content that is created during execution of the web system. Different kinds of web systems are being developed which affect the nature of CCM.
TYPES OF WEB SYSTEMS
It is difficult to classify the types of web systems being built today because, of course, there is no universal blueprint for such systems, the design is still an immature art and the systems themselves are evolving fast. But, for the purposes of opening up discussions about CCM, we need to understand the types of "architecture" of web systems with respect to content creation. A site can have a static dimension (nothing can be altered by the user) or dynamic dimension (there is content creation between the user and/or the connected resource). From a content perspective, we are interested in types of web systems which have data points where data can be added, changed, deleted, accessed or accumulated. Given that, a web system can be categorized as having the properties of one or more of the following classes:
- Informational: information sites with read-only usage, commonly called "Brochureware" e.g., information presented on a site that gives details about a company and its products. First-generation web systems are this type and are static.
- Delivery system: download content to user or resource e.g., download upgrades or plug-ins
- Customized access: access is via a customized interface or based on user's preferences e.g., my customized view of my ISP's (Internet Service Provider's) home page, or favorite portal
- User-provided information: user provides content by filling in a form e.g., subscription to a magazine or registering for a company's seminar
- Interactive: Two-way interaction between sites, users and resources e.g., business-to-business
- Transaction oriented: user buys something e.g., buys books or travel tickets
- Service provider: rentable applications; user rents an application on a per user, per month basis e.g., virus scan program
- Database access: user makes queries into a database e.g., supplier looks up catalog of parts
- Document access: libraries of online documents are available e.g., view corporate standards
- Workflow oriented: a process has to be followed e.g., order entry automation
- Automatic content generator: robots or agents automatically generate content e.g., "bots" scour the WWW to bring back specific information such as best price on products.
Given these classes, it becomes obvious that content can essentially be created by anyone or any other resource: from the content designer, the webmaster, any user or another database or device or web system. From a CCM perspective, it is straightforward to capture content that makes up a released baseline since that is static content, but what about content that is created or changed dynamically? This raises four key questions then: (1) What constitutes a configuration item for a baseline with static and dynamic objects? (2) How can dynamic baselines be captured? (3) Now that the user of the web system participates in the creation or changing of a baseline, how does that affect the definition of the CCM lifecycle? (4) Are CCM requirements different for each class of web system? These are some of the questions being asked by webmasters, developers and CM managers.
ENTERPRISE CHALLENGES FOR WEB SYSTEMS
CCM is not really a problem for small, static web systems managed by a few developers. But it is for medium and large, enterprise systems that involve many content developers creating many pages that will have a high hit rate involving high-volume database accesses and updates every minute. For instance, the NASDAQ stock exchange system (2) is a web system of types 1, 4, 5, 6, 8, 9, and 11, and was built to sustain 12 million hits per day with 8 web servers per database server. When the stock market goes "crazy", the NASDAQ site gets 20 million hits per day. Its content must be completely accurate, and it changes within seconds. Boeing (3), with a web system of types 1, 2, 4, 5, 8, 9, and 10, has 1 million pages hosted by 2300 Intranet sites on more than 1000 web servers.
Developing and maintaining such large systems with large volumes of content, offers many challenges to companies. These challenges span technical, people, process and political issues. The major ones today are the following, and are described in detail below.
- The dynamic, active nature of content
- Variant explosion
- The free-form style of development
- The performance effect of content
- Scaleability of content
- The urgency and frequency of change to content
- The outsourcing and ownership of content
- The immaturity of tools, techniques, standards and skills
- Corporate politics.
THE DYNAMIC, ACTIVE NATURE OF CONTENT
Web content is dynamic because it is created on-the-fly based on a user's or agent's request. It is active because programs are being created and executed in response to the request and to the user's environment (browser and plug-ins are available on the client side). For example, HTML is static but when combined with active controls (such as ActiveX), it becomes dynamic. Such content for instance, gives users feedback on the type of data they are supposed to enter at the site and the types of constraints to make sure the input complies.
Content is made up of data objects, component libraries and code. These can be static and dynamic, singular or a collection, compiled or interpreted, source or binary code. Objects include documents, images, streaming video and audio, and data. Code can be active controls and scripts such as: ActiveX controls, Java, C++, VisualBasic, HTML, DHTML, XML, VRML, OLE controls, Active Server Pages (ASP), Java applets, VBScript, JavaScript, ISAPI, CGI, and Perl. Behaviours, or scripting, can be attached to web objects allowing, for instance, the user to change attributes, such as color, positioning and font size on objects. Typical component libraries or toolkits are JavaBeans and Lotus' eSuite of business applets. Content can be generated and changed in real-time such as with tables, forms, database queries, documents and code. An applet or a control is a compiled binary file that a field in the HTML tag references. A script is executable code in a readable source language that can be embedded directly in the HTML tag. In essence, an object becomes a container for various pieces of content, all of which need to be under CM control.
We are moving towards a container-based, or a component bundling, approach to software development. This means CM techniques need to account for embedded scripts and customized components. Also, the executing environment needs to be taken into account. For example, if a browser doesn't support a certain scripting language, then the behavior of the web system will be different. HTML files can be manually touched up. Scripts can be easily changed because they are interpreted whereas applets or controls typically need to be compiled. This assumes of course, that the source code can be accessed which isn't the case when components are bought and reused in their binary form. So, changing or recompiling isn't an option sometimes. Executing code may require a series of steps. For example: compile a Java file into platform-independent bytecodes, which are then processed by a JIT (Just In Time) compiler to yield fast native instructions for a particular platform. All these objects types, their relationships to intermediate forms, and all the tools, need to be tracked for good CM practices. Embedded documents (Word, Excel) with full editing, formatting and process power are available with an ActiveX-enabled browser. How do we track all these changing objects?
Web pages are dynamically created. For example, an .asp file (Active Server Page which is a combination of static HTML and VBScript) is recognized; the VBScript is interpreted and any database or related files are accessed; then the server creates the full HTML on the fly, thereby dynamically generating the web page which is then displayed. How is all this data tracked for CM purposes? How is a dynamic baseline captured? Hyperlinks can be created on the fly to point to documents. This then changes the baseline. Also, dynamically generated pages can be customized to the user's request using ActiveX, CGI scripts, JavaScript and DHTML frames. So, just the dynamic nature of the content itself provides challenges for CM. But there is more. Content has other properties associated with it that must be captured in the CCM solution. They include:
- The separation of content and format. Companies have standard templates into which content is published. These templates are part of the released baseline
- External structure information, such as the hierarchy and relationship of web pages. We are moving beyond flat file-systems
- Internal structure information, such as embedded objects
- Hyperlinks to internal or external pages, static or dynamic
- Task objects that indicates some activity must happen to an object, such as updating the content
- Transaction, such as this data is involved in carrying out an e-commerce activity
- Security information attached to each objects
- Audit logs related to the activity on each object
- Tool compatibility information, such as the browser for which this object is valid
- Bill of materials: what pieces were used to create the baseline (tools, tool options, data, files)
- Generated or converted files, such as a Word document that is converted into HTML
- Validation rules, such as a form requiring input validation for each field
- Handler rules, such as a data base access request invoking certain tools and operations.
There are obviously many properties about content that needs to be captured. Ideally, a company should have a well-defined CM data model that defines all the properties and relationships of content. With that, configuration items, baselines and releases can be defined.
VARIANT EXPLOSION
Web systems imply an immediate variant explosion problem. For instance, web systems are either created from scratch or they are transmuted from a legacy application. Different tool technologies need to be maintained in parallel for these. A company could have a nightmarish number of versions of a baseline. For example, assume 4 standard baselines per release: (1) partial "browserized" baseline: legacy application that is web-enabled, but only a minimal set of functionality is supported since this is the textual version (2) full browserized application: the same but all functionality is available since it is the graphical version (3) completely redesigned for the web baseline (4) legacy system for non-web use.
On top of these, each variant must work with 2 different browsers (Internet Explorer and Netscape Navigator) including the latest three versions of those browsers -- and support 5 different languages for international use. Hence, we have (1 * 5) + (3 * 2 * 3 * 5) = 95 potential variants. Then add in variants for different devices such as pagers, PDAs and smart phones, and the number of variants escalates further. Most companies have different teams working on separate variants without much communication, reuse or change propagation across common code. With the variants, come all the complexity of parallel development support for simultaneous changes and concurrent baselines, along with significant change propagation to selected variants thereby demanding change set support (10), more sophisticated change tracking along with help desk support and of course, much better release planning and change scheduling. The ramifications are dramatic. Variant management and change propagation have long plagued software companies.
THE FREE-FORM STYLE OF DEVELOPMENT
Web system development is different from traditional software development (7,8). This is due the nature of the tools, languages, skills of the developers and the dynamic nature of the Web environment. There is tremendous pressure on developers to "code-and-publish". And the tools support this free-form style of development. Also, the skill set of the developers is quite varied with typically no experience in software engineering. They are guided by the capabilities of the tools and languages.
Scripting languages (such as JavaScript, Jscript, Tcl, VBScript) are changing the way that applications are developed. Most of these are interpretive languages or use JIT (Just In Time) compilers. This leads to a style of "change on the fly". There is no process in between creating content and publishing it. Libraries of components are evolving that enable reuse and customization of scripts. Programming has gone from a process-oriented compiler-based approach, to "combine components, mix in some new code and go!" Essentially, this squeezes down the change cycle time dramatically because all sense of process is eliminated. This enables a faster rate of change which suits the current modus operandi for web sites but provides greater opportunity for errors through lack of testing and content co-ordination and authorization of change. The question becomes: how can testing, system integration, load testing and release management processes be inserted into the code-and-go paradigm to enable proper CM? Some use staging areas for testing before publishing to a live site.
The complexity of web system development can be seen from Table 1. The major phases are highlighted along with who assumes responsibility for those steps. There are at least 8 key steps involved in getting the web system functioning. At each point, CM issues come into play such as, which release or version of the web site is being changed or published or tested or registered or validated for security purposes or being monitored for hits or performance improvements. Without CCM practices and tool support, all these activities become fraught with errors. Automated workflow along with role based activities must be supported in CCM tools.
| MAJOR ACTIVITY IN WEB SYSTEM | WHO TYPICALLY DOES THE WORK |
| Design and creation | Web Team or IT Dept. or Outsourced |
| Infrastructure support: servers, network connections, databases | Outsourced to network management company, or hosted by IT Dept. |
| Testing e.g., compatibility of content, link accuracy, viewable by all kinds of browsers | Web Team or IT Dept. |
| Publishing of content | Business Units or Web Team or IT Dept. |
| Registering of sites on search engines | Web Team or IT Dept. |
| Security checking: access control, hacker analysis, virus detection | Web Team or IT Dept or Security Consultant |
| Monitoring: traffic performance: intelligent load balancing and web page redesign; replication; web accelerators/caching; traffic shaping capacity planning | Web Team or IT Dept. |
| Maintenance: content evolution via changes, enhancements, deletions, redesign | Content experts or Web Team or IT Dept. |
THE PERFORMANCE EFFECT ON CONTENT
Performance -- particularly response time to a user's request -- plays a major role in influencing content design. High performance web systems continuously monitor the traffic to their site. They want users to get quick access to the content under any load. If access times are not acceptable, a company makes a decision to either, install web accelerators which enable caching to improve performance, or, they redesign the content for better access. For instance, at the Olympics site (11), traffic monitoring showed bottlenecks for users by having to navigate too many pages to get to the right content. Consequently, the pages were redesigned on-the-fly to make access much easier.
There are Internet companies whose sole business is monitoring response times to web sites. Weekly lists of the top 40 sites with the best response times are published. Sites such as AltaVista, Yahoo and Charles Schwab consistently rate high because their content is "clean" i.e., not too heavy on graphics and they have a server infrastructure geared to performance.
Web accelerators are beginning to play bigger roles in performance enhancement with content being designed to take into account caching techniques for accelerators. Hence, a dependency exists between the content baseline and the version of the caching algorithm and server that are used. Also, server crashes (such as with the E*trade brokerage site crashes which shut out users who lost money through lack of trading access) must be accounted for in contingency plans, which means content must be replicated across servers. This in turn, means synchronization and distribution of real-time updates. Redesigning content of course raises all the typical CM change tracking and propagation issues.
SCALABILITY OF CONTENT
The Olympic (11) and NASDAQ (12) web systems are huge in terms of number of pages (million), amount of traffic (millions of hits per day), number of database and web servers. Millions of pages cannot be reasonably stored in a flat, file system and meet the real-time needs of dynamic page creation. Databases are obviously required for storage and are being redesigned to suit web access. Some database companies are redesigning their products so that web applications are stored directly in the database, such as Oracle's WebDB. This helps with scalability, reliability and administration. It is likely that first generation web systems will be redesigned to use web-enabled databases. This means that CM capabilities must be integrated and synchronized with database facilities.
THE URGENCY AND FREQUENCY OF CHANGE
The web enables change at the speed of thought. The mind set is typically: "I see a problem and fix it immediately because it is globally visible." Corporate embarrassment or even worse, litigation, needs to be avoided. There may be no time to follow through a normal change lifecycle (such as with a change request, Change Control Board, change authorization, edit, testing and re-release). Because the change can be done so easily, process is often bypassed. All the benefits then of change tracking are lost. Repeatability will be a difficult benefit to achieve. Roll-back of a site may be the only options for companies. But, the corporate need of keeping the web site accurate takes top priority. There are changes that may need to be propagated across all pages of a web site, or just a few pages. For example, simply changing a copyright notice, may involve changing all one million pages, whereas other changes may involve a select set of pages so an incremental publishing capability is required along with ways of organizing files into partitions to enable incremental updates. A company needs to define its classes of change and decide what process should be followed for each type of change.
OUTSOURCING
Outsourcing is a significant trend for most companies. They are outsourcing web system creation, and sometimes maintenance, for many reasons: to reduce operating costs, share risks with others, access leading-edge technology without having to purchase the infrastructure for it, use expertise not found in-house, do things more quickly, and to focus more on their own core competencies. While content still has to be created by the experts, the outsourcing means distributed management issues as well as how to do CM with a third-party.
Commercial-of-the-shelf tools (OLAP, ERP, CCM, etc.) are helping to change the political infrastructure of companies. For instance, business units no longer are forced to rely on Information Technology (IT) in order to get things done. They go out and buy the best tool that suits their need, bypassing IT. They can even go out and rent the infrastructure for supporting the tools and outsource its administration. This complicates issues of who has responsibility for what, how to maintain control and visibility over outsourced changes, and guaranteeing that a quality process was followed for outsourced work.
THE IMMATURITY OF TOOLS, TECHNIQUES, STANDARDS AND SKILLS
Engineering techniques for web systems are still in their infancy. Tools, standards and experience are maturing albeit slowly. Each month, new tools and new versions of tools are being released that support easier ways of building web systems. Standards such as XML (eXtensible Markup Language) from World Wide Web Consortium, or WebDAV from the Internet Engineering Task Force are slowly being developed which in turn will affect the tools. There are many web technology tools that enable easy publishing of content without team co-ordination or process. Because of the many choices, large, decentralized companies will end up having their business units using different tools. To get some control over how content is developed, and to ensure that quality processes are followed in publishing content, companies will have to define standards. These standards will pertain to style templates, component libraries, tools, languages, servers, testing processes, and CM.
Many web developers have little background in software engineering. Content creators can be human resources personnel, marketing people, accounting staff, etc., people whose core competence is not software. Their web skills are totally dependent on knowledge gleaned from the web tool set and any training class they attended. This implies that the tools need to have interfaces that suit the content writer yet have excellent CCM processes embedded to make up for the lack of software skills.
CORPORATE POLITICS
There is confusion in companies these days as to who has the right to publish content to the web site department: business units publish independently from the IT department which could be independent of the web design group, the marketing group, and so on. Essentially, there is lack of control as to what goes up, when, and how it has been tested and whether it conforms to standards. This is particularly a problem when the web system has content that must be co-ordinated and validated as a whole with other departments or with other applications. Who assumes responsibility for the accuracy of the information on the web site? Who assures that quality control processes have been followed before information is published to the site? Who is responsible for making changes? Who assumes the cost of change? The role of the IT department is changing dramatically, from an infrastructure provider, to that of a strategic advisor and standards producer. Many functions traditionally done by IT (such as network administration) are now being outsourced. Outsourcing will significantly change the modus operandi of IT departments. And web creation is mostly outsourced these days.
All in all, companies face a delicate balancing act in trying to rein in proliferation of web systems while still leaving employees freedom to meet their business needs.
SOFTWARE CM IS A MAJOR PART OF CONTENT CHANGE MANAGEMENT
Software CM tools and techniques need to be applied to the CCM problems. Software CM spans a significant spectrum of activities and roles within a company (5, 6). Table 2 highlights the main goals that companies have in following CM practices. These goals are clearly applicable to web systems. The software CM tool vendors are adding CCM capabilities to their tools. Web tool vendors are beginning to realize that CM practices need to be incorporated into their tools. Advice on good web design (4) is beginning to highlight the importance of CM but only in the sense of version control of files. On the other hand, web engineering advice (12) completely ignores CM. Table 3 lists some of commercial software CM tools.
| GOAL | EXPLANATION |
| Identification | Identifying uniquely all content |
| Control | Version control of all objects including baselines |
| Status accounting | Tracking the status of all work on all objects |
| Audit and review | Keeping an audit trail, confirming all processes followed |
| Cost-effective production | Fast and quick builds of software releases |
| Quality/Process automation | Ensuring all testing, notifications, signoffs, reviews are done |
| Teamwork assistance | Enabling teams to work in parallel in the most effective manner |
| Complexity management | Containing the explosion of changes |
Software CM vendors are taking different approaches to CCM support in their tools. Some, such as StarTeam are web-enabled and have purchased web technology companies with the intention of tool integration. Others such as TrueChange have decided to build a completely new software CM tool for CCM. Some tools such as RCE were originally developed for software CM of web systems. Others such as Continuus and MKSIntegrity, have added on CCM support. The former, WebSynergy provides a web front-end that integrates into all of its existing CM process-oriented capabilities along with web authoring tools. The latter's WebIntegrity integrates its version control facilities with an authoring tool.
| CM TOOL | VENDOR | WEBSITE |
| Continuus | Continuus | www.continuus.com |
| ClearCase | Rational | www.rational.com |
| Harvest | Platinum Technology | www.platinum.com |
| Perforce | Perforce Software | www.perforce.com |
| PVCS®, Dimensions | Merant | www.merant.com |
| Source Integrity | MKS | www.mks.com |
| SourceSafe | Microsoft | www.microsoft.com |
| StarTeam | Starbase Corp. | www.starbase.com |
| TeamConnection | IBM | www.ibm.com |
| TrueChange | True Software | www.truesoft.com |
WEB TECHNOLOGY TOOLS
Web tools are marketed for web authors or web developers. What constitutes a CCM tool is not clear and there is no consistency in functionality across the tools. Suitability for large-scale development seems to determine whether it is a CCM tool or just an authoring tool. Tools are first generation ones (with respect to CCM support) with only one product (DynaBase) claiming that it provides configuration management facilities. Some tools are geared to large-scale web production although it's not clear how scaleable these tools are yet. Half a million components seems to be the yardstick so far. Their in-house repositories though are likely to limit scaleability. Table 4 lists some commercial CCM tools. If there are any similarities or trends, they would be:
- support for web languages
- command line interfaces
- templates for separating content from formatting
- version control of files
- roll-back of complete sites
- minimal workflow support for publishing authorization
- audit logging
- event triggers
- commercial database interfacing
- drag-and-drop component reuse so that minimal programming is required
- role support for authorizations
- minimal change tracking and concurrent site production (for multiple releases)
| CONTENT TOOL | VENDOR | WEB SITE |
| ArticleBase | Running Start | www.runningstart.com |
| DreamWeaver | Macromedia | www.dreamweaver.com |
| Drumbeat 2000 | Elemental Software | www.drumbeat.com |
| DynaBase | Inso | www.inso.com |
| Frontier | Userland | www.userland.com |
| FrontPage98 | Microsoft | www.microsoft.com |
| Fusion | NetObjects | www.netobjects.com |
| Raveler | Platinum Technologies | www.raveler.com |
| StoryServer | Vignette corp. | www.vignette.com |
| TeamSite | Interwoven | www.interwoven.com |
Some noteworthy features include: TeamSite provides visual differencing for examining two versions of content side by side. Tasks can be assigned to authors using notifications. Authors can be notified when content is published on the web. Content is moved to a staging area each time it is changed or receives approval to be published. Drumbeat gives developers guidance on targeting code to specific browsers thereby providing variant creation support. In Raveler, teams can be set up with pre-configured workflows. StoryServer supports static and dynamic versioning. Overall, more CM support needs to be provided to support CCM needs.
CONCLUSION
The web environment provides the opportunity to connect many different resources. Whilst the resultant web systems are easily created, they are complex systems offering many challenges for CCM. We need to understand the problems that companies are having with web systems in order to properly define their CCM requirements. Then we still need solutions to questions such as: What are good content development and change processes for teams developing large-scale web systems? Are there different processes depending on the type of web system, size of company, volume of web data? Can the types of web systems be categorized into classes or architectures? Will component libraries be indicative of these architectures? What factors affect the definition of the CM process and CM items? Do we need system models, data models, architectures of web sites in order to fully capture the appropriate CM meta information?
Second generation web systems will focus on knowledge management and hence need sound engineering principles, such as CCM, behind them. Given the many challenges, much of the solution will have to be embedded in the tools because the skill set of the developer cannot be guaranteed. This means that the CM processes will have to be implemented in the web tools rather than relying on manual procedures. Along with excellent variant support, change tracking, and change propagation (especially via change sets). CCM is becoming an issue for all companies because, in order to survive beyond the first decade of the new millenium, companies must "webify" their applications.
REFERENCES
- Proceedings of ICSE99 Web Engineering Workshop
- Hutcheson "The NT Application That Wouldn't Die (NASDAQ.COM)", Enterprise Development. 1 (1) Dec. 1998.
- C. Sliwa "Maverick Intranets A Challenge for IT" Computerworld, Mar 15, 1999.
- D. Siegel "Secrets of Successful Web Sites : Project Management on the World Wide Web", 1997.
- S. Dart "The Agony and Ecstasy of CM" a half-day tutorial given at 8TH International Workshop on software CM, Brussels Belgium, July 20-21, 1998. http://www.cs.colorado.edu/~andre/SCM8/dart.html
- S. Dart "Not All Tools are Created Equal" Application Development Trends, Oct. 1996, 7pp. http://www.adtmag.com/pub/oct96/fe1002.htm
- H. Gellerson and M. Gaedke "Object-oriented Web Application Development" IEEE Internet Computing, Jan/Feb 1999, p 60-68.
- L. Lockwood "Taming Web Development" Software Development magazine, April 1999.
- S. Dart "The Dawn of Document Management" Application Development Trends, Aug. 1997 6pp.
- S. Dart "To Change Or Not To Change" Application Development Trends, June 1997, 7pp.
- Iyengar et al. "Techniques for Designing High-Performance Web Sites" IBM Research, March 1999. 17pp.
- T. Powell "Web Site Engineering" Prentice Hall 1998.??
CONTENT CHANGE MANAGEMENT: PROBLEMS FOR WEB SYSTEMS (DRAFT)
(c) 1999 SUSAN DART. ALL RIGHTS RESERVED