The Tangled Web: Designing for Maintenance
Jeni Li Shoecraft, Webmaster, Arizona State University West.
4701 W Thunderbird Rd; PO Box 37100; Phoenix, AZ 85069-7100; USA.
Phone (602) 543-8282. Fax (602) 543-3260.
jeni.li@asu.edu -
http://www.west.asu.edu/jenili [HREF1]
Keywords
WorldWideWeb, Planning, Design, Development, Maintenance, Management, Automation, Tools
Abstract
Web sites have a way of growing... and of getting out of hand. This article discusses Web site maintenance and redesign issues, drawing on the author's research and experience redesigning the Web site of Arizona State University West [HREF2].
The article offers a general strategy toward a flexible design to increase the maintenance cycle and reduce the frequency of redesign -- including processes for site planning, automation, and review.
Topics discussed include style sheets, dynamic documents, and server-side includes. The article includes a wealth of links to software products designed to make Web site development and maintenance quicker and easier.
As a communications medium, the Web is in a strange phase of development. URLs appear on billboards, TV commercials, magazine ads, and business cards. People who try to work the word "ubiquitous" into conversations claim that the Web is becoming ubiquitous. Well... almost, but not quite. True, Web access is pretty easy to get, and Web publishing just keeps getting easier to do. But in many cases, the medium still obscures, or even precludes, the message. This isn't the Web's fault; it's ours!
The last couple of years have seen a rush of institutions to "establish a Web presence" -- often without planning or attention that comes anywhere near their approach to print publications and other media. It seems that just "being on the Web" is the initial goal, and the sooner the better. This can lead to unplanned, slapdash sites that do little to advance the institution's mission -- and to a complete site overhaul in a very short time.
Even when a site has been carefully planned from the start, major change is not uncommon within a year or so of the initial design. Visitors may express different needs than the institution anticipated. The institution may become more concerned with its "Web image". The institution's goals may change as individuals explore the Web's communications potential. As Web development responsibilities spread to non-technical staff, or simply outgrow the development staff, simplicity and automation may become more important. All of these have happened at Arizona State University West in the last two years, motivating a site overhaul that's still in progress.
This scenario will seem familiar to a good many Web developers. Indeed, it's not too far from our own experience at ASU West.
January
|
"We need a Web site!"
|
March
|
"We're up! Hooray!"
|
April through August
|
"This needs to be on the Web too..."
"So-and-so will be responsible for these pages..."
"Oops, I deleted your directory by mistake... sorry..."
"Make a new button on the home page for this..."
|
September through December
|
"How come I can't find what I need on our Web?"
"Our pages look like heck!"
"Gosh, our Web server is slow!"
"Whose bright idea was it to get on the Web, anyway?"
|
January
|
"We need a whole new Web site!"
|
When thinking about changes to a Web site, it's useful to make a distinction between maintenance and redesign. Maintenance has to do with the content of the site, while redesign has more to do with its structure and interface. Maintenance happens regularly and frequently. Unfortunately, in many cases so does redesign!
When you update existing content, or add new content to the existing site structure, that's maintenance. Maintenance is a routine and necessary part of Web site administration; visitors often expect up-to-the-minute information on the Web. Haven't we all groaned on finding a site with flashing "NEW!" icons... and a tag line at the bottom: "Last updated: July, 1995"? Many sites need maintenance at least weekly, if not daily, as information changes and new services become available.
Redesign includes re-organizing existing content, creating new or different categories, changing the underlying directory structure, and making significant changes to the site's interface or technology. You almost certainly will redesign your site at some point; that's the nature of the Web. It could be something as simple as a cosmetic overhaul, or as complex as tossing it all out and starting over (yikes!). But too-frequent redesigns can result in "bad links" from commercial search engines, confused and frustrated visitors, and -- of course -- a good deal of unnecessary extra work for the site's developers.
Over time, your Web site may evolve: Its purpose may change, or visitor feedback may indicate the need for serious restructuring, or a new content delivery mechanism may prove to be so compelling and so natural to your application that you just have to incorporate it into the site. At such times, redesign is desirable. A site design at its best, however, will be flexible enough to allow a fairly long maintenance cycle before redesign becomes necessary.
If any step in Web site development consistently gets short shrift, this is it. In the hurry-up-and-do-more environment of many institutions today, Action is King -- and reflection is for weenies. The frantic pace and intense competition of Web tool development only serve to reinforce this approach. But the clichÈ, "If you don't know where you're going, you're going to get nowhere fast," applies here. An up-front investment of the time it takes to identify your purpose and target your audience sets the foundation for all development tasks to come -- making design decisions quicker and easier, and forestalling redesign.
The first step in clarifying your vision is identifying the site's purpose. This will drive your choices of content as well as some decisions about the back-end technology of your Web services. "We need to be on the Web" isn't a reason.
Why do you want to be on the Web? To provide contact information? To conduct distance education and administrative functions? To enhance internal communication? To recruit students? To espouse a particular philosophical view?
Whom do you intend to reach? Existing students, prospective students, members of the institution, faculty of other institutions, secondary school counselors, community members?
What information do you want to make available? Location and contact information? A tour of your facility? Internal support documents? Admission requirements? Course and program descriptions? Actual courseware?
What image do you want to convey? A consistent, conservative image across all parts of the site that says you really have your act together? A modern, "hip" image that says you're on top of the latest in technology?
Once you've identified the types of visitors you want to reach, it's time to "think user" for a while. This will inform content choices and drive your site's structure and front-end technology. What information and services will those visitors want to see on your Web site? In terms of categories, where would they look for a given page? And what methods are they likely to use to connect to your site?
What browser, platform, plug-ins, and connection speed are they likely to have? If your primary audience consists of students with low-end equipment and slow dial-up connections, a home page loaded with "fat" graphics or a 3MB QuickTime video tour of your campus probably won't make you Mr. Popular or get your site onto their bookmark lists. At ASU, the only "free" (non-subscription, provided by the institution) Internet access our students have from home is a dial-in UNIX shell account -- so text-only support becomes important, even more so when we account for our visually impaired students.
Getting a comprehensive site structure in place -- before you start creating pages in a frenzy -- guides site development and maintenance, and reduces your chances of ending up with duplicate content maintained by different developers. This is a good time to involve other developers in the organization, if there will be others.
Your content's organization will have a profound effect on your visitors' browsing experience -- and on the Webmaster's workload, and the overall success of your site. Our existing structure, for example, is department-centric: Information is organized along departmental lines, and people pretty much have to know what department offers a service before they can find that service on our Web site. This was appropriate for an "about our institution" site, which was the site's initial purpose, but is much less appropriate for the "virtual university" environment we're emphasizing this fall.
When organizing your content, think "user"! Web site visitors are looking to complete certain tasks online. When they can't find what they're after quickly, they click the Webmaster's email link and ask for personal service. This creates more work for the Webmaster (especially in a large institution) and degrades service for the visitors.
A typical site will have between six and eight major categories -- maybe four to ten in extreme cases. If you have more, consider grouping some of those categories at a second level, for a more granular approach. The "Affinity Grouping Process" associated with Total Quality Management -- labeling pieces of content on Post-It notes and grouping the Post-It notes on a wall -- works very well for this exercise. Participation from key members is important in this step. For our site redesign we included technology staff, public relations staff, and a librarian (because their background includes an understanding of how people classify information).
This is also a good time to discuss division of responsibilities. Who will be responsible for what content, especially where content is presented in a way that crosses departmental boundaries? The directory structure and access privileges can be addressed at this time; however, if this is a redesign rather than an initial design exercise, remember the hundreds of user bookmarks and search engine entries out there that link to your existing content. You might handle this smoothly with directory aliases or with a very good custom "404 Not Found" page that offers direct links to the site's most frequently requested pages (or their equivalents) within the new structure.
Especially in a large site, coding and updating pages gets old fast. In a large institution, where much of the content may be developed by non-technical staff, design should be as quick, easy, and error-proof as possible. Templates (for both pages and graphics) can help streamline the design process. Style sheets offer consistent text formatting with minimal effort, and without making a page's structure inaccessible to text-only browsers, search engines, and the like.
Consider using templates -- "shell" documents with the site's standard layout tags, such as a background, header, and footer. Most recent Web design packages support templates, or you can create and manage them as straight text files if you're a stubborn fan of notepad and vi.
For sites that use "buttons" or other graphics with text, keep a set of "blank" graphics with guides for text placement and fonts used (Adobe Photoshop's [HREF 3] layers are great for this). Spidersoft's [HREF4] WebGAL takes things a step further, organizing a set of "Web resources" (HTML snippets, graphics, Java applets, et cetera) in a single file.
Cascading style sheets [HREF5], which Netscape is finally supporting in Communicator 4 [HREF6], can be used to modify standard text styles and automate text formatting across an entire site (just think, no more <font> tags!). Style sheets are external documents (or headers within each document, or inline formatting commands) that contain formatting specs for standard and user-defined tag elements. To change the appearance of an entire set of documents based on a style sheet, you simply change the style sheet.
Style sheets are browser-dependent. When a page makes reference to an external style sheet, the browser downloads the CSS file and interprets the page according to the style information it finds there. Browsers that lack style-sheet support will display the standard styles instead of your custom formatting. Depending on your application, this can be a boon or a bane. It's also worth noting that using style sheets will result in a marginal increase in server load, due to extra hits from browsers downloading the style sheets themselves. Using style sheets also requires a new MIME type in the server software's configuration files: .css as text/css.
Many current Web development packages support style sheets as well, including:
Documents can be automated on the server side in a variety of ways -- for instance, with CGI applications, database connectivity, batched changes to static files, and server-side includes.
Dynamic documents -- Web pages that change without manual intervention -- can be generated in a variety of ways. They may be generated by programs on the fly, as in the case of CGI applications that return different output based on HTTP session conditions. An example of this is the opening screen of ASU's FASTTWeb Interactive [HREF12] site, which checks the current time against hours of system availability and responds accordingly.
Dynamic documents may be generated from databases, as in the case of a staff directory or catalog application. ASU's Online Directory [HREF13] is an example, as is the previously mentioned FASTTWeb Interactive. Database connectivity tools range from the expensive-and-simple to the free-and-complex. There are two very general approaches to Web/database tools: a data-access approach, which gives the basic tools for accessing data in an application but leaves the processing and output to you; and a presentation-logic approach, in which you create HTML template files with extended tags to indicate database functions, and a server-side process handles those extended tags before returning the output to the browser. Several database connectivity products are listed below, along with a couple of resource lists.
Document delivery can also be automated by periodic batch processes that simply change static files on the server. ASU's Online Schedule of Courses [HREF19] is an example of this. Registration transactions are processed nightly on a batch mainframe system, which FTPs updated course listings to a directory within the Web space. When a browser requests a particular set of course listings, a CGI script locates the appropriate file and returns it line for line (surrounded by a standard header and footer). The timed banner-switching on the new ASU West Web [HREF2] is another (simpler) example.
Server-side includes (SSIs) can be used to include files containing standard information -- such as a header, footer, or disclaimer -- in multiple Web pages without duplicating the content. For instance, if I have a header file header.html and want to include that file in my page index.shtml, index.shtml will look something like this:
<html>
<head> blah blah blah... </head>
<body>
<!--#include virtual="header.html"-->
blah blah blah...
</body>
</html>
Because of the .shtml extension, the Web server will understand that it's supposed to parse index.shtml, looking for and acting on directives it finds. It will output each line of index.shtml as it normally would; but when it gets to the include directive, it will open the file header.html and send each line of that file before continuing with the rest of index.shtml. In my example above, if my file header.html looked like this:
<img src="logo.gif">
<br>
Copyright information and other oh-so-important stuff
<p>
the browser, on requesting the file index.shtml, would see this:
<html>
<head> blah blah blah... </head>
<body>
<img src="logo.gif">
<br>
Copyright information and other oh-so-important stuff
<p>
blah blah blah...
</body>
</html>
SSIs are also useful for incorporating volatile (or manually updated) information into an otherwise static page. For instance, a "News" page may have formatting in straight HTML, but include text-based news articles from separately maintained files. This is handy for Web content maintained by non-technical staff who don't want to bother with HTML or Web authoring tools; they can work strictly with the text.
As the name implies, server-side includes are server-dependent. If the Web server software supports them (as most do), there is an option to enable SSIs and to specify the file extension the server will parse. For instance, the server might be directed to parse files with a .html extension -- that is, all HTML documents. Different directories may be configured differently as well. In addition, the syntax of the include directive may vary depending on your Web server software. SSIs can have a marginal effect on Web server performance, as they require greater processing overhead for each file to be parsed.
The less time and effort it takes to maintain a site, the better are its chances of being maintained. Many software products are available to take the tedium out of site maintenance task, automating such functions as:
- validating (checking) your HTML
- checking for broken links
- checking for missing or extraneous documents
- managing "NEW!" flags, "last modified" dates, etc.
- performing extended search & replace functions across directories
Some notable maintenance tools for the Windows platform include:
Some tools are available for the Macintosh platform as well, including:
Some maintenance functions are built right into newer Web server software. For example, Netscape's LiveWire-based "Site Manager" will check the entire site for broken links. Look through your server documentation for site-management features -- in some cases, you need look no further.
A periodic top-to-bottom review of the site can point up the need for redesign and give you a chance to do it in a planned, more or less controlled fashion. After the initial design has been up and in use for a while, your goals may change -- or visitor feedback may indicate the need to reorganize. As with any planning process, a redesign will have better results if it's based on data from the previous design. If you start collecting data now, you'll know how to reorganize when the time comes.
Electronic mail inquiries offer a visitor's-eye-view of your Web site. Are you consistently fielding requests for the same piece of information that's not on your Web site? Put it up there! Are you consistently fielding requests for the same piece of information that is on your Web site? It's in the wrong place! A simple email inquiry log can tell you more about how visitors approach your site than almost anything else.
"Exit interviews" or user surveys can yield valuable information in a similar vein -- as long as enough visitors are willing to take the time to fill them out. To increase your response rate, keep the survey short and easy, make it accessible by one link from any page on your site, and consider offering some sort of incentive for filling it out.
Web server logs and statistics can show you the most frequently used files on your site, patterns of documents requested and not found, performance data on your server, and even the type of browser people are using to access your site (and therefore what browsers to tweak your code for -- handy when deciding what to do about the Browser Wars). Some servers have built-in log analysis tools. Ours does, but we use WebTrends [HREF28] because it gives us nicely formatted reports for the academics and higher-ups. Lars-Owe Ivarsson maintains a massive list of log analysis tools [HREF29] at Uppsala University.
Benchmarking with other, similar institutions lets you learn from others' experiences and examples. Compare their information structure with your own, and try to navigate their site as a user. According to a study conducted at Columbia University Teacher's College [HREF30], an average Web site visit lasts no longer than three to five minutes before the visitor moves on. Can you find a given piece of information on their site within that time? On your site?
Change is in the essential nature of the Web. Technology changes; visitors change; the purpose of an organization's Web presence changes. Most of us will redesign our Web sites several times. With careful site planning and review, however, it's possible to spend longer in "maintenance mode" -- forestalling redesign until external circumstances make it absolutely necessary.
In addition, using a variety of automation techniques and tools, site development and maintenance can be streamlined dramatically. This helps to "lower the bar" of technical knowledge required to publish on the Web and reduces the overall workload of Web development and support staff.
- HREF1
- http://www.west.asu.edu/jenili
-- Jeni Li's personal home page
- HREF2
- http://www.west.asu.edu
-- Arizona State University West
- HREF3
- http://www.adobe.com/prodindex/photoshop/main.html
-- Adobe Photoshop
- HREF4
- http://www.spidersoft.com
-- Spidersoft
- HREF5
- http://www.w3.org/pub/WWW/Style
-- W3C's cascading style sheets reference
- HREF6
- http://home.netscape.com/comprod/products/communicator
-- Netscape Communicator 4
- HREF7
- http://www.sausage.com/soft1.htm#hotdog
-- Sausage Software's HotDog Pro
- HREF8
- http://www.allaire.com/products/homesite
-- Allaire's HomeSite 2.5
- HREF9
- http://www.deltapoint.com/qsdeved/index.htm
-- DeltaPoint's QuickSite 2.0
- HREF10
- http://www.softquad.com/hip
-- SoftQuad's HoTMetaL Intranet Publisher
- HREF11
- http://interaction.in-progress.com
-- Interaction in*Progress
- HREF12
- http://www.asu.edu/fastt
-- ASU's FASTTWeb Interactive
- HREF13
- http://www.asu.edu/asuweb/directory
-- ASU's Online Directory
- HREF14
- http://www.stormcloud.com/wdbc3/default.htm
-- Stormcloud's WebDBC
- HREF15
- http://www.allaire.com/products/coldfusion
-- Allaire's Cold Fusion
- HREF16
- http://www.sybase.com/products/internet/websql/index.html
-- Sybase's web.sql
- HREF17
- http://www.starnine.com/development/extendingwebstar/database.html
-- Database connectivity tools for Macintosh-based servers
- HREF18
- http://webdev.indiana.edu/cgi-bin/jsissom/dbgate
-- The Database Gateway Selector, by Jay Sissom of Indiana University, Bloomington
- HREF19
- http://www.asu.edu/registrar/schedule
-- ASU's Online Schedule of Courses
- HREF19
- http://www.opposite.com
-- HTML Power Tools
- HREF20
- http://www.kagi.com/bungalow/html/xplus.html
-- Xpire Plus
- HREF21
- http://www.spyglass.com/products/validator
-- Spyglass HTML Validator
- HREF22
- http://www.biggbyte.com
-- InfoLink Link Checker
- HREF23
- http://www.tcb.ac.il/~lazar/lazar.html
-- CleanUp
- HREF24
- http://www.microsoft.com/frontpage
-- Microsoft's FrontPage
- HREF25
- http://www.matterform.com/grinder/index.html
-- Matterform's HTML Grinder
- HREF26
- http://pauillac.inria.fr/~fpottier/brother.html.en
-- Big Brother
- HREF27
- http://www.adobe.com/prodindex/sitemill/main.html
-- Adobe SiteMill
- HREF28
- http://www.webtrends.com
-- WebTrends
- HREF29
- http://www.uu.se/software/analyzers
-- Lars-Owe Ivarsson's massive list of log analysis tools
- HREF30
- http://www.ilt.columbia.edu/academic/classes/tu5020/projects/he/higher_ed.html
-- A study of higher-education institutions' Web sites, conducted at Columbia University Teacher's College
Copyright
Jeni Li Shoecraft ©, 1997. The author assigns to Southern Cross
University and other educational and non-profit
institutions a non-exclusive licence to use this document
for personal use and in courses of instruction provided that
the article is used in full and this copyright statement is reproduced.
The author also grants a non-exclusive licence to Southern Cross
University to publish this document in full on the World Wide Web
and on CD-ROM and in printed form with the conference papers, and
for the document to be published on mirrors on the World Wide Web.
Any other usage is prohibited without the express permission of the
author.
[Presentation]
[Interactive Presentation]
[All Papers and Posters]
AusWeb97
Third Australian World Wide Web Conference, 5-9 July 1997,
Southern Cross University, PO Box 157, Lismore NSW 2480, Australia
Email: AusWeb97@scu.edu.au