A stable view
of the hyperactive web
George Buchanan, Gil Marsden, Thomas Tan, Yin Leng Theng & Harold Thimbleby
Middlesex University
Bounds Green Road
LONDON, N11 2NQ
Abstract
When we started developing organisational web sites in 1994, we adopted an integrated approach that effectively imposed our concepts on to the web. Our approach was to involve web site designers, and automate as much as possible of the repetitive aspects of authoring, with the aim of improving quality (design, usability, corporate value) and maintainability. However, the diversity and rapid obsolescence of web technologies soon far outstripped our resources to manage and develop a "fully-featured" web site development tool. Today, instead, we are developing specialist tools to address quality issues that are outside the scope of commercial web development tools. These tools are developed in parallel with usability studies that direct the tool development in the most productive directions. Because our tools are now smaller and more focused, their development is no longer on critical paths of overall site development: we are no longer "in control" of the web, but flow with it — and in some cases, ahead of it. Indeed our tools have a serendipitous synergy, which opens up new ways of improving the quality of web sites and similar complex systems.
Introduction
How do we live with a technology that could change the world for the better, which is tremendously exciting, yet has all the worst aspects of the fashion industry? In the fashion industry, this week we wear brown, next week blue is the rage — but at least it makes little difference to our relating together what colour we fancy. But the web shares a standard infrastructure which itself is an object of fashion. When enough people follow a trend that relies on shared infrastructure, we have to use it. If we are to look at their web pages and they at ours, we are effectively forced to comply with the hyperactivity.
We have been developing systems for helping organisations to develop web sites. In particular, we have worked with the Royal Society of Arts on a 500+ page web site (including databases and other active features), involving over 20 authors of material, and with the Friends of Benjamin Franklin House, a smaller site that we authored on our own. In combination with examining the design of sites and the social role of sites within organisations, we have had as a primary goal developing systems to facilitate the authoring of quality, multi-author web sites.
We have found developing our own tools frustrating because web technologies advance so fast it is too hard to remain current. Our tools, such as Gentler (Thimbleby, 1997a), HyperAT (Theng, 1997), Siteview (Thimbleby, 1997b), StyleGeezer (Marsden, Palmer & Thimbleby, 1997), each tried to provide an integrated and effective environment. Even though we had made a decision to separate authoring user interfaces, the expectations of users exceeded our implementation resources. We have now turned to exploring specialist tools, which do one thing well, and moreover do independent things well. Moreover our notion of "one thing" is becoming more restricted than it was! In particular, there is no need to try to do things that commercial enterprises are already doing effectively: making web sites attractive and fashionable. What commercial systems are not doing is making high quality, maintainable and reliable sites.
The unreasonable cost of quality if done the wrong way
This section, motivating what follows, argues that sites at even small scales are too complex to be managed without special approaches.
A web site contains numerous links, often represented as highlighted natural language phrases, that when clicked take the user to other parts of the site. A human must read the links to see what they mean, and click to see whether the natural language description fits the target page. Although some link connections can be checked automatically, most (indeed, the interesting ones) have to be checked manually.
The question arises, what is the shortest way around a site that covers every link? This is a graph theoretic problem, the Chinese Postman Problem (named after its Chinese originator, Mei-Ko, 1962).
For concreteness we use the web site for Ben Franklin’s London house. The site is at
http://www.rsa.org.uk/franklin. The site has 66 pages and 1191 links. The optimal Chinese postman tour of this site is 2248 clicks. This is manifestly too lengthy for a human to do unaided. Moreover, if mistakes are made, or if for any reason a different route is taken, the checking must be more than this minimum of 2248 steps. Realistically, a human checker would have to spend days checking a site, and the start-up times of each session would also need to be accounted for.Thus checking even a modest web site seems like an impossible burden for an unaided (patient and long-lived!) human: the human tester needs tool support, or alternative methods must be used. Even if the work is shared between several people, this creates management overheads, and is still not a problem that overcomes the complexity. Worse, this checking is only checking the static links — the easy bit! Sites typically have images, programmed material, database back ends, and other effects. Do they work with particular browsers, or conform with the latest standards? Do images correspond with their captions? All these and more issues also require checking. Thus, if a small and "easily understood" part of the quality control process of a modest site is unmanageable, we face a really serious problem!
In fact, the Benjamin Franklin site was generated by a web site compiler, Siteview, which in this case compiled 78 ‘source’ pages (some are template files, and therefore do not appear in the final site) with only 201 links. Most links can be checked by the compiler automatically — for instance, the source has to be connected. The source site has a postman tour of only 241 links, and that is including the links the compiler can interpret itself (as if the site was to be carefully prepared entirely by hand without any help from the compiler). This is a scale of task far more easily undertaken by hand.
Sites created by Siteview also have automatically generated linear files. The advantage of linear files is that they can easily be read sequentially, for instance by a proof-reader. Off-site references are converted to links plus explicit URLs, so that these can be visually checked easily.
One disadvantage of the linearised site is that it is enormous, and crashes some browsers! We describe below a way of linearising parts of sites.
Three approaches to quality in an active area
Given that web sites as such are too complex to manage, we need to look at new approaches to authoring, usability and managing web sites. This section gives three ideas, and we then show how the ideas can be combined, to gain a further level of effectiveness.
For the purposes of this paper, submitted to the Active Web conference, we have not taken space to contextualise and explain at length our approaches. We therefore provide brief outlines, to give an indication of their potential.
Webtree: a site simplifier, a site searcher…
Webtree can be explained in many different ways, because it is based on a particularly powerful concept.
Imagine performing a search on the web in a conventional search engine. The engine retrieves many pages, which it summarises. An alternative approach would be to identify site structures of sites that match the query; the virtual site would then be presented as a coherent document, perhaps by showing headings structured, as if from the table of contents of a book on the specialist subject that had been searched for — in fact, as an interactive outline. This is one idea behind Webtree (see Figure). Another is to combine the ideas of personalised information services (such as newspapers; Bharat, Kamba and Albers, 1998) with information retrieval.

Webtree screen shot (running in a standard web browser, which is not shown). Queries are entered in the left frame, the right frame shows search results within an interactive outliner. Clicking one of the arrow buttons in the outline reveals more detail in the usual way; clicking the L buttons at the top reveals the same amount of information in all sections.
Imagine a site manager (possibly an author, but possibly a person responsible for site quality): they wish to proof-read parts of the site. They perform a search on relevant terms (e.g., to search for legal disclaimers). Webtree then presents the result of their site-wide search as a small, manageable document, that is only a part of the whole site. This document has a visible and conventional structure, which reveals the context of the material retrieved by the search. Thus the manager can proof-read those parts of the site they are interested in, in context, and without the (unmanageable) overhead of the rest of the actual site. Thus Webtree solves some issues of web site management (as well as stimulating new ideas in search and retrieval).
Currently, Webtree is a research tool that is being developed with management goals in mind. We plan, however, to make a version of it available to ordinary users, since it is clearly a very effective web site search tool.
Docman: documenting the required changes…
Having found parts of a site that require editing (or may require editing in the future), there is now a communication problem between the manager and the actual authors of those parts of the document that should be updated. Moreover, updates may remain as "to do" items indefinitely — for example, non-urgent changes and ideas need to be recorded, but may never actually get done.
Docman is our tool to present a site so that authors and others can jointly communicate together requirements and ideas for change in the web site (see Figure). It is a little like Microsoft’s Frontpage’s "to do" list, but is properly integrated with the site structure, and is specifically designed to cater for multiple authoring, and for a hierarchy of authors (e.g., a manager can comment to authors). Docman relates each comment/todo item to a specific object (page, design template or database content). Thus it is possible to undertake a formal review of usability commentary on the site, and to create more formal metrics. Patterns of error can be used for taxonomic reasons and to help investigate patterns of author and design error.

Docman screen shot (running in a standard web browser, which is not shown). The top frame is commentary, the bottom frame is the RSA site page being referred to.
Usability of the current site…
Few authors of a large site wish to evaluate the entire site, and indeed it is not realistic to do so. Instead, by working closely with designers and performing conventional usability studies, we are aiming to improve the quality of the site (Theng and Thimbleby, in press).
Essentially, this is adopting the Taguchi method of quality control (Peace, 1993) — briefly, any quality checking you have to do after building a web site, is wasted effort, and (in Taguchi terms) lost quality. Thus the more the designers and authors undestand the issues, the higher the quality.
The two tools we have developed acknowledge that quality checking has to be done, and of course that a web site is a "living thing" that requires continual monitoring. The complementary approach is that every contribution to a site has to be of initially highest quality — otherwise, we just waste time later fixing problems, or handling user problems with a web site in a different part of the organisation.
In contrast to the two previous approaches, so far our usability studies have not yet been tool-based.
We have done both conventional usability studies and formal analyses of the web structure. Although usability evaluations have become common practice in many organisations, they are still novel and typical development cycles do not accommodate these practices. Instead of just using design principles and guidelines for reference and guidance in the design and development processes, we use a framework describing how design principles and guidelines can be used for system and interface usability evaluation of web sites. A detailed description of the framework is found in (Theng and Thimbleby, submitted).
Adapting Lingaard’s (1994) classification of usability defects in interactive systems to web design, seven areas were identified as design guidelines to evaluate web sites:
G1. Overall reactions to hypertext. Users’ overall perception of the performance of hypertext in terms of satisfaction, completion of tasks and appeal.
G2. Screen display. How clearly information is organised and displayed on the screen.
G3. Terminology and system information. Consistency in the use of terminology, word and format.
G4. Learning. Ease of use and effectiveness of hypertext for its purpose.
G5. System capabilities and user control. Response time, reliability and recovery.
G6. Navigation. How clearly navigational elements are displayed and the reason(s) for users feeling lost.
G7. Completing tasks. Usefulness of facilities provided in helping users to complete tasks.
We formulated these design guidelines using questionnaires to measure users’ subjective opinions. (Questionnaires were chosen since they provide a rich source of data fairly quickly compared with other methods such as interviews.) The formulation of the questionnaire was influenced by the Questionnaire for User Interface Satisfaction (QUIS). QUIS measures users' subjective ratings of the interface of an interactive system (Chin, Diehl and Norman, 1988).
Four subjects were selected to evaluate the RSA web site. All were Fellows of the RSA, and had experience using the web. The subjects were asked to use the RSA site by answering questions that involved tasks such as browsing, seeking references, information search and revision. They were asked to complete a questionnaire commenting on how satisfied they were with the design and structure of the web site in helping them to complete the tasks successfully. Subjects were interviewed after they had completed the questionnaire to ask for further comments and/or clarification regarding their responses on the structure of the questionnaire and the web site.
The table compiles subjects’ evaluation of the site. It is open to debate, but we have chosen 75% as a benchmark score to decide whether an area is perceived by the subjects to be well-implemented. It would seem that except for G3, all areas were well-implemented in the RSA web site.
User responses to questionnaires
Area % Remarks
G1 95.83 well-implemented
G2 100.00 well-implemented
G3 72.73 not well-implemented
G4 91.67 well-implemented
G5 84.62 well-implemented
G6 75.00 well-implemented
G7 92.31 well-implemented
Compilation of ratings of the success of implementation of design principles.
Drawing it all together
Nielsen (1998) and others have argued for user testing of sites, by which he means users in structured experiments. Docman is a tool that users — whether subjects in usability experiments, or actual users — can use to annotate parts of a site, so that authors can (if they wish) draw on users’ expertise and experience. With Webtree’s interface, managers and authors responsible for many pages can easily obtain overviews of comments on their work.
Comments could be fed back from agents, not just users, so results of formal and informal tools are both associated with the web object (e.g., the page or the database records). This raises an interesting research avenue: as users don’t see agent comments (only authors and/or managers do), then we can investigate the correlation between an agent and ‘soft’ comments from reviewers and/or users.
At the moment metadata is used for quality scheduling. This is an example of the sort of information that can be managed by Docman to enhance help for better search targeting. Docman will be capable of executing any agent programs to automatically generate quality metrics across both live objects and objects submitted for pre-release review (testing or as a final release check).
By using Java Beans we plan extensions to Docman’s criteria will be readily reflected in Webtree’s user interface; for example, if an agent identifies central pages, then Webtree can have a check box to direct searches to central or non-central pages. If users identify usability catastrophes, then these can be found by Docman; if (and only if) there are catastrophes, Docman’s user interface would provide the option to search for them.
In general, the approach will provide usability results on many dimensions (e.g., web page layout, terminology and web site data, ease of learning, web site capabilities, navigation, completing tasks) and will allow this information to be managed without authors being overwhelmed.
If a usability issue concerns poor site structure, then Docman can identify the collection of objects that affect poor structure and Webtree can find the pages and generate a good structure to browse the problem. We already have this in place.
So, to cope with the active part of the web and to provide designers with a manageable view of it, tools such as Docman, Webtree and Usermetric [currently being implemented] can be integrated within the same environment to help designers. Pleasingly, all of the features we describe to help managers and authors can also be reused as tools to help users, whether to find what they want, or to comment effectively on what they have found.
Automating usability results and incorporating them into Docman and Webtree will mean that changes to a web site can be automated, directed and managed. Designers of a site would be able to incorporate feedback received from users (real and formal) into the design cycle to enhance design, thus closing the design loop but without overwhelming designers with unnecessary detail. One reason for software projects failing is that designers often fail to take into account their evolving nature (Landauer, 1995). Our approach to the active web is also applicable to the wider context of general interactive systems design where requirements, expectations and technologies are evolving.
Acknowledgements
This work was supported by EPSRC Grant GR/K79376. Thomas Tan is a PhD student partially funded by Macmillan, Ltd.
References
K. Bharat, T. Kamba and M. Albers, "Personalized, Interactive News on the Web," ACM Multimedia Systems, 6(5), pp349–358, 1998.
J.P. Chin, V.A. Diehl and K.L. Norman (1988), "Development of an instrument measuring user satisfaction of the human-computer interface," CHI’88, pp213–218.
K. Mei-Ko (1962), "Graphic Programming Using Odd or Even Points," Chinese Mathematics, 1, pp273–277.
T. Landauer (1995), The trouble with computers: Usefulness, usability and productivity, MIT Press.
G. E. Marsden, G. J. Palmer & H. W. Thimbleby (1997), "Benjamin Franklin House: An Illustration of a Site Management and Visual Design Tool for Complex, Multi-authored Web Sites," (abstract only), in S. Lobodzinski and I. Tomek, editors, WebNet’97}, World Conference of the WWW, Internet, & Intranet, Toronto, pp688, Association for the Advancement of Computing in Education (AACE).
J. Nielsen (1998), "Cost of User Testing a Website,"
G. S. Peace, (1993), Taguchi Methods: A Hands-On Approach, Addison-Wesley Publishing Company, Reading MA.
Y. L. Theng (1997), Addressing the "lost in hyperspace" problem in hypertext, PhD Thesis, Middlesex University, London.
Y. L. Theng and H. W. Thimbleby (in press), "Addressing design and usability issues in hypertext and on the World Wide Web by re-examining the ‘lost in hyperspace’ problem," Journal of Universal Computer Science.
Y. L. Theng and H. W. Thimbleby (submitted), "Design principles and guidelines as usability measures for interactive systems," Submitted to CHI’99.
H. W. Thimbleby (1997a), "Gentler: A Tool for Systematic Web Authoring," International Journal of Human Computer Studies, 47(1), pp139–168, 1997.
H. W. Thimbleby (1997b), "Distributed Web Authoring," in S. Lobodzinski and I. Tomek, editors, WebNet’97, World Conference of the WWW, Internet, & Intranet, Toronto, pp1056–1083, Association for the Advancement of Computing in Education (AACE), 1997.