Formality Considered Harmful: Experiences, Emerging Themes, and Directions

Frank M. Shipman III and Catherine C. Marshall

Xerox Palo Alto Research Center
3333 Coyote Hill Road
Palo Alto, CA 94304
(415) 812 - 4740 
E-mail: {shipman, marshall}


This paper reviews experiences in designing, developing, and working with the users of a variety of interactive computer systems. The authors suggest, based on a number of experiences, that the cause of a number of unexpected difficulties in human-computer interaction is users' unwillingness to make structure, content, or procedures explicit. Besides recounting experiences with system use, this paper discusses why users are often justified in rejecting formalisms and how system designers can anticipate and compensate for problems users have in making implicit aspects of their tasks explicit. Incremental and system-assisted formalization mechanisms, as well as techniques to evaluate the task situation, are proposed as approaches to this problem.

KEYWORDS:Formalization, structure, hypermedia, argumentation, knowledge-based systems, groupware, representation.


When people use computer systems, their interaction is usually mediated by abstract representations that describe and constrain some aspect of their work or its content. Computer systems use these abstract representations to support their users' activities in a variety of ways: by structuring a task or users' work practices, by providing users with computational services such as information management and retrieval, or by simply making it possible for the system to process users' data. These abstractions are frequently referred to as formalisms.

When formalisms are embedded in computer systems, users must often engage in activities that might not ordinarily be part of their tasks: breaking information into chunks, characterizing information via keywords, categorizing information, or specifying relations between pieces of information. For example, in the Unix operating system, these activities might correspond to creating files, naming them, putting them in a directory structure, or describing dependencies in "make" files or making symbolic links between directories or files.

The abstract representations that computer systems impose on users may involve varying degrees and types of formalization beyond those that users are accustomed to. In some instances, little additional formalization is necessary to use a computer-based tool; text editors, such as vi or emacs, do not require additional formalization much beyond that demanded by other mechanisms for aiding in the production of linear text. Correspondingly, the computer can perform little additional processing without applying sophisticated content analysis techniques. In other cases, more formalization brings more computational power to bear on the task; idea processors and hypermedia writing tools demand more specification of structure, but they also provide functionality that allows users to reorganize text or present it on-line as a non-linear work. These systems and their embedded representations are referred to as semi-formal since they require some -- but not complete -- encoding of information into a schematic form. At the other end of the spectrum, formal systems require people to encode materials in a representation that can be fully interpreted by a computer program. When the degree or type of formalization demanded by a computer system exceeds that which a user expects, needs, or is willing to tolerate, the user will often reject the system.

In this paper, we suggest that creators of systems that support intellectual work like design, writing, or organizing and interpreting information are particularly at risk of expecting too great a level of formality from their users. To understand the effects of imposing or requiring formality, we draw on our own experiences designing and using such systems.

First, we draw lessons from some anecdotal accounts of our own experiences with these types of systems as well as corroborative reports by others. We discuss possible reasons why users reject formalisms, including issues associated with cognitive overhead, tacit knowledge, premature structure, and situational structure. We then propose some solutions for system designers who are trying to avoid making these same mistakes. In particular, we focus our proposals on mechanisms that are based on incremental system-assisted formalization and restructuring as people reconceptualize their tasks; we also consider ways designers can work with users to evaluate appropriate formalisms for the task at hand.


To understand how formalization influences system use and acceptance, we examine five different kinds of systems that support intellectual work: general purpose hypermedia systems, systems for capturing argumentation and design rationale, knowledge-based systems, groupware systems, and software engineering tools. Some of these systems, such as those designed to capture design rationale, are based on specific formalisms that reflect a prescriptive method or approach to the work; others, such as hypermedia systems, require that an arbitrary formal structure be developed given more abstract, less prescriptive, building blocks. Each type of system addresses a very different aspect of a user's work, but all advance our analysis of the underlying problems developers encounter when they field systems to support intellectual work.

The systems we use to frame our discussion have been successful by many metrics; yet they have all exhibited similar problems with user interaction that may be attributed to their underlying formalisms. To focus the discussion on these formalisms, we deliberately omit any description of the interfaces by which users interact with the formalisms. In so doing, we hope to expose a dangerously seductive line of reasoning: if you build the right interface to an embedded representation, users will formalize the desired aspect of their task domain.

What do systems supporting intellectual work require users to formalize? First, many hypermedia systems try to coerce their users into making structure explicit. With few exceptions, they provide facilities for users to divide text or other media into chunks (usually referred to as nodes), and define the ways in which these chunks are interconnected (as links). This formalism is intended as either an aid for navigation, or as a mechanism for expressing how information is organized without placing any formal requirements on content.

Systems that support argumentation and the capture of design rationale go a step further than general-purpose hypertext systems: they usually require the categorization of content within a prescriptive framework (for example, Rittel's Issue-Based Information Systems (IBIS) [Kunz, Rittel 70]) and the corresponding formalization of how these pieces of content are organized. This prescriptive framework is seen as providing a facilitative methodology for the task.

Knowledge-based systems are built with the expectation of processing content [Waterman 86]. Thus, to add or change knowledge that the system processes, users are required to encode domain structure and content in a well-defined representational scheme. This level of formalization is built into a system with the argument that users will receive significant payback for this extra effort.

Groupware systems supporting coordination are an interesting counterpoint to knowledge-based systems: they may not require users to formalize the structure of their information or its content, but rather their own interactions. This type of formalization allows the system to help coordinate activities between users, such as scheduling meetings or distributing information along a work-flow [Ellis et al. 91].

Formalisms used in each of these types of systems involve computer-mediated communication or coordination with other humans, or the capture and organization of information. As a point of comparison, we also discuss efforts to support the software engineering process, systems designed to help people produce a formal artifact, a computer program.

By looking at this range of systems, we found that formalisms structure the activity in unexpected or unintended ways, are understood differently by different people, may be an unfamiliar addition to a formerly familiar task, may cause people to lose information that doesn't fit, and in general require people to make knowledge explicit that may be difficult or undesirable to articulate. Our discussions delve into these lessons, and illustrate them with specific anecdotes.

2.1 General Purpose Hypermedia

Hypermedia systems generally provide a semi-formal representation where chunks of text or other media, called nodes, can be connected via navigational links. An important goal of these links is to accommodate individualized reading patterns by supporting non-linear traversal of the document. Authors must formalize structure during the creation of such hyperdocuments.

Learning how to write, and to a lesser extent learning how to read, in a hypermedia system takes time. Observing users become accustomed to KMS [Akscyn et al. 88] during its use at Baylor College of Medicine, it became apparent that people do not easily accept new authoring modes. KMS is a page-based hypermedia system; authors record information on electronic pages which can be linked together with navigational links. Some beginning users would write hierarchical outlines and full pages of text which they would connect by a single link to the next page of text, as if they were still using an outlining tool and word processor. By defaulting to the authoring practices of systems previously experienced, they avoided the decision of what information should be chunked together or what links should be created. Information that fit on a page became a chunk with a link to the next page. Thus the new medium, with its unfamiliar formalism, combined with existing practice to yield unexpected results.

Experiences with internal use of early prototypes of the Virtual Notebook System (VNS) [Shipman et al. 89], another page-based hypermedia system, also illustrated the added difficulties of chunking and linking information. Organizational conventions were decided upon within groups sharing "notebooks." These high-level conventions aided in understanding information from other users' notebooks, but there was still a lot of variation between individuals in the amount of information on a page and the number of links created. More recent usage of the VNS outside of the development community shows a large variance in use of the system [Brunet et al. 91]. The heavy users build up sets of structured templates for reuse, thus reducing the overhead involved in adding structure to the information they are entering. In this case, the new medium placed additional requirements on the task; the use of formalisms had to be negotiated within the workgroup.

NoteCards [Halasz et al. 87] is a hypermedia environment that uses an index card metaphor. Many NoteCards users reported similar problems writing non-linear narrative in the new medium. The second author's experiences training information analysts to use NoteCards revealed that they had difficulties chunking information into cards ("How big is an idea? Can I put more than one paragraph on a card?"), naming cards ("What do I call this?"), and filing cards ("Where do I put this?"). Typed links -- the strongest formalization mechanism provided -- were rarely used, and when they were, they were seldom used consistently. Both link direction and link semantics proved to be problematic. For example, links nominalized as "explanations" sometimes connected explanatory text with the cards being explained; other times, the direction was reversed. Furthermore, the addition of "example" links confounded the semantics of earlier explanation links; an example could easily be thought of as an explanation. Monty documents similar problems in her observations of a single analyst structuring information in NoteCards in [Monty 90].

Unlike NoteCards, which supported the capture of specific interconnections between chunks of information, Aquanet [Marshall et al. 91] has a substantially more complex model of hypertext that involves a user-defined frame-like knowledge representation scheme [Minsky 75] with a graphical presentation component. We observed that even sophisticated users with a background in knowledge representation had problems formalizing previously implicit structures. A case study of a large-scale analysis task documents these experiences in [Marshall, Rogers 92].

2.2 Argumentation and Design Rationale

Recently there have been many different proposals for embedding specific representations in systems to capture argumentation and design rationale. Some of them use variations on Toulmin's micro-argument structure [Toulmin 58] or Rittel's issue-based information system (IBIS) [Kunz, Rittel 70]; others invent new schemes like Lee's design representation language [Lee 90] or MacLean and colleagues' Question-Option-Criteria [MacLean et al. 91].

Some design theorists claim that records of formal argumentation or design rationale will yield far-reaching benefits: shorter production time, lower maintenance costs on products, and better designs [Jarczyk et al. 92]. Experiments with mechanisms to capture design rationale -- from McCall et al's use of PHI [McCall et al. 83] to Yakemovic and Conklin's use of itIBIS [Yakemovic, Conklin 90] -- can be interpreted both as successes and as failures. The methods resulted in long-term cost reductions, but success relied on severe social pressure, extensive training, or continuing human facilitation. In fact, Conklin and Yakemovic reported that they had little success in persuading other groups to use itIBIS outside of Yakemovic development team, and that meeting minutes had to be converted to a more conventional prose form to engage any of these outside groups [Conklin, Yakemovic 91]. Like the general-purpose hypermedia systems, argumentation and design rationale systems force their users to chunk and categorize information according to its role, such as issue, position, or argument. Users of these methods must then specify connections between chunks, such as answers, supports, or contradicts links. We have noticed several problems users have in effectively formalizing their design rationale or argumentation in this type of system; these problems can be predicted from our experiences with hypermedia.

First, people aren't always able to chunk intertwined ideas; we have observed, for example, positions with arguments embedded in them. Second, people seldom agree on how information can be classified and related in this general scheme; what one person thinks is an argument may be an issue to someone else. Both authors have engaged in extended arguments with their collaborators on how pieces of design rationale or arguments were interrelated, and about the general heuristics for encoding statements in the world as pieces of one of these representation schemes (see [Marshall et al. 91] for a short discussion of collaborative experiences using Toulmin structures). Finally, there is always information that falls between the cracks, no matter how well thought out the formal representation is. Conklin and Begeman document this latter problem in their experiences with gIBIS [Conklin, Begeman 88].

2.3 Knowledge-Based Systems

Knowledge-based systems have long endorsed the goal of having users add or correct knowledge in the system. End-user knowledge acquisition imposes a number of formalization requirements on users. Users must learn the knowledge representation used by the system, even if it is hidden by a good interface, or else they will not fully understand the effects of their changes.

Peper and colleagues took a different approach to the problem of creating expert systems that users can modify [Peper et al. 89]. They eliminated the inference engine, leaving a hypermedia interface in which users were asked questions and based on their answers, were directed to a new point in the document. For example, a user might see a page asking the question, "Did the warning light come on?" with two answers, "Yes" and "No". Each answer is a link to further questions or information based upon the answer of the previous question. With this system, users could add new questions or edit old questions in English since the computer was not processing the information. By reducing the need for formalized knowledge, they achieved an advantage in producing a modifiable system, though at the cost of sacrificing inferencing.

Another approach is demonstrated by the end-user modifiability (EUM) tools developed to support designers in modifying and creating formal domain knowledge with task agendas, explanations, and examples [Fischer, Girgensohn 90]. In a description of user studies on EUM tools, Girgensohn notes that most of the problems found in the last round of testing "were related to system concepts such as classes or rules [Girgensohn 92]." In short, these user studies revealed that, although the EUM tools made the input of knowledge significantly easier, users still had problems manipulating the formalisms imposed by the underlying system.

2.4 Groupware Systems

Groupware systems that require the formalization of procedure and interaction have suffered many of the same problems as systems that enforce formalization of structure and content. For example, systems that extend electronic mail by attaching properties or types to messages require their users to classify exactly what type of message they are sending or what type of reply is acceptable. Experiences with systems like the Coordinator [Winograd, Flores 86] and Information Lens [Malone et al. 86] point out that many users ignore the formal aspects of such systems, and generally use them as basic electronic mail systems [Bullen, Bennett 90].

Coordination oriented systems have the additional burden of formalizing social practices which are largely left implicit in normal human-human interactions. Automatic scheduling systems, for example, have met with limited acceptance [Grudin 88]; users have proven to be unwilling to describe their normal decision methods for whether and when to schedule a meeting with other people. The same rules of scheduling that apply to your boss do not apply to a stranger, but making such differences explicit is not only difficult, but also socially undesirable.

Experiences with workflow systems, systems which automatically route information and work through defined procedures, show that systems without the ability to handle exceptions to the formalized procedure cannot support the large number of cases when exceptional procedures are required [Ellis et al. 91]. Arguably, almost all office procedures turn out to be exceptions to the prescribed form [Suchman 87]. Also, determining what procedures to encode can be difficult. Do the procedures written in the corporate manual get encoded, or those that are actually followed? How are the actual procedures obtained? Does the encoding of the actual procedures give them legitimacy that will be resisted by those pushing the corporate procedures? Formalization of such information can quickly lead to a political battle whose first casualty is the workflow system.

2.5 The Lesson of Software Engineering

Tools to support the software engineering process echo many of the difficulties we have already described. As in the above situations, people (in this case including programmers) are required to explicitly communicate information to a computer. The interfaces through which this communication occurs, in the form of specification tools or programming languages, are often part of the problem, but they only contribute what Brooks calls "accidental complexity" [Brooks 87] to the overall task. Whether a person uses popup menus, dialog boxes, "English-like" formal languages, or low level programming languages to state the information explicitly, the person must still know what they want to state, be it a relationship between two pieces of text or a complex algorithm.

In software engineering, deciding what needs to be stated explicitly (the specification and actual program code) has been termed "up-stream activity" to distinguish it from the "down-stream activity" of instantiating the specification [Myers 85]. Software engineering tools that focus on the up-stream activity are meant to support the process of coming up with a specification, the storage and retrieval of information associated with this process, and visualization of the result. While the results are still out on the success of up-stream design tools, their methods could be used to support formalization in the other four classes of systems.


The broad range of examples we discuss in the previous section highlights the ubiquity of the problems associated with enforced formalization. In this section we explain why we believe that the users are making the right decisions, in some sense, by resisting premature, unnecessary, meaningless, or cognitively expensive formalization.

From the user's perspective formalization poses many risks. "What if I commit to this formalization only to later find out it is wrong?" "What do I do when the ideas or knowledge is tacit and I cannot formalize it?" "Why should I spend my time formalizing this when I have other things to be doing?" "Why should I formalize this when I cannot agree with anyone else on what the formalization should be?" These are all valid questions and the answers that systems provide are often insufficient to convince people to use a system's formal aspects.

3.1 Cognitive overhead

There are many cognitive costs associated with adding formalized information to a computer system. Foremost, users must learn a system's formal language. Some domains, such as circuit design, have specific formal languages (e.g., circuit diagrams) to describe a certain type of information. More generic formal languages, such as production rules or frames, are almost never used for tasks not dealing with a computer. While knowledge-based support mechanisms and interfaces can improve the ability of users to successfully use formal languages, Girgensohn's experience shows that system concepts related to underlying representations still pose major obstacles for their use [Girgensohn 92].

Even knowing a system's formal language, users face a mismatch between their understanding of the information and the system's formal representation; they face a conceptual gap between the goals of the user and the interface provided by the system. Norman describes the requirements to bridge this gap or "gulf of execution":

"The gap from goals to physical system is bridged in four segments: intention formation, specifying the action sequence, executing the action, and, finally, making contact with the input mechanisms of the interface." [Norman 86] (page 39)

As this implies, formalisms are difficult for people to use often because of the many extra steps required to specify anything. Many of these extra decisions concern chunking, linking, and labeling, where formal languages require much more explicitly defined boundaries, connections between chunks, and labels for such connections than their informal counterparts.

The obstacle created by this conceptual gap between users' goals and systems' formal languages was observed in an early prototype of the Virtual Notebook System's "interest profile matcher." The goal of the profile matcher was to enable users of the system to locate other users with certain interests and expertise. The vocabulary used in profiles was the Medical Subject Headings (MeSH), a set of around 20,000 terms divided in about twelve interconnected trees (forming a directed acyclic graph) which is used by medical journals to index articles. Defining an interest profile required choosing terms out of the hierarchies of concepts which best described one's interests. Queries for locating people also required choosing terms from MeSH terms and attaching "matching ranges" so that all terms in a given range in the MeSH hierarchies would be considered a match. The matching ranges were necessary because MeSH was large enough to experience the vocabulary problem [Furnas et al. 87]--people using different terms to describe the same topic. With the increase in expressiveness in queries came an increase in difficulty to define queries. Work on the profile matcher was discontinued because the effort required to define interests and queries of sufficient clarity overcame the usefulness of the service the system was to provide.

In an experiment in applying Assumption-based Truth Maintenance Systems (ATMS) derived dependency analysis (described in [de Kleer 86]) to networks of Toulmin micro-argument structures in NoteCards, one of the authors came to the conclusion that the cognitive cost was not commensurate with the results, even though dependency analysis had long been a goal of explicitly representing the reasoning in arguments. Although the hypertext representation of the informal syllogistic reasoning inherent to Toulmin structures (the data-claim-warrant triple) captures a dependency relationship, additional formalization is necessary to perform automated analysis by an ATMS model. In particular, it was important to identify assumptions, and contradictory nodes. Not only was it difficult to identify contradictions in real data (belief was qualified rather than absolute) and impossible to track relative truth values over time, but also - and most importantly - by the time contradictions had been specified and relative truth values had been determined, a signification portion of the network evaluation had been done by the user. In this case, the additional processing done by the ATMS mechanism added little value.

3.2 Tacit knowledge

Tacit knowledge is knowledge users employ without being conscious of its use [Polanyi 66]. Tacit knowledge poses a particularly challenging problem for adding information to any system since it is not explicitly acknowledged by users. The problem of tacit knowledge has resulted in knowledge engineering methods aimed at exposing expertise not normally conscious in experts, such as one described by Mittal and Dym:

"We believe that experts cannot reliably give an account of their expertise: We have to exercise their expertise on real problems to extract and model their knowledge." [Mittal, Dym 85] (page 34)

When such introspection becomes necessary to produce and apply a formal representation during a task it necessarily interrupts the task, structures and changes it. These changes may be detrimental to the user's ability to perform their task. Hutchins et al. are discussing such a modification of the task when they say:

"While moving the interface closer to the user's intentions may make it difficult to realize some intentions, changing the user's conception of the domain may prevent some intentions from arising at all. So while a well designed special purpose language may give the user a powerful way of thinking about the domain, it may also restrict the user's flexibility to think about the domain in different ways." [Hutchins et al. 86] (page 108)

An example of this interference is McCall's observation that design students have difficulty producing IBIS-style argumentation even though videotapes of their design sessions show that their naturally occurring discussions follow this structure [Fischer et al. 91]. A physiological example of the interference that making tacit knowledge conscious can cause is breathing (also from McCall). When a person is asked to breath normally, their normal breathing will be interrupted. Furthermore, chances are that introspection about what normal breathing means will cause the person's breathing to become abnormal -- exaggeratedly shallow, overly deep, irregular.

Many of the representations that designers have embedded in systems to capture tacit knowledge are the result of an analysis of existing material. For example, argument representations are often derived from analyzing naturally occurring argumentative discourse: speech or text is broken into discourse units; the discourse units are categorized according to their functional roles; then the relationship between discourse units is described in general terms. But, as we can see from the IBIS example above, post hoc analysis is very different from generation. When these descriptive models are given to users, they find it very difficult to formalize knowledge as they are generating or producing it.

3.3 Premature Structure

One well known reason why users will not formalize is the negative effects of prematurely or unnecessarily imposing a structure [Halasz 88][Shum 91]. Migrating information from one formalization to another may be more difficult than formalizing the information from an informal state.

In his studies of how people organized information in their offices Malone found that office workers perceived the negative effects of prematurely structuring information [Malone 83]. In particular, one of the subjects in Malone's study said of a pile of papers waiting to be filed:

"You don't want to put it away because that way you'll never come across it again. ... it's almost like leaving them out means I don't have to characterize them. ... Leaving them out means that I defer for now having to decide--either having to make use of, decide how to use them, or decide where to put them." [Malone 83] (page 107)

This quote points out the perception that information formalized incorrectly or inconsistently will be more difficult to use or simply be of less use than information not formalized. This problem can also be observed in the directory structures of UNIX, Mac OS, or DOS. Many users have large numbers of disassociated files at the top level directory (or folder) of their machine or account. Many of these users know how to create subdirectories or folders to organize their files but postpone classification until they "have more time" or "the mess gets too bad." For these users the perceived benefit of organizing their files does not make up for the effort required to organize the files and the possible cost of mischaracterizing the files.

3.4 Different people, different tasks: situational structure

The difficulties of creating useful formalizations to support individuals are compounded when different people must share the formalization. An analogy can be drawn between collaborative formalization and writing a legal document for multiple parties who have different goals. The best one can hope for in either case is a result sufficiently vague that it can be interpreted in an acceptable way to all the participants; ambiguity and imprecision are used in a productive way. Formalization makes such agreements difficult because it requires the formalized information to be stated explicitly so that there is little room for different interpretations.

For different people to agree on a formalization they must agree on the chunking, the labelling, and the linking of the information. As has been discussed in the context of earlier examples in the use of tools to capture design rationale, the prospects of negotiating how information is encoded in a fixed representation is at best difficult.

Differences occur not just within a group of users but between groups as well. A study of the communication patterns in biomedical research groups showed that the characteristics of the research being performed influenced the organization and communication of the research groups [Gorry et al. 78]. A system which attempts to impose a particular structure on communication will likely not match the appropriate communication structure for any given group.

The problem of situational structure does not occur only when multiple people use the same structure but can also occur when the user's task changes. The context of the new task may not match the existing structuring scheme. In listing what are commonly considered the most important properties of a formal system, Winograd and Flores include:

"There is a mapping through which the relevant properties of the domain can be represented by symbol structures. This mapping is systematic in that a community of programmers can agree as to what a given structure represents." [Winograd, Flores 86] (page 85)

Experience and intuition seems to indicate that domains for which this is true may be quite small and task dependent. Anecdotal evidence shows that a representation that is suitable for one task may not be appropriate for a very similar related task. For example, Marshall and Rogers describe [Marshall, Rogers 92] how a representation developed for the process of assessing foreign machine translation efforts proved to be of limited value in the closely related task of evaluating Spanish-English machine translation software. The second task shared a subset of the content with the first task, but the representation did not formalize appropriate aspects of the material. Attributes like speed and accuracy as well as cost and computer platform turned out to be very important in evaluating software, but only of secondary importance in a general assessment of the field, while in the general assessment of the field, the technical approach of the various systems was deemed important.

In short, different situations require different user support and thus different formalized structures [Suchman 87].


Although difficulties caused by formalization are widespread and users are justified in their resistance to or rejection of some formalization tasks, there are some partial solutions to this system design dilemma. First, designers should decide what information must be formalized for the task to be performed and provide for that. Second, designers should identify what other services or user benefits the computer can provide based on trade-offs introduced by additional formalization. Finally, and most importantly, designers should expect, allow, and support reconceptualization and incremental formalization in longer tasks.

4.1 Essentials for Task

Some information must be formalized for a computer system to perform almost any task. A word processor must know the order of characters, a drawing program must know the color and shape of objects being drawn, and a circuit analyzer must know the logical circuit design. Interaction based on a limited-domain formalism can become transparent when the user has become skillful in expressing information in the formalism. Failure to get the user to formalize information that is essential for the central task means rejection of the system.

But what is the central task for more general-purpose systems to support intellectual work and, informationally, what does it require? What must by formalized for a system to support the organization and sharing of information? Does the content just have to be entered into the system, or for the system to work does extra information, such as hypertext links and labels need to be specified? To answer these questions, participatory design techniques can be applied to gain an understanding of the users' work practices and the formalisms necessary to support these practices [Greenbaum, Kyng 91].

4.2 The Non-Essential Cost/Benefit Trade-off

Many systems provide features which are not necessary for some uses of the system but are available for users who want the added benefits of providing more information. Font style and size could be considered such information in a word processor. Users can accept the default style and size to write a paper, and thus never have to explicitly state their preference, but they have the option to specify different fonts. Certainly many people seemed very happy to take advantage of this particular feature, placing many fonts on every page until some notion of aesthetics became common.

Other features may be much less widely used. Spreadsheet programs include many features which are used only by a small percentage of the user community [Nardi, Miller 90]. The rest of the users either get by without using the features, or ask for help when they cannot avoid doing otherwise. In designing information systems where formalization is required for use of some of the features, systems designers must balance the effects of cognitive inflation, which can leave services worth little compared to the cost of formalization.

Problems of scale, i.e. too many inferences to make users acknowledge each piece of inferred structure, lead to more automatic reasoning by the system instead of suggestions. This approach provides services to the user based on informally represented information; structures can be inferred by textual, spatial, temporal, or other patterns (see for example [Marshall, Shipman 93]). The system's inferences will be incorrect at times but, as long as the inferences are right part of the time and it is apparent to the user when the system has made the wrong inference, these features will cost the user little for the benefit they provide.

4.3 Gradual Formalization and Restructuring

Longer tasks necessarily involve reconceptualization; the gradual evolution of human understanding during task performance underlies many of the problems associated with formalization. Providing mechanisms for incremental formalization and restructuring is thus an fundamental way system designers can support intellectual activities with computational tools.

Incremental formalization has been demonstrated as a means of addressing problems we identified earlier that were associated with cognitive overhead, tacit knowledge, premature structure, and situational differences [Shipman 93]. We argue how each is addressed, and offer specific incremental formalization strategies such as informal information entry and subsequent addition of structure, the use of informal media such as videotape, tailorable, situation-specific formalisms, and finally, user-directed recognition of implied structure.

Cognitive overhead

Incremental formalization strategies seek to reduce the overhead of entering information, and defer formalization of that information until later in the task. Initial expression is thus changed from a formal language description to an informal representation. Such a difference increased users ability to modify the system in Peper's use of a hyperdocument instead of expert system rules [Peper et al. 89].

Incremental formalization divides up the overhead associated with formalizing information in the system by dividing up the process. By formalizing less information at a time the number of decisions concerning chunking, linking, and labelling that the users must make is reduced. The overhead associated with having to be explicit still exists for incremental formalization, but individual steps require less explicit information.

In determining the chunking, linking, and labelling of information, the user crosses the first of the four segments bridging Norman's gulf of execution, intention formation [Norman 86]. Support for the second and third segments of Norman's bridge, specifying the action sequence and executing the action, can be provided by providing suggested specifications and easy executions of actions. These suggestions can be produced using patterns in textual [Shipman 93] or spatial [Marshall, Shipman 93] information.

Tacit knowledge

The problem of representing tacit knowledge is difficult for all systems requesting information from their users. Informally represented information can capture some of the users' tacit assumptions through their use of language and other informal media like video. While it is not likely that the computer will be able to process much of such implicit information (beyond providing tangential methods of retrieval and access), other users can interpret it.

Suggested formalizations also have the possibility of bringing previously tacit knowledge to consciousness. Recognized patterns in informal information may be the result of tacit knowledge, triggering the recognition that this information is important. Such an occurrence was reported in the use of Infoscope [Stevens 93], a system that suggests information filters based on the users' reading patterns of Usenet News. In this account, a particular suggestion triggered a user to better understand his or her goals.

Premature structure

Systems that employ incremental formalization strategies do not require people to impose premature structure when they record information. Like the desks of the office workers in Malone's study [Malone 83], information in such systems can be kept without structure until the user wants to add structure. Also, since structure does not need to be added all at once, the user can add just the structure they feel comfortable adding, deferring other possible structuring until later.

Resistance to premature structure can still be a problem if users never feel ready to formalize. This problem is related to the problem of intention formation, which is discussed above as part of the problem of cognitive overhead.

Situated structure

The problem of using formally represented information for different situations is only partially addressed by incremental formalization. Incremental formalization requires that formal representations are able to evolve. By adding situation specific formalisms the user may modify existing or add new situation specific structures as needed.

Structure imposed on information for one task may not be of use for a new task without major changes to the chunking, linking, and labelling of information. In the cases when there is no way to resolve the different structures necessary for the different situations, access to informally represented information, or methods for communicating with the original author [Ackerman 94], may still be useful in deriving the new required structures.


We have tried to describe the extent of the difficulties caused by systems that require users to formalize information. These problems are pervasive in systems designed to support intellectual work such as hypermedia, argumentation, knowledge-based systems, and groupware.

The difficulties that users have in formalizing information is not just an interface problem. Users are hesitant about formalization because of a fear of prematurely committing to a specific perspective on their tasks; it may also be difficult for them to formalize knowledge that is usually tacit. The added cost of formalizing information over using informal information makes formalized information far less attractive for users to provide. In a collaborative setting, people must agree on a formalization and the heuristics for encoding information into it.

There are decisions that system designers can make to reduce the need of formal information by systems and also methods to make it easier for users to provide this information. Systems should use the domain-oriented representations used to communicate unambiguously between humans when possible. Systems should provide services based on inferred structure in informally represented information. Finally, systems should support the process of incremental formalization and structure evolution as tasks are reconceptualized.

As system designers, it is tempting to add more whiz-bang features that rely on formalized information. We must temper that urge and consider the difficulty that the user will have providing that information before counting on it for the success of our systems.


We thank the members of the HCC group at the University of Colorado and the Collaborative Systems Area at Xerox PARC for discussions that have aided in the formation of these ideas. We also thank Jonathan Grudin and Tom Moran for reading and providing comments on versions of this paper.


[Ackerman 94] Ackerman, M.S. Definitional and Contextual Issues in Organizational and Group Memories. To appear in Proceedings of the Twenty-seventh Hawaii International Conference on System Sciences (HICSS'94) 1994.

[Akscyn et al. 88] Akscyn, R.M., McCracken, D.L., and Yoder, E.A. KMS: A Distributed Hypermedia System for Managing Knowledge in Organizations. Communications of the ACM 31, 7 (July 1988), pp. 820-835.

[Brooks 87] Brooks Jr., F.P. No Silver Bullet: Essence and Accidents of Software Engineering. IEEE Computer 20, 4 (April 1987), pp. 10-19.

[Brunet et al. 91] Brunet, L.W., Morrissey, C.T., and Gorry, G.A. Oral History and Information Technology: Human Voices of Assessment. Journal of Organizational Computing 1, 3 (1991), pp. 251-274.

[Bullen, Bennett 90] Bullen, C.V., and Bennett, J.L. Learning From User Experience With Groupware. In Proceedings of the Conference on Computer-Supported Cooperative Work (CSCW'90) (Los Angeles, Calif., Oct. 7-10). ACM, New York, 1990, pp. 291-302.

[Conklin, Begeman 88] Conklin, J., and Begeman, M. gIBIS: A Hypertext Tool for Exploratory Policy Discussion. In Proceedings of the Conference on Computer-Supported Cooperative Work (CSCW'88) (Portland, Oregon, Sept. 26-28). ACM, New York, 1988, pp. 140-152.

[Conklin, Yakemovic 91] Conklin, E.J., and Yakemovic, K.C. A Process-Oriented Approach to Design Rationale. Human Computer Interaction 6, 3-4 (1991), pp. 357-391.

[Ellis et al. 91] Ellis, C., Gibbs, S., and Rein, G. GroupWare: Some Issues and Experiences. Communications of the ACM 34, 1 (Jan. 1991), pp. 38-58.

[Fischer et al. 91] Fischer, G., Lemke, A.C., McCall, R., and Morch, A. Making Argumentation Serve Design. Human Computer Interaction 6, 3-4 (1991), pp. 393-419.

[Fischer, Girgensohn 90] Fischer, G., and Girgensohn, A. End-User Modifiability in Design Environments. In Human Factors in Computing Systems, CHI'90 Conference Proceedings (Seattle, Wash., April). ACM, New York, 1990, pp. 183-191.

[Furnas et al. 87] Furnas, G.W., Landauer, T.K., Gomez, L.M., and Dumais, S.T. The Vocabulary Problem in Human-System Communication. Communications of the ACM 30, 11 (Nov. 1987), pp. 964-971.

[Girgensohn 92] Girgensohn, A. End-User Modifiability in Knowledge-Based Design Environments. Ph.D. Dissertation., Department of Computer Science, University of Colorado, Boulder, Colorado, 1992.

[Gorry et al. 78] Gorry, G.A., Chamberlain, R.M., Price, B.S., DeBakey, M.E., and Gotto, A.M. Communication Patterns in a Biomedical Research Center. Journal of Medical Education 53, (1978), pp. 206-208.

[Greenbaum, Kyng 91] Greenbaum, J., and Kyng, M. (Eds.) Design at Work: Cooperative Design of Computer Systems. Lawrence Erlbaum Associates, Hillsdale, NJ, 1991.

[Grudin 88] Grudin, J. Why CSCW Applications Fail: Problems in the Design and Evaluation of Organizational Interfaces. In Proceedings of the Conference on Computer-Supported Cooperative Work (CSCW'88) (Portland, Oregon, Sept. 26-28). ACM, New York, 1988, pp. 85-93.

[Halasz et al. 87] Halasz, F.G., Moran, T.P., and Trigg, R.H. NoteCards in a Nutshell. In Human Factors in Computing Systems and Graphics Interface, CHI+GI'87 Conference Proceedings (Toronto, Canada, April). ACM, New York, 1987, pp. 45-52.

[Halasz 88] Halasz, F. Reflections on NoteCards: Seven Issues for the Next Generation of Hypermedia Systems. Communications of the ACM 31, 7 (July 1988), pp. 836-852.

[Hutchins et al. 86] Hutchins, E., Hollan, J., and Norman, D. Direct Manipulation Interfaces. In User Centered System Design, D. Norman, S. Draper, Eds. Lawrence Erlbaum Associates, Hillsdale, New Jersey, 1986, pp. 87-124.

[Jarczyk et al. 92] Jarczyk, A., Löffler, P., and Shipman, F. Design Rationale for Software Engineering: A Survey. In Proceedings of the 25th Annual Hawaii International Conference on System Sciences Vol. 2 (HICSS-92) (Jan.). IEEE, 1992, pp. 577-586.

[de Kleer 86] de Kleer, J. An Assumption-based TMS. Artificial Intelligence 28, (1986), pp. 127-162.

[Kunz, Rittel 70] Kunz, W., and Rittel, H.W.J. Issues as Elements of Information Systems. Working Paper 131, Center for Planning and Development Research, University of California, Berkeley, Calif., 1970.

[Lee 90] Lee, J. SIBYL: A Qualitative Decision Management System. In Artificial Intelligence at MIT: Expanding Frontiers, P. Winston, S. Shellard, Eds. The MIT Press, Cambridge, Mass., 1990, pp. 104-133.

[MacLean et al. 91] MacLean, A., Young, R., Bellotti, V., Moran, T. Questions, Options, and Criteria: Elements of a Design Rationale for User Interfaces. Human Computer Interaction 6, 3-4 (1991), pp. 201-250.

[Malone 83] Malone, T.W. How do People Organize their Desks? Implications for the Design of Office Information Systems. ACM Transactions on Office Information Systems 1, 1 (January 1983), pp. 99-112.

[Malone et al. 86] Malone, T.W., Grant, K.R., Lai, K.-Y., Rao, R., and Rosenblitt, D. Semi-Structured Messages are Surprisingly Useful for Computer-Supported Coordination. In Proceedings of the Conference on Computer-Supported Cooperative Work (CSCW'86) (Austin, Texas, Dec.). 1986, pp. 102-114.

[Marshall et al. 91] Marshall, C., Halasz, F., Rogers, R., and Janssen, W. Aquanet: a hypertext tool to hold your knowledge in place. In Proceedings of Hypertext `91 (San Antonio, Texas, Dec. 15-18). ACM, New York, 1991, pp. 261-275.

[Marshall, Rogers 92] Marshall, C.C., and Rogers, R.A. Two Years before the Mist: Experiences with Aquanet. In Proceedings of European Conference on Hypertext (ECHT `92) (Milano, Italy, Dec. 1992). pp. 53-62.

[Marshall, Shipman 93] Marshall, C.C., and Shipman, F.M. Searching for the Missing Link: Implicit Structure in Spatial Hypertext. To appear in Proceedings of Hypertext `93 (Seattle, Wash., Nov. 14-18). ACM, New York, 1993.

[McCall et al. 83] McCall, R., Schaab, B., and Schuler, W. An Information Station for the Problem Solver: System Concepts. In Applications of Mini- and Microcomputers in Information, Documentation and Libraries, C. Keren, L. Perlmutter, Eds. New York: Elsevier, 1983.

[Minsky 75] Minsky, M. A Framework for Representing Knowledge. In The Psychology of Computer Vision, P. Winston, Ed. McGraw-Hill Book Company, New York, 1975, pp. 211-277.

[Mittal, Dym 85] Mittal, S., and Dym, C.L. Knowledge Acquisition from Multiple Experts. The AI Magazine (Summer, 1985), pp. 32-36.

[Monty 90] Monty, M.L. Issues for Supporting Notetaking and Note Using in the Computer Environment. Dissertation, Department of Psychology, University of California, San Diego, 1990.

[Myers 85] Myers, W. MCC: Planning the Revolution in Software. IEEE Software (November 1985), pp. 68-73.

[Nardi, Miller 90] Nardi, B.A., and Miller, J.R. An Ethnographic Study of Distributed Problem Solving in Spreadsheet Development. In Proceedings of the Conference on Computer-Supported Cooperative Work (CSCW'90) (Los Angeles, Calif., Oct. 7-10). ACM, New York, 1990, pp. 197-208.

[Norman 86] Norman, D. Cognitive Engineering. In User Centered System Design, D. Norman, S. Draper, Eds. Lawrence Erlbaum Associates, Hillsdale, New Jersey, 1986, pp. 31-61.

[Peper et al. 89] Peper, G., MacIntyre, C., and Keenan, J. Hypertext: A New Approach for Implementing an Expert System. In Proceedings of 1989 ITL Expert Systems Conference, 1989.

[Polanyi 66] Polanyi, M. The Tacit Dimension, Doubleday, Garden City, NY, 1966.

[Shipman et al. 89] Shipman, F.M., Chaney, R.J., and Gorry, G.A. Distributed Hypertext for Collaborative Research: The Virtual Notebook System. In Proceedings of Hypertext `89 (Pittsburgh, Penn., Nov. 5-8). ACM, New York, 1989, pp. 129-135.

[Shipman 93] Shipman, F.M. Supporting Knowledge-Base Evolution with Incremental Formalization. Ph.D. Dissertation., Department of Computer Science, University of Colorado, Boulder, Colorado, 1993.

[Shum 91] Shum, S. Cognitive Dimensions of Design Rationale. In People and Computers VI, D. Diaper and N.V. Hammond, Eds. Cambridge University Press, Cambridge, UK, 1991.

[Stevens 93] Stevens, C. Helping Users Locate and Organize Information. Ph.D. Dissertation., Department of Computer Science, University of Colorado, Boulder, Colorado, 1993.

[Suchman 87] Suchman, L.A. Plans and Situated Actions: The problem of human-machine communication. Cambridge University Press, Cambridge, UK, 1987.

[Toulmin 58] Toulmin, S. (Ed.) The Uses of Argument. Cambridge University Press, UK, 1958.

[Waterman 86] Waterman, D.A. A Guide to Expert Systems. Addison-Wesley, 1986, pp. 3-11.

[Winograd, Flores 86] Winograd, T., and Flores, F. Understanding Computers and Cognition: A New Foundation for Design. Ablex Publishing Corporation, Norwood, NJ, 1986.

[Yakemovic, Conklin 90] Yakemovic, K.C., and Conklin, E.J. Report of a Development Project Use of an Issue-Based Information System. In Proceedings of the Conference on Computer-Supported Cooperative Work (CSCW'90) (Los Angeles, Calif., Oct. 7-10), ACM, New York, 1990, pp. 105-118.