Lucy Suchman, Plans and Situated Actions: The Problem of Human-Machine Communication (Learning in Doing: Social, Cognitive and Computational Perspectives) New York: Cambridge U Press, 1987 (summary) Cambridge U Press, 1987
Are human intelligence and directed action similar to the ones ascribed to
the Trukese or to the European navigators?
European: the plan is derived from Universal Principles
Trukese: the objective is clear, but the course is contingent on the circumstances.
Looking at the difference between the representation of the
Trukese or of the European actions, we can infer:
1-The difference in acting is favored by the culture:
European culture favors abstract analytical thinking
Trukeses learn a cumulative range of concrete, embodied responses. They favor wisdom and experience. or
2- Our actions are ad hoc or planned according to the type of action and/or our expertise. or
3- All actions are "situated actions". Plans are a weak resource and invoked afterwards to account for an activity.
Suchman in this books supports the 3rd hypothesis.
The view of action of the European navigation is reified in the design of intelligent machines. The view, that purposeful action is determined by plans, is deeply rooted in the western human sciences as the correct model of the rational actor. Cognitive sciences also are based on this traditions.
The problem of mutual intelligibility or shared understanding has been the topic of social sciences for the past hundred of years. Recently, with the coming of new technologies, there is the new notion of interaction between humans and machines. But these talks of interaction have not considered the recent developments in social sciences regarding the foundation of human interaction.
Goal: what constitutes purposeful action?
What is interaction?
An old problem is "the
relationship between observable behavior and the processes, not available to
direct observation, that makes behavior
For psychology these processes are cognitive. For sociology they are interactional. But the final results is that the behaviors must be meaningful. A new manifestation of this old problem arises in AI. The purpose of AI is to make a machine that simulate cognitive processes. But in the end they must produce something that is accountably rational in the eyes of another (Turing test).
Computer artifacts are build relying on an underlying conception, based on the planning model of human action. She analyze such an artifact and propose an alternative view of plans not as something in the actor's head but as formulations of antecedent conditions and consequences of action that account for action in a plausible way.
She argues against the planning model, and proposes a situated action model.
Computer challenges the distinction between physics and social,
-things that one design, builds, and uses
-things with which one communicates
Interaction between people and machines implies mutual intelligibility, or shared understanding. So main questions are:
How do we account for the shared understanding, or mutual intelligibility?
How there could be mutual intelligibility between people and machine?
Cognitive scientist contend that the mind is nether substantial nor insubstantial: it is an abstractable structure implementable in any number of possible physical substrates. This decouples intelligence from things uniquely human and opens the way for the construction of intelligent artifacts.
This position is important for cognitive science because:
-At the beginning of the 20th century the method for studying human mental life was introspection: but introspection is not scientific.
-Behaviorists (1920ies) posited that all human actions should be understandable in terms of observable and mechanistically describable relationship between the organism and its environment.
-Cognitive Science: mentalist constructs may now be studied using technologies and AI: if a theory of underlying mental processes could be modeled on the computer so as to produce the right outward behavior, than the theory could be considered valid. Cognitive scientist put a "mental process" in between the environment and the organism. People act on the basis of symbolic representations. Cognition is not like computation, it is computation.
The idea of Human computer interaction
The attribution of purpose to artifacts or use of intentional vocabulary derives from:
1-The computer is interactive: it reacts in real time to each user action.
2- The means for controlling computer and the results are linguistic.
Some researcher have tried to clarify the difference between computer use and human conversation. Robustness, sensitivity to user expectation, ability to resolve ambiguity about input with questions, limited knowledge of the domain and so on. The problem is that this list is ad hoc and it has not been tested against the nature of human communication. The basic question of what human interaction comprises is left out.
"To conduct a dialogue in as human-like a way as possible" is a controvert point: if users see a hint of human dialog they expect the system be able to exhibit the full range while in realty its possibility are very limited.
3-Internal complexity and opacity of the computer invites an intentional stance: it is our inability to see inside each other's head that makes intentional explanation so powerful in the interpretation of human action.
Self explanatory artifacts
Designers always tried to build tools that explains themselves: the user should be able to understand the artifact's use, or the designer's intention regarding its use.
Computers are particular tools: they may actually "explain" themselves. This add another dimension: computers should be intelligent, that is able to understand the actions of the user, and to provide for the rationality of its own. From here there is a short step to the idea that computer could actually instruct, and approximate the behavior of a human coach.
The computer as an artifact designed for a purpose
The computer as an artifact having purposes
Interaction imposes the new idea that intelligibility of an
artifact is not just a matter of the availability to the user of the designer's
intentions for the artifact, but of the intentions of the artifact itself.
The designer must now imbue the machine with the grounds for a behaving in ways that are accountably rational.
1950: Turing: machine test for machine intelligence based on a view of intelligence as accountably rational, independent by the mechanisms used.
Controversy surrounding Weizenbaum's ELIZA: Weizenbaum declares
the program not intelligent, but it fulfills Turing's test. Why? The design of
ELIZA and similar programs exploits the natural tendency of humans to find sense
in actions that are assumed to be purposeful or meaningful. This has been termed
documentary method of interpretation.
Nonetheless, even if Weizenbaum unmasks the intelligence of its program, he think it is possible to improve the program to make conversation with a computer possible. He thinks that, even without intelligence, a shortcoming of his program is robustness; the program needs to make evident its misunderstanding and asks the user for clarifications.
So, for Weizenbaum, human machine communication is possible even without intelligence.
Communication and actions are strictly related: communication involves assumptions about the intelligibility of actions.
There are two main views of action:
1-the organization and significance of actions relays on the underlying plans.
2-actions depend by local interactions contingent on the actor's particular circumstances.
The planning model draws on three theories:
1-The planning model
2-Speech act theory
3-Shared Background knowledge
The planning model
The planning model in cognitive science treats a plan as a
sequence of actions designed to accomplish some preconceived ends.
Action is a form of problem solving, where the actor's problem is to find a path from an initial state to a desired goal.
-Plan generation. First programs are robots in an impoverished environment: robots follow the planning and once in while they check their position in the environment. A following program monitors user actions: the program (NOAH) has a network of partially ordered actions: submits an action to the user and monitor the answer. A positive answer means the user understood the instruction. A negative is taken as a request for info. The computer environment is a social one and requires interpretation of the user's actions.
-Interaction and plan recognition. In AI interaction was accomplished by treating other actors as additional variables in the environment. Now the conditions of the environments include other actors. The problem of interaction, in this view, is to recognize the actions of others as the expression of their underlying plans.
-The Status of plans. There is confusion in the literature on how a plan is treated. Sometimes it is a framework for the analysis of action, in order to understand the goals and actions of the actors. Other times they are treated as psychological procedures that directly direct behavior. "A plan is any hierarchical process in the organism that can control the order in which a sequence of operation is to be performed". This is a psychological "process theory"
Language is a form of action: language understanding involves an analysis of a speaker's utterances in terms of the plans those utterances serve. the research problem with language understanding is therefore the same as that of the planning model. Ex: A has a goal. The plan involves asking B some info needed to A to reach the goal. A execute the plan, asks B the question. B tries to understand A's plan.
What we understand of other's people action depends among other things by assumptions, or by background knowledge. For cognitive science the background of actions is not the world but the knowledge about the world. But representation of the knowledge has turned out to be problematic in AI. Common sense knowledge remains intractable. There are infinite details to consider, they are never enough and they are not ever really taken in consideration at the moment to act. Nevertheless the image of "shared knowledge" as a set of enumerable body of implicit assumption is assumed to be behind every action. But is this assumption correct?
Phenomenological tradition and Garfinkel challenge this position: " a background assumption is generated by the activity of accounting for an action when the premise of the action is called in question. But there is no particular reason to believe that the assumption actually characterizes the actor's metal state prior to the act." Something is taken for granted when it is not problematic. When there are problems, the context provides for a resource to solve the problem.
The problem with cognitive science is the project of substituting definite procedures for vague plans, and representations of the situation of action, for action's actual circumstances.
She shows problems with all three planning models: the planning model itself, speech act theory and background knowledge.
Recent efforts within anthropology and sociology challenge the traditional assumption regarding purposeful action and shared understanding.
A new branch of sociology, called ethnomethology, considers practical reasoning about action as a subject matter of social studies, something to be investigated. That means something created by people. Previous theory believed that plans were scientific models of actions and sociology objective was to to improve these models and to transform them in axiomatic theory of actions.
"Situated actions": every course of action depends in essential way upon its material and social circumstances. Rather then abstracting action away from its circumstances and represent it as a rational plan, the approach is to study how people use their circumstances to achieve intelligent action. How do people produce and find evidence for plans in the course of situated action
Plans are representation of actions
There are two views of actions:
1- The actor makes a choice among alternative courses of action, based upon an anticipated consequences of outcome. The action course is just the playing out of these antecedent factors. Accounts of the action taken are just a report on the choices made.
2- Plans are resources for situated actions but do not in any
strong sense determine its course. Plans presuppose the embodied practice and
changing circumstances of situated actions. Plans do not represent those
practice and circumstances in all of their concrete details. Plans are then used
also afterwards: we can perform a post hoc analysis of situated action that will
make it appear to have followed a rational plan.
Rationality anticipates actions before the fact, and reconstruct it afterwards.
Representation and breakdown
When the action is proceeding smoothly it is essentially
transparent to us, even if we can always construct rational accounts before and
The equipment also tend to disappear when it is "ready-to-hand" according to Heidegger terms.
When there is a breakdown, possibly with the equipment, inspection and practical problem solving occurs. In such times our use of the equipment becomes explicitly manifest as a goal-oriented activity, and we may try to formulate rules or procedures (if-then).
Breakdowns also happen when the equipment is unfamiliar.
So, the action is transparent to us and only when there is a breakdown we explicate rules and procedure for the situation that has now become noticeable. In these cases rules are explicated for the purposes of deliberation; and the action, which is otherwise neither rule based nor procedural, is then made accountable to them.
The practical objectivity of situations
Social studies in the early century stated that there is an objective world
of social facts, or received norms, and people's attitudes and actions are a
response to those. Sociology's fundamental principle is the objective reality of
social facts. People responds to two types of rules: environment and
With this assumption sociology could have been considered a science: men respond to social facts, so it is scientific to study these responses.
Ethnometodology instead assumes that our everyday social practices render the world publicly available and mutually intelligible. Ethnometodology studies how common sense is used by people to make sense of the world, which methods do the member of society uses to make sense of talk and actions.
Ethnomethodology is interested in how the mutual intelligibility and objectivity of social world is achieved. It locates this achievement in situated actions: our common sense of the world is the product of social world, not the precondition.
The indexicality of language
1-expression have assigned to them conventional meaning
2-in some occasions the significance of expressions lies in its relationship to circumstances.
Indexical expressions: expressions that rely upon their situation for significance.
Language is a form of situated action: expression and interpretation involve an active process of pointing to and searching the situation of talk.
Even if some expression are indexical in respect to others, the communicative significance of a linguistic expression is always dependent on the circumstances of its use.
It is impossible to enumerate the list of the circumstances since every utterance's situation comprises an indefinite range of relevant circumstances. There is no finite set of assumptions that underlines a given statement.
The problem of language reflect itself in the instructions: the
indexicality of instruction means that the meaning does not inhere in the
instruction, but must be found by the instruction follower with reference to the
situation of its use.
Instruction necessary rely upon an implicit cetera clause in order to be call complete (they cannot ever be complete.)
The mutual intelligibility of action
Language is indexed.
Not only. Language participate in the action, it constitutes the situation of its use.
One step further for ethnomethodology: the purposefulness of action is is recognizable in virtue of the methodic, skillful practices whereby we establish the rational properties of actions in particular context.
Garfinkel proposes that the stability of the social world is not
the consequence of a "cognitive consensus", or a stable body of shared
meanings, but of our tacit use of the documentary method of
interpretation to find the coherence of situations and actions.
As a general process, the documentary method describes a search for uniformities that underlie unique appearances. In the social world it describes the process whereby actions are taken as evidence of underlying plans or intent, which in turn fill in the sense of actions.
Given the lack of universal rules for the interpretation of action, the program of ethnomethodology is to investigate and describe the use of the documentary method in particular situations. It is to look for the processed whereby particular uniquely constituted circumstances are systematically interpreted so as to render meaning shared and action accountably rational.
Action interpretation is inherently uncertain: nonetheless action description is sufficient for its task. People are engaged in the everyday business of making sense of each other actions.
Which ones are the resources that people uses to manage the inherently uncertainty?
For social science these resources are not only cognitive but
interactional: interpreting the significance of action is an essentially
Mutually intellegibility turns on the availability of communicative resources to detect, remedy, and even exploit the inevitable uncertainties of action's significance.
Conversation as "ensemble" work
Speakers and listeners during a conversation do not engage in an alternating sequence of action and response.
Conversation is more a joint action accomplished through the participants' continuous engagement in speaking and listening: listener gives clues and speaker reacts to these clues.
The contextualization cues consist in organization of speech prosody, body position and gesture, gaze, and collaboratively accomplished timing.
As the basic system for situated communication, conversation is characterized by:
1-an organization design to support local control over the
development of topics or activities, and to maximize accommodation to
unforseeeable circumstances that arise
2-resources for locating and remedying the communication's troubles
The organization of conversation maximize local control over the distribution of turns and the direction of subject matter. That is, who talks and what get talked about is decided then and there, by the participant, through their collaborative construction of the conversation's course.
The turn taking is a collaborative achievement, rather than a simple alternation of intrinsically bounded segment of talk. The turn is not something that can be first defined and than examined for how it is passed back and for.
Sequential organization and coherence
In general, a coherent conversation is one in which each thing said can be heard relevant to what has come before. The adjacency need not to be immediate. There may be other sentences in between or it can come after some considerable time. An embedding of turns.
The expectations between what is said and what is believed to be an appropriate answer, control the inference about the conversation's content: the answer, lack of it or difference from what is expect help in building the meaning.
Sometimes answers that are not tied to the previous questions
are interpreted nonetheless as relevant to the previous utterance.
The overall coherence of a conversation is accomplished through the development and elaboration of a local coherence.
Locating and remedying communicative trouble
Communication takes place in real environment and is vulnerable to internal and external troubles.
Our communication succeed in face of these troubles not because we predict reliably what will happen and thereby avoid problems, but we work, moment by moment, to identify and remedy the inevitable troubles that arise.
In addition to control the turn taking, participants must be able to alert alert of possible problems or misunderstanding. Sometimes even when problems are detected participants may want to leave them there. But if problems are not detected until is to late, in certain cases you have failure of communication.
Specialized forms of interaction
There are many cases where the organization of turns and the subject matter are prescribed (by institutions for example.)
Preallocation of turn types: courtroom or communication doctor patient. Even in these cases, even if turn taking is subject to rules, still participant apply those rules in such a way to convey their meaning, or to make sense of it. So are still in a situated way.
Agendas: various setting comprise prescription also for the subject of the communication. For example doctor-patient or counselor-patient talk. Even in these cases the coherence of the talk is not guarantee by the agenda but is achieved moment by moment as a local, collaborative accomplishment.
Face to face interaction is a systems that has evolved to provide an orderly, concerted action. It masters its constraints and leaves open questions of control and direction, while providing mechanism for recovery from trouble and error.
This chapter analyses the resources available in a form of communication that is more restricted than free form communication or than the previous ones analyzed in the preceding chapter. This deals with an expert system.
The expert system has a set of propositions or
"knowledge" and rules about a domain. In this case copying machine.
The copier expert system should observe user actions, infer user's plan and
suggest the next instruction.
This chapter analyses the problem of the system's recognition of the user action.
The expert help system
Job specification: the user input answer to questions about the
original documents and the desired copies. The system identifies a plan
(connected to the goal).
Plan: the system map the job specification to a plan.
The plan is now ascribed to the user as a basis to understand her actions. The system than deliver step by step instructions.
The actions of the user in the system are mapped to a place in the system's plan.
The design assumes that it is in the correspondence of the
system's plan to the user's purpose that enables the interaction.
But the analyses shows that this is not the case.
The problem of following instructions
The problem of the instruction follower is to turn essentially partial descriptions of objects and actions into concrete practical activities with predictable outcome. Instructions rely on the ability to do the implicit work. Successful instruction-following is a matter of constructing a particular course of action that is accountable to the general description that the instruction provides.
Afterwards instructions serve as a resource for describing what was done, not only because they guide the course of action, but also because they filter out of the retrospective account of the action everything that was done and not mentioned in the instruction.
If the outcome is not what expected, the user tends to find problems in the execution, especially in the actions that were not stated in the instruction. Also, looking at the outcome of the actions, the user makes inferences about the principles behind it.
Some experiments where done by people in AI with the purpose to build a computer-based consultant. In the experiments a novice had to build a tool and an expert was communicating with him in different ways: face to face, through a telephone, through a keyboard or with written instructions. The results shown that the main difference between different types of communication was interactive vs. non interactive. During non interactive communicating (ex. expert provided written instruction) there was more degree of planning. In the case of interaction, speakers plan only at a general level, where as non-interactive discourse can be inherently planned in advance.
Also with non interactive there is a bigger problem of description of the object and actions involved, and there is the tendency to overload the description. In interactive modality there is minimal description and there is monitoring of the action to verify the adequacy of the description.
Interactive communication support the collaborative construction of a useful description of the object and actions in question, through practical analyses of the communication's success at each turn.
The Basic Interaction
The expert system works in the following way. The machine display a video display with either the machine's behavior or the next instruction.
The instructional sequence:
Machine present instruction
User takes action
Machine present next instruction.
The system is able to detect key user actions, but not all. There are instructions that have as outcome a detectable action, other instructions are invisible.
Before proceeding to the next instruction, the system detect
that the previous detectable instruction have been accomplished, but the system
cannot know about the undetectable, and it simply assume that the user did them.
The system control the turn in a reactive way: relationship between user actions and machine response.
To avoids certain problems related to sequentiality of instructions, the system uses a partial order: it starts from the last instruction, if not accomplished it checks if preconditions are met; if not the system select a previous instruction. The checking of preconditions helps in dealing with cases when user undo a precondition or some conditions are met in a previous procedure and they are still valid.
The study uses novice users of the copier machine. They are in
couple so that their dialogues are accessible to the researcher.
She will use videotapes to record their actions.
The study does not uses examples constructed by the researchers. The researchers only chose the tasks and leave the users free to accomplish them in their own way.
The study was directed to two methodological commitments:
1-empirical approach: do not use contrived example. Do not prescribe how the description of the action should look like, otherwise it will limit the type of actions described.
1-consider the fleeting circumstances that our interpretation of action relies upon.
2-study the relationship between interpretation of action and actions' circumstances.
2- The aim of the analysis was to find the sense of "shared
understanding" in human-machine communication. She wanted to compare the
human and the system's respective view of the interaction.
She uses the following framework:
-Actions not available to the machine
-Action available to the machine
-effects available to the user
The purpose of this chapter is to consider communication between
a person and a machine in terms of the nature of their respective situations.
The situation of action is the full range of resources that the actor has available to convey the significance of her own actions and to interpret the actions of others.
Engineering an appropriate response
The designer problem is to ensure that the machine responds
appropriately to the user's actions. There are two possible approaches:
1-participants anticipate each other's actions
2-participants respond to occasioned and unanticipated actions of the other.
Expert systems use the first approach, building a model of the user and taking his goal as an ascribed plan to interpret his actions.
The system's situation: plans and detectable states
System resources for constructing the action of the users: plans and states. Not all user actions are available to the system. Also the history of the action is not available. If a user undo a step, the system keeps no track, it only checks the current status and the position in the plan.
The user's resources: the situated inquiry
Problems posed for the designer by the user's principal resource.
If the user follows the plan suggested by the expert system, the system may anticipate the user question "what next" meaning what is the next instruction. But when the "What next" is a matter of repair or abandonment of the plan, then problems occur. The request is for remedy of the current trouble. As a consequence that the situation of the inquiry is not what the system anticipates, the answer that the system offer is inappropriate.
Sometimes the system advances a response for the motivation of an action, but the user is interested in identifying the object of the action. In some occasion the system anticipate "what is the object" but the user wants to know "How to do the action".
While instructions can answer questions about objects and actions, they also pose problems of interpretation that are solved in and through the objects and actions to which the instructions refer.
The user brings the description that the system provides to bear on the material circumstances of her situation, and brings those circumstances to bear on her interpretation of the descriptions.
Conditional relevance of response
Problems posed for the designer by the user's ability to find the relevance of the systems' responses to the users' inquiries.
Both user and designer share the expectation that the relevance of each utterance is conditional on the last; that given an action by one party that calls for an answer. If the answer is not an answer, the user will try to interpret it as an answer anyway.
Given some instruction to which the user respond with an action,
the user has the following expectations with respect to the system's response:
1-the system response is a new instruction, so the that confirm the adequacy of the user action
2-If the system does not respond, the user action is incomplete
3-If the system respond to repeat an instruction, either the action must be repeated, or there is some trouble in the previous action.
The false alarm:
a misconception on the user's part leads her to find evidence of an error in her actions where none exist.
a misconception on the user's part produces an error in her action, the presence of which is masked. At the point where the trouble is discovered by the user, its source is difficult or impossible to reconstruct.
In neither case is the breakdown available as such to the system.
She proposes an alternative view to the cognitive science view of action.
Cognitive: abstract structural account as the ideal representation of actions.
Recent developments in social sciences proposes not to produce
formal models of knowledge and action, but to explore the relationship of
knowledge and action to the particular circumstances in which acting
This view presupposes:
1-the contingence of action on a complex world is an essential
resource that makes knowledge possible.
2-to ground theories of action on empirical evidence
3-the organization of action is an emergent property of moment by moment interaction between actors, and between actors and their environment.
Her goal is to describe human-machine communication. She bases
her study on studies of human conversation: she applies the insight gained
face to face human communication to human computer communication: it is a special case of human communication in which the resources available to participants are limited.
The limitness of resource imposes the following problems:
1-extenting the access of the machine to the actions and circumstances of the users.
2-make clear to the user the limit of the machine
3-find ways to compensate for the machine's lack of access to the user situation.
In recent efforts, the most common way to solve problem 1-, the limit to user and its circumstances, has been dealt with the introduction of a user model.
Plans are a representation of action. They don't control action but they are a resource (like a map). The action is in the interaction of representation and represented.
Theoretically, understanding the limits of machine behavior could contribute to an account of situated human action and shared understanding (in the same way that AI contributes to the theory of mind.)
Situated action and embodied action.
Situated action is "how do people find meaning in actions, or how do they construct it." It is mainly based on the documentation method. And the framework is communication.
Paul Dourish embodied action is more like
people find "meanings" or better uses of the tools by using them. They
have a goal and in pursuing it they use, modify and adapt the tools to their own
situation or purpose.
The communication aspect is minimum: it is restricted to what the designer communicate to the user.
Suchman study is mainly relative to expert
systems. Expert systems were hot in that time, but there were already serious
critics about them. Expert systems in particular act like "experts",
use dialogue and in some cases, like the one studied by Suchman, are designed to
"coach". This one besides couching leads the user step by step.
Most programs today are not expert systems, but still interact with the user. The user usually lead the action, but sometimes he needs help with some procedures.