Open architectures for integrated, hypermedia-based information systems

John L. Schnase
Advanced Technology Group
School of Medicine Library and Biomedical Communications Center
Washington University School of Medicine
660 South Euclid Avenue (Box 8132)
St. Louis, Missouri 63110 USA
schnase@medicine.wustl.edu

John J. Leggett, David L. Hicks, Peter J. Nürnberg, J. Alfredo Sánchez
Hypermedia Research Laboratory
Department of Computer Science
Texas A&M University
College Station, Texas 77843 USA
{leggett, hicks, pnuern, sanchez}@bush.tamu.edu

Abstract: Most hypermedia systems are currently implemented as closed, monolithic applications that effectively isolate information and functionality from other applications in a computing environment. Open architectures, on the other hand, tend to promote a higher degree of information and tool integration. In this paper, we report on the design and implementation of an open, multi-user hypermedia system called System Prototype 2 (SP2).

SP2 implements an object-based architecture for distributed, inter-application linking. It is an instantiation of a conceptual model that abstracts information, structure, and behavior from hypermedia. SP2 allows hypermedia connections to be forged and navigated among applications that are able to participate in a distributed link services protocol. The functionality of the system is realized by client/server relationships among several software components: Participating Applications, a Link Services Manager, and the HB2 hyperbase management system which provides persistent storage for objects and structural data.

SP2 allows application-level extensibility, the independent extensibility of hypermedia functionality, and extensibility at the data management level. We discuss our experiences with SP2 and describe why this approach to architectural openness can significantly improve the applicability of hypermedia technology to many large-scale, information intensive domains.

Keywords: Open hypermedia system architecture, hyperbase management system, hypermedia reference model.

Introduction

Many of today's hypermedia systems are monolithic, stand-alone applications. The behaviors that make them distinctive are encapsulated within each system. The hypermedia structures that they manipulate cannot be shared, and the information they manage is relatively isolated from other applications. They do not easily integrate with other systems, and they generally require that users disown their customary computing environments in order to avail themselves of hypermedia functionality. Functionality and data are accessible only from within these closed applications, and extensibility, when possible, requires adherence to the abstractions, data models, and operations already defined within the system. This monolithic orientation has been identified as perhaps the single most important hindrance to the proliferation of hypermedia technology [Malcolm et al. 1991; Meyrowitz 1989].

Recently, however, recognition of this limitation has motivated efforts to design and implement non-monolithic, open hypermedia system architectures. Open systems are an extensible collection of independently-written applications that cooperate to function as an integrated system [Ghezzi et al. 1991]. In this paper, we report on the design and prototypic implementation of an open, multi-user hypermedia system called System Prototype 2 (SP2). We believe that SP2 introduces an architectural approach that significantly improves the applicability of hypermedia technology to large-scale, information intensive domains such as digital libraries, medical information systems, and computer-aided design, instruction, and software engineering.

SP2 implements an object-based architecture for distributed, inter-application linking. It is an instantiation of a conceptual model that abstracts information, structure, and behavior from hypermedia. SP2 allows hypermedia connections to be forged and navigated among applications that are able to participate in a distributed link services protocol. The functionality of the system is realized by client/server relationships among several software components: Participating Applications, a Link Services Manager, and the HB2 hyperbase management system which provides persistent storage for objects and structural data.

SP2 is distinctive in the way it separates hypermedia functionality from information, and in the way structural data are further separated into a link services subsystem. This approach to architectural openness allows application- and hyperbase-level extensibility and tailorability. It also supports extensibility and tailorability of hypermedia functionality within the client space of a computing environment.

The remainder of the paper is organized as follows: Section 2 presents the specific design goals for SP2 and the rationale behind these goals; Section 3 describes SP2's underlying model, its software architecture, implementation, and operation; in Section 4, the research is compared to related work, implications of the architecture are discussed, and we provide an overview of future plans; Section 5 concludes the paper with a summary.

Design goals

The overall goal for SP2 is to produce an effective open, extensible architectural framework for advanced hypermedia environments. The following are among the specific design objectives for SP2:

Hypermedia consists of: (1) information, (2) the structural connectivity data that tie information together, and (3) a suite of behaviors that affect information and structure [Schnase et al. 1992]. Within monolithic systems, these elements are packaged together in one way or another. Defining an open, non-monolithic organization begins with an understanding of how these fundamental elements can be abstracted apart to yield extensibility. Abstraction ultimately influences the degree and nature of the openness allowed by a system and the characteristics of its open interfaces.

Formal modeling is one of the primary means by which people reason about alternative abstractions, and there is currently widespread interest in establishing formal reference models for hypermedia [Afrati and Koutras 1990; Delisle and Schwartz 1986; Furuta and Stotts 1990; Garg 1988; Halasz and Schwartz 1990; Kacmar and Leggett 1991; Lange 1990; Schnase et al. 1992; Schütt and Streitz 1990]. Models, even when implicit or informally composed, are at the nucleus of every system design. Unfortunately, most of the models for hypermedia that have been proposed remain distinctly monolithic in their overall philosophical approach. Advanced hypermedia system architectures clearly must be built upon abstractions and models that effectively address open system requirements.

It is important that SP2 not be unduly constrained by a model for hypermedia that is inherently unable to address advanced system requirements. We intend that SP2 distinguish among the structural, information, and behavioral components of hypermedia at an architectural level, so that effective system support for each can be provided.

The conceptual organization of a monolithic hypermedia system can be viewed as consisting of three layers: (1) a front-end, or presentational, layer that allows users to interact with the system; (2) a back-end layer that provides persistent storage for the system; and (3) an intermediary application layer that implements the hypermedia and other behaviors of the system. This application layer contains the operations and data structures required to translate user requests at the front-end into actions in the back-end.

The idea of a layered architecture leads to two observations. First, it is important to recognize that in considering open architectures for hypermedia systems, notions of openness, extensibility, and independence can be applied to each of these three conceptual layers [Wiil and Leggett 1992]. It is also important to realize that, in designing hypermedia systems, there is often a range of options as to where and how a particular behavior might be implemented. For example, composite objects can be assembled by the back-end or within the applications layer. Link data can be served to the application layer at the time a link is traversed, or, alternatively, these data can be assembled into memory-resident data structures within the application layer prior to requests for traversal. It is clear that successful, non-monolithic architectures ultimately depend upon an effective partitioning of functionality and data among subsystems and effective mechanisms for their access.

One form of extensibility that SP2 must support is the capacity to link across application boundaries. Inter-application linking provides application-level extensibility by allowing a diverse array of independently developed applications to be added to a system at any time. We also intend that SP2 allow extensibility at other architectural levels. In particular, SP2 must allow hypermedia functionality to be extensible within the client space independent of the applications utilizing the functionality. Such an approach allows behaviors that are commonly associated with hypermedia to be conveniently augmented or modified. It should be possible to modify these behavioral components without affecting the underlying structure of hypermedia.

In general, distributed software architectures tend to foster openness. This is because network distribution is achieved by having clearly defined interfaces among software subsystems. For example, in the classic client/server relationship, server functionality is made available to distributed clients by way of interprocess communication (IPC) protocols. The functionality of the system can be extended at any time by adding new clients that are able to participate in the IPC. This ease of extensibility is one of the motivations underlying the movement toward network- and object-based operating environments [Accetta et al. 1986; Ousterhoust et al. 1988; Tanenbaum et al. 1990].

However, while distribution may foster openness, distribution alone does not assure it. One must look carefully at the specific nature of a system's functional partitioning and its distribution in order to understand the forms of extensibility they yield. KMS [Akscyn et al. 1988] and Intermedia [Garrett et al. 1986; Yankelovich et al. 1988], for example, are both extensible, distributed hypermedia systems. In each case, however, distribution refers to network access to distributed data, and both are extensible in only a limited, monolithic sense of the term.

Few systems have achieved the types of functional partitioning and distribution required to fully realize the integrative potential of network computing. In order to be successful, SP2's architecture must promote the availability and utilization of hypermedia on the next generation of high performance computers and communication networks.

Open architectures must be massively scalable in order to work with the information repositories contemplated by future hypermedia applications. This means they must be able to accommodate diverse forms of data, such as digital audio, video, and images, in terabyte to petabyte capacities. Issues relating to architectural scalability tend to converge in the data management subsystem, because effective support at this level is essential and contingent upon having effective policies and mechanisms in place.

Many existing hypermedia systems implement storage functionality themselves, without the support of a distinct back-end or commercial DBMS. Data are stored in proprietary formats that are largely inaccessible to other applications. There is growing interest, however, in the design and implementation of general-purpose hypermedia database ("hyperbase") management systems [Schnase et al. 1992, 1993; Schütt and Streitz 1990; Smith and Smith 1991; Wiil 1990; Wiil and Leggett 1992; Wiil and Oesterbye 1990; Zobel et al. 1991].

Hyperbase management systems (HBMSs) are network-accessible servers that provide the back-end functionality required to support hypermedia operations. Ideally, they provide a well-defined application interface as well as additional features such as multi-user access, transaction management, concurrency control, version control, support for novel data types, and efficient management of physical data storage. In most cases, an effort is made to impose as little format as possible on the objects managed by an HBMS, so they can be used by a variety of applications. As we search for more effective ways of incorporating hypermedia into advanced information systems, it is important to understand and refine the interplay of hyperbase management systems with the other architectural components of these environments.

SP2's architecture should be based on a scalable HBMS and allow for the eventual inclusion of multiple homogeneous and heterogeneous hyperbases and archival storage. Most important, SP2 must provide a framework that embraces new concepts and emerging technologies.

A major objective of this effort is to provide a resource for demonstrating and testing new principles and ideas about hypermedia system architectures. Experimentation with the architecture will be facilitated by abstracting its essential functionality into easily accessible programming tools. SP2 should be modular, and there must be clear boundaries and interfaces among the subsystems that comprise its software architecture. A modular approach allows design, implementation, operation, and performance issues to be addressed at a subsystem level, thereby increasing the usefulness of the system as a research vehicle.

SP2 should also provide a framework for examing the incorporation of hypermedia functionality into the base computing environment. The clear evolutionary trajectory of this work is toward the eventual inclusion of hypermedia functionality in the kernel of a network-based, real-time multimedia operating system.

Implementation

We now show how these ideas and goals have been translated into an open, extensible hypermedia system. We begin by describing the model for hypermedia upon which SP2 is based. We presents the system architecture for SP2 and describe its implementation. The section is closed by providing a brief overview of SP2's operation.

We have based SP2 on a model for hypermedia that we feel is particularly well suited to achieve these objectives. As shown in Figure 1, the model consists of six elements: applications, components, persistent selections, anchors, links, and associations. The first three describe the information that is managed by the system. Applications are programs. Components are the data or information manipulated by applications. For example, we would say text editor applications generally manipulate ASCII components. Persistent selections are selections within components that persist between application sessions and can be accessed at a later time.

The remaining three components of the model are essential for hypermedia functionality. Anchors and links are processes in the operating systems sense of the word. They are programs or program units that can be independently scheduled by the underlying operating system in order to accomplish a task. In SP2, they implement the behaviors that characterize hypermedia, such as customized views or traversal behaviors. As shown in Figure 1, anchors are associated with persistent selections, and links are associated with anchors, thereby completing connections among persistent selections. The relationships among these elements, depicted as arcs in Figure 1, are associations. Unlike anchors and links, associations are structural entities, collections of identifiers that tie elements together. In this model, the network structure of hypermedia results from the connections forged among persistent selections.

We will comment further on this model later in the paper. For now, however, note that an information system based on this view of hypermedia allows the integration of diverse applications, anchors, and links under a common hypermedia model. Non-monolithic, inter-application linking can be realized by moving hypermedia connectivity data (associations) and hypermedia functionality (anchors and links) into a "link services" subsystem of a computing environment (Figure 1). Services can then be provided to participating applications through interprocess communication (IPC). Such a framework allows application-level extensibility as well as independent extensibility of hypermedia elements. The result is a potentially higher degree of openness than allowed by most other models. In addition, the network-optimized distribution of architectural components is greatly facilitated, because application, anchor, and link processes can reside anywhere within the IPC domain of the system.

System Architecture

As shown in Figure 2, SP2's software architecture consists of three major components: Participating Applications, the Link Services Manager, and the HB2 hyperbase management system:

HB2 is a centralized, multi-user HBMS consisting of five subsystems: the Hyperbase Session Manager (HSM), Object Manager (OM), Association Set Manager (ASM), Off-line Services Manager (OSM), and the Storage Manager (SM) (Figure 2). OM implements the notion of a large, shared repository of simple, unstructured objects. ASM provides persistent and sharable storage for the connectivity data that link information together to form hypermedia. ASM also manages separate contexts for these structural data. Together OM and ASM implement HB2's data model, which abstracts inter-object connectivity, behaviors, and information from hypermedia. SM maps HB2's data model into physical storage.

Distribution of OM and ASM functionality across a range of platforms is achieved by a client/server model using interprocess communication (IPC) facilities. HSM controls access to the OM and ASM servers and manages sessions. Transaction management instantiates separate and tailored mechanisms for deadlock-free concurrency control over objects, structural data, and contexts. OSM asynchronously enforces integrity constraints over OM and ASM data. Additional information about HB2 can be found in Schnase [1992], and an earlier, single-user version of the HBMS is described in Schnase et al. [1993].

The Link Services Manager (LSM) is a server process that provides run-time support for inter-application linking. It coordinates the interprocess communication required to implement the hypermedia functionality of Participating Applications. These activities include the attachment and detachment of anchors and links, conveying requests to the applications that they display anchored or linked persistent selections, browsing operations, and context operations. LSM instantiates the dynamic, real-time manifestation of hypermedia with which a user or user processes may interact.

Any number of client applications can be registered for LSM services. In the current implementation, only one LSM process is allowed to provide link services per hyperbase session. This is not, however, a limitation of the architecture, and it is envisioned that someday users might work in an environment where multiple LSM's will be running on a workstation. As seen in Figure 2, LSM is ASM's only client at the moment. As we describe below, HB2's ASM server and LSM are central to SP2's link services subsystem.

Applications that wish to participate in link services must be able to interact with the LSM server. These applications are referred to as Participating Applications (Apps) (Figure 2). Participation requires that Apps be able to handle persistent selection and respond appropriately to messages from the LSM.

Recall that in our model of hypermedia, components are the data or information manipulated by Apps, and persistent selections are selections within components that persist between application sessions. ŢIn order to implement persistent selection, Apps append a Persistent Selection Table (PSTable), or some other data structure, to components they manipulate. An example PSTable is shown in Figure 3. In this case, the table carries entries for each persistent selection within a component, and contains: (1) a unique ID for each persistent selection (PSID field of the table); (2) a persistent selection value that provides the information required by the application to address the persistent selection (PSValue field); and (3) bits that can be set to indicate whether the persistent selection has an associated anchor (AnchorBit field), and, if so, whether a link is associated with that anchor (LinkBit field).

Applications are responsible for providing mechanisms to create, delete, and display persistent selections and manipulate the PSTable. It is also assumed that the applications' interfaces will allow users to reference persistent selections in such a way that their identities can be determined by the applications. Finally, as described below, Apps must be able to respond to messages from the LSM that implement hypermedia functionality. Text, graphics, and bitmap editors have been developed as participating applications. The Athena Text Widget has also been modified to make X Window System's Xedit an SP2 App [Drufke et al. 1991; Nye and O'Reilly 1990].

Implementation

SP2's software components were written in C and C++. They easily port to UNIX variants on a variety of platforms. LSM is a decentralized server and can be allocated to any physical processor in a local area network. Participating Application processes can be distributed as well. HB2's managers are implemented as independent processes, however, they are centralized to a dedicated node on which the Storage Manager's repository resides. HB2 incorporates the POSTGRES extended relational database system into its architecture at the Storage Manager level. POSTGRES provides features of semantic and object-oriented database methodologies as extensions to the relational model [Schnase 1992; Stonebraker and Kemnitz 1991].

In order to achieve heterogeneous distribution, message-passing IPC protocols were implemented using X Window System interclient communication facilities. It is possible for client processes on networked workstations to access LSM and HB2 servers remotely by using these protocols. Communication between clients and servers follows a mailbox model in which "in" and "out" mailboxes are owned by each client application and shared by the servers. X Properties are used to implement mailboxes [Nye and O'Reilly 1990]. An X application can create and delete properties that are conceptually associated with specific application windows. They then receive notification about and respond to events affecting those properties. In SP2, the events of interest to servers and applications is the movement of data into and out of an X Property. All messaging is bidirectional.

To facilitate construction of the various client and server modules, we developed the X Link Services (Xlt) Toolkit. Processes that wish to communicate with one another bind liaison procedures from the Xlt Toolkit into their virtual memory space. Clients call on these routines in order to communicate with servers, and servers, in turn, use Xlt routines to reply.

As indicated earlier, three Participating Applications have been developed so far. V is an X-based text editor modeled after vi. PSText Widget is a modified Athena Text Widget whose functionality can be inherited by several other X-based editors. HyperDraw is a graphics and bitmap editor. These applications range in size from 5,000 to 12,000 lines of code. Using the Xlt Toolkit, only about ten percent of the source code is devoted to satisfying the requirements for participation in the SP2 framework.

Operation

In the SP2 environment, hypermedia connections are authored and browsed through coordinated events between Participating Applications and the Link Services Manager. As shown in Figure 2, Apps communicate with HB2's Object Manager and the LSM, and the LSM communicates with HB2's Object Manager and Association Set Manager. Table 1 provides a summary of the messages that pass between Apps and the LSM. The information contained in these messages, as well as other client/server messages, consists of type information and IDs. In order to explain how the system works, we refer to Figure 4 and Table 1.

Figure 4 shows an example screen layout of three processes that are currently displaying their output in (dark bordered) windows. Two of these processes are Participating Applications (AppIDs 2 and 3) that are currently manipulating components (CompIDs 1 and 4 respectively). The third process is the Link Services Manager (LSM). LSM has a menu interface through which users invoke its functionality. It is assumed that a session has been opened with HB2 and an OM/ASM process pair are running in the background on the database machine somewhere in the network. The IDs in this example, (AppID, CompID, etc.) are the unique identifiers for these objects, which are managed by HB2's Object Manager server.

The dashed arc in Figure 4 depicts a hypermedia association between two persistent selections across application boundaries. The authoring of such associations in SP2 is a two step process and would proceed as follows. First, anchors are attached to persistent selections to form "sides" of an association. If the left-hand side in the example (Figure 4, Label A) were the first to be authored, the user would select the oval highlight in the application, then pick Attach Anchor on LSM's menu. LSM would respond by broadcasting an Attach Anchor request message to all Apps (Table 1). Apps with selected persistent selections would respond to the LSM by returning a message containing a {PSID, CompID, AppID} triple, identifying each persistent selection the user has chosen. In this case, AppID 2 would send {PSID 3, CompID 1, AppID 2} to the LSM. When all Apps have responded, LSM makes an Attach Anchor call on ASM and sends it all the triples it has received along with an AnchorID. ASM makes the appropriate entries in its underlying data repository. A similar sequence of events would occur to author other sides. In this example, attaching AnchorID 5 to {PSID 28, CompID 4, AppID 3} forms the only other side (Figure 4, Label B).

Once sides have been defined, authoring simply involves attaching a link to these sides. Picking Attach Link on LSM's menu causes LSM to send an Attach Link message along with a LinkID to the ASM. ASM responds by associating any un-associated sides in the database with the LinkID. This results in a completed association (Figure 4, Label C). The complete communications protocol, including calls on HB2, can be found in Schnase [1992].

An important aspect of authoring in the SP prototypes is that any number and arrangement of {Participating Application, Component, Persistent Selection} triples can be involved in an AttachAnchor operation, and any number of sides can be involved in an AttachLink operation. This means that arbitrarily complex inter-application linking can occur at the level of anchors and links (Figure 5).

Development of anchor and link processes has just begun. Among other things, these will implement a variety of traversal behaviors and filtering operations. At the present time, functionality residing within the LSM supports rudimentary reference traversal semantics. If the user picks LSM's Turn on Browsing menu selection, messages are sent to Apps letting them know the user wishes to navigate. Now, mouse selections on link markers cause Apps to send LSM Follow Association messages. A Follow Association message carries a {PSID, CompID, AppID} triple identifying the "source" persistent selection whose association the user wishes to follow (Table 1). LSM relays the request to ASM, and, in response, ASM provides a list of the LinkID, AnchorIDs, and {PSID, CompID, AppID} triples reachable from the input triple.

Traversal is finally accomplished by LSM creating "destination" application processes and sending them a Goto Persistent Selection message. This messages carries one or more "destination" triples, and Apps respond to it by loading components and displaying the designated persistent selections. As anchor and link processes are introduced into the architecture, this scenario will require richer communication among App, anchor, link, and HB2 processes. Note that during operation of SP2, hypermedia data are maintained under database control at all times -- no other data structures are used to represent the address information required for inter-application linking.

Other operations are supported by LSM, including detach operations and context operations. Context operations allow users to create and manipulate named collections of associations that can be applied independently to SP2 objects. Details about these operations will not be provided here, but their implementation is straightforward and readily understood by examining Table 1.

Discussion

In this section, we briefly review related work, discuss implications of the SP2 architecture, and describe our experiences with SP2 and plans for the future.

Related Work

Work on open hypermedia systems is being influenced by a wide range of activities. We have already mentioned work on formal models. In addition, research on hyperbase management systems, multimedia information systems, decision support systems, etc. is contributing to the dialog. Several projects relate closely to the work we are doing.

Sun Microsystem's Link Service is a product that is shipped with Sun's programming-in-the-large software development environment called Network Software Environment (NSE) [Pearl 1989; Pearl and Walsh 1988]. NSE allows users to make and maintain explicit and persistent links between objects managed by autonomous front-end applications. Its capacity to support inter-application linking makes NSE's notion of extensibility similar to that seen in SP2.

The NSE Link Service includes a communication protocol, a link server program, a library for integrating new applications, and utilities for managing link databases. The server is centralized and can be accessed over a network by multiple applications. Applications interact with the server to create, query, and follow links among applications. Anchor and link information is stored by the Link Service while applications provide the functionality for manipulating and acting on anchor and link data. Object data are stored and managed separately from link data. In contrast to SP2, it is the application's responsibility to invoke "destination" applications that are not running when a link is followed. We feel that it is important to abstract process control away from applications in order to effectively coordinate complex, multi-process information system environments. Sun's Link Service also differs from SP2 in that it does not provide multi-user support or concurrency control.

USC's Distributed Hypertext (DHT) architecture illustrates an interesting move toward an open systems architecture combined with distributed, heterogeneous database techniques [Noll and Scacchi 1991]. DHT is based on a client/server model and includes four components: a common hypermedia data model, a communication protocol, servers, and client applications. All components communicate via a common application protocol that implements data model operations and provides a mechanism for moving objects between clients and servers.

The heart of the system is its collection of database servers. Each server consists of a gateway process and an information repository. The gateway process transforms the hypermedia operations of distributed clients into local access operations. Gateways also assemble local information objects into DHT nodes or links. The information repositories are databases, file systems, or special-purpose storage managers. From the repository's view, the gateway process appears as another local application accessing the repository's data. In this way, DHT can incorporate existing databases without having to copy their data or modify their schemas.

DHT illustrates some of the advantages that can come from having a simple, unifying data model. In this case, the system is able to integrate diverse information repositories within a distributed architectural framework. On the other hand, while a significant degree of application-independence from the back-end has been obtained in DHT, control over multi-user activities would be difficult without incorporating a global coordinator. In a sense, DHT has opted for enhanced access to diverse data at the expense of effective support for cooperative sharing and a simple model for hypermedia.

Microcosm is an open hypermedia system that consists of a set of autonomous communicating processes that supplement the facilities provided by the operating system [Davis et al. 1992; Fountain et al. 1990]. In Microcosm, users interact with browser applications, or "viewers." Viewers translate user actions into messages that pass through one or more filters. Messages requesting that a link be followed are handled by linkbases and link dispatchers which create destination processes.

Like SP2, Microcosm places a strong emphasis on computation and the separation of connectivity data from information. However, Microcosm's model for hypermedia forces the linkbases to maintain application-dependent address information. As a result, an application that changes a document containing links to other documents, can create an inconsistent database state. In SP2, applications are responsible for maintaining persistent selections and intra-document address consistency; only persistent selection IDs are passed to the external environment for storage. Editing an SP2 document will not yield an inconsistent hyperbase state as long as an application correctly maintains its persistent selections.

DHM is an open hypermedia system based on the Dexter model [Grřnbćk and Trigg 1992; Halasz and Schwartz 1990]. It allows the integration of applications and data not "owned" by the hypermedia. In contrast to Microcosm, DHM (by virtue of the Dexter model) introduces a notion of anchor as an endpoint of a link. Anchors consist of an identifier that can be referred to by links and values that identify the anchored part of some material. While this approach does introduce a valuable abstraction for dealing with link endpoints independent of links themselves, anchors in this setting are still essentially structural elements. SP2, however, goes one step further: it places this level of structural specification in application-dependent persistent selections and makes SP2 anchors computational entities that are capable of a wide range of application-independent behaviors.

Max is another hypermedia system based on the Dexter model [Bieber 1991]. Emphasis in Max is on augmenting non-hypertext systems, such as decision support and expert systems, with hypertext functionality. While currently implemented as a monolithic system, Max' layered, modular architecture makes it amenable to an open, communications-based implementation. Among its notable features is a strong emphasis on computation and the notion of bridge laws that are used to map application abstractions to hypertext entities.

The distinction between Max and SP2 is subtle but significant. In SP2, the only application-level abstraction about which the external environment is aware is the persistent selection identifier. The precise notion of persistent selection for a given application is defined by that application. That is to say, the policies and mechanisms relating to persistent selection are implemented and managed entirely by applications. As a result, link service in SP2 is a "lightweight" service that can be used as the basis for additional behavioral layers, such as those introduced in Max to solve the problem of incorporating non-hypertext applications.

Finally, we consider the developmental history of the current prototype. SP2 grows out of work on an earlier system called System Prototype 0 (SP0). SP0, or PROXHY as it was later called, demonstrated the feasibility of integrating diverse applications under a common hypermedia model [Kacmar and Leggett 1991]. Hypermedia services were provided to applications through distributed processes within an object-oriented paradigm. In PROXHY, non-monolithic, inter-application linking was realized by moving hypermedia data and hypermedia functionality into a distinct hypermedia layer. The layer was comprised of anchor and link object classes that were implemented as communicating processes [Kacmar and Leggett 1991].

PROXHY is unusual in that its architecture integrates principles from hypermedia, process, and object-oriented models of software construction. Doing so, significantly increases the power of the hypermedia model because it allows computation, concurrency, and arbitrary complexity to be incorporated into hypermedia components. In addition, hypermedia data and functionality are separated from applications, thus enabling an open architectural organization for the system.

Two subsequent prototypes, SP1 and SP2, extended this earlier work in several ways. Perhaps most important, they were built on top of separable, network-accessible hyperbase management systems. SP1 utilized HB1 and was a single-user system [Schnase et al. 1993]; the current prototype, SP2, was built on top of HB2 and can accommodate multiple users.

PROXHY abstracted hypermedia functionality and connectivity data away from information into a distinct hypermedia layer or subsystem. Anchors and links shared responsibility for managing the structural, or connectivity, data that associated application-level objects to form hypermedia. In PROXHY, applications availed themselves of hypermedia services through a distributed, message-based model of object-oriented computation. Inheritance hierarchies and method invocations were managed by a message router.

In SP2, data abstraction goes one step further. Structural data are abstracted from hypermedia functionality, and both are abstracted away from the underlying information. Structure management is provided by a link services subsystem. The elaborate communications and inheritance mechanisms seen in PROXHY are not incorporated into SP2's link services subsystem. Philosophically, this approach makes link services a simpler and more broadly applicable utility. As indicated above, link services are relatively lightweight in comparison with hypermedia services, and provide a substrate upon which more elaborate layers can be extended.

Implications of the Architecture

We said in our introduction that SP2's architecture can enhance the applicability of hypermedia technology to many large-scale, information intensive applications. There are several reasons why we believe this is so:

SP2 is based on a powerful and flexible model for hypermedia. It differs from most models in three important ways: (1) the model defines a computational view of hypermedia, (2) it has a strong notion of anchor and link, and (3) it abstracts information, behavior, and structure from hypermedia.

Only a few existing systems incorporate a notion of anchor, and in most cases these are merely data structures that specify the endpoint of a link. Anchors in this traditional view correspond closely to persistent selections in SP2's model. Anchors in our model, however, embody behavioral rather than structural characteristics. "Attaching an anchor" means that a (potentially application-independent) behavior is associated with a persistent selection.

The rationale for a strong, computational view of anchor is simple: we believe anchors are an appropriate locus for several classes of behaviors that are only indirectly related to traversal. Anchors can cooperate with application processes to customize views, filter information, coordinate searches, or even monitor events that may happen in the future. Since they are arbitrary processes, anchors can be as complex as needed -- they can, for example, be interfaces to information retrieval systems, expert systems, or database management systems. In fact, anchors could actually be systems such as these.

Links in our model are also behavioral entities. We see links being primarily responsible for behaviors related to the traversal of associations. Links can cooperate with anchor processes to create "destination" processes, disambiguate multiple alternative "destinations", or perform other activities generally associated with navigation.

When anchors and links become arbitrary processes in a hypermedia system architecture, the potential of these systems is enormously increased. Hypermedia systems move from the class of "interesting" applications toward operating systems and other software in the class of systems support. We believe that a strong emphasis on computation will ultimately allow hypermedia systems to achieve their potential as a new operating paradigm for computing.

We have said repeatedly that this model is distinctive in the way it abstracts structure, information, and behavior from hypermedia. Until recently, hypermedia systems have tended to focus on abstracting only structure and information. They have done so in order to implement contexts -- separate collections of links that can be applied to the same information. Since these systems have typically been implemented as monolithic applications, behavioral abstraction has largely been ignored. The behaviors that characterize a given system become encapsulated within the systems themselves.

There is an important rationale for removing addressing data from the behavioral characteristics of anchors and links: doing so promotes extensibility of hypermedia functionality through an open systems approach. In addition, it supports the massive distribution of hypermedia functionality required of network-optimized computing environments. The abstractions provided by most existing hypermedia data models often predispose system designers toward monolithic implementations, and, as a consequence, the global structure of hypermedia tends to be relatively inaccessible and behaviors are not easily modified or extended [Schnase et al. 1993]. Our data model is intended to encourage implementation approaches that avoid encapsulating structural information with structure-independent behaviors. It also facilitates the implementation of operations that are essentially structural, such as computing virtual structures, constructing graphical overviews, and performing structural searches.

In order for hypermedia technology to be useful in large-scale, real-world settings, it must facilitate the integration of diverse tools, data, and services. Users do not want to be confined to a particular application; they want to be able to create information using any application they wish, then build links among pieces of related information [Davis et al. 1992; Malcolm et al. 1991].

SP2's architecture instantiates a framework for extensibility and tailorability at all levels within the conceptual organization of a hypermedia environment. New applications can be added to the system at any time, as can components that implement hypermedia functionality. Due to its highly modular organization and simple inter-module interfaces, entire new subsystems can be readily incorporated into the architecture. Since information can be linked to information across application boundaries, tool integration is achieved without the requirement that tools "understand" each other's data formats. We see this form of information integration as critical to advanced information systems.

It is also clear that platform independence and network distribution is important for advanced hypermedia systems. Users often wish to seamlessly link from applications and information on one machine to information and applications on another [Davis et al. 1992; Malcolm et al. 1991]. SP2's process architecture, client/server approach, and the strongly computational orientation of its underlying model provide a framework for wide-spread distribution of functionality. Use of the X Window System for IPC enables distribution across physical platforms and operating systems. Although other approaches to distribution are allowed by the architecture, few accommodate as wide a range of platforms as the X Window System.

Conceptually, SP2's design places no limit on the ways in which its basic process architecture can grow. In particular, we believe that the HB2 hyperbase management system can evolve to support massively distributed, heterogeneous data stores. Its modular and flexible architecture is designed to assimilate emerging software and hardware innovations. This will ultimately allow the HBMS to manage the data capacities envisioned for future hypermedia systems [Leggett et al. 1993; Wiil and Leggett 1993; Schnase 1992].

SP2's modularity and the clear boundaries and interfaces among its subsystems facilitate the development of programming tools that simplify construction of client and server processes. One of the important results of this work was the creation of the X Link Services Toolkit. The Toolkit provides a convenient application development environment, thereby encouraging experimentation and further development of the architecture. SP2 is proving to be a useful environment for research on advanced hypermedia system designs.

Experiences and Future Work

SP2 provides a proof of principle for the system's architecture and underlying model of hypermedia. Performance has not been a major concern to us during this initial phase of research, although preliminary results indicate that entirely satisfactory response times and behaviors are achievable. To date, small, experimental information spaces have been manipulated, but we feel that the system is sufficiently mature to transition to more challenging, real-world applications.

Research on the architecture is continuing in two laboratories. The Hypermedia Research Lab at Texas A&M University is continuing hyperbase management system development. The next version, HB3, will provide support for object composition, version management, and enhanced support for collaborative use [Wiil and Leggett 1993]. In addition, a series of new participating applications are being designed, and work continues on the development of anchors, links, and other forms of agency.

The Advanced Technology Group at the Washington University School of Medicine is using the SP2 architecture in its StudySpace project. StudySpace is combining large, interactive display technology (Xerox PARC's LiveBoards [Elrod et al. 1992]), portable computers, and workstations to enhance highly focused, task-oriented individual and group learning over digital library information. A major focus of the project is the development of annotation and link services software that allows personalization of large, shared, read-only information repositories and the integration of these repositories with personal digital libraries [Schnase and Frisse 1993].

Summary

The requirements of advanced information systems demand innovative new approaches to software design. In this paper, we have described an open hypermedia system architecture that is intended to address some of these requirements. The prototypic implementation of this architecture, called System Prototype 2 (SP2), incorporates several distinguishing features. For example, SP2:

Experiences gained so far have been used to identify several important issues to be addressed in future research. SP2 has demonstrated its effectiveness in the laboratory and is being incorporated into several real-world applications. We believe that the work introduces an architectural approach that will facilitate greater use of hypermedia in advanced information systems.

Acknowledgments

Thanks to Ed Cunnius, Mark Frisse, and Ted Metcalfe for helpful advice on early drafts of the paper. We also appreciate input from Michael Bieber, Tomás Isakowitz, and four anonymous referees.

References

Accetta, M., Baron, R., Bolosky, W., Golub, D., Rashid, R., Tevanian, A., and Young, M. 1986. Mach: A new kernel foundation for Unix development. In Proceedings of the Summer Usenix Conference (June), pp. 93--112.

Afrati, F., and Koutras, C. 1990. A hypertext model supporting query mechanisms. In Hypertext: Concepts, Systems, and Applications. Proceedings of the European Conference on Hypertext (France, Nov.), A. Rizk, N. Streitz, and J. Andre, Eds. Cambridge University Press, Cambridge, UK, pp. 52--66.

Akscyn, R., McCracken D., and Yoder, E. 1988. KMS: A distributed hypermedia system for managing knowledge in organizations. Commun. ACM 31, 7 (July), 820--835.

Bieber, M. 1991. Issues in modeling a "dynamic" hypertext interface for non-hypertext systems. In Proceedings of the Third ACM Conference on Hypertext (Hypertext '91), (San Antonio, TX, Dec.), pp. 203--217.

Davis, H., Hall, W., Heath, I., Hill, G., and Wilkins, R. 1992. Towards an integrated information environment with open hypermedia systems. In Proceedings of the Fourth ACM Conference on Hypertext (European Conference on Hypertext (ECHT '92), (Milano, Italy, Nov.), pp. 181--190.

Delisle, N., and Schwartz, M. 1986. Neptune: A hypertext system for CAD applications. In Proceedings of the ACM International Conference on the Management of Data (SIGMOD), pp. 132--143.

Drufke, J. E., Leggett, J. J., Hicks, D. L., and Schnase, J. L. 1991. The derivation of a hyperText widget class from the Athena text widget. Department of Computer Science Technical Report No. TAMU-HRL 91-002, Texas A&M University, College Station, TX.

Elrod, S., Bruce, R., Gold, R., Goldberg, D., Halasz, F., Janssen, W., Lee, D., McCall, K., Pedersen, E., Pier, K., Tang, J., and Welch. B. 1992. LiveBoard: A large interactive display supporting group meetings, presentations and remote collaboration. In Proceedings of the Computer-Human Interaction Conference (CHI '92), (Monterey, CA, May), pp. 599--607,

Fountain, A. M., Hall, W., Heath, I., and Davis, H. C. 1990. MICROCOSM: An open model for hypermedia with dynamic linking. In Hypertext: Concepts, Systems and Applications, Proceedings of the European Conference on Hypertext, (INRIA, France, Nov.), A. Rizk, N. Streitz, and J. Andre, Eds. Cambridge University Press, Cambridge, UK, pp. 298--311.

Furuta, R., and Stotts, P. D. 1990. The trellis hypertext reference model. In Proceedings of the Hypertext Standardization Workshop, (Gaithersburg, MD, Jan.), pp. 83--93.

Garg, P. K. 1988. Abstraction mechanisms in hypertext. Commun. ACM, 31, 7, (July), 862--870.

Garrett, L., Smith, K., and Meyrowitz, N. 1986. Intermedia: Issues, strategies, and tactics in the design of a hypermedia document system. In Proceedings of the CSCW '86 Conference, (Austin, TX, Dec.), pp. 163--174.

Ghezzi, C., Jazayeri, M., and Mandrioli, D. 1991. Fundamentals of Software Engineering. Prentice-Hall, Inc., Englewood Cliffs, NJ.

Grønbæk, K. and Trigg, R. 1992. Design issues for a Dexter-based hypermedia system. In Proceedings of the Fourth ACM Conference on Hypertext (European Conference on Hypertext (ECHT '92), (Milano, Italy, Nov.), pp. 191--200.

Halasz, F., and Schwartz, M. 1990. The Dexter hypertext reference model. In Proceedings of the NIST Hypertext Standardization Workshop, NIST, Gaithersburg, MD, pp. 95--133.

Kacmar, C. J., and Leggett, J. J. 1991. PROXHY: A process-oriented extensible hypertext architecture. ACM Trans. Inf. Syst. 9, 4 (Oct.), 399--419.

Lange, D. 1990. A formal model for hypertext. In Proceedings of the NIST Hypertext Standardization Workshop, NIST, Gaithersburg, MD, pp. 145--166.

Leggett, J. J., Schnase, J. L., Fox, E., and Smith, J. 1993. Proceedings of the NSF Workshop on Hyperbase Management Systems, Washington, DC.

Malcolm, K. C., Poltrock, S. E., and Schuler, D. 1991. Industrial strength hypermedia: Requirements fo a large engineering enterprise. In Proceedings of the Third ACM Conference on Hypertext (Hypertext '91), (San Antonio, TX, Dec.), pp. 13--24.

Meyrowitz, N. 1989. The missing link: Why we're all doing hypertext wrong. In The Society of Text: Hypertext, Hypermedia, and the Social Construction of Information, Edward Barrett, Ed. The MIT Press, Cambridge, MA, pp. 107--114.

Noll, J., and Scacchi, W. 1991. Integrating diverse information repositories: A distributed hypertext approach. IEEE Comput., (Dec.), 38--45.

Nye, A. and O'Reilly, T. 1990. X Toolkit Intrinsics Reference Manual. O'Reilly & Associates Inc., Sebastopol, CA.

Ousterhoust, J. K., Cherenson, A. R., Douglis, F., Nelson, M. N., and Welch, B. B. 1988. The Sprite network operating system. IEEE Comput. 21, 2 (Feb.), 23--36.

Pearl, A. 1989. Sun's link service: A protocol for open linking. In Proceedings of the Second ACM Conference on Hypertext (Hypertext '89), (Pittsburgh, PA, Nov.), pp. 137-146.

Pearl, A., and Walsh, D. 1988. Sun's NSE Link Service: A Broker for Integrating Autonomous CASE Tools. Sun Microsystems, Inc., Mountain View, CA.

Schnase, J. L. 1992. HB2: A Hyperbase Management System for Open, Distributed Hypermedia System Architectures. PhD dissertation (University Microfilms #9300506). Texas A&M University, College Station, TX.

Schnase, J. L., and Frisse, M. E. 1993. The StudySpace Project: Design and implementation of a learning environment based on collaborative use of digital medical libraries. School of Medicine Library and Biomedical Communications Center Technical Report No. WUMS 93-01, Washington University School of Medicine, St. Louis, MO.

Schnase, J. L., Leggett, J. J., Hicks, D. L., and Szabo, R. L. 1992. Semantic data modeling of hypermedia associations. ACM Trans. Inf. Syst., 11, 1, 27--50.

Schnase, J.L., Leggett, J.J., Hicks, D.L., Nuernberg, P.J., and Sánchez, J.A., 1993. Design and implementation of the HB1 hyperbase management system. Electronic Publishing-Origination, Dissemination and Design, 6, 1, 125--150.

Schütt, H. A., and Streitz, N. A. 1990. Hyperbase: A hypermedia engine based on a relational database management system. In Hypertext: Concepts, Systems, and Applications. Proceedings of the European Conference on Hypertext (France, Nov.), A. Rizk, N. Streitz, and J. Andre, Eds. Cambridge University Press, Cambridge, UK, pp. 95--108.

Smith, J. B., and Smith, F. D. 1991. ABC: A hypermedia system for artifact-based collaboration. In Proceedings of the Third ACM Conference on Hypertext (Hypertext '91), (San Antonio, TX, Dec.), pp. 179--192.

Stonebraker, M., and Kemnitz, G. 1991. The POSTGRES next-generation database management system. Commun. ACM 34, 2, 78--92.

Tanenbaum, A. S., van Renesse, R., van Staveren, H., Sharp, G. J., Mullender, S. J., Jansen, J., and van Rossum, G. 1990. Experiences with the AMOEBA distributed operating system. Commun. ACM 33, 12 (Dec.), 46--63.

Wiil, U. K. 1990. Design and implementation of a hyperbase. Department of Mathematics and Computer Science, Institute for Electronic Systems Technical Report No. IR 90-03, The University of Aalborg, Aalborg, Denmark.

Wiil, U. K., and Leggett, J. L. 1992. Hyperform: using extensibility to develop dynamic, open and distributed hypertext systems. In Proceedings of the Fourth ACM Conference on Hypertext (European Conference on Hypertext (ECHT '92), (Milano, Italy, Nov.), 251--261.

Wiil, U. K., and Leggett, J. L. 1993. Concurrency control in collaborative hypertext systems. In Proceedings of the Fifth ACM Conference on Hypertext (Hypertext '93), (Seattle, WA, Nov.), (to appear).

Wiil, U. K., and &OSlash;sterbye, K. 1990. Experiences with HyperBase -- A multi-user back-end for hypertext applications with emphasis on collaboration support. Department of Mathematics and Computer Science, Institute for Electronic Systems Technical Report No. IR 90-38, The University of Aalborg, Aalborg, Denmark.

Yankelovich, N., Haan, B., Meyrowitz, N., and Drucker, S. 1988. Intermedia: The concept and the construction of a seamless information environment. IEEE Comput. 21, 1 (Jan.), 81--96.

Zobel, J. Wilkinson, R., Thom, J., Mackie, E., Sacks-Davis, R., Kent, A., and Fuller, M. 1991. An architecture for hyperbase systems. In Proceedings of the First Australian Multi-Media Communications, Applications and Technology Workshop, pp. 152--161.