Peter J. Nürnberg
Department of Computer Science, Aarhus University
phone: ++45 8942 3281; fax: ++45 8942 3255
John J. Leggett
Center for the Study of Digital Libraries, Texas A&M University
phone: ++1 (409) 845 0298; fax: ++1 (409) 847 8578
Abstract: In this paper, we consider one possible strategy for planning a World Wide Web research effort in the face of the rapid change that characterizes our field. We begin by providing an overview of the basic WWW model and some of the types of extensibility currently widely available. Then, we consider three closely related areas of research with respect to identifying both solutions we can adopt and problematic or difficult research questions we will need to face. We conclude with some observations on the general characteristics that many potentially interesting and worthwhile WWW research efforts share and on how we can learn in a principled way from our colleagues in other fields.
Keywords: WWW, research agenda, rhetorics, aesthetics, open hypermedia system (OHS), component-based open hypermedia system (CB-OHS), operating system
Trying to plot a course for WWW research can feel like trying to find an unfamiliar freeway exit at night. Things are moving too quickly to get a good idea of all that is going on around you or to pay attention to all that you feel you should. How can you start to tackle a research problem without feeling like there is a good chance someone else or some company will announce a solution before you finish or, worse still, already has, and you haven't yet heard?
Of course, surprises, competition, and a need to keep current are a part of every field, but the magnitude of these issues surrounding WWW research can seem particularly daunting. How can we avoid duplicating old work, avoid already discovered problems, and learn from our colleagues in this field and others in a principled way? What we need is a "roadmap" that will allow us to navigate the research field at the high rate of change that characterizes the WWW. Nothing can replace the need to keep abreast of breaking news and cutting edge results, but we can augment this by partially filling in our roadmap with the landmarks described by researchers in closely related fields.
Although there are many types of landmarks we could identify, there are two types we will consider here that we find especially useful. The "good road" type marks the path others took in similar fields that led to successes. We can try to adopt and adapt these success stories to our own field. This allows us to avoid "reinventing wheels". The "rough road" type marks difficult and problematic paths in similar fields that still point toward uncharted territories. We can get a feeling for what the "hard issues" in our field will be by seeing what problems others have worked on for a long time with no clear solution.
The body of this paper is divided into two parts. First, we briefly review where we as WWW researchers have been. This allows us to define what part of our roadmap we have already filled in. Second, we discuss three related fields, describing some representative examples of landmarks we can use, both of the good and rough road variety. This allows us to get a feel for some of the still unexplored areas of our map and how we might begin to address them.
In this section, we sketch the known parts of our roadmap by briefly reviewing the progress of WWW research. First, a small description of our starting point is in order. We take this starting point to be the state of the WWW in 1991. We can consider the computing infrastructure to be divided into two parts: client and server. The basic WWW client is a browser. It has two main functions. Firstly, it acts as an HTML renderer by interpreting markup and displaying file contents. Secondly, it acts as an http client, making requests of http servers, such as requesting the contents of a file. The basic WWW server is an http daemon or server, responding to service requests made by clients (e.g., browsers). In principle, one can liken this architecture to any distributed file service, such as that implemented by ftp clients and servers. The major differences lie in the interface that WWW browsers supply and the notion of URL interpretation that both clients and servers can perform (albeit in different ways and for different purposes.) A more detailed description of this basic model is available in many places, among them .
In the remainder of this section, we consider several different kinds of extensibility that have been added to this basic model and become widely adopted. Of course, an exhaustive review is outside the scope of this paper (see  for more complete reviews and surveys), but we hope to provide a feel for some of the well-trodden areas in WWW research. We divide our review into three categories: frontend, middleware, and backend. In each category, we mention some of the improvements and progress made that best inform our discussion in later sections.
Figure 1. Basic WWW architecture and some standard extensibility options. The highlighted boxes are those present in the basic WWW model of 1991. The remainder are types of extensibility we discuss here.
By frontend (of the WWW), we essentially mean browsers. However, we also wish to include more generally other software entities and abstractions managed by the browser. Modern browsers have significantly evolved from the completely monolithic clients of the earliest WWW days. In this section, we will consider five types of frontend extensibility that have been made: helper applications, plug-ins, increased browser functionality, style sheets, and applets.
Helper applications are programs that are registered with the browser as new kinds of rendering servers. For example, a Graphics Interchange Format (GIF) picture viewer may be registered as a helper application with a browser, indicating to the browser that certain types of "documents" that the user requests to be displayed (namely, in this case, GIF files) should not be rendered by the browser itself (even if the browser can render the document), but that an instance of the helper application should be started to display the document. The notion of "document" can be broadly interpreted in this context. For example, telnet clients are often registered with browsers as helper applications that are responsible for the display of "telnet session" documents. The notion of helper application registration provides a type of run-time dynamic rendering extensibility. However, these helper applications are in general "hypermedia dead-ends" (i.e., documents rendered by helper applications cannot normally make requests of WWW backends to enable further browsing).
Plug-ins are related to helper applications, in that they provide a dynamic rendering extensibility. There are some differences between helper applications and plug-ins, however. The major difference is that plug-ins are not simply spawned as completely separate processes on behalf of the browser, but actually remain under the control of the browser process itself. Plug-ins take the form of extensions that are dynamically loaded at browser start-up, providing the browser with subroutines that can render documents. Plug-ins are as a rule hypermedia dead-ends as well.
Browsers are also taking on an increasing number of different functions, which can be seen as a kind of "static rendering extensibility." For example, many browsers can interact with servers other than WWW servers, such as ftp daemons. They can make the appropriate service requests of these different servers and then interpret (i.e., render) the data as if it were retrieved from a WWW backend. Additionally, many browsers now also function as mail tools and news readers. This is different from the extensibility effected by plug-ins and helper applications due to its static nature and the different frameworks and protocols from which the rendered "documents" (i.e., ftp directories, e-mail messages, news articles, etc.) are retrieved.
Style sheets are a kind of "meta-document" for browsers. They provide instructions on how other documents should be rendered. This allows the abstraction of certain types of rendering information from the documents themselves. From the WWW frontend point of view, they provide a kind of static rendering knowledge extensibility.
Applets provide a client-side computation facility for the WWW. They are program segments retrieved from WWW backends and interpreted (executed, or even "rendered") by the browser process. Although there are security restrictions on what instructions or actions applets are allowed to perform, in principle, they provide an open and arbitrary computation facility, thus providing a radical form of extensibility, not only for document rendering, but for other arbitrary actions. Applets are loaded dynamically at run-time and thus bear some resemblance to helper applications in this regard. However, the origin of the code, where (how) it is executed, and what it does all differentiate applets from helper applications and other types of extensibility discussed above.
By middleware, we mean those software entities that reside between the frontend and backend. We are not including those entities such as network managers, file caching and replication managers, etc., which are considered general computing infrastructure on which the WWW resides. In this section, we consider two types of extensibility to which the appellation "WWW middleware" could apply: proxy servers and the object request brokers that are being integrated into some WWW software .
Proxy servers (or simply, proxies) intercept http requests sent by clients and process them in some way (usually differently than standard servers). Proxies can be seen as a type of dynamic retrieval extensibility, since proxies may be added to a system (i.e., may begin to intercept requests otherwise destined for a "normal" server) at any time. Proxies may "massage" service requests and then simply pass them along to a normal http server, or may dispatch them entirely independently. The presence or absence of a proxy at a particular WWW server site is transparent to WWW front-ends.
The Object Management Group (OMG) has recently defined the Common Object Request Broker Architecture (CORBA) standard for object-oriented component frameworks . In this framework, objects communicate and use the services of one another through a common infrastructure. This infrastructure should define parameter marshalling and unmarshalling schemes, protocol formats, naming formats, and other standards to ensure interoperability between compliant components. One aspect of the CORBA framework is the so-called Object Request Broker (ORB), which allows components to register services or lookup information about the services of other components. Some WWW software now has embedded ORB's, allowing it to act in essence as a part of a CORBA component framework. These ORB's may be used to collect and use information about the dynamically changing services available in the environment.
By backend, we essentially mean http daemons (or http servers). However, we also wish to include more generally other software entities and abstractions managed by the server. As is the case with browsers, servers have also evolved from monolithic entities. In this section, we consider one type of backend extension: Common Gateway Interface scripts.
Common Gateway Interface (CGI) scripts provide a server-side computation facility. CGI's may be seen to provide a type of run-time dynamic document retrieval extensibility, since from the server point of view, they generate documents in an analogous way to the file system from which "normal" documents are retrieved. These scripts are program segments that have a well-defined interface that allows the http server to pass parameters to these scripts and retrieve the results. CGI's are referenced by clients just as documents are referenced (i.e., by URL's). From the client point of view, the URL in some sense names the document that is returned by the http server after the CGI is executed. This transparency between dynamically generated and statically stored documents has advantages in that users of browsers may not need or want to know how documents are generated or stored. Some potential disadvantages of this transparency are described in the next section.
In this section, we examine some landmarks on the maps of other research fields. We focus on fields that are closely related to our own so that we can recognize some similar territory and borrow some of these landmarks. As discussed above, we identify two types of landmarks in this paper: good road and rough road. Good roads mark paths that have led to successes that we may be able to adapt into our field. Rough roads mark still uncharted territories that have proven resistant to being fully mapped and understood. Both kinds of landmarks can be useful. Good road landmarks can save us time by preventing the constant reinvention of the wheel. Rough road landmarks most likely identify fundamental problems that will require a concerted, long-term research effort.
In this section we consider three related fields: hypermedia rhetorics and aesthetics, component-based open hypermedia systems, and operating systems. For each field, we give a short description of the subject matters these fields address and some ways in which these fields are related to our own, before preceding on to identifying the good and rough roads marked by researchers in these fields. For each landmark we identify, we discuss how this landmark may be adapted to our map.
A rich history of research exists in the rhetorical aspects of hypermedia writing and aesthetic aspects of hypermedia art [12, 16, 20, 24]. Much of this work builds upon schools of critical thinking that developed independently of hypermedia and have only since been adopted or have found new possibilities for expression and experimentation in hypermedia technologies. Although various forms of postmodernism (as articulated by Baudrillard, Derrida, and others) have been the inspiration for much of the work done in this area, older modern schools of thought such as structuralism and entirely new hypermedial interpretations (e.g., [4, 18]) have played their parts as well.
Figure 2. Partial roadmap of the hypermedia rhetorics and aesthetics field.
Many researchers have pointed out the need for rhetorics of arrival and departure in hypermedia (e.g., ). In other media forms, we have developed sets of conventions and cues to signal readers (or viewers, or, in general, consumers) to the starts and ends of "pieces" or units. For example, we often bold face section titles to signal to readers where new sections begin. We write introductory and summary paragraphs in major sections to provide the reader with senses of introduction and closure. Similar types of clues have been developed in many hypermedia systems. For example, KMS () provides a simple "warp" effect when navigation of a link occurs. This allows the reader to gain a better sense of when navigation occurs, especially if they are not explicitly initiated (e.g., server pushed pages). One might argue that "net delay" provides a way to sense when links are traversed. While this may be true in many circumstances, it is clearly not true in all. Consider using a WWW browser to view local or cached files or updates of only parts of a rendered page (e.g., frames) in response to a link traversal. We could borrow the ideas of effecting rhetorics of arrival and departure by building browsers that provide some audio or visual clue when links are traversed.
The need for overviews or visualizations of hypermedia spaces and structures has long been the subject of research (see  for a review). A number different aids have been proposed and implemented, including fish-eye views  and "overview cards" in NoteCards . These have proven to be useful to many hypermedia readers. There are many issues that make it difficult to provide such overviews on the WWW. Firstly, many overview construction methods used in early hypermedia systems simply do not scale well to the WWW. Secondly, no single software entity in the WWW actually has the necessary information to build even an extremely small subset of an overview, since structure is not explicit in WWW backends, and data (and implicit structure) are widely distributed. Nonetheless, there are valuable lessons to be learned about the effectiveness of providing such overviews. There has recently been some excellent work on the WWW in this area, especially with augmentation of the standard stack backtracking model, but this work provides these overviews through a mechanism external to the WWW itself. That is, servers do not calculate overview information on behalf of WWW clients, and browsers do not provide overview presentation as built-in functionality. It would be most helpful to integrate this functionality into the WWW infrastructure itself, guaranteeing its availability if desired and perhaps distributing the computation and network traffic load required. For example, each server could be responsible for providing overviews of information stored locally to it.
Explainers, a notion first introduced in Intermedia , allow users to see some information about the destination of a link without actually explicitly traversing it. For example, moving the cursor over the anchor of a link might produce a small pop-up window describing the destination document, provide its title, or explain its relationship to the source document. The WWW provides a very rudimentary mechanism similar to this in the display of the URL associated with an anchor. However, oftentimes, this URL is not meaningful to human users since it is meant as a machine readable unique identifier. Many experienced WWW users may be able to glean some information about the destination of a link by examining URL's, but this is clearly an insufficient solution in general. To provide these type of explainers, browsers could either request the title or some summary information of servers for each document to which a link points in the current source document or perhaps read the value of a special field in the anchor tag.
One important issue that has confounded hypermedia researchers is that of computation presentation. Links may have arbitrary computations associated with their destination, such as CGI scripts. In some cases, it may be desirable to make this fact transparent to the user. In other cases, however, the user may want to know if there is a computation associated with a link destination, and even some details about the computation itself. For example, a user may wish to know if a computation is expected to take a long or short time, who the author of the computation is, or what version of the script the server has. Of course there are a host of technical problems associated with this issue, but the aesthetic issues of computation presentation are even more difficult. Few implementation attempts have been made in this regard, but it has been discussed theoretically for some time (see  for a good review). This promises to be a relevant and difficult issue for the WWW community as well.
Component-based open hypermedia systems (CB-OHS's) represent the latest thinking in the non-WWW hypermedia systems field. Historically, hypermedia systems started out as monolithic systems, with frontend, middleware, and backend all instantiated in one process. The advent of open link services in the late 1980's (e.g., PROXHY  and Sun Link Service ) saw the abstraction of the frontend into an open, distributed set of applications. Later, open hyperbase or client-server based systems opened and distributed the middleware/backend part of the architecture. Traditional open hypermedia systems or OHS's (e.g., Chimera , DHM , Microcosm ) added the opening of computations (or "behaviors") associated with link traversal and in some cases the distribution of the link service middleware itself. Most recently, component-based OHS's or CB-OHS's (e.g., HOSS  and HyperDisco ) have opened the link service (middleware) layer to include structure servers that serve structural abstractions other than the nodes and links of "traditional" associative hypermedia systems. Examples of such other kinds of structure are spatial hypermedia structures (e.g., VIKI ), taxonomic hypermedia (e.g., ) or issue-based hypermedia (). A more detailed review of CB-OHS work can be found in .
Figure 3. Partial roadmap of the component-based open hypermedia systems field.
As mentioned above, since the late 1980's, most hypermedia systems that have been developed have been open. Hypermedia researchers use the term open in a somewhat special way. Normally, any system that publishes its API and allows an open set of clients is considered open. By this definition, the WWW is open. However, an open hypermedia system is one that furthermore imposes no particular data model on its clients. By this definition, the WWW is not an open hypermedia system, since it requires its clients to use HTML as their data format if they want to participate in hypermedia services. The case for open hypermedia systems was made first by Meyrowitz . Meyrowitz pointed out that people do not want to forsake their favorite editing and browsing tools just to use hypermedia functionality. Hypermedia services are most helpful if they are integrated into the computing environment itself, allowing users to create and manipulate structure over data handled by any application, even creating structures that span data handled by different applications. Note that the presence of plug-ins or helper applications does not make the WWW an open hypermedia system, because the data handled by plug-ins and helpers in general are hypermedia dead-ends. Structure services are provided only to applications that use HTML as their data format (i.e., browsers). If the WWW is to become an open hypermedia system that supplies its services to arbitrary applications, it must separate structure representation from data and provide structure services orthogonal to storage services .
Contemporary CB-OHS's now serve open sets of structural abstractions instead of a fixed set of abstractions, as is the case with the WWW. The "node-link" structure model used by the WWW and many other hypermedia systems is very powerful. It allows one to build information spaces that can be traversed quickly and easily by following links between nodes of information. In essence, one builds these associative structures to aid in information location and retrieval. In his seminal article , Vannevar Bush described hypermedia as a retrieval aid, allowing external representation of the kinds of idiosyncratic structures we use in our memories to store and retrieve data. However, people use structure in other tasks besides simply information retrieval. Hypermedia researchers have been considering computer support for manipulation of structure in many other problem domains as well, including information analysis , argumentation support , and taxonomic work . Each of these problem areas call for specialized and tailored structural abstractions above and beyond the node-link model, such as dynamic composites, Toulmin structures , or taxonomic hierarchies. The WWW could provide such different kinds of structures most effectively by allowing a kind of middleware layer extensibility, allowing the insertion of middle layer structure server components that tailor the basic abstractions provided by WWW servers to meet the specific needs of these other domains.
Hypermedia researchers have also had good results in the area of behaviors, or traversal computations. The WWW provides computational facilities at both the client and server ends of the system, but these facilities only allow computation to take place after a link is traversed, and not during the actual traversal itself. For many purposes, this distinction may not be relevant, but there are actions that endpoint computation alone cannot model. For example, consider the notion of "professional trailblazers", mentioned as far back as 1945 by Bush. Trailblazers mark up existing data with useful links. One reasonable economic model that could make the profession of trailblazing feasible is one in which users are charged a small fee (fractions of pennies) to traverse these value-added links. (For more details on one such model, see .) Note that these fees should only be rendered upon link traversal, not upon arrival at a given data node. Consider another example in which a link traversal computation chooses the server from which data should be retrieved. Such a computation could analyze network load to choose a server that contains a copy of the data based on an estimate of minimum retrieval time. Many other uses for traversal time computational facilities have been discussed in the literature (see [9, 27, 35] for more examples). Providing such facilities on the WWW could have many of the same benefits for WWW users as those described in these sources. Such traversal computations would constitute a new kind of backend extensibility.
Separation of structure representation from data (as mentioned above in the discussion of openness) has had many benefits for those hypermedia systems that have implemented it. However, these benefits have come at a price. Surrendering control of the data format and storage facilities used by clients brings with it problems of link integrity and consistency. Link integrity with respect to the destination side of links is already a problem on the WWW, but source side integrity is trivially guaranteed. The source of a link is well-known and is persistently consistent, since it is embedded in the node data with which it is associated. However, link endpoints in open systems are not embedded in data - they are merely associated with parts of data files. There has been much work on how to keep these associations consistent across data modifications that can occur outside of the hypermedia system (see  for a excellent review). Nonetheless, the problem has not been (and by its nature, cannot be) completely solved. At best, open hypermedia system designers can implement heuristics to cope with such modifications. If future versions of the WWW become open, the difficult problems surrounding link integrity and consistency will need to be addressed.
Operating system research concerns the design and implementation of basic computing infrastructure at the software level. Operating systems provide the environment in which all other computations occur. Silberschatz and Galvin  have divided the functionality provided by operating systems into two categories: that functionality which ensures efficient computer operation and resource allocation and that which ensures a convenient environment in which to develop and execute other computations. Concerns such as efficient use of primary memory through caching and pre-fetching are examples of functionality that belongs in the efficiency category, while those that concern provision of utility programs to users or development libraries to programmers are examples of functionality in the convenience category.
Figure 4. Partial roadmap of the operating systems field.
Much work has been done in operating systems in general and distributed file systems in particular on caching and replication [11, 33, 37, 42, 43]. Caching involves keeping local client copies of information that logically resides on remote servers. Despite the well-known cache consistency problems, caching has proven to be very useful in practice. Replication involves keeping copies of information that resides on one server at other servers. Replication can circumvent unavailability problems due to temporary network partitioning and can improve performance by lessening net delay. Consistency problems in replication are in essence the same as cache consistency problems. Most WWW browsers provide very simple caching functionality. WWW cache management could be improved in a number of ways by borrowing certain techniques from distributed file systems, such as client registration of change event notifications, more intelligent replacement policies that account for structural distance between node information instead of simply a last referenced replacement scheme, and recognition of data that was generated by state dependent server side computations (which need never be cached). WWW servers could also benefit from the introduction of a replication scheme. How can these best be provided? Again, operating systems provide us a good example. Many modern research operating systems are so-called "micro-kernel" systems . The core of the operating system is very small. Modules that handle file or memory management, for example, can simply be "plugged in". This suggests a radical open approach in which new caching and replication modules can be plugged into browsers and servers, providing a new kind of extensibility.
One area within operating systems that has seen a great deal of improvement in the last several decades is the user interface. Operating systems, like most computer programs, have user interfaces. Many operating systems have evolved beyond simple command-line oriented interfaces to sophisticated graphical user interfaces. Operating systems have benefitted from adding useful and common functionality into the operating system interface itself. Perhaps the most widespread example of such functionality is cut-and-paste. This interaction could be implemented by each separate program. However, providing this functionality in the operating system ensures that all applications share a common and consistent implementation and also relieves application programmers from much of the burden of implementing this feature. Cut-and-paste was deemed such a basic editing function that it has migrated into the operating system. Is hypermedia linking also this fundamental? Is the act of linking information so basic that we should expect it to be available to all users of all programs at all times? Certainly, this has been suggested by numerous researchers (see, e.g., ). Perhaps moving the WWW into the operating system itself is a logical and reasonable next step in operating system and WWW development. The integration of MS Internet Explorer into the MS Windows operating system interface as primary directory browser is a step in this direction, albeit only as frontend WWW integration. Experience suggests that a backend integration of WWW servers into the operating system may be fruitful as well.
One area in which operating system research has not progressed particularly far is use of structure awareness in the operating system. We spoke above of managing data (caching and replication) and presenting structure, but we can also think of using structure in the operating system. One example of structure awareness in an operating system involves the notion of semantic locality. Researchers are trying to use semantic locality measures to pre-fetch server files into client caches (e.g., ), especially in the context of network partitioning (i.e., downloading "important" files on to a portable before disconnecting it from the network and taking it on a trip). Nürnberg et al.  suggest using structure traversal measures to help calculate such semantic distances. Other possibilities for structure awareness include providing new kinds of access control that contain the notion of reference permissions. Consider allowing a set of users permissions to read data on the various sides of a link, but not permission to build of follow a particular link between them. This could be useful in the trailblazing example discussed above. All of these structure awareness examples call for operating system level extensions, moving the WWW further "down" into the computing environment.
We started this paper by considering the roadmap of WWW research. We provided the state of the WWW in 1991 as a starting point and then described some of the kinds of extensibility mechanisms that have come into widespread use since then. We did not describe any specific extensions, but only where and how extensibility can be effected.
In looking at the fields of hypermedia rhetorics and aesthetics, component-based open hypermedia systems, and operating systems, we pointed to a number of landmarks or lessons we can learn from these other fields. We did not discuss specific extensions per se. Instead, we looked at where and how further types of extensibility can and should be added to both the WWW and to its environment. Let us consider the extensibility mechanisms we discussed briefly again:
There will always be a place for interesting extensions to the WWW. However, we have demonstrated in this paper that adding the ability to create new extensions can be an even more rewarding undertaking. Of course, such undertakings require concerted research and standardization efforts (e.g., OHSWG  and W3C ). Our roadmap suggests, however, that they have large potential payoffs and are not as subject to the phenomena we discussed at the beginning of this paper: namely being "beaten to the punch" by other groups of researchers or commercial concerns. "Meta-extensibility" efforts require large numbers of researchers and research groups working together over multiple years. We believe, however, that such efforts constitute sound and principled research agenda for the WWW.