e-drexler.com

Hypertext Publishing and the Evolution of Knowledge
(1987/1991)

   Page 1 | Page 2 | Page 3 | Page 4   

Table of Contents



Architectural sketch

These functions can be mapped onto a set of levels, sketching a system architecture. A more detailed exploration of constraints and design approaches may be found in Hanson [8].

Database level

At the core of a hypertext publishing system will be a network of library machines holding overlapping portions of the hypertext literature. These machines will have a database level (designed to store hypertext data, not traditional database data). This level will support distributed, fine-grained, full-hypertext service, together with triggers to notify users when specific changes occur. It seems desirable to seek general standards for representing links, text, graphics (at least for simple graphs and diagrams), and access and accounting information.

In developing these standards, one should avoid trying to standardize too much, lest the result be unimplementable, inefficient, or excessively restrictive. Conversely, one should avoid standardizing too little, lest the result be a set of incompatible publishing systems. Careful definition of a database interface and a few basic representation schemes seems a good compromise, leaving decisions about representational idioms, user interfaces, and much else free to evolve.

Access and accounting level

Closely related to the database level is the access and accounting level. This level ensures that authors are who they say they are, that royalties are paid when documents are read, and that readers can only see public documents, or documents for which they have been given access, and so forth. These constraints will reflect access and accounting information stored at the database level.

Agent level

The agent level consists of a computational environment near the database (in a cost-of-communication sense); this environment can contain agents to which users delegate rights, resources, and tasks. In particular, agents can examine large numbers of links and items at low cost, apply filter functions, and send users only those most likely to be of interest. Agents can also implement social software functions - for example, applying voting-and-rating algorithms to sets of reader evaluations and publishing the results.

An agent level might use a secure, general-purpose language running under an accounting interpreter and accessing a set of secure, pre-compiled software tools. The latter could perform standard operations (such as reading, filtering, sorting, and merging) in a series of increments of bounded size. Secure in this context means able to operate only on data objects to which access has been given (the core of the Scheme language appears to have this property); for further discussion of language security, see [10,11]). To serve its essential function, accounting need only keep charges roughly proportional to incurred costs.

Telecommunications level

The telecommunications level consists of facilities for communications among library machines and with users. This involves interfaces to existing networks and protocols for identifying communicating parties, accessing remote data, and so forth.

User-machine level

The user-machine level should include a local database acting as a local cache and workspace for hypertext material, together with support for local filtering and display agents. Any of a variety of user interface engines might reside here. Different forms of hypertext might require different interfaces; a modular design in which these interfaces could be downloaded during a session could be very useful.



Existing work

Jeff Conklin's Survey of Hypertext [4] describes existing work at some length. Only Memex [12], NLS/Augment [13], Xanadu [6], and Textnet [5] are described as being (existing or proposed) 'macro literary systems', a category that includes hypertext publishing systems. None has yet been implemented as an open system with true links and filtering. A recent, evolving design aimed at meeting all the conditions described above is the Linktext proposal [8].

Much work has been done on issues such as versions and the semantics of hypertext links [7,14,15], mechanisms for support of argumentation [5,16,17], and user interfaces [16,18,19,20,21,22]. This work will shape both the nature of hypertext publications and the software used to manipulate and display them. A goal for design of a database level is to enable use of the full range of higher-level facilities and idioms that have evolved in the current generation of private hypertext systems. This will enable us to use knowledge we have already evolved.



Advantages and problems

How can one judge the advantages and problems of a nonexistent medium? Perhaps not very well, yet an attempt may be worth the effort. Several approaches seem reasonable. One is to reason by means of analogies to present paper media, seeking analogous problems and solutions. Another is to apply solid, elementary economic principles: lower cost draws greater use, higher cost reduces it; greater reward draws greater contribution, reduced reward reduces it. Another is to place advantages and problems within the analytical framework of expression, transmission, and evaluation, relating them to critical discussion and the evolution of knowledge.



Expression

Hypertext publishing will aid expression in several ways. It will lower per-word publication costs by orders of magnitude: the cost of publishing a book's worth of material will fall from tens of thousands of dollars to tens of dollars. It will lower per-idea writing costs, sometimes by orders of magnitude: by enabling writers to link pithy ideas to an existing context, rather than forcing them to use hundreds or thousands of words to establish a context, it will make single-sentence publications useful and practical. Further, it will allow writers to express networks of facts and relationships by building corresponding networks of statements and links [23], extending the range of what can readily be said. These advantages in cost and quality of expression seem great.

Some problems:

'Who will write for it?'
Hypertext won't be a powerful medium of expression if no one writes for it. One might object that there will (at least initially) be too few authors, in part because the market will at first be too small to reward authors with either money or recognition. Can this start-up problem be overcome?

Given permission or lapsed copyright, hypertext can carry existing paper works. Scanners can input text with adequate accuracy, and readers can mark errors for correction. Royalties will encourage authors to give permissions. Hence the world of documents on the system need not be limited to those written for the system. Further, hypertext can be connected to the paper literature as well as that literature is to itself: Given unique names, each medium can reference works in the other. Thus, the hypertext literature need not suffer greatly from its initial small size.

In fact, one should expect to see much new writing from the start. Small, computer-oriented communities will be early users of hypertext publishing. New, interdisciplinary topics will be good candidates for the early elaboration of a hypertext literature. (One early topic will be hypertext itself.) Communities interested in such topics will generate their own incentives for publication and recognition. The amount of discussion that already occurs over computer networks suggests that a hypertext publishing medium need not starve for lack of material.

'Copyright won't work'
Eventually, copyright is intended to provide incentives for writing - but can copyright work in a computer-based medium, given the ease of copying? For all but bestsellers [8], it seems that it might. A hypertext system would sell not information, but the service of providing information, complete with current evaluations, links to further information, and so forth. To duplicate this service would require copying, advertising, and selling large bodies of information; since this can't be kept secret, reasonably effective enforcement of copyright seems practical.

Experience with expression

Of the advantages cited above, perhaps the least quantifiable was that of extending expressiveness by enabling authors to represent more complex relationships. Is this advantage real and substantial? In 'Theory Reform Caused by an Argumentation Tool' [23], Kurt VanLehn reports his experience with expressing and developing theories in cognitive psychology using the NoteCards hypertext system [16]. This medium enabled him to play with organizations of facts and theories in ways that revealed (and helped correct) serious flaws. Reflecting on his experience with hypertext, he writes:

The NoteCards database is about as close as any written artifact can get to expressing a whole theory. . . .we can expect NoteCards to help theorists clarify their ideas and make them rigorous. . . .Nowadays, I view my work as building a NoteCards database qua theory. To theorize without building a NoteCards database seems like programming without building a program. One can do it, but it's harder. . . .Because NoteCards databases are accurate representations of theories, they have excellent potential as vehicles for collaboration. . . .Being halfway between lab notes and journal articles may also make NoteCards a unique aid to graduate-level teaching. A NoteCards database would allow young theorists to crawl around inside a classic theory, getting to understand it more deeply than they could from journal articles. Incidentally, this is one answer to what could happen to NoteCard databases after their active development ceases. They might rest in graduate schools, embalmed in computational display cases for students to dissect.

This illustrates the utility of hypertext for expressing (and hence evaluating) complex ideas. And with a hypertext publishing medium available, a theory expressed in hypertext need not be embalmed, but can become part of a living, evolving literature.



Transmission

Hypertext publishing will aid transmission in several ways. It will reduce delays in distribution (by orders of magnitude), placing published material in front of readers in under one day, instead of hundreds. It will eventually reduce the cost of placing the capabilities of a research library at a site (by orders of magnitude), from tens of millions of dollars to the cost of some user machines. It will increase the speed of accessing referenced material (again, by orders of magnitude), retrieving it in seconds rather than the minutes or days required in a paper library system. It will increase the ease of finding a reference (by an additional, hard-to-guess factor), because it will encourage a more reference-dense writing style and provide a market for free-lance indexers [6]. These advantages in transmission seem great.

Some problems:

'The public won't be interested'
All these advantages would be of no value if no one used them, and most people won't use hypertext publishing any time soon. Most may never use it.

Likewise, most people don't read scientific journals, and most never will. Nonetheless, journals influence scientists, and their content spreads outward through books, magazines, newspapers, television, and conversation, ultimately having broad effects. Likewise, hypertext publishing might reach only a tiny minority directly, yet greatly affect the evolution of ideas and the course of events.

'Experts won't be interested'
It might seem that leading experts in a field will have little use for hypertext publishing, since their colleagues keep them well informed. And this might be so, if all fields were narrow and well-established. But many fields are broad and interconnected, and of interest not only to their experts, but to other scientists, engineers, policy-makers, scholars, and students. And even experts can benefit from ideas and criticism from foreign fields.

'Readers will get lost'
Transmission will fail if readers get lost; this has been a problem in experimental hypertext systems [4]. It might seem that the vast amounts of material in a publishing medium must worsen the problem.

But styles of use would evolve to suit readers, and success requires only that one scheme work well, even if most schemes have fatal problems. A conservative approach would emphasize hierarchical index structures, like those found in outline processors, allowing evolution of competing hierarchies suited to different fields and perspectives. Lowe's SYNVIEW work [17] suggests how a hierarchical structure can integrate indexing with critical discussion.

It may be that most users of hypertext publishing will chiefly read fairly conventional overview documents, dipping into tangled networks of representation and debate only in their own field, or in search of deeper understanding. Though these summaries might resemble conventional documents, their quality would reflect criticism based on knowledge evolved in the underlying hypertext debate.

'Reading will be too difficult'
Hypertext publishing has obvious problems with equipment cost, and with the speed and cost of telecommunications. If this made reading too difficult or expensive, it would create disadvantages in transmission.

These problems clearly limit the value of hypertext publishing, but by how much? The price/performance ratio of personal computers is already impressive, and improving rapidly, so equipment cost is a modest and declining problem. Telecommunications speed and cost are tolerable for serious work, but they remain the major problem. This motivates a search for ways to minimize the telecommunications bottleneck.

One of the following approaches might eventually be worth pursuing: Monitor reading-frequency of works on the library machines, and observe which are read frequently over several months or more; then, (subject to authors' permissions) copy the most-read half-gigabyte or so of this material and sell it as a CD-ROM, paying royalties in proportion to previous on-line readerships. This would let users buy personal databases containing several hundred books'-worth of the most-read material on the system. In a complementary approach, one would assemble information in a similar way, but with selection biased by a user's interest profile and filter criteria; the result might be distributed on floppy disks and stored on writable media. Users could of course cache downloaded material on a local disk. In on-line use, a user's agent might pre-fetch material to a user's machine when a popular link came into view.

Together, strategies like these might greatly reduce the costs and delays of following a typical link, even without improved telecommunications services. And when compared to a conventional library system, even a fairly awkward and expensive system would shine.

'Who wants to read at a computer?'
Who wants to read at any workstation set up for businesslike typing? Experience shows that a Macintosh set up for reading-chair (rather than secretarial) ergonomics is an acceptable reading device (lean back, swing in the Mac, put up your feet. . . aah!), and reasonable for writing as well.

Experience with transmission

In speed, hypertext publishing will resemble electronic mail, and this is its least exotic advantage in transmission. Common Lisp is widely considered to be an excellent design (at least for a committee-designed standard). In Common LISP: The Language, Guy Steele writes:

The development of Common Lisp would most probably not have been possible without the electronic message system provided by the ARPANET. Design decisions were made on several hundred distinct points, for the most part by consensus, and by simple majority vote when necessary. Except for two one-day face-to-face meetings, all of the language design and discussion was done through the ARPANET message system, which permitted effortless dissemination of messages to dozens of people, and several interchanges per day. . . .It would have been substantially more difficult to have conducted this discussion by any other means, and would have required much more time. [24]

Simply speeding transmission can make a difference in problem solving.





Table of Contents

   Page 1 | Page 2 | Page 3 | Page 4   

© Copyright 1987, , K. Eric Drexler, all rights reserved.
Original web version prepared by Russell Whitaker.