Introduction

The main contention of this paper is that today's Web infrastructure and practice do not exist in a vacuum, but in the context of a long and valuable history of literate systems, and that there are still many ways to benefit from that history right now. The notion of hypertext includes the phenomena of reference and cross-reference, annotation, quotation, and multiple co-ordinated views (themselves perhaps a special case of cross-reference). These have thus been with us for millennia, since nearly the advent of literacy itself. The advent of printing, then electronic documents, then networking, and finally near-universal access, have each been important turning points. In some ways, modern document systems merely take operations that have always been central, and make them faster and more widely available; and yet the change at each of these turning points has been so great as to make huge practical differences.[2]

In the course of these changes, it is easy to focus on valuable things that are or seem new, to the extent of overlooking valuable things that have been lost. Not all old things are worth keeping; not all new things are worth adding. But a close look at today's hypertextualized world, especially in the long context of literate systems that have gone before, reveals many features and capabilities that have gone missing, despite being no less valuable than they have been before the advent of the Web.

This paper, therefore, begins with a very brief overview of some highlights of the origins of hypertext, first in pre-computer history, and then in the early days of computing with text and other media. This overview calls out a variety of capabilities that have not survived on the Web, or more precisely, survive in specialized applications and environments, rather than as ubiquitous functionality available to all readers and writers.

Working around the lack of such literate features is always possible — we have computers, after all. But in today's environment it requires having or finding a great deal of expertise. Even more daunting for the vast majority of users, is figuring out which technologies to learn, and being often surprised at the sudden need for a whole new technology because the last one only almost gets you to your goal. HTML, CSS, Javascript, event models, AJAX, DOM, WebDAV, Apache, PHP, XInclude, XSLT, JSON, HTTP, Unicode, SVG, MathML, REST, the APIs of social media companies; the list never really ends. When beginning the journey from "reader" to "author", a few technologies may promise to suffice; but in the end you never know how many you will need, often because of edge cases that you would expect are covered, but are not.

We can do better for users, and we do keep moving forward. But those who ignore the history of literacy are, as the saying goes, doomed to reimplement it, to do so far later than need be, and to implement it poorly at first.

Preamble on the peculiarities of hypertext history

Hypertext, being all about connecting pieces of information to each other, has been around since the advent of literacy (and even before, since a great deal of reference occurs in speech and even gesture). Even before many computers could handle lower-case letters (much less non-Latin alphabets), people began using them to create, analyze, read, search, and even link documents. Yet as Ka-Ping Yee points out, The World-Wide Web implements some of the ideas of hypertext, but the original hypertext vision has yet to be completely realized (Yee02). Computer-based hypermedia systems have been with us since the 60s, and imagined even earlier; the underlying notions go back much further, for many were done, if slowly, long before automation:

  • For somewhere he has spoken... (Hebrews 4:4; the somewhere is Genesis 2:2)

  • Pliny, Tacitus, and others often referred to other parts of the same work.

  • Now the rest of the acts of Ahaziah which he did, are they not written in the book of the chronicles of the kings of Israel? (2 Kings 1:18)

  • The Talmud is an enormous collection of rabbinic teachings and opinions, and is probably the most heavily linked and interconnected hypertext ever created (for example, see the English-language Talmud resources at http://www.halakhah.com).

  • Concordances (inverted indexes by content words) were produced for works that engendered the motivation; these served much the same role as search engines. James Strong produced an exhaustive concordance to the King James Bible translation in 1890 (Str90), and Roberto Busa (Bus80) launched the field of humanities computing in 1946 with his Index Thomisticus, a concordance to the works of Thomas Aquinas (available at http://itreebank.marginalia.it).

  • Euclid's geometry included fold-out 3-D models of polyhedra (see DeR94, p. 19), which is fair to consider a case of multimedia publishing.

  • National Geographic occasionally used to include vinyl records, tucked between magazine pages.

The precision of references to documents and places in them has gradually improved through history. In oral tradition reference was necessarily difficult. In scrolls, it was common to include a summary when referring back to prior materials, perhaps because of the very difficulty of access. The change to codices made page numbers an obvious if imprecise enhancement, and may in itself have contributed to heavier and more accurate use of cross-references. Pliny's Natural History (~78 CE) included not only many cross-references, but also a Table of Contents (Sta81), though at the end of the Preface Pliny credits Quintus Valerius Soranus with having created one earlier.

Even before Gutenberg some parallel texts were produced, facilitating comparative studies. However, with the printing press cross-reference became more useful, since many copies of a work shared the exact same pagination (not to mention content). According to Boardley (Boa14), the book Sermo in festo praesentationis beatissimae Mariae virginis printed by Arnold Ther Hoernen (Cologne, 1470) is the first extant book to include printed foliation (leaf-numbering).

Polyglot Bibles provided aligned text in multiple languages, and sometimes multiple dense streams of footnotes, cross-references, and other annotations (not to mention illuminations and other artwork). The near-ubiquitous Biblical chapter and verse numbers originated with Stephen Langton (1150?-1228) and Robert Stephanus (1503-1559), respectively.

Romanello and Pasin (2011) describe the similar process for Classical literature:

Historically, canonical references are the result of an effort — whose origins can be traced back to the Renaissance (Martin 2003; Berra 2011) — made by the scholarly community as a whole to provide a precise, stable and shared way to refer to Classical texts. Since the early stages of Humanities Computing and Digital Humanities (Bolter 1993; Crane 1987; McCarty 2005), canonical references were regarded as the ideal candidate on which to experiment the potentialities of hypertext: indeed they can be seen as hyperlinks in potentia pointing [to] a text from within another. More recently (Crane et al. 2009) they were considered as a discipline-specific kind of named entities that Classics scholars should be provided with tools to search for within their texts.

How was hypertext possible (before the Web)?

Of course, hypertext as we usually think of it uses computers to take the reader from one location to another without extensive indirection or search. Navigating is a matter of seconds (or even less), rather than of minutes to months (for example, if the reader had to travel to Alexandria to find a copy of the referenced work). But this isn't the whole story, because it is a one-way street. On paper it is hard to support much more, although hypertext pioneer Andries van Dam famously commented that when in college he always bought the dirtiest used copies of textbooks he could find: dirtier meant more highlights, marginalia, and other input from prior readers (van87, Mar98). This perspective of readers contributing to an ongoing dialog with authors, facilitated by networking, influenced many hypertext researchers.

Computer-based hypertext systems before the Web sought synergy between established literate practices and the novel dynamic capabilities of computers with ever-growing storage and speed. The first such systems were developed in the mid-60s: NLS/Augment and HES/FRESS. All these systems had extensive features automating the same kinds of connective features known for millennia on paper, as well as making use of the speed and communication capabilities of computing.

Hypertext involves many things, but perhaps the single most fundamental concept is that it blurs the distinction of writer and reader. A reader writes in the margin, becoming an author; perhaps a sub-Creator in much the sense Tolkien described (Tol47). This much has always been true (though has become more awkward on ebooks and on the Web). In contrast, entering into a full-fledged dialog with the author was very difficult on paper, but should be easy now that readers can respond quickly and visibly. A reader creates a new link, or crafts a guided tour, or writes an overview which transcludes (transparently includes or embeds) quotations from a difficult source (making it more accessible). This should initiate a cycle of creativity; but it does so far less frequently on the Web than it could.

In such acts the reader becomes something more — but what, exactly? Possibly a reader who merely responds is different; but that seems to me a matter of quantity, not nature. Certainly anyone who writes is an author. All authors are, in the end, responding to others who went before. Whether a reader responds by scribbling privately in the margins, recommending edits to a co-worker, or writing a best-selling full-length critique that touches off a new field of study, they are an author. But essentially zero-cost networked publication brings a new dynamic: the ability to sustain a cyclical, multi-party, rapid, interactive, visible conversation — as Kant (though far more slowly) published Critique of Pure Reason (Kan81) and others responded, in turn prompting Kant to re-respond with the Prolegomena (Kan83). In the world I am trying to sketch, this is the norm. The size, value, and influence of each contribution need not be constrained by its author's predefined role.

Current technology is better characterized as permitting such interactions, rather than actively enabling and encouraging them. And yet hypertext pioneers were intensely focused on readers as active rather than passive agents, and on collaborative work in which the roles of author and reader blur, even when there is a particular work by a particular author being discussed. An adequate history even of early hypertext is far beyond the scope of this article; but a few highlights may illustrate this fundamental focus:

  • In 1945 Vannevar Bush envisioned (but did not build) electronic libraries with high-speed search; the ability to make bookmarks, annotations, trails; data entry via a forehead-mounted camera; and information sharing (Bus45).

  • Doug Engelbart built NLS and Augment, which were used for collaborative document work, outlining, shared links, and more; not to mention inventing the mouse and the window (Eng73).

  • Ted Nelson and Andries van Dam designed HES, with implementation by van Dam's team of undergraduates (led by Steve Carmody), while van Dam and his team designed and implemented FRESS. Both systems supported large documents with styling, reflowing, linking, and more; FRESS was the first system to implement undo. They were used for book production, collaborative writing, and for primary texts and student discussions in classes (Yan85, DeR99, van87, Bar13).

  • Ted Nelson's Xanadu (Nel81, Nel99) aimed for a world where works are primarily electronic, with persistent identifiers that can be precisely referenced without breakage — even in the face of edits. This single feature greatly mitigates several of the most vexing questions in hypertext functionality: link breakage, universal identifiers, transclusion, micropayments, and even some security issues. Xanadu (through a data structure called the enfilade (Uda99)) had the extraordinary property that the address of a given piece of text remained unchanged even when the text was edited. This requires overhead that seemed infeasible at the time, and the enfilade never caught on. But in principle it is feasible; strong document-versioning systems are described in Dur08, Ven12, Nic95, and others, as well as a wide variety of more recent and/or less document-oriented systems.

The Engelbart, van Dam, and Nelson systems bring us only to the late 1960s. The following 20+ years brought tremendous activity and creativity. Systems structured around collaborative reasoning and decision making such as gIBIS (Con87b, Con01) appeared; distributed hypermedia systems such as Intermedia (Yan85); and much more. Very few of these systems were delivery-only; most envisioned very active, productive, collaborative readers. Hal87, Hal01, Wei87, and Con87a provide extensive review and analysis, still valuable today. Bar13 provides a highly-regarded historical survey of the hypertext field.

These issues remain critical today, and have had much attention in some quarters. Yet online publication rarely takes much account of versioning (Wikipedia being one of the exceptions). HTML's <del> and <ins> elements remain essentially unchanged, lacking even conventions for version-identification, much less integration with version control systems, end-user view control, and linking. A trivial and obvious improvement in browsers would be to provide a view switch that responds to them; at the CSS level, convenient means of supporting the equivalent of red-lining; at the HTML level, a means for links to refer to named change-sets and versions (and, at a minimum, warn if the target has changed, which can be done merely by allowing checksums to be kept with links in a standard fashion); at the standards level, WebDAV (Dus07) already provides some versioning mechanisms (Cle02, sometimes known as DeltaV), but support remains limited (SVN hooks are available for some features).

The impact of word processing

As personal computers and word-processing spread, Ted Nelson noted that To the best of my knowledge most of the present generation of word-processing programs derived from the Brown System [HES] — but lost the hypertext part (Nel87b, p. 28). However, Chuk Moran, in his 2013 UCSD dissertation, disputes this (as noted by Rosemary Simpson in personal communication). A number of hypertext experts later worked on important WP systems, but somehow virtually no WP systems have made non-trivial use of links. Because of the ubiquity of word-processing, many users already hold notions and models from word processors when they begin learning how to author hypertext; thus lack of key features, or allowing them to be needlessly difficult in WPs, makes those features much harder to sell downstream.

Some word processors have, indeed, provided section cross-references (but very few can do them easily across files); many provide tables of contents (but even now they are commonly dead). Once saved to PDF, ToCs frequently become non-functional, and even the go to page N feature found in PDF viewers is badly broken, because it counts only physical images, not logical pages. Properly speaking, this page-counting is more the fault of applications that generate PDF than of PDF itself; it could be fixed.

Along with links, word processers lost the idea that documents have some hierarchy. Chapters, sections and the like were eliminated as structures, in favor of having only headings. Certain aspects of SGML (Int86) such as the Annex D sample schema (of which HTML is reminiscent) and certain minimization rules, may have done too little to push back against this problem. Lists in word processors are rarely first-class objects; only list items. This facilitates the fiction that a document is little more than a sequence of paragraphs and character-runs within them. HTML5 (Ber14) is considerably better in this respect. Everything larger was decoration (such as page-breaks) or special cases (footers, tables, etc). This model fails to account for many real phenomena, and therefore leads to anomalies such as the varying but ever-present problems editing nested auto-numbered lists, odd behavior at the boundary between paragraphs of different styles; editing in outline views, and so on. Natural units larger than paragraphs are not first-class objects in the software's model, so WPs can only support them with (generally hidden) heuristic rules, leading to opaque and unintuitive behavior.

It is not difficult to find correspondences between the weak models applied in word processors, and limitations in the current Web. While appearances are very important, a naive dedication to appearance at the cost of ontology is self-defeating; most modern authors have experienced this when, having spent many hours refining the appearance of their writings, they are blind-sided by new costs:

  • Commonly, the appearances must suddenly change due to publisher dictates, delivery medium limitations, or the need to represent distinctions previously missed (say, bolding option names while italicizing option values in documentation, rather than treating both the same).

  • In more severe situations, achieving a desired appearance with inadequate tools to map structure to appearance, leads authors to conscript irrelevant structures to achieve their goals. That is not the fault of authors fault when it is difficult to achieve the desired appearance perspicuously. Rather, authors have fallen victim to one of the classic blunders — the most famous of which is using tables for formatting.

  • With some authoring systems, details of their model

HTML has gradually been stepping back from this precipice, and HTML5 continues that movement. Yet even now we seem almost to have a parody of Kant's contention in the Critique (Kan81, sec. 9): What may be the nature of objects considered as things in themselves and without reference to the receptivity of our sensibility is quite unknown to us. But that is not the case here. Web pages do not purport to be "things in themselves", but the products of cognition. Also our perceptions of documents are not merely of layout and typography — we perceive a great deal from the language used, the organizational conventions, the author's use of our prior knowledge of the topic, and much more. With documents, our goal is not so much to engage with a "true structure" of the text, as to engage with the propositions asserted of the text by prior contributors (including the nominal author(s), but as easily editors, annotators, or respondents). See for example (Spe00).

The further that document models depart from such cognitions (or, more precisely, the further that systems push contributors away from direct expression of their propositions), the less intuitive our systems become. This can be a virtue, just as subverting literary conventions can be; but that takes great skill. Far more commonly, we simply encounter bizarre behaviors: problems selecting exactly a paragraph or section; difficulty of accessing multiple links from a single origin anchor (or such opportunities being absent); the inability to get and use a precise reference to a given selection; the difficulty and even occasional risk of quoting or linking other works; the near-impossibility of engaging with an author at any granularity better than "web page"; seeing our responses (or even our purchased ebooks!) simply disappear from sites beyond our control.

How is it that such useful, proven features as links, live ToCs, and bookmarks were almost entirely lost in the WP domain? Or that chapters, sections, and lists were reduced to epiphenomena? It has often been suggested that an intense fixation on WYSIWYG appearance left out anything that wasn't fully paper-like (such as automated links, hot ToCs, and so on). That is likely a factor, but does not explain the omission of bookmarks, scribbling in the margin, and so on — unless one views what you see exclusively from the nominal author's perspective. Late in the history of WPs, rudimentary change-tracking and bookmarks were added, which at least facilitate primitive collaboration; but these remain mere shadows of functionality from the 60s, or even from the paper era.

Ted Nelson characterized the early Macintosh, with its WYSIWYG approach now characteristic of most commodity word processors, as a paper simulator — millions of acres of virtual pressed wood-pulp. It's the deforestation of the American mind (Inf87).

The Web

The Web began in the context of extensive hypertext research, ubiquitous word-processing, widespread network access, and SGML. It became ubiquitous almost overnight. It nicely leverages the Internet itself, hierarchical filing systems, personal computers capable of good rendering, and a markup system which (while rudimentary at first) was abstract enough to permit data portability. If you didn't have italics, or color, or a certain size screen, or even a screen at all (for example, having voice synthesis instead), you could still use the Web. Putting up a Web server required little more than getting a domain name and copying over some directories and files.

Like much else, hypertext R&D was redirected almost overnight to use the Web as its underlying infrastructure. Before long even ACM SIGHyper was renamed SIGWeb. But for all its strengths, all its success, and very substantial growth in functionality, the Web in general — and HTML in particular — still lack some compelling, useful, democratizing, and powerful features of earlier systems.

However we explain this historically and technically, it seems odd that some very basic features (present in nearly every prior hypertext system) were left out of the Web, and remain missing even today except for specialized, custom, private additions. A few of these existed or were planned in some early Web tools, but quickly disappeared (Cai99). Many others have been repeatedly rebuilt, but failed to become widely available. This could be caused variously by technopolitical conflicts, insufficient perceived return on investment, a sense that content shared on a site not your own may be lost, or, perhaps most likely of all, the ease of motivating (and advertising) building a new ride rather than building (as Renear aptly put it), the tunnels under Disneyland®.

I am not speaking here of the famous breakability of links specified by URIs (on which see below). Nor am I speaking of the arguably superior aesthetic qualities of paper (resolution, color, texture, reflectance, etc.). Rather, I am speaking mainly of the failure to share power for readers as opposed to only (nominal) writers. For millennia readers have had highly flexible, trivially simple tools and capabilities for responding to what they read, although sharing their responses was far more difficult before computers and networking. Those same capabilities, plus far more practical sharing, were central to the visions of all the earliest hypertext pioneers and were reflected throughout their systems. Yet open a page on the Web, and nearly all such capability is gone, or available only in a few special places.

Although improvements have been made (and still can be), link breakage has never been fully solved, and may indeed be insoluble.

Readers on the Web have the freedom to choose among whatever links the author thought to provide, or to travel further afield via a search engine; but they have little capability to respond, much less to share:

  • It is difficult for readers even to accomplish the most basic response of the paper world: scribbling in the margin. They can, of course, put their notes in a separate file — but that is not at all the same thing, for the note will not be there when the reader returns to the relevant Web page, and the Web page may have changed in the meantime, possibly making the note obsolete or even nonsensical; readers cannot even verify whether the Web page has changed or not.

  • Many individual websites do address the demand for reader response, at least in simplistic fashion: Amazon reviews, comment threads on news sites and blogs, FaceBook comment threads, etc. However, these are highly constrained (in length, formatting and linking capability, and accessibility), constrained in site-specific ways, isolated from the real content (perhaps exiled to a small scrolling box, with no means of attachment to any place more precise than an entire article), and supported only at the pleasure of the site owner. Readers may even be asked to give up legal rights to their writings in order to post. With popular sites and topics, comments rapidly disappear no matter how cogent (some sites such as stackoverflow.com, for whom high-quality answers are a core value, address the last-mentioned problem by moving up-voted postings to the top, and in other ways).

  • With sufficient effort and skill (far, far greater than that of the average reader of paper or screens), reader can create their own sites, just as print readers could publish a rebuttal or paean to a printed work.

The relationship of author and reader is often as asymmetric on the Web as in the paper world. It is not my goal to discuss the causes of this, but possibilities to consider might include companies feeling a need to protect their image (as perhaps in a lawsuit over a negative review posted on Amazon — Vas14); pessimism over the value (economic or other) of reader contributions; fear of enabling "trolls"; an online culture now used to such limitations; and limited technical sophistication or tools.

Perhaps the saddest demonstration of reader disenfranchisement is that there have been numerous lawsuits for linking to things. Not copying; simply linking. This goes far beyond failing to give readers what one gives writers; it takes away one of the most basic freedoms one has on paper. Suing someone for linking to your site is in my opinion exactly like suing someone for citing your article in a footnote or bibliography — that is, completely nonsensical. Fortunately a variety of courts agree (summaries of several cases are in (Bod04) and Wikipedia topic Copyright aspects of hyperlinking and framing).

Wikis are a notable and very important exception, where readers do have great freedom to contribute. On the other hand, Wikis might be said to overshoot the concept of dynamic dialog between authors and readers, because they not only empower readers but almost lose the notion of authorship at the same time. While many users can make changes, there is little or no provision for distinguishing the text from the response (and of course, the response to the response, and so on). Wikis that retain the change history could support such dialog even in the face of change to the underlying text; but I know of none that integrates a notion of discussing the text with the editing or structure of the text. For example, Wikipedia provides an immensely useful discussion page for each main-entry page, but the discussions there have little provision for temporal, topical, or rhetorical structure. Points made in the discussion cannot easily be connected with the portion and version of the text they refer to, and a Wiki provides little help for reading the discussion in context of the text portions being discussed.

Users are resourceful, and find ways to work around limitations: They quote (usually without linkage, for precise anchors are unaccountably difficult to achieve), even though their quotes become obsolete. They may pull up a page's version history and seek out changes that were made around the time of a given discussion point (surely a task crying out for automation!). Some sites archive historical versions of Web pages, which also helps. But there is clearly no technical reason contributors can't be enabled to comment on specific places in an article, or to see the discussion right alongside the text it involves, with integrated viewing of suggested or executed changes. Such interfaces have existed for centuries in the humanities, though the cycle of republication and distribution of annotations via paper was very slow.

A fundamental issue

Electronic publication raises one special problem that is far less apparent in the paper world. Information has no stable form. An inscription, scroll, or codex is difficult to change without it being obvious. An electronic document can be changed or removed at any moment, with essentially no trace. We have become used to links breaking, but also to text changing out from under us. This leads to many problems, and is very hard to solve definitively. Some of the responses include:

  • Saving a local copy of things you may want later (but this may not be "connected" to the original; Mac OS X at least stashes the source URI of downloaded files as file-system metadata, which helps a little).

  • Taking a screenshot. With AJAX the data you "save" may not resemble what you see on-screen, nor include enough information to re-fetch what you saw. One amusing case is mis-placement of ads; since ads are usually served dynamically, going to the same URI later, or re-opening a saved copy of the page, won't reproduce the mistake.

  • Archive sites such as The Wayback Machine (http://archive.org/web/) try to keep old versions of Web pages.

  • Trying to keep links to documents on sites that seem trustworthy, such as long-established publishers, archives, or institutions, or sites that offer "permalinks"; rather than linking to copies on transient sites.

  • A very few sites, thankfully including Wikipedia, maintain and provide extensive version-history information.

  • Some authors attempt to keep section numbering, section titles, or IDs as stable as practical across revisions. Regulatory and legal information sometimes involves elaborate systems to enable durable reference.

  • Many software distribution sites provide a checksum for each download, enabling users to check that what they got, is actually what they asked for. This is not perfect security, but does guard against many problems.

  • Simplest of all, many sites make a practice of including a "last modified" date near the top of every page. If bookmarks and links also included such a date, at least in good-faith situations the returning reader could be warned that the page has changed (whether or not the change matters to the reader is, of course, another matter).

The problem of transient information does not only cause "404: Page not found" errors. Much more pernicious is getting a page which seems the same, but includes changes that make your link or annotation nonsensical. We cannot keep pages from going away or being rewritten; nor would we want to. However, it is entirely practical to make sure people can tell when that has happened. Stashing a checksum and date with a bookmark, or with a URI copied from the browser, requires neither difficult technology, nor participation by ISPs, websites, or anyone else; it can simply be done in browsers and other user agents. For sites that support versioning, of course, stashing the version identifier of the linked page is also helpful.

This protects against changes that make your link irrelevant even if not strictly broken. It can also be used to protect against certain kinds of spoofing, such as substitution of malicious data in place of transcluded content.

Not all links should point to the original version of their target. A link to a stock-price could be intended to refer to the price as it was when the link was made, or to the price as it is whenever the link is later followed. This is a legitimate distinction of intent, so forcing either choice will be wrong sometimes. At present, however, our technology forces the latter.

How does electronic publishing fit in?

Publishing in the electronic medium, for all its advantages, loses some benefits of prior media: Computer files do not last nearly as long as paper (much less granite), at least without active maintenance. CDs and flash drives don't smell nice or have engraved leather covers that may themselves be works of art. Ebook readers are not so cheap that dropping one in the bathtub is stress-free. Undesired Kindle ebooks do not make good kindling. And no ebook platform yet makes it as easy to annotate, dog-ear, or draw circles and arrows and a paragraph on each page, as it is with a $0.99 paperback and a $0.25 pencil.

The obvious response to such limitations is to make the most of advantages electronically-published works have in other areas. And indeed this does happen in a few ways: ebook readers (especially networked ones) provide ready access to far more books than one can carry on paper. When provided, cross-references are far quicker. And accessibility for visually impaired readers is often far better. Occasionally an electronic publication may contain video or audio.

Ebooks have implemented some significant improvements: changeable font size, immediate downloading, and the occasional audio, video, and links are very helpful. Nevertheless, electronic publishing has made fairly little use of the potential advantages of new media. I suspect this is due to a few simple but pervasive factors:

  • Many publishers still view a print edition as the primary thing. They spend a great deal of time and money carefully crafting details of appearance, with tools optimized for manual attention to every aesthetic detail, by skilled craftspeople (the Typographers Guild?). Features specific to electronic books have no clear place in such a process. Low customer expectations for the aesthetics and functionality of ebooks likely help support this dysfunction.

  • Most authors use popular word processors for writing, and even today most such tools actively discourage practices that would help. Thus, authors have less motivation to create useful structure in their works; even if they do so, the conversion to many publishing systems discards nearly everything but the raw text content and basic formatting (say, italics, bold, headings, tables, and footnotes).

  • At some point, the electronic version forks from that destined for paper. Many, many publishers base the electronic version on page-production files that may have already lost or obviated structural information the author put in (in other cases, the author provided little such value to begin with); however, they tend not to base the electronic publishing version on the final version of those files. Countless small changes are made afterword, and publishers find themselves in the untenable situation where the paper and electronic versions simply do not match (in content — readers expect differences in layout and rendering). Indeed, this problem was discussed in The Economist (Fle14) recently (though without discussing alternatives). Ill-suited as they may be for non-line-based documents, even basic source-code version control systems would make this process somewhat easier, as would wider support for the V of WebDAV (Cle02).

These days, word-processing and hypertext have partly converged: Most word processors can import and export XML or HTML (though it may be unconscionably messy), and many even let you you click on inline URIs. Very few capabilities clearly belong in only WPs or only hypertext systems; yet word processors typically support only the most trivial hypertext features, and even the main Web technologies (short of custom programming) remain very limited with respect to hypertext authoring. The first Web browser was also an editor (Berners-Lee, undated), and NetScape Composer seemed a promising integration of reading and writing. It is not clear just why editing and browsing have come to require separate software, but this split surely contributes to the separation of readers from authors; fluent browser users can't just learn a few new things to become full-fledged authors; they must choose and learn an entirely new application and interface, and may be tripped up by the fact that some authoring systems create HTML that is difficult to deal with outside the originating program.

As for hypertext per se, present users may be so conditioned to what a link is in a Web browser (or even a word processer), that it has become hard to imagine that anything could be different. But much could be different; much is still missing.

A caveat

Before I discuss specific missing features, let me anticipate a response. To almost any lack, a Web developer could respond but you can build that on top! That would be right: There are Turing-complete programming languages, most obviously Javascript, so you can build anything. But of course that tells us nothing — it was just as true before the Web existed (that's how the Web was built, of course). The difficulty is that with the Web being the ubiquitous environment, anything that it doesn't provide (via at least several browsers) has little chance of broad general use. A feature that only works on certain sites, or with a certain set of plug-ins, or if you follow special conventions, is a 2nd-class citizen in the world of information.

Because you can't count on a significant share of users having a given plugin, access to a special proxy or annotation server, etc., it is very hard for such features to be widely used. Consider the erstwhile impoverishment of Math on the Web: until nearly every browser got MathML support, it wasn't feasible to use math except via images (<object>'s fallback mechanism helped, but wasn't around very much earlier).

On the Bounds of the current Web

In this section I discuss several basic capabilities that are largely absent from the current Web, though they are fundamental parts of hypertext and computer-supported cooperative work (CSCW) in general. Yee provides an informative discussion of several of the same issues (Yee02), and Bieber et al. (Bie97, Section 3) provide an invaluable, detailed analysis of many such missing features, categorized under these headings (p. 36):

  • Typed nodes and links

  • Link attributes and structure-based query

  • Transclusions, warm links and hot links

  • Annotation and public vs. private links

  • Computed personalized links

  • External link databases and link update mechanisms

  • Global and local overviews

  • Trails and guided tours

  • Backtracking and history-based navigation

  • Other features

The reader may find it useful to compare their analysis to that of this paper. I enumerated the categories below before encountering Bie97, yet the concepts called out are quite similar in both.

On the current Web, users do have a few degrees of freedom they lack with paper publication: they can resize the window (and most sites will re-wrap appropriately); they can adjust the font-size. In Safari at least, they can switch to an alternate layout that may be better for reading. If they are sophisticated enough to find, install, and use add-ons, they can do more, but rarely cross-browser and cross-site. And that's about it, unless they are quite fluent in Javascript, DOM, CSS, HTML, and several other technologies. Should a reader master all those, few of the skills transfer to common authoring environments (for example, customizing OpenOffice requires different languages and radically different APIs from handling DOM via Javascript).

What could readers do in a better world, that they basically cannot now?

Collaboration

As noted earlier, a key goal of hypertext has long been collaboration: readers are also authors, and they naturally should have the privileges of authors: to share their creations with others; to be able to use prior author's work in all the usual ways (some of which require permission or compensation, and some of which don't); and so on.

Nearly all of the following features facilitate collaboration. Some of them (such as trails) can be provided by authors; but that is not the point. We have all seen books for which a good index would have been invaluable, but the author didn't provide one. No doubt they could have, given time, money, skill, and/or inclination; but many have not. As well, it is simply not possible for authors to anticipate all the needs of a diverse readership. It is in the very nature of hypertext, that readers must be able to do authorial things too. We all should benefit from the annotations of others.

Annotation

The current Web essentially lacks annotation. Users cannot make notes, highlights, or links even for their own private use, without finding specialized software. Current bookmarks are not a solution: they only point to the whole document (comparable to having no margins to scribble in on paper, but only the inside front cover). Bookmarks, especially when AJAX is involved, may also exclude state information required to get "back" to where you are. Bookmarking not only fails to do what the user expects, but fails quietly. One can easily make any number of bookmarks while browsing an AJAX-heavy site, only to discover months later that they all lead to the same cover page.

The AJAX "state" problem is hard to solve in the general case. However, browsers could trivially let users attach notes for their own use (without, of course, having to modify or copy the original document). Reader-created links are also easy. Shared annotations are harder, but not that hard. The usual model is to support some kind of subscription to shared annotations databases, so users need not be overwhelmed by every 9th-grader's insights on Orwell's 1984.

Several projects have built Web annotation systems, such as CritLink (Yee02), Annotea (Koi05), SharedCopy (http://sharedcopy.com), and Marky (http://sing.ei.uvigo.es/marky/index.html). However, most have been short-lived, and none have gained a substantial following. To my knowledge, most or all of them have required inconvenient schemes such as centralized annotation servers, using a proxy or proxy-like 3rd-party server for sharing, copying documents that you annotate, or similar. All have shared the problem of being add-ons. How many people would scribble in the margins of paper books if doing so required a special pen, available only at specialty stationery stores?

Precise linking

Far more useful than notes attached just to pages, are notes, bookmarks, and/or highlights attached to precise locations.

The technology for linking to precise locations has been around for a long time (XPointer (DeR02)), and is trivial to implement; I've done it in well under 100 lines of Javascript. During XPointer development there was opposition from a few who claimed that browsers would never be able to implement keeping track of an arbitrary location in a document — usually the impossible case cited involved the user dragging from the middle of one paragraph to the next. Of course browsers have all had a selection notion since the beginning, which does exactly this; most have by now implemented Javascript access to it.

In my opinion, this is the single most blatant and single most damaging omission in current Web technology. Without a widely understood and used way for readers to point specifically where they want to, readers are limited to linking only to points that have IDs — and the author is in total control of that. Not only that, but even finding the IDs that do exist (if any) requires considerable expertise, and Cailliau and Ashman (Cai99) point out that browsers have, inexplicably, ceased even to highlight the target element, making precise linking far less useful even when IDs allow it. Basic UI principles such as not scrolling the target to the very beginning of the window, but rather showing a bit of prior context, seem beyond reach.

The single most fundamental interface, common to nearly every hypertext system ever built (except the Web), is to select a range in a document, and make a link to or from it. But users of the Web cannot do this even now. Even when authors provide IDs, readers cannot use or even see them with unusual sophistication; and even then HTML link targets are at best points or possibly elements, not scopes or selections.

Having a server handle this is possible, and many specific sites have implemented gadgets for special cases of it. But that does not touch the fundamental problem. Besides working only on a few sites, or with certain browsers or add-ons, or on alternate Tuesdays, it still leaves those sites in control — not the reader, who clearly owns any annotations they create (and may wish to keep them private).

Bidirectional linking

Bidirectional linking is considerably harder than annotation or precise linking. If a million people create their own links to a certain page, whoever serves that page cannot be expected or compelled to add a million links back. However, bidirectional linking can be very valuable in more controlled contexts, such as within a workgroup or any other community with shared interests. There should be trivial, standard ways to set up such a group, share links with members, and ensure that such links can be seen from both ends. This is not hard technologically; but it is hard for publishers to like, as it limits their control.

Bidirectional linking makes it obvious that links must be able to reside outside the document(s) being linked. However, that is already necessary for any non-authorial linking — unless you copy the document, which introduces many other problems, although it avoids (without entirely solving) the problem of the document changing. Cailliau and Ashman (Cai99) note that link authoring was made to appear difficult and to require specialized tools, due to shortcomings in the Mosaic interface, but that arbitrary link authoring has been considered a key feature of hypertext systems, ever since their first inception, when Bush described the creation of associative links to record one's path of reasoning through a library of information.... They also point out that

Third-party linking epitomizes the contest between free speech and reputation, or perhaps between graffiti and propaganda. However, there is clearly much benefit in the technology. The Web has no quality control office, so anyone can (and frequently will) put up materials of questionable integrity and authenticity. The very vastness of the Web now means that it may be impossible to find evidence to support or deny questionable claims, so properly-used, a third-party linking technology can effect a highly democratic form of quality control for the Web.

Transclusion

Transclusion is simply transparent inclusion: Embedding part or all of one document into another. In the early Web, I was frankly shocked when I discovered that <img> (which did embedding just fine for pictures), simply did nothing if I pointed it at an HTML file instead of an image. Far more shocking is that 20+ years later there is still no easy way to do this. HTML 5's seamless iframe makes a start, and specifies a number of important semantic details. On the other hand, without precise linking seamless iframes are much less useful, and the interaction even with trivial fragment identifiers seems unclear.

When transcluding a document, there are three levels of "scope" that might be specified: The destination anchor itself, to which the user's attention should be directed, for example by highlighting; the amount of surrounding scope that should be available (the size, zoom, and scrollability of the display area can serve as a proxy for this); and the total amount of the target document that should be accessible (for example via scrolling). More sophisticated details such as whether/how to show author and version information about the target, give direct access to other versions, and so on, may be more than need be standardized.

Transclusion, particularly when formatted to appear seamless, makes issues of attribution, legitimacy, and spoofing more obviously important, but likely does not actually introduce any new problems along those lines. The same effects can be achieved (like most anything) with sufficient server- or client-side programming. People doing malicious spoofing are unlikely to be daunted by having to write Javascript rather than insert an iframe.

Dynamic views

Dynamic views are alternate layouts that can be switched at will. Style sheets are the obvious way to achieve this on the Web, but have not nearly achieved their potential, which is far wider than their present use. Probably their first important selling point, starting in the late 80s, was as a way to factor out formatting. Authors would instead co-identify kinds of document components, enabling the publisher professionals to do layout far faster, and to respond to last-minute changes (Coo87). This let publishing staff change a definition instead of manually changing every instance (in this, it is like many uses of indirection in programming). This the Web understands well, even if many authors (and some authoring software) do not.

Stylesheets became more important when large documents began to be delivered in electronic as well as paper form. Notable early examples included aircraft repair manuals, and online documentation for SGI, Novell, DEC, Sun, and others in the early 90's. In such settings documents came to be multipurposed: for print, for CD and online delivery, and perhaps other media as well. A document with its print layout hard-coded required almost as much work to convert for online delivery as it took to make in the first place. A structured document with a separable style specification need only have a new style sheet installed. CSS now provides much of this benefit, particularly for the browser/print split, even though print formatting remains limited, as do features needed for tablets, dedicated ebook readers, and smart phones.

The gradual removal of purely format-oriented elements from HTML, and the addition of new constructs that make it easier to avoid tag abuse (most notably overloading of tables, lack of math, etc.) have been excellent moves forward. Structural markup as opposed to format-oriented markup provides the "hooks" that make dynamic views far more valuable. Microformats for ubiquitous Web components such as addresses, product information, and many more would be useful additions, but there is plenty of descriptive information already in place that can be leveraged.

As a side note, it seems amazing that HTML and its added functionality have not taken over *nix man pages. There remain many viewers, though less is probably still by far the most common. Few viewers even support rudimentary links within and between man pages (info does a little). Imagine if man pages were marked up better, for example by identifying option names and types (an achievable though not trivial conversion). It would then be simple (among many other things) to auto-generate forms from the documentation. You could not only read about options whose names or magic values you've forgotten, but click on them right there in order to interactively assemble the command-line you need, or sort them by categories, alphabetically, by interdependencies, or other ways as desired. Most of this could be done with existing style technologies once the source data was cleaner (an eminently affordable or even crowd-sourcable task) — though see also the section on Integration of Linking and Style, below.

A third value of stylesheets remains missing. We still tend to think in terms of one style sheet per medium. We know that if you're on paper, you deserve a special style sheet; likewise if you're visually impaired. Some websites provide an alternate view for phones (usually via browser sniffing or a special sub-domain). But the mechanisms for publishing to cell phones remain awkward, and almost entirely outside the reader's control — I cannot rely on any particular accommodation when on my smart phone: it all depends on which sites I go to. I can't even change font size readily as I can on a laptop (though it's even more important on the cell phone) — other than by zooming, which breaks all the lines and makes the page unreadable. Ironically, this is almost the same problem many of us fought with for years, when publishers demanded multi-column layouts and pageless scrolling at the same time.

We must ask ourselves at least two questions: 1: Why should there be only one stylesheet per medium? And 2: Why should it only be practical for authors, not readers, to create and swap stylesheets at will? In some browsers I can install my own stylesheet that overlays everything; nice for a few things like font size. But the flexibility is pathetic unless I combine that with extensive Javascript work. In other browsers the View menu provides for choosing among multiple stylesheets if available, though it's not obvious whether or how end users can add their own. As noted before, the reader is largely a second-class citizen, expected for the most part to passively accept what websites provide. Web designers in turn, can be under pressure to design for the least-skilled user, leading to a pernicious cycle of low expectations.

Systems as early as FRESS provided easy ways to change styles at will, and not merely by swapping entire stylesheets. In short, any element in a FRESS document could have a list of keywords attached, and users could at any time set a keyword request string, which was a Boolean expression over such keywords. Only objects which satisfied the request string were operative. This was not nearly so easy to author as it would be now, since FRESS did not have stylesheets as separate first-class objects. Nevertheless, once set up it was easy to use, and many users found the capability useful enough to keyword their documents heavily, allowing many of the same effects as swapping stylesheets on the fly. See DeRose and van Dam (DeR99) for further details.

Roderick Chisholm wrote many of his books in FRESS. He commented in the Preface to his classic treatise Person and Object that The book would not have been completed without the epoch-making File Retrieval and Editing System... (Chi76). In classes taught with FRESS, keywords were used to hide students' comments and annotations from each other (but not from their TAs and professors) until they had completed a given section of work; and then to open up the discussion.

In fact FRESS's filtering supported not only keywords, but also variables with values and even weights. One user even created a faculty salary database (and there was no scripting language in FRESS, so it was entirely declarative). Imagine if it were standard functionality on the Web, to be able to create live tables much like spreadsheets. Yes, it can be done by attaching handlers and Javascript functions to everything; but doing simple tasks that countless spreadsheet users already understand and do regularly requires substantial additional expertise (or commitment to specialty add-ons); it is extremely difficult if you aren't the author of each particular document. It would not be hard to enable spreadsheet-like tables in HTML (for example, by some new CSS property) — but real usefulness would require a bit more, such as simple forms integration (where simple means no programming).

Merely being able to attach all the style sheets you want to documents and switch between them at will, would be a huge step forward yet trivial to implement. In practice, some would be from publishers, some be of your own making, and you might choose sets to apply by domain, something like cookie white-listing. As noted earlier, various browsers have some parts of this functionality, but little to none of it can simply be expected to work as a matter of course.

Ted Nelson (Nel81) proposed an even nicer (though more challenging) feature: A StretchText control (perhaps a knob or slider?) to dynamically adjust the level of detail desired in content itself. If we had several of the other capabilities described here, this would be fairly easy to build atop automatic text summarization systems.

Dynamic (rule-based or implicit) linking

DeR89 discussed the division of links into explicit vs. implicit. HTML <a> is the quintessential explicit link: there is an explicit representation of the origin and destination (though both appear only at the origin). But situations often arise when it is valuable to have all occurrences of something linked. Implicit links address this need without having to manually link every occurrence of a word (or phrase, or regular expression match,...). Instead, a pattern is specified, and the system finds all matches and links them.

For example, when reading a Wikipedia page on an unfamiliar technical subject, having links from jargon would be very helpful. But whether such links are valuable or merely add clutter depends on the reader, not mainly the writer. Because the existence of links cannot adjust for this, Wikipedia articles often have too many or too few links — this is typical with explicit links because readers vary (although Dynamic Views as discussed in the previous section can be used to similar effect).

In much the fashion of regular expression captures, a rule can include part of the matched text into the URI generated for the implicit links, allowing you to define large sets of links with a single rule. This is especially good for linking things such as Named Entities (regarding the detection of which much research and software exists), technical terminology, and so on. Given such functionality, one could also avoid endless repetition when the anchor text of a link is the same as the tail of its URI — a simplification featured in Wiki markup, POD, and other systems.

Implicit links make inevitable, a useful but rarely-seen capability: being able to reach multiple end-points from a single source anchor. If many implicit links are enabled, this becomes especially important. But even with only a few, and even with purely explicit links, it is often useful to have multiple options: Do I just want a reminder of who Neville Chamberlain was, or a whole article on him, or an explanation of how he relates to whatever article I'm reading now? Intermedia (Yan85) made extensive use of this capability; the basic interface was a pop-up menu of the links available from the current point. Similarly, on seeing a reference to a supporting article I may want to see the citation information, or the abstract (perhaps in a small pop-up), or actually go to the full article. This capability is synergistic with precise linking and with transclusion.

It would not be difficult to let links be annotated (manually or even automatically) as, for example, expert vs. beginner (this could even be done with existing attributes such as class). Empowering the user to turn the visibility of links up or down to match their expertise level would let them customize and improve the reading experience. Users could even set up profiles indicating their level of expertise in various domains, and articles could adjust automatically (as always, a manual override should be available).

Trails

One more strong example of the pioneering hypertext systems' focus on enabling readers, is FRESS's block trail feature. A block trail was a list of document locations, reified. Readers could add the current selection to a block trail. A block trail could be a document on its own, or inserted in another document, and of course could be shared with others. A block trail could be displayed as a list of links (something like a site map or navigation bar), or use transclusion to splice the link targets (not the entire documents in which the targets reside!) end-to-end to create a new (virtual) document — one in which all the component parts know where they came from. This was an ideal tool for creating guides to other complex documents, overviews, and other navigational tools.

Today, a Web page can easily be authored which does a little of this: a list of links, perhaps with explanatory text in between. However, the lack of precise linking makes creating accurate trails infeasible. Also, visual transclusion remains difficult to achieve. Yet it need not be difficult; Cailliau and Ashman (Cai99) note that the NeXTStep browser already had features in 1990 for creating path documents with essentially this functionality. Simple interface changes such as "Copy selection with reference" and "Paste as new waypoint" would be enough to be very useful; accomplishing the same thing with a dozen mouse actions as must be done now, is quite simply too tedious, particularly given the lack of precise linking.

Orientation

Hypertext systems gradually developed interfaces for helping keep the reader oriented in the complex space of documents and links. Breadcrumbs on many Websites can help with this, as can site maps. However, graphical overviews, trails such as just discussed, and other navigation tools are uncommon.

Given the huge scope of the Web, an additional feature could be very useful for helping individual readers stay oriented, which was not anticipated by early hypertext systems (so far as I know): Being able to record preferences about where to go. Many large news sites include content from many staff, affiliated, or syndicated reporters. A reader will likely prefer some over others. It may be difficult to ensure that a reader detects new stories by a favored author (though authors or aggregators can offer a subscription push service, or readers can set up a search agent); but avoiding a reporter you've decided you dislike, or who writes at a level inppropriate for you, is difficult. This could be done entirely client-side by storing appropriate information locally. Since there is, perhaps surprisingly, no standard way in HTML to mark up the author, normative URI/DOI/etc., or other often-crucial metadata of a page, avoiding an author would sometimes be heuristic; but likely fairly accurate nevertheless.

Integration of linking and style

While the Web has a reasonably clear model for what is document, what is style, and how the two are handled (essentially by HTML and CSS, respectively), there is no comparably clear Web model for linking. The behavior of links is purely magical in browsers: browser code knows the <a> element (as well as <img>, <object>, <link>, etc.); knows that it is overloaded for source and destination (and how to tell which you've got); knows the special expected behaviors. CSS can do little but change a link's appearance with :link, :visited, etc.

At the 2nd WWW Conference (WWW94), one of the CSS principals commented to me that CSS could handle all the formatting requirements for arbitrary SGML. This was clearly wrong, but CSS has made continuous progress toward being able to accomplish the formatting requirements of HTML. By now the expected effects of nearly all HTML elements (not to mention attributes) can be achieved with CSS instead. However, exceptions remain.[3] In certain specific cases, leaving behavioral semantics outside CSS is, in my opinion, unproblematic. For example, there seems no point in trying to fit the functionality of HTML5 <canvas> or SVG into CSS, rather than leaving them as magic. In particular, this seems so because

  1. the semantics are specialized, and only overlap slightly with those of other HTML constructs (for example, in the styling of text within the drawing).

  2. we could create all the CSS properties needed to make such objects declarative, but I do not think those properties would see much use anywhere else;

  3. nor would many of the existing CSS properties apply to drawings, beyond those that do now for positioning them as blocks;

  4. few people are likely to make tag-sets with such semantics for themselves.

  5. drawings and their semantics are fairly self-contained.

Similar arguments apply to MathML, and perhaps to <object> or <form>.

On the other hand, linking semantics remains largely unaddressed in both HTML and CSS. Unlike the previous examples, linking lacks compelling reasons to be separated from CSS. Linking behavior is not nearly so specialized as vector graphics or equations; linking behavior would naturally fit in CSS because it can reasonably attach to most or all HTML elements (or selector-specifiable groups of elements); many of the existing CSS formatting properties can already apply meaningfully to linking elements, and many, many people already have other tag-sets with other linking elements, which are notoriously difficult to get browsers to handle even when they are isomorphic to HTML (which many are not).

Integrating a set of link-behavior properties into CSS would take the magic out. Thus:

  • Browsers could support other tag-sets with minimal extra effort, rather than doing so only for the formatting capabilities of those tag-sets.

  • Linking would gain all the same separation of concerns advantages that formatting gets by being factored out of documents into stylesheets.

  • Many asymmetries would go away. For example, right now <img> entails embedding, and fails if it points to a non-image MIME type. On the other hand <a> is agnostic about MIME type and can quite happily refer to images, but it implies non-embedding. <object> is essentially like <img> except that it supports more types and has a fallback mechanism (which, indeed, would be massively useful for <a> and <img> to have!). And HTML5 has several similar elements, which it seems ought to share a factored-out set of linking semantics.

  • There are countless scenarios where particular components should change between being links and being regular displayed elements. <blockquote>, alt attributes, <del> and <ins>; <nav>, <aside>, or even <dt> (especially on smartphones). This is synergistic with the notion of dynamic views discussed earlier.

Traversal semantics

The semantics of link traversal usually involve only a few basic properties. Some of these properties can have arbitrarily complex values in principle, but in practice most needs are covered by a small number of cases; this is like many things that CSS addresses. CSS can provide the key properties, as valuable for linking as those it provides for other aspects of presentation. For example:

  • How the origin anchor appears. CSS has this, although it is at best awkward for non-text ends such as icons, hot regions in graphics, menus for multi-destination links, etc.

  • What interaction event triggers traversal. The set of onXXX event attributes address this, but seem out of place in HTML rather than CSS.

  • Possibly, some visual effects to go with the transition.

  • What happens to the origin location. Should it still be displayed or not? Should it be rendered according to a new style, or collapsed to a compact form such as its top heading?

  • Where the destination should appear. A few choices, such as defined already via HTML attributes, should cover the bulk of needs.

  • How to determine the link's destination. Currently, names such as "a", "img", and the like are magic, built in to browser code; so are the "href" and "src" attributes, the meaning of "base", and so on. Any kind of document that doesn't precisely copy all these names and their complex relationships, will not work in a browser except via special programming. Even changes so simple as adding another linking element, moving the URI (or base, or fragment identifier, etc.) to a different attribute, element, or content — any of these exceeds available declarative (non-programming) flexibility. Perhaps surprisingly, a new CSS property for this would need little more than CSS's "content" property already provides: the ability to refer to attributes and concatenate them with constants. Adding a function like attr(), but that takes an XPath or XPointer expression rather than merely a name, would vastly expand functionality within and beyond HTML, for minimal effort.

  • Parameters for how the destination should be viewed. This is not presently a part of linking on the Web, but has been implemented in some other hypermedia systems, and is especially important for transclusion. For example,

    • Traversal could set the stylesheet to be used at the destination, thus customizing the appearance as appropriate for the context.

    • In a system with keywords that parameterize viewing (which could easily be enabled), it is very desirable to support setting a viewing specification, such as an expression over keywords, to apply at the destination.

XLink provides many of the needed linking semantics, but not all, and they are common to the diverse systems that use XLink (SVG, XBRL, LC METS, etc.). The rest crop up in many familiar applications, such as presentation applications.

One obvious way (but not the only way) to accomplish this integration would be to allow the right-hand-side of CSS properties to be specified via a Javascript expression evaluated in the context of the element to which the CSS is being applied. As noted earlier, the content property provides function-call syntax such as for attr(), so the required syntax is obviously not a problem. Full Javascript code need not appear inline; a single function-call such as often appears on HTML event attributes would suffice; more would of course be possible. Although this capability might be supplied for only a handful of properties, that seems to me as short-sighted as the early practice of giving DOM access only to a few selected items such as links and form elements, or pre-HTML5 limitations of use of attributes such as id, style, lang, and the like. DynaText got a great deal of mileage from allowing expressions as the value of all style properties (DeR99).

Another application of this kind of integration, would be for bibliographic references such as in this paper. A key such as Chi76 appears in content, but is also sufficient to locate a target entry in the Bibliography. If you put the key on an attribute then you can copy it into content with CSS's content:attr() feature (although prefixed with #). But if the key is actually in content, you can't use it as part of the URI without resorting to programming. In practice, such cases lead to much duplication of data.

This paper is not the place to propose a full set of semantic properties out of which the various linking behaviors can be built. However, that is hardly a new question. Past hypertext systems have explored many possibilities (as has XLink), and coming up with reasonable declarative semantics, plus clean scripting hooks, is not that technically difficult. There will always be a small remainder of cases where arbitrarily complex programming is required to get link behavior just so for special applications — but that is no argument against integration, because it is equally true for formatting such as currently done by CSS.

Chunking

We have all encountered countless Web pages where someone decided that there was too much information to show on one physical page: forum comments on a controversial subject; a large textbook such as at http://pubmedcentral.gov; the list of people eager to meet you on a dating site. There are often good reasons to break such lists up, but only rarely does a site provide any way to see all the data at once. Bandwidth is rarely the issue, because transmitted page size often includes more script than HTML (Eve13, based on data from http://httparchive.org/interesting.php).

On the present Web, however, you never know what you will find in terms of document granularity. At some sites you can click a thoughtfully-provided view as one page button; on others you can adjust the number of entries to show per page; on others you must page through 10 or 100 or 1000 pages manually. If you want to actually see the union, you can't just concatenate the files, either.

This is a significant practical problem: sending all the data at once is unwieldy, but splitting it up into many parts is, as they say at ISO, obfuscatory. However, this functionality could easily be standardized. This would let readers view things how they want. Of course, sites that split things up for some special reason could still do so; it would be quite difficult to prevent that even if one tried.

It is perhaps worth emphasizing that many sites implement such chunking via AJAX. This works fine except that if you save the page, current browsers quietly save the page source, which may contain absolutely none of the content you wanted. There is generally no option to save the data as currently shown (except through developer tools). What You See, in this case, bears little relation to What You Get.

In addition to the granularity problem, there exists information that is best delivered in "card"-shaped chunks. This includes many simple database-like applications, and also presentations. The hypertext research community long debated the merits of chunky vs. creamy hypertext systems. The former broke information into small (card-sized) pieces. Both approaches have virtues and limitations. Chunk-style information is tedious to achieve on the Web, mainly because there are few provisions for making "cards" uniform; better integration of XForms capabilities would help, mainly because of the notion of having a fixed-form template, which is filled out with different sets of data.

On what can be done to make hypertext actual as a System

The reader will likely have noticed that these issues are synergistic, not independent. They combine in a variety of ways to add extensive functionality, particularly for active readers. I thus propose the following as some initial steps toward enhancing HTML, CSS, and the Web in general to enable such functionality, and to come closer to achieving one of the fundamental goals of hypertext systems: the emancipation of the reader.

  • Protect the right to link.

  • Make private, persistent annotation ubiquitous, and at least as good as annotation on ebook platforms. Better: Add sharing.

  • Make a way to point precisely without IDs, ubiquitous. XPointer is the obvious choice for representation; select/make-link is the obvious interface.

  • Make it trivial to transclude a specified portion of another document (at least HTML and XML) such that it appears inline, and gives ready access to the full original. Use this to enable reader-created trails. One way to do this is to extend the CSS content property, so it can insert not merely strings, counters, and attribute values, but DOM subtrees, or the result of a specified Javascript expression (as string or DOM).

  • Integrate linking semantics into CSS — for all the same reasons. As a first step, add all the HTML event attributes as CSS properties.

  • Make it trivial to attach a checksum to a link or bookmark, so the browser can say if it has changed when you come back (there are some interesting issues with ephemera such as ads, clocks, although some HTML5 additions such as the distinction of <main> help considerably).

  • Make it much easier to attach style and script to documents in the wild, and to switch stylesheets at will.

  • Introduce a keyword filtering capability, allowing stretchable, filterable displays from a single source, under reader control. This could probably be done merely by allowing expressions as values for the CSS display property.

  • Provide built-in chunking (and un-chunking) capability that would make it easy to build card-oriented sites when appropriate.

Most of these capabilities have been implemented and proven in multiple previous systems. Many have also been tried on a small scale on the Web (via private scripts, add-ons, special proxy servers, etc); but that isn't good enough for basic functionality like this, on which readers so fundamentally depend.

While the Web has made huge strides in many areas, the humble <a> element has barely changed in 20+ years. URLs have barely changed (except for the name and for the IRI). CSS, while also having made great strides, still can't reconstruct HTML's default behaviors (especially for links!). And massively useful functionality such as expandable sections, keyword filtering, alternative views, synchronized scrolling, table sorting, and many kinds of horizontal layout needs can only be done with special Javascript, placing them beyond the average reader (one obvious example being the many un-sortable data tables on the web).

Some of this, I believe, arises from an artificial separation between layout and interaction. Interaction must (with a few edge-case exceptions) affect what is on the screen: a fully-integrated synergy of content and layout. Yet interaction is presently segregated: to affect it (other than the seeming anomaly of coloring links) one must learn and use not only CSS, but also Javascript. Interaction requires programming skills and tradeoffs: attaching onXXX attributes which are at least as anomalous in HTML as the long-deprecated hard-coding of style properties; or attaching them via Javascript (which then involves knowledge of event triggering, timing relative to loading, event bubbling and canceling, and at least a little DOM). This puts even the most basic, style-like operations in the domain of programmers rather than of non-programmer authors or readers. For example, collapsing sections to a summary such as their title; selecting a column to copy, saving the current view (rather than the original source, which with Ajax may be irrelevant), etc. There are many simple enhancements that could improve this process, many exactly like historical improvements in the handling of style, with similar benefits such as the greater precision of CSS selectors.

In my opinion, we must rid ourselves of the notion that the reader is of a different class than the author: that the Web is largely a delivery system, with complex rites of passage to join the Guild of Authors. At present readers can fill out a form, and sometimes add a forum comment; on Wikis they can do more. But they have no better navigation tools than search engines and unwieldy bookmark lists. They cannot get, on the Web, the grubby textbooks that van Dam and countless others benefit from. They cannot share their insights with others in the context where they found them. They cannot even make private comments on a page, that they themselves will see when they come back. If they want upward mobility (to become an author) they have many hoops to go through, or they must live in a walled city like a blog site.

At present, the Wikipedia article First-class citizen begins For the usage in society, see Second-class citizen. The latter in turn says: For the usage in computer science, see First-class citizen. Can we (first-class?) denizens of Computer Science not enfranchise the society of readers? I say, make readers first-class citizen on the Web.

References

[Aks88] Akscyn, Robert M., Donald L. McCracken, Elise A. Yoder. 1988. KMS: a distributed hypermedia system for managing knowledge in organizations. Communications of the ACM 31(7): 820-835, July 1988. doi:https://doi.org/10.1145/48511.48513. http://dl.acm.org/citation.cfm?id=48513

[Bar13] Barnet, Belinda. 2013. Memory Machines: The Evolution of Hypertext. London: Anthem Press. ISBN 9780857280602.

[Ber] Berners-Lee, Tim. (undated) The WorldWideWeb browser. http://www.w3.org/People/Berners-Lee/WorldWideWeb.html

[Ber14] Berjon, Robin, Steve Faulkner, Travis Leithead, Erika Doyle Navara, Edward O'Connor, Silvia Pfeiffer, Ian Hickson. 13 July 2014. HTML5: A vocabulary and associated APIs for HTML and XHTML. Editor's Draft. http://www.w3.org/html/wg/drafts/html/CR/

[Bie97] Michael Bieber, Fabio Vitali, Helen Ashman, V. Balasubramanian, Harri Oinas-Kukkonen. 1997. Fourth Generation Hypermedia: Some Missing Links for the World Wide Web. In International Journal of Human Computer Studies 47 (1): 31-65, July 1997. doi:https://doi.org/10.1006/ijhc.1997.0130. http://www.researchgate.net/publication/220107979_Fourth_generation_hypermedia_some_missing_links_for_the_World_Wide_Web/file/50463515b46de9b93b.pdf

[Boa14] Boardley, John. 2014. The First Printed Page Numbers. In I Love Typography (blog). http://ilovetypography.com/2014/02/21/the-first-printed-page-numbers/

[Bod04] Bodard, Katia, Bruno de Vuyst, Gunther Meyer. March 2004. Deep Linking, Framing, Inlining and Extension of Copyrights: Recent Cases in Common Law Jurisdictions. In Murdoch University Electronic Journal of Law 11(1). http://www.austlii.edu.au/au/journals/MurUEJL/2004/2.html

[Bus80] Busa, Roberto. 1980. The Annals of Humanities Computing: The Index Thomisticus. Computers and the Humanities 14:83-90. doi:https://doi.org/10.1007/BF02403798.

[Bus45] Bush, Vannevar. July 1945. As We May Think. The Atlantic. http://www.theatlantic.com/magazine/archive/1945/07/as-we-may-think/303881/

[Cai99] Cailliau, Robert and Helen Ashman. 1999. Hypertext in the Web - a History. ACM Computing Surveys 31(4), December 1999. doi:https://doi.org/10.1145/345966.346036. http://www.acm.org/surveys/Formatting.html

[Chi76] Chisholm, Roderick. 1976. Person and Object.

[Cle02] Clemm, G., J. Amsden, T. Ellison, C. Kaler, J. Whitehead. March 2002. Versioning Extensions to WebDAV (Web Distributed Authoring and Versioning). IETF RFC 3253. http://www.ietf.org/rfc/rfc3253.txt

[Coo87] Coombs, James H., Allen H. Renear, Steven J. DeRose. 1987. Markup systems and the future of scholarly text processing. Communications of the ACM 30(11): 933-947. doi:https://doi.org/10.1145/32206.32209. http://dl.acm.org/citation.cfm?id=32209

[Con87a] Jeff Conklin. 1987. Hypertext: An Introduction and Survey. IEEE Computer 20(9), September, 1987: 17-41. doi:https://doi.org/10.1109/MC.1987.1663693. http://www.computer.org/csdl/mags/co/1987/09/01663693-abs.html

[Con87b] Conklin, Jeff, Michael L. Begeman. 1987. gIBIS: a hypertext tool for team design deliberation. HYPERTEXT '87: Proceedings of the ACM conference on Hypertext: 247-251. New York: ACM. doi:https://doi.org/10.1145/317426.317444. http://dl.acm.org/citation.cfm?id=317426.317444

[Con01] Conklin, Jeff, Albert Selvin, Simon Buckingham Shum, Maarten Sierhuis. 2001. Facilitated hypertext for collective sensemaking: 15 years on from gIBIS. HYPERTEXT '01: Proceedings of the 12th ACM conference on Hypertext and Hypermedia. New York: ACM. doi:https://doi.org/10.1145/504216.504246. http://dl.acm.org/citation.cfm?id=504216.504246

[DeR89] DeRose, Steven. 1989. Expanding the Notion of Links. In Proceedings of Hypertext '89, Pittsburgh, PA. Baltimore, MD: Association for Computing Machinery Press. doi:https://doi.org/10.1145/74224.74245.

[DeR91] DeRose, Steven. 1991. Biblical studies and hypertext. Hypermedia and literary studies: 185-204.

[DeR97] DeRose, Steven. 1997. Navigation, access, and control using structured information. American Archivist 60 (3), 298-309.

[DeR94] DeRose, Steven and David G Durand. 1994. Making Hypermedia Work: A User's Guide to HyTime. Kluwer Academic Publishers. ISBN 978-0-7923-9432-7

[DeR99] DeRose, Steven and Andries van Dam. 1999. Document structure and markup in the FRESS hypertext system. Markup Languages: Theory & Practice 1(1), January 1999: 7-32. doi:https://doi.org/10.1162/109966299751940814.

[DeR02] DeRose, Steven, Eve Maler, Ron Daniel Jr. 2002. XPointer xpointer() Scheme. W3C Working Draft, 19 December 2002 http://www.w3.org/TR/xptr-xpointer/

[Dur08] Durand, David G. 2008. Palimpsest: Change-Oriented Concurrency Control For The Support Of Collaborative Applications. Dissertation, Boston University Department of Computer Science. http://www.lulu.com/us/en/shop/david-durand/palimpsest/paperback/product-3723032.html

[Dur96] Durand, David G., Elli Mylonas, Steven DeRose. 1996. What Should Markup Really Be: Applying theories of text to the design of markup systems. ALLC/ACH.

[Dus07] Dusseault, L. (ed). June 2007. HTTP Extensions for Web Distributed Authoring and Versioning (WebDAV). IETF RFC 4918 (obsoletes 2518). http://www.ietf.org/rfc/rfc4918.txt

[Eng73] Englebart, Douglas C., Richard W. Watson, James C. Norton. June 4–8, 1973. The Augmented Knowledge Workshop. Proceedings of the national computer conference and exposition (AFIPS): 9–12. doi:https://doi.org/10.1145/1499586.1499593

[Eve13] Everts, Tammy. June 5, 2013. The average web page has almost doubled in size since 2010. In Web Performance Today. http://www.webperformancetoday.com/2013/06/05/web-page-growth-2010-2013/

[Fle14] Fleishman, Glenn. Apr 15, 2014. Papering over e-books. The Economist. Blog: Babbage: Science and technology. doi:https://doi.org/10.1145/48511.48514. http://www.economist.com/blogs/babbage/2014/04/book-production

[Hal87] Halasz, Frank G. 1987. Reflections on NoteCards: Seven Issues for the Next Generation of Hypermedia Systems. Communications of the ACM 31(7), July 1988: 836-852. doi:https://doi.org/10.1145/507317.507328. http://dl.acm.org/citation.cfm?id=48514

[Hal01] Halasz, Frank G. 2001. Reflections on 'Seven Issues': hypertext in the era of the web. ACM Journal of Computer Documentation 25(3): 109-114. http://dl.acm.org/citation.cfm?id=507317.507328

[Inf87] InfoWorld. November 23, 1987. p. 30. http://books.google.com/books?id=BT8EAAAAMBAJ&pg=PA30

[Int86] International Organisation for Standardization. 1986. ISO 8879:1986 Information processing — Text and office systems — Standard Generalized Markup Language (SGML).

[Kan81] Kant, Immanuel. 1781. Critique of Pure Reason. Edition 2013, tr. J. M. D. Meiklejohn. "An Electronic Classics Series Publication," Jim Manis, series ed. http://www2.hn.psu.edu/faculty/jmanis/kant/critique-pure-reason6x9.pdf

[Kan83] Kant, Immanuel. 1783. Prolegomena to Any Future Metaphysics That Will Be Able to Present Itself as a Science. Tr. Paul Carus, 1902. http://web.mnstate.edu/gracyk/courses/phil%20306/kant_materials/prolegomena2.htm

[Koi05] Koivunen, Marja-Riitta. 2005 Annotea and Semantic Web Supported Collaboration. In 2nd Annual European Semantic Web Conference. Heraklion, Crete, May 29 – June 1, 2005. http://www.annotea.org/eswc2005/01_koivunen_final.pdf

[Mar98] Marshall, Catherine C. 1998. Toward an ecology of hypertext annotation. In Hypertext 98: Proceedings of the ninth ACM conference on Hypertext and hypermedia: 40-49. ISBN 0-89791-972-6. doi:https://doi.org/10.1145/276627.276632". http://dl.acm.org/citation.cfm?id=276632

[Myl01] Mylonas, Elli. 2001. A commentary on Frank Halasz's 'Reflections on NoteCards: Seven Issues for the Next Generation of Hypertext Systems.' Journal of Computer Documentation (JCD) 25(3): 104-108. doi:https://doi.org/10.1145/507317.507326. http://dl.acm.org/citation.cfm?id=507317.507326

[WWW94] NCSA? 1994. Mosaic and the web: advance proceedings: the second International WWW Conference '94. 17 through 20 1994, Chicago, IL. http://books.google.com/books/about/Mosaic_and_the_web.html?id=T9dFAQAAIAAJ.

[Nel87a] Nelson, Ted. 1987. Computer Lib. Revised edition, 1987 (original: 1974). Bound with Nel87b. Microsoft Press: ISBN 0-914845-49-7. Sausalito, California: Mindful Press.

[Nel87b] Nelson, Ted. 1987. Dream Machines. Revised edition, 1987.(original: 1974). Bound with Nel87a. Microsoft Press: ISBN 0-914845-49-7. Sausalito, California: Mindful Press.

[Nel81] Nelson, Ted. 1981. Literary machines.. Sausalito, California: Mindful Press.

[Nel99] Nelson, Theodor Holm. 1999. Xanalogical structure, needed now more than ever: parallel documents, deep links to content, deep versioning, and deep re-use. In ACM Computing Surveys (CSUR) 31(4es), Dec. 1999. Article No. 33. doi:https://doi.org/10.1145/345966.346033. http://dl.acm.org/citation.cfm?id=346033

[Nic95] Nicol, Gavin Thomas. DynaWeb: Interfacing large SGML repositories and the WWW. Proceedings of "The Web Revolution: Fourth International World Wide Web Conference. December 11-14, 1995, Boston, Massachusetts. http://www.w3.org/Conferences/WWW4/Papers/112/

[Pli] Pliny the Elder. Natural History. Preface 33. http://www.perseus.tufts.edu/hopper/text?doc=Perseus:text:1999.02.0138:book=preface:chapter=7 or http://penelope.uchicago.edu/Thayer/L/Roman/Texts/Pliny_the_Elder/praefatio*.html

[Rom11] Romanello, Matteo, Pasin, Michele. June 2011. An Ontological View of Canonical Citations. Digital Humanities 2011. http://dh2011abstracts.stanford.edu/xtf/view?docId=tei/ab-143.xml

[Spe00] Sperberg-McQueen, Michael, Claus Huitfeldt, and Allen H. Renear. 2000. Meaning and Interpretation of Markup. In Markup Languages: Theory & Practice 2(3): 215-234. doi:https://doi.org/10.1162/109966200750363599.

[Sta81] Starr, Raymond J. 1981. Cross-References in Roman Prose. The American Journal of Philology 102(4), Winter, 1981: 431-437. Baltimore: Johns Hopkins University Press. http://www.jstor.org/stable/294331

[Str90] Strong, James. 1890. The Exhaustive Concordance to the Bible. NY: Abingdon Press. ISBN 0-687-40028-7.

[Tol47] Tolkien, J.R.R. 1947. On Fairy Stories. In C. S. Lewis, ed. Essays Presented to Charles Williams. Grand Rapids: Wm. B Eerdmans. ISBN 0-8028-1117-5.

[Uda99] Udanax.com and Project Xanadu. 1999. A Joint Disclosure by Udanax.com and Project Xanadu as of August 23, 1999: to accompany our presentation at the O'Reilly Open Source Conference. http://xanadu.com/tech/

[van87] van Dam, A. Hypertext '87 Keynote Address. Communications of the ACM 31(7), July, 1988: 887-895. doi:https://doi.org/10.1145/48511.48519. http://cs.brown.edu/memex/HT_87_Keynote_Address.html

[Vas14] Vasilogambros, Matt. May 7, 2014. This Guy May Get Sued Over an Amazon Review. National Journal. http://www.nationaljournal.com/tech/this-guy-may-get-sued-over-an-amazon-review-20140507

[Ven12] Venkata, Veera, Ravi Kumar Geddam, Sambasiva Rao Maddali, Devaraja Holla Vaderahobli, Narendhar Rao Soma, Rajesh Balakrishnan, Venugopal Subbarao, Sandeep Kumar Dewangan. 2012. Framework for supporting repair processes of aircraft. US Patent 20120143908 A1. Also published as US8670893, US20100318396.

[Wei87] Weiss, Stephen, Mayer Schwartz. 1987. Proceedings of the ACM Conference on Hypertext and Hypermedia.. Chapel Hill, North Carolina, November 13-15, 1987.

[Wik14] Wikipedia. Copyright aspects of hyperlinking and framing. http://en.wikipedia.org/wiki/Copyright_aspects_of_hyperlinking_and_framing

[Yan85] Yankelovich, Nicole, Norman Meyrowitz, and Andries van Dam. 1985. Reading and Writing the Electronic Book. IEEE Computer 18(10); 15-30. doi:https://doi.org/10.1109/MC.1985.1662710. http://dl.acm.org/citation.cfm?id=4407

[Yee02] Yee, Ka-ping. 2002 CritLink: Advanced Hyperlinks Enable Public Annotation on the Web. In Computer Supported Cooperative Work (CSCW}. http://zesty.ca/crit/yee-crit-cscw2002-demo.pdf



[1] The subtitle alludes to Kant (Kan83), a short, relatively accessible work which analyzes many particular capabilities and relations of mind and metaphysics (as opposed to the more synthetic approach of his Critique of Pure Reason, Kan81). The present article similarly takes an analytic approach, to a more mundane topic, and similarly proposes some minimal requirements for truly adequate systems — in this case hypermedia information systems.

[2] The author wishes to acknowledge many valuable and helpful comments from Rosemary Simpson of Brown University, as well as the anonymous reviewers.

[3] A notable current exception other than linking, is what linguists call interlinear layout, where multiple streams must be aligned token-wise. For example, each word in a sentence(s) may be annotated with a rough translation, a part-of-speech tag, and a base or root word it comes from. Each word should have 3 tokens aligned under it. Thanks to the addition of display:inline-block, this is achievable in many cases. However, there appears to be no way to get the effect within HTML+CSS if the lines differ in text-direction, such as Hebrew text with English glosses.

Author's keywords for this paper:
Hypertext; Hypermedia; HTML5; Markup Systems

Steven J. DeRose

Consultant

Steve DeRose has been working with electronic document and hypertext systems since joining Andries van Dam's FRESS project in 1979. He holds degrees in Computer Science and in Linguistics and a Ph.D. in Computational Linguistics from Brown University.

He co-founded Electronic Book Technologies in 1989 to build the first SGML browser and retrieval system, DynaText, and has been deeply involved in document standards including XML, TEI, HyTime, HTML 4, XPath, XPointer, EAD, Open eBook, OSIS, NLM and others. He has served as Chief Scientist of Brown University's Scholarly Technology Group and Adjunct Associate Professor of Computer Science. He has written many papers, two books, and eleven patents. Most recently he has been working as a consultant in text analytics.