Efforts to collect and curate important intellectual documents have focused on books and journals, reports, and other forms of communication inherited from the past. Efforts to currate new media are not so common or so mature. In our present moment, slide sets are one of the most common newer forms for presenting information. Despite their ubiquity and frequent reproduction and circulation, there has been little effort to collect and curate slide sets. In this paper, I propose a model for encoding slides that can be used to collect and curate them. I will explore why slide sets should be curated, what elements of slide sets should be encoded, and what guidelines for encoding them might look like.
As I will propose, for purposes of curation, the essential intellectual content of a slide set can be identified as an abstract version of the information that survives translation of the slide set into PDF: text, formulae, important graphics, but no animations, minimal if any description of style, and no distinction among the forty different ways slide software may progress from one slide to another. TEI offers a suitable XML vocabulary for this purpose--even if this might be the first application of TEI to born-digital documents. Others are also feasible, but the differences among them are less important than the difference between some representation of slide sets we can collect and curate and no such representation.
A note on terminology: Collections of slides prepared, most often (although not always), to be displayed for audiences of presentations are most commonly called--after the most-used software to produce them--“PowerPoints,” but “slide decks” or “slide sets” might be more appropriate, since sets of slides can be produced in any number of ways: by PowerPoint, KeyNote, Prezi, TeX with Beamer, or hand-rolled XML or HTML, or acetate, et cetera. I will use “slide set” throughout this talk, and my proposed model should work for slide sets regardless of the technology with which they are produced.
Why curate slide sets at all? PowerPoint belongs to a select group of software that is so disliked, one can buy tee shirts making fun of it. Edward Tufte has famously argued that PowerPoint impedes communication and makes for ineffective, even irresponsible presentations. He concludes that slides should be replaced by handouts and PowerPoint is bad at producing those documents as well. From the curatorial point of view, however, the first question to ask is what functions slide sets serve; the answers can help us recognize the needs and potential value of curation.
Documents often serve functions and have significance beyond their original design. A driver’s license, for example, is more often produced in order to purchase alcohol and cold medicine, to cash checks, or to board an aircraft than as proof that the bearer is licensed to operate a motor vehicle. Similarly, the primary purpose that a slide set is generally understood to serve is as a visual aid during a presentation. Just like a driver’s license, however, slides are pressed into service in all sorts of ways.
Slides can be surrogates for presentations where they had served as visual aids or they can be their own complete texts. Rather than only being tied to a specific presentation or event, slide sets may record details of which audience members will want a record in the future. Speakers will often say,
Don’t worry about copying down these references [or URLs, statistics, etc.], you’ll have them in the slides later, and so on. I have PowerPoints on my computer that I keep for reference long after I attended the talks: they have statistics and phone numbers and URLs that I don’t record anywhere else in any other form. Sets of slides often circulate as complete documents in their own right, as containers of information.
Plato's Socrates was deeply troubled by the practice of creating a written record of a speech and circulating the text beyond the original setting. But today we do it all the time. By convention, slides have come to function, more or less explicitly, as records of face-to-face talks. It’s common to seek out the slides from presentations we missed. Thus they can be surrogates for the talks themselves.
As a chance to provide the first draft of notes, they can be artifacts of the speaker's intention. They can also reflect what speakers consider important, as some points would be committed to writing on a slide and others wouldn’t. (This practice, it’s worth noting, is so well established that some speakers flout it, and use slides to display text they will argue with, or that will otherwise comment on the main points they make.) In this way, they are significant and worth attention even if a complete transcript or video or audio is available.
Slides have particular significance in the context of higher education, as they strain under the load of educational content. Campus disability programs mandate that printouts of slide sets be provided to some students before class. This requirement depends on the assumptions that slides are used in most, if not every, meeting of every class and that much of the content to be learned can be found on the slides. They have also become centerpieces of online education. The instruction in MOOCs like the Khan Academy and Coursera consists mostly of desktop capture with audio, and these tend to center on slide decks.
Figure 1: Figure 1: Sometimes, slides are leaked (
As a genre of document that many people in our current historical moment spend time creating and consuming, slide sets are worth attention. However, when we imagine the potential value of large collections of slides, their value and the importance of curation becomes clearer.
For example, in higher education, we can imagine collecting all the slide sets used in online courses by one university over the course of so many years, or all resident and online courses of one term, or all the slides used in iSchools or CIC schools, or all the slides used for courses in Coursera, or all the slides used by professors in the University of Chicago’s department of economics, or all the slides created by a given prominent scholar. Other collections are also possible, such as all the slide decks used in mission briefings in a given U.S. Army brigade while in Afghanistan over a given span of time.
We can leverage large collections to ask interesting questions. We could mine all citations and all hyperlinks. Someone might want to see all slide sets from iSchool courses that have cited Noam Chomsky on, or all the individual slides in the collection where Chomsky is mentioned, or those that mention both Chomsky and Dublin Core. How many have linked to a W3C Website? How frequently do slide sets use media with Creative Commons licenses? What kinds of maps are used in Army mission briefings?
The peculiarities of slides can express the conventions of audience expectations in specific disciplines, as well as the style of individuals--as is true of other textual forms. Tardy finds that members of disciplines quickly learn how to conform their slide decks to their discipline and audiences. Collections of slide sets would thus present important opportunities for those who study communication or information practices in various communities.
If the collection is large enough, we can ask linguistic and other structural questions. Text-mining could lead to linguistic discoveries, through techniques like topic-modeling. We could also make discoveries about the use of the form itself. How often are previews and other metalinguistic techniques used overall? What does a typical first slide look like? How many slides would it take to make a slide set too long?
When we consider that slide sets are common currency in business, public life, and intellectual life, and the many kinds of information that collections of these sets could yield, the need to collect and curate slide sets should be more apparent.
Beyond an assertion like, “every slide set contains at least one slide,” it's hard to generalize about all slide sets. There are a dizzying number of possibilities to account for.
Slides are produced by many different methods, each with vast feature sets producing their own versions of elements. The mind boggles when we contemplate the many possible combinations of settings and options that are theoretically possible. Stepping back from theory, we can reflect on the great diversity in actually existing slide sets in our experience. There is a lot to account for: slides, headings, text, text which might be formatted like a heading but is not serving semantically as a heading, numbers, formulae, citations, names of persons, names of places, hyperlinks, tables, graphics illustrative and decorative, images layered on images (e.g., circles and arrows), audio, animations, video, transitions, automated timed transitions and so on.
What are slides, really? To convey information, they function very much like comic books (or graphic novels). Slides contain graphics such as diagrams or photographs, limited text—-which can be a word or two—-and sometimes text quoted from another, and graphics that comment on other graphics, such as circled words and arrows. Sometimes the images convey a lot of content, and other times they are merely illumination (or, "clip art”). I would suggest that at bottom slides work through three relationships: elements in spatial relationships with each other on a slide, slides with chronological relation to each other, and slide sets in relation to the speaker.
The last of these relationships, I would argue, should be set aside for our purposes here. On one hand, I would agree with Gross and Harmon and Knoblauch, among others, when they say that we can only understand slides as parts of a whole: individual elements on individual slides forming a slide deck and all aspects of speakers' delivery in interacting with them. But on the other hand, this view ignores the many uses that slides are put to outside of presentations.
In deciding what counts as content to be encoded, one choice we can make is a choice made by many creators of slide sets. Slide sets are often distributed as PDFs. So, we can conclude that many rhetors and audiences (senders and receivers) are content with the amount of information that survives after conversion to PDF: text and images. This does more than boil things down to a manageable scale. This choice points to a discrete, coherent document to model.
What is to be encoded, then, are the spatial arrangement of elements on slides and the chronological arrangement of slides in a deck. I consider these two sets of characteristics as essential because this is what is preserved when slides are circulated outside of a presentation, and because most of what is left out of this model are aspects of delivery and would be best captured in audiovisual media.
Although there are many more nuances to discuss about the rhetorical and semantic work of slides, the important point is that a functional model of slides would allow us to describe them independently of the software used to create them.
A number of vocabularies could be used to encode slides. Presentation slides are often authored in XML in a number of vocabularies. There is already a customization of DocBook that provides for the authoring of new slides (http://www.docbook.org/schemas/5x-custom.html#slides). Use of DITA (http://docs.oasis-open.org/dita/v1.2/os/spec/DITA1.2-spec.html) is another possibility--a point I’ll return to later.
I suggest that it would be best to encode slides in TEI, which offers a finely developed vocabulary to account for the structural and semantic content of documents that would be of most use to curators, archivists, and other scholars.
For the sake of elegance and parsimony, and also for ease of adoption by encoders and collectors, I believe the best model would be one that leaves TEI’s DTD unmodified, that is, without creating any new elements or significantly modifying any existing ones. Thus semantic customization could be laid on top of the structure of the TEI P5 documents.
Because slide sets present only one required element (at least one slide) beyond what TEI always requires, the rest of this model is going to look like suggestions, with a surprisingly short list of TEI elements needed to represent slide sets.
So, from the top to the bottom, or outside-in:
TEI’s root element,
<text> , will serve to contain a sequence of slides.
Two elements in the standard TEI header will be particularly important. First, the
<publicationStmt> contains a
<settingDesc>. This can describe the setting, or the occasion, or the rhetorical situation:
<publicationStmt> <profileDesc> <settingDesc> . . . Presented when-where-why? Is occasion/context/situation an (FRBRish) event? </settingDesc> </profileDesc> </publicationDesc>
The encoding at first might seem simple, but on further consideration might appear truly distressing. If I use the set of slides corresponding to this paper to present my model to my seminar on document modeling, and then I close this file I’m working from and don’t open it again until I present to Balisage, or email it to someone who missed my talk, are these the same document? I am tempted to say no, two events, two documents. I must leave the implications for accounting for documents to be discussed at another time.
Since the most meaningful encoding will be not single sets but collections, we should include collection or series of information in the TEI header:
<fileDesc> >seriesStmt> . . . E.g., University of Illinois, LIS 590 DM 2012, Meeting 5, 2 September 2012. . . . >/seriesStmt> >/fileDesc>
What is included here will depend on the goals of the collection and how the collectors keep track of slide sets.
Slides themselves can be encoded as
<div type="slide”>. They can be named or numbered, or not, to meet the needs of a given collection.
Blocks of text can be marked up with the
<div> and paragraph-level elements.
Position is important, and TEI provides a mechanism for locating elements in the original text. For our purposes X and Y coordinates in relative units like ems would do the job.
TEI provides a means of accounting for common features of slides, such as citations, both full and partial (
<cit> and <bibl>), personal and place names (
<name> and <placeName>), and dates (
Hyperlinks are common, and TEI provides elements not merely for noting them, but for glossing them:
<ref type="url"> http://www. . . . </ref> <!– or --> <ref target="http://www. . . . "> <!-- . . . . . --> <gloss> . . . To identify where a link is going. Particularly important for short URLs. . . . </gloss> </ref>
Resolving and recording URLs is very important. Often tiny URLs are used in slide sets, to make it easy for listeners to take them down. The trouble is that these URLs take users to a third-party database (which may not be in business six months from now, or even if it survives may not retain all those user-created URLs), which in turn redirects to the full URL. In encoding slides we should capture the destination, so as to record that “http:tin.y/blAh7” takes people to the Smithsonian Institution’s historic flashlight exhibit.
Images of many sorts are common:
<figure> <graphic url=". . . "/> <head> . . . e.g. Figure 5 . . .</head> <figDesc> . . . </figDesc> <!-- or --> <floatingText> . . . </floatingText> </figure>
TEI provides for encoding the label (
<head>), the text contained in a graphic (
<floatingText>), and the textual equivalent of the image (
Collectors' policies might decide whether or not to include speakers' notes. Authors may or may not want them to be public. But accomplishing this in TEI is simple enough: (
<note place="margin">), or perhaps (
What about content the encoder adds or subtracts? Should we note multimedia, transitions, and other content that we are not converting, like transitions and animations and multimedia and presenter’s notes? Probably:
<gap>. TEI, of course, also allows us to mark tags we add, such as parts of speech, with
This model makes assertions about slides and sets of slides: Every slide set contains at least one slide. Sets of slides have authors and situations. Individual slides do not have contexts or author independent of the slide set containing them. The order of slides is also determined only by the larger document containing them. Placing
<settingDesc> in the header of a set of slides commits us to hierarchy. This move will help to answer some broader questions about slides and how we should be modeling them.
We can move on to general questions about this proposed model and about this species of document. Asking what is and is not accounted for can lead to a general discussion of the means and ends of this or another model.
Reducing slides to static text and images does leaves out a good deal from the original, including audio and video; animations; transitions; interactions between speaker and slides; and version history.
If a particular collector decides that it is valuable and feasible to include media (ideally in a form that is accessible to all users and will be for as long as the rest of the collection might be used), there is nothing in my proposed model to prohibit that.
As we consider other aspects of slide sets--for example, the automatic, timed transitions of pecha kucha presentations (with only images on slides and regimented time on each slide before it automatically advances)--these elements of the presentation interpenetrate delivery, and anything short of a complete transcript, or audio and video recording of the talk, will fail to capture every aspect of the presentation. I’ve suggested that slides make meaning not only through elements in spatial relationships on slides and slides' chronological relation to each other, but also through the interaction of speaker to slides. I am merely suggesting here a model for preserving the first two relations.
Are we interested in encoding slide-sets or presentations? Although my case below for not encoding more is for the most part pragmatic, I think that the proposed model is theoretically consistent.
Encoding presentations may be a worthy goal. But at present there is no effective method of encoding delivery into XML. So the only viable course here is to capture the video of performances.
Slides, I have argued, are worth collecting, and that is what I am limiting myself to here.
The case for including the version history of slides is more compelling, but it would set us on an even more difficult path. This view would start by observing that the slides we see have a long and often significant version history. They are revised, corrected, distributed, revised after they’re distributed, revised during presentations, copied, borrowed, repurposed, stolen, corrected, reshuffled, added, subtracted, modified, and forked.
Consider a scenario that represents a common kind of circulation of slides: graduate student X is teaching for professor Y, and she uses a slide originally from professor Z. Afterwards, professor Y adds the slide to his lectures. This can be quite significant to anyone following the flow of ideas and the research of those involved. If this still seems like trivia, suppose for example, that X and Y go on to win the Nobel Prize in economics.
Minor revision can include many changes to one slide. Here is an example of two versions of one slide:
Figure 2: Figure 2: Early version of a slide
Let’s imagine that when this slide in Figure 2 was produced, the creator had little experience with PowerPoint, and used a built-in template to add text and image to a slide. Design elements adopted here unconsciously end up carrying meaning to an audience. The text above the image, for example, was styled as a header for the entire slide, and this was not the intention. This points to a broader set of issues when it comes to encoding and real-world documents being created right now in our digital age. Semantic choices are often obscured by users relying on procedural markup, and vice versa. Most users of Microsoft word use formatting such as bold type and larger font-size to indicate headings, even though they have the option of using tags like
<H1> and <H2>. On the other hand, many slides produced in PowerPoint and other programs are done using templates, where semantics elements are supplied. In one case, procedural markup indicates genuine semantic choices, where in the other semantic markup is used largely for cosmetic effect.
Later, the creator made changes mainly with the intention of not saying what was not intended:
Figure 3: Figure 3: a later version of the same slide
Now in Figure 3 the first two sentences are distinct from the quotation, but are given equal stress. The size and placement of every block of text and the image were altered slightly. Whether all the changes have meaning is arguable, but I would admit that in principle it is less than ideal to cut off such an argument before it happens by not capturing the changes.
Capturing all the changes could be very valuable. It would also probably be a practical impossibility. To begin with, noting all the changes on a single slide and encoding them in a way that can be handled by machines and useful to humans would be difficult. TEI provides for storing multiple versions at the phrase level, with
<rdg> and <lem>. As these small changes mount, the temptation grows to encode multiple versions of a slide complete, as changes fork in multiple directions. And this would require significantly more work of encoding, and also creating a distinct schema, no matter which vocabulary we start with.
Faced with two versions of one slide, it might be easier to encode and include both versions. This might require a modification of TEI's basic document model, to alter the behavior of the
<app> element, along with
<rdg>, which as I read the guidelines can only contain phrase-level elements and cannot contain
<div> or <p> or <ab>. There is no reason why a
<rdg> could not contain the same elements as were in the
But I would suggest that a commitment to capture the addition, deletion, and reordering lies on a path to ruin. We have by this point left TEI behind, and I would find it difficult even to compose a workable model in the abstract that could encode all the revision history of slide decks that could be handled mechanically and understood by humans, even if we would all agree on their value.
Accounting for revision history of slides would be a valuable effort, but also a much more ambitious one, and it would require a significant modification of TEI, perhaps importing more from a vocabulary like DITA. If we were sufficiently ambitious, DITA might allow for a model that allows individual slides to float free as
<topic>s, gathered into this or that
<map>, representing a deck of slides for a given presentation.
DITA, however, does not provide the vocabulary for marking all the historical elements of the text. We would either be importing a number of TEI elements into a modified DITA, or the reverse, and both choices are frightening. I must respond to this alternative view on pragmatic grounds.
The hierarchy inherent in our model encourages parsimony. If I use the slides from another professor's lecture in a lecture of my own, with limited alteration, the slides are all mine, delivered in my class.
<SettingDesc> resides in the header and is inherited by the slides contained in the document. The provenance of individual slides is a matter for a footnote.
In his late-breaking paper at this conference, Eliot Kimber reported that the newly-developed Apache POI library (http://poi.apache.org/) allows for converting Microsoft Powerpoint documents (both the current .pptx format and the previous .ppt format) into XML in useable syntax, and vice versa. I would like to see large collections of slides. I am working towards a model that would be taken up easily and quickly by presenters and encoders.
My goal in this presentation is also simple enough: I want to start a conversation about the curation of slide sets. In our present moment, vast collections of slides are created that could be meaningful in understanding the moment we live in, to both our present and future selves.
Gross, Alan and Joseph E. Harmon.
The Structure of PowerPoint Presentations: The Art of Grasping Things Whole.
IEEE Transactions on Professional Communication
121-137. Print. doi:10.1109/TPC.2009.2020889.
General Architecture for Generation of Slide Presentations,
including PowerPoint, from arbitrary XML Documents.
Presented at Balisage: The Markup Conference 2013,
Montréal, Canada, August 6 - 9, 2013.
In Proceedings of Balisage: The Markup Conference 2013.
Balisage Series on Markup Technologies, vol. 10 (2013). doi:10.4242/BalisageVol10.Kimber01.
The Performance of Knowledge: Pointing and Knowledge in PowerPoint Presentations.
75-97. Print. doi:10.1177/1749975507086275.
NSA Slides Explain the PRISM Data-Collection Program.
6 June 2013.
http://www.washingtonpost.com/wp-srv/special/politics/prism-collection-documents/ . Web.
Expressions of Disciplinarity and Individuality in a Multimodal Genre.
Computers and Composition: An International Journal for Teachers of Writing
319-336. Print. doi:10.1016/j.compcom.2005.05.004.
Tufte, Edward. The Cognitive Style of PowerPoint: Pitching Out Corrupts Within, Second Edition. New York: Graphics Press, 2006. Print.