An Experiment in Hybrid Open-Access Online Scholarly Publishing: Regenerations

Susan Brown
University of Guelph

Linda Cameron
University of Alberta Press

Anita Cutic, Mihaela Ilovan, Olga Ivanova, & Ruth Knechtel
University of Alberta

Andrew MacDonald
Independent software developer

Brent Nelson
University of Saskatchewan

Stan Ruecker
Illinois Institute of Technology

Stéfan Sinclair
McGill University

INKE Research Group


Background: The history of reading, writing, and the dissemination of technology is one of epochal change, and each transition – indeed the history of the book – is marked by hybridity. In the mature years of print, publishers, librarians, and scholars had clearly defined and segregated roles. In the digital realm, the boundaries have broken down. Just now we have hybridity of form and of roles in the implementation of new reading environments.

Analysis: This article provides: 1) an overview of e-reading environments; 2) a survey of the Dynamic Table of Contexts interface; and 3) a report on the hybrid production process of a particular online text, Regenerations.

Conclusion and implications: Regenerations could only have emerged from a collaboration among a digital infrastructure project, research project, university press, and digital humanities tool suite.

Keywords: Interface; E-reading; Indexing; Text markup; E-publishing

Susan Brown is Canada Research Chair in Collaborative Digital Scholarship and Professor of English at the University of Guelph, and Visiting Professor at the University of Alberta. She directs the Orlando Project and the Canadian Writing Research Collaboratory. Email: .

Linda Cameron (MSc) is a publisher and author. She is well-known as an advocate for publishing and in 2015 received the Association of Canadian Publishers President’s Award “In recognition of ongoing exemplary work on behalf of Canadian publishers, and superlative service to Canadian book publishing industry organizations.” Email:

Mihaela Ilovan is the Project Manager of the Canadian Writing Research Collaboratory and a student in the combined MA in Humanities Computing and School of library and Information Science at the University of Alberta. Email:

Olga Ivanova is a PhD candidate in Translation Studies at the University of Alberta. Her research interests centre around film dubbing, second language acquisition, humanities computing, and writing. Email:

Ruth Knechtel is the Manager, Institutional Research, in the Office of Research at the University of Waterloo, and formerly Project Manager of the Canadian Writing Research Collaboratory. Email:

Andrew MacDonald is a web developer with a focus on creating digital humanities applications and visualizations. He is one of the principal developers of Voyant Tools. Email:

Brent Nelson is Professor English at the University of Saskatchewan, where he works at the intersection of digital humanities and early modern literature and culture. Email:

Stan Ruecker is an Associate Professor at Illinois Institute of Technology Institute of Design in Chicago. He works in humanities visualization, the future of reading, and information design, focusing on the design of experimental prototypes, both virtual and physical, to support the interpretive process and the encouragement of multiple perspectives. Email:

Stéfan Sinclair is at McGill University where he teaches and researches Digital Humanities, especially the design, development, theorization and use of text analysis and visualization tools. He is co-creator of Voyant Tools and co-author with Geoffrey Rockwell of Hermeneutica: Computer-Asssisted Interpretation in the Humanities (MIT 2016). Email:

INKE (Implementing New Knowledge Environments) is a collaborative research group exploring electronic text, digital humanities, and scholarly communication. The international team involves over 42 researchers, 53 graduate research assistants, 4 staff, 19 postdoctoral fellows, and 30 partners. Email: .

Overview and context for the Dynamic Table of Contexts

The Dynamic Table of Contexts (DToC) is a reading interface that attempts to leverage some of the new affordances of digital books for people who are familiar with some of the useful features of conventional print books. In particular, it grows from a realization that the standard table of contents and index are two different ways of providing overviews of the material in a book. It allows the reader to dynamically combine the two, according to her own needs and interests. Experimentation with various designs and prototypes began in 2005 (Nelson, Sinclair, Brown, Radzikowska, Bieber, & Ruecker, 2013). The production version of the DToC used for the work in 2015 described in this article was based on the sixth design. The DToC has been incorporated into the Voyant Tools suite developed by Stéfan Sinclair and Geoffrey Rockwell (2016), which makes it available to scholars and members of the general public to enable the creation of editions of their own. It was of considerable interest, therefore, to the Canadian Writing Research Collaboratory (CWRC), both as a possible means of providing an interface for particular documents or collections within CWRC’s virtual research environment, and to respond to the request from CWRC members to enable them to create their own anthologies of materials.

We used the DToC to investigate the relationship between what might be considered as two types of indexing: the scholarly index, as produced by a professional book indexer, which emerged from centuries of the book as printed object, and the semantic markup of digital texts using established standards, especially the Text Encoding Initiative (n.d.) guidelines.

The question of the relationship of indexes to e-books, and of indexing to the affordances of digital textuality, is of considerable interest within and beyond the community of people who are interested in the impact of the digital turn on reading environments and reading practices. Publishers of e-books have had difficulty incorporating indexes into the new medium, and they have been missed by readers. As Peter Meyers (2015) put it in an early blog post on e-books: “why has the ebook index gone AWOL?” Only in August 2015 was a specification recommended “to define a consistent way of encoding the structure and content of indexes in EPUB Publications” (International Digital Publishing Forum, 2010-2015), despite the fact that the International Digital Publishing Forum had launched the EPUB format as an official standard in 2007 (Bogaty, n.d.; International Digital Publishing Forum, 2010–2015; Wikipedia, n.d.). The DToC seeks to make the index more than an afterthought within an e-reading environment designed for scholars in the digital age.

The Regenerations project situates itself at the intersection of print and digital forms within the shifting publication landscape that has resulted from a “digital, networked, open approach” (Weller 2011, p. 11) to scholarship. At this moment, the model for scholarly knowledge dissemination is typically described as “hybrid” (Brown, Griffiths, Roscoff, & Gutrhie 2007; Owen, 2011). Our publication of Regenerations using the DToC is hybrid in a conventional sense of combining print and digital, or paper and networked, modes of publication, but is also hybrid in terms of the players in the production process and the formal features or rhetoric of the digital publication. Whereas the overwhelming majority of the digital components of hybrid publications still mimic print, Regenerations was an experiment in more fully exploiting the affordances of digital media. As Douglas Eyman and Cheryl Bell (2015) argue:

as we develop scholarly approaches and platforms that further these practices [of digital rhetoric], it is important to pay attention to the affordances and constraints of these platforms and to carefully consider the intellectual, social, and technological support structures that need to be used in the construction and dissemination of scholarly multimedia work. (p. 66)

Hybridity of form

In the shifting of conditions and possibilities for publication, and given the unevenness of support for publication in print and digital media, the current age is experiencing a halting progress similar to that evident from studies in book history. Old forms persist and new ones are slow in developing. We are still in the incunabula period of digital publication and, not surprisingly, most attempts at electronic book publication still owe a great deal to their print predecessors.

The relationship between print and digital can be thought of in terms of three categories: remediated form, parallel form, and hybrid form.

Remediated form is by far the most common form, whereby the print artefact is replicated in an electronically delivered form, typically in a portable document file (PDF) file. This is what constitutes a typical volume from Google Books and, in some cases, a publisher’s offering as an e-book (see Figure 1). If we are lucky, that PDF was generated from a machine-readable file that produces clean, searchable text, but commonly, as in the case of Google Books, the ability to search the text is achieved only through “dirty,” which is to say quite inaccurate, optical character recognition (OCR) of image-generated files. The only additional affordances over a printed copy are easier sharing, accessing, and copying (if Digital Rights Management has not been used) of the document through electronic means.

 Figure 1

Parallel form publication, where an electronic form is developed to match the printed form of the content, is also quite common. In most cases the electronic form is a fairly simple Web page with a few added features that are native to the Web. Recently, the Oxford University Press has been producing electronic versions of this type, for example, Daniel Starza Smith’s (2014) John Donne and the Conway Papers, with the added “Find it” functionality of an Open URL link resolver (see Figure 2).

 Figure 2

  A hybrid form of publication puts the advantageous forms and functions of the two supports together into a new form. The premise here is that there are some forms and functions that are developed in the printed form that bring great utility and should not be abandoned in an attempt to completely reinvent the book as container. It is hard to imagine, for example, a collection of essays that does not have a table of contents, or at least some tool that provides the same function of identifying discrete divisions in the collection (whether they are called chapters or something else) and points readers to where they can be found (whether by page number or by a direct hyperlink, or both).

In the case of the Blackwell Companion to Digital Humanities (Schreibman, Siemens, & Unsworth, 2008), the Web version contains elements that remind us of a parallel printed version: not only a page facsimile of the cover but also a table of contents that looks much like that of the printed form, except the page numbers have been replaced by hyperlinks (see Figure 3). There are also elements here that are native to Web-based documents, such as a search bar. In fact, once the reader opens a chapter the Web page seems to be the document’s native form. This is an example of limited hybridity.

 Figure 3

The Web version of Debates in the Digital Humanities (Gold, 2012) describes itself as “a hybrid print/digital publication stream that will explore new debates as they emerge.” This publication represents, in the words of a blog post from the GC Digital Scholarship Lab (2013):

the debut of a custom-built social reading platform. Going beyond the basic task of making the contents of the printed edition accessible, the open access (OA) platform makes the text interactive, with key features that allow readers to interact with the text by marking passages as interesting and adding terms to a crowdsourced index.

The emphasis here is in adding a new, social function to the act of annotation to facilitate the development of “debates” in the digital humanities that the lab hopes will contribute to new iterations of the book (see Figure 4). This hybrid includes an enhanced table of contents that can be expanded and contracted and also retains a key feature of the printed page: fixed and justified margins that fall within the range of standard line lengths. The signature feature, however, attempts to address a long-standing complaint about online reading: that, in contrast to the printed book, it curtails the ability to annotate, although some digital platforms such as Kindle are beginning to offer annotation affordances. This version is still not able to replicate the ease and flexibility of annotation one enjoys in a print environment, but it does add a new capacity to this very old function of annotation: the possibility for annotators across vast geographical space to see and respond to each other’s annotation of the text.

 Figure 4

In one sense, our case study represents a parallel form of publication to the extent that there is a print volume and also a Web-based version that, in many respects, attempts to emulate the print version. The Web-based volume, however, is itself a hybrid form, retaining forms and function of the print object, but adding to these new forms that are native to the Web-based publication.

Here is an overview of some of the features in the DToC edition of Regenerations (Carrière & Demers, 2014) that represent the categories described above (see Figure 5).

Figure 5

Carry-over features

  • Running titles/persistent title. One of the challenges in reading extensive Web page content is a lack of orientation. This is something the print form, with discrete page architecture, does very well. The running title, for example, gives a persistent representation of the chapter one is reading. In the DToC we have a number of devices that provide this function, in addition to the persistent title at the top of the reading pane. The table of contents to the left, for example, is always in view, with the current chapter highlighted in orange.
  • Drop initials. This is a simple and under-considered navigational device. A crucial piece of information is where a major section of text, in this case a chapter, begins. Common in medieval manuscripts and carrying over into print, large, decorated initials were used to mark the beginning of major sections of a document (Koroniak, Geddes, & Randall, 2013). Simpler versions persist into contemporary printed books (see Figure 6). Again, in a scrolling Web page, it can sometimes be unclear just where one is situated in a document. In a scrolling Web view, it is possible to imagine coming to the screen and seeing a paragraph at the top of the reading pane and assuming it is the start of the chapter. The drop initial unambiguously signals the start of the chapter.
  • Table of contents. The table of contents, or some other form for identifying and pointing to divisions in a document, has had a long and varied history as it has been modified and adapted to reflect different kinds of content and meet the changing needs of readers. It has lent itself to hybridity, often combined with or coordinated with other means of navigating a text (Nelson, 2013). As we explain below, while the table of contents has an important presence in our reading environment, it works in tandem with other navigational features to create new affordances for readers.
Figure 6

Compensatory features

  • Annotator. As mentioned above, there are some features and functionalities in the print environment that are difficult to achieve in the digital. As scholars, we recognize the importance of annotation. A printed book is eminently tractable in this respect. The digital environment, less so. Annotation is not our focus, but recognizing that readers require this function, we have incorporated a third-party system provided by Open Knowledge Foundation’s Annotator.
  • Orientation. Another function that is inherent in the codex form is the ability to quickly visualize, indeed feel, where one is in the context of the whole. The eye can see it and the thumb can feel it along the fore-edge. Some printings have implemented divisional tabs. Students around the world apply sticky tabs. We preserve some of this functionality with our document model, which flags certain kinds of content within a graphical representation of the full context of the book, similar to viewing sticky notes protruding along the edge of a printed book (Figure 5).

Features unique to a digital environment

There is, of course, functionality in the digital reading environment that is not possible in print.

  • Search. The standard Ctrl-F “find” search is a standard affordance (Figure 5).
  • Analysis and visualization. The DToC Statistics Panel enables a kind of reading that is not possible in a print environment (see Figure 7). If not exactly distant reading in Franco Moretti’s (2000) sense, it certainly provides a different kind of lens on the text.
  • Semantic text encoding. Another feature native to many scholarly digital texts is semantic encoding or tagging, usually in eXtensible Markup Language (XML). One of the challenges and frustrations of scholars using sophisticated encoding such as Text Encoding Initiative (TEI) markup for texts is that so often this enhancement is not easily passed on to the user. Many interfaces provide no immediate access to the XML text. This important form of knowledge representation may support a few affordances in the text, such as searching within particular sections or for a particular type of entity, but otherwise it often remains invisible to the average reader. As the following section explains, the DToC integrates the encoding as a core feature of the interface navigation.

Enhancement of inherited features: Hybridity

An interface approaches true hybridity when inherited features are not simply carried over to the digital environment but used to create new kinds of digital affordances. The name and functionality of the Dynamic Table of Contexts allude to, but depart from, the table of contents, which users in one study of e-books tried to use as a “hyperlinked complement or alternative to both the index and the search tool” (Barnum, 2004, p. 202). The table of contents is present in our interface in a modified form as a table of contexts that makes the interface uniquely dynamic. That is, we mobilize a number of contexts at once that interact with and enhance the inherited print form:

  • The index and the semantic XML tags form complementary navigational features. Two panels, one labelled “Index” and one “Tags,” indicate how many instances of the index term or tag are present in the text, and allow the reader to click on one or more terms. The index and tag panels provide several innovative affordances (see Figure 8):
    • The table of contents expands to include, below the title of each chapter in which it occurs, each instance of the term or tag that occurs in that chapter, along with a snippet of the text associated with the index term or the beginning of the textual contents of the tag;
    • Each term or tag is indicated on the document model, providing a sense of the overall distribution and any clusters of terms; and
    • Users can navigate to the point in the text where a particular instance of either a tag or a term occurs by clicking on the snippet in the expanded table of contents or on the bar in the document model, which provides a “rich prospect” on the contents of the whole volume (Ruecker, Radzikowska, & Sinclair, 2011).
Figure 8

This interface is both interactive and dynamic in a number of ways. The reader is able to interact with the various components of the tool for a very rich and versatile reading experience. Moving between searching, statistics, the index, the tags, the annotations, the table of contents, the document view, and the reading panel in itself provides a wide range of permutations to match various preferences in navigating texts. The view of the index and of tags can also be filtered by chapter. If a reader wants a “pure” reading experience untrammelled by these aids, she can also just close the panels altogether. This feature emerged from supporting tablets.

The reader can also alter the reading environment more dramatically by moving into “curation mode.” Curation mode allows for the following:

  • Inclusion and exclusion of particular tags;
  • Relabelling of those tags within the interface so that, for instance, <persName> might be relabelled “person” or “personal name”; and
  • Reordering of chapters within the table of contents.

Hybridity of process

The Regenerations essay collection came out of the first conference of the Canadian Writing Research Collaboratory (CWRC) – Canadian Women Writers Conference: “Connecting Texts and Generations.” The University of Alberta Press launched the printed edition together with the more traditional by now remediated digital forms (PDF, EPUB, Kindle) in the fall of 2014, while the CWRC launched the Dynamic Table of Contexts edition in the spring of 2015.

The process of collaboration between a large publishing house and an open access digital humanities project such as the CWRC was a new challenge for both collaboration partners. The University of Alberta Press had some prior involvement in digital publishing through online projects such as the groundbreaking solely digital Atlas of Alberta Railways (Lester, 2005), and has, upcoming, its first textbook published solely in digital form on the Web, Ukrainian for Professional Communication (Nedashkivska, 2016). It now, much the same as most presses, routinely publishes digital editions of its print texts. While the press had, in the past, released some of its out-of-print publications under open access licenses, Regenerations was a freshly published in-print collection. On the other hand, digital humanities projects have been more likely to disseminate public domain materials and open access content.

The press had, of course, a well-established publication process: once a manuscript is peer reviewed and approved for publication by the UAlberta Press (UAP) Committee, various members of the publishing team, including the editor, designer, and indexer, collaborate closely with the author in the production of the final text. This time, however, the team involved in the publication process had to be extended to include the members of the Dynamic Table of Contexts team. Whereas the press was confronted with the messiness of an academic team engaging in publishing as a form of research, as well as with having to coordinate with the DToC team for the purposes of this collaboration, the DToC team had to gain an understanding of the UAP publishing process workflow to avoid parallel workflows and divergent versions of the collection.

As many iterative changes in the text take place during the copyediting, book design, proofreading, indexing, and final review stages of the “regular” publishing process, the DToC publication team could not receive the text before it was ready to print, since the agreement specified that the content of the print and DToC editions would be identical; indeed the research team sought a partnership with the University of Alberta Press for this project precisely because of their excellent production values. In addition, the DToC edition required a fully linked conceptual and thematic index. While the usual practice for the UAP was to omit the index from electronic editions with reflowable text (i.e., no page numbers), this was not an option in this case. The DToC team, in conjunction with the indexers (French and English essays had separate indexes) and the press, had to come up with a workflow that would ensure the DToC edition could reuse as much as possible of the indexing work that had gone into the printed edition.

Last but not least, the Microsoft Word versions of the book (easy to translate into XML – the necessary format for the DToC edition) stopped in the copyediting phase, as is customary in the publication process. After that, all the edits were made in an Adobe InDesign document, which is what the DToC publishing team received from the UAP in the fall of 2014. This is when the DToC publication process started in full.

In order to process the collection in the Dynamic Table of Contexts and render it in the interactive manner described earlier, the plain text had to be marked up using TEI XML to encode the text in both machine-readable and human-readable format. To cut down on the work of encoding the structure of the documents, we performed a number of step-by-step operations that moved the text of the collection across a number of different applications and formats into TEI. While the press was completing its customary publication process, it took the DToC team a full year to develop, test, and refine the workflow and to fine-tune the Dynamic Table of Contexts interface.

Since this was a machine-generated XML document, and hence riddled with markup noise, in the first stage of the DToC editing process we had to remove this noise and align the document with the TEI schema. In the next stage of the DToC editing, unique identifiers were generated for each paragraph and note in the document. Thus prepared, Regenerations was returned to the same indexers who had worked on the printed edition, since paragraph numbers cannot be deduced from page numbers. The indexers in the second round replaced the page references with the corresponding paragraph unique identifiers provided by our team. The revised plain-text index document they provided was then encoded in TEI and reincorporated into the master document. This completed the initial clean-up phase of the editing process. We proceeded then to the next stage: markup enrichment.

The production of the DToC edition was a very labourious process. The silver lining is the fact that, unlike the print version of the book where all the preparation produced a single visual expression of the text and its remediated forms, the kind of editing done in preparation for the DToC edition opens up an array of possibilities for further exploration of the text using other digital tools capable of processing XML. To that end, we decided to annotate the text with named entity references that link the names of persons, organizations, places, and titles with international authority files such as the Virtual International Authority File (OCLC, 2010–2016) and GeoNames. By identifying these references in the text in a machine-readable way, we are opening the volume to further exploration and analysis beyond the constraints of the Dynamic Table of Contexts, inviting further hybridization of the text.

The second part of the markup enrichment stage consisted of adding a semantic, interpretive layer to the markup using a set of tags based on those of the Orlando Project, an online history of British women’s writing. Unlike the mainly structural tags of TEI, Orlando uses for its entries a series of tags meant to encode different aspects of a writer’s life and work (Brown, Clements, & Grundy, 2006–2016). Specific references to these aspects of literary and biographical analysis were encoded into the Regenerations DToC edition. Since tagging a text using semantic tags consists, to some extent, of a series of subjective decisions, we had the three encoders engaged in the process check each other’s work to promote tagging consistency and to cut down on the final editorial review (see Figure 9).

 Figure 9

The amount of work put into this hybrid publication is staggering and without a doubt prohibitive to most solitary scholars who would be interested to repeat the experiment (see Figure 10). To give some numbers: the over 300 pages of the volume were encoded (via automatic processes or through individual labour) with more than 15,000 tags:

  • 310 pages (print edition);
  • 1,003 paragraphs;
  • 15,000 XML tags;
  • 2,001 named entity references (541 persons; 144 organizations; 258 places;
    1,058 titles); and
  • 1,136 semantic tags.
Figure 10

Though the project might be considered a failure from a business model perspective based on these numbers, this would neglect to take into account the exploratory, research-oriented nature of the edition, which combined a highly experimental interface with multilayered markup and a one-of-kind collaboration process hybridizing two very distinct publication models.

Regenerations might be launched, but the research it grounds is just taking off in terms of user testing, consequent revisions to the interface, and further editions, including the second volume of CWRC conference essays, Cultural Mapping and the Digital Sphere (Panofsky, Kellett, Brown, & Romaniuk, 2015), which will be published in a second open access edition by the CWRC in the Dynamic Table of Contexts in 2016. We also plan to integrate the Dynamic Table of Contexts into the CWRC online research platform as part of the CWRC-Voyant 2.0 bridge into the CWRC repository, where it can be made available for other volumes and for researcher-produced essay collections, anthologies, and experimental editions.

Conclusions and implications

At this point, the impact on print sales for the University of Alberta Press is difficult to assess due to the relatively low number of sales for publications of this type. A good indicator of the type of impact this joint model could have on sales is provided by the case of Sarah Carter’s (2008) The Importance of Being Monogamous, published by the University of Alberta Press in collaboration with the University of Athabasca Press. In that particular case, one of the derivative forms of the printed book – the PDF version – was made publically available through a Creative Commons licence. This freely available publication did not seem to impact the sales of the printed version, and the University of Athabasca Press gets asked for print copies by people who have come into contact with the free edition. Print and electronic publication can thus be complementary.

On a different note, as an indication of how swiftly the field of electronic publishing is changing, the indexers’ experience with our hybrid model of indexing reflowable text made them confident in the feasibility of such an endeavour overall, and this seems to be consistent with a general move toward indexing e-publications. As of 2015, the University of Alberta Press began to index, via links, all digitized books, since the cost for their third-party vendor to do so had become reasonable.

The collaboration described here brought together the expertise of fine academic book production with that of experimental interface research. The financial risks on both sides were unequal; the press’s risk was underwritten by a financial subvention from the University of Alberta Libraries. This made the project possible and permitted a more fertile negotiation that produced an e-publication in an enriched rather than derivative form. The result is two forms of publication that complement each other: the Dynamic Table of Contexts Browser edition is not trying to be a book but rather to leverage the capacity of the digital medium to offer new affordances. As such it contributes to a rise in hybrid print-digital publication ventures, for instance, the Manifold Scholarship series, which is dedicated to “iterative, networked monographs” (University of Minnesota Press, 2015).

As with the Andrew W. Mellon Foundation’s support of Manifold Scholarship, however, it is clear that this initiative was only possible because of additional investment on both sides, not just of money but of time, attention, and imagination. Public funding of research through the Social Sciences and Humanities Research Council (SSHRC)-funded Implementing New Knowledge Environments project, along with support from the University Libraries, gave us the space to think, to explore, to test, to iterate, to train students, to encode text, to produce two parallel indexes for the two editions, and to report on our research. The final result is 1) a more polished and testable interface than we would otherwise have; 2) an object with high production values, quality content, and an existing user community that will allow for superior testing; and 3) an important intervention in emergent modes of scholarship and e-publishing that could not have emerged from the commercial publishing sector. The market is currently driving e-book interfaces that tend to be skeumorphic imitations of print reading environments. This hybrid intervention in e-book interfaces, enabled by grant funding structures, indicates the importance of multi-sector partnerships and research support for experimentation in relation to major developments in the knowledge environment.


This work would not have been possible without the support of the Social Sciences Humanities Research Council (SSHRC) and the Canada Foundation for Innovation.



Canadian Writing Research Collaboratory,


Implementing New Knowledge Environments,

The Orlando Project,


