Beyond Browsing and Reading: The Open Work of Digital Scholarly Editions

Jon Saklofske & Jake Bruce

Acadia University with the Modelling and Prototyping Team and the INKE Team

Jon Saklofske is Associate Professor in the Department of English and Theatre at Acadia University, 10 Highland Avenue, Wolfville, NS B4P 2R6.  Email: jon.saklofske@acadiau.ca .

Jake Bruce an Honours Graduate in Computer Science from Acadia University, 27 University Avenue, Wolfville, NS
B4P 2R6.  Email: 096334b@acadiau.ca .


Abstract: INKE’s Modelling and Prototyping Team is currently motivated by the following research questions: How do we model and enable context within the electronic scholarly edition? And how do we engage knowledge-building communities and capture process, dialogue, and connections in and around the electronic scholarly edition? NewRadial is a prototype scholarly edition environment developed to address such queries. It argues for the unification of primary texts, secondary scholarship, and related knowledge communities, and re-presents the digital scholarly edition as a social edition — an open work and shared space where users collaboratively explore, sort, group, annotate, and contribute to secondary scholarship creation.

Keywords: INKE; NewRadial; Social edition; Prototype; Scholarly edition; Environment; Visualization; Adapter; Database; Node; Edge; Group


Every performance explains the composition but does not exhaust it. Every performance makes the work an actuality, but is itself only complementary to all possible other performances of the work. In short, we can say that every performance offers us a complete and satisfying version of the work, but at the same time makes it incomplete for us, because it cannot simultaneously give all the other artistic solutions which the work may admit.
– Umberto Eco (1989)

Prototype motivations

How can digital scholarly editions take full advantage of environmentally-generated opportunities to focus on process, collaboration, and distributed control without losing the traditional affordances that make an edition “scholarly?” The Modelling and Prototyping Team of the Implementing New Knowledge Environments (INKE) project is currently exploring ways in which the scholarly edition can be re-imagined within digital settings. Our prototypes function as virtual environments that encourage play within their designed frames, and as Galey and Ruecker (2010) have argued, the trajectory of prototype iterations establishes a valuable record of critical enquiry. In this spirit, we wonder whether the digital scholarly edition (in addition to being perceived as an environment that is a trace record of the theoretical and argumentative motivations that inform the editorial processes of selection, organization, and design) could actively and dynamically host the formation of multiple, simultaneous, and community-generated editions and offshoots. Material print editions are records, artifacts that efface the process of their formation, version-objects that assert an argument and establish a historical position through the printed finality of their collation and production. Yet this apparent finality is interrupted as soon as editions are circulated, because the transmission of printed copies introduces new ideas into the ecosystem of ideas that already exists around a topic, essentially disrupting and challenging the existing system with new or alternative paradigms. These catalyze reactive extensions of varying magnitude, and such reactions eventually mature into secondary scholarship that chronicles the impact of such versions on the knowledge communities that receive them. The extent of the responsive ripples to these already-fossilized edition statements signifies the magnitude and influence of such an edition relative to scholarly conversation over time, though correlating the collective extent of an edition’s dispersed effect on the scholarly community is often difficult.

However, digital editions are different. They do not have to be materially cloned and widely scattered to reach a broad readership. Rather, they are usually installed as centralized databases, which a community of interested readers is invited to explore. The unfortunate result, though, is that due to the inherent structure of the World Wide Web, multiple users can be served pages related to a specific database without ever knowing that there are others browsing the same material, without ever being able to explore that material in tandem, or to hail one another from within the edition environment itself. This reproduction of print-based isolation and technologically-reinforced disability neglects the essential differences between printed and digital edition environments. Print editions are islands inhabited by single, stranded readers who send messages to each other in bottles across oceans of slow moving communication currents. Digital editions, due to the different set of affordances and constraints established by the compound platforms of the modern computer and the World Wide Web, are learning commons, parliamentary hubs, urban squares within which active processes of scholarly debate and exchange can be broadcast, recorded, collected, preserved, and shared in instantaneous, high-speed ways.

If digital editions are to take full advantage of their environments (rather than simply emulating print traditions) they need to visibly include both process and product, and offer opportunities for editorial diligence, contribution, perspective, control, and debate to their users. Top-down forms of authoritative and exclusive editorial selectivity become ironic and anachronistic in dynamic digital environments, which privilege “a new kind of scholarly discourse network that eschews traditional, institutionally-reinforced, hierarchical structures” (Siemens, Timney, Leitch, Koolen, & Garnett, with the ETCL, INKE, & PKP Research Groups, 2012, p. 453). We are exploring models of the digital scholarly edition that capitalize on this opportunity to establish social edition workspaces in which communities of users can contribute content.

The NewRadial knowledge environment

In this spirit, and to provide essential opportunities for user-based contributions and scholarship within digital edition environments, the INKE Modelling and Prototyping Team is currently developing a software prototype called NewRadial, which is an evolution of an earlier Java prototype designed by Jon Saklofske and Jean-Marc Giffin for use with William Blake’s composite art. This collaborative, visual environment reimagines the digital scholarly edition as a transparent workspace layer in which established primary objects from existing databases can be gathered, organized, correlated, annotated, and augmented by multiple users in a dynamic environment that also features centralized margins for secondary scholarship and debate. The NewRadial prototype is free and open source, its code is licensed under the GNU General Public License, and it runs in current versions of Google Chrome, Firefox, Safari, Opera, and Internet Explorer. It queries sites via an application programming interface (API) for specific results, then uses those results to harvest representations of the objects (i.e. thumbnails), which iconically populate its browser-based workspace. Linked data and annotations produced by a community of users within NewRadial’s workspace can then be exported via a resource description framework (RDF) data model structured to other applications. Although NewRadial was initially designed to work with image repositories, we are currently working on alternative ways to adjust its interface to effectively display a broader range of texts, including audio, video, and writing objects.

NewRadial browser frontend

Being an HTML5 web application, the NewRadial frontend consists of HTML and CSS documents that determine the layout of the components in the window, and a set of Javascript files for logic and control. All of the components that compose the frontend are interpreted by the user’s Web browser — entirely on the client machine.

NewRadial makes heavy use of the canvas element; in fact, the application’s entire view consists only of overlays on top of an edge-to-edge canvas surface, adjusted to fit the whole window. Canvas operations as well as several utility features are provided by third-party libraries (jQuery, EaselJS, LABjs, and jsSHA). Libraries are loaded from a Content Delivery Network (CDN) if possible, to allow new clients to use a previously cached copy of the library. However, there are stability concerns involved when using third-party resources, so all externally-hosted scripts are confirmed after being included. If the confirmation is successful and the script has been properly loaded, the application continues; if not, the application loads a local copy of the library from the NewRadial server instead. Libraries that are not available from a CDN are loaded from the NewRadial server using LABjs, which then loads the frontend source files in the order specified in the load script.

Interface

Display field

The three main types of objects shown in the display field are nodes, edges, and groups. Each object is selectable, with an information overlay displaying the attributes of the selected object. For the purposes of the current NewRadial prototype, a node is any object that can be represented by an icon. A node consists of an icon to be shown in the display field, an enlarged image to be shown when the node is selected, a title, and a configurable set of name-value pairs. Nodes are displayed in one or more circular arrangements called radials, which can be used to visualize discrete collections of nodes, separated based on arbitrary criteria. When multiple nodes are selected, the user can pull the selected nodes into a new, separate radial. Users do not have the ability to create and delete nodes — these are obtained directly from the data source (see Figure 1).

Figure 1: Multiple nodes configured into radial groupings

An edge is, as in the field of graph theory, a connection between one node and another. In NewRadial, edges are undirected and consist of a title, user-created content, creator username, creation date, two endpoint nodes, and a comment thread. An edge is shown in the display field as a line between its two endpoint nodes. The application supports multiple sub-edges between the same two nodes, resulting in a thicker line between the endpoints. When a node is selected, all edges that do not contain that node fade. Users can create edges, as well as delete their own edges and edit their title and content, but only admin users have the ability to delete and edit edges created by other users (see Figure 2).

Figure 2: Edges between nodes host user-generated commentary and disc​ussion

A group is a collection of nodes, analogous to a hyper-edge in graph theory. Groups have similar attributes to edges: title, content, creator username, creation date, comment thread, and a set of member nodes. A group is shown in the display field as a line around the convex hull of the nodes in the group. To reduce visual clutter, a group’s hyper-edge is shown in the display field only when the group is selected by the user. Groups are creatable, editable, and able to be deleted in the same way as edges (see Figure 3).

Figure 3: An illustration of the grouping function in NewRadial

Overlays

In addition to the display field, some interface elements are required to provide information and aid in navigation. The interface includes overlays to display information about the selected object, provide a list of all objects, and offer search functions for filtering objects in the visual field.

When a node is selected, the information panel displays its name, a table of its name-value pairs, and the full size image if possible, or a scaled-down version if the image is too large. The image also links to the original image, so users can inspect the image as hosted at its original source. When an edge or group is selected, the information panel shows a list of the nodes in the collection, and in the case of an edge, it shows the images of the two connected nodes in a manner similar to the image of a selected node. Edges show a selectable list of the individual sub-edges that connect the two nodes spanned by the selected edge. Both sub-edges and groups also show the object’s title, content, creator, creation date, and comment thread.

Users are able to comment on particular sub-edges or groups, to facilitate discussion, and to review user contributions. Comments include the creator’s username, creation date, and text content, and are arranged in a nested style, where each comment is either commenting on the sub-edge or replying to a previous comment. Users have the ability to delete and edit their own comments, and the date of creation or latest edit is displayed as part of the comment, whichever date is more recent. Admin users are able to edit or delete any comment, and when a comment is edited by a user other than the original creator, the editor is named along with the edit date.

The interface also includes a list panel, which contains a list of all of the objects in the display field. Objects are selectable in the list panel, which performs the same selection as clicking on the object in the display field. Both this list panel and the information panel are collapsible, which saves screen space.

In addition to the two panels, the interface includes a search bar, which performs a text-based search of all objects in the display field. The application searches all objects by username, date, title, metadata content, comments, and name/value pairs for nodes. Objects that do not match the search string change in appearance, to differentiate negative from positive search results. Negative result nodes shrink, edges fade, and if a group is currently selected while searching, then its hyper-edge fades as well. These affordances allow a community of users to manipulate primary material and contribute to a field of secondary scholarship in a shared and centralized environment (see Figure 4).

Figure 4: NewRadial’s information panel, list panel, and search bar

Login and security

In order to enable this online community of scholarship, user contributions are associated with a named user account. In addition to a username, user accounts consist of a password and a permission level of guest, user, or admin. Guest users, who have no username or password, have read-only permission in accessing the application; they are not able to create, delete, or edit objects. User-level users have all of the permissions of a guest, plus a chosen username, password, and full permission to create edges, groups, and comments, and to edit and delete their own contributions. Admin users have the permissions of the user-level, plus the ability to edit and delete any edge, group, or comment. Typical users have the ability to browse as a guest or create a user account, but admin accounts are only provided to the NewRadial server administrators.

For security purposes, NewRadial passwords are neither transmitted nor stored in plain text. Passwords are converted into an equivalent digest by a one-way hash function such that the data stored on the server can be compared to submitted login attempts, but not reverse-engineered into raw passwords in the event of a malicious entity gaining access to the server. In order to further protect against advanced techniques such as pre-computation (rainbow table) and replay attacks, the stored hashes are salted with a random stored token and transmitted passwords are salted with a one-time token provided by the server in a challenge-response model.

Adapter system

Data published by web services can vary greatly in format and API specification, complicating the problem of integrating many unrelated data sources into a single application. In order to deal with the problem of arbitrary data sources, the NewRadial web application makes use of modular adapters, simplifying the future integration of new data sources into NewRadial. Each adapter is responsible for converting the output of a particular web collection into the native format of NewRadial. Users are able to choose between data sets, determining the adapter through which to fetch the data. The system exposes a simple interface to adapter developers, providing common functions such as database access and HTTP requests. Adapters that deal with large data sources such as the Networked Infrastructure for Nineteenth-Century Electronic Scholarship (NINES) specify a set of filter or search options to enable the study of proper subsets of large archives.

In order to demonstrate compatibility with a variety of data sources, the current version of the NewRadial web application prototype includes adapters for the Networked Infrastructure for Nineteenth-Century Electronic Scholarship (NINES) catalogArchBook, INKE researcher Jon Bath’s Canterbury Tales image database, and Google Image Search. Each of these sources provides metadata suitable for integration with NewRadial, but each operates using a different API and response format. These few test cases represent a small sample of the heterogeneity of access protocol among web services on the Internet, but demonstrate the capability of NewRadial’s modular adapter system to work with such differences.

NewRadial’s adapter system has been designed such that a single adapter can be written for each unique data source to provide the frontend of the application with consistently-formatted data regardless of the response format of the source. In addition, each adapter includes a model of the search options of its data source, if any, so the frontend can provide the user with a set of search controls appropriate to the source. Though each adapter is tailored to a specific data source, these adapters have several common attributes, including a function to communicate with the data source and specific metadata describing the adapter, as well as filtering and search capabilities. Adapter modules follow this specification in order to be properly registered and called by the system, and when invoked, produce a list of the retrieved nodes organized into radials. Importantly, once separate adapters are written for two independent data sources, a meta-adapter can also be written, which allows data from both sources to be worked on at the same time in the NewRadial frontend. This opportunity – to explore and annotate two independent datasets simultaneously without the need to standardize their metadata – is unique to NewRadial and is certainly one of its advantages.

RDF export functionality

Just as NewRadial has the capability to draw data from arbitrary sources on the web, it also facilitates integration with other web services by publishing its user-created content as RDF resources. By using the RDF data model to represent users, edges, groups, and comments, the application enables the development of third-party web services that would draw from NewRadial’s content to provide alternative environments in which to study the same data. NewRadial publishes its content to the semantic web by hosting HTTP URIs on the NewRadial server. Each, when requested, responds with an RDF description of the object requested and the objects to which it is related. The RDF description of an edge, for example, links to the two endpoint nodes that it connects, as well as its creator and its comments.

Data persistence

User contributions persist between sessions, and are visible to other NewRadial users. This is accomplished by storing users, edges, groups, and comments on the NewRadial server. The system acts as a layer on top of existing data sets, where the contributions of NewRadial users are stored separately from the source data set.

NewRadial backend

Although the HTML5 document that hosts the client-side frontend of the web application contains essentially all of the functionality with which the end user will interact, its capabilities rest on the foundation of the server-side backend. NewRadial uses the popular web server software Apache, which, in addition to serving static files, also supports browser caching, response chunking, and many other useful features for modern webpages.

NewRadial’s data storage software is PostgreSQL, an open source SQL-based relational database. Apache serves files to clients, and PostgreSQL stores data, but an intermediate link is required for client-side code running in a webpage to communicate with a server-side database. A server-side scripting system called Node.js runs Javascript code and is focused on speed and scalability. To avoid the language and format conversion step that would be required between the typical pairing of Apache Web server software with PHP, a server-side embedded scripting language, NewRadial uses Node.js, because a unified frontend and backend language and format simplifies interaction between components and maintenance of the system as a whole (see Figure 5).

Figure 5: Illustrative diagram of NewRadial’s server-side architecture

The system supports user-generated content, error logging, user authentication, and custom file caching to deal with Javascript’s same origin policy (SOP). NewRadial overcomes the SOP problem by maintaining a server-side cache of all images required on the client and providing links to the cached versions rather than the originals. This way, the images originate from the same source as the Javascript, from the point of view of the client’s web browser. Cached images are checked daily by sending a HTTP HEAD request to the image source location and are updated in the case of a changed or deleted source image. In addition, when the number of images reaches a configurable maximum number, each new requested image replaces the least recently used image in the cache, to avoid growing beyond the specified maximum size. The ImageBorrower performs these automated duties and provides an interface for adapters to request images from the cache.

This type of server-side image harvesting raises some copyright issues, however. NewRadial only performs this caching to solve the technical problem of the SOP, but it does require the application to download remotely hosted files and host them on its domain instead. This could be troublesome to data source owners, who may not want their content served from an alternate location. NewRadial takes several steps to avoid such copyright issues. During the automatic update of the image cache, if a source image has been removed or changed, the cache is immediately updated by replacing or deleting the old image. Image caching is only performed for thumbnails to be drawn to the canvas; full size images, if available, link to the image from the data source. Finally, for some adapters, a copyright notice will be required for the legal use of the data. A special note field is included in the attributes of each adapter that, if specified, causes a message to appear during the selection of the adapter that can contain copyright notices or other data source information. The developers of NewRadial cannot ensure that adapter developers have rights to the content to which they are adapting – this responsibility is left to them.

Lastly, the backend system should have the ability to host both static webpages and dynamic URLs, in order to host the application’s HTML, CSS, and Javascript files, and the resource URLs of the RDF system. One important advantage of Node.js over PHP as a server-side scripting solution is its ability to deal with dynamic URLs. All requests to the Node.js server go through the same module, so it is simple to redirect incoming requests based on the content of the URL to provide a set of dynamically located resources. In NewRadial, the ability to host dynamic URLs is used for internal application communication, and also to host a web of RDF resources, accessible over HTTP. An RDF resource maps to either a user, edge, group, or comment in the NewRadial system. The representation of the entity includes the same metadata as in the frontend of the application, plus hyperlinks to the RDF resources that are related to the entity.

NewRadial’s affordances are being designed to introduce a dynamic multiplicity of vision into what has traditionally been a reductive, oppositional, and snail’s pace process of inter-edition debate and evolution. The development of this digital edition environment prototype is the first step toward creating inclusive editorial workspaces, which draw from broad data foundations and encourage knowledge-building communities to actively reimagine edition-building processes. It is a multi-functional space in which existing databases can be correlated and sorted into specific edition environments that support annotations and associations between individual objects or object groups. These edition environments can then be subsequently browsed and worked on by communities of scholars, defining edition workspaces that emerge from editorial work as centralized sites that encourage and host further scholarship. Digital scholarly editions, as imagined through NewRadial’s frame, are beginnings, possibility fields of scholarly inquiry and conversation, generative hubs that become both the centre and circumference of scholarly work.

Prototypical argumentation

If a prototype is an argument, as Galey and Ruecker (2010) suggest, then NewRadial argues that both the generation and use of digital scholarly editions should take advantage of the ways that the Web can centralize a distributed community of scholars. NewRadial is a site for the generation of social editions, for a more public and open process of edition formation, pluralization, and persistent growth. It can serve as an environment in which contributing users can generate, collect, and sort primary objects out of larger databases into editions, then share those editions with a knowledge community that can use those sites as further databases for exploration and correlation. In such spaces, hosted and centralized by the NewRadial server, communities can contribute commentary and perform secondary scholarship within the space of a particular user’s edition spaces, or return to the larger database and begin correlative projects of their own. Additionally, NewRadial is a site of scholarly process, discussion, and development. It has the ability to re-present database material in a sandbox environment, encouraging iterative experimentation, hosting methodological and interpretative debate, and supporting innovative combinations and connections. These opportunities can serve as the raw processes, the activated complex from which more traditional scholarly print projects (collaborative or otherwise) can precipitate. In summary, NewRadial is a social edition space that encourages three types of work:

Future visualizations

NewRadial emerged as a prototype solution to the question of how to better interact with databases of images, but through its development as an INKE prototype, it has become a statement, a performative argument that is helping to redefine and visualize the ways in which scholarly editions can evolve in digital environments. However, the question of how applicable it might be to a broader consideration of digitized text or other media objects is one that we’re now turning to. How could we represent text or other media objects in NewRadial so that a user could be afforded the same kinds of manipulation, exploration, and annotation opportunities that they have with images in this environment? If this could be done, then NewRadial would be able to demonstrate that the media-rich opportunities of our 21st-century socially-networked digital environments should serve as the model for an evolution of the scholarly edition into dynamic social edition systems. One of the databases that we imported into the environment while testing the NewRadial adapter system was Jon Bath’s database of scanned pages from an edition of The Canterbury Tales. When they appear in the NewRadial display frame, they are not differentiated enough to justify the visual re-presentation via a NewRadial paradigm (see Figure 6).

Figure 6: The problem of text-rich pages in NewRadial

However, the metadata tags that Bath has associated with each image in the original database make it easy to do quick keyword searches and to visually see the results of those searches (and pull pages that contain search terms out of the main radial as a new radial or group to annotate and discuss). From this, it has become evident that the image-based nature of NewRadial’s display field retains the potential for the use of text-rich objects, if the already-existing metadata for such objects has been richly populated before that database is imported (see Figure 7).

Figure 7: The visual benefits of metadata search results demonstrate that even text-rich source material can be usefully explored in the NewRadial environment.

Alternatively, the definition of a NewRadial object could become more particular. We envision an evolution of NewRadial in which users could establish edge and group connections between parts of an image, or specific words or phrases within and between pages. This possibility is perhaps best suited for databases of digitized, pre-digital written publications or digitized art objects. Essentially, this is a higher-resolution version of the operations that NewRadial already enables and would require some significant changes to the program’s current display interface and affordances. One question that would accompany such a conversion is what to do with text-based objects. Could the NewRadial window, instead of displaying images, display a collection of .pdf or .txt files that are searchable, annotatable, and visually relatable (via edge connections) at the level of specific phrases or words? While the answer might be affirmative, a better question would be whether the NewRadial prototype should simply be understood as one member of a larger family of digital software environments that has a specific set of display options, or whether prototyping efforts should work toward an all-in-one tool for database manipulation and augmentation. NewRadial’s most important function is not its display configuration or representative abilities (though these are unique arguments for the ways that alternative visualization methods multiply critical perspectives relating to an original database). Its main assertion as a prototype is that it models a social edition environment in which knowledge communities contribute to primary editions (or create specific editions out of existing databases).

A further (and perhaps more robust) possibility is preserving the NewRadial idea of allowing users to connect, group, and comment on objects (crowd-sourcing the relational fields rather than automating them), but jettisoning the thumbnail standard and replacing thumbnails with iconic symbols (or even labelled mindmap-like bubbles) of one sort or another. The implicit question at the heart of this possibility is: how flexible should the definition of a NewRadial “object” be (as objects become the site where connections can be made)? In a similar way to deciding how high the resolution should be for the definition of an RDF object, would it be useful to go beyond the page as the definition of a NewRadial object? The answer is a definite “yes.” In the same way that NewRadial was designed to offer an alternative to book-based paradigms of presentation, organization, and access, it has also been designed with flexibility in mind. Instead of being limited by the page object, the program could be customized to display icons that represent characters in a play, places in a narrative, or ideas/themes across an author’s corpus so that a knowledge community could start to map connections and begin dialogue between these kinds of elements (rather than just pages or image thumbnails). Given that NewRadial has been designed to import different database types into its display via specific, customizable adapters, it is quite possible to repurpose its environment and affordances for use with other relational and annotative projects.

Conclusion

The NewRadial prototype argues that the creative re-presentation of media objects in a digital environment (that transcends familiar paradigms of organization) will not only encourage an extension of traditional scholarship and dialogue relating to the formation of scholarly editions, but will also establish unique perceptual opportunities and broader ways of creating, perceiving, relating to, and interacting with such editions. It is offered as a threshold experience, one that connects traditional and transitional approaches to humanities scholarship by providing a rich environment in which we can both engage with our cultural and creative past in new ways (which will continue to seed traditional modes of scholarship), as well as innovatively move beyond the print-based metaphors and paradigms that still necessarily influence, but also limit, current editorial scholarship and practice.

References

Eco, Umberto. (1989). The open work. Translated by Anna Cancogni. Cambridge, MA: Harvard University Press.

Galey, A., & Ruecker, S. (2010). How a prototype argues. Literary and Linguistic Computing 25(4), 405-424.

Siemens, R., Timney, M., Leitch, C., Koolen, C., & Garnett, A., with the ETCL, INKE, & PKP Research Groups. (2012). Toward modeling the social edition: An approach to understanding the electronic scholarly edition in the context of new and emerging social media. Literary and Linguistic Computing, 27(4), 445-461.


CCSP Press
Scholarly and Research Communication
Volume 4, Issue 3, Article ID 0301119, 13 pages
Journal URL: www.src-online.ca

Received March 3, 2013, Accepted April 22, 2013, Published December 18, 2013

Saklofske, Jon, & Bruce, Jake. (2013). Beyond Browsing and Reading: The Open Work of Digital Scholarly Editions. Scholarly and Research Communication, 4(3): 0301119, 13 pp.

© 2013 Jon Saklofske & Jake Bruce. This Open Access article is distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc-nd/2.5/ca), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.