Reading Thomas Jefferson with TopicViz: Towards a Thematic Method for Exploring Large Cultural Archives
In spite of what Ed Folsom has called the “epic transformation of archives,” referring to the shift from print to digital archival form, methods for exploring these digitized collections remain underdeveloped. One method prompted by digitization is the application of automated text mining techniques such as topic modeling -- a computational method for identifying the themes that recur across an archive of documents. We review the nascent literature on topic modeling of literary archives, and present a case study, applying a topic model to the Papers of Thomas Jefferson. The lessons from this work suggest that the way forward is to provide scholars with more holistic support for visualization and exploration of topic model output, while integrating topic models with more traditional workflows oriented around assembling and refining sets of relevant documents. We describe our ongoing effort to develop a novel software system that implements these ideas.