1:20pm Richard Lane

(Machine) Reading One Million Texts: Technological Innovation through Topic Modelling in the Digital Humanities. Theme: Innovation, Entrpreneurship & Social Change
In our world of big data ”extremely large data sets that are beyond the capacity of human readability” the automation of data analysis is essential for research breakthroughs (such as NASA™s recent Kepler data analysis using neural networks, which in turn lead to the discovery of new planets).  Utilizing the machine learning language MALLET, our MeTA DH Lab computers are exploring a million texts, automatically and autonomously developing topic models that in effect read these texts, producing as outputs focused clusters of data or themes/topics, for human researchers to analyze further.  Specific research questions can thus be asked across an extremely large body of humanistic evidence, in this instance, downloaded sets of JSTOR database metadata from 1850 to the current day.  In this presentation I will not only share some of the preliminary research outcomes from the early stages of this project, but I will also discuss how machine reading is a positive example of interdisciplinarity, drawing together computer science, technological innovation, the digital humanities, and more traditional humanistic approaches and methodologies.  As such I will also discuss how the MeTA DH Lab has provided not just research infrastructure, but a space for interdisciplinary collaboration and knowledge mobilization.