Why did I decide to learn data science?
Several years ago, when I was just beginning coursework for my PhD in Medieval History at Saint Louis University, I had the privilege to take a course on Medieval Italy, taught by Prof. Thomas F. Madden. In this course, we covered a wide variety of subjects and cities, with each student being responsible for reading a different book each week on that week's topic, then reporting back a summary of the books' contents. Late in the semester, the week on the Maritime communes (Amalfi, Venice, Pisa, Genoa) came around on the syllabus and I happened to select a book by Quentin van Doosselaere entitled Commercial Agreements and Social Dynamics in Medieval Genoa (Cambridge University Press, 2009).
In van Doosselaere's monograph, he examined the history of commenda contracts (long distance commercial agreements) over the span of nearly three centuries. By employing social network analysis to a corpus of approximately 20,000 notarial records, van Doosselaere was able to show that the development of equity, credit, and insurance instruments (the building blocks of capitalism) within Genoa did not stem from any strategic machinations of merchants, but were rather unconscious responses to the changing structure of Genoese society itself. Already armed with an interest in trade from my undergraduate degree in Economics, I was immediately enthralled by van Doosselaere's work.
Though having a background in Economics helped from a broad concepts perspective, I honestly had no practical experience with data analysis; certainly not with anything as complex as historical social network analysis. Thankfully, despite the rather esoteric nature of the subject, there were a few scattered resources for humanists who wished to learn how. The bibliography from van Doosselaere's book made for a decent starting point, introducing me to the landmark 1993 historical study on Medici marriages by John F. Padgett and Christopher K. Ansell (citation). It was, however, the discovery of Scott B. Weingart's blog, the scottbot irregular, that enabled me to start wrapping my head around SNA applications to historical data. From there, the best practical instruction I found was in Borgatti, Everett, and Johnson's Analyzing Social Networks (SAGE, 2013), with occasional supplementary materials from the Historical Network Research website's resources page.
Though teaching myself was a challenge, somehow I managed to get through it. After compiling, processing, re-processing, and re-re-processing my data set once again, I was finally able to start using social network analysis software packages (primarily UCINET, RStudio, and NetMiner 4, but Tore Opsahl's tnet package for R was also particularly useful for my circumstances). It took about a month of playing around with the data and tweaking things here and there, but in the end I produced a series of social network graphs and measures that laid the foundation for my dissertation.
![]() |
| Late 13th-century Pisan social network as compiled from notarial records (blue squares = notarial records, red circles = persons) |
In retrospect, the entire project had been an enormous gamble on my part. When I began down this path, I had no idea what data I might glean from the notarial records in Pisa, I had no practical knowledge of social network analysis, and certainly no guarantees that years of effort would actually produce significant results. Once the social network analysis portion of my dissertation (the first chapter I wrote) was complete, I was astounded at my own findings, and even more so with the untapped potential that data science held for humanistic inquiry.
This revelation ultimately shifted my career trajectory somewhat. Pursuing a doctorate in medieval history had always carried with it the intended end goal of become a university professor. Though this has not appreciably changed, the tribulations of the academic job market have left their mark. A combination of political and societal factors has resulted in the gradual evaporation of non-STEM field academic postings at American universities over the past two decades, though there has not been any concomitant slowing in the production of new History PhD. This, understandably, has resulted in a glut of applicants for ever-fewer positions. In the field of medieval history, most tenure-track assistant professor job searches receive 200-300 applicants for a single position—rather competitive, to put it mildly.
Additionally, academic positions almost invariably have late-August start dates, though the job postings themselves typically come in two waves (usually Fall for permanent positions and Spring for shorter-term assignments). This state of affairs ultimately means it is quite likely that it might take one, two, or three years after earning a PhD to find stable employment. If one is unsuccessful in finding a job by the summer, then it will be another year at the earliest before one could possibly start teaching.
This, understandably, can be rather financial straining. Having just graduated with my Ph.D. two weeks ago, this is precisely the situation in which I find myself as I compose this blog entry.
As I pondered my existence, I realized that there existed a solution with far-reaching positive implications: become a data scientist. Not only is data science one of the fastest-growing and highest-paying career fields today (and tomorrow!), but there are multiple ways in which it could bolster my own involvement in academia. Though the most immediate advantage of guaranteed employability is certainly attractive, the longer term benefits are what truly drew me to learning to code.
My own historical research has already shown me the potential power of data-driven approaches to historical study; by gaining true fluency in cutting edge data science techniques, I have opened a massive door to potential future research topics. Enter the Digital Humanities. Digital Humanities has been a growing component of most academic research institutions world-wide in the past decade. As more and more scholars recognize the potential applications of computers to the humanities, there has been an ever-growing demand for instructional resources for educating both undergraduates and established scholars alike.
Therefore, by becoming a data scientist, I open myself to a whole range of digital humanities positions which ought to have a significantly smaller applicant pool. Even more, there are numerous exciting DH projects ranging from training machines to transcribe medieval Latin documents to making virtual city maps of with layers spanning centuries, from mapping Early Modern correspondence networks to reconstructing ancient trade patterns using archaeological data.
Whether I find short-term employment as a data scientist before entering the professoriate or remain permanently on the business side of data science, I will always seek for occasions to combine historical research and digital approaches (either full-time or just in the occasional article). The unifying feature about both fields of study is my passion for solving complex problems with innovative approaches, something for which there are ample opportunities in both the academic and private sectors.



No comments:
Post a Comment