Philology 2.0
May 9, 2007 — mhemmentImaging the following scenario:
Background information and the text [of a Greek author like Homer] [...] are translated into the Chinese or Arabic. The inquirer has developed a profile, not unlike her medical history, which can record the classes she has taken, the books she has read, the movies she has seen, the games she has played, and the questions that she has posed. The personal reading agent can compare this profile, eagerly developed and shared only in part and under strict conditions, against the cultural referents implicit in the author or text of interest, then produce not only translations but personalized briefing materials – maps, timelines, diagrams, simulations, glossary entries – to help that reader contextualize what she has encountered. As the reader begins to ask questions, the system refines its initial hypotheses, quickly adapting itself to her needs. As the system changes, it inspires new kinds of inquiry in the reader, creating a feedback loop that encourages their conversation to evolve.
A compelling essay by Gregory Crane, David Bamman and Alison Babeu of the Perseus Digital Library project at Tufts University entitled, ePhilology: when the books talk to their readers (.pdf), uses the above scenario to illustrate the “optimal digital future” of philological research. The article, part of the forthcoming Blackwell Companion to Digital Literary Studies (Ray Siemens and Susan Schreibman eds., 2007), explores topics such as: text mining, “smart books” (documents that learn from each other), texts that adapt themselves to their users, and tools that help scholars identify trends in the secondary literature.

An example of personalization from the Perseus Digital Library: “Once a user has asked for information on four or five words in a three hundred word passage of Ovid, we can then predict two thirds of the subsequent words that will elicit queries. This recommender system is similar in principle to the systems which Amazon and other e-commerce systems use to show consumers new products based on the products purchased by people who also bought product X. The application, however, reduces the search space of a language passage, suggesting words for study rather than products for purchase.”
The authors identify six features that distinguish emerging digital resources:
1. they can be delivered to any point on the earth and at any time
2. they can be fundamentally hypertextual, supporting comprehensive links between assertions and their evidence
3. they dynamically recombine small, well defined units of information to serve particular people at particular times
4. they learn on their own and apply as many automated processes as possible, not only automatic indexing but morphological and syntactic analysis, named entity recognition, knowledge extraction, machine translation etc., with changes in automatically generated results tracked over time
5. they learn from their human readers and can make effective use of contributions, explicit and implicit, from a range of users in real time
6. they automatically adapt themselves to the general background and current purposes of their users.
Beyond a discussion of the latest digital technologies for textual analysis, the essay also contemplates larger issues related to ePhilology: the creation of new “spaces” that will advance the study of literature; systems that will inspire new forms of inquiry for readers; and unexpected discoveries that will arise from repurposing, sharing, and enriching the great texts of the ancients.

