The Ancient Greek Dependency Treebank

Francesco Mambrini

The aim of this presentation is to introduce the practice of linguistic annotation to the audience by focusing on the Ancient Greek Dependency Treebank (AGDT), promoted by the Perseus Project [2]. Treebanks are text corpora where each word is annotated with information on morphology and syntactical relations. The recent appearance of a syntactically annotated corpus of Greek and Latin texts is a unique opportunity for scholars. On the one hand, some of the most sophisticated technologies for corpus-based research can be made available to the community of classicists. On the other hand, the expertises of philologists can be put to use for the task of word-by-word annotation. With treebanks, the new editors will be able to encode their interpretation of ancient texts in a machine-actionable format, producing texts that can be searched for specific syntactic phenomena. Although the literature on treebanks is vast (see [1, 4] for reference), the potential relation between annotated corpora and critical editions is still virtually unexplored (but see [2, 3]).

The work of annotation will be illustrated by focusing on a concrete example. We will take Sophocles, Trachiniae 962-3 as a case study. This 11-word sentence is apparently harmless, yet an annotator will be immediately confronted with a series of questions, for some of which different solutions have been already debated in previous scholarship. The issues that an annotator will face range from fine-grained grammatical details to fundamental problems, such as whether a model of annotation used for spoken corpora, where utterances are often interrupted and restarted, is not more useful for a performance-oriented genre like Greek tragedy than a paradigm build around strict syntactic coherence.

This exposition will thus show that the work of treebankig has two great advantages to offer: 1. it forces an annotator to confront himself with centuries of classical scholarship on a wordby- word basis and on a wide range of questions (from philology to literary criticism); 2. it can challenge him to re-think the model of linguistic interpretation of literary texts. Most importantly, syntactic annotation is, by nature, a collaborative enterprise that involves expertises at different levels. From scholars, interested in critical editions and linguistic interpretations, to students working to improve their linguistic skills by tackling the ancient texts in their original forms, all may be involved into the task.

