Review: Guidelines for Encoding Critical Editions for the Library of Digital Latin Texts

Donald Mastronarde and Richard J. Tarrant

December 4, 2017

Editor’s note: The guidelines under review here, while publicly available for comment, represent a pre-release version.

It has been a long-deferred hope for classicists that there would one day be a comprehensive databank of reliable Latin texts providing ease of consultation and searching of Latin literature. Various imperfect substitutes have, of course, existed, such as the PHI database of Latin texts (reviewed here), The Latin Library, Brepols’s Library of Latin Texts, the Perseus Digital Library, De Gruyter’s Bibliotheca Teubneriana Latina Online, and (the most recent addition) the Loeb Digital Library. More recently, those interested in taking advantage of the latest tools and techniques of digital humanities and “big data” have desiderated more than just a top-notch collection of texts: they also need what may be called actionable texts for use in digital research. To be actionable for such a purpose, the texts must be

of an adequate standard of scholarly accuracy
equipped with an apparatus criticus
in a format suited to the strengths and limitations of digital tools (which are simultaneously extremely powerful and, compared to the human mind’s ability to use language and interpret conventional symbols, rather dumb)
free of the traditional restrictions of licensed intellectual property

Combining these requirements is no easy matter.

The Digital Latin Library (DLL) project is based at the University of Oklahoma and has been developed under the leadership of Samuel Huskey. Although it has other components (the project intends to curate versions of some of the open-access Latin texts already available on the internet), the main thrust is to provide the infrastructure for a new collection of texts that will serve the future of both traditional and digital scholarship. The end of the development phase— funded by a grant from the Andrew W. Mellon Foundation—is now near, and the three professional societies that have sponsored and advised the project will soon be announcing procedures for interested scholars to propose new editions for the LDLT, along with plans for peer review of proposals and final submissions. The Latin texts of antiquity will be handled by the SCS (Division of Publications and Research), while Medieval Latin texts will fall to the Medieval Academy of America, and neo-Latin ones to the Renaissance Society of America.

The Guidelines under review here are a pre-release version (72479d0) accessible in August and September 2017, and revision is an ongoing process. Samuel Huskey and Hugh Cayless wrote them with the help and contributions of several other scholars, who are credited in the Acknowledgments (section 1). The key portions are sections 2–10, and these are the sections we studied closely for this review. The final section (11) is a technical Schema description, that is, reference documentation. This was very extensive when viewed earlier in 2017, but since the update of November 1, 2017, it is much shorter.

The LDLT guidelines are based on a customized subset of the immense array of XML text markup tags and attributes developed over many years by the Text Encoding Initiative (TEI). Huskey and Cayless also acknowledge a debt to the Epidoc “dialect” of the TEI guidelines designed for epigraphy and papyrology. The goal has been to accommodate all the usual elements of critical editions, with ample flexibility for individual editors to include or omit, or expand or contract, different aspects. The full potential of a digital edition depends on the precise tagging of elements (and, of course, a platform like LDLT is designed to make proper use of that precision).

In the case of manuscript witnesses, scholars’ names, and bibliographic items, this enables a digital display in which the user can move effortlessly from a reference or siglum in the apparatus to the bibliographic listing that elucidates it. When it comes to the different forms of editorial brackets used in textual editions, or the compressed language and conventions used for giving variants or conjectures in an apparatus criticus, an experienced human reader of critical editions can recognize (or learn from the edition’s preface, if it is properly informative) how to interpret them. A computer requires explicit tagging, which in some cases can be quite complex. The payoff of the tagging comes in the possibility of offering the user a manipulable text: a different reading can be substituted for the one that the editor has preferred, and different users can adjust the display to the level of detail and complexity that they desire for different purposes.

We examined the Guidelines from the point of view of textual editors and connoisseurs of critical editions. One of us had previous knowledge of TEI XML and one did not. In our judgment, the Guidelines in general do a fine job of defining the different elements in a logical sequence, explaining the rationale of the recommended structure, suggesting best practices, and offering at intervals illustrative examples[1] of encoded segments. These examples are often followed by the human-readable rendering of that segment, very similar to what readers of printed critical editions expect to see.

A good example of clear and effective presentation can been see in section 8.11.6, concerning the reporting of corrections in the manuscripts. This section lays out the complexity of the concept of correction, and explains the logical categories (and associated encodings) under which aspects of an altered reading may be considered. Thanks to the careful thought that has gone into TEI, Epidoc, and the customization for LDLT, the Guidelines succeed in covering a wide variety of needs for editors of various genres of texts, and if future editors come up with additional needs for unusual circumstances, workarounds or extensions can be considered. For many details, the Guidelines are permissive rather than prescriptive, making it possible for some editors to enable fewer features and others to enable more. Enabling more may take more work, but also allows fuller exploitation of the special capabilities of the online display platform.

We offer here a few queries or suggestions for clarification that occurred to us in reading through the document.

3.3 “It is recommended to compile the bibliography first, to facilitate linking to the individual entries as they are mentioned in the preface.” In practice, although a good deal of the main bibliography may be compiled at the outset, it is almost inevitable that the bibliography will be continuously added to as one works and revises. The Guidelines strongly encourage the development of a Zotero bibliography corresponding to the edition, and promise to help editors in setting one up.

4.1: It seems odd to include manuscripts under the Bibliography, since in most editions they are part of the Preface. On the positive side, it is an excellent idea to list scholars who are cited in the apparatus but who do not have a bibliographical entry (e.g., scholars who have contributed conjectures by correspondence with the editor).

4.1 and 4.5 (Sources): It would be helpful to be explicit about how references to other ancient texts (quoted in the edited text, or cited in the apparatus or elsewhere for whatever reason) are to be treated. Are they simply treated as books, since the category of editions appears to apply only to previous editions of the text being edited?

4.4.1: in encoding the list of manuscript witnesses and their sigla, the example shows an explicit equal-sign between the siglum abbreviation and the explicit name (<abbr type="siglum">N</abbr> = Codex Neapolitanus V A 8, saec. XV</witness>). If the explicit name were also tagged (without the equal-sign), there would be some flexibility in how the siglum and name were displayed, rather than requiring the single format “N = Codex Neapolitanus V A 8.”

4.4.1: The elements provided for manuscript description are numerous. There are, however, matters one would expect an editor to mention, and these perhaps would be included in untagged textual elements: e.g., whether a given manuscript has been fully collated or only checked in selected places, and whether the manuscript has been consulted in situ, on microfilm, or in a digital reproduction. Adding some statement about this to the Guidelines would be useful. It might also be helpful to indicate whether an editor may choose to describe some manuscripts in greater detail than others, since many editors give full descriptions of the manuscripts they consider most important and provide less information about other manuscripts they have consulted.

4.4.1 (Manuscript Description): With a view to the future of the availability of manuscript images on the internet (and the exploitation of the recently-developed International Image Interoperability Framework), it would be desirable to have as part of the manuscript description an optional pointer to the location of images of a manuscript. Note that such a pointer is provided in the reference for a book in the bibliography if it has a URL.

4.5.1 (Bibliography of Editions): “To make the most of the functionality supported by the LDLT and to remain true to its data model, previous editions should be classified in one of two categories: early editions based on a single manuscript (witness), and modern critical editions based on more than one external source (source).” A little more elaboration of the definitions might help here. There appears to be an inconsistency in the treatment prescribed for incunabula. In section 4.2 we are told that “it is up to the editor to determine whether an incunabulum is a witness to a single manuscript,” and therefore should be treated as a witness. Yet 4.5.1 states that, “since it is often the case that an early edition is a witness to a single manuscript, early editions should be encoded in the bibliography with <witness>.” The earlier statement seems preferable as a general policy. How often, after all, is it known that the earliest edition is based on a single manuscript source? (Studies of some surviving manuscript sheets that were prepared for the typesetter have shown that a base copy was often modified by collation with another witness or by scholarly interventions like adjusting spelling or changing the text by conjecture.) In addition, this initial sentence draws a contrast with “modern critical editions,” but the examples below it (4.5.1.1-2) show editions as early as 1519 listed under “Modern Critical Editions.” Thus “modern” appears to mean “from the beginning of printed books onward” and its juxtaposition with “critical” does not imply modern stemmatic method.

7 (Parallel Passages): The encoding allows for a register of parallel passages for those editors who wish to include such a collection. First, we note that the word testimonium/a does not occur in the Guidelines, but apparently this is where they would be encoded, if a separate apparatus testimoniorum is desired (of course, some editors cite testimonia only in the main apparatus and only when they offer additional evidence for a reading mentioned in the apparatus). We would prefer that the Guidelines mention that listing merely parallel passages is not a normal or expected component of a critical edition. In that respect, Gelsomino’s edition of Vibius Sequester is an unhelpful model. For a technical text of limited scope, it may be feasible and useful to collect parallels, but that is not the case for most Latin texts. In addition, if an editor is going to offer a large number of items in such a register, it might be helpful to be able to discriminate between different types (direct quotation, paraphrase, allusion, shared subject matter that may have no close genetic relationship to the author’s passage) in the encoding (compare 8.13 on optionally tagging variant readings for analysis).

8 (Apparatus Criticus): The examples of bits of apparatus criticus indicate that the expected language is the traditional Latin, but this is not stated explicitly, so perhaps this will be up to future editors. Various edition prefaces in the OCT series have appeared in English, and the advisers of the Teubner series have considered the matter, but that series maintains the instruction that a preface should be in Latin.[2] So far there has not been a corresponding move in printed editions toward replacing the traditional Latin style of the apparatus.

8.11.4.1 (Transposition): there are two different ways of encoding transpositions: a) with semantic markup, which allows users of the LDLT to swap the transposition in and out of the displayed text, or b) with a prose description, which does not allow such functionality for the user. This raises a larger question: are there specific criteria that determine which variants (including conjectures) are eligible for swapping? Or (perhaps the better way of posing the question) is it possible to specify what types of variant are not eligible? Since the potential for swapping is one of the most distinctive features of the digital edition (and the one that most troubles traditionalists), greater clarity on this point would be helpful.

The Guidelines document and the associated platform represent important scholarly work. The question to be answered now, as the LDLT officially opens for business, is who will edit texts for the collection, and how slowly or quickly will it be populated with new editions. Experienced editors may be daunted by the apparent complexity, and in particular see the encoding as adding substantial overhead to an already laborious task. There is, however, good news even for those who are reluctant on that basis. In the last months, as we have learned from Sam Huskey, scripts have been developed that will allow the automation of a great deal of the XML encoding. In essence, DLL will provide exact guidance to the editor for a specific format for entering the text in a human-readable text document and for entering other material (like the apparatus) in spreadsheets, and the scripts will be able to process those files, turning them into the LDLT-compliant XML. So the overhead may be much less than feared, and worth the added benefits of a well-formed digital edition.

(Header Image: Marble left hand holding a scroll. Roman. 1st or 2nd century A.D. Metropolitan Museum of Art 21.88.10. Licensed under CC0 1.0. Public Domain.)

Metadata

Title: Guidelines for Encoding Critical Editions for the Library of Digital Latin Texts

Description: Guidelines for creation of a critical edition for the Digital Latin Library's Library of Digital Latin Texts, based on the standard established by the Text Encoding Initiative (TEI).

URL: https://digitallatin.github.io/guidelines/LDLT-Guidelines.html

Name: Samuel J. Huskey (University of Oklahoma) and Hugh Cayless (Duke Collaboratory for Classical Computing)

Publisher: Digital Latin Library

Place: University of Oklahoma, Norman, Oklahoma

Date Created: 2015–2017

Date Accessed: August and September, 2017

Availability: Free

Rights: Creative Commons Attribution 4.0

Classification: databases, digitization, Latin, linked open data, manuscripts, texts

[1] In some text boxes containing examples, the lines are not correctly wrapped, creating an inconvenience for the reader, the need to scroll horizontally (sometimes a lot). We are told this is a bug in the CSS for the documentation module, and there is hope that it will be fixed.

[2] Note, however, one exceptional case of an English preface in the 2017 Teubner edition of Augustine, Contra Academicos, De beata vita, De ordine, edited by Therese Fuhrer and Simone Adam.

Authors

Donald J. Mastronarde is Professor of the Graduate School, and Melpomene Professor of Classics Emeritus, at the University of California, Berkeley and was founding Director of the Center for the Tebtunis Papyri. He is currently working on a digital edition of the scholia on Euripides. He can be reached via email at djmastronarde@berkeley.edu.

Donald Mastronarde and Richard J. Tarrant

Metadata

Authors

Categories