You are here

Learning from Git: Critical Editions as Version Control

Peter Heslin

University of Durham

Discussions of the digital critical edition of the future often promise the ability to swap alternate readings out of the apparatus and to view them as a part of the text; the website of the Digital Latin Library lists this as one of its aims. Another feature often promised is the ability to encode all transmitted variants, not just a select few.  But are these two aims compatible in practical terms?

Encoding a complete apparatus of a classical text reveals the limits of the TEI’s model when applied to ancient texts.  It works well for ancient documents, such as inscriptions and papyri (i.e. Epidoc), because, like modern literary texts, these objects exhibit a relatively limited number of errors and alternate readings. But for a text transmitted via a diverse manuscript tradition, the resulting XML is so complex as to be unfeasible.

I will suggest a different data model for such complex traditions, based upon a metaphor derived from source code version control.  It used to be the case that version control was based upon a conceptual model consisting of a current, canonical version of a file, plus its history in the form of differences from that version, attributed to different hands.  This is a model very similar to the traditional print edition of a classical text with apparatus.  But this model does not scale well.  In particular, the problem of merge conflicts in source code is analogous to the problem of non-nested, overlapping textual variants that are awkward to encode in any dialect of XML.  

The solution to the problem for source code was devised by Linus Torvalds, the creator of Git, who turned the fundamental data representation of version control upside down.  Instead of storing a canonical version and a complex encoding of variants against that version, Git just stores multiple, slightly different, copies of the file.  The problem of representing the differences between different versions is simply deferred until the user asks for such a representation.

This suggests a model for the critical edition whereby the text is represented as a set of collations of MSS and prior editions.  The apparatus would not be created by the editor. Multiple interfaces would be possible, permitting the user to compare a selection of witnesses.  Such an edition would have the (salutary?) effect of destabilizing the authority of the editor, but are we ready to abandon that final bastion of critical positivism?

Session/Panel Title

Digital Textual Editions and Corpora

Session/Paper Number


© 2020, Society for Classical Studies Privacy Policy