Opening up the Ancient Mediterranean World (through Unicode and Fonts)

Deborah (Debbie) W Anderson

UC Berkeley

Character encoding (via the Unicode Standard) and standardized fonts are often viewed as unimportant to those studying texts from the ancient Mediterranean world. Yet as text corpora and publications are moving increasingly online, characters not in the Unicode Standard will not be represented with off-the-shelf software, be searchable, or archived in a stable, internationally recognized format. While Unicode now covers over 136,000 characters, no new proposals have been put forward for the Classical languages since 2014, though it is likely Classical Greek and Latin characters are missing.  For the recent script additions outside of Greek or Latin – such as Linear A, Linear B, Cypriot, Lycian, Lydian, Carian and Anatolian Hieroglyphs – only one error has ever been reported to the Unicode Consortium for any of the 1,291 characters.

Hence, input from students and scholars on characters in Unicode is needed: it will enable the representation of texts from the ancient Mediterranean in fonts and software, since font providers and software developers rely on Unicode. Support in fonts is particularly important, as publishers such as Oxford University Press now specify authors should submit their manuscripts with a Unicode font (Oxford University Press 2018).

The first portion of this talk will address the problem of the lack of information about Unicode: it will provide an update on the current status of historic scripts of the Mediterranean in Unicode, outline the process of how to report an error in Unicode, and describe how to propose a new character. 

The final portion of the talk will present a little-known problem that impacts work of Classicists and others working on texts of the ancient world – how to handle the variants of specific letters in fonts.  As a case study, the handling of non-Latin historic alphabets of Italy and the alphabets’ letter variants will be discussed. In Unicode, the related alphabets of Italy, such as Etruscan, Faliscan, Oscan, and Raetic, are unified in a single set of “Old Italic” characters. Separate fonts were originally recommended as the way to handle the different alphabets (i.e., one for Etruscan, one for Faliscan, etc.) (Unicode Consortium 2017: 349), although this approach has not been adopted by the New Athena Unicode font or most other fonts. A different approach has been incorporated in the recent Italica Vetus font by David Perry, a free font funded through an NEH grant (Perry 2017). The Italica Vetus font provides access to the various letter variants for all the alphabets in one font, utilizing a specific OpenType font feature. Since support for OpenType features is restricted to certain software, an alternative approach is provided in the Italica Vetus font (use of the Private Use Area of Unicode). A similar approach has been used in the Athena Ruby Inscription Font by Dumbarton Oaks (Dumbarton Oaks 2017).

In sum, the talk will clearly demonstrate why a basic understanding of character encoding and fonts is important, and how it will open up the ancient Mediterranean world by making texts accessible, searchable, and archivable for the long-term.

