Recogito is a software platform that facilitates annotation of text and images. Through both automatic annotation and manual annotation by users, the software links uploaded files to geographic data and facilitates the sharing and downloading of this data in various formats. The software is freely available for download through GitHub, and a version is also hosted online. In the online version, users have a private workspace as well as the ability to share documents among a group or publicly. Recogito was developed from 2013 to 2018 as part of the Pelagios network, a much wider project dedicated to creating gazetteers and tools for annotation, visualization, pedagogy, collaboration, and registering linked data.
Recogito facilitates research and teaching by making it quick and easy to annotate, link, visualize, and export place, person, and event data found in historical texts and maps. As test cases, I uploaded a .txt file of the second book of Procopius’ On Buildings, based on source text from the Loeb edition in English available through Lacus Curtius, and a .jpg image of Joan Blaeu’s 1659 Morea olim Peleponessus map, sourced from Wikimedia Commons (see images below).
After uploading a text, the user can run an automatic annotation search. This feature automatically assigns annotations to words in the text that appear to be names of people or places. The geographic data can be linked to one or many online gazetteers, including Pleiades.
Once a text is annotated, it is easy to visualize the corresponding geographic data on a map inside the program or export it for use in other software. An entry can be annotated further with comments and tags, which are made available publicly. In addition to text annotation, it is possible to annotate place names on scanned maps and, again, link that data to online gazetteers. A unique feature allows the user to annotate over the text at any angle and link the word on the page to corresponding locations on a map.
Places on more recent maps can be linked consistently to their ancient equivalents. One can, for example, annotate instances of the term Morea with the entry for Peloponnese, Bodrum with the entry for Halicarnassus, or Iznik with the entry for Nicaea, and so on. Accordingly, Recogito makes it easy to keep track of a single place across multiple maps, even when those maps diverge in their spelling or terminology.
Example of the Recogito text annotation interface. The word Amida has been automatically matched to its entry in the Pleiades database.
The user interface is well-designed, and the software functions well. Using the online version, parsing a text of 10,700 words took less than one minute on a standard laptop. This frees the user from considerable leg-work, without requiring excessive computing power.
Naturally, given the variety of forms of ancient place names, there were some errors; at least half of the automatically generated annotations in my sample text were inaccurate. The most common problems were not being able to find a match for a place due to slightly different spellings or confusing the name of a place with that of a person. However, the software is designed so that, once the user annotates a word fully, it is possible to expand that annotation to all other instances of that word. It is thus important for users to check their own documents for formatting errors before uploading, since the text cannot be edited inside Recogito.
Example of the Recogito map annotation interface that shows an annotation of the city of Athens on a 17th century map.
FEATURES AND FUNCTIONALITY
In addition to annotation itself, Recogito provides an extensive array of features, including annotation statistics, automatic saving, and sorting annotations by type and verification status. Additionally, once a text or map has been annotated, it is easy to export this data in a variety of formats, including CSV (spreadsheet) and KML (Google Earth, currently in Beta).
Recogito is designed for easy and efficient collaboration. For example, project leaders and their teams can set their own privacy protections and track changes by user. It also helps that the learning curve for basic training is only a couple of hours.
The software’s user-friendly design also encourages individual use in research and pedagogy. The map visualization features combined with automatic annotation can be especially useful for the classroom, since they allow teachers to make useful maps from texts in quite a short time. This is also useful for researchers who wish to generate maps or compile databases of places in their texts but do not have training in other GIS software.
The geographic data comes from a variety of different gazetteers, all of which have their own area of specialty. It is possible to choose the place data from any one (or multiple) of the gazetteers, as long as that database includes an entry for the location. Pleiades and Digital Atlas of the Roman Empire (DARE) data is extremely well-documented and detailed for ancient places in the Mediterranean, but the variety of gazetteers makes Recogito valuable to scholars working on a range of other areas, including China (CHGIS), Early Modern London (MoEML), and Spanish America (HGIS de las Indias). Recogito also links to the gazetteer of place names in Hebrew (Kima), the various databases of the Digitizing Patterns of Power project (DPP Places), and the more modern set of places names available in GeoNames.
Example of the mapping feature, showing places mentioned in the sample text (On Buildings, Book 2). It is possible to carry out automatic annotation and editing for quick visualization very quickly, which makes the tool especially appealing for use in the classroom or for researchers without extensive training in GIS programs.
There are some caveats to using the software. We do not know the location of many ancient places, and thus not every place can be assigned a specific location. In my case study, many forts and settlements had no corresponding entry and had to be left unidentified, though they could still be tagged. To combat this and other problematic cases, however, if a place is not found, the user can flag it to the broader community.
Geographic features that are not cities or discrete sites (e.g. rivers or mountain ranges) are difficult to annotate and map. Scale is generally focused on settlements that are the size of cities and towns, rather than places within a city or landscape. Person data is not presently linked to an online encyclopedia or prosopography.
The online platform provides 200mb of space, which is enough for a handful of average-sized images. Projects that require more images or significantly larger images can store them on an IIIF (International Image Interoperability Framework) repository. Recogito makes this relatively easy to set up, and it is thus unlikely to present an obstacle to most teams. Overall, the limits of Recogito are the limits of the data available more broadly, rather than the program itself. Rather than reasons not to use the software, in my view these caveats only highlight how powerful and useful the core Recogito software is. The more work completed on geographic databases, the more powerful Recogito will become.
Overall, this program is well-designed and easy to use [see tutorial]. It makes annotating texts for people, places, and events a simple task. It provides a considerable range of features and tools for working with texts and images. The user interface is intuitive, one of the most user friendly I have seen. The project website showcases a number of successful projects that have made use of the Recogito tools. On the whole, it is clear that the software has been thoughtfully and painstakingly developed to facilitate individual and collaborative projects and to make a powerful yet accessible toolkit for researchers and teachers.
DESCRIPTION: Text and image annotation; linking of annotations to gazetteers
NAMES (alphabetical order): Elton Barker, Leif Isaksen, Rebecca Kahn, Andrew Lindley, Rainer Simon, Pau de Soto, Valeria Vitale
PLACE: Austrian Institute of Technology
COLLECTION TITLE: Pleiades Network
DATE CREATED: 2013–2018
DATE ACCESSED: July and August 2019
RIGHTS: Software available under Apache 2 license; copyright law applies for sharing of documents; users consent to have all of their annotations released publicly under the Creative Commons CC0 ('CC Zero') 1.0 Universal license
CLASSIFICATION: databases, images, linked open data, mapping, pedagogy, texts.