OEDUc: EDH and Pelagios location disambiguation Working Group

Wednesday, July 5th, 2017

From the beginning of the un-conference, an interest in linked open geodata seemed to be shared by a number of participants. Moreover, an attention towards gazetteers and alignment appeared among the desiderata for the event, expressed by the EDH developers. So, in the second part of the unconference, we had a look at what sort of geographic information can be found in the EDH and what could be added.

The discussion, of course, involved Pelagios and Pleiades and their different but related roles in establishing links between sources of geographical information. EDH is already one of the contributors of the Pelagios LOD ecosystem. Using the Pleiades IDs to identify places, it was relatively easy for the EDH to make its database compatible with Pelagios and discoverable on Peripleo, Pelagios’s search and visualisation engine.

However, looking into the data available for downloads, we focused on a couple things. One is that each of the epigraphic texts in the EDH has, of course, a unique identifier (EDH text IDs). The other is that each of the places mentioned has, also, a unique identifier (EDH geo IDs), besides the Pleiades ID. As one can imagine, the relationships between texts and places can be one to one, or one to many (as a place can be related to more than one text and a text can be related to more than one place). All places mentioned in the EDH database have an EDH geo ID, and the information becomes especially relevant in the case of those places that do not have already an ID in Pleiades or GeoNames. In this perspective, EDH geo IDs fill the gaps left by the other two gazetteer and meet the specific needs of the EDH.

Exploring Peripleo to see what information from the EDH can be found in it and how it gets visualised, we noticed that only the information about the texts appear as resources (identified by the diamond icon), while the EDH geo IDs do not show as a gazetteer-like reference, as it happen for other databases, such as Trismegistos or Vici.

So we decided to do a little work on the EDH geo IDs, more specifically:

  1. To extract them and treat them as a small, internal gazetteer that could be contributed to Pelagios. Such feature wouldn’t represent a substantial change in the way EDH is used, or how the data are found in Peripleo, but we thought it could  improve the visibility of the EDH in the Pelagios panorama, and, possibly, act as an intermediate step for the matching of different gazetteers that focus in the ancient world.
  2. The idea of using the EDH geo IDs as bridges sounded interesting especially when thinking of the possible interaction with the Trismegistos database, so we wondered if a closer collaboration between the two projects couldn’t benefit them both. Trismegistos, in fact, is another project with substantial geographic information: about 50.000 place-names mapped against Pleiades, Wikipedia and GeoNames. Since the last Linked Past conference, they have tried to align their place-names with Pelagios, but the operation was successful only for 10,000 of them. We believe that enhancing the links between Trismegistos and EDH could make them better connected to each other and both more effectively present in the LOD ecosystem around the ancient world.

With these two objectives in mind, we downloaded the geoJSON dump from the EDH website and extracted the texts IDs, the geo IDs, and their relationships. Once the lists (that can be found on the git hub repository) had been created, it becomes relatively straightforward to try and match the EDH geoIDs with the Trismegistos GeoIDs. In this way, through the intermediate step of the geographical relationships between text IDs and geo IDs in EDH, Trismegistos also gains a better and more informative connection with the EDH texts.

This first, quick attempt at aligning geodata using their common references, might help testing how good the automatic matches are, and start thinking of how to troubleshoot mismatches and other errors. This closer look at geographical information also brought up a small bug in the EDH interface: in the internal EDH search, when there is a connection to a place that does not have a Pleiades ID, the website treats it as an error, instead of, for example, referring to the internal EDH geoIDs. Maybe something that is worth flagging to the EDH developers and that, in a way, underlines another benefit of treating the EDH geo IDs as a small gazetteer of its own.

In the end, we used the common IDs (either in Pleiades or GeoNames) to do a first alignment between the Trismegistos and EDH places IDs. We didn’t have time to check the accuracy (but you are welcome to take this experiment one step further!) but we fully expect to get quite a few positive results. And we have a the list of EDH geoIDs ready to be re-used for other purposes and maybe to make its debut on the Peripleo scene.

Pleiades sprint on Pompeian buildings

Tuesday, June 20th, 2017

Casa della Statuetta Indiana, Pompei.

Monday the 26th of June, from 15 to 17 BST, Pleiades organises an editing sprint to create additional URIs for Pompeian buildings, preferably looking at those located in Regio I, Insula 8.

Participants will meet remotely on the Pleiades IRC chat. Providing monument-specific IDs will enable a more efficient and granular use and organisation of Linked Open Data related to Pompeii, and will support the work of digital projects such as the Ancient Graffiti.

Everyone is welcome to join, but a Pleiades account is required to edit the online gazetteer.

Archaeological and Epigraphic interchange and e-Science

Thursday, January 29th, 2009

Workshop at the e-Science Institute, Edinburgh, February 10-11, 2009 (see programme and registration):

Rationale: The meeting will bring technical and editorial researchers participating in, or otherwise engaged with, the IOSPE (Inscriptiones Orae Septentrionalis Ponti Euxini = Ancient Inscriptions of the Northern Black Sea Coast.) project together with researchers in related fields, both historical and computational. Existing projects, such as the Inscriptions of Roman Cyrenaica and Inscriptions of Aphrodisias, have explored the digitization of ancient inscriptions from their regions, and employed the EpiDoc schema as markup. IOSPE plans to expand this sphere of activity, in conjunction with an multi-volume publication of inscription data. This event is a joint workshop funded in part by a Small Research Grant from the British Academy, and in part by the eSI through the Arts and Humanities e-Science theme. The workshop will bring together domain experts in epigraphy, and specialists in digital humanities, and e-science researchers, which will provide a detailed scoping of the research questions, and the research methods needed to investigate them from an historical/epigraphic point of view.

The success of previous projects, and the opportunities identified by the IOSPE research team, raise questions of significant interest for the e-science community. Great interpretive value can be attached to datasets such as these if they are linked, both with each other, and with other relevant datasets. The LaQuaT project at King’s, part of ENGAGE, is addressing this. There is also an important adjunct research area in the field of digital geographic analysis of these datasets: again, this can only be achieved if disparate data collections can be meaningfully cross-walked.