Archive for the ‘report’ Category

OEDUc: Disambiguating EDH person RDF working group

Tuesday, July 25th, 2017

One of the working groups at the Open Epigraphic Data Unconference (OEDUc) meeting in London (May 15, 2017) focussed on disambiguating EDH person RDF. Since the Epigraphic Database Heidelberg (EDH) has made all of its data available to download in various formats in an Open Data Repository, it is possible to extract the person data from the EDH Linked Data RDF.

A first step in enriching this prosopographic data might be to link the EDH person names with PIR and Trismegistos (TM) references. At this moment the EDH person RDF only contains links to attestations of persons, rather than unique individuals (although it attaches only one REF entry to persons who have multiple occurrences in the same text), so we cannot use the EDH person URI to disambiguate persons from different texts.

Given that EDH already contains links to PIR in its bibliography, we could start with extracting (this should be possible using a simple Python script) and linking these to the EDH person REF. In the case where there is only one person attested in a text, the PIR reference can be linked directly to the RDF of that EDH person attestation. If, however (and probably in most cases), there are multiple person references in a text, we should try another procedure (possibly by looking at the first letter of the EDH name and matching it to the alphabetical PIR volume).

A second way of enriching the EDH person RDF could be done by using the Trismegistos People portal. At the moment this database of persons and attestations of persons in texts consists mostly of names from papyri (from Ptolemaic Egypt), but TM is in the process of adding all names from inscriptions (using an automated NER script on the textual data from EDCS via the EAGLE project). Once this is completed, it will be possible to use the stable TM PER ID (for persons) and TM person REF ID (for attestations of persons) identifiers (and URIs) to link up with EDH.

The recommended procedure to follow would be similar to the one of PIR. Whenever there’s a one-to-one relationship with a single EDH person reference the TM person REF ID could be directly linked to it. In case of multiple attestations of different names in an inscription, we could modify the TM REF dataset by first removing all double attestations, and secondly matching the remaining ones to the EDH RDF by making use of the order of appearance (in EDH the person that occurs first in an inscription receives a URI (?) that consists of the EDH text ID and an integer representing the place of the name in the text (e.g., is the first appearing person name in text HD000001). Finally, we could check for mistakes by matching the first character(s) of the EDH name with the first character(s) of the TM REF name. Ultimately, by using the links from the TM REF IDs with the TM PER IDs we could send back to EDH which REF names are to be considered the same person and thus further disambiguating their person RDF data.

This process would be a good step in enhancing the SNAP:DRGN-compliant RDF produced by EDH, which was also addressed in another working group: recommendations for EDH person-records in SNAP RDF.

OEDUc: EDH and Pelagios location disambiguation Working Group

Wednesday, July 5th, 2017

From the beginning of the un-conference, an interest in linked open geodata seemed to be shared by a number of participants. Moreover, an attention towards gazetteers and alignment appeared among the desiderata for the event, expressed by the EDH developers. So, in the second part of the unconference, we had a look at what sort of geographic information can be found in the EDH and what could be added.

The discussion, of course, involved Pelagios and Pleiades and their different but related roles in establishing links between sources of geographical information. EDH is already one of the contributors of the Pelagios LOD ecosystem. Using the Pleiades IDs to identify places, it was relatively easy for the EDH to make its database compatible with Pelagios and discoverable on Peripleo, Pelagios’s search and visualisation engine.

However, looking into the data available for downloads, we focused on a couple things. One is that each of the epigraphic texts in the EDH has, of course, a unique identifier (EDH text IDs). The other is that each of the places mentioned has, also, a unique identifier (EDH geo IDs), besides the Pleiades ID. As one can imagine, the relationships between texts and places can be one to one, or one to many (as a place can be related to more than one text and a text can be related to more than one place). All places mentioned in the EDH database have an EDH geo ID, and the information becomes especially relevant in the case of those places that do not have already an ID in Pleiades or GeoNames. In this perspective, EDH geo IDs fill the gaps left by the other two gazetteer and meet the specific needs of the EDH.

Exploring Peripleo to see what information from the EDH can be found in it and how it gets visualised, we noticed that only the information about the texts appear as resources (identified by the diamond icon), while the EDH geo IDs do not show as a gazetteer-like reference, as it happen for other databases, such as Trismegistos or Vici.

So we decided to do a little work on the EDH geo IDs, more specifically:

  1. To extract them and treat them as a small, internal gazetteer that could be contributed to Pelagios. Such feature wouldn’t represent a substantial change in the way EDH is used, or how the data are found in Peripleo, but we thought it could  improve the visibility of the EDH in the Pelagios panorama, and, possibly, act as an intermediate step for the matching of different gazetteers that focus in the ancient world.
  2. The idea of using the EDH geo IDs as bridges sounded interesting especially when thinking of the possible interaction with the Trismegistos database, so we wondered if a closer collaboration between the two projects couldn’t benefit them both. Trismegistos, in fact, is another project with substantial geographic information: about 50.000 place-names mapped against Pleiades, Wikipedia and GeoNames. Since the last Linked Past conference, they have tried to align their place-names with Pelagios, but the operation was successful only for 10,000 of them. We believe that enhancing the links between Trismegistos and EDH could make them better connected to each other and both more effectively present in the LOD ecosystem around the ancient world.

With these two objectives in mind, we downloaded the geoJSON dump from the EDH website and extracted the texts IDs, the geo IDs, and their relationships. Once the lists (that can be found on the git hub repository) had been created, it becomes relatively straightforward to try and match the EDH geoIDs with the Trismegistos GeoIDs. In this way, through the intermediate step of the geographical relationships between text IDs and geo IDs in EDH, Trismegistos also gains a better and more informative connection with the EDH texts.

This first, quick attempt at aligning geodata using their common references, might help testing how good the automatic matches are, and start thinking of how to troubleshoot mismatches and other errors. This closer look at geographical information also brought up a small bug in the EDH interface: in the internal EDH search, when there is a connection to a place that does not have a Pleiades ID, the website treats it as an error, instead of, for example, referring to the internal EDH geoIDs. Maybe something that is worth flagging to the EDH developers and that, in a way, underlines another benefit of treating the EDH geo IDs as a small gazetteer of its own.

In the end, we used the common IDs (either in Pleiades or GeoNames) to do a first alignment between the Trismegistos and EDH places IDs. We didn’t have time to check the accuracy (but you are welcome to take this experiment one step further!) but we fully expect to get quite a few positive results. And we have a the list of EDH geoIDs ready to be re-used for other purposes and maybe to make its debut on the Peripleo scene.

OEDUc: recommendations for EDH person-records in SNAP RDF

Monday, July 3rd, 2017

At the first meeting of the Open Epigraphic Data Unconference (OEDUc) in London in May 2017, one of the working groups that met in the afternoon (and claim to have completed our brief, so do not propose to meet again) examined the person-data offered for download on the EDH open data repository, and made some recommendations for making this data more compatible with the SNAP:DRGN guidelines.

Currently, the RDF of a person-record in the EDH data (in TTL format) looks like:

    a lawd:Person ;
    lawd:PersonalName "Nonia Optata"@lat ;
    gndo:gender <> ;
    nmo:hasStartDate "0071" ;
    nmo:hasEndDate "0130" ;
    snap:associatedPlace <> ,
        <> ;
    lawd:hasAttestation <> .

We identified a few problems with this data structure, and made recommendations as follows.

  1. We propose that EDH split the current person references in edh_people.ttl into: (a) one lawd:Person, which has the properties for name, gender, status, membership, and hasAttestation, and (b) one lawd:PersonAttestation, which has properties dct:Source (which points to the URI for the inscription itself) and lawd:Citation. Date and location etc. can then be derived from the inscription (which is where they belong).
  2. A few observations:
    1. Lawd:PersonalName is a class, not a property. The recommended property for a personal name as a string is foaf:name
    2. the language tag for Latin should be @la (not lat)
    3. there are currently thousands of empty strings tagged as Greek
    4. Nomisma date properties cannot be used on person, because the definition is inappropriate (and unclear)
    5. As documented, Nomisma date properties refer only to numismatic dates, not epigraphic (I would request a modification to their documentation for this)
    6. the D-N.B ontology for gender is inadequate (which is partly why SNAP has avoided tagging gender so far); a better ontology may be found, but I would suggest plain text values for now
    7. to the person record, above, we could then add dct:identifier with the PIR number (and compare discussion of plans for disambiguation of PIR persons in another working group)

SNAP:DRGN introduction

Thursday, May 8th, 2014

Standards for Networking Ancient Prosopography: Data and Relations in Greco-roman Names (SNAP:DRGN) is a one-year pilot project, based at King’s College London in collaboration with colleagues from the Lexicon of Greek Personal Names (Oxford), Trismegistos (Leuven), (Duke) and Pelagios (Southampton), and hopes to include many more data partners by the end of this first year. Much of the early discussion of this project took place at the LAWDI school in 2013. Our goal is to recommend standards for sharing relatively minimalist data about classical and other ancient prosopographical and onomastic datasets in RDF, thereby creating a huge graph of person-data that scholars can:

  1. query to find individuals, patterns, relationships, statistics and other information;
  2. follow back to the richer and fuller source information in the contributing database;
  3. contribute new datasets or individual persons, names and textual references/attestations;
  4. annotate to declare identity between persons (or co-reference groups) in different source datasets;
  5. annotate to express other relationships between persons/entities in different or the same source dataset (such as familial relationships, legal encounters, etc.)
  6. use URIs to annotate texts and other references to names with the identity of the person to whom they refer (similar to Pelagios’s model for places using Pleiades).

More detailed description (plus successful funding bid document, if you’re really keen) can be found at <>.

Our April workshop invited a handful of representative data-holders and experts in prosopography and/or linked open data to spend two days in London discussing the SNAP:DRGN project, their own data and work, and approaches to sharing and linking prosopographical data in general. We presented a first draft of the SNAP:DRGN “Cookbook”, the guidelines for formatting a subset of prosopographical data in RDF for contribution to the SNAP graph, and received some extremely useful feedback on individual technical issues and the overall approach. A summary of the workshop, and slides from many of the presentations, can be found at <>.

In the coming weeks we shall announce the first public version of the SNAP ontology, the Cookbook, and the graph of our core and partner datasets and annotations. For further discussion about the project, and linked data for prosopography in general, you can also join the Ancient-People Googlegroup (where I posted a summary similar to this post earlier today).

Report on Digital Classics panel, Classical Association 2013

Wednesday, April 17th, 2013

(Report by Bartolo Natoli on the digital classics panel, April 6, 2013.)

Last week, I had the privilege of participating in the Classical Association Annual Conference at the University of Reading, UK. One of the panels that I attended struck me as particularly intriguing and important in today’s world of Higher Education: the Digital Classics panel. In fact, at the same time at which the CA Conference was occurring, the first annual Digital Classics Association Conference was happening at the University of Buffalo, a fact that further underlines the growing importance of this emerging side of Classics. (more…)

Report on e-learning panel, Classical Association 2013

Saturday, April 6th, 2013

(Report by Bartolo Natoli on the e-learning panel, April 5, 2013.)

Earlier today, the Classical Association’s Annual Conference, hosted by the University of Reading, presented two panels on ‘New Approaches to e-Learning’, a topic of growing interest in Classical Studies. The two panels boasted papers full of insights and suggestions for incorporating educational technology into both Latin and Classical Civilization classes. The first panel, consisting of papers by Jonathan Eaton and Alex Smith, focused more on how technology could be employed in classroom instruction on a macro-level. Eaton’s talk provided examples of how Virtual Learning Environments (VLEs) could be used to enhance student learning and touched on the controversial topic of massive open online courses (MOOCs). Eaton suggested that VLEs be used to offer resources to students asynchronically, whereas evaluation and direct instruction be employed in a f2f setting: blended learning was a key means of maximizing learning potential. An example of such blended learning was Alex Smith’s discussion of using technology to provide students with collaborative and higher-level learning activities based on synthesis through the creation of a website in eXeLearning that was based on set lines of Latin. Students worked through both Latin content and 21st century, real-world skills such as collaboration and web design. Technology provided the medium, but was not the goal. (more…)