Posts Tagged ‘epigraphy’

OEDUc: Disambiguating EDH person RDF working group

Tuesday, July 25th, 2017

One of the working groups at the Open Epigraphic Data Unconference (OEDUc) meeting in London (May 15, 2017) focussed on disambiguating EDH person RDF. Since the Epigraphic Database Heidelberg (EDH) has made all of its data available to download in various formats in an Open Data Repository, it is possible to extract the person data from the EDH Linked Data RDF.

A first step in enriching this prosopographic data might be to link the EDH person names with PIR and Trismegistos (TM) references. At this moment the EDH person RDF only contains links to attestations of persons, rather than unique individuals (although it attaches only one REF entry to persons who have multiple occurrences in the same text), so we cannot use the EDH person URI to disambiguate persons from different texts.

Given that EDH already contains links to PIR in its bibliography, we could start with extracting (this should be possible using a simple Python script) and linking these to the EDH person REF. In the case where there is only one person attested in a text, the PIR reference can be linked directly to the RDF of that EDH person attestation. If, however (and probably in most cases), there are multiple person references in a text, we should try another procedure (possibly by looking at the first letter of the EDH name and matching it to the alphabetical PIR volume).

A second way of enriching the EDH person RDF could be done by using the Trismegistos People portal. At the moment this database of persons and attestations of persons in texts consists mostly of names from papyri (from Ptolemaic Egypt), but TM is in the process of adding all names from inscriptions (using an automated NER script on the textual data from EDCS via the EAGLE project). Once this is completed, it will be possible to use the stable TM PER ID (for persons) and TM person REF ID (for attestations of persons) identifiers (and URIs) to link up with EDH.

The recommended procedure to follow would be similar to the one of PIR. Whenever there’s a one-to-one relationship with a single EDH person reference the TM person REF ID could be directly linked to it. In case of multiple attestations of different names in an inscription, we could modify the TM REF dataset by first removing all double attestations, and secondly matching the remaining ones to the EDH RDF by making use of the order of appearance (in EDH the person that occurs first in an inscription receives a URI (?) that consists of the EDH text ID and an integer representing the place of the name in the text (e.g., is the first appearing person name in text HD000001). Finally, we could check for mistakes by matching the first character(s) of the EDH name with the first character(s) of the TM REF name. Ultimately, by using the links from the TM REF IDs with the TM PER IDs we could send back to EDH which REF names are to be considered the same person and thus further disambiguating their person RDF data.

This process would be a good step in enhancing the SNAP:DRGN-compliant RDF produced by EDH, which was also addressed in another working group: recommendations for EDH person-records in SNAP RDF.

OEDUc: recommendations for EDH person-records in SNAP RDF

Monday, July 3rd, 2017

At the first meeting of the Open Epigraphic Data Unconference (OEDUc) in London in May 2017, one of the working groups that met in the afternoon (and claim to have completed our brief, so do not propose to meet again) examined the person-data offered for download on the EDH open data repository, and made some recommendations for making this data more compatible with the SNAP:DRGN guidelines.

Currently, the RDF of a person-record in the EDH data (in TTL format) looks like:

    a lawd:Person ;
    lawd:PersonalName "Nonia Optata"@lat ;
    gndo:gender <> ;
    nmo:hasStartDate "0071" ;
    nmo:hasEndDate "0130" ;
    snap:associatedPlace <> ,
        <> ;
    lawd:hasAttestation <> .

We identified a few problems with this data structure, and made recommendations as follows.

  1. We propose that EDH split the current person references in edh_people.ttl into: (a) one lawd:Person, which has the properties for name, gender, status, membership, and hasAttestation, and (b) one lawd:PersonAttestation, which has properties dct:Source (which points to the URI for the inscription itself) and lawd:Citation. Date and location etc. can then be derived from the inscription (which is where they belong).
  2. A few observations:
    1. Lawd:PersonalName is a class, not a property. The recommended property for a personal name as a string is foaf:name
    2. the language tag for Latin should be @la (not lat)
    3. there are currently thousands of empty strings tagged as Greek
    4. Nomisma date properties cannot be used on person, because the definition is inappropriate (and unclear)
    5. As documented, Nomisma date properties refer only to numismatic dates, not epigraphic (I would request a modification to their documentation for this)
    6. the D-N.B ontology for gender is inadequate (which is partly why SNAP has avoided tagging gender so far); a better ontology may be found, but I would suggest plain text values for now
    7. to the person record, above, we could then add dct:identifier with the PIR number (and compare discussion of plans for disambiguation of PIR persons in another working group)

OEDUc: EDH and Pelagios NER working group

Monday, June 19th, 2017

Participants:  Orla Murphy, Sarah Middle, Simona Stoyanova, Núria Garcia Casacuberta


The EDH and Pelagios NER working group was part of the Open Epigraphic Data Unconference held on 15 May 2017. Our aim was to use Named Entity Recognition (NER) on the text of inscriptions from the Epigraphic Database Heidelberg (EDH) to identify placenames, which could then be linked to their equivalent terms in the Pleiades gazetteer and thereby integrated with Pelagios Commons.

Data about each inscription, along with the inscription text itself, is stored in one XML file per inscription. In order to perform NER, we therefore first had to extract the inscription text from each XML file (contained within <ab></ab> tags), then strip out any markup from the inscription to leave plain text. There are various Python libraries for processing XML, but most of these turned out to be a bit too complex for what we were trying to do, or simply returned the identifier of the <ab> element rather than the text it contained.

Eventually, we found the Python library Beautiful Soup, which converts an XML document to structured text, from which you can identify your desired element, then strip out the markup to convert the contents of this element to plain text. It is a very simple and elegant solution with only eight lines of code to extract and convert the inscription text from one specific file. The next step is to create a script that will automatically iterate through all files in a particular folder, producing a directory of new files that contain only the plain text of the inscriptions.

Once we have a plain text file for each inscription, we can begin the process of named entity extraction. We decided to follow the methods and instructions shown in the two Sunoikisis DC classes on Named Entity Extraction:

Here is a short outline of the steps might involve when this is done in the future.

  1. Extraction
    1. Split text into tokens, make a python list
    2. Create a baseline
      1. cycle through each token of the text
      2. if the token starts with a capital letter it’s a named entity (only one type, i.e. Entity)
    3. Classical Language Toolkit (CLTK)
      1. for each token in a text, the tagger checks whether that token is contained within a predefined list of possible named entities
      2. Compare to baseline
    4. Natural Language Toolkit (NLTK)
      1. Stanford NER Tagger for Italian works well with Latin
      2. Differentiates between different kinds of entities: place, person, organization or none of the above, more granular than CLTK
      3. Compare to both baseline and CLTK lists
  2. Classification
    1. Part-Of-Speech (POS) tagging – precondition before you can perform any other advanced operation on a text, information on the word class (noun, verb etc.); TreeTagger
    2. Chunking – sub-dividing a section of text into phrases and/or meaningful constituents (which may include 1 or more text tokens); export to IOB notation
    3. Computing entity frequency
  3. Disambiguation

Although we didn’t make as much progress as we would have liked, we have achieved our aim of creating a script to prepare individual files for NER processing, and have therefore laid the groundwork for future developments in this area. We hope to build on this work to successfully apply NER to the inscription texts in the EDH in order to make them more widely accessible to researchers and to facilitate their connection to other, similar resources, like Pelagios.

OEDUc: Images and Image metadata working group

Tuesday, June 13th, 2017

Participants: Sarah Middle, Angie Lumezeanu, Simona Stoyanova


The Images and Image Metadata working group met at the London meeting of the Open Epigraphic Data Unconference on Friday, May 15, 2017, and discussed the issues of copyright, metadata formats, image extraction and licence transparency in the Epigraphik Fotothek Heidelberg, the database which contains images and metadata relating to nearly forty thousand Roman inscriptions from collections around the world. Were the EDH to lose its funding and the website its support, one of the biggest and most useful digital epigraphy projects will start disintegrating. While its data is available for download, its usability will be greatly compromised. Thus, this working group focused on issues pertaining to the EDH image collection. The materials we worked with are the JPG images as seen on the website, and the images metadata files which are available as XML and JSON data dumps on the EDH data download page.

The EDH Photographic Database index page states: “The digital image material of the Photographic Database is with a few exceptions directly accessible. Hitherto it had been the policy that pictures with unclear utilization rights were presented only as thumbnail images. In 2012 as a result of ever increasing requests from the scientific community and with the support of the Heidelberg Academy of the Sciences this policy has been changed. The approval of the institutions which house the monuments and their inscriptions is assumed for the non commercial use for research purposes (otherwise permission should be sought). Rights beyond those just mentioned may not be assumed and require special permission of the photographer and the museum.”

During a discussion with Frank Grieshaber we found out that the information in this paragraph is only available on this webpage, with no individual licence details in the metadata records of the images, either in the XML or the JSON data dumps. It would be useful to be included in the records, though it is not clear how to accomplish this efficiently for each photograph, since all photographers need to be contacted first. Currently, the rights information in the XML records says “Rights Reserved – Free Access on Epigraphischen Fotothek Heidelberg”, which presumably points to the “research purposes” part of the statement on the EDH website.

All other components of EDH – inscriptions, bibliography, geography and people RDF – have been released under Creative Commons Attribution-ShareAlike 3.0 Unported license, which allows for their reuse and repurposing, thus ensuring their sustainability. The images, however, will be the first thing to disappear once the project ends. With unclear licensing and the impossibility of contacting every single photographer, some of whom are not alive anymore and others who might not wish to waive their rights, data reuse becomes particularly problematic.

One possible way of figuring out the copyright of individual images is to check the reciprocal links to the photographic archive of the partner institutions who provided the images, and then read through their own licence information. However, these links are only visible from the HTML and not present in the XML records.

Given that the image metadata in the XML files is relatively detailed and already in place, we decided to focus on the task of image extraction for research purposes, which is covered by the general licensing of the EDH image databank. We prepared a Python script for batch download of the entire image databank, available on the OEDUc GitHub repo. Each image has a unique identifier which is the same as its filename and the final string of its URL. This means that when an inscription has more than one photograph, each one has its individual record and URI, which allows for complete coverage and efficient harvesting. The images are numbered sequentially, and in the case of a missing image, the process skips that entry and continues on to the next one. Since the databank includes some 37,530 plus images, the script pauses for 30 seconds after every 200 files to avoid a timeout. We don’t have access to the high resolution TIFF images, so this script downloads the JPGs from the HTML records.

The EDH images included in the EAGLE MediaWiki are all under an open licence and link back to the EDH databank. A task for the future will be to compare the two lists to get a sense of the EAGLE coverage of EDH images and feed back their licensing information to the EDH image records. One issue is the lack of file-naming conventions in EAGLE, where some photographs carry a publication citation (CIL_III_14216,_8.JPG, AE_1957,_266_1.JPG), a random name (DR_11.jpg) and even a descriptive filename which may contain an EDH reference (Roman_Inscription_in_Aleppo,_Museum,_Syria_(EDH_-_F009848).jpeg). Matching these to the EDH databank will have to be done by cross-referencing the publication citations either in the filename or in the image record.

A further future task could be to embed the image metadata into the image itself. The EAGLE MediaWiki images already have the Exif data (added automatically by the camera) but it might be useful to add descriptive and copyright information internally following the IPTC data set standard (e.g. title, subject, photographer, rights etc). This will help bring the inscription file, image record and image itself back together, in the event of data scattering after the end of the project. Currently linkage exist between the inscription files and image records. Embedding at least the HD number of the inscription directly into the image metadata will allow us to gradually bring the resources back together, following changes in copyright and licensing.

Out of the three tasks we set out to discuss, one turned out to be impractical and unfeasible, one we accomplished and published the code, one remains to be worked on in the future. Ascertaining the copyright status of all images is physically impossible, so all future experiments will be done on the EDH images in EAGLE MediaWiki. The script for extracting JPGs from the HTML is available on the OEDUc GitHub repo. We have drafted a plan for embedding metadata into the images, following the IPTC standard.

Open Epigraphic Data Unconference report

Wednesday, June 7th, 2017

Last month, a dozen or so scholars met in London (and were joined by a similar number via remote video-conference) to discuss and work on the open data produced by the Epigraphic Database Heidelberg. (See call and description.)

Over the course of the day seven working groups were formed, two of which completed their briefs within the day, but the other five will lead to ongoing work and discussion. Fuller reports from the individual groups will follow here shortly, but here is a short summary of the activities, along with links to the pages in the Wiki of the OEDUc Github repository.

Useful links:

  1. All interested colleagues are welcome to join the discussion group:!forum/oeduc
  2. Code, documentation, and other notes are collected in the Github repository:

1. Disambiguating EDH person RDF
(Gabriel Bodard, Núria García Casacuberta, Tom Gheldof, Rada Varga)
We discussed and broadly specced out a couple of steps in the process for disambiguating PIR references for inscriptions in EDH that contain multiple personal names, for linking together person references that cite the same PIR entry, and for using Trismegistos data to further disambiguate EDH persons. We haven’t written any actual code to implement this yet, but we expect a few Python scripts would do the trick.

2. Epigraphic ontology
(Hugh Cayless, Paula Granados, Tim Hill, Thomas Kollatz, Franco Luciani, Emilia Mataix, Orla Murphy, Charlotte Tupman, Valeria Vitale, Franziska Weise)
This group discussed the various ontologies available for encoding epigraphic information (LAWDI, Nomisma, EAGLE Vocabularies) and ideas for filling the gaps between this. This is a long-standing desideratum of the EpiDoc community, and will be an ongoing discussion (perhaps the most important of the workshop).

3. Images and image metadata
(Angie Lumezeanu, Sarah Middle, Simona Stoyanova)
This group attempted to write scripts to track down copyright information on images in EDH (too complicated, but EAGLE may have more of this), download images and metadata (scripts in Github), and explored the possibility of embedding metadata in the images in IPTC format (in progress).

4. EDH and SNAP:DRGN mapping
(Rada Varga, Scott Vanderbilt, Gabriel Bodard, Tim Hill, Hugh Cayless, Elli Mylonas, Franziska Weise, Frank Grieshaber)
In this group we revised the status of SNAP:DRGN recommendations for person-data in RDF, and then looked in detail about the person list exported from the EDH data. A list of suggestions for improving this data was produced for EDH to consider. This task was considered to be complete. (Although Frank may have feedback or questions for us later.)

5. EDH and Pelagios NER
(Orla Murphy, Sarah Middle, Simona Stoyanova, Núria Garcia Casacuberta, Thomas Kollatz)
This group explored the possibility of running machine named entity extraction on the Latin texts of the EDH inscriptions, in two stages: extracting plain text from the XML (code in Github); applying CLTK/NLTK scripts to identify entities (in progress).

6. EDH and Pelagios location disambiguation
(Paula Granados, Valeria Vitale, Franco Luciani, Angie Lumezeanu, Thomas Kollatz, Hugh Cayless, Tim Hill)
This group aimed to work on disambiguating location information in the EDH data export, for example making links between Geonames place identifiers, TMGeo places, Wikidata and Pleiades identifiers, via the Pelagios gazetteer or other linking mechanisms. A pathway for resolving was identified, but work is still ongoing.

7. Exist-db mashup application
(Pietro Liuzzo)
This task, which Dr Liuzzo carried out alone, since his network connection didn’t allow him to join any of the discussion groups on the day, was to create an implementation of existing code for displaying and editing epigraphic editions (using Exist-db, Leiden+, etc.) and offer a demonstration interface by which the EDH data could be served up to the public and contributions and improvements invited. (A preview “” perhaps?)

Reflecting on our (first ever) Digital Classicist Wiki Sprint

Wednesday, July 16th, 2014

From (Print) Encyclopedia to (Digital) Wiki

According to Denis Diderot and Jean le Rond d’Alembert the purpose of an encyclopedia in the 18th century was ‘to collect knowledge disseminated around the globe; to set forth its general system to the people with whom we live, and transmit it to those who will come after us, so that the work of preceding centuries will not become useless to the centuries to come’.  Encyclopedias have existed for around 2,000 years; the oldest is in fact a classical text, Naturalis Historia, written ca 77 CE by Pliny the Elder.

Following the (recent) digitalization of raw data, new, digital forms of encyclopedia have emerged. In our very own, digital era, a Wiki is a wider, electronic encyclopedia that is open to contributions and edits by interesting parties. It contains concept analyses, images, media, and so on, and it is freely available, thus making the creation, recording, and dissemination of knowledge a democratised process, open to everyone who wishes to contribute.


A Sprint for Digital Classicists

For us, Digital Classicists, scholars and students interested in the application of humanities computing to research in the ancient and Byzantine worlds, the Digital Classicist Wiki is composed and edited by a hub for scholars and students. This wiki collects guidelines and suggestions of major technical issues, and catalogues digital projects and tools of relevance to classicists. The wiki also lists events, bibliographies and publications (print and electronic), and other developments in the field. A discussion group serves as grist for a list of FAQs. As members of the community provide answers and other suggestions, some of these may evolve into independent wiki articles providing work-in-progress guidelines and reports. The scope of the Wiki follows the interests and expertise of collaborators, in general, and of the editors, in particular. The Digital Classicist is hosted by the Department of Digital Humanities at King’s College London, and the Stoa Consortium, University of Kentucky.

So how did we end up editing this massive piece of work? On Tuesday July 1, 2014 and around 16:00 GMT (or 17:00 CET) a group of interested parties gathered up in several digital platforms. The idea was that most of the action will take place in the DigiClass chatroom on IRC, our very own channel called #digiclass. Alongside the traditional chat window, there was also a Skype voice call to get us started and discuss approaches before editing. On the side, we had a GoogleDoc where people simultaneously added what they thought should be improved or created. I was very excited to interact with old members and new. It was a fun break during my mini trip to the Netherlands, and as it proved, very focused on the general attitude of the Digital Classicists team; knowledge is open to everyone who wishes to learn and can be the outcome of a joyful collaborative process.


The Technology Factor

As a researcher of digital history, and I suppose most information system scholars would agree, technology is never neutral in the process of ‘making’. The magic of the Wiki consists on the fact that it is a rather simple platform that can be easily tweaked. All users were invited to edit any page to create new pages within the wiki Web site, using only a regular web browser without any extra add-ons. Wiki makes page link creation easy by showing whether an intended target page exists or not. A wiki enables communities to write documents collaboratively, using a simple markup language and a web browser. A single page in a wiki website is referred to as a wiki page, while the entire collection of pages, which are usually well interconnected by hyperlinks, is ‘the wiki’. A wiki is essentially a database for creating, browsing, and searching through information. A wiki allows non-linear, evolving, complex and networked text, argument and interaction. Edits can be made in real time and appear almost instantly online. This can facilitate abuse of the system. Private wiki servers (such as the Digital Classicist one) require user identification to edit pages, thus making the process somewhat mildly controlled. Most importantly, as researchers of the digital we understood in practice that a wiki is not a carefully crafted site for casual visitors. Instead, it seeks to involve the visitor in an ongoing process of creation and collaboration that constantly changes the Web site landscape.


Where Technology Shapes the Future of Humanities

In terms of Human resources some with little involvement in the Digital Classicist community before this, got themselves involved in several tasks including correcting pages, suggesting new projects, adding pages to the wiki, helping others with information and background, approaching project-owners and leaders in order to suggest adding or improving information. Collaboration, a practice usually reserved for science scholars, made the process easier and intellectually stimulating.  Moreover, within these overt cyber-spaces of ubiquitous interaction one could identify a strong sense of productive diversity within our own scholarly community; it was visible both in the IRC chat channel as well as over skype. Several different accents and spellings, British, American English, and several continental scholars were gathering up to expand this incredibly fast-pacing process. There was a need to address research projects, categories, and tools found in non-english speaking academic cultures.  As a consequence of this multivocal procedure, more interesting questions arose, not lest methodological. ‘What projects are defined as digital, really’, ‘Isn’t everything a database?’ ‘What is a prototype?’. ‘Shouldn’t there be a special category for dissertations, or visualisations?’.  The beauty of collaboration in all its glory, plus expanding our horizons with technology! And so much fun!

MediaWiki recorded almost 250 changes made in the 1st of July 2014!

The best news, however is that this, first ever wiki sprint was not the last.  In the words of the Organisers, Gabriel Boddard and Simon Mahony,

‘We have recently started a programme of short intensive work-sprints to
improve the content of the Digital Classicist Wiki
( A small group of us this week made
about 250 edits in a couple of hours in the afternoon, and added dozens
of new projects, tools, and other information pages.

We would like to invite other members of the Digital Classicist community to
join us for future “sprints” of this kind, which will be held on the
first Tuesday of every month, at 16h00 London time (usually =17:00
Central Europe; =11:00 Eastern US).

To take part in a sprint:

1. Join us in the DigiClass chatroom (instructions at
<>) during the
scheduled slot, and we’ll decide what to do there;

2. You will need an account on the Wiki–if you don’t already have one,
please email one of the admins to be invited;

3. You do not need to have taken part before, or to come along every
month; occasional contributors are most welcome!’

The next few sprints are scheduled for:
* August 5th
* September 2nd
* October 7th
* November 4th
* December 2nd

Please, do join us, whenever you can!



Postdoc position: Imaging and Ancient Documents (Oxford)

Monday, May 10th, 2010

Forwarded for Charles Crowther:

Post-doctoral Research Assistant – Reflectance Transformation Imaging Systems for Ancient Documentary Artefacts (RTISAD)
Academic-related Grade 7, Salary: £28,983.00 – £35,646.00 pro rata per annum

The Reflectance Transformation Imaging Systems for Ancient Documentary Artefacts (RTISAD) project is seeking to appoint a Post-doctoral Research Assistant for a three-quarter-time, nine-month fixed term post from 1 June 2010 or as soon as possible thereafter. The project is funded by an Arts and Humanities Research Council Grant, under the Digital Equipment and Database Enhancement for Impact scheme. The person appointed will be responsible for organising a trial programme of photographing ancient documentary material using the Reflectance Transformance Imaging systems built by the project. Applicants should have a completed D.Phil, Ph.D or equivalent, together with a competence in cuneiform studies, and/or Greek and Latin papyrology and epigraphy, or another related discipline, and have proven IT skills.


Practical Epigraphy Workshop

Monday, November 9th, 2009

Forwarded for Charlotte Tupman.

Practical Epigraphy Workshop

22-24 June 2010, Great North Museum, Newcastle

A Practical Epigraphy Workshop is taking place for those who are interested in developing hands-on skills in working with epigraphic material. The workshop is aimed at graduate students, but other interested parties are welcome to apply, whether or not they have previous experience. With expert tuition, participants will learn the practical aspects of how to record and study inscriptions. The programme will include the making of squeezes; photographing and measuring inscribed stones; and the production of transcriptions, translations and commentaries. Participants may choose to work on Latin or Greek texts.

The course fee is £100 but we hope to be able to provide bursaries to participants to assist with the cost. Accommodation will be extra, but we are arranging B&B nearby for around £30-40.

Places on the workshop are limited and applications will be accepted until 31st March. For further details please contact Dr. Charlotte Tupman:

The Practical Epigraphy Workshop is sponsored by The British Epigraphy Society, an independent ‘chapter’ of the Association Internationale d’Épigraphie Grecque et Latine:

EpiDoc Training Sessions 2009

Wednesday, May 20th, 2009

EpiDoc Training Sessions 2009
London 20-24 July
Rome 21-25 September

The EpiDoc community has been developing protocols for the publication of inscriptions, papyri, and other documentary Classical texts in TEI-compliant XML: for details see the community website at

Over the last few years there has been increasing demand for training by scholars wishing to use EpiDoc. We are delighted to be able to announce two training workshops, which will be offered in 2009. Both will be led by Dr Gabriel Bodard. These sessions will benefit scholars working on Greek or Latin documents with an interest in developing skills in the markup, encoding, and exploitation of digital editions. Competence in Greek and/or Latin, and knowledge of the Leiden Conventions will be assumed; no particular computer skills are required.

London session, 20-24 July 2009. This will take place at the Centre for Computing in the Humanities, King’s College London, 26-29 Drury Lane. The cost of attendance will be £50 for students; £100 for employees of universities or other non-profit institutions; £200 for employees of commercial institutions. Those interested in enrolling should apply to Dr Bodard, by 20 June 2009.

We hope to be able to offer some follow-up internships after the session, to enable participants to consolidate their experience under supervision; please let us know if that would be of interest to you.

Rome session, 21-25 September 2009. This will take place at the British School at Rome. Thanks to the generous support of the International Association of Greek and Latin Epigraphy, the British School and Terra Italia Onlus, attendance will be free.

Those interested in enrolling should apply to Dr Silvia Orlandi, by 30 June 2009.

Practical matters
Both courses will run from Monday to Friday starting at 10:00 am and ending at 16:00 each day.

Participants should bring a wireless-enabled laptop. You should acquire and install a copy of Oxygen *and* either an educational licence ($48) or a 30-day trial licence (free). Don’t worry if you don’t know how to use it!

Archaeological and Epigraphic interchange and e-Science

Thursday, January 29th, 2009

Workshop at the e-Science Institute, Edinburgh, February 10-11, 2009 (see programme and registration):

Rationale: The meeting will bring technical and editorial researchers participating in, or otherwise engaged with, the IOSPE (Inscriptiones Orae Septentrionalis Ponti Euxini = Ancient Inscriptions of the Northern Black Sea Coast.) project together with researchers in related fields, both historical and computational. Existing projects, such as the Inscriptions of Roman Cyrenaica and Inscriptions of Aphrodisias, have explored the digitization of ancient inscriptions from their regions, and employed the EpiDoc schema as markup. IOSPE plans to expand this sphere of activity, in conjunction with an multi-volume publication of inscription data. This event is a joint workshop funded in part by a Small Research Grant from the British Academy, and in part by the eSI through the Arts and Humanities e-Science theme. The workshop will bring together domain experts in epigraphy, and specialists in digital humanities, and e-science researchers, which will provide a detailed scoping of the research questions, and the research methods needed to investigate them from an historical/epigraphic point of view.

The success of previous projects, and the opportunities identified by the IOSPE research team, raise questions of significant interest for the e-science community. Great interpretive value can be attached to datasets such as these if they are linked, both with each other, and with other relevant datasets. The LaQuaT project at King’s, part of ENGAGE, is addressing this. There is also an important adjunct research area in the field of digital geographic analysis of these datasets: again, this can only be achieved if disparate data collections can be meaningfully cross-walked.

Digital Classicist Occasional Seminars: Lamé on digital epigraphy

Tuesday, November 4th, 2008

For those who are not subscribed to the Digital Classicist podcast RSS, I’d like to call attention to the latest “occasional seminar” audio and slides online: Marion Lamé spoke about “Epigraphical encoding: from the Stone to Digital Edition” in the internation video-conference series European Culture and Technology. Marion talked about her PhD project which is to use an XML-encoded edition of the Res Gestae Diui Augusti as an exercise in digital recording and presentation of an extremely important and rich historical text and encoding historical features in the markup.

We shall occasionally record and upload (with permission) presentations of interest to digital classicists that are presented in other venues and series. If you would be interested in contributing a presentation to this series, please contact me or someone else at the Digital Classicist.

EpiDoc Summer School, 11-15 June, 2007

Sunday, May 13th, 2007

Over the last few years an international group of scholars has been developing a set of conventions for marking up ancient documents in XML for publication and interchange. The EpiDoc Guidelines started from the case of inscriptions, but the principles are also being applied to papyri and coins, and the aim has always been to produce standards consistent with those of the Text Encoding Initiative, used for all literary and linguistic texts.

Following on from the interest we have seen in EpiDoc training events (including recent sessions in Rome and San Diego) and the success of the London EpiDoc Summer School over several years now, we shall be holding another week-long workshop here at King’s College London, from the 11th-15th June this year.

* The EpiDoc Guidelines provide a schema and associated tools and recommendations for the use of XML to publish epigraphic and papyrological texts in interchangeable format. For a fuller description of the project and links to tools and guidelines see
* The Summer School will offer an in-depth introduction to the use of XML and related technologies for publication and interchange of epigraphic and papyrological editions.
* The event will be hosted by the Centre for Computing in the Humanities, King’s College London, which will provide the venue and tuition. The school is free of charge, but attendees will need to fund their own travel, accommodation, and subsistence. (There may be cheap accommodation available through KCL; please inquire.)
* The summer school is targeted at epigraphic and papyrological scholars (including professors, post-docs, and advanced graduate students) with an interest and willingness to learn some of the hands-on technical aspects necessary to run a digital project (even if they would not be marking-up texts by hand very much themselves). Knowledge of Greek/Latin, the Leiden Conventions and the distinctions expressed by them, and the kinds of data and metadata that need to be recorded by philologists and ancient historians, will be an advantage. Please enquire if you’re unsure. No particular technical expertise is required.
* Attendees will require the use of a relatively recent laptop computer (Win XP+ or Mac OSX 10.3+), with up-to-date Java installation, and should acquire a copy of the oXygen XML editor (educational discount and one-month free trial available); they should also have the means to enter Unicode Greek from the keyboard. Full technical specifications and advice are available on request. (CCH may be able to arrange the loan of a prepared laptop for the week; please inquire asap.)

Places on the workshop will be limited so if you are interested in attending the summer school, or have a colleague or student who might be interested, please contact as soon as possible with a brief statement of qualifications and interest.

EpiDoc: Epigraphic Documents in TEI XML

Monday, April 24th, 2006

There’s a new home on SourceForge for Epidoc, and the Epidoc guidelines themselves are available here on the Stoa server.


Five important principles have governed the elaboration of EpiDoc techniques and tools from the beginning:

  • EpiDoc and its tools should be open and available to the widest possible range of individuals and groups; therefore, all documents and software produced by the EpiDoc Community are released under the GNU Public License
  • Insofar as possible, EpiDoc should be compliant or compatible with other published standards: we should strive to avoid re-inventing wheels or creating data silos
  • Insofar as possible, EpiDoc projects should work collaboratively and supportively with other digital epigraphy initiatives, especially those sanctioned by the Association Internationale d’ Épigraphie Grecque et Latine
  • In the arena of transcription, EpiDoc must facilitate the encoding of all editorial observations and distinctions signaled in traditional print editions through the use of sigla and typographic indicia
  • We avoid encoding the appearance of these sigla and indicia; rather, we encode the character (or semantics) of the distinction or observation the human editor is making. The rendering of typographic representations of these distinctions are accomplished using XSLTs or other methods.

EpiDoc Development Sprint: more

Friday, April 7th, 2006

As a follow-up to my previous post detailing the achievements of the first half of an “EpiDoc Development Sprint” in London (20-24 March 2006), I would like to offer the following summary of achievements during the second (and final) two-day sprint (below). Participants are currently gathering up loose ends and completing finishing touches to portions of the work. Further updates, including announcement of the next major release of tools and guidelines, will be made via the Stoa-sponsored Markup List.


EpiDoc Development Sprint

Wednesday, March 22nd, 2006

During the week of 20-24 March, the Centre for Computing in the Humanities at King’s College (London) is playing host to a group of EpiDoc practitioners and Text Encoding Initiative experts for the purpose of an “EpiDoc Development Sprint.” This event proceeds under the auspices of the Inscriptions of Aphrodisias Project (also at King’s) and with funding support from the Arts and Humanities Research Council. Other participants represent the U.S. Epigraphy Project and the Scholarly Technology Group (both at Brown University); the Ancient World Mapping Center (UNC-Chapel Hill); and Sprint participants are collaborating to achieve major advances in published guidelines and free tooling to support the encoding of Greek and Latin inscriptions using the TEI tagset.

The week’s efforts are organized into a pair of two-day sprint sessions, the first of which has now closed. Herewith a brief summary of accomplishments to date …