Workshop in Computational Linguistics and Latin Philology

Sunday, September 28th, 2008

Posted on behalf of David Bamman:

Place: University of Innsbruck, 15. International Colloquium on Latin Linguistics
Date: April 6, 2009

Workshop organizers: David Bamman (Perseus Project, Tufts University), Dag Haug (University of Oslo), Marco Passarotti (Catholic University of Milan)
Invited speaker: Roberto Busa, S.J.

Classical Studies has long had a history of driving pioneering research in linguistics and literary studies. The great Classical philologists and lexicographers of the 19th century are arguably some of the world’s earliest and finest corpus linguists – but we find ourselves now lagging behind the achievements of other languages due in large part to the absence of structured digital resources on which to base our research. While the TLG and the Packard Humanities Institute each released their respective Greek and Latin corpus in the 1970s (only shortly after the release of the Brown Corpus of English in 1967), they remain today – almost 40 years later – two of our most widely used electronic resources. Those ensuing 40 years have seen the rise and widespread development of structured knowledge bases, such as huge treebanks to encode syntactic information in English, Czech, Arabic and over twenty other languages, lexical ontologies such as WordNet, and new corpora being annotated not just with their semantics and syntax disambiguated, but their named entities and propositional data made explicit as well.

We are, however, now beginning to see these same resources being developed for Latin, along with the automatic tools that can exploit them (such as automatic syntactic parsers and morphological taggers) and a new interest in quantitative research that can only exist as a result. As we enter this new era, we must take care to work together as a community going forward – the three organizers, for instance, are each leading the development of independent treebank projects for different eras of Latin (Classical, Biblical and Thomistic) and we recognize that the value of each project is exponentially greater when compatible with the others. This workshop aims to bring together scholars working in the field – both those developing such resources and those conducting linguistic research using them – to share such work and experience.

We invite presentations including the following:
* Electronic resources for Latin in development
* Corpus linguistic research
* Application and evaluation of NLP tools on Latin texts
* Development of corpus driven lexica
* Standards and standardization of annotation styles on different linguistic layers (e.g.,
morphological, syntactic, semantic, propositional)

Please submit abstracts of up to two a4 pages to Dag Haug at before December 1, 2008. Notifications will be sent before January 1, 2009.

Classical panels at DRHA

Sunday, September 21st, 2008

This year’s Digital Resources for the Humanities and Arts conference (Cambridge, September 14-17) included a two-part panel on Digital Classicist (sadly divided over two days), organized by Simon Mahony, Stuart Dunn, and myself. Despite some apparently last-minute (and unannounced) scheduling changes, the panel was very successful. I post here only my brief notes on the papers involved, and hope that some of my colleagues may post more detailed reactions or reports either in comments, or as posts to this or other blogs.

Gabriel Bodard

I kicked off the first Classicists’ session on Monday morning with a brief history of the Digital Classicist community and a discussion of the different approaches to studying the use of digital methods in the study of the ancient world (contrasting the historical approach of Solomon 1993 with the forward-looking theme of Crane/Terras 2008, for which authors were asked to imagine their field within Classics in 2018). I talked in general terms about the different trajectories of two very early digital classical projects, the TLG and LGPN, both of which were founded in 1972. The TLG, while a technological innovative project from the get-go, and one which changed (and continues to be indispensible to) the study of Greek literature, has not made a great contribution to the Digital Humanities because of its closed, for-profit, and self-sufficient strategy. The LGPN on the other hand began life as a very technologically conservative projects, geared to the production of paper volumes of the Lexicon, and has always been reactive to changes in technology rather than proactive as the TLG was; as a result of this, however, they have been able to change with the times, adopt new database and web technologies as they appeared, and are now actively contributing to the development of standards in XML, onomastics, and geo-tagging, and sharing data and tools widely. Finally I argued that any study of the community of digital Classics needs both to consider history (lessons to be learned from projects such as those discussed above, and other venerable projects that are still currently innovative such as Perseus and the DDbDP), and consider the newest technologies, standards, and cyberinfrastructures that will drive our work forward in the future.

(David Robey pointed out that Classics has an important and unique position with the UK arts and humanities community in that the subject associations give validity and respectability by their support of and recognition for digital resources and research.)

Stuart Dunn

In a paper titled The UK’s evolving e-infrastructure and the study of the past, Stuart discussed the national e-Science agenda and how it relates to the practices and needs of the humanities scholar, using as a basis the research process of data collection, analysis, and publication/dissemination. The essential definition of e-Science is that it centres around scholarly collaboration across and between disciplines, and the advanced computational infrastructure that enables this collaboration. e-Science often involves working with huge bodies of data or processing-intensive operations on complex material, and the example of this kind of research Stuart offered was not Classical but Byzantine: the use of agent-based modelling by colleagues in Birmingham to simulate the climactic battle of Manzikert. After some general conclusions on the opportunities for advanced e-infrastructure to be used in the study of the ancient world, there was some lively discussion of geospacial resources in the British and European academic spheres.

Simon Mahony

Simon gave a detailed presentation of the Humslides 2.0 project that he is conducting with the Classics department at King’s College London. Building upon the work carried out in a pilot project in 2006-7 to digitise the teaching slide collections of the Classics department (as a pilot study for the School of Humanities), which adopted a free trial version of the ContentDM management system (trial license now expired, and not renewed), the new project will utilize Web 2.0 tools to present and organize some 7000 slides with more metadata and more input from students and other contributors. A Humslides Flickr group has been established, inspired in part by the Commons group set up by Library of Congress and now contributed to by several other major institutions. As well as providing a teaching resource (currently restricted to KCL students until some thorny copyright issues have been wrinkled out), students will be set assessed coursework tasks to contribute to the tagging and annotating of images in this collection.

Elpiniki Fragkouli

Due to illness, Elpiniki’s paper on Training, Communities of Practice, and Digital Humanities was not delivered at this conference. We shall see whether she would be willing to upload her slides on the Digital Classicist website for discussion.

Amy Smith (Leif Isaksen, Brian Fuchs)

The paper on Lightweight Reuse of Digital Resources with VLMA: perspectives and challenges, originally commissioned for the Digital Classicist panel, was at the last minute and for unknown reasons switched over into a panel on Digital Humanites on Tuesday morning. Amy presented this paper, which discussed lessons learned from the Virtual Lightbox for Museums and Archives project (discussed in detail in their article in the special issue of Digital Medievalist journal we edited). Some conclusions and discussion followed on the topic of RDF and other metadata standards, and on browser-based versus desktop applications for viewing and organizing remote objects.

John Pybus (Alan Bowman, Charles Crowther and Ruth Kirkham)

John’s presentation on A Virtual Research Environment for the Study of Documents and Manuscripts gave a succinct and very useful summary of the history of the VRE research that has been carried out by the Centre for the Study of Ancient Documents and the humanities VRE team in Oxford. The project is one of four demo projects conducted by the second phase of work that begin with a user requirements survey in 2006-7. Built using uPortal, the VRE allows remote, parallel, and dynamic consultation and annotation of texts, images, and other resources by multiple scholars simultaneously. John showed some examples of the functionality of the VRE platform, including: the ability to show side-by-side parallel views of a tablet (different images or different renderings of the same image); the juxtaposition of multiple fragments in a lightbox; the ability to share views and exchange instant messages between scholars.

Emma O’Riordan (Michael Fulford, et al.)

In a paper that discussed another project related to the Oxford VRE programme, the Virtual Environment for Research in Archaeology: a Roman case study at Silchester, Emma discussed the origins of the VERA system in the Integrated Archaeological Database (IADB) that has been in use at Silchester for several years. The VERA system allows almost instant publication of the years results (as compared to waiting several months for paper notes to be transcribed); is cheaper than manual transcription; and more reliable than manual transcription; perhaps most importantly, the system enables live communication and collaboration between the archaeologists in the field and scholars in other parts of the world. Emma stressed one lesson from this project which was the importance of working alongside computer scientists, so that development of functionality can take into consideration the needs of the archaeologists as well as the research and interests of the programmers. It was interesting, however, that she also noted the potential pitfalls of too much tinkering with a tool while at work in the field.

Claire Warwick (Melissa Terras, et al.)

Originally scheduled in the second “Digital Humanities” on Tuesday morning, this paper followed logically on from Emma’s, and discussed Virtual Environments for Research in Archaeology (VERA): Use and Usability of Integrated Virtual Environments in Archaeological Research. Claire focussed on the evaluation of documentation of the unique needs of archaeologists in the field, and some conclusions the VERA team have been able to draw by the use of questionnaires, diaries, and anonymized interviews with the Silchester workers. Learning new IT skills was considered to be a burdern by students who were already having to learn fieldwork skills on the job; there were also new problems with the technology, as compared to the “pencil and paper” methods for which workflow and solutions had been developed over time. We look forward to a full report on the feedback and usability study that the UCL participants in the VERA project are conducting.

Leif Isaksen

Original scheduled for the “Digital Tools” panel, in this paper, Building a Virtual Community: The Antiquist Experience, Leif spoke to a Digital Classicist audience about a parallel community, Antiquist (who focus on digital approaches to cultural heritage and archaeology). The Antiquist community has an active mailing list (a Google group), a moribund blog, and a wiki whose main function is announcements of events. Antiquist boasts multiple moderators, many of whom try to keep the list active, and from the start they actively invited heritage professionals who were known to them to join the community. There is no set agenda, and membership is from a wide range of industries. Over time, traffic on the list has remained steady, with an unusually high percentage of active participants, but the content of the list traffic has tended recently to become more announcement-focussed rather than long threads and discussions. They are currently considering inviting new moderators to join the team, in the hope of injecting fresh blood and enthusiasm into a team who now rarely innovate and introduce new discussions to the group. Compared to many mailing lists, the community is still very active and very healthy, however. (Leif has usefully uploaded his slideshow and commented in a thread on the Antiquist email group.)

Legal guide to GPL compliance

Sunday, September 14th, 2008

I posted a few weeks ago on a guide to citing Creative Commons works, and just a short while later I saw this not directly related story about a Practical Guide to GPL Compliance, from the Software Freedom Law Center. Where the CC-guide is primarily about citation, and therefore of interest to many Digital Humanists/Classicists who work with these licenses, the GPL-guide is a subtly different animal. Free and Open Source Software licensing is a more fraught area, since in most cases software is re-used (if at all) and embedded in a new product that includes new code as well as the the re-used FOSS parts. In some cases this new software may be sold or licensed for financial gain, or attached to services that are charged for, or otherwise part of a commercial product. It is therefore extemely useful to have this practical guide to issues of legality (including documentation and availability of license information) available to programmers and to companies that make use of FOSS code. One worth bookmarking.

2009 Conference of Computer Applications to Archaeology (CFP)

Sunday, September 14th, 2008

Via Centernet:

CALL FOR PAPERS AND PROPOSALS FOR SESSIONS, WORKSHOPS, AND ROUNDTABLES at the 2009 Conference of Computer Applications to Archaeology (CAA)
Deadline: October 15, 2008

The 37th annual conference on Computer Applications to Archaeology (CAA) will take place at the Colonial Williamsburg Foundation in Williamsburg, Virginia from March 22 to 26, 2009. The conference will bring together students and scholars to explore current theory and applications of quantitative methods and information technology in the field of archaeology. CAA members come from a diverse range of disciplines, including archaeology, anthropology, art and architectural history, computer science, geography, geomatics, historic preservation, museum studies, and urban history.

The full CFP is available here:

In defence of biblioclasm

Saturday, September 13th, 2008

Charlotte Roueché pointed me to this transcript of a piece from ABC Radio’s Perspective slot: ‘Our Biblioclastic Century‘. The author, Robin Derricourt, an academic publisher with a background in archaeology and history, makes some well-observed points about online publication and the need for sustainability of publication and citation if we are to retain the intellectual and academic output of our culture. With none of this can I disagree. However, he then ends this short, pithy piece with the somewhat knee-jerk conclusion:

I know that my grandchildren will be able to go into a library and read an article by Einstein, a book by Newton, or a manuscript by Captain James Cook, and those by their minor contemporaries. I do not know that they will be able to access the reports, documents and articles that I can read today only on some present day institution’s website. In fact I can be pretty sure most of this will not survive.

And when our own civilisation finally ends, as each civilisation does, where will be the repository that maintains what we now have as knowledge, perhaps even through some future dark ages, for later societies to inherit? They will still have Aristotle, or Darwin, but they may not have the 21st century equivalents to read.

It is important to recognise that this is the well-thought out fear of an informed and intelligent person, and that those of us working for digital sustainability therefore need to communicate our aims and achievements more widely. I cannot help, however, but point out a logical fallacy in this argument: Derricourt assumes the existence of the physical library full of books (as well he might, the library is an institution that will not go away any time soon). But the library has not always existed, and it was by no means automatic or self-evident that the library would come into existence.

If these cultural and academic institutions had not come into being at several points in history (often associated with the courts of kings or religious communities), then books would be in no better shape that websites are now (or rather websites in the world that still exists in Derricourt’s imagination, which was the world of the early Web of the 1990s). Individual copies would have circulated in private collection, some would occasionally have been copied, but not on the scale and with the rigour that we saw in Mediaeval monasteries, for example. The idea of the repository that holds a copy of everything published in a certain domain, whatever its perceived worth, would not exist. A private collection or library could easily be burned or thrown into the trash at the end of its owner’s life, or when moving residence (and not all trash-heaps are as future-friendly as the sand at Oxyrhynchus). The library changed all this, and thanks to the libraries and scriptoria, and later printing houses and repositories, copies were made and works were preserved in multiple places, on durable materials, and with rigorous standards.

On the Web, some might say, we do not have libraries to do this job for us, and so when one private collection (a privately registered web domain, say) disappears due to its owner moving residence or losing interest or failing to keep up payments on the domain registration or service provision, all will be lost. Irrevocably and permanently. (No great loss, others would argue.) However this is not true. There are libraries in the online world. There are digital archives and repositories; the Internet Archive and various search engine caches (among other entities) may be able to recover the lost website from 1998 that Derricourt mourns. Digital libraries set out to make multiple, well-archived, backed-up copies, in open standards and formats and registered with Digital Object Identifiers, of all works in their purview. In short, there are libraries on the web. And it is not therefore true that, as Derricourt argues:

Let’s be realistic – all [sc. online content] will disappear, because no web site is permanent. Only a physical library can maintain and transmit to future generations our heritage of ideas, knowledge, discovery, speculation, literature. I can more easily find an 1898 print article than a 1998 document published on the Web.

In fact, as the world becomes more connected and the Internet becomes the source and the repository for more and more of our information, libraries are going to come under increasing pressure to cut back their accessions, to digitize and archive (or even destroy) their paper collections, and to become custodians of digital rather than physical artefacts. (Don’t get me wrong: I will be in the front line of the fight to defend libraries against this offensive, but the pressure will be there.) It is by no means automatic that physical libraries will always be the best source of cultural and literary preservation in our grandchildren’s time. If no one has bothered to digitize even a 2008 print article, then the 1998 website will be easier to find in one hundred years time. I don’t fear for websites. I fear for paper archives that no one is digitizing.

CFP: Digital Humanities 09

Thursday, September 11th, 2008

The Call for Papers for Digital Humanities 09, scheduled for 22-25 June at the University of Maryland, has just been issued. Abstracts are due on 31 October 2008.

MITH’s Digital Dialogues schedule

Thursday, September 4th, 2008

The Maryland Institute for Technology in the Humanities (MITH) has released the fall schedule for their “digital dialogues” lecture series. There are a number of interesting talks. I wonder if any of these will be podcast?

Since the full schedule is only available as a PDF at the moment, I’m taking the liberty of pasting the contents here:

Maryland Institute for Technology in the Humanities
an applied think tank for the digital humanities
Digital Dialogues Schedule
Tuesdays @12:30-1:45
Fall 2008 in MITH’s Conference Room
B0135 McKeldin Library, U. Maryland

  • 9.9 Doug Reside (MITH and Theatre), “The MITHological AXE: Multimedia Metadata Encoding with the Ajax XML Encoder
  • 9.16 Stanley N. Katz (Princeton University), “Digital Humanities 3.0: Where We Have Come From and Where We Are Now?”
  • 9.23 Joyce Ray (Institute of Museum and Library Services), “Digital Humanities and the Future of Libraries”
  • 9.30 Tom Scheinfeldt and Dave Lester (George Mason University), “Omeka: Easy Web Publishing for Scholarship and Cultural Heritage”
  • 10.7 Brent Seales (University of Kentucky), “EDUCE: Enhanced Digital Unwrapping for Conservation and Exploration”
  • 10.14 Zachary Whalen (University of Mary Washington), “The Videogame Text”
  • 10.21 Kathleen Fitzpatrick (Pomona College), “Planned Obsolescence: Publishing, Technology, and the Future of the Academy”
  • 10.28 “War (and) Games” (a discussion in conjunction with the ARHU semester on War and Representations of War, facilitated by Matthew Kirschenbaum [English and MITH])
  • 11.4 Bethany Nowviskie (University of Virginia), “New World Ordering: Shaping Geospatial Information for Scholarly Use”
  • 11.11 Merle Collins (English), Saraka and Nation (film screening and discussion)
  • 11.18 Ann Weeks (iSchool and HCIL), “The International Children’s Digital Library: An Introduction for Scholars”
  • 11.25 Clifford Lynch (Coalition for Networked Information), title TBA
  • 12.2 Elizabeth Bearden (English), “Renaissance Moving Pictures: From Sidney’s Funeral materials to Collaborative, Multimedia Nachleben”
  • 12.9 Katie King (Women’s Studies), “Flexible Knowledges, Reenactments, New Media”

All talks are free and open to the public!

University of Maryland
McKeldin Library B0131
College Park, MD 20742

Neil Fraistat, Director

