Archive for the ‘Digital library’ Category

Cataloguing Open Access Classics Serials

Friday, March 17th, 2017

The Institute for Classical Studies is pleased to announce the appointment of Simona Stoyanova for one year as a new Research Fellow in Library and Information Science on the Cataloguing Open Access Classics Serials (COACS) project, funded by a development grant from the School of Advanced Study.

COACS will leverage various sites that list or index open access (OA) publications, especially journals and serials, in classics and ancient history, so as to produce a resource that subject libraries may use to automatically catalogue the publications and articles therein. The project is based in the ICS, supervised by the Reader in Digital Classics, Gabriel Bodard, and the Combined Library, with the support of Paul Jackson and Sue Willetts. Other digital librarians and scholars including Richard Gartner and Raphaële Mouren in the Warburg Institute; Patrick Burns and Tom Elliott from the Institute for the Study of the Ancient World (NYU); Charles Jones from Penn State; and Matteo Romanello from the German Archaeological Institute are providing further advice.

Major stages of work will include:

  1. Survey of AWOL: We shall assess the regularity of metadata in the open access journals listed at AWOL (which currently lists 1521 OA periodicals, containing a little over 50,000 articles), and estimate what proportion of these titles expose metadata in standard formats that would enable harvesting in a form amenable to import into library catalogues. A certain amount of iteration and even manual curation of data is likely to be necessary. The intermediate dataset will need to be updated and incremented over time, rather than overwritten entirely on each import.
  2. Intermediate data format: We will also decide on the intermediate format (containing MARC data), which in addition to being ingested by the Combined Library will be made available for use by other libraries (e.g. NYU Library and the German Archaeological Institute’s Zenon catalogue). The addition of catalogued OA serials and articles to the library catalogue will significantly contribute to the research practice of scholars and other library users, enabling new research outputs from the Institute and enhancing the open access policy of the School.
  3. Further OA indexes: Once the proof-of-concept is in place, and data is being harvested from AWOL (and tested that they update rather than overwriting or duplicating pre-existing titles), we shall experiment with harvesting similar data from other indexes of OA content, such as DOAJ, OLH, Persée, DialNet, TOCS-IN, and perhaps even institutional repositories.
  4. Publish open access software: All code for harvesting OA serials and articles, and for ingest by library catalogues will be made available through Github. This code will then be available for updating the intermediate data to take advantage of new titles that are added to AWOL and other resources, and new issues of serials that are already listed. This will enable reuse of our scripts and data by other libraries and similar institutions.

By the end of the pilot project, we will have: made available and documented the intermediate dataset and harvesting and ingest code; performed a test ingest of the data into the ICS library catalogue; engaged known (NYU, Zenon, BL) and newly discovered colleagues in potentially adding to and using this data; explored the possibility of seeking external funding to take this project further.

We consider this project to be a pilot for further work, for which we intend to seek external funding once a proof of concept is in place. We hope to be able to build on this first phase of work by: extending the methods to other disciplines, especially those covered by the other institute libraries in SAS; enabling the harvest of full-text from serials whose license permit it, for search and other textual research such as text-mining and natural language processing; disambiguating enhancing the internal and external bibliographical references to enable hyperlinks to primary and secondary sources where available.

CFP: Seminar on Latin textual criticism in the digital age

Wednesday, October 1st, 2014

The Digital Latin Library, a joint project of the Society for Classical Studies, the Medieval Academy of America, and the Renaissance Society of America, with funding from the Andrew W. Mellon Foundation, announces a seminar on Latin textual criticism in the digital age. The seminar will take place on the campus of the University of Oklahoma, the DLL’s host institution, on June 25–26, 2015.

We welcome proposals for papers on all subjects related to the intersection of modern technology with traditional methods for editing Latin texts of all eras. Suggested topics:

  • Keeping the “critical” in digital critical editions
  • The scholarly value of editing texts to be read by humans and machines
  • Extending the usability of critical editions beyond a scholarly audience
  • Visualizing the critical apparatus: moving beyond a print-optimized format
  • Encoding different critical approaches to a text
  • Interoperability between critical editions and other digital resources
  • Dreaming big: a wishlist of features for the optimal digital editing environment

Of particular interest are proposals that examine the scholarly element of preparing a digital edition.

The seminar will be limited to ten participants. Participants will receive a stipend, and all travel and related expenses will be paid by the DLL.

Please send proposals of no more than 650 words to Samuel J. Huskey at by December 1, 2014. Notification of proposal status will be sent in early January.

Editing Texts and Digital Libraries: 2 seminars in Leipzig

Thursday, May 15th, 2014

Posted for Greta Franzini:

Next week the Humboldt Chair of Digital Humanities is hosting two seminars as part of its Digital Philology course:

1) Monday May 19th, 3:15-4:45pm, University of Leipzig (Paulinum, room P801)
“Editing Texts in Context: Two Case Studies” by Rebecca Finnigan, Christine Bannan and Prof. Neel D. Smith, College of the Holy Cross

2) Tuesday May 20th, 9:15-10:45am, University of Leipzig (Paulinum, room P801)
“digilibLT – a Digital Library of Late Latin Texts” by Prof. Maurizio Lana, Università del Piemonte Orientale (Italy)

For more information, please visit

Perseus Catalog Released

Friday, June 21st, 2013

From Lisa Cerrato via the Digital Classicist List:

The Perseus Digital Library is pleased to announce the 1.0 Release of the Perseus Catalog.

The Perseus Catalog is an attempt to provide systematic catalog access to at least one online edition of every major Greek and Latin author (both surviving and fragmentary) from antiquity to 600 CE. Still a work in progress, the catalog currently includes 3,679 individual works (2,522 Greek and 1,247 Latin), with over 11,000 links to online versions of these works (6,419 in Google Books, 5,098 to the Internet Archive, 593 to the Hathi Trust). The Perseus interface now includes links to the Perseus Catalog from the main navigation bar, and also from within the majority of texts in the Greco-Roman collection.

The metadata contained within the catalog has utilized the MODS and MADS standards developed by the Library of Congress as well as the Canonical Text Services and CTS-URN protocols developed by the Homer Multitext Project.  The Perseus catalog interface uses the open source Blacklight Project interface and Apache Solr. Stable, linkable canonical URIs have been provided for all textgroups, works, editions and translations in the Catalog for both HTML and ATOM output formats. The ATOM output format provides access to the source CTS, MODS and MADS metadata for the catalog records. Subsequent releases will make all catalog data available as RDF triples.

Other major plans for the future of the catalog include not only the addition of more authors and works as well as links to online versions but also to open up the catalog to contributions from users. Currently the catalog does not include any user contribution or social features other than standard email contact information but the goal is to soon support the creation of user accounts and the contribution of recommendations, corrections and or new metadata.

The Perseus Catalog blog features documentation, a user guide, and contact information as well as comments from Editor-in-Chief Gregory Crane on the history and purpose of the catalog.

The Perseus Digital Library Team

Duke Collaboratory for Classics Computing (DC3)

Wednesday, May 8th, 2013


We are very pleased to announce the creation of the Duke Collaboratory for Classics Computing (DC3), a new Digital Classics R&D unit embedded in the Duke University Libraries, whose start-up has been generously funded by the Andrew W. Mellon Foundation and Duke University’s Dean of Arts & Sciences and Office of the Provost.

The DC3 goes live 1 July 2013, continuing a long tradition of collaboration between the Duke University Libraries and papyrologists in Duke’s Department of Classical Studies. The late Professors William H. Willis and John F. Oates began the Duke Databank of Documentary Papyri (DDbDP) more than 30 years ago, and in 1996 Duke was among the founding members of the Advanced Papyrological Information System (APIS). In recent years, Duke led the Mellon-funded Integrating Digital Papyrology effort, which brought together the DDbDP, Heidelberger Gesamtverzeichnis der Griechischen Papyrusurkunden Ägyptens (HGV), and APIS in a common search and collaborative curation environment (, and which collaborates with other partners, including Trismegistos, Bibliographie Papyrologique, Brussels Coptic Database, and the Arabic Papyrology Database.

The DC3 team will see to the maintenance and enhancement of data and tooling, cultivate new partnerships in the papyrological domain, experiment in the development of new complementary resources, and engage in teaching and outreach at Duke and beyond.

The team’s first push will be in the area of Greek and Latin Epigraphy, where it plans to leverage its papyrological experience to serve a much larger community. The team brings a wealth of experience in fields like image processing, text engineering, scholarly data modeling, and building scalable web services. It aims to help create a system in which the many worldwide digital epigraphy projects can interoperate by linking into the graph of scholarly relationships while maintaining the full force of their individuality.

The DC3 team is:

Ryan BAUMANN: Has worked on a wide range of Digital Humanities projects, from applying advanced imaging and visualization techniques to ancient artifacts, to developing systems for scholarly editing and collaboration.

Hugh CAYLESS: Has over a decade of software engineering expertise in both academic and industrial settings. He also holds a Ph.D. in Classics and a Master’s in Information Science. He is one of the founders of the EpiDoc collaborative and currently serves on the Technical Council of the Text Encoding Initiative.

Josh SOSIN: Associate Professor of Classical Studies and History, Co-Director of the DDbDP, Associate editor of Greek, Roman, and Byzantine Studies; an epigraphist and papyrologist interested in the intersection of ancient law, religion, and the economy.


Open Philology Project Announced

Thursday, April 4th, 2013

Via Marco Büchler, Greg Crane has just posted “The Open Philology Project and Humboldt Chair of Digital Humanities at Leipzig” at Perseus Digital Library Updates.

Abstract: The Humboldt Chair of Digital Humanities at the University of Leipzig sees in the rise of Digital Technologies an opportunity to re-assess and re-establish how the humanities can advance the understanding of the past and to support a dialogue among civilizations. Philology, which uses surviving linguistic sources to understand the past as deeply and broadly as possible, is central to these tasks, because languages, present and historical, are central to human culture. To advance this larger effort, the Humboldt Chair focuses upon enabling Greco-Roman culture to realize the fullest possible role in intellectual life. Greco-Roman culture is particularly significant because it contributed to both Europe and the Islamic world and the study of Greco-Roman culture and its influence thus entails Classical Arabic as well as Ancient Greek and Latin. The Humboldt Chair inaugurates an Open Philology Project with three complementary efforts that produce open philological data, educate a wide audience about historical languages, and integrate open philological data from many sources: the Open Greek and Latin Project organizes content (including translations into Classical Arabic and modern languages); the Historical Language e-Learning Project explores ways to support learning across barriers of language and culture as well as space and time; the Scaife Digital Library focuses on integrating cultural heritage sources available under open licenses.

Details of the project, its components, and rationale are provided in the original post.

BMCR review of Berti/Costa

Monday, February 14th, 2011

In BMCR 2011-02-24 last week, Alexandra Trachsel reviews:

Monica Berti, Virgilio Costa, La Biblioteca di Alessandria: storia di un paradiso perduto. Ricerche di filologia, letteratura e storia 10. Roma: Edizioni Tored, 2010. Pp. xvi, 279. ISBN 9788888617343. €30.00 (pb).

Of particular interest to Stoans and Digital Classicists is the final section on massive digital libraries such as Google Books and Europeana, and lessons both draw (and should learn) from the ancient Library of Alexandria.

BSR Digital Collections online

Friday, November 27th, 2009

Alessandra Giovenco writes to announce the following:

The British School at Rome Library & Archive is pleased to announce the launch of a new website:

In July 2007, the Getty Foundation awarded a second generous grant to the British School at Rome Archive (BSR) to support the arrangement and description of part of the John Bryan Ward-Perkins photographic collection. As a result of this 2-year project, a website of the BSR digital collections was created to present not only the photographic material (Photographs) but also other types of resources which fall into different categories: Maps, Prints, Documents, Postcards, Drawings, Paintings and Manuscripts.

The majority of the digital images displayed on this website are represented by the photographs catalogued during the second Getty Foundation funded project.

(Note: I should also add that the BSR’s Ward-Perkins collection provided most of the photographs for the recently published Inscriptions of Roman Tripolitania.)

New Digital Humanities/Libraries/Museums Calendar

Friday, February 27th, 2009

Amanda French has started a publicly accessible calendar of conferences and events related to “Digital Humanities, Digital Libraries and Digital Museums.”

The Digital Archimedes Palimpsest Released

Wednesday, October 29th, 2008

Very exciting news – the complete dataset of the Archimedes Palimpsest project (ten years in the making) has been released today. The official announcement is copied below, but I’d like to point out what I think it is that makes this project so special. It isn’t the object – the manuscript – or the content – although I’m sure the previously unknown texts are quite exciting for scholars. It isn’t even the technology, which includes multispectral imaging used to separate out the palimpsest from the overlying text and the XML transcriptions mapped to those images (although that’s a subject close to my heart).

What’s special about this project is its total dedication to open access principles, and an implied trust in the way it is being released that open access will work. There is no user interface. Instead, all project data is being released under a Creative Commons 3.0 attribution license. Under this license, anyone can take this data and do whatever they want to with it (even sell it), as long as they attribute it to the Archimedes Palimpsest project. The thinking behind this is that, by making the complete project data available, others will step up and build interfaces… create searches… make visualizations… do all kinds of cool stuff with the data that the developers might not even consider.

To be fair, this isn’t the only project I know of that is operating like this; the complete high-resolution photographs and accompanying metadata for manuscripts digitized through the Homer Multitext project are available freely, as the other project data will be when it’s completed, although the HMT as far as I know will also have its own user interface. There may be others as well. But I’m impressed that the project developers are releasing just the data, and trusting that scholars and others will create user environments of their own.

The Stoa was founded on principles of open access. It’s validating to see a high-visibility project such as the Archimedes Palimpsest take those principles seriously.

Ten years ago today, a private American collector purchased the Archimedes Palimpsest. Since that time he has guided and funded the project to conserve, image, and study the manuscript. After ten years of work, involving the expertise and goodwill of an extraordinary number of people working around the world, the Archimedes Palimpsest Project has released its data. It is a historic dataset, revealing new texts from the ancient world. It is an integrated product, weaving registered images in many wavebands of light with XML transcriptions of the Archimedes and Hyperides texts that are spatially mapped to those images. It has pushed boundaries for the imaging of documents, and relied almost exclusively on current international standards. We hope that this dataset will be a persistent digital resource for the decades to come. We also hope it will be helpful as an example for others who are conducting similar work. It published under a Creative Commons 3.0 attribution license, to ensure ease of access and the potential for widespread use. A complete facsimile of the revealed palimpsested texts is available on Googlebooks as “The Archimedes Palimpsest”. It is hoped that this is the first of many uses to which the data will be put.

For information on the Archimedes Palimpsest Project, please visit:

For the dataset, please visit:

We have set up a discussion forum on the Archimedes Palimpsest Project. Any member can invite anybody else to join. If you want to become a member, please email:

I would be grateful if you would circulate this to your friends and colleagues.

Thank you very much

Will Noel
The Walters Art Museum
October 29th, 2008.

In defence of biblioclasm

Saturday, September 13th, 2008

Charlotte Roueché pointed me to this transcript of a piece from ABC Radio’s Perspective slot: ‘Our Biblioclastic Century‘. The author, Robin Derricourt, an academic publisher with a background in archaeology and history, makes some well-observed points about online publication and the need for sustainability of publication and citation if we are to retain the intellectual and academic output of our culture. With none of this can I disagree. However, he then ends this short, pithy piece with the somewhat knee-jerk conclusion:

I know that my grandchildren will be able to go into a library and read an article by Einstein, a book by Newton, or a manuscript by Captain James Cook, and those by their minor contemporaries. I do not know that they will be able to access the reports, documents and articles that I can read today only on some present day institution’s website. In fact I can be pretty sure most of this will not survive.

And when our own civilisation finally ends, as each civilisation does, where will be the repository that maintains what we now have as knowledge, perhaps even through some future dark ages, for later societies to inherit? They will still have Aristotle, or Darwin, but they may not have the 21st century equivalents to read.

It is important to recognise that this is the well-thought out fear of an informed and intelligent person, and that those of us working for digital sustainability therefore need to communicate our aims and achievements more widely. I cannot help, however, but point out a logical fallacy in this argument: Derricourt assumes the existence of the physical library full of books (as well he might, the library is an institution that will not go away any time soon). But the library has not always existed, and it was by no means automatic or self-evident that the library would come into existence.

If these cultural and academic institutions had not come into being at several points in history (often associated with the courts of kings or religious communities), then books would be in no better shape that websites are now (or rather websites in the world that still exists in Derricourt’s imagination, which was the world of the early Web of the 1990s). Individual copies would have circulated in private collection, some would occasionally have been copied, but not on the scale and with the rigour that we saw in Mediaeval monasteries, for example. The idea of the repository that holds a copy of everything published in a certain domain, whatever its perceived worth, would not exist. A private collection or library could easily be burned or thrown into the trash at the end of its owner’s life, or when moving residence (and not all trash-heaps are as future-friendly as the sand at Oxyrhynchus). The library changed all this, and thanks to the libraries and scriptoria, and later printing houses and repositories, copies were made and works were preserved in multiple places, on durable materials, and with rigorous standards.

On the Web, some might say, we do not have libraries to do this job for us, and so when one private collection (a privately registered web domain, say) disappears due to its owner moving residence or losing interest or failing to keep up payments on the domain registration or service provision, all will be lost. Irrevocably and permanently. (No great loss, others would argue.) However this is not true. There are libraries in the online world. There are digital archives and repositories; the Internet Archive and various search engine caches (among other entities) may be able to recover the lost website from 1998 that Derricourt mourns. Digital libraries set out to make multiple, well-archived, backed-up copies, in open standards and formats and registered with Digital Object Identifiers, of all works in their purview. In short, there are libraries on the web. And it is not therefore true that, as Derricourt argues:

Let’s be realistic – all [sc. online content] will disappear, because no web site is permanent. Only a physical library can maintain and transmit to future generations our heritage of ideas, knowledge, discovery, speculation, literature. I can more easily find an 1898 print article than a 1998 document published on the Web.

In fact, as the world becomes more connected and the Internet becomes the source and the repository for more and more of our information, libraries are going to come under increasing pressure to cut back their accessions, to digitize and archive (or even destroy) their paper collections, and to become custodians of digital rather than physical artefacts. (Don’t get me wrong: I will be in the front line of the fight to defend libraries against this offensive, but the pressure will be there.) It is by no means automatic that physical libraries will always be the best source of cultural and literary preservation in our grandchildren’s time. If no one has bothered to digitize even a 2008 print article, then the 1998 website will be easier to find in one hundred years time. I don’t fear for websites. I fear for paper archives that no one is digitizing.


Tuesday, July 8th, 2008

Michael E. Smith has just blogged an opinion piece on self-archiving.

Article on PSWPC in LLC June 2008

Monday, May 26th, 2008

David Pritchard, “Working Papers, Open Access, and Cyber-infrastructure in Classical Studies” Literary and Linguistic Computing 2008 23: 149-162; doi:10.1093/llc/fqn005.

Princeton—Stanford Working Papers in Classics (PSWPC) is a web-based series of work-in-progress scripts by members of two leading departments of classics. It introduces the humanities to a new form of scholarly communication and represents a major advance in the free availability of classical-studies scholarship in cyberspace. This article both reviews the initial performance of this open-access experiment and the benefits and challenges of working papers more generally for classical studies. After 2 years of operation PSWPC has proven to be a clear success. This series has built up a large international readership and a sizeable body of pre-prints and performs important scholarly and community-outreach functions. As this performance is largely due to its congruency with the working arrangements of ancient historians and classicists and the global demand for open-access scholarship, the series confirms the viability of this means of scholarly communication, and the likelihood of its expansion in our discipline. But modifications are required to increase the benefits this series brings and the amount of scholarship it makes freely available online. Finally, departments wishing to replicate its success will have to consider other important developments, such as the increasing availability of post-prints, the linking of research funding to open access, and the emergence of new cyber-infrastructure.

Microsoft Ends Book and Article Scanning

Saturday, May 24th, 2008

Miguel Helf, writing in the New York Times, reports:

Microsoft said Friday that it was ending a project to scan millions of books and scholarly articles and make them available on the Web … Microsoft’s decision also leaves the Internet Archive, the nonprofit digital archive that was paid by Microsoft to scan books, looking for new sources of support.

The blog post in question (by Satya Nadella, Senior vice president search, portal and advertising) indicates that both Live Search Books and Live Search Academic (the latter being Microsoft’s competitor with Google Scholar) will be shut down next week:

Books and scholarly publications will continue to be integrated into our Search results, but not through separate indexes. This also means that we are winding down our digitization initiatives, including our library scanning and our in-copyright book programs.

For its part, the Internet Archive has posted a short response addressing the situation, and focusing on the status of the out-of-copyright works Microsoft scanned and the scanning equipment they purchased (both have been donated to IA restriction-free), and on the need for eventual public funding of the IA’s work.

This story is being widely covered and discussed elsewhere; a Google News Search rounds up most sources.

Whither scholarly digitization efforts?

Thursday, April 3rd, 2008

One of the authors at Thoughts on Antiquity has posted a provocative reflection on a long-standing effort to digitize an out-of-copyright translation of Cyril of Alexandria’s Commentary on Luke. In light of technological change, the big book-scanning projects and the continued operation of APh, the author expresses uncertainty about how or whether to proceed.

What is the role of the humanist scholar (and his home institution, and her professional society) in the era of big digitization? Readers of this blog know about the on-going Million Books discussions. I’ve opined elsewhere that the creation of stable, sustainable, massively interlinked scholarly reference works is a critical contribution. The issue also surfaces regularly in attempts to define “digital scholarship in the humanities” and to organize funding for it. Yet, clearly the questions are arising spontaneously in many quarters and there is not yet a field-wide dialog on the subject.

We may agree with Steven Wheatley that:

The day will come, not that far off, when modifying humanities with ‘digital’ will make no more sense than modifying humanities with ‘print.’ (in A. Guess, “Rise of the Digital NEH,” Inside Higher Ed, 3 April 2008).

Ask your colleagues: what is your role in getting there and how will you work when we’ve arrived? Comments welcome.