Archive for the ‘Open Source’ Category

Creative Commons and research

Tuesday, May 8th, 2007

A post on the Creative Commons blog draws together four articles on the value of Creative Commons licensing for newspapers, scientists, film students, and Wikipedia “SEOers” respectively. All are worth reading, but it is the article on scientists that is of most interest here. This article, posted at ScienceBlogs on 1st May by Rob Knop makes the case that:

Scientists do not need, and indeed should not have, exclusive (or any) control over who can copy their papers, and who can make derivative works of their papers.

The very progress of science is based on derivative works! It is absolutely essential that somebody else who attempts to reproduce your experiment be able to publish results that you don’t like if those are the results they have. Standard copyright, however, gives the copyright holders of a paper at least a plausible legal basis on which to challenge the publication of a paper that attempts to reproduce the results— clearly a derivative work!

I would extend this argument (and indeed have done so repeatedly and vocally) to assert that this applies to equally to all academic research, including the Humanties. This is a key part of the philosophy behind the Open Source Critical Editions network that I helped convene last year. All published research includes the requirement to publish the “source code” (by way of citations, arguments, primary and secondary references, retraceable argumentation), and the expectation that others will use this “source” to verify, reproduce, modify, or refute your work. Copyright, and especially digital copyright and crippleware, should not be allowed to get in the way of this process because without this freedom a publication can not be considered research.

Junicode update released

Wednesday, May 2nd, 2007

Keeping with the Unicode/font theme, here is the announcement of the latest release of the useful Junicode font package. Although it focuses on Medieval characters (Correction: and *does* now cover polytonic Greek) is has very good coverage of Latin, symbols, ligatures, as well as Runic, etc.

Junicode 0.6.13 is now available at Here are the release notes:

This release continues to add characters from the MUFI recommendation to benefit medievalists. Many messy outlines have been cleaned up, improving efficiency and reducing the likelihood of bugs. Most of the goodies in this release are for users of OpenType-aware programs such as InDesign and XeTeX: the OpenType features list has been thoroughly worked over and rationalized, and consistency imposed across all four faces (though it is still true that there are more OpenType features in Regular than in the other three). Use of ccmp, mark and mkmk has been greatly expanded, making use of combining diacritics more practical than before. Many MUFI glyphs have been made accessible via OpenType features, especially ccmp (for glyph+diacritic combination) and hlig (Historical Ligatures). Fractions, Roman numbers, subscripts and the various “Enclosed Alphanumerics” have been made accessible as ligatures (either liga, Standard Ligatures, or dlig, Discretionary Ligatures).

Open Source OCR

Wednesday, April 11th, 2007

Seen in Slashdot and Google Code updates:

Google has just announced work on OCRopus, which it says it hopes will ‘advance the state of the art in optical character recognition and related technologies.’ OCRopus will be available under the Apache 2.0 License. Obviously, there may be search and image search implications from OCRopus. ‘The goal of the project is to advance the state of the art in optical character recognition and related technologies, and to deliver a high quality OCR system suitable for document conversions, electronic libraries, vision impaired users, historical document analysis, and general desktop use. In addition, we are structuring the system in such a way that it will be easy to reuse by other researchers in the field.’


The project is expected to run for three years and support three Ph.D. students or postdocs. We are announcing a technology preview release of the software under the Apache license (English-only, combining the Tesseract character recognizer with IUPR layout analysis and language modeling tools), with additional recognizers and functionality in future releases.

It would be interesting to learn how this application compares in accuracy and power with commercial OCR systems (which have apparently gotten much better since the days when I used to get very frustrated with Omnipage and the like).

ImaNote – Image and Map Annotation Notebook

Wednesday, April 4th, 2007

This looks a useful tool. Anyone tried it? Claims to allow annotation and links to be added to images with RSS to keep track of everything.

Following text copied from Humanist:

We are really happy to announce the release of ImaNote 1.0 version.

ImaNote – (Image and Map Annotation Notebook) is a web-based multi-user tool that allows you, and your friends, to display a high-resolution image or a collection of images online and add annotations and links to them. You simply mark an area on an image (e.g. a map) and write an annotation related to the point.

You can keep track of the annotations using RSS (Really Simple Syndication) or link to them from your own blog/web site/email. The links lead right to the points in the image.

The user management features include resetting lost passwords and account email verification. Through the group management features you can create communities that share images and publish annotations.

ImaNote is Open Source and Free Software released under the GNU General Public Licence (GPL).

ImaNote is a Zope product, written in Python, with a javascript-enhanced interface. Zope and ImaNote run on almost all Operating Systems (GNU/Linux, MacOS X, *BSD, etc.) and Microsoft Windows. It currently works with most modern browsers including Mozilla Firefox, IE7 and Opera.

Imanote was developed as a collaboration between the Systems of Representation and the Learning Environments research groups of the Media Lab at the University of Art and Design Helsinki, Finland.

For more information go to

Why Blogs should use Creative Commons

Thursday, March 29th, 2007

An interesting discussion on the iCommons blog. Excerpt:

If your intention, as a blogger, is to have your content and your thoughts distributed as widely as possible, then reserving all your rights to your content is counterproductive. A more effective way of distributing your content and still retaining some control over how your content is distributed is using Creative Commons licenses.

This does, as the author argues, remove the need to walk the fine “Fair Use” line when another blogger wants to reproduce, quote from, and engage with your text. It’s an argument worth reading. I suppose that if your blog consists in large part of quotations from other sources there is less of an obvious advantage in this, as your quotations are obviously exempt from the (cc) licence. I should speak to my co-editors both here and over at CE about doing this.

Your data is the next big battle

Wednesday, March 28th, 2007

The trendspotters are saying (rhetorically) “Open Source is dead” and “Open data matters more than Open Source.” What’s clearly meant is: “open data formats matter more …”

Open access — over which critical battles important to readers of this blog continue to rage — is completely overlooked.

Join the Wikipedia Debate

Wednesday, March 28th, 2007

Seen at Academic Commons:

This coming Thursday (29 March 2007), the first Language Lab Unleashed! of the spring will feature Don Wyatt (chair of the Department of History at Middlebury College), Elizabeth Colantoni (Professor of Classics at Oberlin College), Laura Blankenship (Senior Instructional Technologist at Bryn Mawr), and Bryan Alexander (Director of Research at NITLE) for a discussion on the potential uses and abuses of Wikipedia in the educational arena.

The show will begin promptly at 8pm … for details on how to join the  live conversation, please visit:

That time’s 20:00 EST = 01:00 UTC. I’m not sure I’ll be able to make it…

Citizendium debuts

Tuesday, March 27th, 2007

from the CHE:

Citizendium Starts With a Little Knowledge

Citizendium, the peer-reviewed “progressive fork” of Wikipedia (The Chronicle, October 18, 2006), has opened for business. The site unveiled its public face on Sunday and as of this afternoon boasts more than 1,100 articles — a far cry from the more than 1.6 million entries in Wikipedia’s English version, but a decent start.

So far the new encyclopedia has a fairly random smattering of material: articles on topics relevant to scholars, like Jacques Derrida and the First Punic War, mingle with puzzling entries on the Bruneian dollar and Don MacLean (the basketball player, not the songwriter responsible for “Vincent.”) And while some pieces — like an essay on autism — seem to be well fleshed out, others — like a write-up on dachshunds — are mere placeholders for more-thorough articles.

None of this is meant as criticism. In fact, it’s fascinating to watch an encyclopedia start from the ground up. It will be worth watching to see whether the encyclopedia’s embrace of soft hierarchy — unlike Wikipedia, Citizendium requires contributors to identify themselves, and it lets a panel of scholars make final decisions on edits — slows its growth. –Brock Read

The entry on the Greek alphabet looks substantial as well.

Wikipedia editing as teaching tool

Sunday, March 25th, 2007

A wonderful suggestion in a comment on Cathy Davidson’s letter (that Tom blogged here a few days ago):

Thanks for your great column. I’ve used the “stubs” feature of Wikipedia to generate a list of 120 topics relating to ancient Roman civilization that need full articles. Then I’m requiring the 120 students in my upcoming Roman Civilization class to each write one article. This will hopefully teach them how to do original research in the library on obscure, narrowly focused topics and then create something of lasting value to others. The students will also be required to each review three of their fellow students’ articles in order to learn about the collaborative editing process. I’m a little nervous about its success, but I’m hoping to be part of the solution to the issues raised by Wikipedia, rather than contributing to the problems.

I’ve heard suggestions of this kind before, but this is one of the coolest implementations of it I’ve come across recently. This makes me wish once again that I was teaching a large class this year so I could do something similar. Kudos to JuliaFelix; please let us know how the experiment works.

Wikis and Blogs in Education

Sunday, March 25th, 2007

Seen in the Creative Commons Feed:

“The wiki is the center of my classroom”

That’s a quote from Wikis and Blogs in Education, one of three educational remixes from students of open content pioneer David Wiley.

The other two are Interviewing Basics and the Open Water Project, an excellent disaster preparedness video that probably everyone should watch.

Each project is licensed under CC Attribution-NonCommercial-ShareAlike and incorporates CC licensed and public domain audio, images, and video as well as original materials.

Wikis and Blogs in Education, potentially the most interesting site for readers of this forum is a site that combines text and video in an animated Flash and Javascript framework. It seems to run smoothely, but I don’t know if that would have implications for the free reuse of the material.

MIT Faculty and Libraries Refuse DRM

Wednesday, March 21st, 2007

Seen in Slashdot, MIT LIbraries:

The MIT Libraries have canceled access to the Society of Automotive Engineers’ web-based database of technical papers, rejecting the SAE’s requirement that MIT accept the imposition of Digital Rights Management (DRM) technology.

SAE’s DRM technology severely limits use of SAE papers and imposes unnecessary burdens on readers. With this technology, users must download a DRM plugin, Adobe’s “FileOpen,” in order to read SAE papers. This plugin limits use to on-screen viewing and making a single printed copy, and does not work on Linux or Unix platforms.

MIT faculty respond

“It’s a step backwards,” says Professor Wai Cheng, SAE fellow and Professor of Mechanical Engineering at MIT, who feels strongly enough about the implications of DRM that he has asked to be added to the agenda of the upcoming SAE Publication Board meeting in April, when he will address this topic.

It will be interesting to see how publishers respond as this sort of user-revolt escalates.

Middlebury Wikigate Revisited

Tuesday, March 20th, 2007

Back in January, I made some hooting noises and pointed at Jimmy Wales in the context of the tempest-in-a-teapot that erupted after the Middlebury College History Department added Wikipedia to its list of works students may not cite in papers.

One of the more useful published reactions to the whole affair — certainly more useful than mine — seems to me to be Cathy Davidson’s Op-Ed in the Chronicle of Higher Education,We Can’t Ignore the Influence of Digital Technologies” (53:29, 23 March 2007 [sic!]).
Among other provocative suggestions, she asks:

Rather than banning Wikipedia, why not make studying what it does and does not do part of the research-and-methods portion of our courses? Instead of resorting to the “Delete” button for new forms of collaborative knowledge made possible by the Internet, why not make the practice of research in the digital age the object of study?

Those, like me, who don’t subscribe to the Chronicle can read the letter via Davidson’s blog at HASTAC.

Mellon Award for Technology Collaboration

Thursday, March 15th, 2007

From the Mellon Award for Technology Collaboration website (where you’ll find all the details):

The Program in Research in Information Technology of the Andrew W. Mellon Foundation invites nominations for the 2007 Mellon Awards for Technology Collaboration (MATC). In support of the Program’s mission to encourage collaborative, open source software development within traditional Mellon constituencies, these awards recognize not-for-profit organizations that are making substantial contributions of their own resources toward the development of open source software and the fostering of collaborative communities to sustain open source development.

CC Learn

Friday, March 9th, 2007

Seen in the Creative Commons blog today:

A new division of Creative Commons, provisionally called CC Learn, will focus on education, broadly defined — from kindergarten to graduate school, to lifelong learning. The mission of this new division will be to promote vigorous networks of Open Educational Resources: materials offered freely and openly for educators, students and self-learners to use, modify and re-use for teaching, learning and research. CC Learn is looking for an Executive Director.

What is interesting is not the possibility that someone reading this blog might be interested in applying for an executive director’s position, but that CC are creating a new division especially for educational materials. I have always assumed that teaching materials and other educational resources were the most obvious candidates fro CC licensing, so I am now moved to wonder: what particular requirements do educations have from Creative Commons or Open Source licenses?

Gentium resurgens: refined Cyrillic, Unicode 5, smart rendering

Monday, February 19th, 2007

From Victor Gualtney, on the latest regarding the Gentium font by way of the Gentium-Announce List (links mine):

Update #4 – Gentium project revived, Cyrillic, Charis

Dear friends of Gentium,

No – there’s not a new version out yet. :-) But we’re pleased to report that Gentium is under development again after a while in hibernation. We’re actively refining the Cyrillic, adding support for Unicode 5, and preparing the font for the addition of smart rendering support using three different smart font technologies – OpenType, Graphite and Apple AAT.

If you want to see the target character, glyph set and behavior we’ll be supporting in the next version, you can take a look at our Doulos SIL and Charis SIL fonts:

Gentium will support every character and behavior that these fonts do, plus Greek. This also means that if you’re wondering whether the next version will support a specific character, see if it’s in Doulos SIL or Charis SIL. Note that since these fonts do not support full Greek, some of the Greek improvements (digamma, etc.) will not be there, but will be in Gentium.

Because we want to get this major upgrade to you as soon as possible, the next version will still be only regular and italic. We hope, however to get it to you sometime mid-year (that’s 2007, if we’re able to keep on track).

One more little note: Since Gentium has been released under the SIL Open Font License, it has gained lots of support in the GNU/Linux community. It has also made its way into some Linux distributions, and even has been shown on the OLPC (One Laptop Per Child). There’s a good pic (in both large and small resolutions) of Gentium Greek on the OLPC at:

Thanks for your continued interest in Gentium!

PDF Specification released to AIIM/ISO

Wednesday, January 31st, 2007

Seen (via Slashdot) in Technoracle:

Adobe announced it will release the entire PDF specification (current version 1.7) to the International Standards Organization (ISO) via AIIM. PDF has reached a point in it’s maturity cycle where maintaining it in an open standards manner is the next logical step in evolution. Not only does this reinforce Adobe’s commitment to open standards (see also my earlier blog on the release of flash runtime code to the Tamarin open source project at Sourceforge), but it demonstrates that open standards and open source strategies are really becoming a mainstream concept in the software industry.

So what does this really mean? Most people know that PDF is already a standard so why do this now? This event is very subtle yet very significant. PDF will go from being an open standard/specification and defacto standard to a full blown du jure standard. The difference will not affect implementers much given PDF has been a published open standard for years. There are some important distinctions however. First – others will have a clearly documented process for contributing to the future of the PDF specification.

(See full article at source.)
Does this have implications for the takeup of XPS in the future?

Second Life experient in social copyright

Sunday, January 21st, 2007
I spotted this several weeks ago in Wired magazine, but have only just gotten around to taking it in fully. The scenario:

Businesses in Second Life are in an uproar over a rogue [ed. note: modified from Open Source] software program that duplicates “in world” items. They should be. But the havoc sewn by Copybot promises to transform the virtual word into a bold experiment in protecting creative work without the blunt instrument of copyright law.

Linden Labs, the owners of Second Life, decided against employing DRM (which “won’t work”) or adjudicating copyright disputes themselves, but instead have added creator and creation-date indicators to all items.

The next phase of Linden’s response is more interesting. The company plans to develop an infrastructure to enable Second Life residents and landowners to enforce IP-related covenants within certain areas, or as a prerequisite for joining certain groups. In effect, Second Life’s inhabitants will self-police their world, according to rules and social norms they develop themselves.

There are some interesting comments in the full article about the innovation incentive value of copyright, and the possible success of social norms as against enforcible law as a means of controlling this.

Another Reason for Opening Access to Research

Wednesday, January 17th, 2007

Seen in the Creative Commons feed, an article in the British Medical Journal by John Wilbanks, executive director of the Science Commons, on why scientific research needs to be Open Access (and his arguments apply to all academic research, of course):

Summary points

Authors should be prioritising open access to their works—for the good of other scientists and to ensure that the full benefits of the internet and advanced technology may be realised

Open access is rapidly becoming a mainstream idea in scholarly publishing, with more than 2000 open access journals and more than a million author self archived open access papers

Legal and technical barriers to open access are easily overcome using freely available tools

Full article at

New Journal: Open Access Research

Thursday, January 11th, 2007

A new journal entitled Open Access Research (OAR) is now accepting submissions and plans its first issue (thereafter, thrice a year) in August 2007. It’s described as “a peer-reviewed, open-access journal that will enable greater interaction and facilitate a deeper conversation about open access.”

By way of Dorothea Salo’s Caveat Lector and

Open Course Ware

Thursday, January 11th, 2007

Seen in Slashdot, a comment by Kent Simon:

“Many people may not know that MIT has initiated OpenCourseWare, an initiative to share all of their educational resources with the public. This generous act is intended (in classical MIT style) to make knowledge free, open, and available. It’s a great resource for people looking to improve their knowledge of our world. OpenCourseWare should prove exceptionally beneficial to those who may not be able to afford the quality of education offered at a school like MIT. Here’s a link to all currently available courses. It is expected that by the end of the year every course offered at MIT will be available on the OpenCourseWare site, including lecture notes, homework assignments, and exams. OpenCourseWare is not offered to replace collegiate education, but rather to spread knowledge freely.”

Second Life to open code

Wednesday, January 10th, 2007

I’ve posted here several times about the educational fun to be had with ancient and other reconstructions in Second Life (see e.g. 3D Egyptian Archaeology in Second Life). Now more good news from Linden Labs, which may make SL an even more user-designed and progressive virtual word environment. This announcement seen in Lawrence Lessig’s blog:

I’ve been a long time supporter of SecondLife. Yesterday, they made me proud. SecondLife announced it will GPL its client software. And it committed itself to freeing the back-end as well. How significant is SecondLife? Here’s a really interesting empirical study by Tristan Louis about SecondLife activity.

Integration Proclamation

Tuesday, January 9th, 2007

Many authors and readers of content on this blog are deeply concerned about issues of ineroperability, data integration (and similar terms) as applied to humanities computing. Greg Crane’s recent response to the draft statement of the joint APA/AIA task force on Electronic Publication puts this issue center-stage. Under the leadership of Neel Smith and Chris Blackwell, the Technical Working Group of the Center for Hellenic Studies is pushing forward efforts to identify and develop (when necessary) mechanisms for data interchange and actionable digital citation. My own work, with many collaborators, on EpiDoc and Pleiades places a high value on interoperability. I could go on with more examples …

In light of all this concern, it’s worth noting a signature drive that has just started: The Integration Proclamation. It looks like this effort originates from the community action, advocacy and non-governmental organization community, but the basic issue is the same: too many of our systems and datasets are walled gardens; until we can share data and behaviors seamlessly, we’ll be hobbled in our attempts to do good stuff.

Grassroots book-scanning for uncompromising OA

Saturday, December 23rd, 2006

As complaints multiply about quality control in the Google book scanning initiative, this sort of approach begun by Nicholas Hodson looks increasingly promising to me.  (Had to laugh about the blue and the pink coding, though!)

Call for examples from “TEI by Example”

Saturday, December 16th, 2006

The Centre for Scholarly Editing and Document Studies (CTB) of the Royal Academy of Dutch Language and Literature, the Centre for Computing in the Humanities (CCH) of King’s College London, and the School for Library, Archive, and Information Studies (SLAIS) of University College London, are involved in the joint project “TEI by Example”.

Featuring freely available online tutorials walking individuals through the different stages in marking up a document in TEI (Text Encoding Initiative, these online tutorials will provide examples for users of all levels. Examples will be provided of different document types, with varying degrees in the granularity of markup, to provide a useful teaching and reference aid for those involved in the marking up of texts.

Eight tutorial modules will address a wide range of issues in text encoding with TEI:

1. Introduction to text encoding with TEI
2. The TEI header
3. Prose
4. Poetry
5. Drama
6. Manuscript Transcription
7. Scholarly Editing
8. Customizing TEI, ODD, Roma

To build as much as possible on available sources of existing practice in the field and to be able to present a broad view on the wide variety of encoding practices, we warmly welcome you to contribute TEI-encoded examples (either fragments or complete texts) that are applicable to any of these subjects. Examples are preferably encoded as TEI P5 XML texts, but also texts encoded in TEI P4 XML, other XML formats, or other (documented) electronic formats are of interest. Even examples of less-ideal encoding practices are welcome, since the idea of learning by error is a valuable didactic principle. Please do provide some indication of the errors or controversies in such examples when appropriate. After selection and editing, the example fragments will be incorporated in the freely available online deliverables, which will be issued under a Creative Commons Attribution ShareAlike license (see All contributors will be credited.

The examples can be sent (preferably compressed in .zip format and with an indication of applicability and credits due) to Please do not hesitate to contact us for any inquiries regarding copyright issues or any more general issues.

Kind regards,

The project team:

Ron Van den Branden, Melissa Terras, Edward Vanhoutte

Grumentum: latest from Troels Myrup

Saturday, December 16th, 2006

Troels Myrup has added new images to the Stoa Image Gallery:

I’ve added photos from Grumentum in Lucania (Southern Italy) to the Stoa Gallery. Charming place but fairly remote. It’s kind of hard to think that tourists will ever come here in hordes as the local council seems to believe. Recently, a new project has excavated the baths and parts of the forum, shedding new light on the city’s late antique development. The project’s website is currently offline.