Code4Lib North Meeting at Ryerson

I attended my first Code4Lib North conference this past week, something I’ve never quite felt “geeky” enough to do in the past.  This year an implicit invitation to technical services librarians encouraged us non-coding types to participate.  The result produced a gathering rather similar to what I’ve experienced at Access conferences.

Kudos to the folks at Ryerson for hosting and putting on such a well-organized event.  It was great to meet and learn from participants from the local and regional community and the unconference/unthemed style worked very well.

There was a wonderful flow that emerged on the first day beginning with Alan Harnum‘s talk on the idea of Computational Thinking.  This served to set up the inclusive tone of the conference nicely noting that although some of us were not coders our activities as librarians, such as cataloguing and project management, draws on and applies computational thinking processes.

Alan drew on Jonathan Rochkind’s blog post as inspiration for his talk and outlined the steps of the computational thinking process: decomposition; pattern recognition; pattern generalization and abstraction and, algorithm design.  He emphasized the importance of analysis during the decomposition phase noting that because project failure can be so far along in the process considerable work can be done before it’s realized that this might not have been such a great idea.  Breaking the problem down allows potentially fatal flaws to be identified and corrected earlier in the life of the project.

Peter Senge’s book The Fifth Discipline came up during the Q & A and although coming out of the computing environment of the 1980’s, and therefore also pre-internet sounds like it might still be worth investigating.

Alison Hitchens then took the floor and talked about RDA.  She reminded us that the purpose of RDA is to create “data for discovery” and that the rewriting of the cataloguing code has resulted in a clearer identification of the bibliographic data elements.  She talked a bit about the RDA Vocabularies and how they have been used by the Bibliotheque Nationale de France in their linked data experiments.

MJ Suhonos, one of the conference organizers and a fellow linked data enthusiast, then talked about metadata formats.  He evaluated a number of formats against the criteria of serialization, encoding, and schema.  The JSON MARC-HASH format seemed to satisfy all three and is considered one of the better formats to use.

Cynthia Ng, another conference organizer and our host during the event, talked about the importance of web accessibility and accessible user-interfaces on different platforms.

Katie Legere from Queen’s presented a very interesting session on the “sonification of data” a perspective I had not considered despite my own musical background.  This is a relatively new area of research and Katie provided an example based on reference question statistics from a multi-branched library using snippets from Beethoven’s third symphony to sonify and express the data.  It struck me that combining this with data visualization might also be applicable in a future navigation scenario.  I’ll talk a bit more about this idea in a subsequent post.

The final “formal” presentation of the day was by Nick Ruest who talked about his recent work archiving web information from York University‘s news source Y-File and resources associated with the Dale Askey/Edwin Mellen controversy.  Nick introduced us to web archiving and the use of WARC files and their importance as historical sources for researchers in the future.  He also talked about the importance of advocacy and the librarian’s role as the “social conscience of the information world.”

Nick’s session inspired a reply and very informative talk on hashing and digital file identification/verification by Mat Trudel.  That Mat was able to quickly throw a presentation together and deliver it to the group is a testament to the real beauty of the unconference format.

The afternoon began with a detailed overview of SFX by Dana Thomas and then discussion groups and hackfest sessions began.  I sat in on the discussion about University Librarians and technology led by Mike Ridley.  Mike asked two main questions: what should the UL know about technology?; what should the UL know about those who work with technology.  Some highlights from the discussion: trust your staff especially when weighing information provided by vendors; also trust reports from staff at other institutions who have first hand experience with products you are considering; attracting and keeping “talent” when money is not available as an incentive; importance of fostering a culture of “yes”, i.e. to provide freedom to experiment and participate in projects that might not benefit the library directly; if you are operating in a unionized environment consider the union a partner rather than an adversary.  It was a good exchange of ideas.

That evening was spent in good company. Bill Denton took us over to the fabulous Arts and Letters Club on Elm Street for drinks.  Some computational drinking ensued with Ian Milligan, Alan Harnum, Giles Orr, Abe de Jesus, Anna St. Onge, Nick Ruest and Bill. We migrated to the Queen and Beaver for dinner managing to grab a table for nine in a small private room. Excellent!


Day two began with Andrew McAlorum and Ken Yang talking about a couple of digital collections maintained at the University of Toronto:  DEEDS (Documents of Early English Data Set) [still in development] and the RPO (Representative Poetry Online) Anthology.

We then saw a demonstration of an app called Book Finder that Ryerson has developed to help library users find physical books on the shelf. Here’s an example of the app at work:

The following session was about developing a “responsive catalogue” that is capable of rendering itself in a good way on any device that might be used to access it.  This is something that will be launched at U of T next month.  If you want to test your own site they recommended a device emulator called the Responsive Design Bookmarklet.  They also showed us how Chrome can also be used to do some of this testing through the built in user agent/device metrics feature.  And they noted that it is best to try things out on the actual devices.

Giles Orr from TPL talked about a number of “credit card sized” computers including: the Raspberry Pi; MK802, the Beaglebone Black, and the TP-Link MR-3020 wireless router. A fascinating colection of gizmos if ever I saw one …

John Fink showed us a way to do storage on the cheap with an array of consumer grade hard drives and a BackBlaze Pod. Effective but apparently a tad noisy.

Tim Ribaric and Jonathan Younker, both from Brock University, talked about the recent acquisition of EZProxy by OCLC and what this means for the authentication of off-campus access to library licensed digital resources.  I hadn’t heard about this development and was glad to see this discussion of options to paying the small annual fee that OCLC will now be collecting.

David Fiander then looked at three ISBN APIs comparing Library Thing, xISBN and Open Library. Interesting to see the different results obtainable through each of these services.  I believe it was Open Library that demonstrated the most flexibility.

After lunch there were a series of Lightning Talks beginning with a short discussion that I “led” on navigation and visualization in a linked data environment.  It was a half baked idea to be sure and I’ll elaborate on it and the discussion in a later post.  Other talks included: a very promising project that MJ Suhonos is working on called Ladder; David Fiander talked about using WebDav with Zotero; and, Dale Askey talked about using Passpack and YubiKeys to manage passwords and authentication.

I had suggested a hackfest dealing with the conversion of MARC records into a format(s) that could be employed in a linked data environment and was lucky enough to spend some time with MJ and see some of the inner workings of his Ladder project and the conversion processes he’s used there.  We also took a quick look at a couple of open source conversion tools: marc2rdf and Ross Singers’s marc2rdf-modeler.  Alison Hitchens, Mat Trudel, and Liam Whalen also took part in this discussion.  Many thanks to everyone for bringing me another conceptual step closer to filling the gap that exists between connecting our bibliographic data to the semantic web.

A fantastic couple of days!  Thanks again to Ryerson for the wonderful hospitality!!


End of Second Day at LVI 2012 – more pics

It’s over! It was a great conference and congratulations to John Joergensen and Núria Casellas for putting together a great collection of presentations for Track 5, Data Organization and Legal Informatics.

Clay Shirky delivered a great opening plenary this morning talking about crowdsourcing, openness and “cognitive surplus.” Time to stop watching TV and start working on shaping the web, at least that’s one conclusion to draw (Wikipedia=100 million hours of work; U.S. TV viewing=200 billion hours per year).

Clay Shirky delivering a great plenary.

Then it was Jerry Goldman and Matt Gruhn talking about the multimedia Oyez Project and some fascinating work on machine readable access to U.S. Supreme Court information. Broccoli was high on the word cloud for this session.

Jerry Goldman

Matt Gruhn

Had a chance to take a short walk around Ithaca at lunch and check out more of the gorge.

Hey, that’s me at the top of Catherine St.!


The gorge at Stewart St.

After lunch Track 5 started to bunch up a bit with two half hour programs; although some seemed like hour programs compressed into that half hour. Yoshiharu Matsuura and Amy Huey-Ling Shee talked about translation issues between Japan, Taiwan, Korea and China. Interesting to see the different interpretations and English translations of the same and similar ideograms.

In the second half hour Michael Curtotti shared his research on visualizing the law. I think visualization of data is an area that will become increasingly important especially as we start negotiating our way through linked data on the semantic web.

Michael Curtotti

The next double whammy began with a write in presentation with Susan Newell Hart comparing Lexis and Westlaw and their development of the digest and citator components of their platforms. Very interesting to hear about Lexis using machine algorithms to catch up with the legacy of human created digests created by Westlaw.

Susan Newell Hart

The second half of this hour featured Pompeau Casanovas presenting his research on crowdsourcing “relational law” and I was really disappointed that we didn’t have time to hear more about this very interesting area of research. He raised some great questions: how do you define knowledge when you can connect everything together; what is law today?; what is a legal document?

Pompeu Casanovas

The final pair of presentations began with Lee Hollaar and his statutory “time machine.” This was an interesting report on an older project and I would have liked to have had an opportunity to see this in action.

Lee Hollaar

Søren Nielsen and Rasmus Lohals shared their experience with optimizing Danish statutory law so that they had better exposure in general search engines on the web. Loved the “extreme search” option that they provided on their own site.

Søren Nielsen

Rasmus Lohals

Thanks for the hospitality Cornell Law School and Itahca. Enjoyed the conference!

Sunset over the Ithaca gorge.

P.S. Gotta get me a real camera …

Some Pics from the Law via the Internet Conference

So I’m down in Ithaca attending the Law via the Internet congference. It’s a beautiful little city surrounding Cornell University and nestled against Cayuga Lake. It being October the leaves are starting to change colour which provides some wonderful vistas.

I’m essentially following Track 5 which is focused on Data Organization and Legal Informatics. Here are some shots of the speakers I had an opportunity to hear today.

Richard Susskind

Richard Susskind delivering this morning’s opening plenary at the Schwartz Center for the Performing Arts

Mr. Susskind delivered a talk almost identical to the one he gave at AALL in Boston this past summer. Still good to hear it again. Although, as my colleague Louis Mirando noted, he did not mention his view that, “Law schools have always been on the cutting edge of tradition.” 🙂

After the plenary we trooped over to Myron Taylor Hall crossing “the gorge” with this wonderful view from the bridge:

View from the bridge over “the gorge” at Cornell University.

Beautiful fall colours.

I then heard Anurag Acharya, the founding engineer for Google Scholar talk about how they are providing access to U.S. case law. Very interesting, but still wondering how they define “significance” without any reference to a classification structure.

Anurag Acharya

Anurag Acharya talking about legal search and Google Scholar.

After lunch I heard Phillipe Grand’Maison and Daniel Poulin talk about statistical analysis of Supreme Court of Canada decisions and the idea of the “half-life” of a digital document.

Daniel Poulin

Daniel Poulin discussing statistical perspective of SCC decisions.

I then enjoyed Philip Chung from AustLII talk about citation searching in a session provocatively titled, “Searching Without Search Terms.”

Philip Chung

Philip Chung, one of the AustLII developers.

And the last session was delivered by Enrico Francesconi from the Institute of Theory and Techniques of Legal Information. A fascinating talk on the impact of semantic web technology on legal information.

Enrico Francesconi

Enrico Francesconi talking about legal information and the semantic web.

A great first day at LVI 2012. Looking forward to tomorrow!

RDA IFLA Satelite Conference in Quebec City

It was great to attend the RDA Conference last Friday in Quebec City. We drove up to Quebec via Ottawa and stayed a couple of nights in the posh dorms of the University of Laval. Friday was a rainy day which made sitting in the windowless conference room just that much more bearable. The day ran very smoothly and delegates had come from as far away as Iceland, Singapore and Australia.

Barbara Tillett set the stage with a wonderful survey of the development of RDA in her presentation entitled, “Resource Description and Access: Overview: History, Principles, Conceptual Models“. This provided a great introduction for those who may have been new to RDA and was also a clear review for the experienced RDA follower. She traced the history from the British Museum rules of 1841, the Paris Principles, card catalogues, the development of the IBSD, OPACs and the current web environment and showed how the FRBR principles were drawn directly from this evolution.

All of the presentations were informative but the two highlights for me were Gord Dunsire’s, “RDA Vocabularies and Concepts” and Chris Oliver’s run through the RDA online prototype.

Dunsire’s presentation was particularly interesting to me because he spoke about the connection of RDA to some of the other players in the bibliographic universe including ONIX, FRBRoo, OWL, RDF and the Semantic Web. Things seem very promising with comments from the communities like: “Why haven’t we sat down and talked about this stuff together before?” Dunsire expressed the importance of enabling the ‘machine’ in this rapidly changing technological environment: “We don’t have to understand it, we’re just humans … it needs to be this complicated so that the machines can understand it … we should just keep talking and let the machines know what we’ve decided.” A brave new bibliographic world to be sure.

Oliver walked us though some screenshots of the online prototype which, unfortunately, was still not quite ready for prime time due to the delay of the final draft. It looks pretty good. Conceptually it seems to be well thought out and includes features like annotating, commenting and workflow creation that will be potentially very useful. The principle developer Nannette Naught was praised very highly and you might be interested in taking a look at her presentation from the RDA Forum at the ALA annual conference this past June, “Product Development Snapshot: A Visual Tour of the Development Process“; especially the diagram that shows the RDA entity-relationships on slide 8 which was included in Oliver’s presentation. A little weeny to actually see, but interesting none-the-less.

There were some criticisms from the European library community who to some degree have felt a bit left out of the process. Anders Cato from the National Library of Sweden outlined the concerns of the international community, but it seemed that many of them had been addressed and dealt with earlier in the day. Dierdre Kiorgaard, Chair of the JSC, assured everyone that all of the submitted comments had been considered by the Committee, but decisions had not been reached for all of them.

Another issue of concern raised during Chris Oliver’s Q&A was the publishing/business model for RDA. How will RDA be developed? Will there be considerations for small libraries, independent cataloguer/indexers and possibly educational access packages for teachers and students? Some wondered about the accessibility of the online version in rural areas and underdeveloped communities and expressed a desire for a print version. There was a representative from the publisher’s group who said they were aware of most of these issues and will address them once the first version of RDA has been issued.

Implementation of RDA also looms as a big question. The Library of Congress, Library and Archives Canada, and the Australian and British national libraries have agreed to take the lead. Once RDA is ready, likely mid-2009, plans for implementation will be prepared with the goal that libraries will start adopting and using RDA sometime in 2010.

It was a great day overall. I had a chance to speak with a number of interesting folks and came away feeling generally positive about the whole endeavour. I’m looking forward to reviewing the final draft of RDA which is due out in mid-October. The conference presentations haven’t surfaced yet but will likely appear on the IFLA website or the JSC presentations page shortly.

Here are a few pictures from the conference.