Discovery

Published on May 10th, 2013 | by Adrian Stevenson

Final features, Trends and Future Work

(Image: Victory Loan Poster © IWM (Q 61339) )

A lot of my previous articles reflected the state of play at the time they were written, so I wanted to write one final article summarising the API and reviewing what it does, as well as various directions in which interested parties might like to see it develop.

Final Features

Our API consists of a Perl core which emulates Solr query and response syntaxes and performs federated searching across any number of other APIs which are available on the internet.

We have developed an extensible XSLT mapping routine to which API types and instances can be easily added. At the moment plugin code has been written to enable our mappings routine to cope with five types of API data output:

Solr
Oxford Continuations And Beginnings RSS
Europeana OpenSearch RSS
The Victoria and Albert Museum‘s in house API format
OCLC CONTENTdm

From those types, we have been able to include the following sources of WW1 Digital Content which were already live with their own API:

The University of Leicester Manufacturing Pasts
The catalogo_unico collection via Europeana
Culture Grid
The ersterWeltKrieg collection via Europeana
European Library collections via Europeana
The Europeana 1914-1918 collection
The Imperial War Museum
The National Maritime Museum
The Oxford University Continuations And Beginnings Project
The Oxford Great War Archive via Europeana
The Victoria and Albert Museum

We were also able to assist the following institutions by getting samples of their data into our own multicore Solr instance as a searchable API

Our API supports several features via a Solr-style query string, as detailed in my API syntax post.

We added our own stylesheet to format the results of searches, but also worked with two consultancies to build example applications using our API interface:

We are running an instance of our api at http://discovery.ac.uk/ww1/api, but the Perl core and mappings infrastructure is also downloadable for users to run locally, to aid with performance or even to build upon further.

Trends and Future Work

Timelines, Geodata and enriching the content

In working with other providers, It became clear that popular applications which designers are interested in building on top of this kind of data right now centre around timelines and geodata, neither of which was of a high or consistent quality across all the datasets we pulled into our project. Europeana have addressed issues like this to some extent with their enrichment scheme, whereby they have re-processed data submitted to their project and looked for meaning within that data, in an effort to bring out translations and geodata.

For a federated project, making enrichments to data on the fly represents much more of a challenge, and it was not something we managed to get around to within the scope that we had. It would certainly be interesting to explore this kind of enrichment further!

Facets and relevance ranking

Another limitation we found was attempting to facet and relevance rank from federated searches. From a faceting perspective, we are currently passing through facet requests to other providers, and not all of them support faceting which immediately limits the responses. The next issue is that the facets returned by providers rarely match up, and so faceted requests tend to be a concatenation of each provider’s individual facets.

From a relevance ranking perspective, again, not all providers return a relevance rank in their API (although some of them clearly have one internally). One possible solution to both of these problems which we considered was adding a lucene library to the API and running the results through it in order to get faceting and relevance ranking from a single, consistent source. In the end, time didn’t permit us to look into this much further.

The End!

That concludes my set of blog posts on this project! It’s been an interesting piece of work, and it would be great to work on some more of the features I have mentioned, but for now the project has come to a close. Take a look at Adrian’s posts for more of a summary about the project as a whole, and I’ll update this page and the API syntax page with any last-minute amendments!

Tags: api, british library india office, british postal museum, catalogo unico, culture grid, digital content, enrichment, erster weltkrieg, european library, europeana, faceting, features, future work, geodata, imperial war museum, iwm, manufacturing pasts, mappings, mickey & mallory, national maritime, opensearch, outputs, oxford continuations and beginnings, oxford great war archive, relevance ranking, rss, search, solr, syntax, timeline, trends, ukdiscovery, victoria and albert, we are what we do, welsh voices of the great war, ww1, ww1mimas

About the Author

Adrian Stevenson

Comments are closed.

Final features, Trends and Future Work

Final Features

Trends and Future Work

Timelines, Geodata and enriching the content

Facets and relevance ranking

The End!

About the Author

Related Posts

Helping Institutions Towards APIs →

All About Mappings →

Release of WW1 Discovery Interfaces →

A look into the APIs in the field →

About

Archives