Posted by: Ian | October 2, 2009

The first OARJ project meeting – our ‘reality check’

On Wednesday 30th Sept Theo and I went down to London for the day to have a wee (Ha! we were away for 14 hours! that’s not “wee”) project meeting with some of our respected colleagues, to have a reality check on our plans for the OARJ functionality.

Our group of peers that we sought feedback from consisted of Ben O’Steen (Oxford), Jim Downing (Cambridge) and Richard Jones (Symplectic). We were able to get valuable feedback from the group which enables us to go forwards with the project with confidence.

In essence, the OA-RJ functionality we discussed falls into two strands:

  1. Giving users information about repositories.
  2. Being a broker and depositing objects.

Summary

The OA-RJ project is looking at something that can be really useful, but it has some BIG concepts in it, and some MAJOR social issues to address. Below is a summary of the main points we found useful:

Give me Information

This is, in essence, what the Depot currently does in its reception area, but it would be useful to extend this in many ways:

  • International in scope,
  • return more information:
    • Name & URL of the respository
    • Name & URL of the Institution
    • OAI base URL
    • Query URIs for stuff:
      • Author searches
      • Title searches
      • … etc
    • Atom/RSS news feed URIs
    • SWORD Deposit Base URL

In addition, noting WHY a particular entry was included/suggested would be very useful:

  • There is a name match between an OpenDOAR/ROAR record and the name of the Institution/IR,
  • The IP number of the repository is in the same network range as the client making the query,
  • There is a geo-location closeness,
  • The funder mandates a deposit in a specific Repository,
  • There are Subject repositories for the subject area of the research,
  • The publisher mandates a deposit somewhere,
  • An author has been traced to a specific institution, thus IR,
  • … etc

When making decisions, one may have just the IP address or domain name of the client, or one may be given some metadata to work with (the more info there is, the more scope OA-RJ has for making decisions) .

Deposit Broker

The broker side is also very interesting… here the client elects to give the item to be deposited to broker and leaves the broker to handle both the where and the how of the deposit.

The broker does, however, have a responsibility to the client – it has to be able to tell the client (probably any client) where the deposited item has gone to.

Here the logic we came up with is pretty simple.

We assume that OARJ is an opt-in/elect-in system… in that repositories come to the OA-RJ system and request to be informed of any relevant items, and who has the authority to act on behalf of their repository.

OARJ/Broker receives an item, and deduces some targets for it:

  1. OA-RJ, having determined some appropriate repositories, informs those repositories that information is available.
  2. One of more repositories harvest the data from OA-RJ
    • The repository may return a URI to the “media edit” location for that article.
  3. The repository process that item in some period of time, and then makes the item publicly available
    • The repository should informs OA-RJ of the public URI for the item, which OA-RJ notes in it’s internal tables (but does not make available to subsiquent harvesters).
    • The repository may declare that it will accept “persistent responsibility” for the item (ie, the URI for the object will persist for a reasonable amount of time, and other sites will be able to harvest the content)
  4. If a repository declares they will have “persistent responsibility” for an item, then OA-RJ redirects any request for that item to the appropriate repository. If multiple repositories take “persistent responsibility”, then a list is made available instead.
    • After some period of time, the OA-RJ Broker deletes the object from its store, and relies purely on the redirect to the repository with the “persistent responsibility”.
  5. If no repository claims “persistent responsibility” within (some pre-determined) “timeout period”, then the broker transfers the item to the Depot, and the Depot takes “persistent responsibility” for the item.

When transfering an item, don’t package it up into a JSON encoded set of metadata wrapped in a METS manifest… use an ORE resourse map (“The world does not need another packaging standard”).

In terms of the clients for pushing and/or pulling to/from the depot, the most sensible solution we agreed was to get the repository vendors to write the code (even if its “by proxy”) and make that code part of the core distribution bundle… thus making the OA-RJ/Broker part of the core infrastructure in the Repository world.

Advertisements

Responses

  1. […] February 17, 2010 by Ian We have previously talked about the Broker depositing, and even had some thoughts on how the process should be handled (see The first OARJ project meeting). […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: