Posted by: Ian | February 17, 2010

Sense and sensibility (for repository deposit)

We have previously talked about the Broker depositing, and even had some thoughts on how the process should be handled (see The first OARJ project meeting).

Most repository deposit tools we have seen so far are only looking at deposit in a single target. The reality can be a lot more complicated.

If we consider the SONEX general case [a multi-authored, multi-institutional (multi-national) paper], there are some interesting high-level discussions that should happen, discussions that would guide us in the right direction for managing this process.

We would like a client to be able to deposit an item with our broker, and the broker will make some deductions on where that item should live based on the metadata provided for that item (title, authors, institutions, etc). Generally speaking, it should be that:

  • Authors can be resolved to organisations, and thus institutional repositories; however,
    • not all authors will resolve to institutions,
    • two authors may be at one institution,
    • sometimes  authors can be affiliated with multiple institutions.
  • Funders codes can be resolved to a funding council, which often have a mandate to deposit in their own funders repository (see JULIET),
  • Publishers may have their own repository, or an Mandate to deposit somewhere specific (see RoMEO), and
  • Subject classification can match to known Subject repositories

Interesting observations

There is a definite time-lag between Deposit and Available, due to repositories taking time to verify deposits, correct the metadata, and add missing information.

Different repositories also have different rules about what to accept, and what not to accept. Different repositories catalogue things in different ways.

Much as we would like it to be otherwise, the vast majority of data in our repositories is “assisted deposits” – where someone other than one of the authors has done the hard-work of getting the item into the repository.

Interesting questions

Now, some interesting questions arise from this:

  1. Do the Funder or Subject repositories want the full text of the article, or will they be happy with the metadata and a link to the full text in another repository?  (in a journal-article centric view)
  2. What about the Institutional repositories? Do they both get full text, or should just one get the full record and the other just get metadata and a reference to the binary object(s)?
    • What happens when any of the repositories claim to be “100% full text”? Do they have a priority for the binary object?
    • What happens when multiple repositories claim to be “100% full text”?
  3. Does the OARJ Broker make assumptions if it can determine a Principle Investigator is an author, which matches an institutional repository?
  4. If we assume that the OARJ Broker is not going to be a central archive too (meaning that it deletes records at some point in time), when does the ‘Broker delete records?
    • Should there be some form of “ping back” to tell the OARJ Broker when a site takes “persistent responsibility” for an item?
  5. Does the OARJ Broker maintain a reference to where the item has gone?
    • Does it need to keep a list of the places the has gone to?
    • Does this list of places get updated if a repository moves the item to somewhere else?
  6. How do we ensure that the OARJ Broker can deposit into any repository (with minimal effort on behalf of the non-techie managers)
  7. How does “Open Deposit” work?
    • Should anyone be allowed to deposit, or just those authenticated to do so?
    • What can projects such as EM-LOADER, OfficeSword, PEER and the Facebook SWORD Deposit App teach us?
    • Rather than the OARJ Broker attempting to contact any repository it can, should individual repositories register to receive data from the Broker?
    • Should clients that use the Broker be registered, and target repositories assume there has been some form of vetting by the Broker?

This is a particularly interesting time to ask this, as I hear JISC will be making another call for deposit tools (ie, clients)

I think this would be an interesting discussion to have around a table, and some form of agreement (or consensus) reached.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: