Open Access and Research Data Management Roadshow

Alistair Fitt,
Pro Vice-Chancellor,
Research and Knowledge Exchange

Set of slides presented at REF Roadshows, October – December 2014

*

Plan

We aim to cover the following today:

1) HEFCE Open Access requirements for “REF 2020”:

what were the surprises of the final guidance?

what you have to do and how you can do it

demo of what you have to do

any more points – questions/discussion?

  • Research data management and curation:

what we are trying to achieve

why this is a really tough (but exciting) problem

what help we hope to offer (e.g. people/resources/guidance)

any more points – questions/discussion?

Please complete the sign-in sheet !

HEFCE REF 2020 OA Guidance

  • Published 28/03/14
  • Lots of consultation went into the recommendations
  • Speaks of “research assessments after the 2014 REF”
  • How did this compare to ham-fisted RCUK guidance?
  • How will we get researchers to satisfy these mandates?
  • How much will it all cost us?
  • In the end, largely seen as sensible and proportionate

THERE WERE STILL SOME SURPRISES THOUGH – 10 OF THEM AS FAR AS I WAS CONCERNED!

Surprise #1: acceptance not publication

Most people thought that the guidance would essentially require the POSTPRINT to be deposited in the Institutional Repository (IR)

They were right – but the postprint has to be deposited WHEN THE OUTPUT IS ACCEPTED and not when it is published. This has some good and bad consequences:

  • GOOD: more in the spirit of Open Access
  • GOOD: probably a better “trigger” for people to remember to do it
  • GOOD: gets round problem of “when exactly was publication?”
  • GOOD: policy applies only after 1/4/2016 – but we expected that
  • GOOD/BAD(?) meant to make it YOUR RESPONSIBILITY (OA “mindset”)
  • BAD: the postprint might not be the final version
  • BAD: can be unclear to people if postprint = galley proof

Surprise #2: three months’ grace allowed

Most people thought that the “grace” period allowed would be very small –

Some suggesting as little as a week. However HEFCE have gone for 3

months.

(Note though that the grace period for embargoes is only 1 month – the IR

and the CRIS should sort that out automatically though one hopes)

“18. The output must have been deposited as soon after the point of

acceptance as possible, and no later than three months after this date (as

given in the acceptance letter or e-mail from the publication to the author)”

(In some ways 3 months may be worse as it may make researchers “put it

off as they still have lots of time”)

Surprise #3: ISBN = BOOK

Most people thought that the guidance would only apply to “journal papers and conference proceedings” and they were right – but how would these be defined? HEFCE found a simple and imaginative solution:

“11. The requirement to comply with the open access policy applies only to

particular outputs, as defined below.

  • The type of output is a journal article or the type of output is a conference

proceeding with an International Standard Serial Number (ISSN).

(b) The output is accepted for publication after 1 April 2016.

Any output that fits both aspects of this definition will need to meet the

open access criteria outlined in paragraphs 16 to 34, unless an exception

applies.

12. Conference proceedings published with an International Standard Book Number

(ISBN) or as part of a book series with an ISSN do not meet this definition.”

Surprise #4: CC-BY-ND-NC allowed !

The CC (Creative Commons) licence arrangements were one of the most hotly-debated parts of the guidance. I thought that a CC-BY licence would be mandated: this allows valuable searches to be carried out and is completely in the spirit of OA, but it also denies the author most of their rights and allows unlimited “hashing and mashing”.

Wrong again!......

“25. The output must be presented in a form that allows anyone with

internet access to search electronically within the text, read it and

download it without charge, while respecting any constraints on timing (as

detailed in paragraphs 27 to 33) 6. While we do not request that outputs

are made available under any particular licence, we advise that outputs

licensed under a Creative Commons Attribution Non-Commercial Non-

Derivative (CC BY-NC-ND) licence would meet this requirement. “

Surprise #5: non-compliant embargoes allowed !

We always thought that the embargoes would be “like RCUK” i.e.

  • 12 months for REF main panels A & B, 24 months for C & D

And so it proved. Your CRIS should deal with all of this automatically

However, we did not anticipate:

“37. The following exceptions deal with cases where deposit of the output is possible, but there are issues to do with meeting the access requirements. In the following cases, the output will still be required to meet the deposit and discovery requirements, but not the access requirements. A closed-access deposit will be required, and the open access requirements should be met as soon as possible.

b. The publication concerned requires an embargo period that exceeds the

stated maxima*, and was the most appropriate publication for the output.”

* by an infinite amount?

Surprise #6: crafty credit for OA books and monographs !

It always seemed to be too complicated to require OA for books and monographs.

No solution seems to be in sight for this difficult problem – the basic reason being that the whole publishing industry needs to change if we are to be able to cope with OA for books and monographs

So it was cunning of HEFCE to allow us to “have a go at solving the problem ourselves”.............

“15. Where a higher education institution (HEI) can demonstrate that it has taken steps towards enabling open access for outputs outside the scope of this definition, credit will be given in the research environment component of the post-2014 REF. “

Surprise #7: it needn’t cost anything !

There was a general worry that the HEFCE OA guidance might in some way require gold OA (and all the APC ££££ associated with this). However the arrangements that will be required seem to have been created with a clear eye on not costing us money. The “non-compliant embargo rule” (surprise #5) has helped a great deal with this

“5. Higher education institutions are now advised to implement processes

and procedures to comply with this policy, which may include using a

combination of the ‘green’ and ‘gold’ routes to open access. Institutions

can achieve full compliance without incurring any additional publication

costs through article processing charges. We will be working closely with

Jisc to support repositories in implementing this policy, and will issue

further information on this work in due course.”

Surprise #8: it’s not just IR’s that meet the criteria

Most people thought that OA for a post-2014 REF would be done entirely via IRs. This was because doing it this way would make it much easier for HEFCE to check (only ~135 IRs to look at). But they have allowed shared IRs AND subject repositories as well.

“17. The output must have been deposited in an institutional repository, a

repository service shared between multiple institutions, or a subject

repository such as arXiv*.

*Individuals depositing their outputs in a subject repository are advised to ensure that their chosen repository meets the requirements set out in this policy. HEFCE will be working to support institutional repositories who may wish to populate their systems with records of externally held deposits.”

Surprise #9: text-mining is not mandatory

One of the main ideas of OA is that “companies from UK PLC should be able to send data-mining robots out on the web to harvest data. So text-mining is right at the heart of the OA mission. However, it is not required (but note – extra credit again will be given in the environment statement if text-mining is allowed).

“34. Outputs do not need to allow automated tools to perform in-text search and download (those activities commonly known as text-mining) to meet the access requirement. However, where an HEI can demonstrate that outputs are presented in a form that allows re-use of the work, including via text-mining, credit will be given in the research environment component of the post-2014 REF. We further recommend that institutions fully consider the extent to which they currently retain or transfer the copyright of works published by their researchers, as part of creating a healthy research environment. For further information on text-mining, see Annex A.”

Surprise #10: a light-touch approach to compliance

How would all of this be enforced? This was a big question.

The answers are below – essentially it’s a “light-touch” process

“40. Evidence for outputs meeting the criteria, the definition, or any of the allowed exceptions will not be required to be submitted to the post-2014 REF.

41. We will establish the detailed verification and audit process as part of the implementation of the post-2014 REF, but we initially intend that any audit will require institutions to provide assurance about their processes and systems for recording open-access information, as well as taking a light-touch approach to verifying supporting information. Some parts of the audit, including of the deposit requirements, are expected to take place at the repository level, not the output level. We will be working with Jisc on establishing a metadata profile that institutions will be advised to adopt; as a minimum this is likely to include a record of the dates of acceptance, initial deposit, and the start and end dates of any embargo period.”

Surprise #11: closed access deposit allowed where access rights are an issue

How would we have to deal with outputs where the rights of parts of the output had not been completely nailed? (For example, some History of Art paper contain low-res, un-copyrighted images at acceptance – the image rights are paid for later) This was a difficult question.

The answer: if you have to you can deposit, but not OA:

37. The following exceptions deal with cases where deposit of the output is possible, but there are issues to do with meeting the access requirements. In the following cases, the output will still be required to meet the deposit and discovery requirements, but not the access requirements. A closed-access deposit will be required, and the open access requirements should be met as soon as possible.

a. The output depends on the reproduction of third party content for which open access rights could not be granted (either within the specified timescales, or at all)

(So we deposit the postprint with images, but nobody can see that version)

OK – but what do *I* have to do?

ANSWER – basically just one thing:

When your paper or conference proceeding

output is accepted, post the postprint on the

CRIS

In a minute, I’ll show you how to do this, but first...

Pre-demo FAQs

  • If it’s got an ISBN number it’s a book. End of argument.
  • This all starts for acceptances after 1/4/2016 – BUT DO IT NOW
  • The postprint is the final version that you sent to the publishers after acceptance.
  • You can change this later if you are worried about posting the correct VOR (Version Of Record).
  • If you have worries about embargoes then either ask somebody or let us worry about it.
  • When you do this it will all be recorded and date/time stamped. No need to record it (but you might do all the same).
  • It’ll get from the CRIS to RADAR on it’s own. Don’t worry.
  • If you are not the corresponding author then MAKE SURE they tell you when it’s accepted.

HARD CORE

You have 3 months after the day of acceptance to post the

postprint on the CRIS. But DON’T PUT IT OFF – DO IT NOW!

If after all this you don’t do what’s required correctly:

  • You will not be allowed to use that output in a future REF
  • No retrospective OA will be allowed. That paper’s DEAD.
  • It’s your responsibility – there will be no category of “people who would have been returned but forgot to do the OA properly”. It’s YOUR FAULT.
  • HELP YOUR COLLEAGUES. REMIND THEM.
  • WORK TOGETHER

OK – but what do *I* have to do?

ANSWER – basically just one thing:

When your paper or conference proceeding

output is accepted, post the postprint on the

CRIS

In a minute, I’ll show you how to do this, but first...

DOING IT

Here’s a postprint of a great paper

“just accepted” by IMA TEAMAT (bet

you’ve never had a paper that cites

“The Mature Man’s Guide to Style”…..)

Note: it’s camera-ready, produced in dreamy LaTeX – looks exactly like the final printed version will look but minus the page numbering, journal livery and branding

This was the FAVSTJ

Doing it

  • Log on to CRIS www.brookes.ac.uk/go/cris
  • Top RH corner “logon to CRIS”
  • Top RH corner “login”
  • 4th icon down – “outputs”: “add new”…”output”…”journal article”
  • Sort out the metadata, adding what you can, scroll down
  • SUBMIT POSTPRINT “click on this icon to select document….
  • “for validation by library”
  • Logout from CRIS

Rowena’s bit

Time for Rowena to do some work......

What the Scholarly Communications

Team does:

  • Check the record for accuracy, add any further information if known

  • Has a version of the paper been uploaded?

  • What do the publishers allow us to make publicly available and when?

*

Sherpa record for ‘Teaching Mathematics and its Applications’

Post-print in Institutional repositories or Central repositories

Embargo: Authors may upload their accepted manuscript PDF to an institutional repository, provided that public availability is delayed until 12 months after first online publication in the journal.

http://www.sherpa.ac.uk/romeo/

*

  • If incorrect version, contact author asap and request the required version
  • If correct version, look after accepted version until embargo has elapsed, email author and ask to let us know date of publication
  • Do we require anything else?
  • Yes – evidence of acceptance date such as email from publisher –
  • Has that been uploaded?
  • All information complete? And publication date known.
  • Then we add coversheet and it is automatically passed to RADAR
  • Inform author and issue a deposit licence
  • Acceptance date and publication date are very important
  • Example of a complete record in RADAR
  • Example of a complete record in RADAR


In Summary

Output made publicly available via the CRIS, on open access on RADAR – the institutional repository of Oxford Brookes

Compliance with funder mandates

Record with a link to RADAR record on the CRIS

CRIS will feed information to staff web profiles (expected sometime in Summer 2015)

JISC OA Pathfinder – Making Sense: a researcher centred approach to funder mandates

*

Making Sense – a researcher centred

approach to funder mandates

www.brookes.ac.uk/library

Stuart Hunt, Rowena Rouse

*


Need help

*

One last time:

When your journal paper or conference

proceeding output is accepted, post the

postprint on the CRIS !!!!

  • Questions?
  • Points?
  • Discussion?
  • Arguments?

Data bit starts now………….

Part 2: Research data management and curation

Data is deluging us

  • Big data – sick of hearing about it!
  • 90% of the data “ever” was generated in the last 2 years
  • Electronic data collection has vast potential
  • The data that we generate here is publicly funded
  • ........ (you may argue about this of course)

Given that we are drowning in data how can we:

Research data – BIG challenges

How can we:

  • Even decide what IS data?
  • Decide what data is worth keeping?
  • Keep our data secure?
  • Store our data in a way so it will be readable in 30 years?
  • Agree on metadata standards?
  • Keep/index our data so that it can be used by others?
  • Open our data to the public (with suitable embargoes)?
  • Allow data-mining and analytics automatically?

What is research data?

  • Wind tunnel results
  • Skin
  • Interviews
  • Sculpture
  • Photographs
  • Algorithms
  • Notebooks
  • Workflows
  • Protocols
  • Etc. etc. ............. the only limit is your imagination!

Data, metadata or information?

  • Data – the plain facts or records
  • Metadata – data about data

  • Information - data that have been processed, organized, structured or presented in a given context so as to make them useful

Why are we bothered about this?

  • Mandates will come soon
  • We know that there will be no data mandates for the next REF
  • ....... but some RCUK policies are in place already
  • Wellcome and other funders are also talking about this
  • In any case, it’s the RIGHT thing to do

BTW – the UK is a world leader in this

We want to keep it this way!

What have we done about this?

  • We have been considering this problem for a while
  • We have done a range of data audits
  • We have taken a pragmatic “solve what you can” approach
  • We have a POLICY and that’s what we want to tell you about now
  • The next step is to move to an OPERATIONAL PLAN
  • And for this we need to tell you what we can offer you

OBU Research Data Management Policy

Here’s where you can find it:

The Policy

Here’s what it looks like: (6pp)

What’s the next step?

It’s fine having a policy, but now we must IMPLEMENT IT

This will involve a number of things. We need to tell you:

  • Who can help you and where you can get help
  • What resources we can provide for you (e.g. data storage)
  • How you can organise your metadata
  • How we can communicate this to staff and research students
  • How we can build on the good practice that already exists

What can we offer?

We are working on this at the moment, but our offer will include:

  • A data storage solution (probably Arkivum)
  • People that can help you
  • University guidance about mandates and open data requirements
  • Metadata standards (maybe from JISC)
  • Lots of other things as well

Let’s be clear now…….

Phrases like “open data mandate” and “data sharing” can cause alarm, so let me stress:

  • You can embargo your data from sharing, so there’s no danger of “I collected the data – but if I share it then everybody else will write my papers before I can”
  • Things that need to be secure will still be secure – so if data is anonymous (for instance) then there is no sense in which this will be overridden
  • Ethical requirements will always trump openness of data

Final RDM thoughts

We KNOW this is a really hard problem. There are no easy solutions

- but we will HAVE to do it – there is not a choice

National resources exist e.g. DCC (Digital Curation Centre), ESRC data repository (U. Essex)

We need to work together on this exciting problem

The End

QUESTIONS?