Open research data can save lives

15 November 2016

Paul Boersma is professor of Phonetic Sciences and director of the Amsterdam Center for Language and Communication, which is part of the Faculty of Humanities. He designed the faculty’s data protocol and represents UvA researchers on the Research Data Management (RDM) Programme Board.

More meta-analyses are possible

Open research data as a default, unless there is a strong case against it. This is the EU’s requirement from 2017, and I think that’s an excellent guideline. Taxpayers need to have access to the data. This also makes research data much more valuable. Its value to society increases exponentially, as among other things it makes more meta-analyses possible, which are extremely useful. A key text on the subject reports that, if a meta-analysis had been conducted into infant death in the 1970s, it would have resulted in 60,000 fewer deaths. In other words, open research data can save lives. Right now it’s often still very tricky to gather data for meta-analyses.’

Sharing research data as much as possible

‘One of the four aims of the protocol that has been developed for the Faculty of Humanities is that we share research data as much as possible. Researchers often worry about publishing their data while the study they are working on is still in progress, as it means others could start doing analyses as well. That’s why we don’t publish the data until a research project has been completed. The necessity of open data has come to be generally accepted here. What we’re still debating is after how many years the data should be made public. What do you do if you want to continue working with your data after completing your thesis?’

What counts as data?

‘We have discussed the faculty’s data protocol with the research directors on the Research Council, the advisory council and the group meetings. The Research Council has already approved it, and it needs to be formally adopted by the dean before the end of this year. The talks revealed that some types of research aren’t well suited to RDM, and that a number of groups are already very active with RDM and therefore aren’t keen to have another guideline to have to comply with. We also talked extensively about what counts as data, and what data need to be stored.’

Notes in the margin

‘For example, in the case of book reviews by literature scholars, do their notes in the book’s margins count as research data, and should they be made publicly available? Researchers could view their notes as interpretation, not data. And what about linguists doing participatory research on languages in the Amazon, for example? It’s not always possible to publish recordings of people who hold their language to be sacred. In situations like this, what do you store, and how?’

Which raw, derivative and analysis data should be stored?

‘Our protocol contains a description of the raw and derivative data that have to be stored. For historical research, for example, all sources that are difficult to access need to be scanned or photographed and added to the dataset. That’s important with a view to replication. The protocol also states which analysis data need to be stored.’

Organise your research data with other people in mind

‘The protocol’s main objective is the safe storage of data. Currently, our facilities for this are not ideal, so we are very pleased about the arrival of the centralised RDM system. Provided that it’s sufficiently user-friendly, our researchers will certainly be using it. The second and third objectives – accountability and data organisation – refer to researchers’ responsibility to ensure that others can easily consult their data. This means they have to document all of the steps in the analysis, and clearly describe and code their research data and organise it into folders.’

Concerns about the retention period

‘My primary remaining concern is the retention period. The central UvA-AUAS guidelines stipulate a minimum period of ten years. For many of our disciplines – take history, for example! – that’s very short. I had hoped that, when research data is made public, it would remain so indefinitely. I hope long-term retention in the centralised RDM system will be possible, because we want to store our data for a very long time. People often claim that data storage is expensive, but that’s not true. In fact, it’s becoming more and more affordable.’

Leap forward

‘Everyone at my research institute is aware that the data protocol has been developed. Soon everyone will be sent a copy, after which we need to go on to talk about possible interpretations and how to put it into practice at the institute. Research data management is also discussed in staff annual consultations, but it could be a more explicit facet of the review. I am expecting that we will be able to make a significant leap forward in 2017 with our protocol and the central RDM system in place.’

Faculty of Humanities' RDM Protocol

The main aims of the Faculty of Humanities’ RDM Protocol are:

  1. you must not lose your data
  2. you are accountable for your data
  3. you must organise your data with other people in mind
  4. you must share your data with others as much as possible

These aims of the Faculty of Humanities’ RDM protocol have been derived from the main aims of the data protocol used by UvA psychologists: safety, accountability, efficiency and data-sharing. After the protocol has been formally adopted, it will be published on the Faculty of Humanities homepage.

Interview: Anneke de Maat

Published by  RDM support