‘The world will be better off if we share knowledge and data; that’s why I am a strong supporter of open access and open data. Science is discourse. You have to be able to exchange ideas and collaborate – and to do so you need access. As a lawyer, I am happy to be a part of that effort’, Lucie Guibault says enthusiastically.
Guibault is an associate professor at the UvA Institute for Information Law. Together with Andreas Wiebe, she published the report OpenAire – Safe to be open, Study on the protection of research data and recommendations for access and usage in 2013 – the result of an EU-funded research consortium on various aspects of intellectual property in research data. Another report on privacy in research data is forthcoming.
‘Raw data are not protected by copyright because copyright does not protect ideas, but rather only original expressions that bear the mark of the author. In experimental research you have to abide by so many rules that your selection of data is almost never original. Researchers do hold copyright on their publications based on their research data. With 80% of the large publishers, however, they must transfer their copyright upon publication; this is then given to the publisher.’
‘Publication via open access is still limited, leading to a scarcity of material available for text and data mining. That’s a shame because there is a wealth of valuable data and no one has the time to read all the research worldwide, so the only way to unlock it is through data mining. With open data you can link, compare and analyse files in order to gain new insights; this makes them even more valuable.’
‘In addition to copyright, we have also had database rights in Europe since 1996. Intellectual property rights apply to a database if you can prove that you have made a substantial investment in collecting the data. The specifics of database rights are still vague, so we often have to refer back to the European Court of Justice for their interpretation. In the United States there are no database rights; there the free flow of information is thought to be more important. The problem is that a practice has arisen in Europe in which large publishers charge extra to make publications and data available for Text and Data Mining (TDM).’
Guibault is involved in two large EU-Horizon 2020 research projects which both deal with Text and Data Mining (TDM): FutureTDM and OpenMinTeD. ‘European legislators are working on reviewing European copyright. With Future TDM our aim is to play a role in this in order to ensure that a mandatory limitation is added to European copyright and database rights for Text and Data Mining. In 2016 the European Commission published a draft directive called Digital Single Market, which included a copyright exception for TDM. It is a step forward, but we are not yet 100% satisfied with this proposal because it only applies to TDM activities carried out by research organisations. We think TDM should be possible for everyone, not just staff members at research organisations.’
‘The purpose of OpenMinTeD is to create an infrastructure to better enable TDM. Technical and legal aspects are also involved. We are building a platform that offers knowledge as well as software tools in order to promote the interoperability of data, so that data can be combined more easily. Nearly every country has its own specific rules regarding copyright, image rights and listening rights. This currently causes a great deal of legal uncertainty. It is therefore my wish that when scientists publish their data, they do so through a Creative Commons (CC) licence .’
‘CC offers scientists and all other producers of creative works the freedom to manage their copyright in a flexible manner. By selecting one of six (free) available standard licences, copyright holders can determine the extent to which their works may be distributed. This immediately clarifies for everyone the conditions under which they may use your work without the need for permission each time. I am a strong supporter of the least restrictive variant of this: Attribution only. To be able to combine or compare data sets, the licences must of course be compatible. In OpenminTeD we are also working on a tool which compares these licences on a broad scale of aspects in order to quickly assess whether they are compatible.’
Interview: Anneke de Maat