EU Open Science

EU Open Science#

The replication crisis is a widely acknowledged and damaging failure in communication of scientific research, such that the results of a large proportion of studies across fields cannot easily replicated, nor often replicated at all. Some behaviours that have led to this crisis include:

  • not making research data available with published results

  • not making intermediary processing steps available as scripts or workflows, nor fully describing them

  • not making adopted research software available for replication

  • using proprietary and closed source software and platforms

  • not allowing others to use or build on data or software developed during the research via ommission of a license or use of restrictive licenses

  • manual manipulation of data and figures

The idea of Open Science is the development and implementation of recommended practices, funding stipulations and tools to improve the communication of scientific research, so to allow replication of results and derivation of increased value for invested resources in scientific work by stakeholders including the taxpayer, researchers themselves, those donating to charities and industry backers. Open Science has recently been identified as a priority of the European Commission detailed here, which is being increasingly being reflected in funding decisions and stipulations.

One of the pillars of Open Science is the communication of scientific data. The FAIR principles are a set of attributes of scientific data that make it compatible with Open Science goals. They are:

  • Findable: Data is described or annotated such that it can be found by humans or machines via search and uniquely and persistetly identifiable.

  • Accessible: There is a clear, machine-parseable mechanism by which the data can be retrieved. This often amounts to storing it in a publically accessible repository.

  • Interoperable: The data is made available in a format and with sufficient annotation that allows it be combined with other data or consumed by analytics tooling without further human annotation. This often amounts to using self-describing storage formats.

  • Reusable: Data is formatted and licensed in such a way that it can be resused by others without further interaction with the producer.

The Research Data Alliance is a community around best practice in research data handling, with an emphasis on FAIR data. Although not a standards development organisation they do have a collection of ICT Technical Specifications and many supporting resources.

The FAIR principles were originally designed with scientific data in mind, however they have been extended to other areas, including:

  • web services

  • research software and source code

  • scientific workflows

An Open Science-aligned movement that is related to FAIR is that of reproducible computing - which aims to make the full computing environment used in the development of scientific results reproducible. This is becoming increasingly feasible with the development of immutable operating systems and compute environments - in that the entire environment used to produce scientific results can be described, reproduced and verified with a hash comparison.

The application of Open Science principles brings opportunities for collaboration and innovation in the EU. In the area of scientific data, the European Strategy for Data aims at creating a single market for data toward Europe’s global competitiveness and data sovereignty. It focuses on the development of Common European Data Spaces which are a single market for data. The concept is supported by the Data Spaces Support Centre and in software by the Smart Open-source Middleware. There is a list of planned and active data spaces here. The SIMPl middleware was developed based on the experience developing some of these data spaces, allowing tighter integration on a common platform.

In terms of consumption or application of open data, the European Open Science Cloud (EOSC) aims to provide European researchers, innovators, companies and citizens with a federated and open multi-disciplinary environment where they can publish, find and reuse data, tools and services for research, innovation and educational purposes.

Further Reading#

Events#