Open Science and Reproducibility

A key concept in science is that of reproducibility: Imagine a researcher or research group carries out a study, comes to a conclusion and publishes this. We call the work reproducible, if the publication describes the study in such way that another research group can follow the description, repeat the study and come to the same conclusion.

The issue of reproducibility is particularly acute when computer based work is involved, for example to generate data or analyse data. For full reproducibility, it is required to archive the software, to record exactly which processing steps where carried on which order on what data and how figures, tables and other results were obtained from that. Ideally, any software that is used should be open source so that there are no ‘hidden steps’ in processing.

Reproducibility raises a number of challenges, including a technical and social one.

One the technical side, the OpenDreamKit project enables and improves more reproducible research by allowing to drive computational science inside the Jupyter Notebook, which provides a full and detailed record of computational steps taken, together with the results obtained. It also allows to annotate the results, thus capturing the key elements of a scientific study: experiment, processing and interpretation. This makes in easier for the scientists to log their actions, and share this with others - for example an as appendix to a publication. Further technical work includes the creation of the NBVAL package and support of the Binder project that archives the computational environment in which such a notebook can execute successfully.

On the social side, it is important to convince scientists, but more so journal editors and policy makers that reproducibility is important for better science and more effective use of taxpayers money. OpenDreamKit runs a wide range of workshops to spread this message and train scientists in use of the relevant tools.

Mixing Data and Computation to explore mathematical data sets: Knowledge to the rescue with LMFDB + SageMath + Pari + MitM

Learn more >

Diffing and Merging Jupyter Notebooks with nbdime

Learn more >