Progress report for Sheffield/Leeds
Tania Allard and Michael Croucher
Reporting period from March 2017 to June 2018
Finance and administration
Hiring
Achievements
- Supported a number of lecturers in Sheffield to migrate to Jupyter and CoCalc (formerly SageMathCloud). Also, provided continued support to those that had already been using CoCalc and Jupyter notebooks for their courses. These included lecturers from Computer Science, Physics, Biomedical Science, Bioinformatics, and Materials Science. (D2.17, T2.6)
- The previously generated CoCalc tutorial was extended by adding tutorial sections for students having courses in CoCalc as well as with a hands-on tutorial for lecturers to get started. The material can be found as a website at https://tutorial.cocalc.com/ (The repository for this is located at:https://github.com/sagemathinc/cocalc_tutorial) (D2.17, T2.6).
- Developed a Jekyll template
for use by academics and researchers using Jupyter notebooks for course materials and dissemination. Such a template allows the creation of Jupyter notebooks based websites using Jekyll, which is the
default static website framework supported by GitHub. It also allows for easy display of notebooks connected to cloud computing resources such as Microsoft Azure Notebooks (D2.17,
T2.6).
- Examples of courses using such templates can be found at:
- The documentation for the template can be found at: http://trallard.github.io/Modules-template-docs/
- Developed a Python package: nbjekyll that complements the Jekyll template developed . This package converts Jupyter notebooks into .md files that can be readily usable by Jekyll (this uses nbformat for the conversion). It also uses nbval to perform notebook validation and add custom headers indicating the last update date, version and test status of the notebook (D2.17, T2.6).
Work in progress
- Involvement in the Sheffield Machine Learning Network: working with the Machine Learning group at Sheffield to leverage the use of Jupyter notebooks and related technologies (e.g. code cafe, coding dojos) (T2.3)
- Creation of an online journal based on Jupyter notebook submissions and with a content review performed in GitHub (the first volume will be generated from a sprint/hackathon) (D2.17, T2.6, T2.3)
- The infrastructure for the journal itself is now in the form of a WebApp and deployed using Heroku
- The contents for the journal can be found at: https://github.com/MCNotes
- The review process will be fully automated via GitHub issues and PR and a bot has been created to handle most of the review admin and validation tasks (https://github.com/MCNotes/wintermute)
- Tania A. will be attending Women in Sage in Montreal during July 8-13th. She will give a talk on practical best practices for computational sciences. In addition she will participate in the sprints to take place during the event.
Workshops and dissemination activities
- Developed and delivered a workshop on using Jupyter notebooks for reproducible research for the 2nd international Research Software Engineering conference. The workshop was one of the most popular across the entire conference and, as such, we were asked to deliver it twice in order to meet demand. Workshop materials at https://github.com/trallard/JNB_reproducible and blog post at RSE Sheffield blog which was reposted in the Software Carpentries Blog (D2.17, T2.6).
- Developed and delivered Bioinformatics Awareness Days https://github.com/trallard/BAD_days in Collaboration with Prof. Luisa Cutillo of Parthenope University of Naples (D2.17, T2.6).
- Developed training materials and provided training for over 130 women in the last 12 months at Sheffield and Manchester in partnership with CodeFirstGirls (D2.17, T2.6,T2.5).
- Participation in the Diversity and Inclusion in Scientific Computing unconference by direct invite of NUMFOCUS (T2.5).
- Workshop and open materials on Reproducible analysis in Python: these materials cover the essentials on how to develop workflows with a reproducibility approach
- The workshop was first delivered in PyCon 2018 to over 60 attendees from all over the world
- The materials were further extended to add more content on Jupyter notebook validation using nbval and also on property based testing
- All the materials are licensed under CC-BY and can be found at https://github.com/trallard/ReproduciblePython and are also shared using binder
- As a follow up for this workshop Tania Allard has been invited to give a talk about reproducibility in data pipelines at the RAPIDS conference in London
- Web data in Python for non computer science people: a set of Jupyter notebooks and materials on how to use Python to collect, clean and analyse web data was developed in conjunction with Sheffield Research Methods Institute, in addition to developing and open sourcing the materials a workshop on the topic and using the materials was taught on May 18th, 21st, 22nd and 23rd along with other Software Carpentry workshops.
- The materials can be found at https://github.com/trallard/WebData_Python they are licensed under CC-By and also shared using binder
</section>