What is ♻️ Reproducibility?

and why should I care?

πŸ‘¨β€πŸ« Dr David Wilby (he/him)

Open Data Science Summer School
Wellcome Trust DTC in Public Health, Economics & Decision Science

Thurs 4th Aug 2022

Schedule

09:30 Welcome and Orientation
09:40 TALK: David Wilby: Research Reproducibility.
10:00 Tips and Tricks for Reproducing and Reviewing.
10:15 Select Papers, Team Formation, Chat and Coffee
10:45 Round I of ReproHacking
12:15 Re-group and sharing of experiences
12:30 LUNCH
13:30 Round II of ReproHacking
14:45 Coffee break & TALK: Bob Turner: FAIR 4 Research Software
15:15 Round III of ReproHacking - Complete Feedback form
16:15 Re-group and sharing of experiences
16:30 Feedback and Closing

Event page: reprohack.org/event/18

How do we do research?

The Turing Way project illustration by Scriberia. Used under a CC-BY 4.0 licence. DOI: 10.5281/zenodo.3332807

How do we do research?

Image: Rebus community

What is reproducibility?

How the Turing Way defines reproducible research

Image: The Turing Way

Why is reproducibility important for research?


  • Verification βœ”οΈ

  • Re-use ♻️

  • Longevity ⏳

  • Efficiency ⌚

Is there a problem with reproducibility?

Ioannidis (2005) PLoS Medicine

Different kinds of reproducibility


  • Experimental πŸ§‘β€πŸ”¬

    • Resource-intensive
    • Costly
    • Harder to describe
  • Analytical πŸ§‘β€πŸ’»

    • Well-represented by code
    • Probably easier to reproduce

What isn’t reproducibility?

Definitely not ❌

β€œβ€¦results were produced using custom MATLAB scripts…”

β€œCode/data available on reasonable request..”

Up for debate ❓

  • Equations only?
  • Proprietary software?
  • Spreadsheets?

only those who are faultless have the right to pass judgment on others

πŸ§”β€β™€οΈ [paraphrasing] Jesus, John 8:7, Christian Bible

How can we do it? (1/2)


  • Research data management πŸ’Ύ
  • Version control πŸ™
  • Open source development πŸ‘©β€πŸ’»
  • Documentation πŸ“œ
  • Tutorials πŸ‘©β€πŸ«
  • Literate programming (e.g. rmarkdown, quarto, jupyter notebooks)
  • Envrionment management (e.g. Conda 🐍, renv πŸ€, docker 🐳)
  • Talk to each other! πŸ‘₯

How can we do it? (2/2)

Barriers to reproducibility 🚧


sensu The Turing Way

  • Requires additional skills
  • May make you (feel) vulnerable
  • Publication bias toward novel findings
  • Limited benefit to career advancement
  • Support other users of your code
  • Can make your mistakes public

Advantages πŸŽ‰


  • Validation
  • Reusability
  • Longevity
  • Efficiency
  • Facilitate collaboration and review process
  • Publish validated research and avoid misinformation
  • Write your papers, thesis and reports efficiently
  • Get credits for your work fairly
  • Ensure continuity of your work

Now it’s time to..

  1. πŸ“ƒ Find an interesting paper

  2. 🎯 Try to follow the authors’ instructions to reproduce it!

  3. πŸ’ Give friendly feedback

  4. πŸ¦Έβ€β™€οΈ Learn!

Choosing a paper

  1. Get up and look around the room πŸ‘€
  2. Choose from Reprohack database: reprohack.org/paper πŸ“š

Consider:

  • programming language & tools
  • topic
  • computational intensity
  • previous reproducibility score
  • use the tags or search to narrow down
  • could I learn something new?

Reproducing the paper


  • where is the code?

  • where is the data?

  • what hardware do you need to run the code?

  • what software do you need to run the code?

  • are there instructions to follow?

Reviewing

Think about:

  • how easy was it to get the data? πŸ’Ύ
  • how easy was it to get the code? πŸ’»
  • what was or wasn’t documented well? πŸ“–
  • reproducibility score (?/10) 🎰

Be kind!

πŸ‘€ ReproHack Code of Conduct

Additional Considerations

  • Reproducibility is hard!

  • Submitting authors are incredibly brave!

Thank you Authors! πŸ™Œ


  • Without them there would be no ReproHack


  • Show gratitude and appreciation for their effort and courage πŸ™


  • Constructive criticism only please!

How to ReproHack

  1. Choose a paper
  2. Find some team mates (if you like!)
  3. Have a go at reproducing the results
  4. Feed back to the author(s) - nicely!
  5. Optionally repeat! πŸ˜€