Combating the Reproducibility Crisis in Computational Proteomics


Episode Artwork
1.0x
0% played 00:00 00:00
Jan 22 2025 28 mins   1

On this episode of Translating Proteomics, co-hosts Parag Mallick and Andreas Huhmer of Nautilus Biotechnology discuss the reproducibility crisis in biology and specifically focus on how we can enhance reproducibility in computational proteomics. Key topics they cover include:

•               What the reproducibility crisis is

•               Factors that make it difficult to replicate multiomics research

•               Steps we can take to make biology research more reproducible

Chapters 

00:00 – 01:20 – Introduction

01:20– 03:10 – What is reproducibility in research and why is it important?

03:10 – 05:42 – Recent work from the Mallick Lab focused on computational proteomics reproducibility

05:42 – 09:32 – Ways to help improve reproducibility in computational proteomics – More detailed documentation, moving beyond papers as our main form of documentation, and ensuring computational workflows are available,

09:32 – 11:30 – Why Parag got interested reproducibility – Attempts to build AI layers on top of current workflows

11:30 – 14:00 – The need to create repositories of analytical workflows codified in a structured way that AI can learn from

14:00 – 15:24 – A role for dedicated data curators

15:24 – 18:31 – Moving beyond the idea of study endpoints and recognizing data as part of a larger whole

18:31 – 21:32 – How does AI fit into the continuous analysis and incorporation of new datasets

21:32 – 23:36 – The role of AI in helping researchers design experiments

23:36 – 27:25 – Three things we can do today to increase the reproducibility of computational proteomics experiments:

·      Be clear about the stated hypothesis

·      Document analyses through workflow engines and containerized workflows

·      Advocate for support for funding for reproducibility and reproducibility tools

27:25 – End – Outro

Resources

Parag’s Gilbert S. Omenn Computational Proteomics Award Lecture

o   In this lecture, Parag describes his vision for a more reproducible future in proteomics

Nature Special on “Challenges in irreproducible research

o   A list of articles and perspective pieces discussing the “reproducibility crisis” in research

Why Most Published Research Findings Are False (Ioannidis 2005)

o   Article outlining many of the issues that make it difficult to reproduce research findings

Reproducibility Project: Cancer Biology

o   eLife initiative investigating reproducibility in preclinical cancer research

Center for Open Science Preregistration Initiative

o   Resources for preregistering a hypothesis as part of a study

National Institute of Standards and Technology (NIST)

o   US government agency that aims to be “the world’s leader in creating critical measurement solutions and promoting equitable standards.”

MSstats

o   Open source software for mass spec data analysis from Bioconductor

National Institute of General Medical Sciences

o   US government agency focused on “basic research that increases understanding of biological processes and lays the foundation for advances in disease diagnosis, treatment, and prevention.”

Chan Zuckerberg Initiative – Essential Open Source Software for Science

o   CZI program supporting “software maintenance, growth, development, and community engagement for critical open source tools.”