Veronika Cheplygina - Shortcuts and shortcomings in machine learning for medical imaging

Veronika Cheplygina - Shortcuts and other shortcomings in machine learning for medical imaging - Perspectives on Scientific Error 2024
For slides, see osf.io/ayfek/
The application of machine learning (ML) to medical imaging diagnosis has attracted a lot of attention in recent years, with numerous reports of recognising medical images more accurately than human experts (for an overview see Liu et al., 2019). Yet progress in clinical practice has not been proportional to claims. For example Roberts et al. (2021) found that none of the 62 published studies on ML for COVID-19 had potential for clinical use. Studies for other clinical applications of ML have also failed to find reliable published prediction models.
The increased popularity of ML in recent years is often explained by two developments. First, there are several large publicly available datasets. Second, open source deep-learning toolboxes allow development of algorithms without specialised domain knowledge, allowing more researchers into a field. Despite these seemingly ideal conditions for reproducibility, the state of ML in medical imaging is not as positive as one might think. There are various reasons for this which we outline in (Varoquaux and Cheplygina, 2022), here we highlight two.
One reason is that large sample sizes are not a panacea. There is a tendency to expect that a clinical task can be “solved” if the dataset is large enough. However, not all clinical tasks translate neatly into ML tasks. Furthermore, creating larger datasets often comes at the expense of quality, leading algorithms to learn spurious correlations or “shortcuts”. For example, an algorithm might learn that if a patient’s chest x-ray shows a drain - a treatment for a collapsed lung - that that patient is likely to suffer from the collapsed lung condition (Oakden-Rayner, 2020). Similarly, our recent results (in preparation) show that lung diseases can be diagnosed with high accuracy, even if the lungs are hidden from the x-ray.
One reason is that the availability of data and code, plus the theoretical option to “infinitely” repeat experiments (for example, with different subsets of data, different initialization points of the algorithms, and so forth) creates an illusion of generalization. Since there are many degrees of freedom to how such repetition can be done, for practical reasons researchers tend to not do this exhaustively, but might be tempted to formulate their conclusions more generally.
In this talk I dive deeper into these problems and hopefully, with the help of the audience, also explore some solutions.

Жүктеу

Fiona Fidler - Can forecasts of replicability improve peer review? - PoSE 2024

Ana Martinovici - How to detect data fabrication in Qualtrics questionnaires - PoSE 2024

Why You Should Always Help Others ❤️

OMG🤪 #tiktok #shorts #potapova_blog

❌Не пускают в тц с животными. В чем проблема, не пойму!?!? #pov #story

Получилось у Вики?😂 #хабибка

Chest X Rays (CXR) Made Easy! - Learn in 10 Minutes!

Meta (Facebook) Machine Learning Mock Interview: Illegal Items Detection

Simple Stereo | Camera Calibration

Bias Detection (in Meta-Analyses)

GOLD prices are rising. Is it KILLING the US Dollar ? : Geopolitical Case Study

Generative AI in a Nutshell - how to survive and thrive in the age of AI

How I Memorized ALL Anatomy

Why the UK's Economy Stopped Working

To Get to Know Yourself, Be Curious and Walk Through Fear

MRI Physics | Magnetic Resonance and Spin Echo Sequences - Johns Hopkins Radiology

Zooming until I Find Raspberry Pi Logo in the infinity art Gallery #zoom #art #shorts

Обзор Sonos Ace - лучше б не выпускали...

Как объяснить маме, куда тыкать на iPhone?

Lid hologram 3d

Собери ПК и Получи 10,000₽

Телефон в воде 🤯

НЕДЕЛЯ с iPhone 15 Pro - ПРАВДА о которой все МОЛЧАТ! | ЧЕСТНЫЙ ОТЗЫВ

Veronika Cheplygina - Shortcuts and shortcomings in machine learning for medical imaging - PoSE 2024

Пікірлер: 1