Shuvom Sadhuka
📍 SF
I am a PhD student at MIT CSAIL, advised by Bonnie Berger. I also collaborate with Emma Pierson. Broadly, I am interested in evaluation, uncertainty estimation, and decision-making, often in the context of biomedical problems. To me, evaluation is a two-way street:
-
Develop and use new tools to evaluate human decision-makers and data. Some past and ongoing work includes evaluating privacy risks in “anonymous” genomic datasets (link) and building Bayesian models of clinical decision-making (link).
-
Develop new metrics and methods to analyze ML systems themselves. Given thorny issues in our data — noisy labels, sparse labels, and so on — it is unsurprising that evaluations of performance are often unreliable. On this front, I’ve investigated how to use unlabeled data to estimate performance of models (link) and repurposed sequential hypothesis testing ideas to verify agent trajectories (link).
I previously interned at Genentech, where I built statistical methods for sequentially monitoring AI agents.
I am grateful for the support of the Hertz Fellowship and the NSF GRFP. You can find my CV here. I enjoy writing, and you can find my blog posts here. You can contact me at ssadhuka [at] mit [dot] edu.
Recent Work
E-valuator: Reliable Agent Verifiers with Sequential Hypothesis Testing. Shuvom Sadhuka, Drew Prinster, Clara Fannjiang, Gabriele Scalia, Aviv Regev, Hanchen Wang (Preprint)
A Bayesian Model for Multi-stage Censoring. Shuvom Sadhuka, Sophia Lin, Bonnie Berger, Emma Pierson (ML4H 2025)
Evaluating multiple models using labeled and unlabeled data. Divya Shanmugam*, Shuvom Sadhuka*, Manish Raghavan, John Guttag, Bonnie Berger, Emma Pierson (NeurIPS 2025)
Non-research, non-writing projects
A small website to visualize embeddings of song lyrics for my own Spotify. Also includes a sparse autoencoder to interpret the embeddings.
latest posts
| Feb 11, 2025 | Measuring Entropy |
|---|---|
| Oct 21, 2024 | Fellowship Applications |
| Jun 24, 2024 | A running list of writing ideas |