Skip to content

Best of NeurIPS - Feb 6

Feb 6, 2025 at 9 AM Pacific

Best of NeurIPS 2024 event

Register for the Zoom

By submitting you (1) agree to Voxel51’s Terms of Service and Privacy Statement and (2) agree to receive occasional emails.

Welcome to the Best of NeurIPS virtual series that highlights some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you.

Intrinsic Self-Supervision for Data Quality Audits

Fabian Gröger
University of Basel

Benchmark datasets in computer vision often contain issues such as off-topic samples, near-duplicates, and label errors, compromising model evaluation accuracy. This talk will discuss SelfClean, a data-cleaning framework that leverages self-supervised representation learning and distance-based indicators to detect these issues effectively.

By framing the task as a ranking or scoring problem, SelfClean minimizes human effort while outperforming competing methods in identifying synthetic and natural contamination across natural and medical domains. With this methodology, we identified up to 16% of problematic samples in current benchmark datasets and enhanced the reliability of model performance evaluation.

Read the paper, “Intrinsic Self-Supervision for Data Quality Audits”

About the Speaker

Fabian Gröger is a second-year PhD Student supervised by Alexander A. Navarini and Marc Pouly at the University of Basel. His research interests include self-supervised learning, data-centric machine learning research, and medical imaging.

CLIP: Insights into Zero-Shot Image Classification with Mutual Knowledge

Fawaz Sammani
Vrije Universiteit Brussel

We interpret CLIP’s zero-shot image classification by examining shared textual concepts learned by its vision and language encoders. We analyzes 13 CLIP models across various architectures, sizes, and datasets. The approach highlights a human-friendly way to understand CLIP’s classification decisions.

Read the paper, “Interpreting and Analysing CLIP’s Zero-Shot Image Classification via Mutual Knowledge”

About the Speaker

Fawaz Sammani is a 2nd year PhD student at the Vrije Universiteit Brussel. His research focuses on Human-Friendly Interpretability and Explainability of deep neural networks

Multiview Scene Graph

Juexiao Zhang
NYU

Motivated by how humans perceive scenes, we propose the Multiview Scene Graph (MSG) as a general topological scene representation. MSG constructs a place+object graph from unposed RGB images and we provide novel metrics to evaluate the graph quality. We combine visual place recognition and object association to build MSG in one Transformer decoder model. We believe MSG can connect dots across classic vision tasks to promote spatial intelligence and open new doors for topological 3D scene understanding.

Read the paper, “Multiview Scene Graph”

About the Speaker

Juexiao Zhang is a second-year PhD student in computer science at NYU Courant, advised by Professor Chen Feng. He is interested in learning scene representations that are useful for robots to understand the world and interact with it.