July 11, 2025 | 9 AM Pacific
July 11, 2025 at 9 AM Pacific
Online. Register for the Zoom!
OTH Regensburg
As AI becomes more prevalent in fields like healthcare, ensuring its reliability under unexpected inputs is essential. We present OpenMIBOOD, a benchmarking framework for evaluating out-of-distribution (OOD) detection methods in medical imaging. It includes 14 datasets across three medical domains and categorizes them into in-distribution, near-OOD, and far-OOD groups to assess 24 post-hoc methods. Results show that OOD detection approaches effective in natural images often fail in medical contexts, highlighting the need for domain-specific benchmarks to ensure trustworthy AI in healthcare.
Khalifa University
Semi-supervised medical image segmentation often suffers from class imbalance and high uncertainty due to pathology variability. We propose DyCON, a Dynamic Uncertainty-aware Consistency and Contrastive Learning framework that addresses these challenges via two novel losses: UnCL and FeCL. UnCL adaptively weights voxel-wise consistency based on uncertainty, initially focusing on uncertain regions and gradually shifting to confident ones. FeCL improves local feature discrimination under imbalance by applying dual focal mechanisms and adaptive entropy-based weighting to contrastive learning.
Washington University in St. Louis
The choice of representation for geographic location significantly impacts the accuracy of models for a broad range of geospatial tasks, including fine-grained species classification, population density estimation, and biome classification. Recent works learn such representations by contrastively aligning geolocation[lat,lon] with co-located images. While these methods work exceptionally well, in this paper, we posit that the current training strategies fail to fully capture the important visual features. We provide an information-theoretic perspective on why the resulting embeddings from these methods discard crucial visual information that is important for many downstream tasks. To solve this problem, we propose a novel retrieval-augmented strategy called RANGE. We build our method on the intuition that the visual features of a location can be estimated by combining the visual features from multiple similar-looking locations. We show this retrieval strategy outperforms the existing state-of-the-art models with significant margins in most tasks.
Technical University of Munich
CLIP excels at global image-text alignment but struggles with fine-grained visual understanding. In this talk, I present FLAIR—Fine-grained Language-informed Image Representations—which leverages long, detailed captions to learn localized image features. By conditioning attention pooling on diverse sub-captions, FLAIR generates text-specific image embeddings that enhance retrieval of fine-grained content. Our model outperforms existing methods on standard and newly proposed fine-grained retrieval benchmarks, and even enables strong zero-shot semantic segmentation—despite being trained on only 30M image-text pairs.
Join the AI and ML enthusiasts who have already become members
The goal of the AI, Machine Learning, and Computer Vision Meetup network is to bring together a community of data scientists, machine learning engineers, and open source enthusiasts who want to share and expand their knowledge of AI and complementary technologies. If that’s you, we invite you to join the Meetup closest to your timezone.