Talk to a computer vision expert

Book a demo

Virtual

3 of 4

Americas

Conferences

Best of ICCV - November 21, 2025

This event has ended, but you can still catch up! Watch the on-demand recordings and register for our future events.

Nov 21, 2025

9 AM Pacific

Online. Register for the Zoom!

Day 1 Day 2 Day 3 Day 4

Speakers

About this event

Welcome to the Best of ICCV series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you.

Schedule

Proactive Comorbidity Prediction in HIV: Towards Fair and Trustworthy Care

HIV is a chronic infection that weakens the immune system and exposes patients to a high burden of comorbidities. While antiretroviral therapy has improved life expectancy, comorbidities remain a major challenge, and traditional screening protocols often fail to capture subtle risk patterns early enough. To address this, we develop a novel method trained on lab tests and demographic data from 2,200 patients in SE London. The method integrates feature interaction modeling, attention mechanisms, residual fusion and label-specific attention heads, outperforming TabNet, MLPs and classical machine learning models. Our experiments show that incorporating demographic information improves predictive performance, though demographic recoverability analyses reveal that age and gender can still be inferred from lab data alone, raising fairness concerns. Finally, robustness checks confirm stable feature importance across cross-validation folds, reinforcing the trustworthiness of our approach.

Resources

Paper

About the Speaker

Dimitrios Kollias is an Associate Professor in Multimodal AI at Queen Mary University of London, specializing in machine/deep learning, trustworthy AI, computer vision, medical imaging & healthcare, behavior analysis, HMI. I have published 80+ papers (h-index 39; 6100+ citations) in top venues (e.g., CVPR, ICCV, ECCV, AAAI, IJCV, ECAI), invented a patent in behavior analysis (Huawei) and my research is widely adopted by academia and industry. I also serve as AI consultant and advisor to global companies, and have played leading roles in major international AI workshops and competitions.

GECO: Geometrically Consistent Embedding with Lightspeed Inference

Recent advances in feature learning have shown that self-supervised vision foundation models can capture semantic correspondences but often lack awareness of underlying 3D geometry. GECO addresses this gap by producing geometrically coherent features that semantically distinguish parts based on geometry (e.g., left/right eyes, front/back legs).

We propose a training framework based on optimal transport, enabling supervision beyond keypoints, even under occlusions and disocclusions. With a lightweight architecture, GECO runs at 30 fps, 98.2% faster than prior methods, while achieving state-of-the-art performance on PFPascal, APK, and CUB, improving PCK by 6.0%, 6.2%, and 4.1%, respectively. Finally, we show that PCK alone is insufficient to capture geometric quality and introduce new metrics and insights for more geometry-aware feature learning.

Resouces

About the Speaker

Regine Hartwig is a PhD student in the Computer Vision Group at the Technical University of Munich (TUM), where she conducts research under the supervision of Prof. Daniel Cremers. Her work centers on computer vision, drawing from her background in electrical engineering, machine learning, and robotics.

Before starting her PhD in 2022, Regine was a Research and Development Assistant at TUM’s Klinikum Rechts der Isar, contributing to applied research at the intersection of engineering and healthcare. She previously completed both her Bachelor’s and Master’s degrees in Electrical and Electronics Engineering at TUM, with academic focus areas spanning robotics, deep learning, embedded systems, and cybersecurity.

Regine continues to pursue research that advances the foundations and real-world applications of modern computer vision.

DRaM-LHM: A Quaternion Framework for Iterative Camera Pose Estimation

We explore a quaternion adjugate matrix-based representation for rotational motion in the Perspective-n-Point (PnP) problem. Leveraging quadratic quaternion terms within a Determinant Ratio Matrix (DRaM) estimation framework, we extend its application to perspective scenarios, providing a robust and efficient initialization for iterative PnP pose estimation. Notably, by solving the orthographic projection least-squares problem, DRaM provides a reliable initialization that enhances the accuracy and stability of iterative PnP solvers. Experiments on synthetic and real data demonstrate its efficiency, accuracy, and robustness, particularly under high noise conditions. Furthermore, our nonminimal formulation ensures numerical stability, making it effective for real-world applications.

Resources

Paper

About the Speaker

Chen Lin was a Research Fellow at the Simons Foundation, where she specialized in 3D computer vision and visual(-inertial) SLAM. Her research spans from classical multiview geometry to learning-based pose estimation and scene understanding. Her ICCV 2025 paper introduces a new framework for rotation and pose estimation built on advanced algebraic paradigms.

Chen is now a Senior Computer Vision Engineer at Midea, focusing on industrial robotics. She remains passionate about 3D vision and continues to contribute to advancing the field.

Toward Trustworthy Embodied Agents: From Individuals to Teams

Modern intelligent embodied agents, such as service robots and autonomous vehicles, interact frequently with humans in dynamic, uncertain environments. They may also collaborate with each other as a team through effective communication to enhance task success, safety, and efficiency. These brings a few significant challenges. First, building reliable agents that safely navigate multi-agent scenarios requires scalable and generalizable prediction of surrounding agents’ behaviors and robust decision making under environmental uncertainty in out-of-distribution (OOD) scenarios. Second, effective cooperation between agents requires efficient communication and information fusion strategies and reliable task planning for complex long-horizon tasks. In this talk, I will introduce a series of our recent work that addresses these challenges to enable safe and trustworthy embodied agents and their application to autonomous driving and service robots. Specifically, I will first demonstrate principled uncertainty quantification techniques and how they enable generalizable prediction and planning in out-of-distribution scenarios. Then, I will talk about effective approaches to enable efficient multi-agent communication and cooperation in centralized and decentralized settings.

Resources

GitHub

About the Speaker

Dr. Jiachen Li is an Assistant Professor in the Department of Electrical and Computer Engineering (ECE) and a cooperating faculty in the Department of Computer Science and Engineering (CSE) at the University of California, Riverside. He is the Director of the Trustworthy Autonomous Systems Laboratory and is affiliated with the Riverside Artificial Intelligence Research Institute (RAISE), the Center for Robotics and Intelligent Systems (CRIS), and the Center for Environmental Research and Technology (CE-CERT). Before joining UCR, he was a postdoctoral scholar at Stanford University and earned his Ph.D. from the University of California, Berkeley.