The Best of ICRA is a two-day virtual meetup series featuring researchers presenting their accepted papers from the 2026 International Conference on Robotics and Automation (ICRA).
š Register for either day to get access to both days of the Best of ICRA.
Each session features a curated lineup of speakers sharing cutting-edge research across robotics, computer vision, and AI ā straight from papers accepted at one of the field's top conferences.
Whether you're a researcher, engineer, or practitioner, you'll leave with a sharper view of where the field is heading.
Schedule
Contrastive learning on 3d point clouds for geometric defect detection
Reliable 3D defect detection in manufacturing is hard: the input is a point cloud ā an unordered set that standard neural backbones cannot process directly; high-quality training data is scarce; and real scans are noisy and arrive in arbitrary orientations. We address these challenges in COSARAD, a contrastive learning framework that learns highly discriminative representations of object surface geometry under weak supervision.
When a test object arrives, we extract its features and compare them against a library of defect-free reference shapes for precise, interpretable defect localization ā achieving state-of-the-art accuracy on industrial benchmarks such as Real3D-AD. In my talk, I'll cover the design choices behind the system, why contrastive representation learning is the right fit for sparse 3D data, and open problems in scaling inspection to production.
A Semantic and Occlusion-Aware Gaussian Mixture Probability Hypothesis Density Filter
Reliable and resilient multi-target tracking is foundational for safe autonomous driving, yet most perception pipelines frequently struggle with sensor noise, heavy clutter, and severe environmental occlusions. To resolve these limitations, this talk presents a novel Semantic-Occlusion Aware (S-OA) Gaussian Mixture Probability Hypothesis Density (GM-PHD) filter.
By combining geometric occlusion reasoning with deep learning-derived environmental semantics, the proposed framework adaptively initializes target tracking in regions where new targets are likely to appear. Evaluations demonstrate that this context-aware tracking system minimizes track initiation latency and preserves high tracking precision even under intense clutter.
Ultimately, this work demonstrates how embedding spatial and semantic structure into filtering yields a significantly more robust and resilient perception stack for autonomous navigation.
An Annotation-to-Detection Framework for Autonomous and Robust Vine Trunk Localization in the Field by Mobile Agricultural Robots
Autonomous robots struggle to detect objects in unstructured fields, requiring in-domain tuning with laborious manual data collection. In this work, we introduce a comprehensive annotation-to-detection framework designed to train a robust multi-modal detector using limited and partially labeled training data.
Our method combines cross-modal annotation transfer, early sensor fusion, and a multi-stage detection architecture to train and enhance multi-modal detection. Validated on vineyard trunk detection and paired with a custom LOAM algorithm, it localised over 70% of trees in one pass with under 0.37 m mean error.
Our system demonstrated that robust detection is achievable even with minimal initial annotations and human intervention.