Recent advances in feature learning have shown that self-supervised vision foundation models can capture semantic correspondences but often lack awareness of underlying 3D geometry. GECO addresses this gap by producing geometrically coherent features that semantically distinguish parts based on geometry (e.g., left/right eyes, front/back legs).
We propose a training framework based on optimal transport, enabling supervision beyond keypoints, even under occlusions and disocclusions. With a lightweight architecture, GECO runs at 30 fps, 98.2% faster than prior methods, while achieving state-of-the-art performance on PFPascal, APK, and CUB, improving PCK by 6.0%, 6.2%, and 4.1%, respectively. Finally, we show that PCK alone is insufficient to capture geometric quality and introduce new metrics and insights for more geometry-aware feature learning.
Resouces
About the Speaker
Regine Hartwig is a PhD student in the Computer Vision Group at the Technical University of Munich (TUM), where she conducts research under the supervision of Prof. Daniel Cremers. Her work centers on computer vision, drawing from her background in electrical engineering, machine learning, and robotics.
Before starting her PhD in 2022, Regine was a Research and Development Assistant at TUM’s Klinikum Rechts der Isar, contributing to applied research at the intersection of engineering and healthcare. She previously completed both her Bachelor’s and Master’s degrees in Electrical and Electronics Engineering at TUM, with academic focus areas spanning robotics, deep learning, embedded systems, and cybersecurity.
Regine continues to pursue research that advances the foundations and real-world applications of modern computer vision.