July 10, 2025 | 9 AM Pacific
July 10, 2025 at 9 AM Pacific
Online. Register for the Zoom!
Max Planck Institute for Intelligent Systems
Reconstructing 3D face models from a single image is an inherently ill-posed problem, which becomes even more challenging in the presence of occlusions where multiple reconstructions can be equally valid. Despite the ubiquity of the problem, very few methods address its multi-hypothesis nature. In this paper we introduce OFER, a novel approach for single-image 3D face reconstruction that can generate plausible, diverse, and expressive 3D faces by training two diffusion models to generate a shape and expression coefficients of face parametric model, conditioned on the input image. To maintain consistency across diverse expressions, the challenge is to select the best matching shape. To achieve this, we propose a novel ranking mechanism that sorts the outputs of the shape diffusion network based on predicted shape accuracy scores.
University of Washington
Video anomaly detection is crucial for ensuring safety and security, yet existing benchmarks overlook the unique context of smart home environments. We introduce SmartHome-Bench, a dataset of 1,203 smart home videos annotated according to a novel taxonomy of seven anomaly categories, such as Wildlife, Senior Care, and Baby Monitoring. We evaluate state-of-the-art closed- and open-source multimodal LLMs with various prompting techniques, revealing significant performance gaps. To address these limitations, we propose the Taxonomy-Driven Reflective LLM Chain (TRLC), which boosts detection accuracy by 11.62%.
University of Adelaide
What if you could tell an AI model exactly “𝘸𝘩𝘦𝘳𝘦 𝘵𝘰 𝘧𝘰𝘤𝘶𝘴” and “𝘸𝘩𝘦𝘳𝘦 𝘵𝘰 𝘪𝘨𝘯𝘰𝘳𝘦” on a medical image? Our work enables radiologists to interactively guide AI models at test time for more transparent and trustworthy decision-making. This paper introduces the novel Concept-based Similarity Reasoning network (CSR), which offers (i) patch-level prototype with intrinsic concept interpretation, and (ii) spatial interactivity. First, the proposed CSR provides localized explanation by grounding prototypes of each concept on image regions. Second, our model introduces novel spatial-level interaction, allowing doctors to engage directly with specific image areas, making it an intuitive and transparent tool for medical imaging.
Leibniz Universität Hannover
The advent of 3D Gaussian Splatting has revolutionized and re-vitalized the interest in multi-view image data. Applications of these techniques to fields such as anomaly detection have been a logical next step. However, some of the limitations of these models may warrant a return to already applied probabilistic techniques. New approaches, difficulties and possibilities in this field will be explored in this talk.
Join the AI and ML enthusiasts who have already become members
The goal of the AI, Machine Learning, and Computer Vision Meetup network is to bring together a community of data scientists, machine learning engineers, and open source enthusiasts who want to share and expand their knowledge of AI and complementary technologies. If that’s you, we invite you to join the Meetup closest to your timezone.