Virtual
Americas
CV Meetups
AI, ML and Computer Vision Meetup – March 5, 2026
This event has ended, but you can still catch up! Watch the on-demand recordings and register for our future events.
Mar 05, 2026
9 - 11 AM Pacific
Online. Register for the Zoom!
Speakers
About this event
Join our virtual meetup to hear talks from experts on cutting-edge topics across AI, ML, and computer vision.
Schedule
MOSPA: Human Motion Generation Driven by Spatial Audio
Enabling virtual humans to dynamically and realistically respond to diverse auditory stimuli remains a key challenge in character animation. This problem demands the tight integration of perceptual modeling and motion synthesis, yet despite its importance, it remains largely unexplored.
Most prior work has focused on mapping modalities such as speech, audio, or music to generate human motion. However, these approaches typically overlook the role of spatial features encoded in spatial audio signals, and how those features influence human movement.
To bridge this gap and enable high-quality modeling of human motion in response to spatial audio, we will introduce the first comprehensive Spatial Audio-Driven Human Motion (SAM) dataset. SAM contains diverse, high-quality spatial audio paired with corresponding human motion data.
For benchmarking, we will develop a simple yet effective diffusion-based generative framework for human motion generation driven by spatial audio, termed MOSPA. MOSPA faithfully captures the relationship between body motion and spatial audio through an effective multimodal fusion mechanism. Once trained, the model can generate diverse and realistic human motions conditioned on varying spatial audio inputs.
Finally, we will conduct a thorough investigation of the proposed dataset and perform extensive benchmarking experiments. Our approach achieves state-of-the-art performance on this task, demonstrating the effectiveness of both the dataset and the proposed framework.
Resources
About the Speaker
Zhiyang (Frank) Dou is a Ph.D. student at MIT CSAIL, advised by Prof. Wojciech Matusik. He works with the Computational Design and Fabrication Group and the Computer Graphics Group.
Securing the Autonomous Future: Navigating the Intersection of Agentic AI, Connected Devices, and Cyber Resilience
With billions of devices now in our infrastructure and emerging as autonomous agents (AI), we face a very real question: How can we create intelligent systems that are both secure and trusted? This talk will explore the intersection of agentic AI and IoT and demonstrate how the same AI systems can provide robust defense mechanisms. At its core, however, this is a challenge about trusting people with technology, ensuring their safety, and providing accountability. Therefore, creating a new way of thinking is required, one in which security is built in, and where autonomous action has oversight; and, ultimately, innovation leads to greater human well-being.
Resources
About the Speaker
Samaresh Kumar Singh is an engineering principal at HP Inc. with more than 21 years of experience in designing and implementing large-scale distributed systems, cloud native platform systems, and edge AI / ML systems. His expertise includes agentic AI systems, GenAI / LLMs, Edge AI, federated and privacy preserving learning, and secure hybrid cloud / edge computing.
Transforming Business with Agentic AI
Agentic AI is reshaping business operations by employing autonomous systems that learn, adapt, and optimize processes independently of human input. This session examines the essential differences between traditional AI agents and Agentic AI, emphasizing their significance for project professionals overseeing digital transformation initiatives. Real-world examples from eCommerce, insurance, and healthcare illustrate how autonomous AI achieves measurable outcomes across industries. The session addresses practical orchestration patterns in which specialized AI agents collaborate to resolve complex business challenges and enhance operational efficiency. Attendees will receive a practical framework for identifying high-impact use cases, developing infrastructure, establishing governance, and scaling Agentic AI within their organizations.
Resources
About the Speaker
Joyjit Roy is a senior technology and program management leader with over 21 years of experience delivering enterprise digital transformation, cloud modernization, and applied AI programs across insurance, financial services, and global eCommerce. He is a keynote speaker at international conferences and regularly presents on AI, digital engineering, and enterprise transformation at industry and academic forums.
Plugins as Products: Bringing Visual AI Research into Real-World Workflows with FiftyOne
Visual AI research often introduces new datasets, models, and analysis methods, but integrating these advances into everyday workflows can be challenging. FiftyOne is a data-centric platform designed to help teams explore, evaluate, and improve visual AI, and its plugin ecosystem is how the platform scales beyond the core. In this talk, we explore the FiftyOne plugin ecosystem from both perspectives: how users apply plugins to accelerate data-centric workflows, and how researchers and engineers can package their work as plugins to make it easier to share, reproduce, and build upon. Through practical examples, we show how plugins turn research artifacts into reusable components that integrate naturally into real-world visual AI workflows.
Resources
About the Speaker
Adonai Vera - Machine Learning Engineer & DevRel at Voxel51. With over 7 years of experience building computer vision and machine learning models using TensorFlow, Docker, and OpenCV.
VAND 4.0 - The Visual Anomaly and Novelty Detection Challenge at CVPR26
Deep learning–based anomaly detection has become a cornerstone of modern visual quality inspection and continues to attract increasing interest from the research community. Yet, achieving robust and reliable anomaly detection remains challenging, especially under real-world conditions such as varying lighting conditions or when only a small number of reference images are available.
The latest edition of the Visual Anomaly and Novelty Detection Challenge - VAND 4.0 - directly targets these pressing issues. It introduces new tasks and evaluation settings designed to spark innovation and drive progress on these difficult but practically relevant problems.
In this talk, I will present the key ideas behind this year’s challenge design and outline its central objectives, offering insights into how VAND 4.0 aims to push the boundaries of anomaly detection research.
Resources
About the Speaker
Lars Heckler-Kram is a PhD student in image processing at Technical University of Munich.