Register for the event
In-person
Americas
CV Meetups
Boston AI, ML and Computer Vision Meetup - February 18, 2026
Feb 18, 2026
5:30 - 8:30 PM EST
Microsoft Research Lab – New England (NERD) at MIT Deborah Sampson Conference Room One Memorial Drive, Cambridge, MA 02142
Speakers
About this event
Join the Meetup to hear talks from experts on cutting-edge topics across AI, ML, and computer vision.
Schedule
SNAP: Towards Segmenting Anything in Any Point Cloud
Segmenting objects in 3D point clouds is a core problem in 3D scene understanding and scalable data annotation. In this talk, I will present SNAP: Segmenting Anything in Any Point Cloud, a unified framework for interactive point cloud segmentation that supports both point-based and text-based prompts across indoor, outdoor, and aerial domains. SNAP is trained jointly on multiple heterogeneous datasets and achieves strong cross-domain generalization through domain-adaptive normalization. The model enables both spatially prompted instance segmentation and text-prompted panoptic and open-vocabulary segmentation directly on point clouds. Extensive experiments demonstrate that SNAP matches or outperforms domain-specific methods on a wide range of zero-shot benchmarks.
Culturally Adaptive AI
AI can now generate videos, images, speech, and text that are almost indistinguishable from human-created content. As generative AI systems become more sophisticated, we end up questioning our feeds' credibility and whether they're even real. There is a need, now more than ever, to develop models that help humans distinguish between real and AI-generated content. How can we shape the next generation of AI models to be more explainable, safe, and creative? How can we make these models teach humans about different cultures, bridging the gap between human and AI collaboration? This talk highlights emerging techniques and the future of AI that will improve trust in generative AI systems by integrating insights from multimodality, reasoning, and factuality. Tomorrow's AI won't just process data and generate content; rather, we imagine it will amplify our creativity, extend our compassion, and help us rediscover what makes us fundamentally human.
Data Foundations for Vision-Language-Action Models
Model architectures get the papers, but data decides whether robots actually work. This talk introduces VLAs from a data-centric perspective: what makes robot datasets fundamentally different from image classification or video understanding, how the field is organizing its data (Open X-Embodiment, LeRobot, RLDS), and what evaluation benchmarks actually measure. We'll examine the unique challenges such as temporal structure, proprioceptive signals, and heterogeneity in embodiment, and discuss why addressing them matters more than the next architectural innovation.
Better Together: Leveraging Unpaired Multimodal Data for Stronger Unimodal Models
Traditional multimodal learners find unified representations for tasks like visual question answering, but rely heavily on paired datasets. However, an overlooked yet potentially powerful question is: can one leverage auxiliary unpaired multimodal data to directly enhance representation learning in a target modality? We introduce UML: Unpaired Multimodal Learner, a modality-agnostic training paradigm in which a single model alternately processes inputs from different modalities while sharing parameters across them. This design exploits the assumption that different modalities are projections of a shared underlying reality, allowing the model to benefit from cross-modal structure without requiring explicit pairs. Theoretically, under linear data-generating assumptions, we show that unpaired auxiliary data can yield representations strictly more informative about the data-generating process than unimodal training. Empirically, we show that using unpaired data from auxiliary modalities---such as text, audio, or images---consistently improves downstream performance across diverse unimodal targets such as image and audio. Our project page: https://unpaired-multimodal.github.io/