We just wrapped up the November ‘24 AI, Machine Learning and Computer Vision Meetup, and if you missed it or want to revisit it, here’s a recap! In this blog post you’ll find the playback recordings, highlights from the presentations and Q&A, as well as the upcoming Meetup schedule so that you can join us at a future event.

Human-in-the-loop: Practical Lessons for Building Comprehensive AI Systems

AI systems often struggle with data limitations, data distribution shift over time, and a poor user experience. Human-in-the-loop design offers a solution by placing users at the center of AI systems and leveraging human feedback for continuous improvement. In this talk, we’ll dive deeply into a recent project at Merantix Momentum: A interactive tool for automatic rodent behaviour analysis in videos at a large scale. We’ll discuss the machine learning components, including pose estimation, behavior classification, and active learning and talk about the technical challenges and the impact of the project.

Speaker:Adrian Loy has a Msc in IT Systems Engineering and spent the last 5 years at Merantix Momentum planning and executing Computer Vision Projects for a variety of clients. He is currently leading the Machine Learning Engineering Team at Momentum.

Q&A

Are you using imagery from Maxar, Sentinel, Airbus, and/or another source?
What is meant by image binning?
Did you explore semantic planning to address the Djikstra issue?

Resource Links

The Regularization Cookbook
Vincent’s blogs on Medium

Deploying ML models on Edge Devices using Qualcomm AI Hub

In this talk, we address the common challenges faced by developers migrating AI workloads from the cloud to edge devices. Qualcomm aims to democratize AI at the edge, easing the transition to the edge by supporting familiar frameworks and data types. This is where Qualcomm AI Hub comes in. Developers can follow along, gaining knowledge and tools to efficiently deploy optimized models on real devices using Qualcomm AI Hub. We’ll walk through how to get started using Qualcomm AI Hub, go through examples on how to optimize models and bundle the downloadable target asset into your application and talk through iterating on your model and meet performance requirements to deploy on device!

Speaker:Bhushan Sonawane has optimized and deployed more than 1000s of AI models on-device on iOS and Android ecosystem. Currently, he is building AI Hub at Qualcomm to make on-device journey on Android and Snapdragon platform as seamless as possible.

Q&A

If a 4th channel is lidar, it’s more sparse than the RGB resolution usually. Have you seen this in use?
Are there data or modeling considerations you can mention about handling this "sparse 4th channel" case?
When building the model, while feeding in more information, is there a general concern for overfitting the model and straying away from generalization? With imaging, I used to believe that more data is better to train the model, so in this case, would more information refer to more data in this context, please?
Would it be correct to see the X in RGB-X model as different channels in a CNN model?

Resource Links

Tutorial: Monocular Depth Estimation with FiftyOne
Paper: Sapiens: Foundation for Human Vision Models

Curating Excellence: Strategies for Optimizing Visual AI Datasets

In this talk, Harpreet will discuss common challenges plaguing visual AI datasets, their impact on model performance, and share some tips and tricks for curating datasets to make the most of any compute budget or network architecture.

Speaker:Harpreet Sahota is a hacker-in-residence and machine learning engineer with a passion for deep learning and generative AI. He’s got a deep interest in RAG, Agents, and Multimodal AI.

Resource Links

Demo Repo on GitHub
Slides from the talk

Join the AI, Machine Learning and Computer Vision Meetup!

The goal of the Meetups is to bring together communities of data scientists, machine learning engineers, and open source enthusiasts who want to share and expand their knowledge of AI and complementary technologies. Join one of the 12 Meetup locations closest to your timezone.

What’s Next?

Missed ECCV 2024 in Milan? We’ve lined up some top-notch speakers and presentations from the conference as part of our ECCV Redux series. Check them out below: 𝗗𝗮𝘆 𝟭: 𝗡𝗼𝘃𝗲𝗺𝗯𝗲𝗿 𝟭𝟵

Fast and Photo-Realistic Novel View Synthesis from Sparse Images by Avinash Paliwal from Texas A&M University
Robust Calibration of Large Vision-Language Adapters by Balamurali Murugesan from École de technologie supérieure
Tree-of-Life Meets AI by Mridul Khurana from Virginia Tech & NSF

𝗗𝗮𝘆 𝟯: 𝗡𝗼𝘃𝗲𝗺𝗯𝗲𝗿 𝟮𝟭

Closing the Gap Between Satellite and Street-View Imagery Using Generative Models by Ningli Xu from The Ohio State University
High-Efficiency 3D Scene Compression Using Self-Organizing Gaussians by Wieland Morgenstern from Fraunhofer Heinrich Hertz Institute HHI
Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Segmentation of Thin Tubular Structures by Maximilian R. and Yannick Kirchhoff from Medical Image Computing (MIC) @DKFZ German Cancer Research Center

𝗗𝗮𝘆 𝟰: 𝗡𝗼𝘃𝗲𝗺𝗯𝗲𝗿 𝟮𝟮

Zero-shot Video Anomaly Detection: Leveraging Large Language Models for Rule-Based Reasoning by Yuchen Yang from Johns Hopkins Whiting School of Engineering
Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models by Xiaoyu Zhu from Carnegie Mellon University

Dive into the groundbreaking research, all from the comfort of your own space. Register for the Zoom here. You can find a complete schedule of upcoming Meetups on the Voxel51 Events page.

Get Involved!

There are a lot of ways to get involved in the Computer Vision Meetups. Reach out if you identify with any of these:

You’d like to speak at an upcoming Meetup
You have a physical meeting space in one of the Meetup locations and would like to make it available for a Meetup
You’d like to co-organize a Meetup
You’d like to co-sponsor a Meetup

Reach out to Meetup co-organizer Jimmy Guerrero on Meetup.com or ping me over LinkedIn to discuss how to get you plugged in. — These Meetups are sponsored by Voxel51, the company behind the open source FiftyOne computer vision toolset. FiftyOne enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster. It’s easy to get started, in just a few minutes.

Talk to a computer vision expert

Human-in-the-loop: Practical Lessons for Building Comprehensive AI Systems

Q&A

Resource Links

Deploying ML models on Edge Devices using Qualcomm AI Hub

Q&A

Resource Links

Curating Excellence: Strategies for Optimizing Visual AI Datasets

Resource Links

Join the AI, Machine Learning and Computer Vision Meetup!

What’s Next?

Get Involved!

Talk to a computer vision expert

Related posts

Related posts