AI, Machine Learning and Computer Vision Meetup
Dec 12, 2024 at 10 AM Pacific
Register for the Zoom
By submitting you (1) agree to Voxel51’s Terms of Service and Privacy Statement and (2) agree to receive occasional emails.
How We Built CoTracker3: Simpler and Better Point Tracking by Pseudo-Labeling Real Videos
Nikita Karaev
Meta AI and Oxford
CoTracker3 is a state-of-the-art point tracking model that introduces significant improvements in tracking objects through video sequences. Its key innovations include:
- Use of semi-supervised training with real videos, reducing reliance on synthetic data
- Generates pseudo-labels using existing tracking models as teachers
- Features a simplified architecture compared to previous trackers
About the Speaker
Nikita Karaev is currently doing a PhD at Meta AI and Oxford, where he’s working on dynamic reconstruction and motion estimation (CoTracker) with Andrea Vedaldi and Christian Rupprecht. Before that, he did his master’s at École Polytechnique (Paris), and undergrad in cold Siberia (Novosibirsk). He was also an early employee at two startups that got acquired by Snapchat and Farfetch.
Hands-On with Meta AI's CoTracker3: Parsing and Visualizing Point Tracking Output
Harpreet Sahota
Voxel51
In this presentation, Harpreet Sahota explores CoTracker3, a state-of-the-art point tracking model that effectively leverages real-world videos during training. He dives into the practical aspects of running inference with CoTracker3 and parsing its output into FiftyOne, a powerful open-source tool for dataset curation, analysis, and visualization. Through a hands-on demonstration, Harpreet shows how to prepare a video for inference, run the model, examine its output, and parse the model’s output into FiftyOne’s keypoint format for seamless integration and visualization within the FiftyOne app.
About the Speaker
Harpreet Sahota is a hacker-in-residence and machine learning engineer with a passion for deep learning and generative AI. He’s got a deep interest in RAG, Agents, and Multimodal AI.
Streamlined Retail Product Detection with YOLOv8 and FiftyOne
Vanshika Jain
UNAR Labs
In the fast-paced retail environment, automation at checkout is increasingly essential to enhance operational efficiency and improve the customer experience. This talk will demonstrate a streamlined approach to retail product detection using the Retail Product Checkout (RPC) dataset, which includes 200 SKUs across 17 meta-categories such as puffed food, dried food, and drinks. By leveraging YOLOv8, renowned for its speed and accuracy in real-time object detection, and FiftyOne, an open-source toolset for computer vision, we can simplify data loading, training, evaluation, and visualization for effective product detection and classification. Attendees will gain insights into how these tools can be applied to optimize checkout automation.
About the Speaker
Vanshika Jain is a Data Engineer Intern at UNAR Labs, a startup focused on making information accessible for the blind. She holds a Master’s degree in Machine Learning and Computer Vision from Northeastern University and is passionate about applying AI and computer vision to real-world problems, with a focus on automation and accessibility.
Find a Meetup Near You
Join 12,000+ AI and ML enthusiasts who have already become members
The goal of the AI, Machine Learning, and Data Science Meetup network is to bring together a community of data scientists, machine learning engineers, and open source enthusiasts who want to share and expand their knowledge of AI and complementary technologies. If that’s you, we invite you to join the Meetup closest to your timezone.