We just wrapped up the Jan 25, 2024 AI, Machine Learning and Data Science Meetup, and if you missed it or want to revisit it, here’s a recap! In this blog post you’ll find the playback recordings, highlights from the presentations and Q&A, as well as the upcoming Meetup schedule so that you can join us at a future event.
First, Thanks for Voting for Your Favorite Charity!
In lieu of swag, we gave Meetup attendees the opportunity to help guide a $200 donation to charitable causes. The charity that received the highest number of votes this month was Heart to Heart International, an organization that ensures quality care is provided equitably in medically under-resourced communities and in disaster situations. We are sending this event’s charitable donation of $200 to Heart to Heart International on behalf of the Meetup members!
Missed the Meetup? No problem. Here are playbacks and talk abstracts from the event.
SANPO: A Scene Understanding, Accessibility, Navigation, Pathfinding, Obstacle Avoidance Dataset
In this talk we introduce SANPO, a large-scale egocentric video dataset focused on dense prediction in outdoor environments. It contains stereo video sessions collected across diverse outdoor environments, as well as rendered synthetic video sessions. To our knowledge, this is the first human egocentric video dataset with both large scale dense panoptic segmentation and depth annotations.
Speaker: Kimberly Wilber is a computer vision researcher at Google Research NYC. She previously studied tasks at the intersection of computer vision and crowdsourcing at Cornell Tech.
- Are there any indoor datasets to use of home/office room environments which you are aware of?
- Is privacy information filtered on the dataset? Like people’s faces or car license plates?
- What’s the difference between Stereo and Depth exactly?
- Did the users participating record data and mail the entire platform back to you so that you can get your hands on the data?
- How can we help with data diversity and international expansion?
- What does AOT stand for?
- Anything to be said about the AOT component which reduces manual annotation by 500%?
- Does applying a “Human annotated frame” correct AIT interpolations both backward and forward in time?
- Was annotating every 1/5 frames a decision based on the precision of AOT for this sample rate, or was it a hard limit based on your resources?
- How does SANPO perform for different seasonal modalities? Like for sunny weather, rainy, cloudy, etc.
- I thought “Event Camera’s” mixed with RGB (A Zurick / IBM project) was the most promising future. What do you think?
- Did you manage to quantify the complementary benefits of using synthetic data?
Setmlvis: Object Detection Comparison Using Set Visualization
In this talk we introduce Setmlvis, a novel tool employing set theory and visualization for object detection model comparison. It efficiently aggregates and matches detection data across multiple models, highlighting where models align or diverge in object detection. This approach allows for analysis of each model’s unique capabilities and common strengths. Through our system, we demonstrate how set theory and visualization can be used as a valuable asset in the model evaluation process.
Speaker: Liudas Panavas is a CS PhD student at Northeastern University’s Data Visualization Lab, where they focus on explainable AI for object detection algorithms. They specialize in developing visual analytics software and conducting comprehensive user experience evaluations.
- If there is a moving car etc., which may be lost behind some sort of wall, and appearing again in the next camera, can SetMLvis compare and classify if those two are the same car or a different ones?
- SetML is very impressive. How did you code the black stack visuals in the Notebook?
- Is there a research paper?
- Can all these models be combined for a more perfect prediction? And can they be more trained with rural landscapes where all these models had false predictions?
- Is the imagery from satellites, drones or both?
How well does the Segment Anything Model work on a Fisheye lens?
The Segment Anything Model has revolutionized zero shot segmentation especially in rectilinear images. But how well does it work on fisheye lenses? These lenses produce wide panoramic or hemispherical images that have visual distortion specially at the edge of the image. In this talk, we will look into the inner workings of the Segment Anything Model, its usages and zero shot capabilities on fisheye images.
Speaker: Nahid Alam is a Staff ML Engineer working on video understanding on the Cisco Meraki Camera Intelligence team.
- Did you measure improvement in SAM performance if images were preprocessed by applying a dewarping algorithm?
- Does the attention output correspond to the query boxes?
- Have you used camera calibration to simply adjust the fisheye image, converting it to a “normal” one?
- Have you quantified the degradation impact relative to the distance from the camera center or by any other creative measure?
- Is there an open standard for meta-scene information labeling?
- Are the fish eye images generated by traffic enforcement cameras? From what other sources are the samples generated?
- For sure the SAM dataset includes images of different distortion coefficients, given its ability to 0-shot infer on arbitrary image context, one might expect 0-shot performance on extremely distorted images as well. Could you give us your intuition as to why it doesn’t work as nicely as expected?
- Are these processed as RGB images or any other color spaces might be useful for the algorithm?
- Do you use the full image as input or did you tile it?
Lightning Talk: Next-Generation Image/Video Editor Built with Generative AI
Storia Lab is a next-generation image/video editor built on the state-of-the-art in generative AI offering seamless visual capabilities including generation, style transfer, artifact cleanup, and more. Since launching in October, Storia Lab has already been used by tens of thousands of marketers, designers, and creatives to 100x their speed and quality of visual asset creation.
Speaker: Mihail Eric is a cofounder of Storia AI. He has over a decade of experience researching and engineering AI systems at scale. Previously he built the first deep learning dialogue systems at the Stanford NLP group. He was also a founding member of Amazon Alexa’s first special projects team where he built the organization’s earliest large language models.
- Is there an API for Storia?
- What’s the limitation concerning magnitude resolution?
- What sort of compute power is needed for Storia?
Join the AI, Machine Learning and Data Science Meetup!
The AI, Machine Learning and Data Science Meetup membership has grown to almost 12,000 members! The goal of the Meetups is to bring together communities of data scientists, machine learning engineers, and open source enthusiasts who want to share and expand their knowledge of AI and complementary technologies.
Join one of the 12 Meetup locations closest to your timezone.
- New York
- San Francisco
- Silicon Valley
Up next on Feb 15 at 10 AM Pacific we have a great line up speakers including:
- Towards Fair Computer Vision: Discover the Hidden Biases of an Image Classifier – Chenliang Xu at University of Rochester
- Food Waste Classification with AI – Luka Posilović, PhD at Kitro
- Objects and Image Geo-localization from Visual Data – Safwan Wshah at University of Vermont
- Lightning talk: Next Generation Video Understanding – James Le at Twelve Labs
There are a lot of ways to get involved in the Computer Vision Meetups. Reach out if you identify with any of these:
- You’d like to speak at an upcoming Meetup
- You have a physical meeting space in one of the Meetup locations and would like to make it available for a Meetup
- You’d like to co-organize a Meetup
- You’d like to co-sponsor a Meetup
Reach out to Meetup co-organizer Jimmy Guerrero on Meetup.com or ping me over LinkedIn to discuss how to get you plugged in.
These Meetups are sponsored by Voxel51, the company behind the open source FiftyOne computer vision toolset. FiftyOne enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster. It’s easy to get started, in just a few minutes.