We just wrapped up the July 13, 2023 Computer Vision Meetup, and if you missed it or want to revisit it, here’s a recap! In this blog post you’ll find the playback recordings, highlights from the presentations and Q&A, as well as the upcoming Meetup schedule so that you can join us at a future event.
First, Thanks for Voting for Your Favorite Charity!
In lieu of swag, we gave Meetup attendees the opportunity to help guide our monthly donation to charitable causes. The charity that received the highest number of votes this month was Drink Local Drink Tap, an international non-profit focused on solving water equity and quality issues through education, advocacy, and affordable, safe clean water sources. We are sending this event’s charitable donation of $200 to Drink Local Drink Tap on behalf of the computer vision community.
Missed the Meetup? No problem. Here are playbacks and talk abstracts from the event.
Unleashing the Potential of Visual Data: Vector Databases in Computer Vision
Discover the game-changing role of vector databases in computer vision applications. These specialized databases excel at handling unstructured visual data, thanks to their robust support for embeddings and lightning-fast similarity search. Join us as we explore advanced indexing algorithms and showcase real-world examples in healthcare, retail, finance, and more using the FiftyOne engine combined with the Milvus vector database. See how vector databases unlock the full potential of your visual data.
Speaker: Filip Haltmayer is a Software Engineer at Zilliz working in both software and community development.
Resource links
Q&A from the talk
- In the example where you searched to find similar shoes. In this case do you have embeddings generated for the individual objects (I think “patches” in FiftyOne) too? Not just image-wide embeddings?
- Does FiftyOne’s similarity checker work on Milvus?
Computer Vision Applications at Scale with Vector Databases
Vector Databases enable semantic search at scale over hundreds of millions of unstructured data objects. In this talk Zain will introduce how you can use multi-modal encoders with the Weaviate vector database to semantically search over images and text. This will include demos across multiple domains including e-commerce and healthcare.
Speaker: Zain Hasan is a senior developer advocate at Weaviate, an open source vector database.
Q&A from the talk
- In regards to multimodal applications, can you discuss a bit more how audio and image can be embedded in the same vector space, e.g. how do you make a lion’s roar be in proximity to the image of a lion? Or is a text label used to connect them?
- What do you mean by short term recommendation and long term recommendations?
- Is it possible to embed movements in a database like Weaviate, e.g. search through security video to find someone hitting someone else?
- Can you explain how the symbolic graphs are used in recommendation systems?
- What is the typical response time?
- What are some of the challenges you faced while trying to combine embeddings from different modalities?
- Can you suggest some popular models for fashion and garments use cases?
Reverse Image Search for Ecommerce Without Going Crazy
Traditional full-text-based search engines have been on the market for a while and we are all currently trying to extend them with semantic search. Still, it might be more beneficial for some ecommerce businesses to introduce reverse image search capabilities instead of relying on text only. However, both semantic search and reverse image may and should coexist! You may encounter common pitfalls while implementing both, so why don’t we discuss the best practices? Let’s learn how to extend your existing search system with reverse image search, without getting lost in the process!
Speaker: Kacper Łukawski is a Developer Advocate at Qdrant.
Resource link
Q&A from the talk
- Can we use vector embeddings to track objects moving from one frame to another? If there are n objects in one frame and their position has moved in the next frame, can we compare the embeddings of these objects and find the exact location of these objects in the new frame?
- How do you rank between textual and vector search? In other words, can we boost one search over the other?
- Do you think there would be some benefits of using some kind of RLHF or RL(KPI)F to decide when/how to update the embedding models?
Fast and Flexible Data Discovery & Mining for Computer Vision at Petabyte Scale
Improving model performance requires methods to discover computer vision data, sometimes from large repositories, whether its similar examples to errors previously seen, new examples/scenarios or more advanced techniques such as active learning and RLHF. LanceDB makes this fast and flexible for multi-modal data, with support for vector search, SQL, Pandas, Polars, Arrow, and a growing ecosystem of tools that you’re familiar with. We’ll walk through some common search examples and show how you can find needles in a haystack to improve your metrics!
Speakers: Jai Chopra founding product lead and Ayush Chaurasia founding engineer from LanceDB.
Resource links
- Get started with LanceDB by reading the LanceDB Documentation
- Join the LanceDB community on Discord
- Read the LanceDB blog for product updates and technical discussions
- Check out YoloExplorer for practical examples for querying computer vision data
- Read the Voxel51 LanceDB Integration documentation for how to use LanceDB with Voxel51
Q&A from the talk
- In regards to autonomous vehicles, it’s probably one of the “edge computing” cases of an MV environment, where one has no and very little cloud resources. Are there any strategies you can recommend in narrowing down the tech stack or algorithms to account for this?
How to Build Scalable Image and Text Search for Computer Vision Data Using Pinecone & Qdrant
Have you ever wanted to find the images most similar to an image in your dataset? What if you haven’t picked out an illustrative image yet, but you can describe what you are looking for using natural language? And what if your dataset contains millions, or tens of millions of images? In this talk Jacob will show you step-by-step how to integrate all the technology required to enable search for similar images, search with natural language, plus scaling the searches with Pinecone and Qdrant. He’ll dive-deep into the tech and show you a variety of practical examples that can help transform the way you manage your image data.
Speaker: Jacob Marks is a Machine Learning Engineer and Developer Evangelist at Voxel51.
Resource links
- Try FiftyOne, no install required!
- “How I Turned My Company’s Docs into a Searchable Database with OpenAI”
- “The Computer Vision Interface for Vector Search”
- How-to: Sorting by Similarity and Similarity with the FiftyOne Brain
Q&A from the talk
- Can the Fiftyone platform be used for non-computer vision applications?
- Is there an LLM behind VoxelGPT?
- The vector embeddings from an image could be way different from the vector embeddings of an audio that is related/same context as the image. How do those two vectors end up close to each other in the embedding space?
Join the Computer Vision Meetup!
Computer Vision Meetup membership has grown to almost 5,000 members in just under a year! The goal of the Meetups is to bring together communities of data scientists, machine learning engineers, and open source enthusiasts who want to share and expand their knowledge of computer vision and complementary technologies.
Join one of the 13 Meetup locations closest to your timezone.
- Ann Arbor
- Austin
- Bangalore
- Boston
- Chicago
- London
- New York
- Peninsula
- San Francisco
- Seattle
- Silicon Valley
- Singapore
- Toronto
We have exciting speakers already signed up over the next few months! Become a member of the Computer Vision Meetup closest to you, then register for the Zoom.
What’s Next?
Up next on July 20 at 10 AM IST we have a great line up speakers including:
- MARLIN: Masked Autoencoder for facial video Representation LearnINg – Zhixi Cai, Monash University
- Unleashing the Potential of Visual Data: Vector Databases in Computer Vision – Filip Haltmayer, Zilliz
- DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data – Stephanie Fu and Shobhita Sundaram, MIT and Netanel Tamir, Weizmann Institute of Science
Register for the Zoom here. You can find a complete schedule of upcoming Meetups on the Voxel51 Events page.
Get Involved!
There are a lot of ways to get involved in the Computer Vision Meetups. Reach out if you identify with any of these:
- You’d like to speak at an upcoming Meetup
- You have a physical meeting space in one of the Meetup locations and would like to make it available for a Meetup
- You’d like to co-organize a Meetup
- You’d like to co-sponsor a Meetup
Reach out to Meetup co-organizer Jimmy Guerrero on Meetup.com or ping him over LinkedIn to discuss how to get you plugged in.
The Computer Vision Meetup network is sponsored by Voxel51, the company behind the open source FiftyOne computer vision toolset. FiftyOne enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster. It’s easy to get started, in just a few minutes.