Image similarity search in the FiftyOne App using MongoDB Atlas Vector Search backend.
The vast majority of the world’s data is unstructured, nestled within images, videos, audio files, and text. Whether you’re developing application-specific business solutions or trying to train a state-of-the-art machine learning model, understanding and extracting insights from unstructured data is more important than ever.
Without the right tools, interpreting features in unstructured data can feel like looking for a needle in a haystack. Fortunately, the
integration between
FiftyOne and MongoDB Atlas enables the processing and analysis of visual data with unparalleled efficiency!
In this post, we will show you how to use FiftyOne and
MongoDB Atlas Vector Search to streamline your data-centric workflows and interact with your visual data like never before.
What is FiftyOne?
Filtering a demo image dataset by class label and prediction confidence score in the FiftyOne App.
FiftyOne is the leading
open-source toolkit for the curation and visualization of unstructured data,
built on top of MongoDB. It leverages the non-relational nature of MongoDB to provide an intuitive interface for working with datasets consisting of images, videos, point clouds, PDFs, and more.
You can install FiftyOne from PyPi:
The core data structure in FiftyOne is the
Dataset
, which consists of samples — collections of labels, metadata, and other attributes associated with a media file. You can access, query, and run computations on this data either programmatically, with the
FiftyOne Python software development kit, or visually via the
FiftyOne App.
Once you have a
fiftyone.Dataset
instance, you can create a view into your dataset (
DatasetView
) by applying
view stages. These view stages allow you to perform common operations like filtering, matching, sorting, and selecting by using arbitrary attributes on your samples.
To programmatically isolate all high-confidence predictions of an airplane
, for instance, we could run:
Note that this achieves the same result as the UI-based filtering in the last GIF.
This querying functionality is incredibly powerful. For a full list of supported view stages, check out this
View Stages cheat sheet. What’s more, these operations readily scale to billions of samples. How? Simply put, they are built on
MongoDB aggregation pipelines!
When you print out the DatasetView
, you can see a summary of the applied aggregation under “View stages”:
We can explicitly obtain the MongoDB aggregation pipeline when we create directly with the _pipeline()
method:
You can also inspect the underlying MongoDB document for a sample with the to_mongo() method.
You can even create a DatasetView
by applying a MongoDB aggregation pipeline directly to your dataset using the Mongo
view stage and the add_stage()
method:
Vector Search With FiftyOne and MongoDB Atlas
Searching images with text in the FiftyOne App using multimodal vector embeddings and a MongoDB Atlas Vector Search backend.
Vector search is a technique for indexing unstructured data like text and images by representing them with high-dimensional numerical vectors called
embeddings, generated from a machine learning model. This makes the unstructured data
searchable, as inputs can be compared and assigned similarity scores based on the alignment between their embedding vectors. The indexing and searching of these vectors are efficiently performed by purpose-built vector databases like
MongoDB Atlas Vector Search.
Vector search is an essential ingredient in retrieval-augmented generation (RAG) pipelines for LLMs. Additionally, it enables a plethora of visual and multimodal
applications in data understanding, like finding similar images, searching for objects within your images, and even semantically searching your visual data using natural language.
Now, with the
integration between FiftyOne and MongoDB Atlas, it is easier than ever to apply vector search to your visual data! When you use FiftyOne and MongoDB Atlas, your traditional queries and vector search queries are connected by the same underlying data infrastructure. This streamlines development, leaving you with fewer services to manage and less time spent on tedious ETL tasks. Just as importantly, when you mix and match traditional queries with vector search queries, MongoDB can optimize efficiency over the entire aggregation pipeline.
Connecting FiftyOne and MongoDB Atlas
To get started, first configure a MongoDB Atlas cluster:
Then, set MongoDB Atlas as your default vector search back end:
Generating the similarity index
You can then create a
similarity index on your dataset (or dataset view) by using the FiftyOne Brain’s
compute_similarity()
method. To do so, you can provide any of the following:
- An array of embeddings for your samples
- The name of a field on your samples containing embeddings
- The name of a model from the FiftyOne Model Zoo (CLIP, OpenCLIP, DINOv2, etc.), to use to generate embeddings
- A
fiftyone.Model
instance to use to generate embeddings - A Hugging Face
transformers
model to use to generate embeddings
When you generate the similarity index, you can also pass in configuration parameters for the MongoDB Atlas Vector Search index: the index_name
and what metric
to use to measure similarity between vectors.
Sorting by Similarity
Once you have run compute_similarity()
to generate the index, you can sort by similarity using the MongoDB Atlas Vector Search engine with the sort_by_similarity()
view stage. In Python, you can specify the sample (whose image) you want to find the most similar images to by passing in the ID of the sample:
If you only have one similarity index on your dataset, you don’t need to specify the brain_key
.
We can achieve the same result with UI alone by selecting an image and then pressing the button with the image icon in the menu bar:
Searching by similarity in the FiftyOne App using vector embeddings and indexing with a MongoDB Atlas Vector Search backend.
The coolest part is that sort_by_similarity()
can be interleaved with other view stages — no need to write custom pre- and post-processing scripts. Keep everything in the same query language and underlying data model. Here’s a simple example, just to get the point across:
But wait, there’s so much more! The FiftyOne and MongoDB Atlas Vector Search integration also natively supports semantically searching your data with natural language queries. As long as the model you specify can embed both text and images — think CLIP, OpenCLIP models, and any of the zero-shot classification or detection models from Hugging Face’s transformers
library — you can pass a string in as a query:
Or in the FiftyOne App via the button with the magnifying glass icon:
Conclusion
Filtering, querying, and visualizing your unstructured data doesn’t have to be hard.
Together, MongoDB and FiftyOne offer a flexible and powerful yet still remarkably simple and efficient way to get the most out of your visual data!