Skip to content

Press & News

Voxel51 & Google Collaborate to Make Downloading & Visualizing Open Images A Breeze


Ann Arbor, MI – Voxel51 today announced a collaboration with Google to support Google’s Open Images Dataset, one of the largest visual datasets in the world used by AI researchers and the machine learning community for common object detection and other computer vision tasks. Through the collaboration, Open Images users will gain access to Voxel51’s free, open source machine learning developer tool, “FiftyOne,” to more efficiently and effectively load, visualize and evaluate datasets from Open Images’ annotated collection of over 9 million images. Machine learning’s massive rise in adoption is powered by the widespread availability of large, manually labeled datasets, but what truly makes this data useful in complex, real-world applications, is the ability to easily and rapidly analyze and experiment with this data at scale.

Starting today, when users visit the Open Images download page, they will be directed to FiftyOne, where they can easily download all or part of Open Images, visualize its rich annotations and evaluate models with the official open Images protocol. FiftyOne enables users to rapidly analyze and improve the quality, accuracy and diversity of computer vision datasets in order to improve model performance. Users can also select specific dataset subsets, types of annotations and classes of objects to download.

Google reached out to Voxel51 about the dataset analysis capabilities of FiftyOne following an error analysis study that Voxel51 performed on Google’s Open Images Dataset. In a blog post, Voxel51 detailed how they utilized FiftyOne’s powerful visualization and model analysis features to observe patterns in dataset errors that frequently stifle model performance. In this case, even though the object detection ground truth in Open Images is 98% correct, nearly a third of the “false positive errors” among modern models were due to the small amount of ground-truth imperfections rather than model errors.

“FiftyOne enables researchers to analyze and improve the quality of their datasets rapidly, replacing the weeks of manual labor that would otherwise be required without this technology,” said Jordi Pont-Tuset, Research Scientist at Google. “High-quality data is critical to the success of machine learning systems. Without the right tools to analyze and curate datasets, machine learning development can be inefficient and ineffective.”

“In order to push the frontier of what’s possible in machine learning, model output must be closely analyzed in conjunction with the datasets in a detailed manner,” said Jason Corso, co-founder and CEO of Voxel51 and director of the Stevens Institute for Artificial Intelligence. “The classical aggregate analysis methods most commonly used in machine learning are only part of what is necessary to develop high performing datasets and models. We created FiftyOne to fill a gap in developer tools for machine learning and we’re proud to be empowering others to build better datasets and to train better models through this collaboration with Google.”

Since launching in August 2020, FiftyOne has transformed the machine learning and computer vision development lifecycle. FiftyOne provides a flexible and open core software ecosystem to build machine learning workflows, with an emphasis on the important role that high quality data plays at every stage of the workflow. Among its state-of-the-art features are a novel query language to easily search and filter images and videos based on their content, a user-friendly web-based application for visualizing data, access to powerful tools like embeddings that can uncover hidden patterns and automatically label data, and the flexibility to work locally, remotely, in the cloud, or in Jupyter/Colab notebooks. FiftyOne is available at

About Voxel51
Headquartered in Ann Arbor, Michigan, and founded in 2016 by Dr. Jason Corso and Dr. Brian Moore, Voxel51 is an AI software company that is democratizing access to software 2.0 by providing the open core software building blocks that enable computer vision and machine learning engineers to rapidly engineer data-powered workflows. Learn more at