The open-source tool for building high-quality datasets and computer vision models
Nothing hinders the success of machine learning systems more than poor quality data. And without the right tools, improving a model can be time-consuming and inefficient.
FiftyOne supercharges your machine learning workflows by enabling you to visualize datasets and interpret models faster and more effectively.
Improving data quality and understanding your model’s failure modes are the most impactful ways to boost the performance of your model.
FiftyOne provides the building blocks for optimizing your dataset analysis pipeline. Use it to get hands-on with your data, including visualizing complex labels, evaluating your models, exploring scenarios of interest, identifying failure modes, finding annotation mistakes, and much more!
FiftyOne integrates naturally with your favorite tools. Click on a logo to learn how:
FiftyOne is growing! Sign up for the mailing list to learn about new features as they come out.
Surveys show that machine learning engineers spend over half of their time wrangling data, but it doesn't have to be that way. Use FiftyOne's powerful dataset import and manipulation capabilities to manage your data with ease.
Aggregate metrics alone don’t give you the full picture of your ML models. In practice, the limiting factor on your model’s performance is often data quality issues that you need to see to address. FiftyOne makes it easy to do just that.
Are you using embeddings to analyze your data and models? Use FiftyOne's embeddings visualization capabilities to reveal hidden structure in you data, mine hard samples, pre-annotate data, recommend new samples for annotation, and more.
Working with geolocation
Many datasets have location metadata, but visualizing location-based datasets has traditionally required closed source or cloud-based tools. FiftyOne provides native support for storing, visualizing, and querying datasets by location.
Finding annotation mistakes
Annotations mistakes create an artificial ceiling on the performance of your model. However, finding these mistakes by hand is not feasible! Use FiftyOne to automatically identify possible label mistakes in your datasets.
Removing redundant images
During model training, the best results will be seen when training on unique data. Use FiftyOne to automatically remove duplicate or near-duplicate images from your datasets and curate diverse training datasets from your raw data.
The FiftyOne tool has three components: the Python library, the App, and the Brain.
FiftyOne’s core library provides a structured yet dynamic representation to explore your datasets. You can efficiently query and manipulate your dataset by adding custom tags, model predictions and more.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
import fiftyone as fo dataset = fo.Dataset("my_dataset") sample = fo.Sample(filepath="/path/to/image.png") sample.tags.append("train") sample["custom_field"] = 51 dataset.add_sample(sample) view = dataset.match_tags("train").sort_by("custom_field").limit(10) for sample in view: print(sample)
FiftyOne is designed to be lightweight and flexible, making it easy to load your datasets. FiftyOne supports loading datasets in a variety of common formats out-of-the-box, and it also provides the extensibility to load datasets in custom formats.
Check out loading datasets to see how to load your data into FiftyOne.
The FiftyOne App is a graphical user interface that makes it easy to explore and rapidly gain intuition into your datasets. You can visualize labels like bounding boxes and segmentations overlaid on the samples; sort, query and slice your dataset into any subset of interest; and more.
The FiftyOne Brain is a library of powerful machine learning-powered capabilities that provide insights into your datasets and recommend ways to modify your datasets that will lead to measurably better performance of your models.
1 2 3 4
import fiftyone.brain as fob fob.compute_uniqueness(dataset) rank_view = dataset.sort_by("uniqueness")
Where should you go from here? You could…