“Rapidly experiment with your datasets”
If you are looking to boost the performance of your machine learning models, chances are improving the quality of your dataset will provide the highest return on your investment. Enter FiftyOne. FiftyOne is a Python-based tool for machine learning/computer vision engineers and scientists that enables you to curate better datasets. Work efficiently with FiftyOne to achieve better models with dependable performance.
“Become one with your data”
FiftyOne does more than improve your dataset; it gets you closer to your data. Rapidly gain insight by visualizing samples overlayed with with dynamic and queryable fields such as ground truth and predicted labels, dataset splits, and much more!
FiftyOne is rapidly growing. Sign up for the mailing list so we can keep you posted on new features as they come out!
FiftyOne provides advanced capabilities that will turbocharge your machine learning workflows.
Finding annotation mistakes
Annotations mistakes create an artificial ceiling on the performance of your model. However, finding these mistakes by hand is not feasible! Use FiftyOne to automatically identify possible label mistakes in your datasets.Check out the label mistakes tutorial
Removing redundant images
During model training, the best results will be seen when training on unique data. Use FiftyOne to automatically remove duplicate or near-duplicate images from your datasets and curate diverse training datasets from your raw data.Try the image uniqueness tutorial
Bootstrapping datasets from raw images
"What data should I select to annotate?" Use FiftyOne to automatically recommend unlabeled samples from your dataset to send for annotation, enabling you to bootsrap a training dataset that leads to demonstrably better model performance.Tutorial coming soon
Adding optimal samples to your dataset
"What new samples should I add to my training dataset to see the largest improvement in my model?" FiftyOne provides methods for mining hard samples from your datasets, a tried and true measure of mature machine learning processes.Tutorial coming soon
The FiftyOne tool has three components: the core library, the App, and the Brain.
FiftyOne’s core library provides a structured yet dynamic representation to explore your datasets. You can efficiently query and manipulate your dataset by adding custom tags, model predictions and more.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
import fiftyone as fo dataset = fo.Dataset("my_dataset") sample = fo.Sample(filepath="path/to/img.png") sample.tags.append("train") sample["custom_field"] = 51 dataset.add_sample(sample) view = dataset.match_tag("train").sort_by("custom_field").limit(10) for sample in view: print(sample)
FiftyOne is designed to be lightweight and flexible, making it easy to load your datasets. FiftyOne supports loading datasets in a variety of common formats out-of-the-box, and it also provides the extensibility to load datasets in custom formats.
Check out loading datasets to see how to load your data into FiftyOne!
The FiftyOne App is a graphical user interface (GUI) that makes it easy to rapidly gain intuition into your datasets. You can visualize labels, bounding boxes and segmentations overlayed on the samples; sort, query and slice your dataset into any aspect you need; and more.
The FiftyOne Brain is a library of powerful machine learning-powered capabilities that provide insights into your datasets and recommend ways to modify your datasets that will lead to measurably better performance of your models.
1 2 3 4
import fiftyone.brain as fob fob.compute_uniqueness(dataset) rank_view = dataset.sort_by("uniqueness")
The FiftyOne Brain is a separate Python package that is bundled with FiftyOne. Although it is closed-source, it is licensed as freeware, and you have permission to use it for commercial or non-commercial purposes. See the license for more details.
Where should you go from here? You could…