Shortcuts

FiftyOne Dataset Basics

A FiftyOne Dataset is the understood format that can be visualized in the FiftyOne App.

App

Datasets

What is a FiftyOne Dataset?

The FiftyOne Dataset class allows you to easily load, modify and visualize your data along with any related labels (classification, detection, segmentation, etc). It provides a way to easily load images, annotations, and model predictions into a format that can be visualized in the FiftyOne App.

If you have your own collection of data, loading it as a Dataset will allow you to easily search and sort your samples. You can use FiftyOne to identify unique samples as well as possible mistakes in Sample labels.

If you are training a model, the output predictions and logits can be loaded into your Dataset. The FiftyOne App makes it easy to visually debug what your model has learned, even for complex label types like detection and segmentation masks. With this knowledge, you can update your Dataset to include more representative samples and samples that your model found difficult into your training set.

Note

Checkout out our dataset loading guide to load your dataset into FiftyOne.

Dataset Details

A Dataset is composed of multiple Sample objects which contain Field attributes, all of which can be dynamically created, modified and deleted. FiftyOne uses a lightweight non-relational database to store a Dataset, so usage is easy on your computer’s memory and scalable.

A Dataset should be thought of as an unordered collection. Samples can be added to it and they can be accessed by key. However, slicing and sorting of a Dataset is done through the use of a DatasetView. A DatasetView allows for an ordered look into the Dataset or a subset of the Dataset along user specified axes.

Samples

A Sample is the elements of a Dataset that store all the information related to a given image. Any Sample must include a file path to an image:

1
2
3
import fiftyone as fo

sample = fo.Sample(filepath="/path/to/image.png")

Fields

A Field is a special attribute of a Sample that is shared across all samples in a Dataset. If a Dataset were a table where each row is a Sample, then each column would be a Field.

Fields can be dynamically created, modified, and deleted. When a new Field is assigned for a Sample in a Dataset, it is automatically added to the dataset’s schema and thus accessible on each other Sample in the Dataset. When unset, the default Field value will be None.

Tags

Sample.tags is a default Field of any Sample. Tags are simply a list of strings and can be used to tag a Sample as part of a train/test split or any other tagging that you would like:

1
2
3
4
sample = fo.Sample(filepath="path/to/image.png", tags=["train"])
sample.tags += ["my_favorite_samples"]
print(sample.tags)
# ["train", "my_favorite_samples"]

DatasetViews

A DatasetView is a powerful and fast tool for taking your Dataset and looking at subsets of it without worrying about augmenting the Dataset itself.