Creating a Dataset

FiftyOne supports automatic creation of datasets stored in various common formats. If your dataset is stored in a custom format, don’t worry, FiftyOne also provides support for easily create custom data formats as well.


When you create a FiftyOne Dataset, its samples and all of their fields (metadata, labels, custom fields, etc.) are written to FiftyOne’s backing database.

Note, however, that samples only store the filepath to the media, not the raw media itself. FiftyOne does not create duplicate copies of your data!

Dataset formats

There are three basic ways to get data into FiftyOne:

FiftyOne natively supports creating datasets in a variety of common formats, including COCO, VOC, CVAT, BDD, TFRecords, and more.

FiftyOne provides a Dataset Zoo that contains a variety of popular open source datasets like CIFAR-10, COCO, and ImageNet that can be downloaded and loaded into FiftyOne with a single line of code.

If your data is stored in a custom format, you can easily get it into FiftyOne by directly adding the samples and their fields to a FiftyOne Dataset in a variety of formats. You can even provide your own sample parser to automate this process.


Ingest a directory of images into FiftyOne and explore them in the FiftyOne App:

import fiftyone as fo

dataset_dir = "/path/to/images-dir"

# Visualize a directory of images in the FiftyOne App
dataset = fo.Dataset.from_dir(dataset_dir, fo.types.ImageDirectory)
session = fo.launch_app(dataset=dataset)
# Visualize a directory of images in the FiftyOne App
fiftyone app view \
    --dataset-dir /path/to/images-dir --type fiftyone.types.ImageDirectory