# fiftyone.utils.data.base¶

Data utilities.

Functions:

 Downloads the classification dataset specified by the given CSV file, which should have the following format. download_images(image_urls, output_dir[, …]) Downloads the images from the given URLs. parse_image_classification_dir_tree(dataset_dir) Parses the contents of the given image classification dataset directory tree, which should have the following format. parse_images_dir(dataset_dir[, recursive]) Parses the contents of the given directory of images. parse_videos_dir(dataset_dir[, recursive]) Parses the contents of the given directory of videos.
fiftyone.utils.data.base.parse_images_dir(dataset_dir, recursive=True)

Parses the contents of the given directory of images.

Parameters
• dataset_dir – the dataset directory

• recursive (True) – whether to recursively traverse subdirectories

Returns

a list of image paths

fiftyone.utils.data.base.parse_videos_dir(dataset_dir, recursive=True)

Parses the contents of the given directory of videos.

Parameters
• dataset_dir – the dataset directory

• recursive (True) – whether to recursively traverse subdirectories

Returns

a list of video paths

fiftyone.utils.data.base.parse_image_classification_dir_tree(dataset_dir)

Parses the contents of the given image classification dataset directory tree, which should have the following format:

<dataset_dir>/
<classA>/
<image1>.<ext>
<image2>.<ext>
...
<classB>/
<image1>.<ext>
<image2>.<ext>
...

Parameters

dataset_dir – the dataset directory

Returns

a list of (image_path, target) pairs classes: a list of class label strings

Return type

samples

fiftyone.utils.data.base.download_image_classification_dataset(csv_path, dataset_dir, classes=None, num_workers=None)

Downloads the classification dataset specified by the given CSV file, which should have the following format:

<label1>,<image_url1>
<label2>,<image_url2>
...


The image filenames are the basenames of the URLs, which are assumed to be unique.

The dataset is written to disk in fiftyone.types.dataset_types.FiftyOneImageClassificationDataset format.

Parameters
• csv_path – a CSV file containing the labels and image URLs

• dataset_dir – the directory to write the dataset

• classes (None) – an optional list of classes. By default, this will be inferred from the contents of csv_path

• num_workers (None) – the number of processes to use to download images. By default, multiprocessing.cpu_count() is used

fiftyone.utils.data.base.download_images(image_urls, output_dir, num_workers=None)

The filenames in output_dir are the basenames of the URLs, which are assumed to be unique.
• num_workers (None) – the number of processes to use. By default, multiprocessing.cpu_count() is used