Welcome to our weekly FiftyOne tips and tricks blog where we recap interesting questions and answers that have recently popped up on Slack, GitHub, Stack Overflow, and Reddit.
Wait, what’s FiftyOne?
FiftyOne is an open source machine learning toolset that enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster.
- If you like what you see on GitHub, give the project a star.
- Get started! We’ve made it easy to get up and running in a few minutes.
- Join the FiftyOne Slack community, we’re always happy to help.
Ok, let’s dive into this week’s tips and tricks!
Computing embeddings on a PyTorch model
Community Slack member Pedro asked,
“How can I compute embeddings using my PyTorch model? I have trained using ResNet-18 and want to also use it for the embeddings computation. Is this possible?”
FiftyOne provides powerful embeddings visualization capabilities via the FiftyOne Brain. With it you can generate low-dimensional representations of the samples and objects in your datasets.
Check out the Using Image Embeddings tutorial in the FiftyOne Docs for step-by-step instructions on how to:
- Load a dataset into FiftyOne
- Use
compute_visualization()
to generate 2D representations of images - Provide custom embeddings to
compute_visualization()
- Visualize embeddings via interactive plots
- Identify anomalous/incorrect image labels
- Find examples of scenarios of interest
- Pre-annotate unlabeled data for training
Exporting with absolute paths vs filenames
Community Slack member ZKW asked,
“Is there a method in FiftyOne to add absolute paths in annotation.xml rather than providing a file name?
Yes! You can pass the optional abs_paths=True
option to export absolute paths rather than filenames. For example:
dataset.export( ... dataset_type=fo.types.CVATImageDataset, abs_paths=True, )
For more information on the options available when exporting FiftyOne datasets, check out the FiftyOne Docs.
Simplest way to get data into the FiftyOne App
Community Slack member J asked,
“What is the simplest way to get example datasets into the FiftyOne App so I can start experimenting with the tool?”
The easiest way to get a dataset into the FiftyOne App is to load the quickstart
dataset which consists of 200 images from the validation split of COCO-2017, with model predictions generated by an out-of-the-box Faster R-CNN model from torchvision.models.
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart") session = fo.launch_app(dataset)
Alternatively, you can also easily load popular datasets you may already be familiar with from the FiftyOne Dataset Zoo. For example ActivityNet, COCO, ImageNet, Kinetics, Open Images, and more. Any of these datasets can be loaded (and downloaded if necessary) using the load_zoo_dataset()
command.
Check out all the available datasets in the FiftyOne Docs.
Downloading annotated videos stored locally or remotely
Community Slack member Patrick asked,
“Is it possible to download an annotated version of a video from FiftyOne? I have a dataset I created with many videos and their associated object detection predictions. How can I download this annotated video locally?”
FiftyOne supports the annotation of datasets and views via the draw_labels
method in the SDK, and support for this directly in the App is coming very soon! You can use the export
method on your dataset to download your annotated videos. You can perform exports with this method by following the basic patterns detailed below:
- Provide
export_dir
anddataset_type
to export the content to a directory in the default layout for the specified format, as documented on this page - Provide
dataset_type
along withdata_path
,labels_path
, and/orexport_media
to directly specify where to export the source media and/or labels (if applicable) in your desired format; this syntax provides the flexibility to, for example, perform workflows like labels-only exports - Provide a
dataset_exporter
to which to feed samples to perform a fully-customized export
If the dataset is local, pass in your local directory. If the dataset is remote, pass in the directory of your remote machine and then download the video. Here’s an example:
import fiftyone as fo # The Dataset or DatasetView containing the samples you wish to export dataset_or_view = fo.Dataset(...) # The directory to which to write the exported dataset export_dir = "/path/for/export" # The name of the sample field containing the label that you wish to export # Used when exporting labeled datasets (e.g., classification or detection) label_field = "ground_truth" # for example # The type of dataset to export # Any subclass of `fiftyone.types.Dataset` is supported dataset_type = fo.types.COCODetectionDataset # for example # Export the dataset dataset_or_view.export( export_dir=export_dir, dataset_type=dataset_type, label_field=label_field, )
Deleting a batch of samples using a DatasetView
Community Slack member Dan asked,
“How can I delete a group of samples that I have selected with a view?”
You can easily remove a batch of samples from a Dataset
by constructing a DatasetView
that contains the samples, and then deleting them from the dataset as follows:
# Choose 10 samples at random unlucky_samples = dataset.take(10) dataset.delete_samples(unlucky_samples)
For more information on removing a batch of samples from a dataset, check out the FiftyOne Docs.
Join the FiftyOne community!
Join the thousands of engineers and data scientists already using FiftyOne to solve some of the most challenging problems in computer vision today!
- 1,600+ FiftyOne Slack members
- 3,000+ stars on GitHub
- 4,000+ Meetup members
- Used by 266+ repositories
- 58+ contributors