Welcome to our weekly FiftyOne tips and tricks blog where we give practical pointers for using FiftyOne on topics inspired by discussions in the open source community. This week we’ll cover labels.
Wait, What’s FiftyOne?
FiftyOne is an open source machine learning toolset that enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster.
- If you like what you see on GitHub, give the project a star.
- Get started! We’ve made it easy to get up and running in a few minutes.
- Join the FiftyOne Slack community, we’re always happy to help.
Ok, let’s dive into this week’s tips and tricks!
A primer on labels
In FiftyOne, labels store semantic information about the sample, such as ground annotations or model predictions. FiftyOne provides a Label
subclass for many common tasks, including detection, classification, segmentation, and keypoints. Using FiftyOne’s Label
types enables you to visualize your labels in the FiftyOne App, and there are also a bunch of methods designed specifically to facilitate working with labels.
Continue reading for some tips and tricks to help you master labels in FiftyOne!
Only download desired labels
The FiftyOne Dataset Zoo contains dozens of the most common computer vision datasets, allowing you to easily load these datasets into FiftyOne with a single line of code. Some of these datasets include multiple types of labels. MS COCO, for instance, supports both detections and segmentations.
If you are working on a particular task that only uses some of the available label types, you can specify these details when loading the dataset by passing in the label_types
keyword and an accompanying list, making the loading (and downloading of the dataset) faster.
If we only need the segmentation masks for the COCO dataset, we can load in this data with
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset( "coco-2014", label_types=["segmentations"], )
To see which label types are available for a dataset, check out the section detailing that dataset in the FiftyOne Dataset Zoo documentation.
Learn more about the FiftyOne Dataset Zoo in the FiftyOne Docs.
Manage class names by mapping labels
Suppose you’ve downloaded a dataset that has ground truth detections for birds, cats, and dogs, but you want to test out a model that is trained to detect the much broader class animal
.
Once you’ve added your model’s predictions to the dataset, you need to rename the bird
, cat
, and dog
, and ground truth classes to animal
before you can use FiftyOne’s evaluation API.
Rather than loop through all detections on all samples, you can use FiftyOne’s map_labels()
method to create a new view with these labels:
import fiftyone as fo import fiftyone.zoo as foz # load in dataset dataset = foz.load_zoo_dataset("quickstart") ANIMALS = ["bird", "cat", "dog"] # Replace all animal detection's labels with "animal" mapping = {k: "animal" for k in ANIMALS} animals_view = dataset.map_labels("predictions", mapping)
This approach is not only faster and more efficient than iterating over all samples, it also has the advantage that it preserves the original class names on the dataset. These labels just exist on this view. We get the best of both worlds.
Learn more about map_labels() in the FiftyOne Docs.
Add custom attributes to labels
Once you have imported or loaded a labeled dataset into FiftyOne, you can add whatever custom attributes you like!
As an example, the Detections
label class comes with a bounding box in the bounding_box
field, specified in the [top-left-x, top-left-y, width, height]
format. If we want to add a custom bbox_area
attribute to this label representing the area of the bounding box, we can do so as follows:
import fiftyone as fo import fiftyone.zoo as foz # load in dataset dataset = foz.load_zoo_dataset("quickstart") # get sample to which we will add attribute # add attribute to the `Detections` labels in predictions field detections = sample.predictions.detections for detection in detections: bounding_box = detection["bounding_box"] detection["bbox_area"] = bounding_box[2]*bounding_box[3] sample.predictions.detections = detections sample.save()
Learn more about custom attributes in the FiftyOne Docs.
Customize rendering of labels
When you load a labeled dataset into the FiftyOne App, you’ll notice that by default, the labels appear in lowercase text, bounding boxes for detections are all the same color, and confidence scores are displayed. But did you know that these details can be changed?
You can control how labels are rendered on images using the annotations utils, fiftyone.utils.annotations
.
To change the line width of bounding boxes and free up each bounding box to have its own color, you can set an annotations config and pass that into the draw_labeled_images()
method:
import fiftyone as fo import fiftyone.zoo as foz import fiftyone.utils.annotations as foua # load in dataset dataset = foz.load_zoo_dataset("quickstart") # Pick a sample sample = dataset.first() config = foua.DrawConfig( { "bbox_linewidth": 5, "per_object_label_colors": True, } ) # The path to write the annotated image outpath = "/path/for/image-annotated.jpg" # Render the annotated image foua.draw_labeled_image(sample, outpath, config=config)
Learn more about FiftyOne’s Annotation API in the FiftyOne Docs.
Update dataset by merging labels
As a tool made to curate and improve dataset quality, FiftyOne integrates seamlessly with labeling services like CVAT, Labelbox, Label Studio. A crucial part of many common computer vision workflows is identifying and tagging errors in labeling, sending these samples out for re-annotation, and updating the dataset with the improved data.
FiftyOne makes it easy to iterate on your datasets by merging updated labels into an existing label field using the merge_labels()
method.
Suppose that we have sent out a batch of detections to Labelbox for edits to the “ground truth”, using a label_schema
:
view = dataset.match_tags("reannotate") label_schema = { "ground_truth_edits": { "type": "detections", "classes": dataset.distinct("ground_truth.detections.label"), } } anno_key = "fix_labels" results = view.annotate( anno_key, label_schema=label_schema, backend="labelbox", )
Here, view
is a DatasetView
consisting of the images we have tagged for re-annotation, and our updated “ground truth” labels for the samples in this view are stored in the temporary ground_truth_edits
label field.
We can merge these revisions into the ground_truth
label with a single line of code:
view.merge_labels("ground_truth_edits", "ground_truth")
Resulting in an improved dataset!
Learn more about merge_labels() and annotating datasets with Labelbox in the FiftyOne Docs.
Join the FiftyOne community!
Join the thousands of engineers and data scientists already using FiftyOne to solve some of the most challenging problems in computer vision today!
- 1,250+ FiftyOne Slack members
- 2,300+ stars on GitHub
- 2,400+ Meetup members
- Used by 215+ repositories
- 52+ contributors
What’s next?
- If you like what you see on GitHub, give the project a star.
- Get started! We’ve made it easy to get up and running in a few minutes.
- Join the FiftyOne Slack community, we’re always happy to help.