Welcome to our weekly FiftyOne tips and tricks blog where we recap interesting questions and answers that have recently popped up on Slack, GitHub, Stack Overflow, and Reddit.
Wait, what’s FiftyOne?
FiftyOne is an open source machine learning toolset that enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster.
- If you like what you see on GitHub, give the project a star.
- Get started! We’ve made it easy to get up and running in a few minutes.
- Join the FiftyOne Slack community, we’re always happy to help.
Ok, let’s dive into this week’s tips and tricks!
Sorting samples by filepath
Community Slack member Oğuz Hanoğlu asked,
“Is it possible to have the samples in my dataset appear sorted by their filepath?”
The samples in a FiftyOne Dataset
are always enumerated in the same order in which they were added to the Dataset
. However, if you want to create a new dataset where the samples appear in the desired order, you can use the sort_by()
method, passing in the filepath
field. This creates a reindexed view into the original dataset.
You can then use the clone()
method to create a new dataset from this view. As the samples appear in the desired order in the view, they will be inserted into the new dataset in this order as well.
Putting it all together, your workflow may look like this:
import fiftyone as fo dataset = fo.Dataset(...) view = dataset.sort_by(“filepath”) dataset1 = view.clone()
Using the same idea, you could similarly sort by other properties of the samples in the dataset, such as id
. With the ViewField
to handle expressions, you can even sort by the number of ground truth object detections in the sample:
from fiftyone import ViewField as F view = dataset.sort_by( F("ground_truth.detections").length(), reverse=True ) dataset2 = view.clone()
Learn more about sort_by() and DatasetViews in the FiftyOne Docs.
Viewing images with many small objects
Community Slack member Dan Erez asked,
“Is there a way to not show the labels for the detections? I have an image with a lot of little detection bounding boxes, and it’s hard to see.”
There are a few ways you might be able to work around this. First, if you have a lot of overlapping bounding boxes, you can use FiftyOne’s IoU utils to identify and remove duplicate objects. In a similar vein, if a lot of your predictions are low confidence, you can reduce clutter by filtering for objects with confidence values about some threshold.
conf_thresh = 0.9 ## example threshold view = dataset.filter_labels( "predictions", F("confidence") > conf_thresh )
You can also reduce clutter by setting attributes in your FiftyOne App config. For instance, you can set show_confidence=False
and show_label=False
to get rid of all the text above the bounding boxes. Additionally, you can set color_by = “label”
to have the color of each bounding box set by the label class of the object. This may help distinguishing classes for smaller objects.
To make these changes in Python, you can run the following:
import fiftyone as fo fo.app_config.show_label = False fo.app_config.show_confidence = False fo.app_config.color_by = “label”
Learn more about configuring FiftyOne in the FiftyOne Docs.
Assigning the same CVAT job to multiple users
Community Slack member Bahram Marami asked,
“If I want the same dataset to be annotated by two users with FiftyOne’s CVAT integration, how can I create jobs for two users and retrieve/load annotations individually for them?”
Making a single call to FiftyOne’s annotation API with multiple users listed will result in each job only being assigned to one of the listed users. For instance, the snippet below will assign the CVAT job to either user 1 or user 2.
dataset.annotate( ..., job_assignees=["user1", "user2"], backend="cvat" )
If you would like for each annotation task to be performed by both users, then one way to achieve this is by creating separate fields in the dataset for each user, and publishing the jobs twice, as in the code below.
dataset.clone_sample_field( 'ground_truth', 'gt_user1' ) dataset.annotate( ..., label_field='gt_user1', job_assignees=['user1'] ) dataset.clone_sample_field( 'ground_truth', 'gt_user2' ) dataset.annotate( ..., label_field='gt_user2', job_assignees=['user2'] )
Here we have cloned the ground truth field twice – once for each user, and have made separate API calls assigning the jobs for each.
Learn more about FiftyOne’s annotation API in the FiftyOne Docs.
Converting instance segmentation mask to full image mask
Community Slack member Onuralp Sezer asked,
“How do I use the instance segmentation masks in FiftyOne to generate analogous masks that span the entire image?”
In FiftyOne, instance segmentations are represented with a bounding box and a two dimensional array. The bounding box, which is in the format [top-left-x, top-left-y, width, height]
, specifies what portion of the image the grid lies on, and the two dimensional array specifies which pixels within that portion of the image are part of the object.
Converting from this into full image scale instance segmentation masks requires using image metadata (width and height) and bounding box coordinates to place the object’s mask grid onto the appropriate pixels in the image.
However, you can use FiftyOne’s utilities to handle these details for you! If all you require is semantic segmentation masks, then you can use the FiftyOne utils method objects_to_segmentations()
out of the box. For instance, to generate a semantic segmentation mask for COCO 2014 samples, accounting for dogs and cats, the following code suffices:
import fiftyone as fo import fiftyone.zoo as foz # load dataset dataset = foz.load_zoo_dataset( "coco-2014", split="validation", label_types=["segmentations"], classes=["cat", "dog"], label_field="instances", max_samples=25, only_matching=True, ) # specify pixel values for classes mask_targets = {100: "cat", 200: "dog"} import fiftyone.utils.labels as foul # generate semantic segmentation mask foul.objects_to_segmentations( dataset, "instances", "segmentations", mask_targets=mask_targets, )
To generate full image instance segmentation masks with this method, you have to get a bit craftier. You can iterate through the objects in a given image, filtering for each one individually, creating a view for this single image, single object pair, and using the objects_to_segmentations()
method.
Learn more about objects_to_segmentations()
and label type coercion in the FiftyOne Docs.
Viewing in Jupyter notebooks
Community Slack member Adrian Tofting asked,
“Does anyone know how I can make a Jupyter notebook show the FiftyOne output cell view in full height? Mine is cropped on the bottom”
Absolutely! You can specify this directly in the launch_app()
command when you launch the FiftyOne App in a Jupyter notebook with the height
argument. To set the height to 600 pixels, for example, the following would work:
import fiftyone as fo dataset = foz.load_zoo_dataset(“quickstart”) session = fo.launch_app(dataset, height = 600)
Learn more about running FiftyOne in a notebook in the FiftyOne Docs.
Join the FiftyOne community!
Join the thousands of engineers and data scientists already using FiftyOne to solve some of the most challenging problems in computer vision today!
- 1,275+ FiftyOne Slack members
- 2,450+ stars on GitHub
- 2,750+ Meetup members
- Used by 231+ repositories
- 55+ contributors