Welcome to our weekly FiftyOne tips and tricks blog where we recap interesting questions and answers that have recently popped up on Slack, GitHub, Stack Overflow, and Reddit.
Wait, what’s FiftyOne?
FiftyOne is an open source machine learning toolset that enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster.
- If you like what you see on GitHub, give the project a star.
- Get started! We’ve made it easy to get up and running in a few minutes.
- Join the FiftyOne Slack community, we’re always happy to help.
Ok, let’s dive into this week’s tips and tricks!
Converting FiftyOne label schema to CVAT
Community Slack member Vinicius Madureira asked,
“I’m trying to create a project with FiftyOne’s CVAT utils and initialize it with certain classes. I tried to use a FiftyOne label_schema
, but it is not working. What would be a correct label_schema
?”
To create a CVAT project using FiftyOne’s CVAT utils, you need to pass in a label_schema
matching CVAT’s label schema. You can generate this from a FiftyOne label_schema
using the get_cvat_schema()
function in the CVAT utils:
import fiftyone as fo import fiftyone.utils.cvat as fouc cvat = fouc.CVATAnnotationAPI(...) project_name = “my_project_name” project_id = “my_project_id” ### example FiftyOne label schema label_schema = { "detections": { "type": "detections", "classes": [ "class1", "class2", "class3", ] } } # conversion from FiftyOne to CVAT # attrs embed concepts like occlusion and group ids from CVAT into FiftyOne cvat_schema, assign_scalar_attrs, occluded_attrs, group_id_attrs = cvat._get_cvat_schema( label_schema, project_id=project_id, occluded_attr=occluded_attr, group_id_attr=group_id_attr, ) cvat.create_project(project_name,schema=cvat_schema)
Once the label schema has been converted, you can pass this into the create_project()
function to create the project in CVAT.
cvat.create_project(project_name, schema = cvat_schema)
Learn more about FiftyOne’s integration with CVAT in the FiftyOne Docs.
Understanding the support metric in evaluations
Community Slack member Isaac Padberg asked,
“I’m hoping to get some insight into the support
metric shown when printing out an evaluation report. What does this mean?”
The support
in FiftyOne’s evaluation API refers to the number of ground truth instances of each class in a classification or multi-class object detection task. As such, the support
will not depend on the model predictions you are evaluating.
Nevertheless, it can be useful in assessing the model’s performance. When the support for a particular class is small, evaluation metrics for that class can have high variance.
Learn more about FiftyOne’s evaluation API in the FiftyOne Docs.
Lazy-loading large media files in the FiftyOne App
Community Slack member Mohamed Serrari asked,
“Is there a way to lazy-load high-resolution media files like satellite images into the FiftyOne App?”
The way the FiftyOne App is set up, a media file associated with each sample is displayed in the grid view. The default behavior is for the media file that displays in the grid view to be the same as the media file used to define the Sample
(located at sample.filepath
). This works well for small-to-medium sized images.
If you’re dealing with ultra-HD images or satellite imagery, however, this may lead to slow loading in the FiftyOne App. The best practice in these scenarios is to generate a lower resolution thumbnail image for each sample, and configure the app to display the thumbnails in the grid view:
import fiftyone as fo import fiftyone.utils.image as foui import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart") # Generate some thumbnail images foui.transform_images( dataset, size=(-1, 32), output_field="thumbnail_path", output_dir="/tmp/thumbnails", ) dataset.app_config.media_fields = ["filepath", "thumbnail_path"] dataset.app_config.grid_media_field = "thumbnail_path" dataset.save()
The thumbnails can be any size, so long as the aspect ratio matches the original media file. With these settings, the media file that appears in the expanded modal will still be the original, full-resolution version.
Learn more about multiple media fields and configuring the FiftyOne App in the FiftyOne Docs.
Filtering bounding boxes by width and height
Community Slack member George Pearse asked,
“Is there a way to filter bounding boxes based on their absolute width and height?”
Yes! In FiftyOne, object detections are stored on samples in Detection
objects with bounding_box
attributes, which store the two-dimensional area information in the [top-left-x, top-left-y, width, height]
format.
These dimensions are represented in relative coordinates, with size relative to the total image size, in the range [0,1]
. To transform this into absolute values for width and height, you can use the absolute width and height for the image in pixels, which is stored in the metadata
field on the sample.
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset(“quickstart”) ## One-time computation of the metadata for all samples in dataset dataset.compute_metadata() ### get abs width and height for first sample sample = dataset.first() img_width = sample.metadata.width img_height = sample.metadata.height
Relative to detections on an image, this metadata is stored at the parent level. Fortunately, with FiftyOne it is possible to use this parent level information in filters by prepending the field name with the $
character and passing it into the ViewField.
For example, to get ground truth bounding boxes with width < 400
and height < 600
pixels, we can create the following filter:
from fiftyone import ViewField as F rel_width = F("bounding_box")[2] img_width = F("$metadata.width") abs_width = rel_width * img_width rel_height = F("bounding_box")[3] img_height = F("$metadata.height") abs_height = rel_height * img_height size_filter = (abs_width < 400) & (abs_height < 600)
Finally, we can apply this filter to our dataset:
small_boxes_view = dataset.filter_labels( "ground_truth", size_filter )
Learn more about the MongoDB syntax underlying FiftyOne expressions in the FiftyOne Docs.
Selecting IDs from session
Community Slack member Dan Erez asked,
“When using the selection plot tool, is there a way to extract the list of sample ids that I selected?”
If you’re using the FiftyOne App to plot some of your samples, for instance visualizing a uMAP embedding, then you can access the collection of currently lassoed samples using plot.selected_ids
. This could be useful if, for instance, you want to pre-annotate a selection of samples by cluster.
This feature exemplifies the ease with which it is possible to move back and forth between the Python SDK and the FiftyOne App. As another example, if you click on an image in the sample grid and select a detection bounding box, you can pass the sample and prediction info to the Python SDK using session.selected_labels
.
Learn more about the sessions and the FiftyOne App in the FiftyOne Docs.
Join the FiftyOne community!
Join the thousands of engineers and data scientists already using FiftyOne to solve some of the most challenging problems in computer vision today!
- 1,200+ FiftyOne Slack members
- 2,300+ stars on GitHub
- 2,000+ Meetup members
- Used by 208+ repositories
- 51+ contributors
What’s next?
- If you like what you see on GitHub, give the project a star.
- Get started! We’ve made it easy to get up and running in a few minutes.
- Join the FiftyOne Slack community, we’re always happy to help.