We recently released FiftyOne 0.19, which is packed with new features that make it even easier and faster to visualize your computer vision datasets and boost the performance of your machine learning models. How? Read on! Voxel51 Co-Founder and CTO Brian Moore walked us through the new features in a live webinar, with plenty of live demos and code examples to show it in action. You can watch the playback on YouTube, take a look at the slides, read the transcript, and read the recap below for the highlights. Enjoy! https://www.youtube.com/watch?v=6hlJO8wW0YU

What Is FiftyOne?

Brian starts with a quick overview of what FiftyOne is for those who might be new to it. It’s a data-centric open source toolset that enables you to:

Visualize, query, and analyze computer vision datasets
Streamline annotation workflows
Identify and correct labeling mistakes
Analyze model performance, both visually and programmatically

… And dozens of additional workflows to help you curate high quality computer vision datasets and improve model performance. As a tool for visual data (images, videos, and 3D data), it supports all popular computer vision tasks, including: classification, detection, instance segmentation, polygons and polylines, keypoints, point clouds and annotations, geolocation, embeddings, and multiview datasets. Before diving into the fresh new features, Brian gives a quick shout out to a couple of other capabilities:

FiftyOne is more than just an application. It's also a very powerful Python API. This means you can move seamlessly between code and the UI.
FiftyOne comes with the Brain, a component that provides powerful machine learning techniques and helps you automatically surface potential issues in your datasets

What Are the New Features in FiftyOne 0.19?

Here are the new features in FiftyOne 0.19 that were demonstrated in the webinar and are explained in the sections below:

Spaces: a new feature that enables you to organize panels of information within the FiftyOne App
In-App embeddings visualization: the ability to visualize embeddings directly in the App through a panel made possible by Spaces
Saved views: the ability to save views under a name of your choice and recall those later either in the App or through code
On-disk segmentations: as of this release you can store semantic segmentation masks and heatmaps on disk, including as RGB images, rather than in the database
New UI filtering options: we’re we're continually bringing new filtering options to the App’s sidebar, which now contains upgraded options for filtering datasets

1. Spaces

In FiftyOne, it was already possible to load different types of modalities like a map using the previous layout, which was similar to a persistent tab-based menu at the top. Those tabs have now been converted into Panels that you can arrange horizontally and vertically within your App window. You can drag Panels to reorganize them, and you can have multiple tabs open in a Panel to toggle between them. Brian provides an analogy for the Spaces feature: “it's like having a code editor for your data, so when you're looking at FiftyOne now, you can think more like VSCode, and less like a grid of images alone.” FiftyOne natively includes these Panel types:

Samples panel: the media grid that loads by default when you launch the App
Histograms panel: a dashboard of histograms for the fields of your dataset
(Detailed below!) Embeddings panel: a canvas for working with embeddings visualizations
Map panel: visualizes the geolocation data of datasets that have a GeoLocation field

Other nifty things to know about Spaces: You can configure custom Panels via plugins. And, not only can you configure Spaces through the App, you can also configure them through code. Learn more about configuring Spaces from ~13:15 - 21:37 in the webinar replay, in the release announcement blog post, and in the FiftyOne Spaces docs.

2. In-App Embeddings Visualization

Brian walks us through how to interactively explore embeddings scatterplots in the App through the Embeddings Panel mentioned above. The Embeddings Panel:

Supports any visualization generated by compute_visualization()
Supports image and object-level embeddings
Enables you to lasso points in the plot to show only the corresponding samples/patches
Enables you to show/hide specific classes via the legend
Automatically updates when your view changes
Gracefully handles new/missing data

To show us the Embeddings Panel in action, Brian demos a subset of the Berkeley Deep Drive dataset with embeddings he had already generated using the compute_visualization() method available in the FiftyOne Brain ahead of time. In the demo, he color codes the samples by time of day – red for night, pink for day. Looking at the Samples Panel and Embeddings Panel side by side enables you to pull out interesting findings. For example, Brian turns off all time-of-day labels except daytime labels, then lassos the samples that are labeled as daytime but clustered with nighttime samples in the Embeddings Panel, then explores the lassoed samples in the Samples Panel. Some samples are incorrectly labeled. Other samples represent driving through a tunnel so it is daytime, but visually darker.

How might you act on findings like these in FiftyOne? Brian describes some of the ways: “This could be a powerful workflow to identify label mistakes. If these were instead model predictions, it could be an equally interesting way to pick out particular samples that were classified incorrectly. And then maybe grab close samples, and use them as hard examples, especially if you're viewing a data set that contains a bunch of points, only some of which were in your training dataset.” Brian shows a few other examples from this dataset of exploring outliers in the embeddings clusters to find mislabeled samples (images with a highly visible dashboard, images in the rain, etc.). Finally, he shows a dataset with geolocation data in order to show the Samples Panel, Map Panel, and Embeddings Panel in a single view and all working in concert with each other as you filter and explore. Learn more about working with the Embeddings Panel from ~21:38 - 29:34 in the webinar replay, in the release announcement blog post, and in the Embeddings Panel docs.

3. Saved Views

Next, Brian explains the new saved views features. As of FiftyOne 0.19, you can save any view. Simply give your saved view a name, and then you can load that saved view later, either through the App or through code. What kind of workflows can you accomplish with saved views? Brian explains, “I've seen workflows where users tag data and then create a saved view that simply matches a certain tag, which would give you a quick way to pull up specific subsets of your data set encoded by tags. Or you might use saved views to remember what subset of your data set you trained a model on by creating the view, and then naming it the name of the model you're training. Maybe I want to use a saved view as a shorthand to pull up the samples in my dataset that were the nighttime images in the example before, or really anything else you can imagine.” In the demo, Brian shows how to save a view of samples in a dataset that are labeled as cats, sorted by the number of cats with the most cats shown first, and then load that saved view in the App.

You may notice in the screenshot above that the name of the saved view is in the URL bar in the App. This gives you a one-click way to pull up a specific subset of a data set. (Additionally, for anyone using FiftyOne Teams to securely collaborate on datasets, you could share this link with members of your team.) Get the details and demo on saved views in the webinar replay from ~29:37 - 34:53, in the release announcement blog post, and also in the Saving Views docs.

4. On-Disk Segmentations

Until now, it has been possible to work with semantic segmentations in FiftyOne. The previous way to do that resulted in storing segmentations in the database, which was convenient but not efficient for large datasets. However, many people prefer to work with segmentations stored on disk, so as of FiftyOne 0.19 you can! Brian walks us through some notable characteristics of the new on-disk segmentation feature:

Storing segmentations on disk is significantly more performant
RGB segmentation masks now also supported
Entire API upgraded to support on-disk and RGB segmentations
- evaluate_segmentations()
- apply_model()
- export_segmentations()

Did you know? Before showing on-disk segmentations, Brian gives a shout out to a powerful feature in FiftyOne you may not yet know about: FiftyOne provides a number of utility methods to convert between different representations of certain label types, such as converting between instance segmentations, semantic segmentations, and polylines – each implemented with a simple command. Learn more about converting label types in the docs. Now, to show on-disk segmentations, Brian first loads 25 samples from the COCO 2017 validation split. The dataset has samples with a field called instances (for instance segmentations), with detection objects and masks. Brian converts instance segmentations to semantic segmentations and passes the output_dir argument so that these are stored on disk as follows.

Then Brian demonstrates the new segmentation instances on the dataset with a mask_path on disk. You can learn more about the new on-disk feature in the demo from ~38:19 - 42:05 in the webinar replay, in the release announcement blog post, and in the docs: instance segmentations, semantic segmentation, and heatmaps.

5. New UI Filtering Options

Next, Brian covers the new builtin UI filtering options added in the App’s sidebar as of FiftyOne 0.19:

Only show objects with the specified labels (omitting samples with no matching objects)
Exclude objects with the specified labels
Show samples that contain the specified labels (without filtering)
Omit samples that contain the specific labels

All applicable filtering options are available from both the grid view and the sample modal, and for all field types, including top-level fields and dynamic label attributes. This has always been possible through code, but now you can do it through the App as well! To learn more about the new filtering options, check out the demo from ~42:05 - 45:30 in the webinar replay and in the release announcement blog post.

Other Notes

After demonstrating the new features, Brian shares a few additional points before concluding the presentation. Open source software like FiftyOne doesn’t happen without an amazing community supporting it. Brian gives a shout out to the community members who contributed to FiftyOne 0.19. In addition to FiftyOne, Voxel51 also builds FiftyOne Teams that adds collaboration and security features built specifically for teams, including cloud-backed media, dataset permissions, versioning, sharing, and more – all on top of the goodness available in open source FiftyOne.

Q&A from the Webinar

You mentioned an analogy: FiftyOne can be the pandas for computer vision. Can you elaborate? Yes! While they apply to different types of data, the pandas DataFrame and FiftyOne Dataset classes share many similar functionalities. As a result, we prepared a side-by-side comparison of common operations in the two libraries in a tutorial, blog post, and cheat sheet. Are there any plans to add support for text annotation visualizations for scene text recognition problems? There are already FiftyOne users working on scene text recognition today! You can store StringFields on your samples or labels that you can use to store and visualize arbitrary text. One example is storing the bounding boxes as Detection labels, then storing the result of your text recognition model as a string attribute of your detection. Adding fields to sample: https://docs.voxel51.com/user_guide/using_datasets.html#adding-fields-to-a-sample Label attributes: https://docs.voxel51.com/user_guide/using_datasets.html#labelsDoes FiftyOne have support for 3D images other than video? ( i.e. time is not one of the dimensions). Yes we support 3D point clouds, polylines, bounding boxes, etc. But to clarify: FiftyOne’s support for 3D point clouds does not support any specific visualizer for 3D volumetric images (e.g. MRI). For this, the current practice is to slice the volume according to the dimension of minimum size and render these into frames of a video. FiftyOne does support plugins based on media type. So one could envision adding a 3D volumetric imaging visualizer via a custom plugin. So, I could build my own visualization tool as a plugin? Cool! Yes! You could write your own custom plugin of any kind and expose it directly in the FiftyOne App. If you need any help along the way, please reach out to us in Slack individually (Jacob, Brian, anyone at Voxel51) or generally in the #help channel where we hang out and we would be happy to assist you.

Talk to a computer vision expert