Skip to content

Getting Started with FiftyOne Workshop – April 26 Recap

Jacob Marks, PhD and Machine Learning Engineer at Voxel51, recently presented the Getting Started with FiftyOne Workshop, which is part of a series of hands-on, educational events to show you step-by-step how to use the open source FiftyOne computer vision toolset. In this blog post, we summarize the highlights, recap the questions and answers from the event, and share the upcoming schedule of events. We’d love to see you at a future event!

First, Thanks for Voting for Your Favorite Charity!

In lieu of swag, we gave attendees the opportunity to help guide our monthly donation to charitable causes. The charity that received the highest number of votes was Wildlife AI. We were first introduced to Wildlife AI through the FiftyOne community! They are using FiftyOne to enable their users to easily analyze the camera data and create their own models. We are sending a charitable donation of $100 to Wildlife AI on behalf of the computer vision community who participated in this event!

wildlife.ai logo

Wait, what’s FiftyOne?

The Getting Started with FiftyOne Workshop was created to help you get up-to-speed on the basics of using open source FiftyOne in your computer vision workflows. If you’re new to FiftyOne – you may be wondering, what is it? FiftyOne enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster.

Workshop summary

Half lecture, half lab, the workshop covered these essentials:

Lecture:

  • FiftyOne Basics (terms, architecture, installation, and general usage)
  • An overview of useful workflows to explore, understand, and curate your data
  • How FiftyOne represents and semantically slices unstructured computer vision data

Lab:

  • Loading datasets from the FiftyOne Dataset Zoo
  • How to easily navigate the FiftyOne App’s features
  • Programmatically inspecting attributes of a dataset
  • Adding new sample and custom attributes to a dataset
  • Generating and evaluating model predictions
  • How to save insightful views into the data

… All with the goal of helping you gain greater visibility into the quality of your computer vision datasets and models.

Lecture: get up-to-speed on the basics

Jacob walked us through some popular ways to use FiftyOne.

Curate data 

  • Find: filter, match, sort, select
  • Remove: duplicates
  • Add: tags, metadata, predictions
  • Correct: annotation mistakes
  • Save: interesting “views”

Understand data

  • Aggregate statistics: FiftyOne makes it easy to compute summary statistics about your datasets and views, including histograms and all of the traditional aggregations for numerical quantities you would expect: min, max, mean, standard aviation, and more.
  • Embeddings: Visualizing your dataset in a low-dimensional embedding space is a powerful workflow that can help you uncover hidden patterns and clusters in your data so you can take action to improve the quality of your datasets and models.
  • Interactive visualization: All these visualizations are interactive. Simply lasso points in an embeddings plot to see just those samples, explore a cell in a confusion matrix to see just those samples, and more.  

Evaluate data

FiftyOne has support for tons of one-number metrics: precision, recall, F1 score, intersection over union, and more. There’s support for all of your favorite plots including PR curves and confusion matrices. You can also perform analysis on samples, labels, and entire datasets.

Tap into the flexibility of FiftyOne

Because computer vision is not one-size-fits-all, FiftyOne is designed for flexibility and customizability, across all these categories:

  • Datasets
  • Models
  • Media types
  • Labels
  • Plugins
  • More!

Key components of FiftyOne

Jacob provided a few additional tips to prepare everyone as they geared up for the hands-on lab. First, Jacob described the three main components of FiftyOne: the FiftyOne Library, the FiftyOne App, and the FiftyOne Brain.

three components of fiftyone

Then covered two additional helpful concepts:

  • A description of working with tabular data (structured data) vs computer vision data (unstructured data), and how FiftyOne can be thought of as the pandas of computer vision
  • A look under the hood of a schema – including a dataset, samples, fields, metadata, filepath, labels, media type, and more

Lab: fire up FiftyOne and experience it for yourself! 

The hands-on lab was the star of the second half, enabling you to put your learnings from the lecture into action. The outcome – everyone fired up FiftyOne, explored datasets and models firsthand, and experienced how to:

  • Install FiftyOne
  • Load datasets and models from the FiftyOne Dataset Zoo and FiftyOne Model Zoo
  • Easily navigate the FiftyOne App’s features
  • Programmatically inspect attributes of a dataset
  • Add new samples and custom attributes to a dataset
  • Evaluate model predictions
  • Save insightful views into the data
  • More!

Q&A recap

Does FiftyOne support segmentation?

Yes! Learn more about FiftyOne’s support for instance segmentation and semantic segmentation masks in the FiftyOne Docs.

Does FiftyOne work with the Segment Anything Model?

Yes! Look for a blog on this topic in the near future.

Can I import a 3D point cloud dataset into FiftyOne?

Absolutely! Check out this blog and this tutorial to learn more about how to work with 3D point cloud data in FiftyOne. If you want to work with point clouds as part of grouped datasets, see the FiftyOne User Guide.

Is FiftyOne similar to YOLO?

YOLO is a set of (You Only Look Once) object detection models, while FiftyOne is a toolset that allows you to manage and curate your computer vision data. In a typical workflow they would be used together with FiftyOne boosting the performance of the YOLO model by improving the quality of the data going in. Check out this YOLO tutorial for additional details.

In a dataset, can multiple samples point to the same source media file?

Yes!

Do the uniqueness and similarity features in FiftyOne use localized distance metrics?

Out-of-the-box, FiftyOne selects a default metric, as well as a default model, for FiftyOne Brain methods like uniqueness and similarity. However, if you’d like, you can also specify these via keyword arguments. For instance, for a Qdrant similarity backend, you can use a `cosine` similarity metric with `metric=cosine`. Different vector search backends support different metrics. Learn more about how to use uniqueness and similarity in the FiftyOne Docs.

What types of integrations does FiftyOne support?

FiftyOne supports a variety of popular platforms and tools including COCO, PyTorch, AWS, Google Cloud, Qdrant, and more. For a complete list of documented integrations, check out the integration Docs.

What options does FiftyOne support for model evaluations?

FiftyOne provides a variety of builtin methods for evaluating your model predictions, including regressions, classifications, detections, polygons, instance and semantic segmentations, on both image and video datasets. It also supports custom metrics. For a variety of tips and tricks concerning evaluations, check out this blog and these Docs.

What backend database does FiftyOne use?

FiftyOne uses MongoDB as a backend. Learn more about configuring a MongoDB backend in the Docs.

Does FiftyOne have utilities for dataset transformations and/or conversions?

Yes! A good place to start is with the documentation concerning Custom Importers and Custom Exporters. For a deeper dive, check out the FiftyOne data utils.

Join an upcoming event

We are excited to have a number of other events already lined up and hope to see you there!

Upcoming Computer Vision Meetups:

  • May ’23 Computer Vision Meetup (Americas & EMEA)
  • May ’23 Computer Vision Meetup (APAC)
  • June ’23 Computer Vision Meetup (Americas, EMEA)

Additional dates & times for the Getting Started with FiftyOne Workshop:

  • May 31 @ 4 PM BST [11 AM EDT / 15:00 UTC]
  • June 28 @ 10 AM PDT [1 PM EDT / 17:00 UTC]

See the full schedule.