Skip to content

FiftyOne Sample Fields Tips and Tricks – Aug 11, 2023

Welcome to our weekly FiftyOne tips and tricks blog where we recap interesting questions and answers that have recently popped up on Slack, GitHub, Stack Overflow, and Reddit. Recently, many users have been interested in learning more about fields so we will be shining a spotlight on them today!

Wait, what’s FiftyOne?

FiftyOne is an open source machine learning toolset that enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster.

Ok, let’s dive into this week’s tips and tricks!

Managing your sample fields

One of the great things about FiftyOne datasets is that your data is more than just an image directory. With FiftyOne, you are able to store metadata, scalar fields, labels, or tags — all within a single sample. Let’s start by viewing a sample to see what a default sample may look like.

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset(
	"coco-2017",
	split="validation",
	dataset_name="tips+tricks",
)
dataset.persistent = True

sample = dataset.first()
print(sample)
<Sample: {
    'id': '64d47b3ec4b549ae3a13ac47',
    'media_type': 'image',
    'filepath': '/home/dan/fiftyone/coco-2017/validation/data/000000000139.jpg',
    'tags': ['validation'],
    'metadata': <ImageMetadata: {
        'size_bytes': None,
        'mime_type': None,
        'width': 640,
        'height': 426,
        'num_channels': None,
    }>,
    'ground_truth':
    ...

We can see a several main fields right off the bat: filepath, tags, metadata, and our ground truth labels with all of our detections! Depending on the dataset you loaded, fields can change so it’s great to see how much is stored right off the bat by loading into the FiftyOne. The fun doesn’t stop there as there are plenty of options to add to our samples!

Adding predictions to your samples

One of the most useful examples of adding a field to your samples is to add predictions from your model to your samples. This will allow you to not only visualize your predictions next to your ground truths in the Fiftyone App, but also allow you to perform evaluations on your data to get key scores such as accuracy or mAP (mean average precision). A quick way to do this can be seen below or here in our docs! 

from PIL import Image
from torchvision.transforms import functional as func

import fiftyone as fo

# Get class list
classes = dataset.default_classes

# Add predictions to samples
with fo.ProgressBar() as pb:
    for sample in pb(predictions_view):
        # Load image
        image = Image.open(sample.filepath)
        image = func.to_tensor(image).to(device)
        c, h, w = image.shape

        # Perform inference
        preds = model(image)
        labels = preds["labels"].cpu().detach().numpy()
        scores = preds["scores"].cpu().detach().numpy()
        boxes = preds["boxes"].cpu().detach().numpy()

        # Convert detections to FiftyOne format
        detections = []
        for label, score, box in zip(labels, scores, boxes):
            detections.append(
                fo.Detection(
                    label=classes[label],
                    bounding_box=box,
                    confidence=score
                )
            )

        # Save predictions to dataset
        sample["my_model"] = fo.Detections(detections=detections)
        sample.save()

session = fo.launch_app(predictions_view)

Adding strings or scalars to samples

With FiftyOne samples, you also have the flexibility to add several different basic data type fields to your sample easily. You can use this to keep track of where the data came from, who added it to the dataset, or why it is there. There are many ways to do this but here are two easy ways: 

  1. Add directly to the sample as we see in our int_field example.
  2. Update all samples with a new field of a specific type, then add the correct entry for that field on each sample. 

For a full list of basic field types, you can refer here in the docs!

sample = dataset.first()

## option 1
sample["int_field"] = 51

## option 2
dataset.add_sample_field("location", fo.StringField)
sample["location"] = "outdoor"

sample.save()

How to map original labels to super categories

Another cool use case you can achieve easily with FiftyOne is adding something like super categories to your samples. Often users can get bogged down with different label types on a single sample. You can bring clarity to this by holding multiple labels on one sample. One way to tackle this challenge is to clone the sample field using clone_sample_field() to duplicate the original ground truths and then map the super categories to the new field. Here is an example!

import json

with open('annotation.json') as f:
    data = json.load(f)

# Create a dictionary mapping category names to supercategory names
class_to_supercategory = {}

for category in data['categories']:
    class_to_supercategory[category['name']] = category['supercategory']

# Duplicate ground_truth field
clone_f = {
    "ground_truth_detections": "gt_super",
}

dataset.clone_sample_fields(clone_f)

# Remap the category names to supercategory names in the new field and save
dataset.map_labels("gt_super", class_to_supercategory).save()

To learn more about fields, samples, and more FiftyOne features, head over to our User Guide for more information!

Join the FiftyOne community!

Join the thousands of engineers and data scientists already using FiftyOne to solve some of the most challenging problems in computer vision today!