Welcome to our weekly FiftyOne tips and tricks blog where we cover interesting workflows and features of FiftyOne! This week is part two of a two-part series exploring FiftyOne’s grouped datasets. If you missed last week, you can still catch it here. Today we dive into dynamic grouped datasets and how to create powerful new ways to organize and view your data.
Wait, What’s FiftyOne?
FiftyOne is an open source machine learning toolset that enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster.
- If you like what you see on GitHub, give the project a star.
- Get started! We’ve made it easy to get up and running in a few minutes.
- Join the FiftyOne Slack community, we’re always happy to help.
What Is a Dynamic Group?
Let’s take a look at what dynamic grouping is and when we may want to do this. Dynamic grouping is a feature in FiftyOne that allows you to group samples in your dataset by a particular field or expression. In the most basic of examples we can take a standard classification dataset and dynamically group it to create a new view. It is all possible with the
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = foz.load_zoo_dataset("cifar10", split="test") # Take 100 samples and group by ground truth label view = dataset.take(100, seed=51).group_by("ground_truth.label") print(view.media_type) # group print(len(view)) # 10
Our new view is a dynamic group that instead of showing all 100 images, we can now organize by label. Opening the FiftyOne App allows us to see our 10 groups and click to see all of our different samples within each label group.
A dynamic group view is a collection of samples that has been grouped based on a specified matching condition. You can group by scene number, label, view expression, or more generally any field on your samples. We will walk through a couple examples of how this can be done and how to work with the dataset afterwards.
Grouping Together Video Frames
Here’s another way to use
group_by. In this example, we have a set of videos that I need to break down into frames for my model and use case. We can load the videos for our example with the
quickstart-video and create the frames from it with
dataset2 = ( foz.load_zoo_dataset("quickstart-video") .to_frames(sample_frames=True) .clone() ) print(dataset2) #1279 samples session = fo.launch_app(dataset2)
We used to be greeted with each individual video in our dataset. Now, instead, we are met with the first several frames from our first video. To see a sample from our next video, we will need to scroll and scroll to see. No worries, as dynamic group views can clean this up for us if you like. We can group and order our data by using
group_by with the
sample_id from the original video and
order_by the frame number.
view2 = dataset2.group_by("sample_id", order_by="frame_number") print(len(view2)) # 10 print(view2.values("frame_number")) session.view = view2
This can also be done within the App by using the dynamic group button! Take a look below on how to do the same only using the UI!
Working with Dynamically Grouped Views
Now that we have our data grouped, in order to access this dynamic group, we use the
get_dynamic_group method to grab the group we want.
get_dynamic_group works just like
get_group does in grouped datasets, only this time on our dynamic group view! This will bring us a view with our selected group and all its samples. For example, we can grab the group from the first video in our dataset.
sample_id = dataset2.take(1).first().sample_id video = view2.get_dynamic_group(sample_id) print(video.values("frame_number"))
Iterating through any datasets or views is also an important and useful method to do. When iterating through a dynamic group view, each iteration gives you a group from the dataset. If we look at the CIFAR dataset again, we can see this in action:
# Sort the groups by label sorted_view = view.sort_by("ground_truth.label") for sample in sorted_view: print(sample.ground_truth.label)
airplane automobile bird cat deer dog frog horse ship truck
To flatten or unroll a view that you have created using dynamic groups, you can use the `flatten` function to undo any groups and put it back into a flat collection of samples.
# Unwind the sorted groups back into a flat collection flat_sorted_view = sorted_view.flatten() print(len(flat_sorted_view)) # 1000 print(flat_sorted_view.values("ground_truth.label")) # ['airplane', 'airplane', 'airplane', ..., 'truck']
Applying Dynamic Groups to Your Data
When working with FiftyOne, it is important to think of dynamic groups as one of the many tools in your toolbag. Organizing and bringing clarity to how your data is related is highly valuable in an ML workflow. Pairing related samples can not only help bring you insights into your data, it can speed up your workflow by making training easier. Instead of managing multiple input streams from multiple datasets, with groups you can limit it to just one.
Dynamic groups can be used alongside sensor data to help group every instance, let’s say a frame, with each output of any number of sensors. This can allow you to easily track sequences in data and bring a more temporal feel to FiftyOne. It can also be great for quick exploration of your dataset by utilizing dynamic group views. A great example can be looking at the groups with the highest numbers detections to make sure there are no cluttered or crowded boxes. We can easily group by number of detections.
dataset3 = foz.load_zoo_dataset("quickstart") # Group samples by the number of ground truth objects they contain expr = F("ground_truth.detections").length() view3 = dataset3.group_by(expr) print(len(view3)) # 26 print(len(dataset3.distinct(expr))) # 26
For a great use case of a sophisticated view created from dynamic groups, check out Spring, a dataset hosted on the try.fiftyone.ai website so you can instantly see it in action! Explore this dataset to see each frame and all its associated sensor data grouped nicely. This type of grouping can be applied across industries from automotive, agriculture, aerospace, and more!
Wrapping up today’s Tips and Tricks, I hope you now feel comfortable using groups in FiftyOne whether through a grouped dataset or a dynamic group view. Alongside these features, organizing multiview or multimodal data can be made easy and done on the fly. Accessing, changing, iterating, or reverting dynamic group views is quick and easy to learn! As always, if you are looking to learn more about dynamic groups or have any questions, I highly encourage you to head over to the community Slack channel for more information!
Join the FiftyOne Community!
Join the thousands of engineers and data scientists already using FiftyOne to solve some of the most challenging problems in computer vision today!
- 2,000+ FiftyOne Slack members
- 4,000+ stars on GitHub
- 5,000+ Meetup members
- Used by 370+ repositories
- 60+ contributors