Skip to content

Forsight Finds a Centralized Dataset Management Solution in FiftyOne Teams

Success Story at a Glance

Challenge

  • As their R&D team grew, manual dataset management was no longer going to work
  • Forsight needed an easier, more transparent way to share image and video datasets among team members

Solution

  • Forsight chose FiftyOne Teams as their central computer vision dataset management system where all the data is aggregated and consumed by all of their machine learning pipelines

Results

  • 1.5TB of visual data is managed daily through FiftyOne Teams
  • Half an engineer’s time is saved
  • Datasets and model performance are improved
  • Costs are straightforward, not complex

How Forsight Uses AI

Forsight is on a mission to help keep workers and the jobsites where they work safe and secure. Forsight does this by using cutting edge AI vision technology to build solutions for dynamic environments that address challenges regarding safety, security, management and more.

“Missing safety equipment or not adhering to safety regulations are common but preventable occurrences on jobsites in the US today. We founded Forsight around the idea that if we could save just one person’s life with AI and software, then our efforts would be well worth it,” said Ivan Ralašić, CTO and co-founder of Forsight.

Source: forsight.ai

Although Forsight started with an initial focus on safety in the construction industry, the company now also works with industrial plant owners and operators, manufacturers, mining operations, and other customers to make their sites safer and day to day tasks easier through the use of real-time, vision-powered technology.

Forsight develops AI software that uses CCTV cameras to detect and predict safety incidents, security threats, and management issues in real time. Their solution monitors jobsites for safety issues, like missing personal protective equipment, no-go zones for safety or security reasons, as well as vehicles, license plates, and more. It does this all in real time, which means processing camera feeds ten times a second, and many cameras in parallel.

Scaling Dataset Management as Business Grows

Forsight’s AI technology is now deployed at hundreds of customer sites and as Forsight’s business grew, so did their R&D team.

Ivan noted, “initially, dataset management was a manual process, and the ingestion of data to the training instances was cumbersome with all the syncing. Luckily, it was only me training the models at the time, and I somehow managed to keep track of everything. As we added more team members, we needed a more transparent and easier way to share datasets among team members.”

Ivan wanted to scale and streamline dataset management as the R&D team grew, as well as avoid the pitfalls of manual dataset versioning.

Source: @petergyang on Twitter

Forsight’s primary requirements for a dataset management solution were that it had to fit into their existing cloud infrastructure and be cost effective. The Forsight engineering team tested a few different products and then found open source FiftyOne on their way to finding FiftyOne Teams.

>> FiftyOne is the open source tool for building high-quality datasets and computer vision models.

>> FiftyOne Teams inherits all the goodness of open source FiftyOne and adds collaborative features built specifically for teams, including cloud-backed media, dataset permissions, versioning, sharing, and more.

“The open source FiftyOne version stood out from other solutions because it had much more powerful features, and the FiftyOne Brain was a big bonus. FiftyOne worked really great, but as a team we needed additional features like central permission management and access control, and that’s where FiftyOne Teams comes into the picture,” explained Ivan.

FiftyOne Teams Makes Dataset Management Easy

Forsight is using FiftyOne Teams as their central dataset management system where all the data is aggregated and consumed by all of their machine learning pipelines. “It’s a great thing to have centralized dataset management in the form of FiftyOne Teams, which really makes our lives easier when curating datasets and training new models. All the team members have the same view on the datasets, which ensures that everyone understands the data used to train the models. This really saves a few hours of back and forth between team members,” said Ivan.

Forsight’s R&D team trains different models to perform tasks such as detection of personal protective equipment, intrusion detection, construction vehicle detection, fire detection, semantic segmentation, and more.

Source: Forsight’s FiftyOne Teams App

Their lightweight CV and ML algorithms run on-premises on edge devices that process the CCTV camera streams in real time. For typical deployments, the edge devices are powered by NVIDIA’s embedded Jetson technology, and for larger deployments, Forsight uses NVIDIA GPUs that can process 25+ camera streams in real time in parallel. Video feeds and inferred metadata are then sent into Forsight’s cloud provider, which is AWS. They’re using Amazon EC2 instances for training their ML models, and Amazon S3 buckets for media storage. Ivan added, “We wanted to have high throughput between our S3 buckets and the EC2 instances, which we are able to get from FiftyOne Teams.”

In addition to integrating with AWS, Forsight also integrated FiftyOne Teams with CVAT for annotation, PyTorch for model training loops, and ClearML for experiment tracking.

Overall, Forsight’s R&D team works with about 1.5 TB of data (mostly images, but some video datasets) on a daily basis, and all that data goes through FiftyOne Teams.

Better data, better visibility, better models

FiftyOne exists to give developers and scientists comprehensive visibility into their datasets, so they can c​urate better data and build better models. Because FiftyOne Teams is built on top of open source FiftyOne, it does all that and adds collaboration and access features that teams need.

Forsight uses FiftyOne to automate dataset management, such as auto-tagging suspicious samples defined by rules set by mistakeness and embeddings. They also use FiftyOne to auto-balance their datasets and help reduce bias in their models. Adding the collaboration features of FiftyOne Teams extends dataset visibility team-wide.

“The quality of our datasets is a key component of the performance of our models. FiftyOne helps us improve our datasets by identifying mislabeled data, adversarial samples, and unusually sized samples. FiftyOne Teams helps us track our datasets across multiple model runs and evaluate our models. We can see how our models are improving as we track model predictions on certain samples over multiple runs,” said Daniel Reiff, Machine Learning Engineer at Forsight.

“We have seen performance improvements in our models directly due to using FiftyOne Teams for dataset management. FiftyOne Teams has greatly improved the visibility of our datasets across our entire R&D team, it has made it extremely easy for multiple team members to access and collaborate on datasets,“ explained Ivan.

Source: Forsight’s FiftyOne Teams App

Cost effective dataset management

The pricing model for other dataset management solutions can be complex, but with FiftyOne Teams, pricing is straightforward and includes unlimited data.

“Some companies charge per gigabyte transferred, while others charge per image stored, per model trained, and per hour — it’s so complex. The costs are maybe manageable in the beginning to attract you to the product, but as you scale and have more and more data, then you run into problems. With FiftyOne Teams, pricing is straightforward, includes unlimited data, and that’s what we like,” explained Ivan.

FiftyOne Teams also streamlines the process of working with data, freeing up valuable engineering time. Ivan estimates that the time saved with FiftyOne Teams is equivalent to half an engineer, which means the team can spend more time building new features and less time wrangling data.

Enabling quick iteration

The Forsight R&D team iterates quickly on their models, and uses FiftyOne Teams to help.

“There are a ton of features in FiftyOne Teams that I use to iterate quickly on our models. For example, there are a lot of features of the FiftyOne Brain that I find incredibly useful, including mistakenness and uniqueness that help me quickly identify suspicious samples to send to our labelers, get things turned around, and retrain the model.”

Daniel gave a shout out to another one of his favorite features — heatmaps. “My favorite FiftyOne Teams feature is heatmaps! At Forsight we love using Grad-CAM heatmaps to interpret what our models have learned. Intuitively, we use these heatmaps to build more adversarial samples into our datasets so we can rapidly improve our models. Being able to view these heatmaps in FiftyOne Teams is a great advantage because we can tag samples where the model has not learned the critical regions and build sets of adversarial samples,” said Daniel.

Check out this article to learn more about how Forsight uses Grad-CAM heatmaps.

Conclusion

Forsight uses cutting edge AI vision technology to build solutions for jobsites that address challenges regarding safety, security, management, and more. The Forsight R&D team needed a dataset management solution and tested out a few products before finding open source FiftyOne, which ultimately led them to FiftyOne Teams. FiftyOne Teams meets all their requirements including integrating with their existing cloud infrastructure and being a cost effective solution. In addition, FiftyOne Teams has led to significant results in terms of improved speed, quality, and performance.

See for yourself: