For years, the conventional wisdom in computer vision has been simple: more labeled data equals better models. As a result, teams have allocated millions annotating every visual data point, wasting budgets on labels that don't actually improve performance. For example,
recent research from Apple ML and
MIT CS AI lab shows 6-10% error rates persist even in production systems. And teams implementing manual quality control go through an average of 5-7 review cycles before datasets are ready.
However, what’s silently slowing visual AI development is siloed data annotation workflows, which compound the problem of over-annotation. Visual AI stacks are getting crowded, with teams stitching together multiple ML tools. This fragmented pipeline leads to overhead and lost context as data moves between labeling tools and the development environment without a clear understanding of whether data collection/labeling campaigns are focused on the right data, not just more data.
Today, Voxel51 is announcing new capabilities in
FiftyOne Annotation that rethink these workflows by enabling faster iteration, lower costs, and more accurate models. ML teams already rely on FiftyOne—the only
end-to-end open source ML platform—for auto-labeling and error detection, alongside data curation, and model evaluation. Now, we’re adding manual annotation capabilities and automation to ease label prioritization, creation, and QA.
Available in open source FiftyOne, users can create 2D and 3D labels and QA them directly within the platform — no coordination overhead or handoffs between external vendors and ML workflow tools. Understanding which data is most valuable for model training is critical for reducing annotation costs and improving model performance. We’re also releasing smart data selection capabilities that guide users to prioritize unlabeled data for downstream model training.
Why annotation workflow gaps hinder visual AI velocity
Three critical gaps turn data annotation workflows into a development bottleneck:
No systematic data selection: Teams that rely on random sampling, guesswork, or label-everything approaches waste annotation dollars on data that doesn't necessarily improve model performance. This approach also misses rare scenarios and underrepresented samples crucial for production-ready models. Data and ML engineers already grapple with the enormous volumes of visual data. Without understanding which data actually needs labels, it’s easy to overspend on low-value samples.
No methodical error detection and QA strategy: Even well-curated benchmark datasets contain
3-6% labeling errors that go undetected, and real-world annotation pipelines typically fare worse. Teams discover labeling mistakes weeks into training, forcing expensive rework that carries through the entire pipeline. The back-and-forth between ML workflow tools and annotators across multiple correction rounds compounds the problem.
No end-to-end unified workflow. Black-box annotation services are siloed from the rest of the ML workflow tools. This fragmentation makes coordination and systematic quality control very difficult. When annotation lives in one tool, curation and model evaluation in another, it’s easy for teams to lose critical context.
As a result, dev teams end up wasting time on excessive handoffs and iterations: coordinating across annotators, domain experts, and tools. And each cycle compounds those delays, turning annotation into a development bottleneck.
Treating data annotation as an important ML data understanding challenge, tightly integrated into the other ML tasks, can drive development efficiency. Strategic data selection methods to label and validate data samples enable users to systematically handle end-to-end ML workflows in a single platform.
FiftyOne Annotation: Automated and human-in-the-loop ML workflows that deliver reliable model improvements
FiftyOne Annotation eliminates the development gaps by fundamentally rethinking how annotation fits into the ML tools and workflow. Instead of treating annotation as a standalone task, FiftyOne unifies strategic data selection, intelligent automation, and human expertise with data curation and model evaluation.
Join us on
Feb 18, 2026 @10 am PT to learn how feedback-driven annotation pipelines can reduce labeling effort and downstream model failures.
Save your seat →
ML data management: Strategic data selection to label what matters
The most expensive annotation mistake in ML data management is labeling data that doesn't improve your model. Traditional approaches randomly sample datasets or label everything. Both waste budget on redundant images while missing the edge cases that actually matter.
FiftyOne's
embedding and
model failure analysis workflows help teams break this pattern by surfacing unrepresented samples, edge cases, and unique scenarios worth labeling. For teams working with large unlabeled datasets, zero-shot coreset selection offers an even earlier intervention by automatically prioritizing which samples will contribute the most to model performance before any labeling begins.
Our newly released ML research paper,
Zero-Shot Coreset Selection via Iterative Subspace Sampling, demonstrates the approach. Using pre-trained foundation models to analyze unlabeled data, the technique scores each image based on the unique information it contributes—then filters out redundant examples you would otherwise pay to label unnecessarily.
Benchmarks on ImageNet indicate that this technique achieves the same model accuracy with just 10% of the training data, eliminating annotation costs for over 1.15 million images.
Manual annotation: Create 2D & 3D labels and QA them without leaving your workflow
Foundation model-based auto-labeling handles common labeling tasks, but complex scenarios still require human expertise. The problem is what happens next: most teams export data to external data management tools, wait for labels, import them back, discover mistakes, and then repeat the cycle. Each handoff introduces delays, format conversions, and lost context.
FiftyOne annotation eliminates this friction by integrating manual labeling and QA directly into the platform where you visualize data and evaluate ML models—eliminating the tool-switching that slows annotation.
Create and modify 2D and 3D labels
Annotating 2D and 3D together in a single interface ensures label accuracy and consistency. FiftyOne’s 3D data annotation workflow supports the creation of cuboids with depth of field control and polylines for precision and real-time rendering.
- Annotate across 2D and 3D. Create new classification and detection labels: 2D bounding boxes, 3D polylines and cuboids for point clouds – with the ability to move between 2D and 3D modalities with full spatial context preserved.
- Define annotation schemas, configure which fields and attributes can be viewed or edited.
- Perform editing tasks such as adjusting label properties, modifying non-label metadata, or deleting labels entirely directly in the FiftyOne app.
Review and fix labeling mistakes
Labeling mistakes are one of the most prominent ML model performance bottlenecks. FiftyOne Annotation closes the loop between model failure analysis and label correction. Integrated data exploration and model evaluation let you surface samples with low confidence scores, misclassifications, or poor localization, then filter and query for similar errors across your dataset. Redraw 2D boxes, adjust 3D polylines and cuboids, modify classification labels, or delete incorrect annotations—all in place, without leaving your ML workflow.
Data control for production-level annotation
FiftyOne UX allows users to easily manage their annotation schemas by providing fine-tuned control over which fields and attributes can be viewed or edited. ML
dataset versioning allows users to capture the state of their dataset in time (e.g., model training, updated annotations), making it easy to maintain audit trails as versions evolve.
Methodical error detection: Uncover labeling mistakes before they hurt performance
Even with strategic data curation and automation,
annotation errors silently degrade model performance. Missing or incorrect annotations weaken model learning. FiftyOne ML-backed techniques: visual clustering using embeddings, similarity analysis, and mistakenness scoring, surface labeling inconsistencies, and outliers. Errors are detected and prioritized for correction directly in FiftyOne.
Why curation before annotation drives lower costs and better model performance
Effective data curation transforms annotation from a blindfolded approach to a more strategic investment. Before labeling anything, data curation workflows reveal which samples are redundant, which edge cases are missing, and where quality issues hide. This intelligence ensures annotation budgets target the data that enables models to reach target performance, often with 60-80% less annotated data and better real-world generalization.
After annotation, teams immediately transition to model evaluation within the same environment. FiftyOne goes beyond aggregate metrics like precision and recall to reveal where models actually fail.
Analyzing scenarios helps break down performance across meaningful data slices, e.g., weather conditions, lighting, object sizes, occlusion, exposing the edge cases that impact performance.
When evaluation uncovers systematic failures, teams trace them directly back to data gaps: missing scenarios, underrepresented conditions, or labeling errors. This intelligence feeds immediately back into curation and annotation, creating a tight loop that systematically improves model accuracy with each iteration— without switching ML tools or losing context.
Get started with FiftyOne
Siloed annotation compounds the over-annotation problem, impacting costs and dev iterations. FiftyOne Annotation unifies strategic data selection, label creation, and error detection, with data curation and model evaluation in a single platform workflow.
Built on an open source foundation, FiftyOne gives ML and data engineers complete transparency into how data is processed, the flexibility to customize and extend data curation and model workflows, and access to a thriving community of ML practitioners sharing best practices.
Join us on
Feb 18, 2026 @10 am PT to learn how feedback-driven annotation pipelines can reduce labeling effort and downstream model failures.
Save your seat →