How FiftyOne and its plugin system turned a stranger's 3,000-image search into a few minutes of review
Part of my job as a developer advocate at Voxel51 is to keep an eye out for real, messy, unglamorous computer vision problems — the kind that never show up in benchmarks but are exactly what practitioners actually face. So when this post crossed my feed, I couldn't resist:
Hey guys. I recently lost a turbine RC plane over the desert and ran an aerial mapping grid over the search area. I now have about 3,000 high-res images to comb through and was thinking that maybe someone knew how to cook up a quick cv script that could flag stuff for manual review. I built the plane myself and it would be sick to find it.
Note: one of the images is renamed "0000" — it's a false positive that has a shadow on the top left that looks EXACTLY like the plane... (I went and checked it out irl).
Three thousand near-identical desert tiles. A handmade carbon-fiber airframe somewhere in the frame. And a known decoy — a shadow that fooled the pilot in person. This is a search problem, a data problem, and a tooling problem all at once. It's also a perfect showcase for the workflow I evangelize every day: stop writing one-off scripts, and start treating your dataset as the thing you build, query, and share.
Encapsulating the Workflow with FiftyOne
The instinct most people have here is to write a standalone OpenCV script that dumps a CSV file and a contact sheet JPEG. That gets you exactly one answer, once, with no way to iterate. FiftyOne flips the model: the dataset is the source of truth, every computation writes back onto it, and the App becomes your review surface. Tooling becomes reusable, results become inspectable, and the whole thing becomes shareable.
Concretely, three pieces of the platform carried this project: the FiftyOne Brain for embeddings and uniqueness, the Hugging Face integration for distribution, and — the part I'm most excited about — the plugin system for packaging the actual detector.
From a folder of images to a queryable dataset
Loading the survey is one call, and the FiftyOne Brain turns raw pixels into something you can search and visualize — CLIP embeddings, a similarity index, and a 2D UMAP map you can lasso in the App:
The detector, packaged as a plugin
Here's the heart of it. The desert background is almost uniform mid-gray, and the only dark features are scrubby vegetation — so tone alone can't separate the plane from a bush. The detector instead scores each dark blob on shape and surface character: a carbon-fiber airframe is elongated, fills its bounding box, has a smooth low-variance interior, and is solid and convex. Vegetation is none of those.
I could have left that as a script. Instead, I
wrapped it as a FiftyOne plugin — and that decision is the whole point of this post. A plugin exposes the logic as operators that run identically from the Python SDK and from the App's UI, can be delegated to run in the background, and can be installed by anyone with one command:
The plugin ships two operators:
- preview_dark_mask — a calibration helper that writes the thresholded dark mask back as a heatmap overlay, so you can tune the threshold by eye in the App before committing to a full run.
- find_plane — the detector, which writes a plane_candidates detections field (score in confidence, plus per-detection attributes like rect_aspect and interior_std) and a per-image max_plane_candidates_score for ranking.
Driving them from the SDK is just get_operator and a call:
The exact same operator, with the exact same form inputs, is one backtick-keystroke away in the App — every parameter rendered as a labeled, described field, with a checkbox to delegate execution for the full 6,500-image run. I didn't write that UI. The plugin system generated it from the operator's input schema. That parity between code and UI is, to me, the single most underrated thing about building on FiftyOne.
Turning 3,000 images into ranked shortlists
Once the scores are on the dataset, review is just querying. Saved views persist on the dataset and show up in the App's Saved Views menu, so the community member opens the data and immediately has a set of curated entry points — each attacking the search from a different angle:
Being able to filter on the detector's own per-detection attributes (rect_aspect, interior_std, solidity, bbox_diag) is what makes this expressive. I saved eight views in total:
- candidates_by_score — every flagged sample, ranked by its best score.
- top_shortlist — the top 100 candidates — the fastest first pass.
- high_confidence — only the strong detections, weak boxes dropped.
- carbon_like — the most airframe-like: elongated and smooth-surfaced.
- best_geometry — strongest pure shape, independent of surface smoothness.
- review_band — the mid-scoring "maybe" pile, so nothing borderline is missed.
- thermal_frames — the small thermal/IR frames, reviewed apart from the RGB.
- unusual_candidates — candidates ranked by Brain uniqueness — the plane is an anomaly in a sea of identical desert.
Zooming in: patch embeddings
The sharpest tool is to explode every candidate into its own crop with to_patches, then run the Brain on the crops instead of the whole frames. Each embedding now describes just the blob, so look-alikes cluster tightly: find one true patch and a similarity search pulls the rest, and ranking by uniqueness floats the real airframe above the repetitive bush and shadow patches.
I bundled the view-building and patch-embedding step into a single publish_views.py script that recomputes everything and pushes the enriched dataset back to the Hub. One note worth flagging for anyone adapting it: because compute_embeddings spins up a multiprocessing DataLoader, the script body lives inside a main() guarded by if __name__ == "__main__" — otherwise the spawned workers re-import and re-run the whole pipeline.
Why the plugin system is the real story
Strip away the desert and the drone, and what this project really demonstrates is what the FiftyOne plugin system unlocks for developers:
- Write once, run anywhere — one operator definition runs from the SDK in a notebook, from a script, and as a point-and-click panel in the App — with the UI generated from the input schema.
- Scale without rewrites — the same operator runs inline on a small view or as a delegated background job over thousands of full-res images, just by toggling a flag.
- Shareable by design — fiftyone plugins download <github-url> means the community member runs my exact detector without copying code or matching my environment by hand.
- Composable with the rest of the platform — the plugin's output is ordinary FiftyOne labels, so it immediately works with views, the Brain, similarity search, and the Hub.
That's the pitch I make to developers all the time, and this is it in miniature: the gap between "I hacked together a CV script" and "I shipped a reusable, UI-enabled, shareable tool" is a thin plugin wrapper around code you've already written.
Did it find the plane?
The tooling did its job: it collapsed a 3,000-image haystack into a handful of short, ranked lists a human can review in minutes, kept the notorious "0000" shadow decoy honest, and shipped as a dataset and a plugin anyone can pull down and rerun. Whether the airframe turns up is now a question of eyes on a shortlist rather than endurance — which is exactly where you want a search-and-rescue problem to land.
If you want to try it yourself: pull the dataset from the Hub, install the plugin from GitHub, and start at the top of top_shortlist. And if you build something like this, come share it with the
Voxel51 community — the next messy real-world problem is always around the corner.
And if the plane does turn up, somewhere a Redditor owes the internet fifty bucks.