Two things happened in Earth observation that, together, are quietly a big deal. Allen AI shipped OlmoEarth v1.2 — a foundation model for satellite imagery that got dramatically cheaper without getting worse. And FiftyOne turns the embeddings those models produce into something you can actually see, search, and trust. Pair them and you get planetary-scale insight on a laptop — and there's a notebook to prove it.
Key Takeaways
- OlmoEarth v1.2 reduces training compute 3× and inference operations 2.9× vs. v1, with no accuracy loss across 13 benchmark tasks
- It supports Sentinel-1, Sentinel-2, Landsat, and derived maps — and runs frozen on Apple Silicon, no GPU required
- FiftyOne visualizes OlmoEarth embeddings with UMAP, enabling unsupervised clustering of satellite tiles without labels
- FiftyOne links map, image grid, and embedding space in a live three-way view — filter by geography and all three update in sync
- The full workflow is open: open weights, open training code, open dataset (CC BY 4.0), and a companion notebook using EuroSAT benchmark data
OlmoEarth v1.2: Same Accuracy, 3× Less Compute
OlmoEarth is a family of multimodal, spatio-temporal foundation models for satellite imagery — Sentinel-1, Sentinel-2, Landsat, plus a stack of derived maps. v1 already hit state-of-the-art across a wide battery of Earth-observation tasks. v1.2's trick is making that performance affordable:
- 3× fewer GPU-hours to train. A v1 Base run took ~2,989 GPU-hours; v1.2 Base needs ~1,012. Same data, same step count — the savings come from smarter engineering, not shortcuts.
- 2.9× fewer operations at inference on Sentinel-2 tasks. When you're mapping a continent, inference is the cost. Cutting it 3× is the difference between "research demo" and "we can actually run this."
- Performance holds. Averaged across 13 benchmark tasks, scores stayed flat or nudged up. Cheaper and as good.
How? A few elegant moves: collapsing each modality into a single "bandset" (fewer tokens), then recovering the lost signal with random band dropout and a nonlinear projection; a smarter masking strategy that finally uses all those information-dense map layers as training targets; and rotary positional encodings (RoPE) that erase the ugly grid-stripe artifacts v1 left in its embeddings. That last one matters more than it sounds — cleaner embeddings mean cleaner everything downstream.
And it's genuinely open: open weights across four sizes (Nano to Base), open training code, an open 285K-sample global dataset, and a license that bars military and extractive-industry use. You can run the whole thing frozen, as an embedding extractor, on Apple Silicon — no GPU farm required.
FiftyOne: Visualize and Search Satellite Imagery Embeddings
A foundation model gives you a pile of high-dimensional vectors. Useful — but you can't look at a 768-dimensional vector. This is where FiftyOne, Voxel51's open-source toolkit for dataset curation and visual-AI development, earns its keep.
FiftyOne is the refinery between raw data and a working model. Tens of thousands of engineers use it to do the unglamorous work that actually moves model accuracy: find and fix label errors, surface edge cases, weed out duplicates, and understand what's really in a dataset.
Embeddings you can explore. Drop OlmoEarth's vectors into the Embeddings panel, reduce with UMAP, and watch your data self-organize into clusters — without ever using a label. Lasso a cluster, and the matching images light up instantly.
"Find me more like this." Build a similarity index and sort the entire dataset by nearness to any sample. Mining a data lake, hunting down a model's failure neighbors — one click.
Mistakes become insight. Filter to where predictions disagree with ground truth and discover the "errors" are usually genuinely hard examples — the curation gold.
The geospatial three-way link. Every tile carries a real-world coordinate, so FiftyOne's Map panel plots them on the globe. Here's the moment that lands in a live demo: we save a view filtered to Sweden — dataset.geo_within(sweden_polygon) — and instantly the map, the image grid, and the embedding scatterplot all snap to just the Swedish tiles, perfectly in sync. Color the points by land cover to read geographic patterns, or by uniqueness to see where the anomalies physically sit. Draw a box on the map by hand and everything filters live. Why it's awesome: map ↔ grid ↔ embedding space, linked three ways, is genuinely hard to replicate anywhere else.
Explore the OlmoEarth + FiftyOne Notebook"
Talk is cheap, so we built it.
The companion notebook uses
OlmoEarth v1 as a frozen embedding extractor on a sample of labeled, georeferenced Sentinel-2 tiles (the EuroSAT land-cover benchmark — the same m-eurosat task from the paper), then turns FiftyOne loose on the results. It runs end-to-end on a MacBook's MPS, no GPU required.
Get Started with OlmoEarth and FiftyOne
OlmoEarth
FiftyOne
FAQ
Ground these in questions someone would actually search, not just restating headers:
What is OlmoEarth?
OlmoEarth is a family of multimodal, spatio-temporal foundation models from Allen AI built for satellite imagery — including Sentinel-1, Sentinel-2, and Landsat data.
How does OlmoEarth v1.2 differ from v1?
v1.2 reaches the same benchmark accuracy as v1 using 3× fewer GPU-hours to train and 2.9× fewer operations at inference, thanks to smarter token collapsing, improved masking, and rotary positional encodings (RoPE).
Can OlmoEarth run without a GPU?
Yes. OlmoEarth can run as a frozen embedding extractor on Apple Silicon (MPS). The companion notebook in this post runs end-to-end on a MacBook.
How does FiftyOne work with OlmoEarth embeddings?
FiftyOne ingests OlmoEarth's output vectors, reduces them with UMAP for visualization, and links the embedding scatterplot to an image grid and geospatial map — all in sync. You can filter by geography, sort by similarity, or lasso a cluster to surface matching satellite tiles instantly.
Is OlmoEarth open source?
Yes — open weights (Nano to Base), open training code, and an open 285K-sample global pretraining dataset under CC BY 4.0. The license explicitly bars military and extractive-industry use.