Document Visual AI Workshop - September 2, 2026

Book a demo

Virtual

Americas

Workshops

Document Visual AI Workshop - September 2, 2026

Sep 02, 2026

9:00 AM - 11:00 AM PST

Online. Register for the Zoom!

About this event

In this hands-on workshop, you'll use FiftyOne and the High Quality Invoice Images for OCR dataset to run the full data-centric loop end-to-end: embed invoices with a modern visual document model, cluster them by structure, run LightOnOCR as your base model, and use per-sample evaluation scores layered onto embedding space to find *where* and *why* it fails. You'll then turn those insights into a curated fine-tuning view — combining low ANLS, high representativeness, and uniqueness filters — fine-tune LightOnOCR, and come back to FiftyOne to verify the failure clusters actually got fixed.

The punchline: a few hundred invoices chosen by combining embedding signals with evaluation metrics beats thousands of randomly sampled ones, every time.

Host

What You'll Walk Away With

A working FiftyOne pipeline for any document collection you own
A repeatable curation query that combines evaluation + embedding signals
A fine-tuned LightOnOCR checkpoint that demonstrably outperforms the base model on your invoices
The mental model that data curation — not architecture or hyperparameters — is the highest-leverage thing you can do to improve a document AI system

You might also enjoy these events

Virtual

From Cold Pool to Hot Queue: Annotation Curation with FiftyOne - August, 20, 2026

In this hands-on workshop, you'll use FiftyOne to run the full rare-class mining loop end-to-end on a large unlabeled im...

Aug 20, 2026

Virtual

1 of 2

Getting Started with FiftyOne - August 19, 2026

This workshop is part of our Getting Started with FiftyOne monthly series — a recurring session designed to help you bui...

Aug 19, 2026