Register for the event
Virtual
Americas
Workshops
Text industry
Document Visual AI with FiftyOne—When a Pixel is Worth a Thousand Tokens - November 14, 2025
Nov 14, 2025
9:00-10:30 AM Pacific
Online. Register for the Zoom!
About this event
In document understanding, a pixel is worth a thousand tokens. While traditional text-extraction pipelines tokenize and process documents sequentially, modern visual AI approaches can understand document structure, layout, and content directly from images—making them more efficient, accurate, and robust to diverse document formats.
Host
This hands-on workshop introduces you to document visual AI workflows using FiftyOne, the leading open-source toolkit for computer vision datasets. You'll learn how to:
  • Load and organize document datasets in FiftyOne for visual exploration and analysis
  • Compute visual embeddings using state-of-the-art document retrieval models to enable semantic search and similarity analysis
  • Leverage FiftyOne workflows including similarity search, clustering, and quality assessment to gain insights from your document collections
  • Deploy modern vision-language models for OCR and document understanding tasks that go beyond simple text extraction
  • Evaluate and compare different OCR models to select the best approach for your specific use case
Whether you're working with invoices, receipts, forms, scientific papers, or mixed document types, this workshop will equip you with practical skills to build robust document AI pipelines that harness the power of visual understanding. Walk away with reproducible notebooks and best practices for tackling real-world document intelligence challenges.