Skip to content

AI, Machine Learning and Computer Vision Meetup

February 20, 2025 | 10 AM Pacific

Register for the Zoom

Exploring DeepSeek’s Janus-Pro Visual Question Answer Capabilities

Harpreet Sahota

Voxel51

DeepSeek‘s Janus-Pro is an advanced multimodal model designed for both multimodal understanding and visual generation, with a particular emphasis on improvements in understanding tasks. The model’s architecture is built upon the concept of decoupled visual encoding, which allows it to handle the differing representation needs of these two types of tasks more effectively.

In this talk, we’ll explore Janus-Pro’s Visual Question Answer (VQA) capabilities using FiftyOne’s Janus-Pro VQA Plugin.

The plugin provides a seamless interface to Janus Pro’s visual question understanding capabilities within FiftyOne, offering:

    • Vision-language tasks
    • Hardware acceleration (CUDA/MPS) when available
    • Dynamic version selection from HuggingFace
    • Full integration with FiftyOne’s Dataset and UI

Can’t wait to see it for yourself? Check out the FiftyOne Quickstart with Janus-Pro.

Getting the Most Out of FiftyOne Open-Source for Gen AI Workflows

Maxime Brénon

Finegrain

In this talk we’ll explore how we maximize the potential of the FiftyOne open source SDK and App to efficiently store and annotate training data critical to Finegrain‘s Generative AI workflows. We will provide an overview of our cloud-based storage and hosting architecture, showcase how we leverage FiftyOne for training and applying models for semi-automatic data annotation, and demonstrate how we extend the CVAT integration to enable pixel-perfect side-by-side evaluation of our Generative AI models.

BIOSCAN-5M: A Multimodal Dataset for Insect Biodiversity

Scott C. Lowe

Vector Institute

Measuring biodiversity is crucial for understanding global ecosystem health, especially in the face of anthropogenic environmental changes. Rates of data collection are ever increasing, but access to expert human annotation is limited, making this an ideal use-case for machine learning solutions. The newly released BIOSCAN-5M dataset features five million specimens from 47 countries around the world, with paired high-resolution images and DNA barcodes for every sample.

The dataset’s hierarchical taxonomic labels, geographic data, and long-tail distribution of rare species offer valuable resources for ecological research and AI model training. The dataset enables large-scale multimodal modelling for insect biodiversity, and poses challenging machine learning problems for fine-grained classification both for recognising known species of insects (closed-world), and handling novel species (open-world). BIOSCAN-5M represents a significant advancement in biodiversity informatics, facilitated by the International Barcode of Life and the BIOSCAN project, and is publicly available for download via Hugging Face and PyPI.

Fine Tuning Moondream2

Parsa Khazaeepou

Moondream AI

Stay tuned for the talk abstract!

Find a Meetup Near You

Join the AI and ML enthusiasts who have already become members

The goal of the AI, Machine Learning, and Computer Vision Meetup network is to bring together a community of data scientists, machine learning engineers, and open source enthusiasts who want to share and expand their knowledge of AI and complementary technologies. If that’s you, we invite you to join the Meetup closest to your timezone.