Skip to content

Recapping the Computer Vision Meetup — July 20, 2023

We just wrapped up the July 20, 2023 Computer Vision Meetup, and if you missed it or want to revisit it, here’s a recap! In this blog post you’ll find the playback recordings, highlights from the presentations and Q&A, as well as the upcoming Meetup schedule so that you can join us at a future event. 

First, Thanks for Voting for Your Favorite Charity!

In lieu of swag, we gave Meetup attendees the opportunity to help guide a $200 donation to charitable causes. There was a two-way tie for the highest number of votes received, so we’ll be making donations of $100 to each of these organizations!

Drink Local Drink Tap is an international non-profit focused on solving water equity and quality issues through education, advocacy, and affordable, safe clean water sources.

Education Development Center advances lasting solutions to the most pressing educational, health, and workforce challenges across the globe.

Missed the Meetup? No problem. Here are playbacks and talk abstracts from the event.

DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data

Current perceptual similarity metrics compare images in terms of their low-level colors and textures, but fail to capture mid-level similarities in image layout, object pose, and semantic content. To address this gap, we introduce NIGHTS, a synthetic image dataset labeled with human similarity judgments, and DreamSim, a metric tuned to better align with human perception. We analyze how our metric is affected by different visual attributes, and show that it outperforms prior learned metrics and recent large vision models on retrieval and reconstruction tasks.

Stephanie Fu recently graduated from MIT with bachelor’s degrees in computer science and music and an M.Eng in computer science. Her research interests include computer vision, representation learning, and the connections between human and machine perception.

Shobhita Sundaram is a PhD student at MIT in computer science. She is interested in computer vision, particularly generative models and representation learning. Previously she obtained her bachelors in computer science and mathematics from MIT while researching biologically-inspired models for computer vision.

Netanel Tamir is an MSc student at the Weizmann Institute of Science, studying computer science. He’s interested in computer vision, representation learning and psychophysics.

Resource links

NIGHTS Synthetic Dataset and FiftyOne Demo

In this impromptu demo we’ll explore NIGHTS, a synthetic image dataset labeled with human similarity judgments using the open source FiftyOne computer vision toolset.

Jacob Marks is a Machine Learning Engineer and Developer Evangelist at Voxel51.

Resource links

MARLIN: Masked Autoencoder for facial video Representation LearnINg

This talk proposes a self-supervised approach to learn universal facial representations from videos, that can transfer across a variety of facial analysis tasks such as Facial Attribute Recognition (FAR), Facial Expression Recognition (FER), DeepFake Detection (DFD), and Lip Synchronization (LS). Our proposed framework, named MARLIN, is a facial video masked autoencoder, that learns highly robust and generic facial embeddings from abundantly available non-annotated web crawled facial videos. As a challenging auxiliary task, MARLIN reconstructs the spatio-temporal details of the face from the densely masked facial regions which mainly include eyes, nose, mouth, lips, and skin to capture local and global aspects that in turn help in encoding generic and transferable features. Through a variety of experiments on diverse downstream tasks, we demonstrate MARLIN to be an excellent facial video encoder as well as feature extractor, that performs consistently well across a variety of downstream tasks including FAR (1.13% gain over supervised benchmark), FER (2.64% gain over unsupervised benchmark), DFD (1.86% gain over unsupervised benchmark), LS (29.36% gain for Frechet Inception Distance), and even in low data regime.

Zhixi Cai is a Ph.D. student in the Data Science and Artificial Intelligence Department of Monash University IT Faculty, supervised by Dr. Munawar Hayat, Dr. Kalin Stefanov, and Dr. Abhinav Dhall.

Resource links

Unleashing the Potential of Visual Data: Vector Databases in Computer Vision

Discover the game-changing role of vector databases in computer vision applications. These specialized databases excel at handling unstructured visual data, thanks to their robust support for embeddings and lightning-fast similarity search. Join us as we explore advanced indexing algorithms and showcase real-world examples in healthcare, retail, finance, and more using the FiftyOne engine combined with the Milvus vector database. See how vector databases unlock the full potential of your visual data.

Filip Haltmayer is a Software Engineer at Zilliz working in both software and community development. His contributions mainly revolve around the Milvus and Towhee projects, helping develop both applications and helping grow their respective user bases through client interaction, integrations, and technical talks.

Resource links

Join the Computer Vision Meetup!

Computer Vision Meetup membership has grown to almost 5,000 members in just under a year! The goal of the Meetups is to bring together communities of data scientists, machine learning engineers, and open source enthusiasts who want to share and expand their knowledge of computer vision and complementary technologies. 

Join one of the 13 Meetup locations closest to your timezone.

We have exciting speakers already signed up over the next few months! Become a member of the Computer Vision Meetup closest to you, then register for the Zoom. 

What’s Next?

Up next on Aug 10 at 10 AM pacific we have a great line up speakers including:

  • Neural Congealing: Aligning Images to a Joint Semantic Atlas – Dolev Ofri-Amar at Weizmann Institute of Science
  • Advancing Personalized Medicine and Radiotherapy through AI-Enabled Computer Vision – Roushanak Rahmat, PhD, AI Researcher
  • A Practical Approach to Deep Learning for Computer Vision with Tensorflow 2 – Folefac Martins at Vinsight and ML Instructor

Register for the Zoom here. You can find a complete schedule of upcoming Meetups on the Voxel51 Events page.

Get Involved!

There are a lot of ways to get involved in the Computer Vision Meetups. Reach out if you identify with any of these:

  • You’d like to speak at an upcoming Meetup
  • You have a physical meeting space in one of the Meetup locations and would like to make it available for a Meetup
  • You’d like to co-organize a Meetup
  • You’d like to co-sponsor a Meetup

Reach out to me, Meetup co-organizer Jimmy Guerrero on, or ping me over LinkedIn to discuss how to get you plugged in.

The Computer Vision Meetup network is sponsored by Voxel51, the company behind the open source FiftyOne computer vision toolset. FiftyOne enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster. It’s easy to get started, in just a few minutes.