Skip to content

February AI, ML, and Data Science Meetup

February 15, 2024 at 10 AM Pacific

Register for the Zoom

By submitting you (1) agree to Voxel51’s Terms of Service and Privacy Statement and (2) agree to receive occasional emails.

Talks and Speakers

Lightning talk: The Next Generation of Video Understanding with Twelve Labs

James Le
at Twelve Labs

The evolution of video understanding has followed a similar trajectory to language and image understanding – with the rise of large pre-trained foundation models trained on a huge amount of data. Given the surge of multimodal research lately, video foundation models are becoming even more powerful to decipher the rich visual information embedded in videos. This talk will explore diverse use cases of video understanding and provide a glimpse of Twelve Labs offerings.

About the Speaker

James Le is the Head of Developer Experience at Twelve Labs, a startup building multimodal foundation models for video understanding. Previously, he worked at ML Infrastructure startups such as Superb AI and Snorkel AI, while contributing to the popular Full-Stack Deep Learning course series. He is also the host of Datacast, a podcast featuring conversations with founders, investors, and operators in the data and AI infrastructure space to unpack the narrative journeys of their careers.

Towards Fair Computer Vision: Discover the Hidden Biases of an Image Classifier

Chenliang Xu
at University of Rochester

Recent works find that AI algorithms learn biases from data. Therefore, it is urgent and vital to identify biases in AI algorithms. However, previous bias identification methods overly rely on human experts to conjecture potential biases, which may neglect other underlying biases not realized by humans. Is there an automatic way to assist human experts in finding biases in a broad domain of image classifiers? In this talk, I will introduce solutions.

About the Speaker

Chenliang Xu is an Associate Professor in the Department of Computer Science at the University of Rochester. His research originates in computer vision and tackles interdisciplinary topics, including video understanding, audio-visual learning, vision and language, and methods for trustworthy AI. He has authored over 90 peer-reviewed papers in computer vision, machine learning, multimedia, and AI venues.

Food Waste Classification with AI

Luka Posilović
at Kitro

1/3 of all food gets wasted, with millions of tons of food being thrown away each day. Food does not mean the same thing everywhere in the world, there are thousands of different meals across the world, therefore a lot of different classes to distinguish between. In this talk we’ll see through challenges of food-waste classification and see how foundation models can be useful to this task. We will also explore how we use FiftyOne to test models during development.

About the Speaker

Luka Posilović is a computer scientist with a PhD from FER, Zagreb, Croatia, working as a Head of machine learning in Kitro. Luka and the team are trying to reduce the global food waste problem by using AI.

Objects and Image Geo-localization from Visual Data

Safwan Wshah
at University of Vermont

Localizing images and objects from visual information stands out as one of the most challenging and dynamic topics in computer vision, owing to its broad applications across different domains. In this talk, we will introduce and delve into several research directions aimed at advancing solutions to these complex problems.

About the Speaker

Safwan Wshah is an Associate Professor in the Department of Computer Science at the University of Vermont. His research interests encompass the intersection of machine learning theory and application, with a particular emphasis on geo-localization from visual information. Additionally, he maintains broader interests in deep learning, computer vision, data analytics, and image processing.