June Computer Vision Meetup

June 8 – 10AM PDT [1:00PM EDT]

When

June 8, 2023 – 10AM PDT [1:00PM EDT, 17:00 UTC]

Where

Virtual / Zoom

Agenda

Housekeeping
Redefining State-of-the-Art with YOLOv5 and YOLOv8
Re-Annotating MS COCO, an Exploration of Pixel Tolerance
Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation
Closing Remarks

Redefining State-of-the-Art with YOLOv5 and YOLOv8

In recent years, object detection has been one of the most challenging and demanding tasks in computer vision. YOLO (You Only Look Once) has become one of the most popular and widely used algorithms for object detection due to its fast speed and high accuracy. YOLOv5 and YOLOv8 are the latest versions of this algorithm released by Ultralytics, which redefine what “state-of-the-art” means in object detection. In this talk, we will discuss the new features of YOLOv5 and YOLOv8, which include a new backbone network, a new anchor-free detection head, and a new loss function. These new features enable faster and more accurate object detection, segmentation, and classification in real-world scenarios. We will also discuss the results of the latest benchmarks and show how YOLOv8 outperforms the previous versions of YOLO and other state-of-the-art object detection algorithms. Finally, we will discuss the potential for this technology to “do good” in real-world scenarios and across various fields, such as autonomous driving, surveillance, and robotics.

Speaker: Glenn Jocher is founder and CEO of Ultralytics. In 2014 Glenn founded Ultralytics to lead the United States National Geospatial-Intelligence Agency (NGA) antineutrino analysis efforts, culminating in the miniTimeCube experiment and the world’s first-ever Global Antineutrino Map published in Nature. Today he’s driven to build the world’s best vision AI as a building block to a future AGI, and YOLOv5, YOLOv8, and Ultralytics HUB are the spearheads of this obsession.

Re-Annotating MS COCO, an Exploration of Pixel Tolerance

The release of the COCO dataset has served as a foundation for many computer vision tasks including object and people detection. In this session, we’ll introduce the Sama-Coco dataset, a re-annotated version of COCO focused on fine-grained annotations. We’ll also cover interesting insights and learnings during the annotation phase, illustrative examples, and results of some of our experiments on annotation quality as well as how changes in labels affect model performance and prediction style.

Speakers: Jerome Pasquero is Principal Product Manager at Sama. Jerome holds a Ph.D. in electrical engineering and is listed as inventor on more than 120 US patents along with published over 10 peer-reviewed journal and conference articles. Eric Zimmermann is an Applied Scientist at Sama helping to redefine annotation quality guidelines. He is also responsible for building internal curation tools which aim to improve the process on how clients and annotators interact with their data.

Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation

Large-scale text-to-image generative models have been a revolutionary breakthrough in the evolution of generative AI, allowing us to synthesize diverse images that convey highly complex visual concepts. However, a pivotal challenge in leveraging such models for real-world content creation tasks is providing users with control over the generated content. In this paper, we present a new framework that takes text-to-image synthesis to the realm of image-to-image translation — given a guidance image and a target text prompt, our method harnesses the power of a pre-trained text-to-image diffusion model to generate a new image that complies with the target text, while preserving the semantic layout of the source image. Specifically, we observe and empirically demonstrate that fine-grained control over the generated structure can be achieved by manipulating spatial features and their self-attention inside the model.

Speakers: Michal Geyer and Narek Tumanya are Masters students at the Weizmann Institute of Science in the Computer Vision department.

Don’t Forget

Voxel51 will make a donation on behalf of the Meetup members to the charity that gets the most votes this month.
Can’t make the date and time? No problem! Just make sure to register here so we can send you links to the playbacks.

Register now to receive your invite link

By submitting you (1) agree to Voxel51’s Terms of Service and Privacy Statement and (2) agree to receive occasional emails.

Find a Meetup Near You

Join 4,200+ computer vision enthusiasts who have already become members

The goal of the Computer Vision Meetup network is to bring together a community of data scientists, machine learning engineers, and open source enthusiasts who want to share and expand their knowledge of computer vision and complementary technologies. If that’s you, we invite you to join the Meetup closest to your timezone: