Skip to content

June Computer Vision Meetup

June 8 – 10AM PDT [1:00PM EDT]

June 8 2023 Computer Vision Meetup

When

June 8, 2023 – 10AM PDT [1:00PM EDT, 17:00 UTC]

Where

Virtual / Zoom

Agenda

  • Housekeeping
  • Redefining State-of-the-Art with YOLOv5 and YOLOv8
  • Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation
  • Performant ML Models for Edge Applications Using OpenVINO
  • Closing Remarks

Redefining State-of-the-Art with YOLOv5 and YOLOv8

In recent years, object detection has been one of the most challenging and demanding tasks in computer vision. YOLO (You Only Look Once) has become one of the most popular and widely used algorithms for object detection due to its fast speed and high accuracy. YOLOv5 and YOLOv8 are the latest versions of this algorithm released by Ultralytics, which redefine what “state-of-the-art” means in object detection. In this talk, we will discuss the new features of YOLOv5 and YOLOv8, which include a new backbone network, a new anchor-free detection head, and a new loss function. These new features enable faster and more accurate object detection, segmentation, and classification in real-world scenarios. We will also discuss the results of the latest benchmarks and show how YOLOv8 outperforms the previous versions of YOLO and other state-of-the-art object detection algorithms. Finally, we will discuss the potential for this technology to “do good” in real-world scenarios and across various fields, such as autonomous driving, surveillance, and robotics.

Speaker: Glenn Jocher is founder and CEO of Ultralytics. In 2014 Glenn founded Ultralytics to lead the United States National Geospatial-Intelligence Agency (NGA) antineutrino analysis efforts, culminating in the miniTimeCube experiment and the world’s first-ever Global Antineutrino Map published in Nature. Today he’s driven to build the world’s best vision AI as a building block to a future AGI, and YOLOv5, YOLOv8, and Ultralytics HUB are the spearheads of this obsession.

Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation

Large-scale text-to-image generative models have been a revolutionary breakthrough in the evolution of generative AI, allowing us to synthesize diverse images that convey highly complex visual concepts. However, a pivotal challenge in leveraging such models for real-world content creation tasks is providing users with control over the generated content. In this paper, we present a new framework that takes text-to-image synthesis to the realm of image-to-image translation — given a guidance image and a target text prompt, our method harnesses the power of a pre-trained text-to-image diffusion model to generate a new image that complies with the target text, while preserving the semantic layout of the source image. Specifically, we observe and empirically demonstrate that fine-grained control over the generated structure can be achieved by manipulating spatial features and their self-attention inside the model.

Speakers: Michal Geyer and Narek Tumanya are Masters students at the Weizmann Institute of Science in the Computer Vision department.

Performant ML Models for Edge Applications using OpenVINO

One of the most important technology shifts of recent times is to use edge processing to collect, process, analyze and make decisions on-site or send data to the cloud. In addition, the increasing connection of millions or billions of sensors to the cloud will have a huge impact on bandwidth consumption, which will make low-latency applications unfeasible in the cloud. So, thinking about intermediate or full processing at the edge and sending reduced information to the cloud would reduce the impact on data transfer/processing, with another possible solution being the emerging 5G networks. Currently, Intel through its OpenVINO toolkit contributes mostly generating low latency computing systems at the edge, while retaining the same accuracy as the original models.

Speaker: Paula Ramos, PhD is an AI Evangelist at Intel. She has been working on developing novel integrated engineering technologies, mainly in the field of Computer Vision, robotics and Machine Learning applied to agriculture, since the early 2000s in Colombia.

Don’t Forget

  • Voxel51 will make a donation on behalf of the Meetup members to the charity that gets the most votes this month.
  • Can’t make the date and time? No problem! Just make sure to register here so we can send you links to the playbacks.

Register now to receive your invite link

By submitting you (1) agree to Voxel51’s Terms of Service and Privacy Statement and (2) agree to receive occasional emails.

Find a Meetup Near You

Join 3,300+ computer vision enthusiasts who have already become members

Computer Vision Meetups Worlwide, sponsored by Voxel51

The goal of the Computer Vision Meetup network is to bring together a community of data scientists, machine learning engineers, and open source enthusiasts who want to share and expand their knowledge of computer vision and complementary technologies. If that’s you, we invite you to join the Meetup closest to your timezone: