April ’23 Computer Vision Meetup
April 13, 2023 – 10AM PT

When
April 13, 2023 – 10AM PT
Where
Virtual / Zoom
Agenda
- Housekeeping
- Lightning Talk – Generating Diverse and Natural 3D Human Motions from Texts – Chuan Guo (University of Alberta)
- Emergence of Maps in the Memories of Blind Navigation Agents – Dhruv Batra (Meta & Assoc. Professor, Georgia Tech)
- Using Computer Vision to Understand Biological Vision – Benjamin Lahner (MIT)
- Closing Remarks
Lightning Talk: Generating Diverse and Natural 3D Human Motions from Texts
Automated generation of 3D human motions from text is an interesting yet challenging problem, which also owns a broad range of applications such as VR/AR, 3D content creation. Specifically, the generated motions are expected to be sufficiently diverse to explore the text-grounded motion space, and more importantly, accurately depicting the content in prescribed text descriptions. Here we tackle this problem with a two-stage approach: text2length sampling and text2motion generation. Text2length involves sampling from the learned distribution function of motion lengths conditioned on the input text. This is followed by our text2motion module using temporal variational autoencoder to synthesize a diverse set of human motions of the sampled lengths. Moreover, a large-scale dataset of scripted 3D Human motions, HumanML3D, is constructed, consisting of 14,616 motion clips and 44,970 text descriptions.
Chuan Guo is a fourth-year ECE PhD student at the University of Alberta.
Emergence of Maps in the Memories of Blind Navigation Agents
Animal navigation research posits that organisms build and maintain internal spatial representations, or maps, of their environment. We ask if machines — specifically, artificial intelligence (AI) navigation agents — also build implicit (or ‘mental’) maps. A positive answer to this question would (a) explain the surprising phenomenon in recent literature of ostensibly map-free neural-networks achieving strong performance, and (b) strengthen the evidence of mapping as a fundamental mechanism for navigation by intelligent embodied agents, whether they be biological or artificial. Learn more on Arxiv.
Dhruv Batra is an Associate Professor in the School of Interactive Computing at Georgia Tech (leading the machine Learning & perception lab) and a Research Director in the Fundamental AI Research (FAIR) team at Meta (leading the embodied AI and robotics efforts at FAIR).
Using Computer Vision to Understand Biological Vision
In the past decade, deep neural networks (DNNs) have become a leading choice among neuroscientists to model the visual brain. While DNNs are often celebrated for their biological inspiration, they are also criticized for their lack of interpretability. In this talk we will discuss how DNNs help us understand biological intelligence, their promises and pitfalls as models of the brain, and what may be in store for the future.
Benjamin Lahner is a PhD student at MIT studying computational neuroscience.
Don’t Forget
- Voxel51 will make a donation on behalf of the Meetup members to the charity that gets the most votes this month.
- Can’t make the date and time? No problem! Just make sure to register here so we can send you links to the playbacks.
Register now to receive your invite link
By submitting you (1) agree to Voxel51’s Terms of Service and Privacy Statement and (2) agree to receive occasional emails.
Find a Meetup Near You
Join 3,000+ computer vision enthusiasts who have already become members

The goal of the Computer Vision Meetup network is to bring together a community of data scientists, machine learning engineers, and open source enthusiasts who want to share and expand their knowledge of computer vision and complementary technologies. If that’s you, we invite you to join the Meetup closest to your timezone: