Register for the event
In-person
Americas
Meetups
Chicago AI, ML, and CV Meetup - July 23, 2026
Jul 23, 2026
5:30 PM - 8:30 PM CT
231 S LaSalle St 20th Floor Chicago, IL 60604
Speakers
About this event
Join our in-person meetup in Chicago to hear talks from experts on cutting-edge topics across AI, ML, and computer vision. View more CV events here.
Schedule
Building Real-World Computer Vision Systems with Voxel51
This talk will explore practical workflows for building, evaluating, and improving modern computer vision systems. We’ll dive into real-world approaches to dataset curation, model analysis, multimodal AI workflows, and production-ready vision pipelines using open-source technologies.
The session is designed for engineers, researchers, and AI practitioners looking to better understand how teams are developing and scaling computer vision applications today. Expect practical demos, technical insights, and discussions around the evolving AI tooling ecosystem.
Teaching Vision-Language Models to Read Spine MRIs
Radiologists face mounting report volumes and rising burnout, creating demand for assistive tools that can draft structured findings directly from imaging. We fine-tuned Qwen2.5-VL with LoRA on roughly 5000 paired lumbar spine MRI studies and reports, training the model to generate spinal level-specific findings evaluated with lexical, semantic, and clinical metrics. We are extending the work with self-supervised pretraining on an additional 20000 unlabeled studies to build domain-specific backbones for downstream tasks including lumbar spine MRI segmentation and classification. The talk shares current results, challenges faced, and why evaluating structured radiology reports is harder than standard metrics suggest.
Improving Efficiency of DNN Stereo Depth Estimation models
Stereo depth estimation is a core perception capability in robotics and autonomous systems, converting rectified stereo image pairs into dense disparity and depth maps. While modern deep stereo methods achieve strong accuracy, state-of-the-art models, especially transformer-based architectures often incur high computational and energy costs, limiting deployment on resource-constrained devices. This studies stereo depth estimation and proposes an efficiency-oriented modification to a transformer-based stereo pipeline by incorporating Walsh–Hadamard Transform (WHT) operations into the feature extraction stage. Specifically, we experiment with a WHT-based convolutional substitute (WHTConv2D) to reduce multiply-accumulate operations while preserving representational capacity via structured ±1 transforms. We inferenced classical and neural stereo models on a specific dataset to compare, culminating in an STereo TRansformer (STTR) baseline and a WHTConv2D-enhanced variant. The proposed design achieves an observed 18.33% efficiency improvement relative to the baseline configuration while maintaining competitive long-range disparity accuracy.