San Diego AI, ML and Computer Vision Meetup - December 4, 2025
Dec 4, 2025
5:30 - 8:30 PM
Hilton San Diego Bayfront
(Across the street from NeurIPS)
Elevation Room
1 Park Blvd
San Diego, CA
Speakers
About this event
Join the Meetup to hear talks from experts on cutting-edge topics across AI, ML, and computer vision.
Schedule
Extending RT-DETR for Line-Based Object Detection: Paddle Spine Estimation in Pickleball Serve Analysis
We present a modified vision transformer–based detection model for estimating the spine line of a pickleball paddle from video data, developed to support automated serve legality analysis and motion coaching. Building on the RT-DETR architecture, we reformulated the detection head to predict two keypoints representing the endpoints of the paddle’s longitudinal axis rather than a bounding box, enabling a general framework for regressing an arbitrary number of vertices defining lines or polygons. To facilitate stable training, we defined a loss combining a line-IoU term with a cosine-angle regularizer that enforces geometric consistency between predicted and ground-truth orientations. Dataset curation and qualitative validation were performed using FiftyOne, allowing visual inspection of data diversity pre-training and model quality post-training. The model was trained and deployed end-to-end on the EyePop.ai platform, which provided data management, training orchestration, and model hosting for seamless integration into a third-party application performing real-time serve evaluation and feedback.
Visual Agents: What it takes to build an agent that can navigate GUIs like humans
We’ll examine conceptual frameworks, potential applications, and future directions of technologies that can “see” and “act” with increasing independence. The discussion will touch on both current limitations and promising horizons in this evolving field.
Edge AI for Biofluid Analysis
This talk, “Edge AI for Biofluid Analysis”, explores how compact neural networks running on low-power devices can detect and classify biological materials — from salt crystals in sweat, cell types in saliva, sperm motility and morphology, to particle counting — using affordable research-grade microscopes along with accessible hardware; such as a Raspberry Pi, microcontrollers, AI accelerators & FPGAs. The talk will demonstrate that meaningful bioanalysis can occur entirely at the edge, lowering costs, protecting privacy, and opening the door to new home-diagnostic and health-monitoring tools.
Structured Zero-Shot Vision-Based LLM Grounding for Driving Video Reasoning
Grounding large language models (LLMs) for post-hoc dash-cam video analysis is challenging due to their lack of domain-specific inductive biases and structured reasoning. I will present iFinder, a modular, training-free framework that decouples perception from reasoning by converting dash-cam videos into hierarchical, interpretable data structures.
Using pretrained vision models and a three-block prompting strategy, iFinder enables step-wise, grounded reasoning. Evaluations on four public benchmarks show up to 39% improvement in accident reasoning accuracy, demonstrating interpretable and reliable performance over end-to-end V-VLMs.