Join our virtual meetup to hear talks from experts on cutting-edge topics across AI, ML, and computer vision. View more CV events here.
Schedule
UISurf: Toward Universal UI Automation with Cross-Environment Agents
In this talk, we introduce UISurf, an open-source multimodal agentic UI automation platform in which agents can perceive, reason, and collaborate across browser and desktop environments to complete end-to-end tasks that require interaction with multiple user interfaces. UISurf comprises three main components: uisurf-agent, the runtime for UI automation agents; uisurf-admin, the session orchestration and management service; and uisurf-app, the full-stack user application. Its multi-agent architecture includes a planning_agent that transforms natural-language requests into structured execution plans, specialized Browser and Desktop Agents for environment-specific interaction, an automation_agent that coordinates execution and inter-agent handoff through Agent-to-Agent (A2A) communication, and a summarization_agent that produces the final task summary for the user. UISurf supports both autonomous execution and human-in-the-loop supervision, offering a practical and extensible framework for studying and deploying cross-environment UI automation.
From Manual Workflows to AI-Assisted Skills: Building Reliable Internal Automation
In this session, I will discuss how teams can turn repetitive manual workflows into reliable AI-assisted and automation-driven “skills.” I will share practical lessons from building internal tools for CAD and engineering workflows, including how automation can reduce manual effort, improve consistency, and support better process control. The talk will also cover why many AI/agent experiments fail when they are not connected to real team workflows, standards, and validation steps. Attendees will walk away with a practical framework for identifying repeatable workflows, designing useful internal tools, and adopting AI assistance without losing accuracy or trust.
Building Real-World Computer Vision Systems with Voxel51
This talk will explore practical workflows for building, evaluating, and improving modern computer vision systems. We’ll dive into real-world approaches to dataset curation, model analysis, multimodal AI workflows, and production-ready vision pipelines using open-source technologies.
The session is designed for engineers, researchers, and AI practitioners looking to better understand how teams are developing and scaling computer vision applications today. Expect practical demos, technical insights, and discussions around the evolving AI tooling ecosystem.