Computer vision has rapidly become one of the most transformative
technologies in modern manufacturing and industrial automation. Applications range from robotic guidance in bin picking and assembly, to
defect detection in quality assurance, to condition monitoring and predictive maintenance of machinery. These systems leverage advances in deep learning, edge deployment, and real-time visual AI processing to enable faster cycle times, reduce scrap rates, and improve overall equipment effectiveness.
In this article, we’ll explore some of the most impactful computer vision applications in industry, with a focus on computer vision applications in manufacturing. We’ll also
highlight leading companies building computer vision solutions for manufacturing, along with datasets and resources that developers can use to build and optimize their own systems.
Current Challenges in Manufacturing
Despite rapid innovation, the manufacturing industry is grappling with significant challenges that threaten efficiency, safety, and long-term growth. These pressures are driving interest in computer vision applications in industry, as companies look for scalable automation solutions.
- Labor shortages: A 2024 report by Deloitte and The Manufacturing Institute projects that U.S. manufacturing will require up to 3.8 million new employees between 2024 and 2033 driven by growth, retirements, and new investments with an expected 1.9 million of those jobs to remain unfulfilled.
- Rising costs and inflation: Material, energy, and transportation costs continue to climb, forcing manufacturers to do more with less. The pressure to optimize workflows and eliminate inefficiencies has never been greater.
- Complexity and diversity: Manufacturing touches nearly every product we use—from screws and shoes to cars and electronics. Each product requires unique processes, tools, and quality controls. Adding to the challenge, every factory uses its own mix of sensors, cameras, and machines, meaning that one-size-fits-all solutions rarely work.
These challenges highlight why computer vision in manufacturing has become a necessity for manufacturing businesses. From improving worker safety to automating complex tasks, computer vision solutions for manufacturing are helping companies adapt to an increasingly demanding industrial landscape.
Applications of computer vision in manufacturing
1. Bin Picking: The foundational computer vision industrial application
A common industrial robotics application,
bin picking is the action of selecting an object from a bin, picking it up, and placing it in another location.
For a robot to be successful with this task, it needs to precisely navigate and interact with objects of different shapes, sizes, and materials in a potentially cluttered, occluded, and poorly lit environment. Machine vision systems make this possible by mapping the environment and guiding the robotic arm’s motion.
In the simplest cases, camera images are passed into object detection routines. In many cases, however, grabbing an object from a bin requires mapping depth information and performing 3D object detection to situate the object in three dimensional space. Some computer vision systems use point clouds from LiDAR sensors to generate these 3D representations. One additional complication is that some objects can only be easily grabbed and held in certain orientations. To overcome this, some machine vision systems
estimate the “pose” of objects and use this information to orient the robotic arm for picking.
Here are a few papers on computer vision in bin picking:
2. Palletizing and Depalletizing: Streamline industry practices through computer vision
Pallets have been called the unsung heroes of our modern age. These flat, typically wooden, platforms are essential to the scale and economy of global logistics and transportation. Before pallets make their way to shipping containers, they are loaded up with goods. These stacks of goods can reach
up to 15 feet tall and weigh more than two tons. The process of loading and stacking products onto a pallet is known as
palletizing. Similarly, once products have reached their destination, they must be unloaded from the pallet. This unloading process is referred to as
depalletizing.
To reduce injury and error, manufacturers have been automating palletizing and depalletizing tasks through the use of robotic arms equipped with computer vision systems. Computer vision is a great fit for this manufacturing problem, as the items being loaded and unloaded are typically nearly similar, so object detection models can be trained with very high accuracy.
Another crucial enabling element is calibration. As the robot arm loads or unloads items, it takes note of the disparity between where it estimated the object to be located and where it was actually located as feedback to refine its future predictions.
A few resources to get you started:
3. Machine Tending: Using computer vision to reduce risk and improve consistency
Machine tending is the process of automatically loading raw material or components for input into a machine. This includes placing parts on a conveyor belt, as well as preparing materials for welding, grinding, milling, or injection molds.
Automated machine tending has multiple advantages, from reducing injury risk to improving consistency. Computer vision-enabled automation in machine tending also allows for higher precision in these applications. With real-time monitoring, object localization can be used to precisely situate the input materials relative to the machine being tended, and a robotic arm can adjust positioning accordingly.
As with many computer vision applications in manufacturing, data availability and quality in machine tending is a significant challenge. Machine vision systems for machine tending often need to be built with limited labeled training data. This means that data cleaning and curation are essential, and techniques like data augmentation and transfer learning can be incredibly important.
Here are a few preliminary resources:
4. Defect Detection: Supporting quality control with computer vision
Computer vision in manufacturing has already become indispensable in ensuring quality control throughout the industrial processes. Instance segmentation, for instance, is used in conjunction with high-resolution sensor data, to check if a manufactured part has the desired spatial dimensions, within an allowed tolerance.
One area of quality control where computer vision features prominently is defect detection. Manufacturers want to identify defective parts and products as early in their journey as possible. In some applications, where the possible varieties of defects are known, object detection and classification are used to identify problems.
In other cases, the full variety of possible defects is either unknown, or too complicated to easily categorize. In manufacturing, defects can range from minutiae like small scratches to entirely missing components, like a missing screw. This can also be exacerbated by class imbalance, where there are many more examples of “normal” products than “defective” products.
Anomaly detection provides an alternative, unsupervised approach that takes normal and defective examples as input and predicts a new product as normal, or “nominal”, if it is likely to have come from the same
distribution as the previously seen normal samples. To make this determination, anomaly detection models learn an approximate representation of this nominal distribution, which may involve using density-based models like
DBSCAN,
support vector machines, or deep learning models like
autoencoders and generative adversarial networks (
GANs). Libraries like Intel’s
Anomalib provide tools for implementing and benchmarking anomaly detection algorithms.
Here are a few papers on defect detection and anomaly detection in manufacturing:
5. Predictive Maintenance: Building smarter industrial processes
Tools and machinery in manufacturing plants gradually accumulate wear and tear which, if left untreated, could lead to complete breakdown, as well as potential injuries and lost productivity. To avoid such failure, manufacturers have historically performed routine maintenance, cleaning, refurbishing, and changing out old parts on a regular basis.
But routine maintenance has two predominant downsides: first, it introduces downtime and costly upkeep when maintenance may not be necessary; second, it is not able to pick up on issues that arise and escalate between maintenance intervals.
Preventive maintenance (PdM) seeks to address these issues with more active monitoring. Computer vision and predictive analytics help manufacturers to save on maintenance costs and avert catastrophe. PdM is already applied to a wide variety of machines and mechanical parts, from blades and bearings to gears and gaskets.
In some cases, as with the saw blade pictured above, computer vision techniques like segmentation, object detection, and classification suffice to identify and predict all likely failure modes. Common signs include cracks, corrosion, and leaks. As with defect detection, when the spectrum of failure modes is complex or unknown, anomaly detection is applied to the state of the machines.
Companies driving industrial innovation with computer vision
Mech-Mind Robotics
With more than 700 employees, 1000+ customers, and more than $200M in funding,
Mech-Mind Robotics is the largest 3D vision company in China, and one of the world’s largest providers of 3D vision cameras and machine vision software for robotic automation.
Mech-Mind’s integrated hardware and software solutions are used in a wide range of manufacturing applications, including bin picking, machine tending, palletizing and depalletizing, assembly, and gluing. The Mech-Eye industrial 3D camera uses
structured light technology to generate high resolution, high accuracy point clouds.
Mech-Mind’s Mech-Vision machine vision software provides a platform for customers to build industrial computer vision applications with Mech-Eye cameras and customers’ robots. Mech-Vision has built-in support for common computer vision tasks like pose adjustment, and 2D and 3D matching, wherein the customer generates a point cloud model of the object to be recognized, either from a CAD file or directly from a camera image, and this model is recognized in the scene. Built-in support for common industrial robots means that the
robot calibration process,
while traditionally time consuming (on the scale of hours), can be completed in less than 20 minutes.
To round things out, Mech-Mind's deep learning software allows customers to fine-tune computer vision models for their specific use cases. Customers load in their own data, which are automatically pre-labeled, and can then be rapidly edited and revised. Typically, Mech-Mind’s deep learning software only needs 20-50 images of an object to train a model to recognize it in a scene.
Instrumental
Founded by two Stanford and MIT grads and former Apple employees in 2015 and based in Palo Alto, CA,
Instrumental is leading the way in ensuring product quality in electronics manufacturing. They use computer vision in conjunction with predictive analytics to provide real-time monitoring and alerts, as well as root cause analysis for prior failures.
Instrumental's AI-based computer vision suite supports both new product introduction (NPI) manufacturing, which is characterized by low volume, and mass production (MP) manufacturing, which is high volume.
Even within electronics manufacturing, the wide variety of products and substantial variation from product to product means that general purpose computer vision models have seen very little success. Nevertheless, manufacturers want to detect defects and issues on their particular use case as quickly as possible.
Instrumental's suite of computer vision tools is designed to achieve high performance application-specific defect detection given as few samples as possible. To do this, they use techniques like data augmentation, transfer learning, and active learning to build a robust dataset that they use to train an anomaly detection model. Their models are built easily with no coding. Once deployed, these models run real-time inference on the edge in the factory and create a record that can be shared, inspected, and evaluated.
Voxel51
Founded in Ann Arbor, Michigan, Voxel51 is the company behind
FiftyOne, designed to help data scientists, ML engineers, and enterprises build better computer vision models. Unlike many providers of computer vision solutions for manufacturing who focus solely on hardware or model deployment, Voxel51 addresses one of the industry’s biggest bottlenecks: managing and curating large, complex datasets to maximize model performance.
In manufacturing environments, where visual data is captured constantly—from defect detection cameras on assembly lines to predictive maintenance sensors—FiftyOne enables teams to filter, explore, and label data efficiently. The platform supports workflows for training, validating, and benchmarking models across a variety of computer vision applications in industry, including quality inspection, anomaly detection, and safety monitoring.
By giving manufacturers the ability to organize and understand their data, Voxel51 makes it easier to fine-tune models for application-specific use cases. With growing adoption among Fortune 500 companies, and research labs, Voxel51 is helping manufacturers transform raw visual data into deployable computer vision in manufacturing solutions that drive efficiency, reduce downtime, and improve product quality.
Protex AI
Founded in 2020 and backed by YCombinator, Notion Capital, and Playfair Capital, Irish startup
Protex AI is helping enterprise safety teams to revolutionize how they make proactive safety decisions that contribute to a safer work environment.
Their AI-powered technology is enabling businesses to gain greater visibility of unsafe behaviors in their facilities. The privacy-preserving platform plugs into existing CCTV infrastructure and uses its computer vision technologies to capture unsafe events autonomously in settings such as warehouses, manufacturing facilities, and ports.
Protex AI provides a simple interface so that each user can create their own “rules”, including setting exclusion zones, speed limits for forklifts, or even minimum distances workers must maintain between themselves and machines. Protex then uses computer vision techniques, including object detection, object tracking, and pose estimation, in order to check these rules. For rules involving speeds or distances, the vision system employs calibration. Typically, calibration is performed using inputs from multiple cameras, but Protex uses special routines to estimate calibration from a sole CCTV camera.
Due to privacy concerns surrounding customer image and video data, Protex AI runs all of their models on the edge on Nvidia powered devices. As use cases can differ greatly, Protex AI deploys custom models for each customer. Their base model is trained on hundreds of thousands of images, and then a unique version is fine-tuned on a given customer’s data. In their line of work, data quantity is not an issue. The most important factor in model performance is having a clean, high quality dataset.
Cognex
More than forty years old but still on the cutting edge, Nasdaq-listed (CGNX)
Cognex is a world leader in machine vision for industrial automation. Their 2000+ employee team has a hand in almost every step of industrial automation processes, from sensors and barcode scanners to industrial cameras and fully integrated vision systems.
Cognex has machine vision tools for rule-based applications, such as monitoring object location and detecting edges, as well as deep learning tools for cloud connected and edge devices. Their VisionPro Deep Learning software supports standard tasks like defect detection and segmentation, and assembly verification, as well as burgeoning tasks like material classification.
Beyond specific tasks, Cognex’s VisionPro software expedites time to deployment with
AutoML capabilities. The
label checker automatically verifies the vast majority of labels and flags the remaining images for manual review, minimizing the number of samples a user needs to assess.
During training,
parameter autotune will use input example images to determine the optimal set of hyperparameters. In
optical character recognition (OCR), for instance, it can be difficult to recognize text due to the wide spectrum of fonts and potential distortions. Traditional OCR systems require that users specify segmentation hyperparameters to achieve high precision and recall. Cognex Blue Read eliminates this requirement by comparing an input image to the library of hundreds of fonts on which it was trained, and automatically selecting the best hyperparameters.
Berkshire Grey
Founded in 2013 and headquartered in Bedford, Massachusetts, Berkshire Grey is a leader in
AI-enabled robotic automation for supply chain and logistics. The company develops robotic systems that combine computer vision, advanced machine learning, and proprietary hardware to automate tasks such as picking, packing, sorting, and moving goods in warehouses, distribution centers, and retail backrooms.
Berkshire Grey’s robotic solutions are designed to address the growing demands of e-commerce and omnichannel fulfillment, where high order volumes and labor shortages have put pressure on traditional operations. Their systems integrate robotic arms, mobility platforms, and computer vision to handle a wide variety of SKUs, from small electronics to apparel and groceries, with minimal need for human intervention.
Unlike traditional fixed automation, Berkshire Grey’s approach emphasizes flexibility and scalability. Their robots use AI-driven perception systems to adapt to unpredictable product shapes, sizes, and packaging, enabling automation across mixed inventory environments that have historically been too complex for conventional robotics.
With customers including global retailers, e-commerce companies, and logistics providers, and having raised more than $200M in funding prior to going public via SPAC in 2021, Berkshire Grey is positioned as a major force in transforming how products move through the modern supply chain. By blending computer vision with robotics, they are building fully automated, intelligent fulfillment systems that reduce costs, increase throughput, and help companies meet rising customer expectations.
RIOS Intelligent Machines
RIOS Intelligent Machines is on a mission to transform labor-intensive factories into smart factories powered by robotics and AI. The company helps its global customers automate their factories, warehouses, and supply chain operations by deploying AI-powered end-to-end robotic workcells that integrate within existing workflows. The Menlo Park, CA-based company was founded by former Xerox PARC engineers who saw a massive failure of traditional robots and predicted that factories over reliance on labor would soon reach a breaking point.
RIOS has developed some of the most advanced hardware and AI/software platforms in robotics, including human-like tactile sensors for robots, haptics intelligence platform, and highest performance end-of-arm tooling and food-grade grippers. Their AI Controlled Robotics platform delivers fixed, programmable, flexible and integrated automation. They also offer palletizing robots to load and unload products on or off of pallets, plus robotic packaging systems.
A few more
It’s impossible to highlight every company doing amazing work at the intersection of computer vision and manufacturing and industrial automation. Here are a few more companies that are pushing the boundaries:
- PreML GmBH: German startup focused on automated visual quality inspection.
- Prophesee: French series C startup pioneering event-based vision.
- Datalogic: Italy-based leader in automated data capture, barcode readers, sensors, and vision systems.
- Stemmer Imaging: Europe’s largest imaging technology provider, with a hand in everything from photography to factory floor vision systems.
- Pickit 3D: 2016 spinout of NASA robotics software provider Intermodalics.
- Matroid: End-to-end no-code computer vision solutions for quality assurance, assembly verification, and safety and compliance.
Computer vision industrial datasets examples
Due to the highly proprietary nature of manufacturing and industrial automation processes, public computer vision datasets are few and far between. Here are a few datasets will help you get started:
If you would like to see any of these, or other computer vision manufacturing datasets added to the
FiftyOne Dataset Zoo, get in touch and we can work together to make this happen.
What’s next for computer vision solutions for manufacturing?
The rise of computer vision in manufacturing signals a major shift in how industrial processes are designed, monitored, and optimized. Computer vision technologies are enabling factories to become smarter, safer, and more efficient. Companies at the forefront are already proving the enormous value of computer vision applications in industry.
As more manufacturers adopt AI-driven systems, we’ll continue to see innovations that reduce downtime, enhance product quality, and streamline operations. The future of industrial automation is being built today, and computer vision solutions for manufacturing are at the heart of it.
If you’re building or researching industrial AI systems, now is the time to dive deeper into this space—and with tools like
FiftyOne, you can curate, analyze, and optimize the massive datasets that fuel these cutting-edge applications.