How robots are learning to see

For end-to-end automated pick-and-place operations, it is important that robots can reliably grip differently shaped and even transparent objects. Deep learning methods in powerful machine vision software enable safe gripping even on complex surfaces. TEKVISA has implemented such a demanding application.


Plastic bags with assembly components can have many different shapes. In industrial processes, therefore, difficulties often arise in the precise identification and automated gripping of such bags, especially if they lie in disorder and/or are made of translucent material. As a result, productivity suffers, for example in pick-and-place activities. We have tackled precisely this challenge – with machine vision.

Precisely identifying and positioning bags with accessories

For a leading manufacturer of wall panels for offices we developed a robot-assisted picking system based on machine vision with deep learning algorithms. The objective was to precisely detect plastic bags with accessories for wall boards so that robots can reliably grab them. In order to significantly increase productivity, the entire workflow was to be automated end-to-end. The aim was also to relieve the employees of monotonous routine tasks so that they could devote themselves to more demanding activities.

A special feature had to be taken into account when developing an appropriate automation solution: The bags contain a variety of different types of accessories. These include, for example, fastening materials such as screws, nuts, and dowels, pens and highlighters for writing on the white wall panels, pins for the cork wall panels or even sponges for wiping the wall surfaces. Consequently, the bags can vary in size and weight, as well as have many different appearances. Moreover, they are randomly shaped and, because of their elasticity, may additionally be compressed, pulled apart, or deformed in some other way.

High product variance posed challenges for developers

This enormously high product variance presented the automation engineers at TEKVISA with major challenges: The goal was to develop a flexible solution based on machine vision that reliably detects all conceivable types and shapes of accessory bags and thus enables safe gripping processes. The important thing here was that the system should identify those bags on the conveyor belt that could best be picked up by the robot arm based on their position and orientation.

The complete setup consists of a high-resolution color area scan camera and special lighting that minimizes reflections, paving the way for precise detection of each bag’s contents. At the heart of the application is an innovative machine vision system. It precisely identifies the plastic bags lying on a conveyor belt so that a robot can pick them up accurately. The robot then places them with high precision on the wall panel to be packaged, shortly before the final packaging process.

Combination of classic machine vision and deep learning

Based on the many different appearances and positions of the bags, the machine vision solution selects the optimal candidates for picking in each case. The machine vision standard software MVTec HALCON is used for this.

Using the deep learning algorithms it contains, the system is first comprehensively trained with sample images. In this way, the software learns the numerous different characteristics the bags can have. This leads to a very robust recognition rate – even with an almost infinite variance of objects. The bags not selected for gripping are sorted out and then fed back into the system. By repositioning them, they assume a more favorable position on the conveyor, allowing the robot to pick them up more easily and place them for shipping. In this way, even overlapping and stacked bags can be gripped and picked. The system is able to analyze and precisely identify up to 60 bags per minute using the integrated machine vision software.

By combining classic machine vision methods with modern deep learning technologies, MVTec HALCON proved to be the ideal solution for us.

Camera and robot harmonize perfectly thanks to hand-eye calibration

In addition to deep learning technologies, classic machine vision methods, which are also an integral part of MVTec HALCON, are responsible for the robust recognition rates. In addition to image acquisition and various tools for pre-processing the images, hand-eye calibration is an important feature here. This is required in advance so that the robot can accurately grip and place the bags observed by a stationary 2D camera during operation. During the hand-eye calibration, a calibration plate is attached to the robot’s gripper arm and brought into the camera’s field of view. Several images with different positions of the robot are then taken and offset against the robot’s axis positions. The result is a “common” coordinate system for camera and robot. This allows the robot to grip the components at the positions that were detected immediately beforehand by the camera. Precise determination of the object position with an accuracy of 0.1 millimeters allows a hit rate of 99.99 percent to be achieved during the gripping process.

Numerous different bags with varying sizes, deformation, and contents, as well as their overlapping positions on the conveyor belt – all these factors presented us with enormous challenges in this project. We were unable to address these with purely classic machine vision methods. By combining them with modern deep learning technologies, MVTec HALCON proved to be the ideal solution for us. With it, we achieve outstanding recognition rates despite the large variety of objects. This paves the way for our goal – the end-to-end automation of the entire process surrounding the packaging of wall panels. In addition, we benefit from an increase in productivity and flexibility in order to be able to map deviating scenarios within the same application.