Machine Vision Insights

Do you need to be a professional programmer to use deep learning? Spoiler alert: No

Ever since the emergence of the new generative AI models such as ChatGPT, everyone has been talking about artificial intelligence (AI). But the technology has been profitably used in machine vision many years now. Here, it is applied in tasks ranging from defect detection to position detection of complex objects. But how do you get started and what kind of knowledge is required? Ulf Schulmeyer explains how even beginners can benefit from deep learning quickly and easily.

In discussions of artificial intelligence or deep learning, you’ll often encounter terms such as neural networks, black box, labeling, etc. This jargon can be incomprehensible to the layperson. It also leads people to believe that they need solid programming skills to truly master the technology. Unfortunately, this assumption makes it impossible to see the technology’s potential for practical industrial use. When it comes to machine vision in particular, artificial intelligence offers enormous benefits – and not just for professionals.

Let’s start at the beginning:
What is deep learning?

As a subset of machine learning, deep learning is based on multi-layered neural networks that are capable of realistically emulating complex structures and processes of the human brain and making independent decisions. During a comprehensive training process using large amounts of data, deep learning models learn to identify certain patterns, and relationships through analysis.

So much for the theoretical side. But why is the technology so successful in the area of machine vision?

Because machine vision produces an extremely large amount of image data. It forms the perfect basis for the effective training of neural networks. That’s the technical side.

At the same time, users also benefit from the technology. The recognition rates that deep learning can deliver reach new levels of quality. This also allows the automation of entirely new applications based on machine vision. Deep learning is an advancement that gives new impetus to machine vision as a whole.

When planning a new machine vision application, it’s possible to continue to rely exclusively on classic machine vision methods or entirely on deep learning methods. However, the ideal approach is to combine classic methods with deep learning. No matter which approach is chosen, more and more people are finding the use of deep learning worthwhile. As we’ve seen over and over in conversations with customers, many companies, both large and small, are exploring the idea of introducing artificial intelligence or deep learning. Frequently, however, they have certain reservations that keep them from taking this step. But using the technology is not as complicated as they might think. There are also tools that make it easier to work with deep learning.

The right deep learning method for each application

When it comes to implementation, the most important question is this: What exactly do you want to automate? The range of deep learning methods available to integrators, plant operators, and machine manufacturers – in short, to everyone who deals with this question – is constantly growing. One of the most common applications is quality inspection. Anomaly detection and global context anomaly detection are methods that identify faults, defects, scratches, and other deviations. Another application is locating objects. As the name implies, the deep-learning-based object detection technology locates objects and also conducts completeness checks and automated counting. One technology that makes counting large quantities of objects particularly easy and robust, is Deep Counting. Methods such as segmentation and instance segmentation are suitable for pixel-precise object localization. In addition to determining where an object is situated in the image, the two methods are also important as preliminary stages for further machine vision steps. Finally, Deep OCR (optical character recognition) enables the robust reading of text using deep learning.

Although this discussion covers only some of the available methods, it does demonstrate the diversity of the potential applications.

How Do You Get Started?

To run an application, you first need to have a typical machine vision setup consisting of a camera, appropriate illumination, and suitable computer hardware, such as an industrial PC equipped with a high-performance CPU and (even better) a GPU. But at the heart of any machine vision setup lies powerful machine vision software. In addition to HALCON, our powerful standard machine vision software, MVTec also offers MERLIC, a machine vision software program that even beginners can use without having to forgo high-performance deep learning technologies.

The benefit for beginners is that the software allows them to handle machine vision applications without any programming skills, thanks to an image-centric user interface and an intuitive operating concept. This significantly simplifies and speeds up the creation and commissioning of machine vision applications.

Optimal image data preparation for training

To use deep learning applications, you first have to label the image data. Tools such as the Deep Learning Tool (DLT) from MVTec are helpful here. The goal of labeling is to add further information about each image to the machine vision system. Such information can be the image class or the object’s position within the image. Software that provides an intuitive user interface makes labeling very easy even for beginners and can be used without any programming skills. A particularly practical aspect is that only good images are required for training certain deep learning models. These are easy to obtain. Moreover, the number of these image datasets ranges from 30 to 100 good images, depending on the condition of the object to be inspected. The training process itself happens at the press of a button.

A glimpse into the deep learning black box

One criticism of deep learning is the lack of transparency in decision-making processes. While the latest developments can’t completely illuminate what goes on inside this black box, they do provide certain insights into the inner workings of the neural networks. There are tools that use a heat map to highlight the image areas relevant for decision-making. It’s also possible to influence the deep learning results with the aid of the threshold value. For example, if you set the threshold value for anomaly detection very high, you’ll obtain only “OK” results that fully match the trained images. If you lower the threshold value, images are also output that deviate more significantly from the OK results. This means that you can flexibly and individually adjust the sensitivity of the model’s response to irregularities.

What Are You Waiting For?

Companies shouldn’t hesitate to enter the world of deep learning. Suitable tools are available on the market that allow them to take this step successfully and benefit from reliable detection processes. At the same time, these so