Solving Problems with AI Object Detection in Digital Asset Management
Over the last few years, it has become evident that many problematic and previously unsolvable tasks can be accomplished with the invaluable help of Artificial Intelligence, or AI for short. AI will eventually be found in every industry, including Digital Asset Management (DAM).
DAM is the business process of storing, organizing, searching and retrieving rich media assets and managing digital rights and permissions. The definition hints that AI can become an irreplaceable tool in several real-world scenarios. AI-integrated DAMs have proven to be extremely helpful in content workflows, helping companies to save time and money by automatically removing background images, intelligently compressing images and videos, classifying and tagging them accurately.
For example, a company may want to group or filter digital assets based on features such as visible logos, specific human faces, dominant colors, and so on. And while most of those areas of interest are rather broad, sometimes our customers have more specific needs that require special attention.
What is Object Detection?
Object detection is one of the most fascinating new areas of artificial intelligence. It is intuitive to think that our brain instantly recognizes what it is when we see something. But that’s not entirely true - it takes a lot of processing power in the brain to recognize an object and then identify and classify what it is.
And this is where object detection and artificial intelligence come in! It is the process in which we replicate this intelligence into a computer.
Object detection is a computer vision technique that helps locate instances of objects and their locations in images or video frames. This technology has a wide range of applications, including self-driving cars to detect pedestrians, robotics to detect objects in the environment, security systems, and image and video analysis.
The process of object detection involves using purposefully-built machine learning models.
In layperson’s terms, a machine-learning model is a file trained to recognize specific patterns. Machine learning in artificial intelligence starts with technology developers training a model over a large data set. This data can include labeled images, where each image is annotated with the location and object class it contains.
Doing so provides the model an algorithm that it can use to reason over the input and learn. Not only can it accurately detect the presence of an object, but image classification can also be another skill it can acquire. Once trained with deep learning algorithms, the model can be used to reason over data that was never seen before and even make predictions about it. The more training dataset you upload - the more accurate your model will be.
How does AI Object Detection solve a real-world problem?
In this blogpost, we would like to show you how AI object detection helped us solve a time-consuming task our clients had - positioning an existing towbar image onto another photo of a motor vehicle.
Of course, the resulting image had to look as accurate as possible, and the whole process was to be completely automatic.
Defining the obstacles
While this may seem like a tedious yet trivial task if performed manually, doing it automatically in a way that can be applied to virtually endless variations of car makes and models captured from arbitrary shooting angles presents a plethora of challenges, including:
- How can we be sure that the image contains a vehicle at all?
- Even if we know that there is a vehicle, how can we deduce the most appropriate towbar position?
- Even if we somehow found the perfect location, how can we detect the angle at which the car photo was taken, so the towbar image can be adjusted appropriately and look in the right place?
- How to measure the size of the vehicle so the towbar can be scaled up or down accordingly?
The process of designing and implementing an automated system that overcomes the challenges mentioned above and provides excellent value for our customers is a tall order. Many considerations must be taken into account, which requires an enormous amount of research and planning. Let’s take a closer look at how we approached the problem and the solution we came up with.
Designing the solution
The process begins with the magic of object detection. In our case, we need to know if the image depicts a motor vehicle and, if so - where exactly it is located. While a good model can solve this quickly, having the coordinates of our target object still leaves us with a lot of work to do
Since photographs of the cars may differ every time, a single image of a towbar will not suffice in most cases. To overcome this challenge, we had to accumulate high-quality towbar images in various orientations. This, combined with several image transformation techniques, ensures that no matter how unorthodox the vehicle image is, we are always able to provide an adequately oriented towbar.
Implementing it on the images
But then again, how exactly do we infer the way our vehicle is positioned? The answer is simple - more object detection.
This time, we are looking for something smaller - the license plate. Knowing the precise location of the plate with respect to the automobile as a whole, we can use some fancy mathematics to determine which towbar image we should use and what additional operations we should perform so the end-result looks as genuine as possible.
Next comes the most crucial part of the process. We attempted to find the best possible position for the towbar attachment point. We accomplish this with a set of computer vision techniques.
To put it simply, we run some advanced algorithms that process images and return information based on the visual features that we are interested in. Using the embedded color information allows us to detect where the bumper of the vehicle starts and the exact point where the towbar (if it were a real one) would have been attached. After some additional checks for common pitfalls and edge cases, we arrive at a point that, by all accounts, is a perfectly suitable place for a towbar.
Finally, we had to fuse the processed towbar image with transparent background to the photograph of the vehicle at the correct coordinates. Of course, every single time afterward, all of this will be done automatically and at fast speeds too.
Solving problems with the help of an AI-integrated DAM
At Scaleflex, we ensure that all our machine learning models and procedures are not only highly accurate but also as fast as possible. All our clients have to do after we solve these challenges is simply to upload their photographs onto our DAM and the post-processing takes place almost immediately. After all, we all know that time really does matter. Don’t believe us? Explore our AI playground demo page!
Now, the goal of this article was to give you a peek at some of the challenges we, developers, at Scaleflex face in helping our customers regain control over their digital assets and the approaches we take in order to meet their requirements.
By now, we hope that you are convinced by the important role that AI has in elevating our customized DAM solutions into a dynamic and agile platform. Using AI helps us extract the hidden value of the assets we manage, which in turn allows us to deliver a richer customer experience, enhanced with speed, automation, and intelligence.
Intrigued and want to know how we can help you solve your time-consuming content challenges? Speak to our friendly DAM experts to discuss your project.