Artificial Intelligence is a prevalent topic in the consumer-packaged goods industry (and pretty much every other industry) with good reason. AI brings the promise of automating manual tasks to help people perform tasks faster and better, to enable people to make better business decisions and even to automate decision-making processes without human intervention.
With that said, it’s commonplace to see the AI label on many different technologies and approaches, with varying levels of capabilities and benefits. It is good practice to peel back the onion on what’s behind that label to separate hype from commercial reality.
Expectations for Computer Vision + AI
CVAI refers to using a combination of computer vision and AI to rapidly assess CPG inventory on the physical retail shelf. CVAI can be used to “see” and interpret how the retail shelf is organized and automatically determine what is running low or out-of-stock. In this sense, it replaces tedious tasks that otherwise consume time for store workers, brand teams and third-party merchandisers. The commercial benefit of CVAI is clear as it enables more time to focus on fixing inventory or merchandising problems to drive incremental growth rather than staring at the shelf to identify if and where inventory or merchandising problems exist.
Not all approaches are the same. There are differences in both data capture techniques as well as AI to interpret the visual input. The differences translate to speed of capture, accuracy of the outputs and depth and breadth of the AI takeaways.
Let’s take the data capture process first
Legacy image-based approaches essentially try to train the person tasked with data capture to play robot, taking one step down the aisle, snapping a photo, very slowly sidestepping another pace down the aisle, snapping again, etc. This approach then requires the software behind the scenes to “stitch” carefully together the photos of a long aisle into one continuous panorama.
Video-based approaches to data capture, rather than one image at a time, are instead capturing full motion (video) from a quick normal walk down an aisle. Pensa pioneered this approach, designed to be quite fast and provide immediate labor efficiency over historical manual shelf inspection.
Comparison of shelf inventory data capture approaches
For more on this data capture comparison, see here.
Now to the AI to process the vision input – let’s look behind the curtain
AI for legacy image-based capture generally employs AI algorithms that operate on a single image, typically to process the stitched panorama image “pasted” together from the independent images gathered by the user sidestepping down the aisle. To analyze a single panoramic photo for the entire aisle, the AI behind the capture can be simpler technology. Downsides are of course the tedium of the collection (hitting business cases negatively), degraded recognition accuracy, and longer AI training cycles.
AI for video-based capture, is designed to process native moving frames from a natural walk down the aisle. While the capture is dramatically faster, to extract digital understanding requires substantially more AI to interpret. Without more advanced AI behind video-capture, digital takeaways are quite limited – and with the more advanced AI, are much richer.
Pensa’s advanced AI works in concert with our video-based data capture. From the fast walk down the aisle, Pensa AI processes motion input from the camera sensor very similar to the AI utilized in autonomous driving technology. In effect, from the video frames, Pensa’s AI evaluates each item on the shelf from many angles, triangulating as it moves down the aisle just like an autonomous car navigating the streets.
Pensa’s Visual AI evaluates each item from many perspectives
Pensa’s AI then localizes items and placement across an entire aisle, building a full three-dimensional internal digital model of the shelf and how the shelf space is organized and managed.
This 3D model is not a photograph. It’s generated by Pensa’s patented AI.
Why is this AI combination of triangulation and 3D reconstruction so important?
First, it delivers dramatically higher accuracy in identifying the items on the shelf, down to the small differences such as low sodium versus regular or organic chicken broth.
Second, it learns and trains automatically in-situ, or in-place, from what it sees on the shelf – recognizing new products and packaging changes and learning the catalog automatically from the natural turns and placements on the shelves. And when Pensa’s AI detects a new product or packaging change once, it’s immediately recognized globally going forward. This approach is many times faster and more robust than matching images captured at the shelf against a reference database.
Third, and most important, Pensa’s AI understands and interprets how video frames relate to each other in the context of the shelf. Pensa’s AI-generated 3D digital model of the entire aisle enables our analytics to answer a richness of questions about shelf conditions in near real time. For example, understanding whether a product is running low or out of stock, whether a new item is on the shelf and in the right place, whether the correct number of “facings” are on the shelf, how is the aisle organized etc.
Without advanced AI such as Pensa’s, a video taken of the shelf is only a jumble of images. A video is otherwise a series of independent snapshots, in essence a one-dimensional view that may be able to recognize whether a product is present in the aisle but not assess stockouts, number of facings, product positions or de-facto planograms for how the shelves are organized. All critical benefits expected from CVAI to digitize the shelf and automate otherwise tedious manual activity.
Putting it together
CVAI is a critically important use case for digitizing the shelf at retail, holding the promise of increasing labor efficiency, reducing tedious manual activity, more accurately and quickly assessing store conditions, and then identifying highest priority areas and actions to improve.
CVAI can be a game-changer at retail. But the speed of the data capture and the AI behind are the magic that determines the real capabilities that lead to growth and business transformation.
If you’d like to learn more about the “how” behind Pensa’s CVAI please reach out.