Grey markets/parallel imports are a significant problem for many brand-name manufacturers. A grey market is a market in which goods have been manufactured by or with the consent of the brand owner but are sold outside of the brand owner's approved distribution channels—an activity that can be perfectly legal in some countries, making this problem particularly difficult to tackle.
However, this is far less a matter of law than a matter of lost profit potential for manufacturers since 9 out 10 products are impacted. Grey market product diversion is costing brand owners hundreds of billions of dollars worldwide and is a risk for brand reputation. Therefore, the existence of grey markets is seen by brand-name manufacturers as a problem, one they would like to eliminate.
Implementing end-to-end traceability up to the consumer level, leveraging new technologies such as blockchain and big data, is a solid way to provide essential data about the product thanks to scans taken of the products for example. These are the necessary foundations to start tackling this problem thanks to machine learning, which is at the heart of our new Grey Market Detection module.
This Grey Markets Detection module will soon be implemented in Tilkal’s solution. However, the module is operational and can be implemented in all Data Viz tools.
At Tilkal, one of our customers was concerned about grey markets affecting the price of their product and they believed that products were being sold where they shouldn’t be. They asked us if it was possible to identify grey markets based only on the locations of scans of products. Here we present our first step to solving this problem.
To speed up the "time-to-control" by sending the inspectors asap to the right place when suspicious scans are detected at a location outside the planned distribution/circulation circuit of goods. The field control capacity is today limited:
by insufficient means
by the lack of precise and anticipated visibility of the risk areas: checks are carried out at random and are therefore unlikely to be correct.
But how do we identify suspicious scans in the first place? Imagine our customer is a clothing brand employing Tilkal’s consumer app with a QR code that consumers can scan to have information about the product. This app gives customers the choice of declaring their geographical position while remaining anonymous. Most scans should happen in the market/country for which the product is destined and be clustered around store locations (more often than not unknown to the brand). But what if they are not? Can we be sure that they originate from an illegal market or could they be harmless scans inside someone’s home? We further refined our challenge as - Using these geographical locations only, is it possible to determine if the product was purchased via the grey or illegal markets or not?
Why Machine Learning?
One might wonder why exactly we need machine learning to solve this problem. Is it not possible to have a rule-based algorithm to do this? Let’s consider the following schematic of a rule based algorithm.
At Tilkal, we value using the right technology for the right problem. In this case, we found that while rule based algorithms are easy to implement, they only work when we understand and define rules accurately. In the real world, ground reality is often hard to describe with a finite set of rules - How do we pinpoint the locations of illegal markets? Even if we do, can we be sure of our predictions?
Solving a hard problem like grey markets involves getting feedback and using it to improve the system. For example, suppose our customer suspects that grey markets only operate during certain time periods or are sensitive to the market price of the product. How do we integrate this insight into the system without knowing precisely what the influences are?
In fact, Machine Learning (ML) algorithms are meant to do that for us -- uncover underlying patterns in unseen data. ML gives us the flexibility to work with data both that follow a decipherable rule-based approach (for example linear models and decision trees) and others for which the rules are not as easily decipherable. Here is a simplified version of our grey market ML stack:
Machine Learning details
We make use of both unsupervised and supervised learning in our solution.
Unsupervised learning The first step of the grey market module is to identify illegal markets by their locations. In order to do this, we use a density-based clustering approach to cluster the scans.
Supervised learning Clusters are then fed into a regression model to mark them with a score based on several features including their size and distance from the nearest store. We then use a threshold to determine which clusters are safe ‘0’ or suspect ‘1’. This gives us an initial map of illegal markets.
We have found that ensemble methods like bagging and boosting regression techniques work the best on our dataset. We would like to call out the Lazy Predict module that compares the performances across all the regression and classification models available in scikit learn. This module helps us to choose the right model for our data. Our Proof of Concept model performs at greater than 95% accuracy.
Individual scans inherit the grey value of the closest cluster. Isolated scans are marked as safe.
Note that we use a regression model, followed by a threshold comparison to give grey scores, instead of a direct classification. The reason is 2 fold:
Features such as distance and size affect suspicion on a continuous scale and not discrete jumps.
The regression value allows us to give a confidence score to the scans. For example, if we set our threshold at 50% and a scan has a regression score of 51%, it will be marked as suspect ‘1’, thus raising an alert. However, the person viewing this alert will also note that the confidence associated with this score is 51% so they know how much importance to give to this alert. As with all good ML models, ours is always a work in progress. Based on the specific domain and customer needs, this model will evolve to include different features.
VIDEO "An example of scan data visualization, produced in a few minutes using the Dash plotly tool"
Next Step : Machine Learning Interpretability
An often-heard complaint about Machine Learning is that it is difficult to understand and feels a bit like a black box. At Tilkal we are cognizant of this issue and believe that it is our responsibility to make sure that our processes are transparent to the end user. As next steps in our Grey market stack, we plan to implement the LIME technique (Localised Interpretable Machine agnostic Explanation). Without getting into details, it gives the user an understanding as to what features in their data influenced a particular prediction. Here is an example of a LIME explanation: How much have the different factors influenced the greyvalue for a particular sample.
Fig 3: LIME - Localised Interpretable Machine agnostic Explanation (an example)
The grey Market is a challenging problem for the supply chain industry. This is our first step to solving the grey market problem affecting the supply chain industry using Machine Learning. We believe it is possible to tackle this problem incrementally, gradually increasing the scope and complexity of our solution. Our ultimate goal is to make supply chains transparent and give our customers the ability to react swiftly to market challenges such as these.
Supply Chain Leakages influencing grey market - https://www.industryweek.com/leadership/companies-executives/article/21956474/driving-growth-by-controlling-grey-market-leakage
PwC anecdote of Australian products being sold in China - https://www.pwc.com.au/publications/the-press/trust-your-crust.html
An article by Rishika Rupam, Data & AI Research Engineer @Tilkal