In-App Fraud Detection
- #Data analytics
- #Fraud detection
- #Machine learning
About the Client
Wi-Fi Map is the largest crowd-sourced Wi-Fi community in the world.
More than 50 million users downloaded their application from Google Play Market.
Wi-Fi Map simplifies Wi-Fi spot use for millions of people. They use this application daily to get accurate data about Wi-Fi worldwide.
However, some people register fake data in the network.
The client wanted to detect fraudulent information since it harms the user experience and overall brand loyalty.
Finding a Wi-Fi hotspot to avoid data roaming charges isn’t always easy.
But the crowdsourced Wi-Fi Map app highlights hotspots around your current location and provides you with the password to use them.
You’ll want to be careful what data you access when you’re connected to public Wi-Fi, so you should choose wisely not only when it comes to your hotspot but what you do with it.
Quantum’s task was to enhance the solution with automatic fraud detection.
Our developed component recognizes possible fraud in a new Wi-Fi spot registration process.
It sends an alarm to the manager and influences the final decision-making.
We worked with the user data and their actions in the system.
First, we tried to cluster all users into two groups according to the data we had, but this approach failed.
After more in-depth data analysis, we found a parameter that shows if a user was banned or not.
We took this parameter to split users into two groups for an ML model to train.
When we finished the training, our team wrote a document with instructions on how to implement our results into code.
In a single month, we solved a problem the client had been struggling with for several years.
Let's discuss your idea!
The first approach was to use clustering and 1-class classifiers to distinguish fraudulent hotspot tips. For this task, we used one-class SVM from scikit-learn as well as various clustering algorithms, such as t-SNE and k-means.
When none of the clustering methods gave results, we chose a feature from the dataset that was set as a label for fraud. This helped create new features from the existing ones. After that, a scikit-learn xgBoost classifier was trained to solve the client’s challenge successfully.