Fraud Detection in Social Networks
About the Client
WiFi Map is the largest crowd-sourced Wi-Fi community in the world.
More than 50 million users downloaded their application from Google Play Market.
WiFi Map simplifies WiFi spot use for millions of people.
Every day, they use this application to get accurate data about WiFi around the world.
However, some people register fake data in the network.
The client wanted to detect fraudulent information since it harms the user experience and overall brand loyalty.
Finding a WiFi hotspot to avoid data roaming charges isn’t always easy.
But the crowdsourced WiFi Map app highlights hotspots around your current location and provides you with the password to use them.
You’ll want to be careful what data you access when you’re connected to public WiFi, so you should choose wisely not only when it comes to your hotspot but what you do with it.
Quantum’s task was to enhance the solution with automatic fraud detection.
Our developed component recognizes possible fraud in the process of a new WiFi spot registration.
It sends an alarm to the manager and influences the final decision making.
We worked with the user data and their actions in the system.
First, we tried to cluster all users in two groups according to the data we had, but this approach failed.
After more in-depth data analysis, we found a parameter that shows if a user was banned or not.
We took this parameter to split users into two groups for an ML model to train.
When we finished with the training, our team wrote a document with instructions on how to implement our results into code.
In a single month, we solved a problem the client was struggling with for several years.
Let's discuss your idea!
The first approach was to use clustering and 1-class classifiers to distinguish fraudulent hotspot tips. For this task, we used one-class SVM from scikit-learn as well as various clustering algorithms, such as t-SNE and k-means.
When none of the clustering methods gave results, we chose a feature from the dataset that was set as a label for fraud. This helped create new features from the existing ones. After that, a scikit-learn xgBoost classifier was trained to solve the client’s challenge successfully.