In order to achieve the best possible results the project was split into a couple of stages:
Data Labeling and preparation
For better results, we used videos from the target platform. The dataset was labeled according to possible hand wash steps. Assuming the steps are non-overlapping, each step was labeled according to its starting and ending time in the video. Assuming the effort needed to get videos is significant, we use around 100 videos for the baseline development with a train/test split of 60-40.
Modeling
We built a handwashing event detector. The events are different stages in the handwashing process. Our system outputs timestamps and event classification so our client’s systems are able to compare a handwashing procedure with the World Health Organization’s recommended procedure.
At the core of the event detector is a neural network, which utilizes both spatial and temporal information to give accurate event classification for each timestep. The predictions are further filtered based on the distribution of the handwashing event times for each particular event class.
Deployment to the hardware platform
The developed algorithm is deployed in two different environments. An AWS g3s.xlarge instance is used for on-demand cloud processing. A Raspberry Pi 3b with an Intel Compute Stick 2 module is used to deliver real-time processing on the hand hygiene station itself. RGB video from the Pi Camera v2 is used in the hygiene station.