Remote video streaming with face detection

July 12, 2019

I want to share some information about an interesting project, I faced recently. The goal of this project was to build an asynchronous server, connect the camera for streaming video, find faces on the video stream, and show all this on the Web. I use OpenCV and Tornado

I won’t describe the front-end part, and how to right basic Tornado server you can easily find it on my GitHub, so I will focus on the server part.

Let’s look at the implementation of the code, that streams MJPEG video to our Web page.

class StreamHandler(tornado.web.RequestHandler):
    @tornado.web.asynchronous
    @tornado.gen.coroutine
    def get(self):
        self.set_header('Cache-Control',
        'no-store, no-cache, must-revalidate, pre-check=0, post-check=0, max-age=0')
        self.set_header('Connection', 'close')
        self.set_header('Content-Type', 'multipart/x-mixed-replace;boundary=-boundarydonotcross')
        while True:
            if self.get_argument('fd') == "true":
                img = cam.get_frame (True)
            else:
                img = cam.get_frame(False)
            self.write("--boundarydonotcross\n")
            self.write("Content-type: image/jpeg\r\n")
            self.write("Content-length: %s\r\n\r\n" % len(img))
            self.write(str(img))
            yield tornado.gen.Task(self.flush)

MJPEG is a discontinued stream of jpeg images not connected.

pros:

simple to use
good picture quality
low latency

cons:

introduces a lot of traffic.

Shown method is a generator that every cycle calls get_frame function and forms http response with mjpeg fragment with the detected face on it. As you see, string @tornado.web.asynchronous decorates this method is asynchronous, so it wouldn’t block our server.

Let’s look at get_frame function implementation. First of all, we need to get a picture from a camera, -1 argument means that we want to capture pictures from this camera, found in the system.

self.cam = cv2.VideoCapture(-1)
success, image = self.cam.read()

When we have a picture let’s try to find a face on it. To simplify processing we want to make it grayscale by calling following OpenCV method.

gray = cv2.cvtColor(gray, cv2.COLOR_BGR2GRAY)

Now we can try to find a face. They are many different algorithms and methods to solve this task. One of the simplest ways is Haar cascade detection. Using Haar cascades we can detect objects, the cascade was trained on. First, we need a dataset, trained on some type of objects, in our case it is humans’ face, we will take the standard one from OpenCV library. Let’s load it using the standard cv2 lib method.

self.face_cascade = cv2.CascadeClassifier('face.xml')

Now we initialized Haar cascade classifier, let’s try to give an image to it, and look at what we get.

faces = self.face_cascade.detectMultiScale(gray)

An explanation of algorithms you can find here Face Detection using Haar Cascades, so I won’t talk much about it: The function returns coordinates of rectangles around each face.

All we need now is to draw these rectangles on the source image and encode it in jpeg format.

for f in faces:
    x, y, z, t = [int(float(v) * scale) for v in f]
    cv2.rectangle(image, (x, y), (x + z, y + t), (255, 255, 255), 2)
ret, jpeg = cv2.imencode('.jpg', image)
return jpeg.tostring()

Now we have a byte array that represents a mjpeg frame. All you need is to send it back to the client. The result is shown below:

Few words about application containerization. I used docker compose for solving this problem. I`d like to focus on one moment. We need a Web camera connected outside the container. The solution to this problem is shown below.