Web API for Face Recognition

Sergey L. Gladkiy

5.00/5 (4 votes)

30 Jul 2021CPOL3 min read

17.9K

In this article, we’ll wrap the face identification model in a simple Web API, create a client application on the Raspberry Pi, and run the client-server system.

Here we'll develop Python code for sending detected face images to the recognition server, wrap the AI face recognition with a simple web API to receive the face images from an edge device, and show how it works together.

Download net.zip - 80.9 MB

Introduction

Face recognition is one area of artificial intelligence (AI) where the modern approaches of deep learning (DL) have had great success during the last decade. The best face recognition systems can recognize people in images and video with the same precision humans can – or even better.

Our series of articles on this topic is divided into two parts:

Face detection, where the client-side application detects human faces in images or in a video feed, aligns the detected face pictures, and submits them to the server.
Face recognition (this part), where the server-side application performs face recognition.

We assume that you are familiar with DNN, Python, Keras, and TensorFlow. You are welcome to download this project code to follow along.

In the previous articles, we learned how to detect faces with the MTCNN library on a Raspberry Pi device and how to identify faces with the FaceNet model. In this article, we’ll see how these components can be used together in a simple web client-server system.

Client-side Application

Let’s start with the client part running on an edge device. First, we write code for a simple class that will send face images to the server:

Python

class ImgSend:
    def __init__(self, host, port, debug_mode=False):
        self.host = host
        self.port = port
        self.url = host+":"+str(port)+"/api/faceimg"
        self.dbg_mode = debug_mode
    
    def send(self, img):
        (_, encoded) = cv2.imencode(".png", img)
        
        data = encoded.tostring()
        headers = { "content-type": "image/png" }
        if self.dbg_mode:
            print("Sending request... ")
            #print(data)
        t1 = time.time()
        response = requests.post(self.url, data=data, headers=headers)
        t2 = time.time()
        dt = t2-t1
        if self.dbg_mode:
            print("Request processed: "+str(dt)+" sec")
        
        result = json.loads(response.text)
        
        return result

The constructor receives the host and port parameters and forms the final URL with the special path /api/faceimg to route the request to the face identification method. In the send method, we encode the image to the png format, convert it to a string, and send the string to the server with the requests.post function.

We must also modify the face detector described in this (reference to part ‘Face Detection on Raspberry Pi’) article.

Python

class VideoWFR:    
    def __init__(self, detector, sender):
        self.detector = detector
        self.sender = sender
    
    def process(self, video, align=False, save_path=None):
        detection_num = 0;
        rec_num = 0
        capture = cv2.VideoCapture(video)
        img = None

        dname = 'AI face recognition'
        cv2.namedWindow(dname, cv2.WINDOW_NORMAL)
        cv2.resizeWindow(dname, 960, 720)
        
        frame_count = 0
        dt = 0
        if align:
            fa = Face_Align_Mouth(160)
            
        # Capture all frames
        while(True):    
            (ret, frame) = capture.read()
            if frame is None:
                break
            frame_count = frame_count+1
            
            t1 = time.time()
            faces = self.detector.detect(frame)
            f_count = len(faces)
            detection_num += f_count
            
            names = None
            if (f_count>0) and (not (self.sender is None)):
                names = [None]*f_count
                for (i, face) in enumerate(faces):
                    if align:
                        (f_cropped, f_img) = fa.align(frame, face)
                    else:
                        (f_cropped, f_img) = self.detector.extract(frame, face)
                    if (not (f_img is None)) and (not f_img.size==0):
                        response = self.sender.send(f_img)
                        is_recognized = response["message"]=="RECOGNIZED"
                        print(response["message"])
                        if is_recognized:
                            print(response["name"]+": "+response["percent"])
                        
                        if is_recognized:
                            rec_num += 1
                            name = response["name"]
                            percent = int(response["percent"])
                            conf = percent*0.01
                            names[i] = (name, conf)
                            if not (save_path is None):
                                ps = ("%03d" % rec_num)+"_"+name+"_"+("%03d" % percent)+".png"
                                ps = os.path.join(save_path, ps)
                                cv2.imwrite(ps, f_img)
                        
            t2 = time.time()
            dt = dt + (t2-t1)
                    
            if len(faces)>0:
                Utils.draw_faces(faces, (0, 0, 255), frame, True, True, names)
            
            # Display the resulting frame
            cv2.imshow(dname,frame)
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
        
        capture.release()
        cv2.destroyAllWindows()
        
        if dt>0:
            fps = detection_num/dt
        else:
            fps = 0
        
        return (detection_num, rec_num, fps)

Now we have a face recognizer, not just a face detector. It includes an internal MTCNN detector and an image sender. When a face is detected, it is sent to the server. When the response from the server is received, it is parsed and saved to the specified folder.

Server-Side Application

Let’s move to the server application. We use the Flask microframework to wrap our face identification code with a web API:

Python

import flask
from flask import Flask, request, Response

print(flask.__version__)

# Initialize the Flask application
app = Flask(__name__)

rec = None
f_db = None
rec_data = None
save_path = None

@app.route("/api/faceimg", methods=['POST'])
def test():
    response = {}
    r_status = 200
    r = request
    
    print("Processing recognition request... ")
    t1 = time.time()
    nparr = np.fromstring(r.data, np.uint8)
    img = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
    
    embds = rec.embeddings(img)
    data = rec.recognize(embds, f_db)
    t2 = time.time()
    dt = t2-t1
    print("Recognition request processed: "+str(dt)+" sec")
    
    rec_data.count()
    ps = ""
    info = ""
    if not (data is None):
        (name, dist, p_photo) = data
        conf = 1.0 - dist
        percent = int(conf*100)
        info = "Recognized: "+name+" "+str(conf)
        ps = ("%03d" % rec_data.get_count())+"_"+name+"_"+("%03d" % percent)+".png"
        response = { "message": "RECOGNIZED",
                     "name": name,
                     "percent": str(percent) }
    else:
        info = "UNRECOGNIZED"
        ps = ("%03d" % rec_data.get_count())+"_unrecognized"+".png"
        response = { "message": "UNRECOGNIZED" }
    
    print(info)
    
    if not (save_path is None):
       ps = os.path.join(save_path, ps)
       cv2.imwrite(ps, img)
        
    # encode response using jsonpickle
    response_pickled = jsonpickle.encode(response)
    return Response(response=response_pickled, status=r_status, mimetype="application/json")

After initializing the Flask application, we apply the route decorator to trigger the test method with the specified URL (the same one the client application uses). In this method, we decode the received PNG image of a face, get the embeddings, recognize the face, and send the response back to the client.

Running the System in a Container

Finally, here is the code for running our web application:

Python

if __name__ == "__main__":
    host = str(sys.argv[1])
    port = int(sys.argv[2])

    # FaceNet recognizer
    m_file = r"/home/pi_fr/net/facenet_keras.h5"
    rec = FaceNetRec(m_file, 0.5)
    rec_data = RecData()
    print("Recognizer loaded.")
    print(rec.get_model().inputs)
    print(rec.get_model().outputs)
    
    # Face DB 
    save_path = r"/home/pi_fr/rec"
    db_path = r"/home/pi_fr/db"
    f_db = FaceDB()
    f_db.load(db_path, rec)
    db_f_count = len(f_db.get_data())
    print("Face DB loaded: "+str(db_f_count))
    
    print("Face recognition running")
          
    #host = "0.0.0.0"
    #port = 50
    app.run(host=host, port=port, threaded=False)

As we’re going to run the web application in the created Docker container (reference to the previous part), we need to start this container with the appropriate network settings. Use the following commands to create a new container from the image:

c:\>docker network create my-net
c:\>docker create --name FR_2 --network my-net --publish 5050:50 sergeylgladkiy/fr:v1

When the FR_2 container starts, it forwards port 5050 of the host machine to the internal port 50 of the container.

Now we can run the application in the container (note that because it is inside the container, we specify internal port 50):

# python /home/pi_fr/pi_fr_facenet.run_align_dock_flask.lnx.py 0.0.0.0 50

When the server starts, we can run the client on a Raspberry Pi device. Here is the code we used to launch the application:

Python

if __name__ == "__main__":
    #v_file = str(sys.argv[1])
    #host = str(sys.argv[2])
    #port = int(sys.argv[3])
    
    v_file = r"/home/pi/Desktop/PI_FR/video/5_2.mp4"
    host = "http://192.168.2.135"
    port = 5050
    
    # Video Web recognition 
    save_path = r"/home/pi/Desktop/PI_FR/rec"
    d = MTCNN_Detector(50, 0.95)
    sender = ImgSend(host, port, True)
    vr = VideoWFR(d, sender)

    (f_count, rec_count, fps) = vr.process(v_file, True, save_path)

    print("Face detections: "+str(f_count))
    print("Face recognitions: "+str(rec_count))
    print("FPS: "+str(fps))

Note that the client uses the IP and port of the host machine (5050), not the IP of the container and internal port number (50).

The following two videos show how our client-server systemworks:

As you can see, the recognition request was processed very quickly; it took only about 0.07 seconds. This is the proof of the correct system architecture. The client sends to the server only the cropped and aligned detected face images, thus decreasing the network load, while the identification algorithm runs on a powerful server computer.

Next Steps

In the next article of the series, we’ll show how to run the face recognition servers on Kubernetes. Stay tuned!

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)