Here we'll develop Python code for sending detected face images to the recognition server, wrap the AI face recognition with a simple web API to receive the face images from an edge device, and show how it works together.
Introduction
Face recognition is one area of artificial intelligence (AI) where the modern approaches of deep learning (DL) have had great success during the last decade. The best face recognition systems can recognize people in images and video with the same precision humans can – or even better.
Our series of articles on this topic is divided into two parts:
- Face detection, where the client-side application detects human faces in images or in a video feed, aligns the detected face pictures, and submits them to the server.
- Face recognition (this part), where the server-side application performs face recognition.
We assume that you are familiar with DNN, Python, Keras, and TensorFlow. You are welcome to download this project code to follow along.
In the previous articles, we learned how to detect faces with the MTCNN library on a Raspberry Pi device and how to identify faces with the FaceNet model. In this article, we’ll see how these components can be used together in a simple web client-server system.
Client-side Application
Let’s start with the client part running on an edge device. First, we write code for a simple class that will send face images to the server:
class ImgSend:
def __init__(self, host, port, debug_mode=False):
self.host = host
self.port = port
self.url = host+":"+str(port)+"/api/faceimg"
self.dbg_mode = debug_mode
def send(self, img):
(_, encoded) = cv2.imencode(".png", img)
data = encoded.tostring()
headers = { "content-type": "image/png" }
if self.dbg_mode:
print("Sending request... ")
t1 = time.time()
response = requests.post(self.url, data=data, headers=headers)
t2 = time.time()
dt = t2-t1
if self.dbg_mode:
print("Request processed: "+str(dt)+" sec")
result = json.loads(response.text)
return result
The constructor receives the host
and port
parameters and forms the final URL with the special path /api/faceimg to route the request to the face identification method. In the send
method, we encode the image to the png
format, convert it to a string, and send the string to the server with the requests.post
function.
We must also modify the face detector described in this (reference to part ‘Face Detection on Raspberry Pi’) article.
class VideoWFR:
def __init__(self, detector, sender):
self.detector = detector
self.sender = sender
def process(self, video, align=False, save_path=None):
detection_num = 0;
rec_num = 0
capture = cv2.VideoCapture(video)
img = None
dname = 'AI face recognition'
cv2.namedWindow(dname, cv2.WINDOW_NORMAL)
cv2.resizeWindow(dname, 960, 720)
frame_count = 0
dt = 0
if align:
fa = Face_Align_Mouth(160)
while(True):
(ret, frame) = capture.read()
if frame is None:
break
frame_count = frame_count+1
t1 = time.time()
faces = self.detector.detect(frame)
f_count = len(faces)
detection_num += f_count
names = None
if (f_count>0) and (not (self.sender is None)):
names = [None]*f_count
for (i, face) in enumerate(faces):
if align:
(f_cropped, f_img) = fa.align(frame, face)
else:
(f_cropped, f_img) = self.detector.extract(frame, face)
if (not (f_img is None)) and (not f_img.size==0):
response = self.sender.send(f_img)
is_recognized = response["message"]=="RECOGNIZED"
print(response["message"])
if is_recognized:
print(response["name"]+": "+response["percent"])
if is_recognized:
rec_num += 1
name = response["name"]
percent = int(response["percent"])
conf = percent*0.01
names[i] = (name, conf)
if not (save_path is None):
ps = ("%03d" % rec_num)+"_"+name+"_"+("%03d" % percent)+".png"
ps = os.path.join(save_path, ps)
cv2.imwrite(ps, f_img)
t2 = time.time()
dt = dt + (t2-t1)
if len(faces)>0:
Utils.draw_faces(faces, (0, 0, 255), frame, True, True, names)
cv2.imshow(dname,frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
capture.release()
cv2.destroyAllWindows()
if dt>0:
fps = detection_num/dt
else:
fps = 0
return (detection_num, rec_num, fps)
Now we have a face recognizer, not just a face detector. It includes an internal MTCNN detector and an image sender. When a face is detected, it is sent to the server. When the response from the server is received, it is parsed and saved to the specified folder.
Server-Side Application
Let’s move to the server application. We use the Flask microframework to wrap our face identification code with a web API:
import flask
from flask import Flask, request, Response
print(flask.__version__)
app = Flask(__name__)
rec = None
f_db = None
rec_data = None
save_path = None
@app.route("/api/faceimg", methods=['POST'])
def test():
response = {}
r_status = 200
r = request
print("Processing recognition request... ")
t1 = time.time()
nparr = np.fromstring(r.data, np.uint8)
img = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
embds = rec.embeddings(img)
data = rec.recognize(embds, f_db)
t2 = time.time()
dt = t2-t1
print("Recognition request processed: "+str(dt)+" sec")
rec_data.count()
ps = ""
info = ""
if not (data is None):
(name, dist, p_photo) = data
conf = 1.0 - dist
percent = int(conf*100)
info = "Recognized: "+name+" "+str(conf)
ps = ("%03d" % rec_data.get_count())+"_"+name+"_"+("%03d" % percent)+".png"
response = { "message": "RECOGNIZED",
"name": name,
"percent": str(percent) }
else:
info = "UNRECOGNIZED"
ps = ("%03d" % rec_data.get_count())+"_unrecognized"+".png"
response = { "message": "UNRECOGNIZED" }
print(info)
if not (save_path is None):
ps = os.path.join(save_path, ps)
cv2.imwrite(ps, img)
response_pickled = jsonpickle.encode(response)
return Response(response=response_pickled, status=r_status, mimetype="application/json")
After initializing the Flask
application, we apply the route
decorator to trigger the test
method with the specified URL (the same one the client application uses). In this method, we decode the received PNG image of a face, get the embeddings, recognize the face, and send the response back to the client.
Running the System in a Container
Finally, here is the code for running our web application:
if __name__ == "__main__":
host = str(sys.argv[1])
port = int(sys.argv[2])
m_file = r"/home/pi_fr/net/facenet_keras.h5"
rec = FaceNetRec(m_file, 0.5)
rec_data = RecData()
print("Recognizer loaded.")
print(rec.get_model().inputs)
print(rec.get_model().outputs)
save_path = r"/home/pi_fr/rec"
db_path = r"/home/pi_fr/db"
f_db = FaceDB()
f_db.load(db_path, rec)
db_f_count = len(f_db.get_data())
print("Face DB loaded: "+str(db_f_count))
print("Face recognition running")
app.run(host=host, port=port, threaded=False)
As we’re going to run the web application in the created Docker container (reference to the previous part), we need to start this container with the appropriate network settings. Use the following commands to create a new container from the image:
c:\>docker network create my-net
c:\>docker create --name FR_2 --network my-net --publish 5050:50 sergeylgladkiy/fr:v1
When the FR_2
container starts, it forwards port 5050 of the host machine to the internal port 50 of the container.
Now we can run the application in the container (note that because it is inside the container, we specify internal port 50):
# python /home/pi_fr/pi_fr_facenet.run_align_dock_flask.lnx.py 0.0.0.0 50
When the server starts, we can run the client on a Raspberry Pi device. Here is the code we used to launch the application:
if __name__ == "__main__":
v_file = r"/home/pi/Desktop/PI_FR/video/5_2.mp4"
host = "http://192.168.2.135"
port = 5050
save_path = r"/home/pi/Desktop/PI_FR/rec"
d = MTCNN_Detector(50, 0.95)
sender = ImgSend(host, port, True)
vr = VideoWFR(d, sender)
(f_count, rec_count, fps) = vr.process(v_file, True, save_path)
print("Face detections: "+str(f_count))
print("Face recognitions: "+str(rec_count))
print("FPS: "+str(fps))
Note that the client uses the IP and port of the host machine (5050), not the IP of the container and internal port number (50).
The following two videos show how our client-server systemworks:
As you can see, the recognition request was processed very quickly; it took only about 0.07 seconds. This is the proof of the correct system architecture. The client sends to the server only the cropped and aligned detected face images, thus decreasing the network load, while the identification algorithm runs on a powerful server computer.
Next Steps
In the next article of the series, we’ll show how to run the face recognition servers on Kubernetes. Stay tuned!