Posts 20 Days of openCV - Day 1
Post
Cancel

20 Days of openCV - Day 1

20 Days of openCV - Day 1

This blog is based on Ardrian R. Face detection with OpenCV and deep learning - PyImageSearch post

Introduction

We already know that openCV ships out-of-the-box with pretrained haar cascades which are used for face detection. But, only few known that it also has hidden dnn module which uses deep learning models for object detection

In this post we will use openCV’s dnn module for face detection.

Object Detection

dnn module was introduced in openCV 3.3. This module supports number of Deep learning frameworks like tensorflow, caffe and torch/pytorch

Usally there are 3 main models used for object detection task.

  1. RCNN - Regions with CNN (used in traditional Models)
  2. YOLO - DNN based hence more accurate, but not very fast
  3. SDD - Mix between high accuracy and fast performance.

Code

In this blog we will use SSD with mobile-net. Now lets get coding.

Load the model

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# read the model arch and load weights
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])

# load the input image and construct an input blob for the image
# by resizing to a fixed 300x300 pixels and then normalizing it
image = cv2.imread(args["image"])
(h, w) = image.shape[:2]


blob = cv2.dnn.blobFromImage(cv2.resize(image, resize=(300, 300)), scaled=1.0, normalize=(300, 300), (104.0, 177.0, 123.0))

# set input
net.setInput(blob)
# get predictions
detections = net.forward()

Now we need to read the predictions

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# iterate through all predictions
for i in range(0, detections.shape[2]):
    
    # get confidence for that detection
    confidence = detections[0, 0, i, 2]

    if confidence > args['confidence']:
        # descale the cordinated
        box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
        
        # destructure the points from array
        (startX, startY, endX, endY) = box.astype("int")

        # Add text and rectangle
        text = "{:.2f}%".format(confidence * 100)
        y = startY - 10 if startY - 10 > 10 else startY + 10
        cv2.rectangle(image, (startX, startY), (endX, endY), (0, 0, 255), 2)
        cv2.putText(image, text, (startX, y),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 0, 255), 2)

Now show the final image

1
2
3
cv2.imshow("Output", image)
cv2.waitKey(0)

Now we go one step further we will feed the input from our webcam to get real time predictions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
cap = cv2.VideoCapture(0)

while True:

    # Capture frame-by-frame
    ret, frame = cap.read()
    # resize frame for prediction
    image = cv2.resize(frame,(300,300)) 
    
    
    # copy the above script here ......
    
    cv2.namedWindow("frame", cv2.WINDOW_NORMAL)
    cv2.imshow("frame", image)
    if cv2.waitKey(1) >= 0:  # Break with ESC 
        break

This post is licensed under CC BY 4.0 by the author.