前言

从四川回广东的时候,在火车上闲的无聊,于是萌发出写一个OpenCV人脸识别的软件,这样别人在用我电脑的时候,我就知道啦!

最新进展:核心功能已经实现了,现在再做的就是如何包装起来,放在电脑里面实现了。其中也许也会接触到许多不同平台的兼容性问题,摄像头的并行使用问题,也许会鸽,也许会记录,但是无论如何,还是做个记录先。

整篇正文会由写好的jupyter_notebook作为引导,会有大概下面几个部分

  1. 环境配置Environment
  2. 一些预定义的参数Configuration
  3. 模型的训练、保存、载入、预测
  4. 还有OpenCV的使用

正文

1. Environment

# env: Python38
# OpenCV packages
import cv2

# Tensorflow-keras training
from tensorflow.keras.layers import Conv2D, MaxPooling2D, ZeroPadding2D,Reshape, Dense, Dropout, Activation, Flatten
from tensorflow.keras.preprocessing import image
from tensorflow.keras import Sequential, layers
import tensorflow as tf
import keras

# sklearn
# from sklearn.svm import SVC

# Tools for drawing and array
import matplotlib.pyplot as plt
import numpy as np

我们可以看见python版本是3.8,其中:

  1. cv2是opencv的包
  2. tensorflowkeras等都是神经网络相关的包
  3. matplotlibnumpy分别是用于处理图像和数组的包

2. Configuration

#####################
# 4. For training
#####################
# If training?
is_training = True

# If trianing is start-from-the-ground or fine-tune?
fine_tune_start = 550 # ID of fine tune sample
is_fine_tune = False

# Training data, including start-from-the-ground and fine-tune
data_num = 550 # Number of fine tune sample

# Test data number, including start-from-the-ground and fine-tune
test_data_num = 50

#####################
# 5. For predicting
#####################
# The model load path
model_load_path = 'models/model_before604.h5'

# If collecting abnormal samples?
is_catching_abnormal = True

在Configuration里面分别定义了各种各样在后面会用到的配置参数,例如is_training表示是否要进行training,还是单纯的载入,is_fine_tune代表了训练完了预模型之后,是否要在预模型的基础上进行微调,data_numtest_data_num分别表示训练样本的个数和训练样本中测试集的个数,其实这个可以使用百分比,但是由于我太懒了,就,你懂的,就酱紫:Dis_catching_abnormal表示是否需要对结果进行异常检测,并且对异常的结果进行分析和后续的fine-tune。

3. Model training

3.1 Data Loading

if is_training:
    print('Training...')
    classes = ['others','seanzou']

    train_path = '/Users/seanzou/jupyter_notebook'
    sample0_data = []
    sample1_data = []
    sample0_labels = []
    sample1_labels = []

    for i in range(data_num):
            img_name = '%s/0_sample/%d.jpg' % (train_path, i)
            try:
                data = np.array(tf.keras.utils.load_img(img_name)) / 255
            except Exception as e:
                print(e)
                print(i)
            data = tf.image.resize(data, [255, 255])
            data = np.array(data)
            sample0_data.append(data)
            sample0_labels.append(0)

    if is_fine_tune:
        for i in range(data_num):
            img_name = '%s/running/%d.jpg' % (train_path, fine_tune_start+i)
            try:
                data = np.array(tf.keras.utils.load_img(img_name)) / 255
            except Exception as e:
                print(e)
                print(i)
            data = tf.image.resize(data, [255, 255])
            data = np.array(data)
            sample1_data.append(data)
            sample1_labels.append(1)
    else:
        for i in range(data_num):
            img_name = '%s/1_sample/%d.jpg' % (train_path, i)
            data = np.array(tf.keras.utils.load_img(img_name)) / 255
            data = tf.image.resize(data, [255, 255])
            data = np.array(data)
            sample1_data.append(data)
            sample1_labels.append(1)


    is_one_hot = True

    samples_data = sample0_data + sample1_data
    samples_X = np.stack(samples_data, axis=0)

    samples_labels = sample0_labels + sample1_labels
    samples_labels = np.stack(samples_labels, axis=0)

    if is_one_hot:
        samples_y = np.zeros((samples_labels.shape[0],len(classes)))
        for i in range(samples_y.shape[0]):
            samples_y[i][samples_labels[i]] = samples_y[i][samples_labels[i]] + 1
    else:
        samples_y = samples_labels
        
    # Separate the dataset
    low = data_num - test_data_num
    high = data_num + test_data_num

    train_X = np.concatenate((samples_X[:low],samples_X[high:]))
    train_y = np.concatenate((samples_y[:low],samples_y[high:]))
    test_X = samples_X[low:high,]
    test_y = samples_y[low:high,]

else:
    print('Not training, just use the model.')

首先在训练之前需要对所有图片数据进行一个数据的载入,这一段话写了如何对不同文件夹的图片数据进行载入,并且放到python的内存之中。

这之中我们可以用one-hot和非one-hot的数据集。但是注意,one-hot和非one-hot的数据集的损失函数(loss function) 有所不同。选择的时候请仔细思考。

3.2 Model compile and training

if is_training:
    print('Training model.')
    # model
    model = Sequential()
    #
    model.add(Conv2D(16, (3, 3), name='conv1', padding='same', activation='relu', kernel_initializer='glorot_uniform', input_shape=(255,255,3)))
    model.add(MaxPooling2D(pool_size=(2, 2), strides=None, padding='same', data_format=None))
    #
    model.add(Conv2D(32, (3, 3), name='conv2', padding='same', activation='relu', kernel_initializer='glorot_uniform', input_shape=(255,255,3)))
    model.add(MaxPooling2D(pool_size=(2, 2), strides=None, padding='same', data_format=None))
    #
    model.add(Conv2D(64, (3, 3), name='conv3', padding='same', activation='relu', kernel_initializer='glorot_uniform', input_shape=(255,255,3)))
    model.add(MaxPooling2D(pool_size=(2, 2), strides=None, padding='same', data_format=None))
    #
    model.add(Flatten())
    #
    model.add(Dense(512, name='dense1', kernel_initializer='he_normal', activation='relu'))
    # only binary
    model.add(Dense(2, name='dense2', kernel_initializer='he_normal', activation='softmax'))

else:
    print('Not training, just use the model.')

在数据载入完成之后,就可以开始构建我们的模型了,我们的模型这个地方用的是一个多层的CNN(简单构建的一个模型,其实并没有多少想法,只是在我某个数据集上用的效果不错就拿过来了)

然后下面就是模型的compile

if is_training:
    model.compile(
    loss = 'categorical_crossentropy',
    optimizer='adam',
    metrics=['accuracy']
    )

else:
    print('Not training, just use the model.')

最后是模型的训练

if is_training:
    history = model.fit(
    train_X, 
    train_y,
    epochs = 10,
    verbose= 1,
    validation_split=0.2
    )

else:
    print('Not training, just use the model.')

4. Model evaluation

4.1 plot_confusion_matrix

def plot_confusion_matrix(cm, title='Confusion matrix', cmap=plt.cm.Blues, labels=[], scale=(1,1)):
    plt.figure(figsize=scale)
    plt.imshow(cm, interpolation='nearest', cmap=cmap)
    plt.title(title)
    plt.colorbar()
    tick_marks = np.arange(len(labels))
    plt.xticks(tick_marks, labels, rotation=45)
    plt.yticks(tick_marks, labels)
    
    plt.tight_layout()
    plt.ylabel('True label')
    plt.xlabel('Predicted label')

先直接从Keras官网拿一个confusion matrix的定义函数,下面evaluation的时候要用。

4.2 Model evaluating

score = model.evaluate(test_X, test_y, verbose=1)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

这段是Evaluate模型model的loss和accuracy

# confusion matrix
conf = np.zeros([len(classes),len(classes)])
confnorm1 = np.zeros([len(classes),len(classes)])

test_pred_labels = model.predict(test_X, batch_size=2)

for i in range(0,test_X.shape[0]):
    j = list(test_y[i,:]).index(1)
    k = int(np.argmax(test_pred_labels[i,:]))
    conf[j,k] = conf[j,k] + 1
    
for i in range(0,len(classes)):
    confnorm1[i,:] = conf[i,:] / np.sum(conf[i,:])

print('Confusion matrix [number]:')
print(conf)

print('Accuracy for each class:')
for i in range(len(classes)):
    sum = 0
    for j in range(len(classes)):
        sum += conf[i][j]
    print('accuracy of class[%s] :%s' % (classes[i], confnorm1[i,i]))
    
plot_confusion_matrix(confnorm1, labels=classes, scale=(5,5))

然后是模型的confusion matrix。这里的是二分类任务的 confusion matrix,如果是多分类任务需要自己重新改一下。

if is_training:
    model_save_path = 'models/model_before' + str(fine_tune_start + data_num) +  '.h5'
    model.save(model_save_path)
else:
    print('Not training, just use the model.')

最后如果一切正常的话,就可以对模型进行保存。

5. Model prediction

network = tf.keras.models.load_model(model_load_path)
model = network

写了一个如何用tensorflow-keras进行模型载入的load_model的函数,剩下的摸了,其实prediction和evaluation也差不多,只不过是将model.evaluate改为了model.predict而已,例如以下。

train_data_path = '1_sample/'
img1 = cv2.imread(train_data_path+'200.jpg') 
plt.imshow(img1)
img1_after = np.array(img1) / 255
img1_after = tf.image.resize(img1_after, [255,255])
img1_after = np.array(img1_after)
img1_after = np.stack([img1_after], axis=0)
pred = model.predict(img1_after)
print(pred)

这就是从文件夹中的一个图片,然后进行一个预测的例子。

6. OpenCV catching and Model implementation

cap = cv2.VideoCapture(0)
num = 605
path_name = '/Users/seanzou/jupyter_notebook/running'
while True:
    ret, frame = cap.read()
    # cv2.imshow('frame', frame)
    img = frame
    gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
    face_detect = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
    # face = face_detect.detectMultiScale(gray) #限制,最小是300×300,最大是700×700
    face = face_detect.detectMultiScale(gray, minSize=(300, 300), maxSize=(700, 700))
    if face is not None:
        for x,y,w,h in face:
            try:
                img1 = img[y - 10: y + h + 10, x - 10: x + w + 10]
                img1_after = np.array(img1) / 255
                img1_after = tf.image.resize(img1_after, [255,255])
                img1_after = np.array([img1_after])
                pred = model.predict(img1_after)
                # print(pred)
                if pred[0][1] > 0.5:
                    indicator = "Safe"
                    color = (0,255,0)
                else:
                    indicator = "Not Safe"
                    color = (0,0,255)
                    if is_catching_abnormal:
                        img_name = '%s/%d.jpg' % (path_name, num)
                        # plt.imshow(img)
                        cv2.imwrite(img_name, img1)
                        num += 1
                        print('Abnormal sample saving in ' + img_name)
                    else:
                        print('Detect the abnormal sample, but not catching!')

                # Output
                font = cv2.FONT_HERSHEY_COMPLEX
                cv2.putText(img,indicator,(int(x),int(y+h/2)),font,5,color,3)
                cv2.rectangle(img,(x,y),(x+w,y+h),color=color,thickness=2)
            except Exception as e:
                print(e)
    
    cv2.imshow("result",img)
    
    if cv2.waitKey(1) == ord('q'):
        break
            
cap.release()
cv2.destroyAllWindows()
# 一行代码完美解决cv2.destroyAllWindows()运行后窗口卡死问题
# ref: https://blog.csdn.net/qq_35164554/article/details/120061896
cv2.waitKey(1)

这里面写了如何用OpenCV截图获取摄像头里面的信息,并且如果需要fine-tune的话如何保存异常检测到的图片。

其中有一个需要注意的事项就是,如果单纯的输入

cap.release()
cv2.destroyAllWindows()

这样的python代码段的话,有可能会造成CV视频无法正常关闭卡死的情况,通常需要在后面再加一段cv2.waitKey(1)方可正常关闭。

7. Result

结果其实还挺不错的,可以正确分辨出我自己和不是我自己。 :D

由于隐私问题我就不把结果放出来了,嘻嘻嘻嘻!

总结

纯属娱乐,大家看着开心就好。自己在火车上无聊胡乱写的,哈哈哈 :Dr

参考

[1] 一行代码完美解决cv2.destroyAllWindows()运行后窗口卡死问题

Q.E.D.


立志做一个有趣的碳水化合物