前言
从四川回广东的时候,在火车上闲的无聊,于是萌发出写一个OpenCV人脸识别的软件,这样别人在用我电脑的时候,我就知道啦!
最新进展:核心功能已经实现了,现在再做的就是如何包装起来,放在电脑里面实现了。其中也许也会接触到许多不同平台的兼容性问题,摄像头的并行使用问题,也许会鸽,也许会记录,但是无论如何,还是做个记录先。
整篇正文会由写好的jupyter_notebook作为引导,会有大概下面几个部分
- 环境配置Environment
- 一些预定义的参数Configuration
- 模型的训练、保存、载入、预测
- 还有OpenCV的使用
正文
1. Environment
# env: Python38
# OpenCV packages
import cv2
# Tensorflow-keras training
from tensorflow.keras.layers import Conv2D, MaxPooling2D, ZeroPadding2D,Reshape, Dense, Dropout, Activation, Flatten
from tensorflow.keras.preprocessing import image
from tensorflow.keras import Sequential, layers
import tensorflow as tf
import keras
# sklearn
# from sklearn.svm import SVC
# Tools for drawing and array
import matplotlib.pyplot as plt
import numpy as np
我们可以看见python版本是3.8,其中:
cv2
是opencv的包tensorflow
和keras
等都是神经网络相关的包matplotlib
和numpy
分别是用于处理图像和数组的包
2. Configuration
#####################
# 4. For training
#####################
# If training?
is_training = True
# If trianing is start-from-the-ground or fine-tune?
fine_tune_start = 550 # ID of fine tune sample
is_fine_tune = False
# Training data, including start-from-the-ground and fine-tune
data_num = 550 # Number of fine tune sample
# Test data number, including start-from-the-ground and fine-tune
test_data_num = 50
#####################
# 5. For predicting
#####################
# The model load path
model_load_path = 'models/model_before604.h5'
# If collecting abnormal samples?
is_catching_abnormal = True
在Configuration里面分别定义了各种各样在后面会用到的配置参数,例如is_training
表示是否要进行training,还是单纯的载入,is_fine_tune
代表了训练完了预模型之后,是否要在预模型的基础上进行微调,data_num
和test_data_num
分别表示训练样本的个数和训练样本中测试集的个数,其实这个可以使用百分比,但是由于我太懒了,就,你懂的,就酱紫:D。is_catching_abnormal
表示是否需要对结果进行异常检测,并且对异常的结果进行分析和后续的fine-tune。
3. Model training
3.1 Data Loading
if is_training:
print('Training...')
classes = ['others','seanzou']
train_path = '/Users/seanzou/jupyter_notebook'
sample0_data = []
sample1_data = []
sample0_labels = []
sample1_labels = []
for i in range(data_num):
img_name = '%s/0_sample/%d.jpg' % (train_path, i)
try:
data = np.array(tf.keras.utils.load_img(img_name)) / 255
except Exception as e:
print(e)
print(i)
data = tf.image.resize(data, [255, 255])
data = np.array(data)
sample0_data.append(data)
sample0_labels.append(0)
if is_fine_tune:
for i in range(data_num):
img_name = '%s/running/%d.jpg' % (train_path, fine_tune_start+i)
try:
data = np.array(tf.keras.utils.load_img(img_name)) / 255
except Exception as e:
print(e)
print(i)
data = tf.image.resize(data, [255, 255])
data = np.array(data)
sample1_data.append(data)
sample1_labels.append(1)
else:
for i in range(data_num):
img_name = '%s/1_sample/%d.jpg' % (train_path, i)
data = np.array(tf.keras.utils.load_img(img_name)) / 255
data = tf.image.resize(data, [255, 255])
data = np.array(data)
sample1_data.append(data)
sample1_labels.append(1)
is_one_hot = True
samples_data = sample0_data + sample1_data
samples_X = np.stack(samples_data, axis=0)
samples_labels = sample0_labels + sample1_labels
samples_labels = np.stack(samples_labels, axis=0)
if is_one_hot:
samples_y = np.zeros((samples_labels.shape[0],len(classes)))
for i in range(samples_y.shape[0]):
samples_y[i][samples_labels[i]] = samples_y[i][samples_labels[i]] + 1
else:
samples_y = samples_labels
# Separate the dataset
low = data_num - test_data_num
high = data_num + test_data_num
train_X = np.concatenate((samples_X[:low],samples_X[high:]))
train_y = np.concatenate((samples_y[:low],samples_y[high:]))
test_X = samples_X[low:high,]
test_y = samples_y[low:high,]
else:
print('Not training, just use the model.')
首先在训练之前需要对所有图片数据进行一个数据的载入,这一段话写了如何对不同文件夹的图片数据进行载入,并且放到python的内存之中。
这之中我们可以用one-hot和非one-hot的数据集。但是注意,one-hot和非one-hot的数据集的损失函数(loss function) 有所不同。选择的时候请仔细思考。
3.2 Model compile and training
if is_training:
print('Training model.')
# model
model = Sequential()
#
model.add(Conv2D(16, (3, 3), name='conv1', padding='same', activation='relu', kernel_initializer='glorot_uniform', input_shape=(255,255,3)))
model.add(MaxPooling2D(pool_size=(2, 2), strides=None, padding='same', data_format=None))
#
model.add(Conv2D(32, (3, 3), name='conv2', padding='same', activation='relu', kernel_initializer='glorot_uniform', input_shape=(255,255,3)))
model.add(MaxPooling2D(pool_size=(2, 2), strides=None, padding='same', data_format=None))
#
model.add(Conv2D(64, (3, 3), name='conv3', padding='same', activation='relu', kernel_initializer='glorot_uniform', input_shape=(255,255,3)))
model.add(MaxPooling2D(pool_size=(2, 2), strides=None, padding='same', data_format=None))
#
model.add(Flatten())
#
model.add(Dense(512, name='dense1', kernel_initializer='he_normal', activation='relu'))
# only binary
model.add(Dense(2, name='dense2', kernel_initializer='he_normal', activation='softmax'))
else:
print('Not training, just use the model.')
在数据载入完成之后,就可以开始构建我们的模型了,我们的模型这个地方用的是一个多层的CNN(简单构建的一个模型,其实并没有多少想法,只是在我某个数据集上用的效果不错就拿过来了)
然后下面就是模型的compile
if is_training:
model.compile(
loss = 'categorical_crossentropy',
optimizer='adam',
metrics=['accuracy']
)
else:
print('Not training, just use the model.')
最后是模型的训练
if is_training:
history = model.fit(
train_X,
train_y,
epochs = 10,
verbose= 1,
validation_split=0.2
)
else:
print('Not training, just use the model.')
4. Model evaluation
4.1 plot_confusion_matrix
def plot_confusion_matrix(cm, title='Confusion matrix', cmap=plt.cm.Blues, labels=[], scale=(1,1)):
plt.figure(figsize=scale)
plt.imshow(cm, interpolation='nearest', cmap=cmap)
plt.title(title)
plt.colorbar()
tick_marks = np.arange(len(labels))
plt.xticks(tick_marks, labels, rotation=45)
plt.yticks(tick_marks, labels)
plt.tight_layout()
plt.ylabel('True label')
plt.xlabel('Predicted label')
先直接从Keras官网拿一个confusion matrix的定义函数,下面evaluation的时候要用。
4.2 Model evaluating
score = model.evaluate(test_X, test_y, verbose=1)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
这段是Evaluate模型model的loss和accuracy
# confusion matrix
conf = np.zeros([len(classes),len(classes)])
confnorm1 = np.zeros([len(classes),len(classes)])
test_pred_labels = model.predict(test_X, batch_size=2)
for i in range(0,test_X.shape[0]):
j = list(test_y[i,:]).index(1)
k = int(np.argmax(test_pred_labels[i,:]))
conf[j,k] = conf[j,k] + 1
for i in range(0,len(classes)):
confnorm1[i,:] = conf[i,:] / np.sum(conf[i,:])
print('Confusion matrix [number]:')
print(conf)
print('Accuracy for each class:')
for i in range(len(classes)):
sum = 0
for j in range(len(classes)):
sum += conf[i][j]
print('accuracy of class[%s] :%s' % (classes[i], confnorm1[i,i]))
plot_confusion_matrix(confnorm1, labels=classes, scale=(5,5))
然后是模型的confusion matrix。这里的是二分类任务的 confusion matrix,如果是多分类任务需要自己重新改一下。
if is_training:
model_save_path = 'models/model_before' + str(fine_tune_start + data_num) + '.h5'
model.save(model_save_path)
else:
print('Not training, just use the model.')
最后如果一切正常的话,就可以对模型进行保存。
5. Model prediction
network = tf.keras.models.load_model(model_load_path)
model = network
写了一个如何用tensorflow-keras进行模型载入的load_model
的函数,剩下的摸了,其实prediction和evaluation也差不多,只不过是将model.evaluate
改为了model.predict
而已,例如以下。
train_data_path = '1_sample/'
img1 = cv2.imread(train_data_path+'200.jpg')
plt.imshow(img1)
img1_after = np.array(img1) / 255
img1_after = tf.image.resize(img1_after, [255,255])
img1_after = np.array(img1_after)
img1_after = np.stack([img1_after], axis=0)
pred = model.predict(img1_after)
print(pred)
这就是从文件夹中的一个图片,然后进行一个预测的例子。
6. OpenCV catching and Model implementation
cap = cv2.VideoCapture(0)
num = 605
path_name = '/Users/seanzou/jupyter_notebook/running'
while True:
ret, frame = cap.read()
# cv2.imshow('frame', frame)
img = frame
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
face_detect = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
# face = face_detect.detectMultiScale(gray) #限制,最小是300×300,最大是700×700
face = face_detect.detectMultiScale(gray, minSize=(300, 300), maxSize=(700, 700))
if face is not None:
for x,y,w,h in face:
try:
img1 = img[y - 10: y + h + 10, x - 10: x + w + 10]
img1_after = np.array(img1) / 255
img1_after = tf.image.resize(img1_after, [255,255])
img1_after = np.array([img1_after])
pred = model.predict(img1_after)
# print(pred)
if pred[0][1] > 0.5:
indicator = "Safe"
color = (0,255,0)
else:
indicator = "Not Safe"
color = (0,0,255)
if is_catching_abnormal:
img_name = '%s/%d.jpg' % (path_name, num)
# plt.imshow(img)
cv2.imwrite(img_name, img1)
num += 1
print('Abnormal sample saving in ' + img_name)
else:
print('Detect the abnormal sample, but not catching!')
# Output
font = cv2.FONT_HERSHEY_COMPLEX
cv2.putText(img,indicator,(int(x),int(y+h/2)),font,5,color,3)
cv2.rectangle(img,(x,y),(x+w,y+h),color=color,thickness=2)
except Exception as e:
print(e)
cv2.imshow("result",img)
if cv2.waitKey(1) == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
# 一行代码完美解决cv2.destroyAllWindows()运行后窗口卡死问题
# ref: https://blog.csdn.net/qq_35164554/article/details/120061896
cv2.waitKey(1)
这里面写了如何用OpenCV截图获取摄像头里面的信息,并且如果需要fine-tune的话如何保存异常检测到的图片。
其中有一个需要注意的事项就是,如果单纯的输入
cap.release()
cv2.destroyAllWindows()
这样的python代码段的话,有可能会造成CV视频无法正常关闭卡死的情况,通常需要在后面再加一段cv2.waitKey(1)
方可正常关闭。
7. Result
结果其实还挺不错的,可以正确分辨出我自己和不是我自己。 :D
由于隐私问题我就不把结果放出来了,嘻嘻嘻嘻!
总结
纯属娱乐,大家看着开心就好。自己在火车上无聊胡乱写的,哈哈哈 :Dr
参考
[1] 一行代码完美解决cv2.destroyAllWindows()运行后窗口卡死问题
Q.E.D.