Python深度学习---使用数据增强加预训练网络的特征提取

最新推荐文章于 2026-04-05 21:41:42 发布

原创最新推荐文章于 2026-04-05 21:41:42 发布 · 2k 阅读

7 ·

本内容遵循CC 4.0 BY-SA版权协议

收录于

深度学习

本文介绍了如何在Python中利用Keras库进行深度学习，特别是通过VGG16卷积神经网络进行特征提取，并结合数据增强技术提高模型性能。通过加载预训练的VGG16权重，对图像数据进行旋转、平移等增强操作，然后提取这些增强图像的特征。接着，将特征展平并构建一个简单的全连接网络进行分类。最后，训练模型并在验证集上评估其准确性和损失。

弗朗索瓦.肖莱的Python深度学习书，第五章5.3.1中的特征提取第二个例子，利用数据增强进行特征提取，我理解，笔者可能更想表述数据增强后使用VGG16卷积基的特征提取，因此循着这个技术思路，对利用冻结的卷积基训练模型源码进行了改进。不说废话，直接上码。

导入必要的包

Ctrl/Command
import numpy as np
from keras import layers
from keras import models
from keras import optimizers
from keras.applications import VGG16
from keras.preprocessing.image import ImageDataGenerator
from matplotlib import pyplot as plt
Shift + K

定义路径目录（根据数据存放地址）

Ctrl/Command
conv_base = VGG16(weights=‘imagenet’,include_top=False,input_shape=(150,150,3))
base_dir = ‘E:/pydata/cats_and_dogs_small’
train_dir = os.path.join(base_dir,‘train’)
test_dir = os.path.join(base_dir,‘test’)
validation_dir = os.path.join(base_dir,‘validation’)
datagen = ImageDataGenerator(rescale=1./255)
batch_size = 20
Shift + K

图像数据增强

Ctrl/Command train_datagen = ImageDataGenerator(
rescale=1./255,rotation_range=40,width_shift_range=0.2,height_shift_range=0.2,
shear_range=0.2,zoom_range=0.2,horizontal_flip=True,fill_mode=‘nearest’
)
train_generator = train_datagen.flow_from_directory(
train_dir,target_size=(150,150),batch_size=batch_size,class_mode=‘binary’
) Shift + K

提取特征

Ctrl/Command def extract_features(directory,sample_count):
features = np.zeros(shape=(sample_count,4,4,512))
labels = np.zeros(sample_count)
if directory==train_dir:
generator = train_generator
else:
generator = datagen.flow_from_directory(
directory,target_size=(150,150),batch_size =batch_size,class_mode=‘binary’
)
i = 0
for inputs_batch,labels_batch in generator:
features_batch = conv_base.predict(inputs_batch)
features[i*batch_size:(i+1)batch_size] = features_batch
labels[ibatch_size:(i+1)batch_size] = labels_batch
i += 1
if ibatch_size >= sample_count:
break
return features,labels

train_features,train_labels = extract_features(train_dir,2000)
test_features,test_labels = extract_features(test_dir,1000)
validation_features,validation_labels = extract_features(validation_dir,1000)Shift + K

将特征展平，以便能够输入到密集连接分类器

Ctrl/Command train_features = np.reshape(train_features,(2000,44512))
test_features = np.reshape(test_features,(1000,44512))
validation_features = np.reshape(validation_features,(1000,44512)) Shift + K

定义模型

Ctrl/Command model = models.Sequential()
model.add(conv_base)
model.add(layers.Dense(256,activation=‘relu’,input_dim = 44512))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1,activation=‘sigmoid’))
conv_base.trainable = False #冻结模型的层，防止原参数被改动
model.compile(optimizer=optimizers.RMSprop(lr=np.exp(-5)),loss=‘binary_crossentropy’,metrics=[‘acc’])
history = model.fit(
train_features,train_labels,epochs=30,batch_size=20,validation_data=(validation_features,validation_labels)
) Shift + K

制图

Ctrl/Command acc = history.history[‘acc’]
val_acc = history.history[‘val_acc’]
loss = history.history[‘loss’]
val_loss = history.history[‘val_loss’]
epochs = range(1,len(acc)+1)

plt.plot(epochs,acc,‘bo’,label = ‘Training acc’)
plt.plot(epochs,val_acc,‘b’,lable = ‘Validation acc’)
plt.title(‘Training and Validation Acc’)
plt.legend()
plt.figure()

plt.plot(epochs,loss,‘bo’,label = ‘Training loss’)
plt.plot(epochs,val_loss,‘b’,lable = ‘Validation loss’)
plt.title(‘Training and Validation Loss’)
plt.legend()
plt.show() Shift + K

代码中大量源码的写作风格延续了原书中的风格，结合了VGG16特征提取和图像增强的优势。

标签

#深度学习 #Python #特征提取 #图像增强