一尘不染

无法在Keras中使用VGG19预测单个图像的标签

python

根据本教程,我正在使用转移学习方法在Keras中使用按训练的VGG19模型。它显示了如何训练模型,但没有显示如何为预测准备测试图像。

在评论部分中说:

获取图像,使用相同的preprocess_image函数预处理图像,然后调用model.predict(image)。这将为您提供该图像上模型的预测。使用argmax(prediction),您可以找到图像所属的类。

我找不到preprocess_image代码中使用的命名函数。我进行了一些搜索,并考虑使用本教程提出的方法。

但这给出了一个错误说:

decode_predictions expects a batch of predictions (i.e. a 2D array of shape (samples, 1000)). Found array with shape: (1, 12)

我的数据集有12个类别。这是训练模型以及如何得到此错误的完整代码:

import pandas as pd
import numpy as np
import os
import keras
import matplotlib.pyplot as plt

from keras.layers import Dense, GlobalAveragePooling2D
from keras.applications.vgg19 import VGG19
from keras.preprocessing import image
from keras.applications.vgg19 import preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Model
from keras.optimizers import Adam

base_model = VGG19(weights='imagenet', include_top=False)

x=base_model.output                                                          
x=GlobalAveragePooling2D()(x)                                                
x=Dense(1024,activation='relu')(x)                                           
x=Dense(1024,activation='relu')(x)                                           
x=Dense(512,activation='relu')(x)

preds=Dense(12,activation='softmax')(x)                                      
model=Model(inputs=base_model.input,outputs=preds)

# view the layer architecture
# for i,layer in enumerate(model.layers):
#   print(i,layer.name)

for layer in model.layers:
    layer.trainable=False

for layer in model.layers[:20]:
    layer.trainable=False

for layer in model.layers[20:]:
    layer.trainable=True

train_datagen=ImageDataGenerator(preprocessing_function=preprocess_input)

train_generator=train_datagen.flow_from_directory('dataset',
                    target_size=(96,96), # 224, 224
                    color_mode='rgb',
                    batch_size=64,
                    class_mode='categorical',
                    shuffle=True)

model.compile(optimizer='Adam',loss='categorical_crossentropy',metrics=['accuracy'])

step_size_train=train_generator.n//train_generator.batch_size

model.fit_generator(generator=train_generator,
    steps_per_epoch=step_size_train,
    epochs=5)

# model.predict(new_image)

IPython:

In [3]: import classify_tl                                                                                                                                                   
Found 4750 images belonging to 12 classes.
Epoch 1/5
74/74 [==============================] - 583s 8s/step - loss: 2.0113 - acc: 0.4557
Epoch 2/5
74/74 [==============================] - 576s 8s/step - loss: 0.8222 - acc: 0.7170
Epoch 3/5
74/74 [==============================] - 563s 8s/step - loss: 0.5875 - acc: 0.7929
Epoch 4/5
74/74 [==============================] - 585s 8s/step - loss: 0.3897 - acc: 0.8627
Epoch 5/5
74/74 [==============================] - 610s 8s/step - loss: 0.2689 - acc: 0.9071

In [6]: model = classify_tl.model

In [7]: print(model)                                                                                                                                                         
<keras.engine.training.Model object at 0x7fb3ad988518>

In [8]: from keras.preprocessing.image import load_img

In [9]: image = load_img('examples/0021e90e4.png', target_size=(96,96))

In [10]: from keras.preprocessing.image import img_to_array

In [11]: image = img_to_array(image)

In [12]: image = image.reshape((1, image.shape[0], image.shape[1], image.shape[2]))

In [13]: from keras.applications.vgg19 import preprocess_input

In [14]: image = preprocess_input(image)

In [15]: yhat = model.predict(image)

In [16]: print(yhat)                                                                                                                                                         
[[1.3975363e-06 3.1069856e-05 9.9680350e-05 1.7175063e-03 6.2767825e-08
  2.6133494e-03 7.2859187e-08 6.0187017e-07 2.0794137e-06 1.3714411e-03
  9.9416250e-01 2.6067207e-07]]

In [17]: from keras.applications.vgg19 import decode_predictions

In [18]: label = decode_predictions(yhat)

IPython提示中的最后一行导致以下错误:

ValueError: `decode_predictions` expects a batch of predictions (i.e. a 2D array of shape (samples, 1000)). Found array with shape: (1, 12)

我应该如何正确输入测试图像并获得预测?


阅读 287

收藏
2021-01-20

共1个答案

一尘不染

decode_predictions用于根据ImageNet数据集中具有1000个类别的类别标签对模型的预测进行解码。但是,您经过微调的模型只有12个类。因此,decode_predictions在这里使用没有意义。当然,您必须知道这12个类别的标签是什么。因此,只需在预测中取最大分数的索引并找到其标签即可:

# create a list containing the class labels
class_labels = ['class1', 'class2', 'class3', ...., 'class12']

# find the index of the class with maximum score
pred = np.argmax(class_labels, axis=-1)

# print the label of the class with maximum score
print(class_labels[pred[0]])
2021-01-20