使用TensorFlow进行目标检测项目实践指南

1. 数据标注与准备

在开始训练之前，需要先标注自己的数据集并组织文件结构。将图片存储在 JPEGImages 目录下，XML标注文件存放在 Annotations 目录中。确保文件结构如下：

VOC2007/├── Annotations/│   └── Main/│       └── *.xml├── ImageSets/│   └── Main/│       ├── train.txt│       ├── val.txt│       ├── test.txt│       └── trainval.txt└── JPEGImages/    └── *.jpg

2. 分离训练集与测试集

使用代码将数据集按比例分离为训练集、验证集和测试集。以下是分离代码示例：

import osimport randomtrainval_percent = 0.66  # 训练+验证的比例train_percent = 0.5      # 训练集在trainval中的比例xmlfilepath = 'Annotations'total_xml = os.listdir(xmlfilepath)num = len(total_xml)# 随机分割数据集tv = int(num * trainval_percent)tr = int(tv * train_percent)trainval = random.sample(list(range(num)), tv)train = random.sample(trainval, tr)# 创建并写入文件with open('ImageSets/Main/trainval.txt', 'w') as ftrainval:    with open('ImageSets/Main/train.txt', 'w') as ftrain:        with open('ImageSets/Main/val.txt', 'w') as fval:            with open('ImageSets/Main/test.txt', 'w') as ftest:                for i in range(num):                    filename = total_xml[i][:-4] + '\n'                    if i in trainval:                        ftrainval.write(filename)                        if i in train:                            ftrain.write(filename)                        else:                            fval.write(filename)                    else:                        ftest.write(filename)

运行后，ImageSets/Main/ 目录下将生成 train.txt、val.txt、test.txt 和 trainval.txt 文件。

3. 数据格式转换为TFRecord

将XML标注文件转换为TFRecord格式，通常有以下两种方法：

方法一：XML → CSV → TFRecord

首先，将XML文件转换为CSV文件：

import osimport globimport pandas as pdimport xml.etree.ElementTree as ETdef xml_to_csv(path):    xml_list = []    for xml_file in glob.glob(os.path.join(path, '/*.xml')):        tree = ET.parse(xml_file)        root = tree.getroot()        for member in root.findall('object'):            value = (                root.find('filename').text,                int(root.find('size')[0].text),                int(root.find('size')[1].text),                member[0].text,                int(member[4][0].text),                int(member[4][1].text),                int(member[4][2].text),                int(member[4][3].text),            )            xml_list.append(value)    column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']    xml_df = pd.DataFrame(xml_list, columns=column_name)    return xml_dfdef main():    for directory in ['train', 'test']:        project_path = 'E:/gitcode/tensorflow-model/VOCPolice/VOC2007'        image_path = os.path.join(project_path, directory)        xml_df = xml_to_csv(image_path)        xml_df.to_csv(os.path.join(project_path, f'{directory}_labels.csv'), index=None)        print('成功将 XML 转换为 CSV.')if __name__ == '__main__':    main()

运行后，项目路径下将生成 train_labels.csv 和 test_labels.csv。

方法二：使用TensorFlow的create_pascal_tf_record.py脚本

直接使用TensorFlow提供的脚本可以自动转换数据集为TFRecord格式。

4. 准备训练数据

1. 下载预训练模型

从TensorFlow Model Zoo下载预训练模型，并解压到 object_detection 目录下。

2. 修改标签文件

根据需要修改 pascal_label_map.pbtxt 文件。您可以新建自己的标签文件，格式如下：

# Pascal VOC 类别标签categories = [    {        'name': 'police',        'id': 1    }]

3. 修改训练配置文件

根据实际情况修改训练参数，例如：

# Example configuration filenum_classes: 1  # 类别数batch_size: 2  # 训练批次大小fine_tune_checkpoint: "E:/gitcode/tensorflow-model/chde222-models-master-MyData1230/models/research/object_detection/ssd_mobilenet_v1_coco_2017_11_17/model.ckpt"  # 预训练模型路径num_steps: 50000  # 训练步数train_input_reader {    label_map_path: "E:/gitcode/tensorflow-model/chde222-models-master-MyData1230/models/research/object_detection/data/pascal_label_map.pbtxt"    tf_record_input_reader {        input_path: "E:/gitcode/tensorflow-model/chde222-models-master-MyData1230/models/research/object_detection/train.record"    }}eval_input_reader {    label_map_path: "E:/gitcode/tensorflow-model/chde222-models-master-MyData1230/models/research/object_detection/data/pascal_label_map.pbtxt"    shuffle: false    num_readers: 1    tf_record_input_reader {        input_path: "E:/gitcode/tensorflow-model/chde222-models-master-MyData1230/models/research/object_detection/test.record"    }}

5. 开始训练

运行 train.py 文件，确保以下参数正确配置：

# Example train.py configurationflags.DEFINE_string('train_dir', 'E:/gitcode/tensorflow-model/chde222-models-master-MyData1230/models/research/object_detection/training',                   '训练结果存放路径')flags.DEFINE_string('pipeline_config_path', 'E:/gitcode/tensorflow-model/chde222-models-master-MyData1230/models/research/object_detection/data/pascal_config.pbtxt',                   '训练配置文件路径')

运行后，模型将开始训练，训练结果保存在指定的 train_dir 目录下。

6. 查看训练曲线

在 object_detection 目录下启动TensorBoard：

tensorboard --logdir=E:/gitcode/tensorflow-model/chde222-models-master-MyData1230/models/research/object_detection

7. 导出模型

修改 export_inference_graph.py 文件，并运行：

flags.DEFINE_string('pipeline_config_path', 'E:/gitcode/tensorflow-model/chde222-models-master-MyData1230/models/research/object_detection/data/pascal_config.pbtxt',                   '导出模型的配置文件路径')flags.DEFINE_string('trained_checkpoint_prefix', 'E:/gitcode/tensorflow-model/chde222-models-master-MyData1230/models/research/object_detection/train.record',                   '训练结果存放路径')flags.DEFINE_string('output_directory', 'E:/gitcode/tensorflow-model/chde222-models-master-MyData1230/models/research/object_detection/exported',                   '导出模型的存储路径')

运行后，模型将被导出到指定目录下。

8. 测试模型

在 object_detection 目录下新建 myTest.py 文件，并运行：

import osimport tensorflow as tf# 加入项目根目录sys.path.append("..")from object_detection.utils import ops as utils_opsdef load_image_into_numpy_array(image):    (im_width, im_height) = image.size    return np.array(image.getdata()).reshape(        (im_height, im_width, 3)).astype(np.uint8)def run_inference_for_single_image(image, graph):    with graph.as_default():        with tf.Session(config=tf.ConfigProto()) as sess:            ops = tf.get_default_graph().get_operations()            all_tensor_names = {output.name for op in ops for output in op.outputs}            tensor_dict = {}            for key in ['num_detections', 'detection_boxes', 'detection_scores', 'detection_classes', 'detection_masks']:                tensor_name = key + ':0'                if tensor_name in all_tensor_names:                    tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(tensor_name)            if 'detection_masks' in tensor_dict:                detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])                detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])                real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)                detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])                detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])                detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(                    detection_masks, detection_boxes, image.shape[0], image.shape[1])                detection_masks_reframed = tf.cast(                    tf.greater(detection_masks_reframed, 0.5), tf.uint8)                tensor_dict['detection_masks'] = tf.expand_dims(                    detection_masks_reframed, 0)            image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')            output_dict = sess.run(tensor_dict,                                  feed_dict={image_tensor: np.expand_dims(image, 0)})            output_dict['num_detections'] = int(output_dict['num_detections'][0])            output_dict['detection_classes'] = output_dict['detection_classes'][0].astype(np.uint8)            output_dict['detection_boxes'] = output_dict['detection_boxes'][0]            output_dict['detection_scores'] = output_dict['detection_scores'][0]            if 'detection_masks' in output_dict:                output_dict['detection_masks'] = output_dict['detection_masks'][0]            return output_dictdef main():    # 修改模型路径    MODEL_NAME = './Police_detection1231/'    PATH_TO_FROZEN_GRAPH = MODEL_NAME + '/frozen_inference_graph.pb'    PATH_TO_LABELS = os.path.join('data', 'pascal_label_map.pbtxt')    NUM_CLASSES = 1    # 修改测试图片路径    PATH_TO_TEST_IMAGES_DIR = 'test_police'    TEST_IMAGE_PATHS = [os.path.join(PATH_TO_TEST_IMAGES_DIR, f'image{i}.jpg') for i in range(1, 22)]    # 加载模型    detection_graph = tf.Graph()    with detection_graph.as_default():        od_graph_def = tf.GraphDef()        with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:            serialized_graph = fid.read()            od_graph_def.ParseFromString(serialized_graph)            tf.import_graph_def(od_graph_def, name='')    # 加载标签文件    label_map = label_map_util.load_labelmap(PATH_TO_LABELS)    categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)    category_index = label_map_util.create_category_index(categories)    # 遍历测试图片    for image_path in TEST_IMAGE_PATHS:        image = Image.open(image_path)        image_np = load_image_into_numpy_array(image)        image_np_expanded = np.expand_dims(image_np, axis=0)        output_dict = run_inference_for_single_image(image_np_expanded, detection_graph)        # 生成带有框的图像        vis_util.visualize_boxes_and_labels_on_image_array(            image_np,            output_dict['detection_boxes'],            output_dict['detection_classes'],            output_dict['detection_scores'],            category_index,            instance_masks=output_dict.get('detection_masks'),            use_normalized_coordinates=True,            line_thickness=8)        plt.savefig(image_path + '_labeled.jpg')        plt.close()        print("完成")if __name__ == '__main__':    tf.app.run()

9. 评估模型

修改 eval.py 文件，并运行：

flags.DEFINE_string('checkpoint_dir', 'E:/gitcode/tensorflow-model/chde222-models-master-MyData1230/models/research/object_detection/training',                   '评估的模型路径')flags.DEFINE_string('eval_dir', 'E:/gitcode/tensorflow-model/chde222-models-master-MyData1230/models/research/object_detection/eval',                   '评估结果存放路径')flags.DEFINE_string('pipeline_config_path', 'E:/gitcode/tensorflow-model/chde222-models-master-MyData1230/models/research/object_detection/data/pascal_config.pbtxt',                   '评估的配置文件路径')

运行后，模型将被评估，评估结果保存在指定目录下。