当前位置:主页 > 计算机软件论文 >

基于深度学习的视频行人重识别技术研究

更新时间:2023-02-14
阅享价格280元 资料包括:原始论文 点击这里给我发消息QQ在线咨询
文档格式:doc/docx 全文字数:33000 温馨提示
以下仅列出文章摘要、提纲简介,如需获取全文阅读权限,或原创定制、长期合作,请随时联系。
微信QQ:312050216 点击这里给我发消息
扫一扫 扫一扫
基于深度学习的视频行人重识别技术研究

摘要

近年来,随着信息化社会的不断发展和人民对公共安全重视度的不断提升,视频监控设备被广泛地部署在城市的公共场所。行人重识别的技术研究作为视频监控系统的关键部分成为了计算机视觉领域中的研究热点。行人重识别面临光照、视角、遮挡等问题,而视频行人重识别更是面临如何利用视频序列的时序信息进行重识别的挑战。本文主要针对目前的视频行人重识别方法进行了深入研究和分析,对存在的问题作出改进。

针对当前视频行人重识别模型在视频序列的时空信息提取上效果较差的问题,设计一种基于非局部关注模块和多重特征融合的行人重识别网络模型,在ResNet-50基础网络中嵌入非局部关注模块提取全局特征,同时构建多层特征融合网络来获得行人的显著特征,最终对行人特征进行相似性度量和匹配排序得到精度值。所提模型在各数据集上性能均有明显提升:在大数据集MARS上的mAP和Rank-1值为81.4%和88.7%,在DukeMTMC-VideoReID上的mAP和Rank-1值达到93.4%和95.3%,在小数据集PRID2011上Rank-1值为94.8%。

针对视频行人重识别中数据增强方法产生的图像含有噪声干扰而无法提取显著特征的问题,提出一种盲去噪和自监督压缩生成对抗网络对视频行人重识别进行数据增强,使用盲去噪生成对抗网络对原始数据集进行数据增强拓展了训练数据集,同时把生成的行人图像进行去噪处理,另外采用自监督压缩技术压缩生成对抗网络以减少计算量。实验结果表明,本文提出的模型在MARS和DukeMTMC-VideoReID数据集上性能均有明显提升,Rank-1达到89.1%和96.7%,mAP值达到82.4%和94.1%。

关键词:视频行人重识别;非局部关注;特征融合;盲去噪;自监督压缩

Research on Technology of Video Pedestrian Re-recognition Based on Deep Learning

Abstract

In recent years, with the continuous development of the information society and the increasing importance of the people on public safety,video surveillance equipment has been widely deployed in public places in cities. As a key part of the video surveillance system,the technology research of pedestrian re-identification has become a research hotspot in the field of computer vision. Pedestrian re-identification faces problems such as illumination,viewing angle, occlusion,and video pedestrian re-identification is the challenge of how to use the timing information of the video sequence for re-identification. This article mainly conducts in-depth research and analysis on the current video pedestrian re-identification methods, and makes improvements to the existing problems.

Aiming at the problem that the current video pedestrian re-recognition model has poor effect on the temporal and spatial information extraction of video sequences,a pedestrian re-recognition network model based on non-local attentionmodules and multiple feature fusion is designed, and non-local re-recognition models are embedded in the ResNet-50 basic network. The local attention module extracts global features,and at the same time constructs a multi-layer feature fusion network to obtain the salient features of pedestrians, and finally measures the similarity of pedestrian features and matches and sorts them to obtain the accuracy value.The performance of the proposed model has been significantly improved on each dataset: the values of mAP and Rank-1 on the large dataset MARS are 81.4% and 88.7%, and the values of mAP and Rank-1 on DukeMTMC-VideoReID reach 93.4% and 95.3%.Moreover, the Rank-1 value on the small dataset PRID2011 is 94.8%.

Aiming at the problem that the image generated by the data enhancement method in video pedestrian re-identification contains noise interference and cannot extract significant features,a blind denoising and self-supervised compression generation confrontation network is proposed to enhance the data of video pedestrian re-identification,using blind denoising generation The confrontation network enhances the original dataset to expand the training dataset, and at the same time denoises the generated pedestrian images. In addition, the self-supervised compression technology is used to compress the generated confrontation network to reduce the amount of calculation.The experimental results demonstrate that the proposedmodel achieves great performance of Rank-1 reaching 89.1% and 96.7%, mAP value Reached 82.4% and 94.1% on the MARS and DukeMTMC-VideoReID datasets respectively.

Keywords: Video Pedestrian Re-recognition; Non-local attention;Feature fusion; Blind denoising; Self-supervised compression