OPT OpenIR  > 海洋光学技术研究室
Semantics-Consistent Representation Learning for Remote Sensing Image-Voice Retrieval
Ning, Hailong1; Zhao, Bin2; Yuan, Yuan3
作者部门海洋光学技术研究室
发表期刊IEEE Transactions on Geoscience and Remote Sensing
ISSN01962892;15580644
产权排序1
摘要

With the development of earth observation technology, massive amounts of remote sensing (RS) images are acquired. To find useful information from these images, cross-modal RS image-voice retrieval provides a new insight. This article aims to study the task of RS image-voice retrieval so as to search effective information from massive amounts of RS data. Existing methods for RS image-voice retrieval rely primarily on the pairwise relationship to narrow the heterogeneous semantic gap between images and voices. However, apart from the pairwise relationship included in the data sets, the intramodality and nonpaired intermodality relationships should also be considered simultaneously since the semantic consistency among nonpaired representations plays an important role in the RS image-voice retrieval task. Inspired by this, a semantics-consistent representation learning (SCRL) method is proposed for RS image-voice retrieval. The main novelty is that the proposed method takes the pairwise, intramodality, and nonpaired intermodality relationships into account simultaneously, thereby improving the semantic consistency of the learned representations for the RS image-voice retrieval. The proposed SCRL method consists of two main steps: 1) semantics encoding and 2) SCRL. First, an image encoding network is adopted to extract high-level image features with a transfer learning strategy, and a voice encoding network with dilated convolution is devised to obtain high-level voice features. Second, a consistent representation space is conducted by modeling the three kinds of relationships to narrow the heterogeneous semantic gap and learn semantics-consistent representations across two modalities. Extensive experimental results on three challenging RS image-voice data sets, including Sydney, UCM, and RSICD image-voice data sets, show the effectiveness of the proposed method. IEEE

关键词Heterogeneous semantic gap remote sensing(RS) image–voice retrieval semantics-consistent representation
DOI10.1109/TGRS.2021.3060705
收录类别SCI ; EI
语种英语
WOS记录号WOS:000728266600060
出版者Institute of Electrical and Electronics Engineers Inc.
EI入藏号20211110072662
引用统计
文献类型期刊论文
条目标识符http://ir.opt.ac.cn/handle/181661/94569
专题海洋光学技术研究室
作者单位1.Shaanxi Key Laboratory of Ocean Optics, Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi'an 710119, China
2.School of Artificial Intelligence, Optics and Electronics (iOPEN), Northwestern Polytechnical University, Xi'an 710072, China.;
3.School of Artificial Intelligence, Optics and Electronics (iOPEN), Northwestern Polytechnical University, Xi'an 710072, China (e-mail: y.yuan1.ieee@qq.com)
推荐引用方式
GB/T 7714
Ning, Hailong,Zhao, Bin,Yuan, Yuan. Semantics-Consistent Representation Learning for Remote Sensing Image-Voice Retrieval[J]. IEEE Transactions on Geoscience and Remote Sensing.
APA Ning, Hailong,Zhao, Bin,&Yuan, Yuan.
MLA Ning, Hailong,et al."Semantics-Consistent Representation Learning for Remote Sensing Image-Voice Retrieval".IEEE Transactions on Geoscience and Remote Sensing
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
Semantics-Consistent(4915KB)期刊论文出版稿开放获取CC BY-NC-SA浏览 请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Ning, Hailong]的文章
[Zhao, Bin]的文章
[Yuan, Yuan]的文章
百度学术
百度学术中相似的文章
[Ning, Hailong]的文章
[Zhao, Bin]的文章
[Yuan, Yuan]的文章
必应学术
必应学术中相似的文章
[Ning, Hailong]的文章
[Zhao, Bin]的文章
[Yuan, Yuan]的文章
相关权益政策
暂无数据
收藏/分享
文件名: Semantics-Consistent Representation Learning for Remote Sensing Image-Voice Retrieval.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。