基于多尺度注意力机制的RAW图像重建

张科; 刘昱; 胡凯

doi:10.13700/j.bh.1001-5965.2022.0959

基于多尺度注意力机制的RAW图像重建

doi: 10.13700/j.bh.1001-5965.2022.0959

张科¹,
刘昱^1, ,,
胡凯^{1, 2}

1.
天津大学微电子学院，天津 300072
2.
天津大学浙江国际创新设计与智造研究院，绍兴 312000

基金项目:

国家自然科学基金(61771338)；天津市科技计划重点项目(18ZXRHSY00190)

详细信息

通讯作者:
E-mail：liuyu@tju.edu.cn

中图分类号: TN911.73
计量
- 文章访问数: 435
- HTML全文浏览量: 130
- PDF下载量: 15
- 被引次数: 0
出版历程
- 收稿日期: 2022-12-03
- 录用日期: 2023-02-03
- 网络出版日期: 2023-05-05
- 整期出版日期: 2025-01-31

RAW image reconstruction based on multi-scale attention mechanism

ZHANG Ke¹,
LIU Yu^{1
, ,},
HU Kai^{1, 2}

1.
School of Microelectronics，Tianjin University，Tianjin 300072，China
2.
International Institute for Innovative Design and Intelligent Manufacturing of Zhejiang，Tianjin University，Shaoxing 312000，China

Funds:

National Natural Science Foundation of China (61771338); Key Project of Tianjin Science and Technology item (18ZXRHSY00190)

More Information

Corresponding author: E-mail：liuyu@tju.edu.cn

摘要

摘要:
针对传统图像信号处理(ISP)算法繁琐的问题，基于可取代ISP算法的PyNET网络模型，提出一种端到端的RAW图像重建方法Py-CBAM。通过引入高效的注意力机制，并利用该机制对原有网络的多层级多尺度结构进行重设计，实现不同尺度特征的自适应加权，以较大程度提升图像重建的性能。实验结果表明，所提方法在公开的ZRR数据集上获得的峰值信噪比(PSNR)与PyNET方法相比提升了0.37 dB，结构相似度(SSIM)提升了0.001 8。将ZRR数据集和新构建的NRR数据集联合对Py-CBAM重新训练后，PSNR和SSIM分别达到25.73 dB和0.965 4。视觉效果上，所提方法解决了RAW图像重建时的噪声高与色彩失真、畸变等问题，增强模型在多场景不同光照环境条件下的重建能力；重建结果较为真实，视觉质量最优，在图像过曝和过暗区域视觉提升效果较为明显。
- 图像信号处理 /
- 图像重建 /
- 增强网络 /
- 注意力机制 /
- 深度学习
Abstract:
Traditional image signal processing (ISP) algorithms are cumbersome. Therefore, based on the PyNET model that can replace ISP algorithms, an end-to-end RAW image reconstruction method was proposed, called Py-CBAM. This method introduced an efficient attention mechanism and used it to redesign the multi-level and multi-scale structure of the PyNET network to achieve adaptive weighting of features at different scales, so as to improve the image reconstruction performance to a large extent. The experimental results show that the peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) obtained by the proposed method on the publicly available ZRR dataset improve by 0.37 dB and 0.0018 compared with those by the PyNET method. After retraining the Py-CBAM on the ZRR dataset and the newly constructed NRR dataset, the PSNR and SSIM reach 25.73 dB and 0.965 4, respectively. Visually, the proposed method solves the problems of high noise and chromatic aberration and distortion in RAW image reconstruction. It can also enhance the reconstruction ability of the model under different lighting environment conditions in multiple scenes. The reconstruction results are more realistic and have better visual quality, especially in the overexposed and dark areas of the image.
- image signal processing /
- image reconstruction /
- augmentation network /
- attention mechanism /
- deep learning

HTML全文

图 1 可视化RAW文件与重建图像

Figure 1. Visualized RAW data and reconstructed image

下载: 全尺寸图片幻灯片

图 2 卷积块注意力模块

Figure 2. Convolutional block attention module

下载: 全尺寸图片幻灯片

图 3 多卷积块注意力模块

Figure 3. Multi-convolution block attention module

下载: 全尺寸图片幻灯片

图 4 本文所提的Py-CBAM模型结构

Figure 4. Py-CBAM model structure mentioned in this article

下载: 全尺寸图片幻灯片

图 5 消融实验视觉效果对比

Figure 5. Visual effect comparison of ablation experiment

下载: 全尺寸图片幻灯片

图 6 夜晚环境下的实验结果对比

Figure 6. Comparison of experimental results in night environment

下载: 全尺寸图片幻灯片

表 1 消融实验结果对比

Table 1. Comparison of ablation experiment results

模型通道注意力空间注意力 PSNR/dB SSIM

PyNET 20.77 0.8601

Py-CA √ 20.89 0.8597

Py-SA √ 20.97 0.8610

Py-CBAM √ √ 21.14 0.8619

下载: 导出CSV

表 2 亮度方差对比实验结果

Table 2. Comparison experiments result of brightness variance

模型方差

PyNET 1 108.6
Py-CA 1 042.0
Py-SA 1 022.2
Py-CBAM 994.3

下载: 导出CSV

表 3 NRR数据集测试结果对比

Table 3. Comparison of test results on NRR dataset

模型 PSNR/dB SSIM

PyNET 13.67 0.5537

PyNET(夜晚) 20.22 0.7059

Py-CBAM 20.54 0.7158

Py-CBAM(夜晚) 25.73 0.9654

下载: 导出CSV

参考文献(24)

[1]	SCHWARTZ E, GIRYES R, BRONSTEIN A M. DeepISP: Toward learning an end-to-end image processing pipeline[J]. IEEE Transactions on Image Processing, 2019, 28(2): 912-923. doi: 10.1109/TIP.2018.2872858
[2]	LIANG Z T, CAI J R, CAO Z S, et al. CameraNet: A two-stage framework for effective camera ISP learning[J]. IEEE Transactions on Image Processing, 2021, 30: 2248-2262. doi: 10.1109/TIP.2021.3051486
[3]	IGNATOV A, VAN GOOL L, TIMOFTE R. Replacing mobile camera ISP with a single deep learning model[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE Press, 2020: 2275-2285.
[4]	WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[EB/OL]. (2018-07-17)[2022-11-25]. http://doi.org/10.48550/arXiv.1807.06521.
[5]	HEIDE F, STEINBERGER M, TSAI Y T, et al. FlexISP: A flexible camera image processing framework[J]. ACM Transactions on Graphics, 2014, 33(6): 231.
[6]	ZAMIR S W, ARORA A, KHAN S, et al. Restormer: Efficient transformer for high-resolution image restoration[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2022: 5718-5729.
[7]	LI X, GUNTURK B, ZHANG L. Image demosaicing: A systematic survey[C]//Proceedings of the Visual Communications and Image Processing. San Diego: SPIE, 2008: 6822-6837.
[8]	BUADES A, COLL B, MOREL J M. A non-local algorithm for image denoising[C]//Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2005: 60-65.
[9]	KWOK N M, SHI H Y, HA Q P, et al. Simultaneous image color correction and enhancement using particle swarm optimization[J]. Engineering Applications of Artificial Intelligence, 2013, 26(10): 2356-2371. doi: 10.1016/j.engappai.2013.07.023
[10]	GIJSENIJ A, GEVERS T, VAN DE WEIJER J. Improving color constancy by photometric edge weighting[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(5): 918-929. doi: 10.1109/TPAMI.2011.197
[11]	SHI W Z, CABALLERO J, HUSZÁR F, et al. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 1874-1883.
[12]	LIM B, SON S, KIM H, et al. Enhanced deep residual networks for single image super-resolution[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE Press, 2017: 1132-1140.
[13]	LAI W S, HUANG J B, AHUJA N, et al. Deep Laplacian pyramid networks for fast and accurate super-resolution[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 5835-5843.
[14]	LEE J Y, SUNKAVALLI K, LIN Z, et al. Automatic content-aware color and tone stylization[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 2470-2478.
[15]	FU X Y, ZENG D L, HUANG Y, et al. A fusion-based enhancing method for weakly illuminated images[J]. Signal Processing, 2016, 129: 82-96. doi: 10.1016/j.sigpro.2016.05.031
[16]	GUO C L, YAN Q X, ANWAR S, et al. Image dehazing Transformer with transmission-aware 3D position embedding[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2022: 5802-5810.
[17]	ZAMIR S W, ARORA A, KHAN S, et al. CycleISP: Real image restoration via improved data synthesis[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 2693-2702.
[18]	UHM K H, KIM S W, JI S W, et al. W-Net: Two-stage U-Net with misaligned data for raw-to-RGB mapping[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop. Piscataway: IEEE Press, 2019: 3636-3642.
[19]	ZHANG Z L, WANG H L, LIU M, et al. Learning RAW-to-sRGB mappings with inaccurately aligned supervision[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2021: 4328-4338.
[20]	MORAWSKI I, CHEN Y, LIN Y S, et al. GenISP: Neural ISP for low-light machine cognition[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE Press, 2022: 629-638.
[21]	HSYU M C, LIU C W, CHEN C H, et al. CSANet: High speed channel spatial attention network for mobile ISP[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE Press, 2021: 2486-2493.
[22]	DAI L H, LIU X H, LI C Q, et al. AWNet: Attentive wavelet network for image ISP[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2020: 185-201.
[23]	WANG F, JIANG M Q, QIAN C, et al. Residual attention network for image classification[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 6450-6458.
[24]	VEDALDI A, FULKERSON B. VLFeat: An open and portable library of computer vision algorithms[C]//Proceedings of the 18th ACM international conference on Multimedia. New York: ACM, 2010: 1469-1472.