图像超分:使用自己的数据集微调Real-ESRGAN-x4plus进行超分重建
| 低分辨率测试图 ---------------------------------------------> 超分辨率重构图 |
|---|
![]() |




前言
- 由于本人水平有限,难免出现错漏,敬请批评改正。
- 更多精彩内容,可点击进入Python日常小操作专栏、OpenCV-Python小应用专栏、YOLO系列专栏、自然语言处理专栏、人工智能混合编程实践专栏或我的个人主页查看
- Ultralytics:使用 YOLO11 进行速度估计
- Ultralytics:使用 YOLO11 进行物体追踪
- Ultralytics:使用 YOLO11 进行物体计数
- Ultralytics:使用 YOLO11 进行目标打码
- 人工智能混合编程实践:C++调用Python ONNX进行YOLOv8推理
- 人工智能混合编程实践:C++调用封装好的DLL进行YOLOv8实例分割
- 人工智能混合编程实践:C++调用Python ONNX进行图像超分重建
- 人工智能混合编程实践:C++调用Python AgentOCR进行文本识别
- 通过计算实例简单地理解PatchCore异常检测
- Python将YOLO格式实例分割数据集转换为COCO格式实例分割数据集
- YOLOv8 Ultralytics:使用Ultralytics框架训练RT-DETR实时目标检测模型
- 基于DETR的人脸伪装检测
- YOLOv7训练自己的数据集(口罩检测)
- YOLOv8训练自己的数据集(足球检测)
- YOLOv5:TensorRT加速YOLOv5模型推理
- YOLOv5:IoU、GIoU、DIoU、CIoU、EIoU
- 玩转Jetson Nano(五):TensorRT加速YOLOv5目标检测
- YOLOv5:添加SE、CBAM、CoordAtt、ECA注意力机制
- YOLOv5:yolov5s.yaml配置文件解读、增加小目标检测层
- Python将COCO格式实例分割数据集转换为YOLO格式实例分割数据集
- YOLOv5:使用7.0版本训练自己的实例分割模型(车辆、行人、路标、车道线等实例分割)
- 使用Kaggle GPU资源免费体验Stable Diffusion开源项目
- Stable Diffusion:在服务器上部署使用Stable Diffusion WebUI进行AI绘图(v2.0)
- Stable Diffusion:使用自己的数据集微调训练LoRA模型(v2.0)
环境要求
Package Version Editable project location
----------------------- --------------- -------------------------------------------
python 3.8.13
absl-py 2.1.0
addict 2.4.0
autocommand 2.2.2
backports.tarfile 1.2.0
basicsr 1.4.2
cachetools 5.5.1
certifi 2025.1.31
charset-normalizer 3.4.1
contourpy 1.1.1
cycler 0.12.1
facexlib 0.3.0
filterpy 1.4.5
flatbuffers 25.2.10
fonttools 4.56.0
future 1.0.0
gfpgan 1.3.8
google-auth 2.38.0
google-auth-oauthlib 1.0.0
grpcio 1.70.0
idna 3.10
imageio 2.35.1
importlib_metadata 8.5.0
importlib_resources 6.4.5
inflect 7.3.1
jaraco.collections 5.1.0
jaraco.context 5.3.0
jaraco.functools 4.0.1
jaraco.text 3.12.1
kiwisolver 1.4.7
lazy_loader 0.4
llvmlite 0.41.1
lmdb 1.6.2
Markdown 3.7
MarkupSafe 2.1.5
matplotlib 3.7.5
more-itertools 10.3.0
networkx 3.1
numba 0.58.1
numpy 1.24.4
oauthlib 3.2.2
onnxruntime-gpu 1.11.0
opencv-python 4.10.0.84
packaging 24.2
pillow 10.4.0
pip 24.2
platformdirs 4.3.6
protobuf 5.29.3
pyasn1 0.6.1
pyasn1_modules 0.4.1
pyparsing 3.1.4
python-dateutil 2.9.0.post0
PyWavelets 1.4.1
PyYAML 6.0.2
realesrgan 0.3.0
requests 2.32.3
requests-oauthlib 2.0.0
rsa 4.9
scikit-image 0.21.0
scipy 1.10.1
setuptools 75.1.0
six 1.17.0
tb-nightly 2.14.0a20230808
tensorboard-data-server 0.7.2
tifffile 2023.7.10
tomli 2.2.1
torch 1.9.1+cu111
torchaudio 0.9.1
torchvision 0.10.1+cu111
tqdm 4.67.1
typeguard 4.3.0
typing_extensions 4.12.2
urllib3 2.2.3
Werkzeug 3.0.6
wheel 0.44.0
yapf 0.43.0
zipp 3.20.2
相关介绍
- Python是一种跨平台的计算机程序设计语言。是一个高层次的结合了解释性、编译性、互动性和面向对象的脚本语言。最初被设计用于编写自动化脚本(shell),随着版本的不断更新和语言新功能的添加,越多被用于独立的、大型项目的开发。
- PyTorch 是一个深度学习框架,封装好了很多网络和深度学习相关的工具方便我们调用,而不用我们一个个去单独写了。它分为 CPU 和 GPU 版本,其他框架还有 TensorFlow、Caffe 等。PyTorch 是由 Facebook 人工智能研究院(FAIR)基于 Torch 推出的,它是一个基于 Python 的可续计算包,提供两个高级功能:1、具有强大的 GPU 加速的张量计算(如 NumPy);2、构建深度神经网络时的自动微分机制。
- Real-ESRGAN (Enhanced Super-Resolution Generative Adversarial Networks,增强型超分辨率生成对抗网络) 是一种先进的图像超分辨率技术,专为处理真实世界图像中的复杂和多样的降质情况而设计。
- 核心特点
- 基于GAN的改进模型:在ESRGAN基础上进行深度优化,通过生成对抗网络技术实现高质量图像重建
- 纯合成数据训练:创新性地采用纯合成数据进行训练,模拟真实世界中图像可能出现的模糊、噪点、压缩失真等退化情况
- 4倍分辨率提升:能够将低分辨率图像的分辨率提升至4倍,显著改善图像质量
- 多尺度特征提取:通过多尺度特征提取模块,更好地捕捉图像中的不同尺度信息,提升细节表现力
- 技术优势:Real-ESRGAN相比传统超分辨率技术有显著优势:
- 有效处理真实世界图像中的复杂退化情况
- 生成高质量、细节丰富的高分辨率图像
- 能够精准还原细节,如人物面部纹理、建筑线条等
- 适用于多种图像类型,包括照片、动画等
- 应用场景
- 图像修复:老照片修复、去除划痕和噪点
- 图像增强:提升低清图片的分辨率和质量
- 视频修复:基于图像修复技术构建视频修复系统
- 专业图像处理:满足设计、影视制作等专业需求
- 与传统技术的区别
与传统超分辨率方法相比,Real-ESRGAN采用"盲超分辨率"技术,能够处理各种未知的图像退化情况,而不仅仅是特定类型的退化。它通过模拟真实世界中的复杂退化模型,使模型在面对实际受损图像时能更精准地还原细节。- 优点
- 高质量图像生成:相比ESRGAN等早期模型,Real-ESRGAN生成的图像更加清晰、细节更丰富,能更好地保留原始图像的纹理和结构。
- 真实世界退化处理能力:Real-ESRGAN是"基于EnhancedSRGAN模型的改进版本",特别针对"真实世界图像中的复杂退化情况"进行了优化,能有效处理模糊、噪点、压缩失真等。
- 4倍分辨率提升:Real-ESRGAN提供"RealESRGAN_x4plus.pth"模型,可以进行4倍的超分辨率。
- 多场景适用:Real-ESRGAN被列为"内置超分辨率算法"之一,与Waifu2x、SRMD等并列,适用于"无论是二次元动漫还是您日常拍摄的照片&录像"的多种图像类型。
- 开源社区支持:Real-ESRGAN是一个开源项目,有活跃的社区支持,持续更新和改进。
- 局限性
- 计算资源需求高:与其他超分辨率算法相比,Real-ESRGAN需要较强的计算资源,特别是处理高分辨率图像时。
- 处理速度较慢:相比一些轻量级模型,Real-ESRGAN的处理速度相对较慢,特别是处理视频时。
- 放大倍数限制:Real-ESRGAN主要提供4倍分辨率提升,无法实现更高倍数的放大而不损失质量。
- 对极端退化图像效果有限:对于严重模糊、低信噪比等极端退化情况,效果可能不如预期。
- 无法解决所有图像问题:Real-ESRGAN专注于超分辨率,但无法解决图像中的内容缺失、结构错误等问题,需要与其他图像修复技术结合使用。
改进方向- 提升处理速度:通过模型压缩、量化等技术,提高处理速度,使实时处理成为可能,满足视频处理等实时应用需求。
- 扩展放大倍数:研究如何在保持图像质量的同时,实现更高的放大倍数,如8倍甚至16倍。
- 降低计算资源需求:开发轻量级版本,使在移动设备上也能高效运行,扩大应用场景。
- 增强对特定退化类型的处理能力:“通过模拟训练来最大化精度”,进一步针对特定退化类型(如运动模糊、镜头畸变等)进行优化。
- 结合多任务学习:将超分辨率与其他图像处理任务(如去噪、去模糊、色彩增强)结合,实现"一揽子"图像质量提升。
- 改进训练数据:使用更多样化的训练数据,提高模型对各种退化类型的泛化能力,“通过模拟训练来最大化精度”。
- 优化模型架构:“反向估计处理程序,这可以更好地融合低分辨率和高分辨率的前向进程”,进一步优化模型架构,提高处理效果。
- Real-ESRGAN代表了图像超分辨率技术的最新进展,作为当前最先进的超分辨率技术之一,为图像修复、增强和重建提供了强大的工具,广泛应用于个人图像处理、专业影视制作、文物数字化等多个领域,其持续改进和优化将为图像处理领域带来更大价值,特别是在视频修复、老照片修复、影视制作等应用领域。
- 官方源代码: https://github.com/xinntao/Real-ESRGAN.git
- Xintao Wang, Liangbin Xie, Chao Dong, Ying Shan. Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. 2021
微调Real-ESRGAN-x4plus模型
下载Real-ESRGAN项目
- 官方源代码: https://github.com/xinntao/Real-ESRGAN.git
Windows

下载解压后,项目目录,如下图所示。

Linux
git clone https://github.com/xinntao/Real-ESRGAN.git
准备数据集

【可选】生成多尺寸图片
python scripts/generate_multiscale_DF2K.py --input datasets/train_mydatas/trainset_HR --output datasets/train_mydatas/trainset_HR_multiscale
datasets/train_mydatas/trainset_HR/1.png
0.75
0.50
0.33
datasets/train_mydatas/trainset_HR/2.png
0.75
0.50
0.33
datasets/train_mydatas/trainset_HR/3.png
0.75
0.50
0.33
datasets/train_mydatas/trainset_HR/4.png
0.75
0.50
0.33

【可选】裁切为子图像
python scripts/extract_subimages.py --input datasets/train_mydatas/trainset_HR --output datasets/train_mydatas/trainset_HR_multiscale_sub --crop_size 400 --step 200
mkdir datasets/train_mydatas/trainset_HR_multiscale_sub ...
Extract: 100%|█████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 5.47image/s]
All processes done.

准备元信息
python scripts/generate_meta_info.py --input datasets/train_mydatas/trainset_HR datasets/train_mydatas/trainset_HR_multiscale datasets/train_mydatas/trainset_HR_multiscale_sub --root datasets/train_mydatas datasets/train_mydatas datasets/train_mydatas --meta_info datasets/train_mydatas/meta_info/meta_info.txt
trainset_HR/1.png
trainset_HR/2.png
trainset_HR/3.png
trainset_HR/4.png
trainset_HR_multiscale/1T0.png
trainset_HR_multiscale/1T1.png
trainset_HR_multiscale/1T2.png
trainset_HR_multiscale/1T3.png
trainset_HR_multiscale/2T0.png
trainset_HR_multiscale/2T1.png
trainset_HR_multiscale/2T2.png
trainset_HR_multiscale/2T3.png
trainset_HR_multiscale/3T0.png
trainset_HR_multiscale/3T1.png
trainset_HR_multiscale/3T2.png
trainset_HR_multiscale/3T3.png
trainset_HR_multiscale/4T0.png
trainset_HR_multiscale/4T1.png
trainset_HR_multiscale/4T2.png
trainset_HR_multiscale/4T3.png
trainset_HR_multiscale_sub/1_s001.png
trainset_HR_multiscale_sub/1_s002.png
trainset_HR_multiscale_sub/1_s003.png
trainset_HR_multiscale_sub/1_s004.png
trainset_HR_multiscale_sub/1_s005.png
trainset_HR_multiscale_sub/1_s006.png
trainset_HR_multiscale_sub/1_s007.png
trainset_HR_multiscale_sub/1_s008.png
trainset_HR_multiscale_sub/1_s009.png
trainset_HR_multiscale_sub/1_s010.png
trainset_HR_multiscale_sub/1_s011.png
trainset_HR_multiscale_sub/1_s012.png
trainset_HR_multiscale_sub/1_s013.png
trainset_HR_multiscale_sub/1_s014.png
trainset_HR_multiscale_sub/1_s015.png
trainset_HR_multiscale_sub/1_s016.png
trainset_HR_multiscale_sub/1_s017.png
trainset_HR_multiscale_sub/1_s018.png
trainset_HR_multiscale_sub/1_s019.png
trainset_HR_multiscale_sub/1_s020.png
trainset_HR_multiscale_sub/1_s021.png
trainset_HR_multiscale_sub/1_s022.png
trainset_HR_multiscale_sub/1_s023.png
trainset_HR_multiscale_sub/1_s024.png
trainset_HR_multiscale_sub/1_s025.png
trainset_HR_multiscale_sub/2_s001.png
trainset_HR_multiscale_sub/2_s002.png
trainset_HR_multiscale_sub/2_s003.png
trainset_HR_multiscale_sub/2_s004.png
trainset_HR_multiscale_sub/2_s005.png
trainset_HR_multiscale_sub/2_s006.png
trainset_HR_multiscale_sub/2_s007.png
trainset_HR_multiscale_sub/2_s008.png
trainset_HR_multiscale_sub/2_s009.png
trainset_HR_multiscale_sub/2_s010.png
trainset_HR_multiscale_sub/2_s011.png
trainset_HR_multiscale_sub/2_s012.png
trainset_HR_multiscale_sub/2_s013.png
trainset_HR_multiscale_sub/2_s014.png
trainset_HR_multiscale_sub/2_s015.png
trainset_HR_multiscale_sub/2_s016.png
trainset_HR_multiscale_sub/2_s017.png
trainset_HR_multiscale_sub/2_s018.png
trainset_HR_multiscale_sub/2_s019.png
trainset_HR_multiscale_sub/2_s020.png
trainset_HR_multiscale_sub/2_s021.png
trainset_HR_multiscale_sub/2_s022.png
trainset_HR_multiscale_sub/2_s023.png
trainset_HR_multiscale_sub/2_s024.png
trainset_HR_multiscale_sub/2_s025.png
trainset_HR_multiscale_sub/3_s001.png
trainset_HR_multiscale_sub/3_s002.png
trainset_HR_multiscale_sub/3_s003.png
trainset_HR_multiscale_sub/3_s004.png
trainset_HR_multiscale_sub/3_s005.png
trainset_HR_multiscale_sub/3_s006.png
trainset_HR_multiscale_sub/3_s007.png
trainset_HR_multiscale_sub/3_s008.png
trainset_HR_multiscale_sub/3_s009.png
trainset_HR_multiscale_sub/3_s010.png
trainset_HR_multiscale_sub/3_s011.png
trainset_HR_multiscale_sub/3_s012.png
trainset_HR_multiscale_sub/3_s013.png
trainset_HR_multiscale_sub/3_s014.png
trainset_HR_multiscale_sub/3_s015.png
trainset_HR_multiscale_sub/3_s016.png
trainset_HR_multiscale_sub/3_s017.png
trainset_HR_multiscale_sub/3_s018.png
trainset_HR_multiscale_sub/3_s019.png
trainset_HR_multiscale_sub/3_s020.png
trainset_HR_multiscale_sub/3_s021.png
trainset_HR_multiscale_sub/3_s022.png
trainset_HR_multiscale_sub/3_s023.png
trainset_HR_multiscale_sub/3_s024.png
trainset_HR_multiscale_sub/3_s025.png
trainset_HR_multiscale_sub/4_s001.png
trainset_HR_multiscale_sub/4_s002.png
trainset_HR_multiscale_sub/4_s003.png
trainset_HR_multiscale_sub/4_s004.png
trainset_HR_multiscale_sub/4_s005.png
trainset_HR_multiscale_sub/4_s006.png
trainset_HR_multiscale_sub/4_s007.png
trainset_HR_multiscale_sub/4_s008.png
trainset_HR_multiscale_sub/4_s009.png
trainset_HR_multiscale_sub/4_s010.png
trainset_HR_multiscale_sub/4_s011.png
trainset_HR_multiscale_sub/4_s012.png
trainset_HR_multiscale_sub/4_s013.png
trainset_HR_multiscale_sub/4_s014.png
trainset_HR_multiscale_sub/4_s015.png
trainset_HR_multiscale_sub/4_s016.png
trainset_HR_multiscale_sub/4_s017.png
trainset_HR_multiscale_sub/4_s018.png
trainset_HR_multiscale_sub/4_s019.png
trainset_HR_multiscale_sub/4_s020.png
trainset_HR_multiscale_sub/4_s021.png
trainset_HR_multiscale_sub/4_s022.png
trainset_HR_multiscale_sub/4_s023.png
trainset_HR_multiscale_sub/4_s024.png
trainset_HR_multiscale_sub/4_s025.png

下载预训练模型
下载预先训练的模型到 experiments/pretrained_models 目录下。
- RealESRNet_x4plus.pth:
wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.1/RealESRNet_x4plus.pth -P experiments/pretrained_models
- RealESRGAN_x4plus.pth:
wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth -P experiments/pretrained_models
- RealESRGAN_x4plus_netD.pth:
wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.3/RealESRGAN_x4plus_netD.pth -P experiments/pretrained_models
- RealESRGAN_x2plus.pth:
wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.1/RealESRGAN_x2plus.pth -P experiments/pretrained_models
- RealESRGAN_x2plus_netD.pth:
wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.3/RealESRGAN_x2plus_netD.pth -P experiments/pretrained_models
配置文件
准备一个finetune_realesrgan_x4plus.yml配置文件,内容如下。
# general settings
name: finetune_RealESRGANx4plus_10k_lr5e-5_train_mydatas
model_type: RealESRGANModel
scale: 4
num_gpu: auto
manual_seed: 0
# ----------------- options for synthesizing training data in RealESRGANModel ----------------- #
# USM the ground-truth
l1_gt_usm: True
percep_gt_usm: True
gan_gt_usm: False
# the first degradation process
resize_prob: [0.2, 0.7, 0.1] # up, down, keep
resize_range: [0.15, 1.5]
gaussian_noise_prob: 0.5
noise_range: [1, 30]
poisson_scale_range: [0.05, 3]
gray_noise_prob: 0.4
jpeg_range: [30, 95]
# the second degradation process
second_blur_prob: 0.8
resize_prob2: [0.3, 0.4, 0.3] # up, down, keep
resize_range2: [0.3, 1.2]
gaussian_noise_prob2: 0.5
noise_range2: [1, 25]
poisson_scale_range2: [0.05, 2.5]
gray_noise_prob2: 0.4
jpeg_range2: [30, 95]
gt_size: 256
queue_size: 180
# dataset and data loader settings
datasets:
train:
# name: DF2K+OST
# type: RealESRGANDataset
# dataroot_gt: datasets/DF2K
# meta_info: datasets/DF2K/meta_info/meta_info_DF2Kmultiscale+OST_sub.txt
# io_backend:
# type: disk
name: train_mydatas
type: RealESRGANDataset
dataroot_gt: datasets/train_mydatas
meta_info: datasets/train_mydatas/meta_info/meta_info.txt
io_backend:
type: disk
blur_kernel_size: 21
kernel_list: ['iso', 'aniso', 'generalized_iso', 'generalized_aniso', 'plateau_iso', 'plateau_aniso']
kernel_prob: [0.45, 0.25, 0.12, 0.03, 0.12, 0.03]
sinc_prob: 0.1
blur_sigma: [0.2, 3]
betag_range: [0.5, 4]
betap_range: [1, 2]
blur_kernel_size2: 21
kernel_list2: ['iso', 'aniso', 'generalized_iso', 'generalized_aniso', 'plateau_iso', 'plateau_aniso']
kernel_prob2: [0.45, 0.25, 0.12, 0.03, 0.12, 0.03]
sinc_prob2: 0.1
blur_sigma2: [0.2, 1.5]
betag_range2: [0.5, 4]
betap_range2: [1, 2]
final_sinc_prob: 0.8
gt_size: 256
use_hflip: True
use_rot: False
# data loader
use_shuffle: true
num_worker_per_gpu: 5
batch_size_per_gpu: 6
# num_worker_per_gpu: 0
# batch_size_per_gpu: 2
dataset_enlarge_ratio: 1
prefetch_mode: ~
# Uncomment these for validation
# val:
# name: validation
# type: PairedImageDataset
# dataroot_gt: path_to_gt
# dataroot_lq: path_to_lq
# io_backend:
# type: disk
# network structures
network_g:
type: RRDBNet
num_in_ch: 3
num_out_ch: 3
num_feat: 64
num_block: 23
num_grow_ch: 32
network_d:
type: UNetDiscriminatorSN
num_in_ch: 3
num_feat: 64
skip_connection: True
# path
path:
# use the pre-trained Real-ESRNet model
pretrain_network_g: experiments/pretrained_models/RealESRNet_x4plus.pth
param_key_g: params_ema
strict_load_g: true
pretrain_network_d: experiments/pretrained_models/RealESRGAN_x4plus_netD.pth
param_key_d: params
strict_load_d: true
resume_state: ~
# training settings
train:
ema_decay: 0.999
optim_g:
type: Adam
# lr: !!float 1e-4
lr: !!float 5e-5
weight_decay: 0
betas: [0.9, 0.99]
optim_d:
type: Adam
# lr: !!float 1e-4
lr: !!float 5e-5
weight_decay: 0
betas: [0.9, 0.99]
scheduler:
type: MultiStepLR
milestones: [400000]
gamma: 0.5
# total_iter: 400000
# total_iter: 20000
total_iter: 10000
warmup_iter: -1 # no warm up
# losses
pixel_opt:
type: L1Loss
loss_weight: 1.0
reduction: mean
# perceptual loss (content and style losses)
perceptual_opt:
type: PerceptualLoss
layer_weights:
# before relu
'conv1_2': 0.1
'conv2_2': 0.1
'conv3_4': 1
'conv4_4': 1
'conv5_4': 1
vgg_type: vgg19
use_input_norm: true
perceptual_weight: !!float 1.0
style_weight: 0
range_norm: false
criterion: l1
# gan loss
gan_opt:
type: GANLoss
gan_type: vanilla
real_label_val: 1.0
fake_label_val: 0.0
loss_weight: !!float 1e-1
net_d_iters: 1
net_d_init_iters: 0
# Uncomment these for validation
# validation settings
# val:
# val_freq: !!float 5e3
# save_img: True
# metrics:
# psnr: # metric name
# type: calculate_psnr
# crop_border: 4
# test_y_channel: false
# logging settings
logger:
print_freq: 100
save_checkpoint_freq: !!float 5e3
use_tb_logger: true
wandb:
project: ~
resume_id: ~
# dist training settings
dist_params:
backend: nccl
port: 29500
进行训练
python realesrgan/train.py -opt datasets/train_mydatas/finetune_realesrgan_x4plus.yml --auto_resume
Version Information:
BasicSR: 1.4.2
PyTorch: 1.9.1+cu111
TorchVision: 0.10.1+cu111
INFO: Dataset [RealESRGANDataset] - train_mydatas is built.
INFO: Training statistics:
Number of train images: 120
Dataset enlarge ratio: 1
Batch size per gpu: 6
World size (gpu number): 1
Require iter number per epoch: 20
Total epochs: 500; iters: 10000.
INFO: Network [RRDBNet] is created.
INFO: Network: RRDBNet, with parameters: 16,703,171
INFO: RRDBNet(
(conv_first): Conv2d(12, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(body): Sequential(
(0): RRDB(
(rdb1): ResidualDenseBlock(
(conv1): Conv2d(64, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv2): Conv2d(96, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv3): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv4): Conv2d(160, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv5): Conv2d(192, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(lrelu): LeakyReLU(negative_slope=0.2, inplace=True)
)
...
(rdb3): ResidualDenseBlock(
(conv1): Conv2d(64, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv2): Conv2d(96, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv3): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv4): Conv2d(160, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv5): Conv2d(192, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(lrelu): LeakyReLU(negative_slope=0.2, inplace=True)
)
)
...
(22): RRDB(
(rdb1): ResidualDenseBlock(
(conv1): Conv2d(64, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv2): Conv2d(96, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv3): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv4): Conv2d(160, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv5): Conv2d(192, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(lrelu): LeakyReLU(negative_slope=0.2, inplace=True)
)
...
(rdb3): ResidualDenseBlock(
(conv1): Conv2d(64, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv2): Conv2d(96, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv3): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv4): Conv2d(160, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv5): Conv2d(192, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(lrelu): LeakyReLU(negative_slope=0.2, inplace=True)
)
)
)
(conv_body): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv_up1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv_up2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv_hr): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv_last): Conv2d(64, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(lrelu): LeakyReLU(negative_slope=0.2, inplace=True)
)
INFO: Loading RRDBNet model from experiments/pretrained_models/RealESRGAN_x4plus.pth, with param key: [params_ema].
INFO: Use Exponential Moving Average with decay: 0.999
INFO: Network [RRDBNet] is created.
INFO: Loading RRDBNet model from experiments/pretrained_models/RealESRNet_x4plus.pth, with param key: [params_ema].
INFO: Network [UNetDiscriminatorSN] is created.
INFO: Network: UNetDiscriminatorSN, with parameters: 4,376,897
INFO: UNetDiscriminatorSN(
(conv0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv1): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(conv2): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(conv3): Conv2d(256, 512, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(conv4): Conv2d(512, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(conv5): Conv2d(256, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(conv6): Conv2d(128, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(conv7): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(conv8): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(conv9): Conv2d(64, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
INFO: Loading UNetDiscriminatorSN model from experiments/pretrained_models/RealESRGAN_x4plus_netD.pth, with param key: [params].
INFO: Loss [L1Loss] is created.
INFO: Loss [PerceptualLoss] is created.
INFO: Loss [GANLoss] is created.
INFO: Model [RealESRGANModel] is created.
INFO: Start training from epoch: 0, iter: 0
INFO: [finet..][epoch: 4, iter: 100, lr:(5.000e-05,)] [eta: 1:24:45, time (data): 0.545 (0.041)] l_g_pix: 3.9798e-02 l_g_percep: 1.0801e+01 l_g_gan: 1.6555e-01 l_d_real: 4.4597e-01 out_d_real: 1.8479e+00 l_d_fake: 4.4126e-01 out_d_fake: -1.2139e+00
...
INFO: [finet..][epoch:249, iter: 5,000, lr:(5.000e-05,)] [eta: 0:47:01, time (data): 0.586 (0.038)] l_g_pix: 4.3422e-02 l_g_percep: 9.5577e+00 l_g_gan: 2.2636e-01 l_d_real: 3.3816e-01 out_d_real: 2.1100e+00 l_d_fake: 4.0207e-01 out_d_fake: -1.8612e+00
INFO: Saving models and training states.
INFO: [finet..][epoch:254, iter: 5,100, lr:(5.000e-05,)] [eta: 0:46:06, time (data): 0.564 (0.033)] l_g_pix: 3.9790e-02 l_g_percep: 9.2760e+00 l_g_gan: 2.4556e-01 l_d_real: 4.6562e-01 out_d_real: 1.5839e+00 l_d_fake: 2.5859e-01 out_d_fake: -2.1963e+00
...
INFO: [finet..][epoch:499, iter: 10,000, lr:(5.000e-05,)] [eta: 0:00:00, time (data): 0.534 (0.041)] l_g_pix: 4.8842e-02 l_g_percep: 1.0380e+01 l_g_gan: 3.5410e-01 l_d_real: 1.2939e-01 out_d_real: 4.0833e+00 l_d_fake: 2.1729e-01 out_d_fake: -3.3225e+00
INFO: Saving models and training states.
INFO: End of training. Time consumed: 1:29:29
INFO: Save the latest model.
训练结果


进行超分
python inference_realesrgan.py --input datasets/train_mydatas/test_imgs --model_name RealESRGAN_x4plus --output datasets/train_mydatas/test_imgs_x4_res --model_path experiments/finetune_RealESRGANx4plus_10k_lr5e-5_train_mydatas/models/net_g_latest.pth --outscale 4
超分结果
| LR | HR | SR |
|---|---|---|
![]() | ![]() | ![]() |
参考
[1] Xintao Wang, Liangbin Xie, Chao Dong, Ying Shan. Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. 2021
[2] https://github.com/xinntao/Real-ESRGAN.git
- 由于本人水平有限,难免出现错漏,敬请批评改正。
- 更多精彩内容,可点击进入Python日常小操作专栏、OpenCV-Python小应用专栏、YOLO系列专栏、自然语言处理专栏、人工智能混合编程实践专栏或我的个人主页查看
- Ultralytics:使用 YOLO11 进行速度估计
- Ultralytics:使用 YOLO11 进行物体追踪
- Ultralytics:使用 YOLO11 进行物体计数
- Ultralytics:使用 YOLO11 进行目标打码
- 人工智能混合编程实践:C++调用Python ONNX进行YOLOv8推理
- 人工智能混合编程实践:C++调用封装好的DLL进行YOLOv8实例分割
- 人工智能混合编程实践:C++调用Python ONNX进行图像超分重建
- 人工智能混合编程实践:C++调用Python AgentOCR进行文本识别
- 通过计算实例简单地理解PatchCore异常检测
- Python将YOLO格式实例分割数据集转换为COCO格式实例分割数据集
- YOLOv8 Ultralytics:使用Ultralytics框架训练RT-DETR实时目标检测模型
- 基于DETR的人脸伪装检测
- YOLOv7训练自己的数据集(口罩检测)
- YOLOv8训练自己的数据集(足球检测)
- YOLOv5:TensorRT加速YOLOv5模型推理
- YOLOv5:IoU、GIoU、DIoU、CIoU、EIoU
- 玩转Jetson Nano(五):TensorRT加速YOLOv5目标检测
- YOLOv5:添加SE、CBAM、CoordAtt、ECA注意力机制
- YOLOv5:yolov5s.yaml配置文件解读、增加小目标检测层
- Python将COCO格式实例分割数据集转换为YOLO格式实例分割数据集
- YOLOv5:使用7.0版本训练自己的实例分割模型(车辆、行人、路标、车道线等实例分割)
- 使用Kaggle GPU资源免费体验Stable Diffusion开源项目
- Stable Diffusion:在服务器上部署使用Stable Diffusion WebUI进行AI绘图(v2.0)
- Stable Diffusion:使用自己的数据集微调训练LoRA模型(v2.0)




2万+

被折叠的 条评论
为什么被折叠?



