图像超分:使用自己的数据集微调Real-ESRGAN-x4plus进行超分重建

Qwen3-32B-Chat 私有部署镜像 | RTX4090D 24G 显存 CUDA12.4 优化版

Qwen3-32B-Chat 私有部署镜像 | RTX4090D 24G 显存 CUDA12.4 优化版

Qwen
文本生成
Qwen3

本镜像基于 RTX 4090D 24GB 显存 + CUDA 12.4 + 驱动 550.90.07 深度优化,内置完整运行环境与 Qwen3-32B 模型依赖,开箱即用。

低分辨率测试图 ---------------------------------------------> 超分辨率重构图
在这里插入图片描述

在这里插入图片描述

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

前言

环境要求

Package                 Version         Editable project location
----------------------- --------------- -------------------------------------------
python                  3.8.13
absl-py                 2.1.0
addict                  2.4.0
autocommand             2.2.2
backports.tarfile       1.2.0
basicsr                 1.4.2
cachetools              5.5.1
certifi                 2025.1.31
charset-normalizer      3.4.1
contourpy               1.1.1
cycler                  0.12.1
facexlib                0.3.0
filterpy                1.4.5
flatbuffers             25.2.10
fonttools               4.56.0
future                  1.0.0
gfpgan                  1.3.8
google-auth             2.38.0
google-auth-oauthlib    1.0.0
grpcio                  1.70.0
idna                    3.10
imageio                 2.35.1
importlib_metadata      8.5.0
importlib_resources     6.4.5
inflect                 7.3.1
jaraco.collections      5.1.0
jaraco.context          5.3.0
jaraco.functools        4.0.1
jaraco.text             3.12.1
kiwisolver              1.4.7
lazy_loader             0.4
llvmlite                0.41.1
lmdb                    1.6.2
Markdown                3.7
MarkupSafe              2.1.5
matplotlib              3.7.5
more-itertools          10.3.0
networkx                3.1
numba                   0.58.1
numpy                   1.24.4
oauthlib                3.2.2
onnxruntime-gpu         1.11.0
opencv-python           4.10.0.84
packaging               24.2
pillow                  10.4.0
pip                     24.2
platformdirs            4.3.6
protobuf                5.29.3
pyasn1                  0.6.1
pyasn1_modules          0.4.1
pyparsing               3.1.4
python-dateutil         2.9.0.post0
PyWavelets              1.4.1
PyYAML                  6.0.2
realesrgan              0.3.0          
requests                2.32.3
requests-oauthlib       2.0.0
rsa                     4.9
scikit-image            0.21.0
scipy                   1.10.1
setuptools              75.1.0
six                     1.17.0
tb-nightly              2.14.0a20230808
tensorboard-data-server 0.7.2
tifffile                2023.7.10
tomli                   2.2.1
torch                   1.9.1+cu111
torchaudio              0.9.1
torchvision             0.10.1+cu111
tqdm                    4.67.1
typeguard               4.3.0
typing_extensions       4.12.2
urllib3                 2.2.3
Werkzeug                3.0.6
wheel                   0.44.0
yapf                    0.43.0
zipp                    3.20.2

相关介绍

  • Python是一种跨平台的计算机程序设计语言。是一个高层次的结合了解释性、编译性、互动性和面向对象的脚本语言。最初被设计用于编写自动化脚本(shell),随着版本的不断更新和语言新功能的添加,越多被用于独立的、大型项目的开发。
  • PyTorch 是一个深度学习框架,封装好了很多网络和深度学习相关的工具方便我们调用,而不用我们一个个去单独写了。它分为 CPU 和 GPU 版本,其他框架还有 TensorFlow、Caffe 等。PyTorch 是由 Facebook 人工智能研究院(FAIR)基于 Torch 推出的,它是一个基于 Python 的可续计算包,提供两个高级功能:1、具有强大的 GPU 加速的张量计算(如 NumPy);2、构建深度神经网络时的自动微分机制。
  • Real-ESRGAN (Enhanced Super-Resolution Generative Adversarial Networks,增强型超分辨率生成对抗网络) 是一种先进的图像超分辨率技术,专为处理真实世界图像中的复杂和多样的降质情况而设计。
  • 核心特点
    • 基于GAN的改进模型:在ESRGAN基础上进行深度优化,通过生成对抗网络技术实现高质量图像重建
    • 纯合成数据训练:创新性地采用纯合成数据进行训练,模拟真实世界中图像可能出现的模糊、噪点、压缩失真等退化情况
    • 4倍分辨率提升:能够将低分辨率图像的分辨率提升至4倍,显著改善图像质量
    • 多尺度特征提取:通过多尺度特征提取模块,更好地捕捉图像中的不同尺度信息,提升细节表现力
  • 技术优势:Real-ESRGAN相比传统超分辨率技术有显著优势:
    • 有效处理真实世界图像中的复杂退化情况
    • 生成高质量、细节丰富的高分辨率图像
    • 能够精准还原细节,如人物面部纹理、建筑线条等
    • 适用于多种图像类型,包括照片、动画等
  • 应用场景
    • 图像修复:老照片修复、去除划痕和噪点
    • 图像增强:提升低清图片的分辨率和质量
    • 视频修复:基于图像修复技术构建视频修复系统
    • 专业图像处理:满足设计、影视制作等专业需求
  • 与传统技术的区别
    与传统超分辨率方法相比,Real-ESRGAN采用"盲超分辨率"技术,能够处理各种未知的图像退化情况,而不仅仅是特定类型的退化。它通过模拟真实世界中的复杂退化模型,使模型在面对实际受损图像时能更精准地还原细节。
  • 优点
    • 高质量图像生成:相比ESRGAN等早期模型,Real-ESRGAN生成的图像更加清晰、细节更丰富,能更好地保留原始图像的纹理和结构。
    • 真实世界退化处理能力:Real-ESRGAN是"基于EnhancedSRGAN模型的改进版本",特别针对"真实世界图像中的复杂退化情况"进行了优化,能有效处理模糊、噪点、压缩失真等。
    • 4倍分辨率提升:Real-ESRGAN提供"RealESRGAN_x4plus.pth"模型,可以进行4倍的超分辨率。
    • 多场景适用:Real-ESRGAN被列为"内置超分辨率算法"之一,与Waifu2x、SRMD等并列,适用于"无论是二次元动漫还是您日常拍摄的照片&录像"的多种图像类型。
    • 开源社区支持:Real-ESRGAN是一个开源项目,有活跃的社区支持,持续更新和改进。
  • 局限性
    • 计算资源需求高:与其他超分辨率算法相比,Real-ESRGAN需要较强的计算资源,特别是处理高分辨率图像时。
    • 处理速度较慢:相比一些轻量级模型,Real-ESRGAN的处理速度相对较慢,特别是处理视频时。
    • 放大倍数限制:Real-ESRGAN主要提供4倍分辨率提升,无法实现更高倍数的放大而不损失质量。
    • 对极端退化图像效果有限:对于严重模糊、低信噪比等极端退化情况,效果可能不如预期。
    • 无法解决所有图像问题:Real-ESRGAN专注于超分辨率,但无法解决图像中的内容缺失、结构错误等问题,需要与其他图像修复技术结合使用。
      改进方向
    • 提升处理速度:通过模型压缩、量化等技术,提高处理速度,使实时处理成为可能,满足视频处理等实时应用需求。
    • 扩展放大倍数:研究如何在保持图像质量的同时,实现更高的放大倍数,如8倍甚至16倍。
    • 降低计算资源需求:开发轻量级版本,使在移动设备上也能高效运行,扩大应用场景。
    • 增强对特定退化类型的处理能力:“通过模拟训练来最大化精度”,进一步针对特定退化类型(如运动模糊、镜头畸变等)进行优化。
    • 结合多任务学习:将超分辨率与其他图像处理任务(如去噪、去模糊、色彩增强)结合,实现"一揽子"图像质量提升。
    • 改进训练数据:使用更多样化的训练数据,提高模型对各种退化类型的泛化能力,“通过模拟训练来最大化精度”。
    • 优化模型架构:“反向估计处理程序,这可以更好地融合低分辨率和高分辨率的前向进程”,进一步优化模型架构,提高处理效果。
  • Real-ESRGAN代表了图像超分辨率技术的最新进展,作为当前最先进的超分辨率技术之一,为图像修复、增强和重建提供了强大的工具,广泛应用于个人图像处理、专业影视制作、文物数字化等多个领域,其持续改进和优化将为图像处理领域带来更大价值,特别是在视频修复、老照片修复、影视制作等应用领域。
  • 官方源代码: https://github.com/xinntao/Real-ESRGAN.git
  • Xintao Wang, Liangbin Xie, Chao Dong, Ying Shan. Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. 2021

微调Real-ESRGAN-x4plus模型

下载Real-ESRGAN项目

Windows

在这里插入图片描述

下载解压后,项目目录,如下图所示。
在这里插入图片描述

Linux

git clone https://github.com/xinntao/Real-ESRGAN.git

准备数据集

在这里插入图片描述

【可选】生成多尺寸图片

python scripts/generate_multiscale_DF2K.py --input datasets/train_mydatas/trainset_HR --output datasets/train_mydatas/trainset_HR_multiscale
datasets/train_mydatas/trainset_HR/1.png
        0.75
        0.50
        0.33
datasets/train_mydatas/trainset_HR/2.png
        0.75
        0.50
        0.33
datasets/train_mydatas/trainset_HR/3.png
        0.75
        0.50
        0.33
datasets/train_mydatas/trainset_HR/4.png
        0.75
        0.50
        0.33

在这里插入图片描述

【可选】裁切为子图像

python scripts/extract_subimages.py --input datasets/train_mydatas/trainset_HR --output datasets/train_mydatas/trainset_HR_multiscale_sub --crop_size 400 --step 200
mkdir datasets/train_mydatas/trainset_HR_multiscale_sub ...
Extract: 100%|█████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00,  5.47image/s]
All processes done.

在这里插入图片描述

准备元信息

python scripts/generate_meta_info.py --input datasets/train_mydatas/trainset_HR datasets/train_mydatas/trainset_HR_multiscale datasets/train_mydatas/trainset_HR_multiscale_sub --root datasets/train_mydatas datasets/train_mydatas datasets/train_mydatas --meta_info datasets/train_mydatas/meta_info/meta_info.txt
trainset_HR/1.png
trainset_HR/2.png
trainset_HR/3.png
trainset_HR/4.png
trainset_HR_multiscale/1T0.png
trainset_HR_multiscale/1T1.png
trainset_HR_multiscale/1T2.png
trainset_HR_multiscale/1T3.png
trainset_HR_multiscale/2T0.png
trainset_HR_multiscale/2T1.png
trainset_HR_multiscale/2T2.png
trainset_HR_multiscale/2T3.png
trainset_HR_multiscale/3T0.png
trainset_HR_multiscale/3T1.png
trainset_HR_multiscale/3T2.png
trainset_HR_multiscale/3T3.png
trainset_HR_multiscale/4T0.png
trainset_HR_multiscale/4T1.png
trainset_HR_multiscale/4T2.png
trainset_HR_multiscale/4T3.png
trainset_HR_multiscale_sub/1_s001.png
trainset_HR_multiscale_sub/1_s002.png
trainset_HR_multiscale_sub/1_s003.png
trainset_HR_multiscale_sub/1_s004.png
trainset_HR_multiscale_sub/1_s005.png
trainset_HR_multiscale_sub/1_s006.png
trainset_HR_multiscale_sub/1_s007.png
trainset_HR_multiscale_sub/1_s008.png
trainset_HR_multiscale_sub/1_s009.png
trainset_HR_multiscale_sub/1_s010.png
trainset_HR_multiscale_sub/1_s011.png
trainset_HR_multiscale_sub/1_s012.png
trainset_HR_multiscale_sub/1_s013.png
trainset_HR_multiscale_sub/1_s014.png
trainset_HR_multiscale_sub/1_s015.png
trainset_HR_multiscale_sub/1_s016.png
trainset_HR_multiscale_sub/1_s017.png
trainset_HR_multiscale_sub/1_s018.png
trainset_HR_multiscale_sub/1_s019.png
trainset_HR_multiscale_sub/1_s020.png
trainset_HR_multiscale_sub/1_s021.png
trainset_HR_multiscale_sub/1_s022.png
trainset_HR_multiscale_sub/1_s023.png
trainset_HR_multiscale_sub/1_s024.png
trainset_HR_multiscale_sub/1_s025.png
trainset_HR_multiscale_sub/2_s001.png
trainset_HR_multiscale_sub/2_s002.png
trainset_HR_multiscale_sub/2_s003.png
trainset_HR_multiscale_sub/2_s004.png
trainset_HR_multiscale_sub/2_s005.png
trainset_HR_multiscale_sub/2_s006.png
trainset_HR_multiscale_sub/2_s007.png
trainset_HR_multiscale_sub/2_s008.png
trainset_HR_multiscale_sub/2_s009.png
trainset_HR_multiscale_sub/2_s010.png
trainset_HR_multiscale_sub/2_s011.png
trainset_HR_multiscale_sub/2_s012.png
trainset_HR_multiscale_sub/2_s013.png
trainset_HR_multiscale_sub/2_s014.png
trainset_HR_multiscale_sub/2_s015.png
trainset_HR_multiscale_sub/2_s016.png
trainset_HR_multiscale_sub/2_s017.png
trainset_HR_multiscale_sub/2_s018.png
trainset_HR_multiscale_sub/2_s019.png
trainset_HR_multiscale_sub/2_s020.png
trainset_HR_multiscale_sub/2_s021.png
trainset_HR_multiscale_sub/2_s022.png
trainset_HR_multiscale_sub/2_s023.png
trainset_HR_multiscale_sub/2_s024.png
trainset_HR_multiscale_sub/2_s025.png
trainset_HR_multiscale_sub/3_s001.png
trainset_HR_multiscale_sub/3_s002.png
trainset_HR_multiscale_sub/3_s003.png
trainset_HR_multiscale_sub/3_s004.png
trainset_HR_multiscale_sub/3_s005.png
trainset_HR_multiscale_sub/3_s006.png
trainset_HR_multiscale_sub/3_s007.png
trainset_HR_multiscale_sub/3_s008.png
trainset_HR_multiscale_sub/3_s009.png
trainset_HR_multiscale_sub/3_s010.png
trainset_HR_multiscale_sub/3_s011.png
trainset_HR_multiscale_sub/3_s012.png
trainset_HR_multiscale_sub/3_s013.png
trainset_HR_multiscale_sub/3_s014.png
trainset_HR_multiscale_sub/3_s015.png
trainset_HR_multiscale_sub/3_s016.png
trainset_HR_multiscale_sub/3_s017.png
trainset_HR_multiscale_sub/3_s018.png
trainset_HR_multiscale_sub/3_s019.png
trainset_HR_multiscale_sub/3_s020.png
trainset_HR_multiscale_sub/3_s021.png
trainset_HR_multiscale_sub/3_s022.png
trainset_HR_multiscale_sub/3_s023.png
trainset_HR_multiscale_sub/3_s024.png
trainset_HR_multiscale_sub/3_s025.png
trainset_HR_multiscale_sub/4_s001.png
trainset_HR_multiscale_sub/4_s002.png
trainset_HR_multiscale_sub/4_s003.png
trainset_HR_multiscale_sub/4_s004.png
trainset_HR_multiscale_sub/4_s005.png
trainset_HR_multiscale_sub/4_s006.png
trainset_HR_multiscale_sub/4_s007.png
trainset_HR_multiscale_sub/4_s008.png
trainset_HR_multiscale_sub/4_s009.png
trainset_HR_multiscale_sub/4_s010.png
trainset_HR_multiscale_sub/4_s011.png
trainset_HR_multiscale_sub/4_s012.png
trainset_HR_multiscale_sub/4_s013.png
trainset_HR_multiscale_sub/4_s014.png
trainset_HR_multiscale_sub/4_s015.png
trainset_HR_multiscale_sub/4_s016.png
trainset_HR_multiscale_sub/4_s017.png
trainset_HR_multiscale_sub/4_s018.png
trainset_HR_multiscale_sub/4_s019.png
trainset_HR_multiscale_sub/4_s020.png
trainset_HR_multiscale_sub/4_s021.png
trainset_HR_multiscale_sub/4_s022.png
trainset_HR_multiscale_sub/4_s023.png
trainset_HR_multiscale_sub/4_s024.png
trainset_HR_multiscale_sub/4_s025.png

在这里插入图片描述

下载预训练模型

下载预先训练的模型到 experiments/pretrained_models 目录下。

  • RealESRNet_x4plus.pth:
wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.1/RealESRNet_x4plus.pth -P experiments/pretrained_models
  • RealESRGAN_x4plus.pth:
wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth -P experiments/pretrained_models
  • RealESRGAN_x4plus_netD.pth:
wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.3/RealESRGAN_x4plus_netD.pth -P experiments/pretrained_models
  • RealESRGAN_x2plus.pth:
wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.1/RealESRGAN_x2plus.pth -P experiments/pretrained_models
  • RealESRGAN_x2plus_netD.pth:
wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.3/RealESRGAN_x2plus_netD.pth -P experiments/pretrained_models

配置文件

准备一个finetune_realesrgan_x4plus.yml配置文件,内容如下。

# general settings
name: finetune_RealESRGANx4plus_10k_lr5e-5_train_mydatas
model_type: RealESRGANModel
scale: 4
num_gpu: auto
manual_seed: 0

# ----------------- options for synthesizing training data in RealESRGANModel ----------------- #
# USM the ground-truth
l1_gt_usm: True
percep_gt_usm: True
gan_gt_usm: False

# the first degradation process
resize_prob: [0.2, 0.7, 0.1]  # up, down, keep
resize_range: [0.15, 1.5]
gaussian_noise_prob: 0.5
noise_range: [1, 30]
poisson_scale_range: [0.05, 3]
gray_noise_prob: 0.4
jpeg_range: [30, 95]

# the second degradation process
second_blur_prob: 0.8
resize_prob2: [0.3, 0.4, 0.3]  # up, down, keep
resize_range2: [0.3, 1.2]
gaussian_noise_prob2: 0.5
noise_range2: [1, 25]
poisson_scale_range2: [0.05, 2.5]
gray_noise_prob2: 0.4
jpeg_range2: [30, 95]

gt_size: 256
queue_size: 180

# dataset and data loader settings
datasets:
  train:
    # name: DF2K+OST
    # type: RealESRGANDataset
    # dataroot_gt: datasets/DF2K
    # meta_info: datasets/DF2K/meta_info/meta_info_DF2Kmultiscale+OST_sub.txt
    # io_backend:
    #   type: disk
    name: train_mydatas
    type: RealESRGANDataset
    dataroot_gt: datasets/train_mydatas
    meta_info: datasets/train_mydatas/meta_info/meta_info.txt
    io_backend:
      type: disk

    blur_kernel_size: 21
    kernel_list: ['iso', 'aniso', 'generalized_iso', 'generalized_aniso', 'plateau_iso', 'plateau_aniso']
    kernel_prob: [0.45, 0.25, 0.12, 0.03, 0.12, 0.03]
    sinc_prob: 0.1
    blur_sigma: [0.2, 3]
    betag_range: [0.5, 4]
    betap_range: [1, 2]

    blur_kernel_size2: 21
    kernel_list2: ['iso', 'aniso', 'generalized_iso', 'generalized_aniso', 'plateau_iso', 'plateau_aniso']
    kernel_prob2: [0.45, 0.25, 0.12, 0.03, 0.12, 0.03]
    sinc_prob2: 0.1
    blur_sigma2: [0.2, 1.5]
    betag_range2: [0.5, 4]
    betap_range2: [1, 2]

    final_sinc_prob: 0.8

    gt_size: 256
    use_hflip: True
    use_rot: False

    # data loader
    use_shuffle: true
    num_worker_per_gpu: 5
    batch_size_per_gpu: 6
    # num_worker_per_gpu: 0
    # batch_size_per_gpu: 2
    dataset_enlarge_ratio: 1
    prefetch_mode: ~

  # Uncomment these for validation
  # val:
  #   name: validation
  #   type: PairedImageDataset
  #   dataroot_gt: path_to_gt
  #   dataroot_lq: path_to_lq
  #   io_backend:
  #     type: disk

# network structures
network_g:
  type: RRDBNet
  num_in_ch: 3
  num_out_ch: 3
  num_feat: 64
  num_block: 23
  num_grow_ch: 32

network_d:
  type: UNetDiscriminatorSN
  num_in_ch: 3
  num_feat: 64
  skip_connection: True

# path
path:
  # use the pre-trained Real-ESRNet model
  pretrain_network_g: experiments/pretrained_models/RealESRNet_x4plus.pth
  param_key_g: params_ema
  strict_load_g: true
  pretrain_network_d: experiments/pretrained_models/RealESRGAN_x4plus_netD.pth
  param_key_d: params
  strict_load_d: true
  resume_state: ~

# training settings
train:
  ema_decay: 0.999
  optim_g:
    type: Adam
    # lr: !!float 1e-4
    lr: !!float 5e-5
    weight_decay: 0
    betas: [0.9, 0.99]
  optim_d:
    type: Adam
    # lr: !!float 1e-4
    lr: !!float 5e-5
    weight_decay: 0
    betas: [0.9, 0.99]

  scheduler:
    type: MultiStepLR
    milestones: [400000]
    gamma: 0.5

  # total_iter: 400000
  # total_iter: 20000
  total_iter: 10000
  warmup_iter: -1  # no warm up

  # losses
  pixel_opt:
    type: L1Loss
    loss_weight: 1.0
    reduction: mean
  # perceptual loss (content and style losses)
  perceptual_opt:
    type: PerceptualLoss
    layer_weights:
      # before relu
      'conv1_2': 0.1
      'conv2_2': 0.1
      'conv3_4': 1
      'conv4_4': 1
      'conv5_4': 1
    vgg_type: vgg19
    use_input_norm: true
    perceptual_weight: !!float 1.0
    style_weight: 0
    range_norm: false
    criterion: l1
  # gan loss
  gan_opt:
    type: GANLoss
    gan_type: vanilla
    real_label_val: 1.0
    fake_label_val: 0.0
    loss_weight: !!float 1e-1

  net_d_iters: 1
  net_d_init_iters: 0

# Uncomment these for validation
# validation settings
# val:
#   val_freq: !!float 5e3
#   save_img: True

#   metrics:
#     psnr: # metric name
#       type: calculate_psnr
#       crop_border: 4
#       test_y_channel: false

# logging settings
logger:
  print_freq: 100
  save_checkpoint_freq: !!float 5e3
  use_tb_logger: true
  wandb:
    project: ~
    resume_id: ~

# dist training settings
dist_params:
  backend: nccl
  port: 29500

进行训练

python realesrgan/train.py -opt datasets/train_mydatas/finetune_realesrgan_x4plus.yml --auto_resume
Version Information: 
	BasicSR: 1.4.2
	PyTorch: 1.9.1+cu111
	TorchVision: 0.10.1+cu111

INFO: Dataset [RealESRGANDataset] - train_mydatas is built.
INFO: Training statistics:
	Number of train images: 120
	Dataset enlarge ratio: 1
	Batch size per gpu: 6
	World size (gpu number): 1
	Require iter number per epoch: 20
	Total epochs: 500; iters: 10000.
INFO: Network [RRDBNet] is created.
INFO: Network: RRDBNet, with parameters: 16,703,171
INFO: RRDBNet(
  (conv_first): Conv2d(12, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (body): Sequential(
    (0): RRDB(
      (rdb1): ResidualDenseBlock(
        (conv1): Conv2d(64, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conv2): Conv2d(96, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conv3): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conv4): Conv2d(160, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conv5): Conv2d(192, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (lrelu): LeakyReLU(negative_slope=0.2, inplace=True)
      )
      ...
      (rdb3): ResidualDenseBlock(
        (conv1): Conv2d(64, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conv2): Conv2d(96, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conv3): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conv4): Conv2d(160, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conv5): Conv2d(192, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (lrelu): LeakyReLU(negative_slope=0.2, inplace=True)
      )
    )
    ...
    (22): RRDB(
      (rdb1): ResidualDenseBlock(
        (conv1): Conv2d(64, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conv2): Conv2d(96, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conv3): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conv4): Conv2d(160, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conv5): Conv2d(192, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (lrelu): LeakyReLU(negative_slope=0.2, inplace=True)
      )
      ...
      (rdb3): ResidualDenseBlock(
        (conv1): Conv2d(64, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conv2): Conv2d(96, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conv3): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conv4): Conv2d(160, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conv5): Conv2d(192, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (lrelu): LeakyReLU(negative_slope=0.2, inplace=True)
      )
    )
  )
  (conv_body): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv_up1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv_up2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv_hr): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv_last): Conv2d(64, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (lrelu): LeakyReLU(negative_slope=0.2, inplace=True)
)
INFO: Loading RRDBNet model from experiments/pretrained_models/RealESRGAN_x4plus.pth, with param key: [params_ema].
INFO: Use Exponential Moving Average with decay: 0.999
INFO: Network [RRDBNet] is created.
INFO: Loading RRDBNet model from experiments/pretrained_models/RealESRNet_x4plus.pth, with param key: [params_ema].
INFO: Network [UNetDiscriminatorSN] is created.
INFO: Network: UNetDiscriminatorSN, with parameters: 4,376,897
INFO: UNetDiscriminatorSN(
  (conv0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv1): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
  (conv2): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
  (conv3): Conv2d(256, 512, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
  (conv4): Conv2d(512, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
  (conv5): Conv2d(256, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
  (conv6): Conv2d(128, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
  (conv7): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
  (conv8): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
  (conv9): Conv2d(64, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
INFO: Loading UNetDiscriminatorSN model from experiments/pretrained_models/RealESRGAN_x4plus_netD.pth, with param key: [params].
INFO: Loss [L1Loss] is created.
INFO: Loss [PerceptualLoss] is created.
INFO: Loss [GANLoss] is created.
INFO: Model [RealESRGANModel] is created.
INFO: Start training from epoch: 0, iter: 0
INFO: [finet..][epoch:  4, iter:     100, lr:(5.000e-05,)] [eta: 1:24:45, time (data): 0.545 (0.041)] l_g_pix: 3.9798e-02 l_g_percep: 1.0801e+01 l_g_gan: 1.6555e-01 l_d_real: 4.4597e-01 out_d_real: 1.8479e+00 l_d_fake: 4.4126e-01 out_d_fake: -1.2139e+00 
...
INFO: [finet..][epoch:249, iter:   5,000, lr:(5.000e-05,)] [eta: 0:47:01, time (data): 0.586 (0.038)] l_g_pix: 4.3422e-02 l_g_percep: 9.5577e+00 l_g_gan: 2.2636e-01 l_d_real: 3.3816e-01 out_d_real: 2.1100e+00 l_d_fake: 4.0207e-01 out_d_fake: -1.8612e+00 
INFO: Saving models and training states.
INFO: [finet..][epoch:254, iter:   5,100, lr:(5.000e-05,)] [eta: 0:46:06, time (data): 0.564 (0.033)] l_g_pix: 3.9790e-02 l_g_percep: 9.2760e+00 l_g_gan: 2.4556e-01 l_d_real: 4.6562e-01 out_d_real: 1.5839e+00 l_d_fake: 2.5859e-01 out_d_fake: -2.1963e+00 
...
INFO: [finet..][epoch:499, iter:  10,000, lr:(5.000e-05,)] [eta: 0:00:00, time (data): 0.534 (0.041)] l_g_pix: 4.8842e-02 l_g_percep: 1.0380e+01 l_g_gan: 3.5410e-01 l_d_real: 1.2939e-01 out_d_real: 4.0833e+00 l_d_fake: 2.1729e-01 out_d_fake: -3.3225e+00 
INFO: Saving models and training states.
INFO: End of training. Time consumed: 1:29:29
INFO: Save the latest model.

训练结果

在这里插入图片描述
在这里插入图片描述

进行超分

python inference_realesrgan.py --input datasets/train_mydatas/test_imgs --model_name RealESRGAN_x4plus --output datasets/train_mydatas/test_imgs_x4_res --model_path experiments/finetune_RealESRGANx4plus_10k_lr5e-5_train_mydatas/models/net_g_latest.pth --outscale 4

超分结果

LRHRSR
LRHRHR

参考

[1] Xintao Wang, Liangbin Xie, Chao Dong, Ying Shan. Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. 2021
[2] https://github.com/xinntao/Real-ESRGAN.git

您可能感兴趣的与本文相关的镜像

Qwen3-32B-Chat 私有部署镜像 | RTX4090D 24G 显存 CUDA12.4 优化版

Qwen3-32B-Chat 私有部署镜像 | RTX4090D 24G 显存 CUDA12.4 优化版

Qwen
文本生成
Qwen3

本镜像基于 RTX 4090D 24GB 显存 + CUDA 12.4 + 驱动 550.90.07 深度优化,内置完整运行环境与 Qwen3-32B 模型依赖,开箱即用。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

FriendshipT

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值