YOLO26改进| DownSample | 基于离散小波变换（DWT）的下采样模块-洪萨配资

💡💡💡本专栏所有程序均经过测试，可成功执行💡💡💡

本文给大家带来的教程是将YOLO26的下采样替换为WaveletPool来提取特征。文章在介绍主要的原理后，将手把手教学如何进行模块的代码添加和修改，并将修改后的完整代码放在文章的最后，方便大家一键运行，小白也可轻松上手实践。以帮助您更好地学习深度学习目标检测YOLO系列的挑战。

专栏地址：YOLO26改进-论文涨点——点击跳转看所有内容，关注不迷路！

1.论文

2. WaveletPool代码实现

2.1 将WaveletPool添加到YOLO26中

2.2 更改init.py文件

2.3 添加yaml文件

2.4 在task.py中进行注册

2.5 执行程序

3. 完整代码分享

4. GFLOPs

5. 进阶

6.总结

1.论文

论文地址：WAVELET POOLING FOR CONVOLUTIONAL NEURAL NETWORKS
官方代码：官方代码仓库点击即可跳转

2. WaveletPool代码实现

2.1 将WaveletPool添加到YOLO26中

关键步骤一：在ultralytics\ultralytics\nn\modules下面新建文件夹models，在文件夹下新建WaveletPool.py，粘贴下面代码

import torch import torch.nn as nn import torch.nn.functional as F import numpy as np from ultralytics.nn.modules.conv import Conv class WaveletPool(nn.Module): def __init__(self): super(WaveletPool, self).__init__() ll = np.array([[0.5, 0.5], [0.5, 0.5]]) lh = np.array([[-0.5, -0.5], [0.5, 0.5]]) hl = np.array([[-0.5, 0.5], [-0.5, 0.5]]) hh = np.array([[0.5, -0.5], [-0.5, 0.5]]) filts = np.stack([ ll[None, ::-1, ::-1], lh[None, ::-1, ::-1], hl[None, ::-1, ::-1], hh[None, ::-1, ::-1] ], axis=0) self.weight = nn.Parameter( torch.tensor(filts).to(torch.get_default_dtype()), requires_grad=False ) def forward(self, x): C = x.shape[1] filters = torch.cat([self.weight, ] * C, dim=0) y = F.conv2d(x, filters, groups=C, stride=2) return y

2.2 更改init.py文件

关键步骤二：在文件ultralytics\ultralytics\nn\modules\models文件夹下新建__init__.py文件，先导入函数

然后在下面的__all__中声明函数

2.3 添加yaml文件

关键步骤三：在/ultralytics/ultralytics/cfg/models/26下面新建文件yolo26_WaveletPool.yaml文件，粘贴下面的内容

目标检测

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license # Ultralytics YOLO26 object detection model with P3/8 - P5/32 outputs # Model docs: https://docs.ultralytics.com/models/yolo26 # Task docs: https://docs.ultralytics.com/tasks/detect # Parameters nc: 80 # number of classes end2end: True # whether to use end-to-end mode reg_max: 1 # DFL bins scales: # model compound scaling constants, i.e. 'model=yolo26n.yaml' will call yolo26.yaml with scale 'n' # [depth, width, max_channels] n: [0.50, 0.25, 1024] # summary: 260 layers, 2,572,280 parameters, 2,572,280 gradients, 6.1 GFLOPs s: [0.50, 0.50, 1024] # summary: 260 layers, 10,009,784 parameters, 10,009,784 gradients, 22.8 GFLOPs m: [0.50, 1.00, 512] # summary: 280 layers, 21,896,248 parameters, 21,896,248 gradients, 75.4 GFLOPs l: [1.00, 1.00, 512] # summary: 392 layers, 26,299,704 parameters, 26,299,704 gradients, 93.8 GFLOPs x: [1.00, 1.50, 512] # summary: 392 layers, 58,993,368 parameters, 58,993,368 gradients, 209.5 GFLOPs # YOLO26n backbone backbone: # [from, repeats, module, args] - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2 - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4 - [-1, 2, C3k2, [256, False, 0.25]] # 2-P2/4 - [-1, 1, WaveletPool, []] # 3-P3/8 - [-1, 2, C3k2, [512, False, 0.25]] # 4-P3/8 - [-1, 1, WaveletPool, []] # 5-P4/16 - [-1, 2, C3k2, [512, True]] # 6-P4/16 - [-1, 1, WaveletPool, []] # 7-P5/32 - [-1, 2, C3k2, [1024, True]] # 8-P5/32 - [-1, 1, SPPF, [1024, 5, 3, True]] # 9-P5/32 - [-1, 2, C2PSA, [1024]] # 10-P5/32 # YOLO26n head head: - [-1, 1, nn.Upsample, [None, 2, "nearest"]] # 11-P4/16 - [[-1, 6], 1, Concat, [1]] # 12-P4/16 - [-1, 2, C3k2, [512, True]] # 13-P4/16 - [-1, 1, nn.Upsample, [None, 2, "nearest"]] # 14-P3/8 - [[-1, 4], 1, Concat, [1]] # 15-P3/8 - [-1, 2, C3k2, [256, True]] # 16-P3/8 - [-1, 1, WaveletPool, []] # 17-P4/16 - [[-1, 13], 1, Concat, [1]] # 18-P4/16 - [-1, 2, C3k2, [512, True]] # 19-P4/16 - [-1, 1, WaveletPool, []] # 20-P5/32 - [[-1, 10], 1, Concat, [1]] # 21-P5/32 - [-1, 1, C3k2, [1024, True, 0.5, True]] # 22-P5/32 - [[16, 19, 22], 1, Detect, [nc]] # 23-P3/8,P4/16,P5/32

语义分割

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license # Ultralytics YOLO26 object detection model with P3/8 - P5/32 outputs # Model docs: https://docs.ultralytics.com/models/yolo26 # Task docs: https://docs.ultralytics.com/tasks/detect # Parameters nc: 80 # number of classes end2end: True # whether to use end-to-end mode reg_max: 1 # DFL bins scales: # model compound scaling constants, i.e. 'model=yolo26n.yaml' will call yolo26.yaml with scale 'n' # [depth, width, max_channels] n: [0.50, 0.25, 1024] # summary: 260 layers, 2,572,280 parameters, 2,572,280 gradients, 6.1 GFLOPs s: [0.50, 0.50, 1024] # summary: 260 layers, 10,009,784 parameters, 10,009,784 gradients, 22.8 GFLOPs m: [0.50, 1.00, 512] # summary: 280 layers, 21,896,248 parameters, 21,896,248 gradients, 75.4 GFLOPs l: [1.00, 1.00, 512] # summary: 392 layers, 26,299,704 parameters, 26,299,704 gradients, 93.8 GFLOPs x: [1.00, 1.50, 512] # summary: 392 layers, 58,993,368 parameters, 58,993,368 gradients, 209.5 GFLOPs # YOLO26n backbone backbone: # [from, repeats, module, args] - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2 - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4 - [-1, 2, C3k2, [256, False, 0.25]] # 2-P2/4 - [-1, 1, WaveletPool, []] # 3-P3/8 - [-1, 2, C3k2, [512, False, 0.25]] # 4-P3/8 - [-1, 1, WaveletPool, []] # 5-P4/16 - [-1, 2, C3k2, [512, True]] # 6-P4/16 - [-1, 1, WaveletPool, []] # 7-P5/32 - [-1, 2, C3k2, [1024, True]] # 8-P5/32 - [-1, 1, SPPF, [1024, 5, 3, True]] # 9-P5/32 - [-1, 2, C2PSA, [1024]] # 10-P5/32 # YOLO26n head head: - [-1, 1, nn.Upsample, [None, 2, "nearest"]] # 11-P4/16 - [[-1, 6], 1, Concat, [1]] # 12-P4/16 - [-1, 2, C3k2, [512, True]] # 13-P4/16 - [-1, 1, nn.Upsample, [None, 2, "nearest"]] # 14-P3/8 - [[-1, 4], 1, Concat, [1]] # 15-P3/8 - [-1, 2, C3k2, [256, True]] # 16-P3/8 - [-1, 1, WaveletPool, []] # 17-P4/16 - [[-1, 13], 1, Concat, [1]] # 18-P4/16 - [-1, 2, C3k2, [512, True]] # 19-P4/16 - [-1, 1, WaveletPool, []] # 20-P5/32 - [[-1, 10], 1, Concat, [1]] # 21-P5/32 - [-1, 1, C3k2, [1024, True, 0.5, True]] # 22-P5/32 - [[16, 19, 22], 1, Segment, [nc, 32, 256]]

旋转目标检测

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license # Ultralytics YOLO26 object detection model with P3/8 - P5/32 outputs # Model docs: https://docs.ultralytics.com/models/yolo26 # Task docs: https://docs.ultralytics.com/tasks/detect # Parameters nc: 80 # number of classes end2end: True # whether to use end-to-end mode reg_max: 1 # DFL bins scales: # model compound scaling constants, i.e. 'model=yolo26n.yaml' will call yolo26.yaml with scale 'n' # [depth, width, max_channels] n: [0.50, 0.25, 1024] # summary: 260 layers, 2,572,280 parameters, 2,572,280 gradients, 6.1 GFLOPs s: [0.50, 0.50, 1024] # summary: 260 layers, 10,009,784 parameters, 10,009,784 gradients, 22.8 GFLOPs m: [0.50, 1.00, 512] # summary: 280 layers, 21,896,248 parameters, 21,896,248 gradients, 75.4 GFLOPs l: [1.00, 1.00, 512] # summary: 392 layers, 26,299,704 parameters, 26,299,704 gradients, 93.8 GFLOPs x: [1.00, 1.50, 512] # summary: 392 layers, 58,993,368 parameters, 58,993,368 gradients, 209.5 GFLOPs # YOLO26n backbone backbone: # [from, repeats, module, args] - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2 - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4 - [-1, 2, C3k2, [256, False, 0.25]] # 2-P2/4 - [-1, 1, WaveletPool, []] # 3-P3/8 - [-1, 2, C3k2, [512, False, 0.25]] # 4-P3/8 - [-1, 1, WaveletPool, []] # 5-P4/16 - [-1, 2, C3k2, [512, True]] # 6-P4/16 - [-1, 1, WaveletPool, []] # 7-P5/32 - [-1, 2, C3k2, [1024, True]] # 8-P5/32 - [-1, 1, SPPF, [1024, 5, 3, True]] # 9-P5/32 - [-1, 2, C2PSA, [1024]] # 10-P5/32 # YOLO26n head head: - [-1, 1, nn.Upsample, [None, 2, "nearest"]] # 11-P4/16 - [[-1, 6], 1, Concat, [1]] # 12-P4/16 - [-1, 2, C3k2, [512, True]] # 13-P4/16 - [-1, 1, nn.Upsample, [None, 2, "nearest"]] # 14-P3/8 - [[-1, 4], 1, Concat, [1]] # 15-P3/8 - [-1, 2, C3k2, [256, True]] # 16-P3/8 - [-1, 1, WaveletPool, []] # 17-P4/16 - [[-1, 13], 1, Concat, [1]] # 18-P4/16 - [-1, 2, C3k2, [512, True]] # 19-P4/16 - [-1, 1, WaveletPool, []] # 20-P5/32 - [[-1, 10], 1, Concat, [1]] # 21-P5/32 - [-1, 1, C3k2, [1024, True, 0.5, True]] # 22-P5/32 - [[16, 19, 22], 1, OBB, [nc, 1]]

温馨提示：本文只是对yolo26基础上添加模块，如果要对yolo26 n/l/m/x进行添加则只需要指定对应的depth_multiple 和 width_multiple

end2end: True # whether to use end-to-end mode reg_max: 1 # DFL bins scales: # model compound scaling constants, i.e. 'model=yolo26n.yaml' will call yolo26.yaml with scale 'n' # [depth, width, max_channels] n: [0.50, 0.25, 1024] # summary: 260 layers, 2,572,280 parameters, 2,572,280 gradients, 6.1 GFLOPs s: [0.50, 0.50, 1024] # summary: 260 layers, 10,009,784 parameters, 10,009,784 gradients, 22.8 GFLOPs m: [0.50, 1.00, 512] # summary: 280 layers, 21,896,248 parameters, 21,896,248 gradients, 75.4 GFLOPs l: [1.00, 1.00, 512] # summary: 392 layers, 26,299,704 parameters, 26,299,704 gradients, 93.8 GFLOPs x: [1.00, 1.50, 512] # summary: 392 layers, 58,993,368 parameters, 58,993,368 gradients, 209.5 GFLOPs

2.4 在task.py中进行注册

关键步骤四：在parse_model函数中进行注册，添加WaveletPool

先在task.py导入函数

然后在task.py文件下找到parse_model这个函数，如下图，添加WaveletPool

elif m is WaveletPool: # downsample_modules c1 = ch[f] c2 = c1 * 4 args = []

2.5 执行程序

关键步骤五:在ultralytics文件中新建train.py，将model的参数路径设置为yolo26_WaveletPool.yaml的路径即可【注意是在外边的Ultralytics下新建train.py】

from ultralytics import YOLO import warnings warnings.filterwarnings('ignore') from pathlib import Path if __name__ == '__main__': # 加载模型 model = YOLO("ultralytics/cfg/26/yolo26.yaml") # 你要选择的模型yaml文件地址 # Use the model results = model.train(data=r"你的数据集的yaml文件地址", epochs=100, batch=16, imgsz=640, workers=4, name=Path(model.cfg).stem) # 训练模型

🚀运行程序，如果出现下面的内容则说明添加成功🚀

from n params module arguments 0 -1 1 464 ultralytics.nn.modules.conv.Conv [3, 16, 3, 2] 1 -1 1 4672 ultralytics.nn.modules.conv.Conv [16, 32, 3, 2] 2 -1 1 6640 ultralytics.nn.modules.block.C3k2 [32, 64, 1, False, 0.25] 3 -1 1 16 ultralytics.nn.models.WaveletPool.WaveletPool[] 4 -1 1 38368 ultralytics.nn.modules.block.C3k2 [256, 128, 1, False, 0.25] 5 -1 1 16 ultralytics.nn.models.WaveletPool.WaveletPool[] 6 -1 1 136192 ultralytics.nn.modules.block.C3k2 [512, 128, 1, True] 7 -1 1 16 ultralytics.nn.models.WaveletPool.WaveletPool[] 8 -1 1 411648 ultralytics.nn.modules.block.C3k2 [512, 256, 1, True] 9 -1 1 164608 ultralytics.nn.modules.block.SPPF [256, 256, 5, 3, True] 10 -1 1 249728 ultralytics.nn.modules.block.C2PSA [256, 256, 1] 11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 12 [-1, 6] 1 0 ultralytics.nn.modules.conv.Concat [1] 13 -1 1 119808 ultralytics.nn.modules.block.C3k2 [384, 128, 1, True] 14 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 15 [-1, 4] 1 0 ultralytics.nn.modules.conv.Concat [1] 16 -1 1 34304 ultralytics.nn.modules.block.C3k2 [256, 64, 1, True] 17 -1 1 16 ultralytics.nn.models.WaveletPool.WaveletPool[] 18 [-1, 13] 1 0 ultralytics.nn.modules.conv.Concat [1] 19 -1 1 119808 ultralytics.nn.modules.block.C3k2 [384, 128, 1, True] 20 -1 1 16 ultralytics.nn.models.WaveletPool.WaveletPool[] 21 [-1, 10] 1 0 ultralytics.nn.modules.conv.Concat [1] 22 -1 1 561408 ultralytics.nn.modules.block.C3k2 [768, 256, 1, True, 0.5, True] 23 [16, 19, 22] 1 309656 ultralytics.nn.modules.head.Detect [80, 1, True, [64, 128, 256]] YOLO26_WaveletPool summary: 255 layers, 2,157,384 parameters, 2,157,304 gradients, 5.2 GFLOPs

3. 完整代码分享

主页侧边

4. GFLOPs

关于GFLOPs的计算方式可以查看：百面算法工程师 | 卷积基础知识——Convolution

未改进的YOLO26n GFLOPs

改进后的GFLOPs

5. 进阶

可以与其他的注意力机制或者损失函数等结合，进一步提升检测效果

6.总结

通过以上的改进方法，我们成功提升了模型的表现。这只是一个开始，未来还有更多优化和技术深挖的空间。在这里，我想隆重向大家推荐我的专栏——<专栏地址：YOLO26改进-论文涨点——点击跳转看所有内容，关注不迷路！>。这个专栏专注于前沿的深度学习技术，特别是目标检测领域的最新进展，不仅包含对YOLO26的深入解析和改进策略，还会定期更新来自各大顶会（如CVPR、NeurIPS等）的论文复现和实战分享。

为什么订阅我的专栏？——专栏地址：YOLO26改进-论文涨点——点击跳转看所有内容，关注不迷路！