当前位置：首页 > news >正文

YOLOv11对比YOLOV8网络结构变化分析，帮助你真正的理解和学习yolo框架

news 文章来源：https://blog.csdn.net/qq_39128381/article/details/142788296 2025/4/3 11:56:30

本文在大佬的文章YOLOv11 | 一文带你深入理解ultralytics最新作品yolov11的创新 | 训练、推理、验证、导出（附网络结构图）基础上做了一些补充。

一、YOLOv11和YOLOv8对比

在这里插入图片描述

二、YOLOv11的网络结构图

下面的图片为YOLOv11的网络结构图。
在这里插入图片描述

三、YOLOv11新提出的模块

1、提出C3k2机制，其中C3k2有参数为c3k，其中在网络的浅层c3k设置为False（下图中可以看到c3k2第二个参数被设置为False，就是对应的c3k参数）。

在这里插入图片描述
C3k2就相当于YOLOv8中的C2f，其网络结构为一致的，其中的C3k机制的网络结构图如下图所示

可将yolov11训练好的pt模型，通过命令转化成onnx模型：

yolo task=detect mode=export model=runs/detect/train/weights/best.pt format=onnx

再送入Netron网站打开，获取模型结构。
要想获得每一层的特征图大小，如下图所示，需要对转化好的onnx进行简化后得到的模型再送入Netron打开即可。
sim命令如下：

pip install onnx-simplifier

python -m onnxsim /runs/detect/train/weights/best.onnx  runs/detect/train/weights/best_sim.onnx

以下yolov11s的第3层c3k2（第二个参数设置为False）的onnx结构图。
在这里插入图片描述
由于s的depth=0.5,所以yaml文件中卷积通道数减半。

以下是第6层c3k2第二个参数设置为True的onnx结构图。

yolov11模块代码：

class C3k2(C2f):"""Faster Implementation of CSP Bottleneck with 2 convolutions."""def __init__(self, c1, c2, n=1, c3k=False, e=0.5, g=1, shortcut=True):"""Initializes the C3k2 module, a faster CSP Bottleneck with 2 convolutions and optional C3k blocks."""super().__init__(c1, c2, n, shortcut, g, e)self.m = nn.ModuleList(C3k(self.c, self.c, 2, shortcut, g) if c3k else Bottleneck(self.c, self.c, shortcut, g) for _ in range(n))

class C3k(C3):"""C3k is a CSP bottleneck module with customizable kernel sizes for feature extraction in neural networks."""def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5, k=3):"""Initializes the C3k module with specified channels, number of layers, and configurations."""super().__init__(c1, c2, n, shortcut, g, e)c_ = int(c2 * e)  # hidden channels# self.m = nn.Sequential(*(RepBottleneck(c_, c_, shortcut, g, k=(k, k), e=1.0) for _ in range(n)))self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, k=(k, k), e=1.0) for _ in range(n)))

class Bottleneck(nn.Module):"""Standard bottleneck."""def __init__(self, c1, c2, shortcut=True, g=1, k=(3, 3), e=0.5):"""Initializes a standard bottleneck module with optional shortcut connection and configurable parameters."""super().__init__()c_ = int(c2 * e)  # hidden channelsself.cv1 = Conv(c1, c_, k[0], 1)self.cv2 = Conv(c_, c2, k[1], 1, g=g)self.add = shortcut and c1 == c2def forward(self, x):"""Applies the YOLO FPN to input data."""return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))

2、第二个创新点是提出C2PSA机制，这是一个C2（C2f的前身）机制内部嵌入了一个多头注意力机制，在这个过程中我还发现作者尝试了C2fPSA机制但是估计效果不如C2PSA，有的时候机制有没有效果理论上真的很难解释通，下图为C2PSA机制的原理图，仔细观察把Attention哪里去掉则C2PSA机制就变为了C2所以我上面说C2PSA就是C2里面嵌入了一个PSA机制。

在这里插入图片描述

在这里插入图片描述
以下是第10层C2PSA第二个参数设置为True的onnx结构图。


class C2PSA(nn.Module):"""C2PSA module with attention mechanism for enhanced feature extraction and processing.This module implements a convolutional block with attention mechanisms to enhance feature extraction and processingcapabilities. It includes a series of PSABlock modules for self-attention and feed-forward operations.Attributes:c (int): Number of hidden channels.cv1 (Conv): 1x1 convolution layer to reduce the number of input channels to 2*c.cv2 (Conv): 1x1 convolution layer to reduce the number of output channels to c.m (nn.Sequential): Sequential container of PSABlock modules for attention and feed-forward operations.Methods:forward: Performs a forward pass through the C2PSA module, applying attention and feed-forward operations.Notes:This module essentially is the same as PSA module, but refactored to allow stacking more PSABlock modules.Examples:>>> c2psa = C2PSA(c1=256, c2=256, n=3, e=0.5)>>> input_tensor = torch.randn(1, 256, 64, 64)>>> output_tensor = c2psa(input_tensor)"""def __init__(self, c1, c2, n=1, e=0.5):"""Initializes the C2PSA module with specified input/output channels, number of layers, and expansion ratio."""super().__init__()assert c1 == c2self.c = int(c1 * e)self.cv1 = Conv(c1, 2 * self.c, 1, 1)self.cv2 = Conv(2 * self.c, c1, 1)self.m = nn.Sequential(*(PSABlock(self.c, attn_ratio=0.5, num_heads=self.c // 64) for _ in range(n)))def forward(self, x):"""Processes the input tensor 'x' through a series of PSA blocks and returns the transformed tensor."""a, b = self.cv1(x).split((self.c, self.c), dim=1)b = self.m(b)return self.cv2(torch.cat((a, b), 1))

class PSABlock(nn.Module):"""PSABlock class implementing a Position-Sensitive Attention block for neural networks.This class encapsulates the functionality for applying multi-head attention and feed-forward neural network layerswith optional shortcut connections.Attributes:attn (Attention): Multi-head attention module.ffn (nn.Sequential): Feed-forward neural network module.add (bool): Flag indicating whether to add shortcut connections.Methods:forward: Performs a forward pass through the PSABlock, applying attention and feed-forward layers.Examples:Create a PSABlock and perform a forward pass>>> psablock = PSABlock(c=128, attn_ratio=0.5, num_heads=4, shortcut=True)>>> input_tensor = torch.randn(1, 128, 32, 32)>>> output_tensor = psablock(input_tensor)"""def __init__(self, c, attn_ratio=0.5, num_heads=4, shortcut=True) -> None:"""Initializes the PSABlock with attention and feed-forward layers for enhanced feature extraction."""super().__init__()self.attn = Attention(c, attn_ratio=attn_ratio, num_heads=num_heads)self.ffn = nn.Sequential(Conv(c, c * 2, 1), Conv(c * 2, c, 1, act=False))self.add = shortcutdef forward(self, x):"""Executes a forward pass through PSABlock, applying attention and feed-forward layers to the input tensor."""x = x + self.attn(x) if self.add else self.attn(x)x = x + self.ffn(x) if self.add else self.ffn(x)return x

3. 第三个创新点可以说是原先的解耦头中的分类检测头增加了两个DWConv，具体的对比大家可以看下面两个图下面的是YOLOv11的解耦头，上面的是YOLOv8的解耦头.

在这里插入图片描述
head部分分类头定义:
v8:

v11:

DWconv卷积代码
在这里插入图片描述

我们上面看到了在分类检测头中YOLOv11插入了两个DWConv这样的做法可以大幅度减少参数量和计算量（原先两个普通的Conv大家要注意到卷积和是由3变为了1的，这是形成了两个深度可分离Conv）
可参考博客https://blog.csdn.net/m0_56563749/article/details/133150979

4. 非创新点的SPPF模块

在这里插入图片描述
第9层SPPF的onnx结构图

SPPF模块代码