当前位置：首页 > news >正文

pytorch代码实现之动态卷积模块ODConv

news 2026/5/12 16:57:28

ODConv动态卷积模块

ODConv可以视作CondConv的延续，将CondConv中一个维度上的动态特性进行了扩展，同时了考虑了空域、输入通道、输出通道等维度上的动态性，故称之为全维度动态卷积。ODConv通过并行策略采用多维注意力机制沿核空间的四个维度学习互补性注意力。作为一种“即插即用”的操作，它可以轻易的嵌入到现有CNN网络中。ImageNet分类与COCO检测任务上的实验验证了所提ODConv的优异性：即可提升大模型的性能，又可提升轻量型模型的性能，实乃万金油是也！值得一提的是，受益于其改进的特征提取能力，ODConv搭配一个卷积核时仍可取得与现有多核动态卷积相当甚至更优的性能。

原文地址：Omni-Dimensional Dynamic Convolution

ODConv结构图
代码实现：

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.autograd
from models.common import Conv, autopadclass Attention(nn.Module):def __init__(self, in_planes, out_planes, kernel_size, groups=1, reduction=0.0625, kernel_num=4, min_channel=16):super(Attention, self).__init__()attention_channel = max(int(in_planes * reduction), min_channel)self.kernel_size = kernel_sizeself.kernel_num = kernel_numself.temperature = 1.0self.avgpool = nn.AdaptiveAvgPool2d(1)self.fc = Conv(in_planes, attention_channel, act=nn.ReLU(inplace=True))self.channel_fc = nn.Conv2d(attention_channel, in_planes, 1, bias=True)self.func_channel = self.get_channel_attentionif in_planes == groups and in_planes == out_planes:  # depth-wise convolutionself.func_filter = self.skipelse:self.filter_fc = nn.Conv2d(attention_channel, out_planes, 1, bias=True)self.func_filter = self.get_filter_attentionif kernel_size == 1:  # point-wise convolutionself.func_spatial = self.skipelse:self.spatial_fc = nn.Conv2d(attention_channel, kernel_size * kernel_size, 1, bias=True)self.func_spatial = self.get_spatial_attentionif kernel_num == 1:self.func_kernel = self.skipelse:self.kernel_fc = nn.Conv2d(attention_channel, kernel_num, 1, bias=True)self.func_kernel = self.get_kernel_attentionself._initialize_weights()def _initialize_weights(self):for m in self.modules():if isinstance(m, nn.Conv2d):nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')if m.bias is not None:nn.init.constant_(m.bias, 0)if isinstance(m, nn.BatchNorm2d):nn.init.constant_(m.weight, 1)nn.init.constant_(m.bias, 0)def update_temperature(self, temperature):self.temperature = temperature@staticmethoddef skip(_):return 1.0def get_channel_attention(self, x):channel_attention = torch.sigmoid(self.channel_fc(x).view(x.size(0), -1, 1, 1) / self.temperature)return channel_attentiondef get_filter_attention(self, x):filter_attention = torch.sigmoid(self.filter_fc(x).view(x.size(0), -1, 1, 1) / self.temperature)return filter_attentiondef get_spatial_attention(self, x):spatial_attention = self.spatial_fc(x).view(x.size(0), 1, 1, 1, self.kernel_size, self.kernel_size)spatial_attention = torch.sigmoid(spatial_attention / self.temperature)return spatial_attentiondef get_kernel_attention(self, x):kernel_attention = self.kernel_fc(x).view(x.size(0), -1, 1, 1, 1, 1)kernel_attention = F.softmax(kernel_attention / self.temperature, dim=1)return kernel_attentiondef forward(self, x):x = self.avgpool(x)x = self.fc(x)return self.func_channel(x), self.func_filter(x), self.func_spatial(x), self.func_kernel(x)class ODConv2d(nn.Module):def __init__(self, in_planes, out_planes, k, s=1, p=None, g=1, act=True, d=1,reduction=0.0625, kernel_num=1):super(ODConv2d, self).__init__()self.in_planes = in_planesself.out_planes = out_planesself.kernel_size = kself.stride = sself.padding = autopad(k, p)self.dilation = dself.groups = gself.kernel_num = kernel_numself.attention = Attention(in_planes, out_planes, k, groups=g,reduction=reduction, kernel_num=kernel_num)self.weight = nn.Parameter(torch.randn(kernel_num, out_planes, in_planes//g, k, k),requires_grad=True)self._initialize_weights()self.bn = nn.BatchNorm2d(out_planes)self.act = nn.SiLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())if self.kernel_size == 1 and self.kernel_num == 1:self._forward_impl = self._forward_impl_pw1xelse:self._forward_impl = self._forward_impl_commondef _initialize_weights(self):for i in range(self.kernel_num):nn.init.kaiming_normal_(self.weight[i], mode='fan_out', nonlinearity='relu')def update_temperature(self, temperature):self.attention.update_temperature(temperature)def _forward_impl_common(self, x):# Multiplying channel attention (or filter attention) to weights and feature maps are equivalent,# while we observe that when using the latter method the models will run faster with less gpu memory cost.channel_attention, filter_attention, spatial_attention, kernel_attention = self.attention(x)batch_size, in_planes, height, width = x.size()x = x * channel_attentionx = x.reshape(1, -1, height, width)aggregate_weight = spatial_attention * kernel_attention * self.weight.unsqueeze(dim=0)aggregate_weight = torch.sum(aggregate_weight, dim=1).view([-1, self.in_planes // self.groups, self.kernel_size, self.kernel_size])output = F.conv2d(x, weight=aggregate_weight, bias=None, stride=self.stride, padding=self.padding,dilation=self.dilation, groups=self.groups * batch_size)output = output.view(batch_size, self.out_planes, output.size(-2), output.size(-1))output = output * filter_attentionreturn outputdef _forward_impl_pw1x(self, x):channel_attention, filter_attention, spatial_attention, kernel_attention = self.attention(x)x = x * channel_attentionoutput = F.conv2d(x, weight=self.weight.squeeze(dim=0), bias=None, stride=self.stride, padding=self.padding,dilation=self.dilation, groups=self.groups)output = output * filter_attentionreturn outputdef forward(self, x):return self.act(self.bn(self._forward_impl(x)))

pytorch代码实现之动态卷积模块ODConv

ODConv动态卷积模块

相关文章：

pytorch代码实现之动态卷积模块ODConv

动态规划：子序列问题（C++）

ORACLE的分区（一）

【数据结构】C++实现二叉搜索树

Python中Mock和Patch的区别

sql server 查询某个字段是否有值返回bool类型

紫光展锐5G芯T820 解锁全新应用场景，让机器人更智能

秋招前端面试题总结

【入门篇】ClickHouse 数据类型

关于Python数据分析，这里有一条高效的学习路径

基于 json-server 工具，模拟实现后端接口服务环境

想要精通算法和SQL的成长之路 - 课程表II

【sgGoogleTranslate】自定义组件：基于Vue.js用谷歌Google Translate翻译插件实现网站多国语言开发

论文总结《A Closer Look at Few-shot Classification Again》

Postman使用_参数设置和获取

【SQL】优化SQL查询方法

Linux-相关操作

二十、MySQL多表关系

HarmonyOS/OpenHarmony应用开发-DevEco Studio新建项目的整体说明

去耦电路设计应用指南（三）磁珠/电感的噪声抑制

如何通过SRWE实现游戏窗口分辨率自定义：5个高效技巧与实战指南

MathType 快捷键实战指南——数学建模效率飙升的秘诀（从入门到精通）

MTKClient终极指南：免费解锁联发科设备的完整刷机解决方案

com0com虚拟串口驱动终极指南：免费创建无限COM端口对，彻底摆脱物理线缆束缚

放心API和4SAPI怎么选？从开发者选型角度看差异

好用的AI软件开发选哪家

XT2055 双灯显示微型线性电池充电管理芯片

SpringBoot微服务启动遇阻：RedisTemplate Bean缺失的排查与修复指南

Sora 2 + After Effects 24.4终极联动教程：含LUT自动映射、运动追踪反哺、动态遮罩同步（附独家.jsx插件）

Vexip UI暗黑主题实现：CSS变量与主题切换完全指南 [特殊字符]