【深度学习】了解残差网 ResNet 和 ResNeXt 的架构
发布人:shili8
发布时间:2025-01-17 13:34
阅读次数:0
**深度学习系列文章之ResNet和ResNeXt**
在深度学习领域,网络结构的设计是非常重要的一环。其中,残差连接(Residual Connection)是近年来极大地改进了深度神经网络性能的一个关键技术。这个系列文章将重点介绍两种基于残差连接的网络架构:ResNet和ResNeXt。
**1. ResNet**
###1.1 简介ResNet(Residual Network)是由Kaiming He等人在2015年提出的,旨在解决深度神经网络训练难题。该网络结构通过引入残差连接,使得网络能够更容易地学习和训练。
###1.2 架构ResNet的基本架构如下所示:
markdown# ResNet## Block### Convolutional Layer#### Input: x#### Output: F(x) ### Identity Mapping#### Input: x#### Output: x### Residual Connection#### Input: F(x) + x#### Output: F(x) + x = H(x)
其中,`F(x)` 表示前向传播过程中的计算结果,而 `H(x)` 则是残差连接后的输出。
###1.3 实现代码
import torchimport torch.nn as nnclass ResidualBlock(nn.Module): def __init__(self, in_channels, out_channels, stride=1): super(ResidualBlock, self).__init__() self.conv = nn.Sequential( nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1), nn.BatchNorm2d(out_channels), nn.ReLU(), nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(out_channels) ) self.shortcut = nn.Sequential() if stride !=1 or in_channels != out_channels: self.shortcut = nn.Sequential( nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride, bias=False), nn.BatchNorm2d(out_channels) ) def forward(self, x): out = self.conv(x) + self.shortcut(x) return outclass ResNet(nn.Module): def __init__(self, num_classes=10): super(ResNet, self).__init__() self.in_channels =64 self.conv1 = nn.Sequential( nn.Conv2d(3,64, kernel_size=7, stride=2, padding=3), nn.BatchNorm2d(64), nn.ReLU(), nn.MaxPool2d(kernel_size=3, stride=2, padding=1) ) self.layer1 = self._make_layer(ResidualBlock,64,3) self.layer2 = self._make_layer(ResidualBlock,128,4, stride=2) self.layer3 = self._make_layer(ResidualBlock,256,6, stride=2) self.fc = nn.Linear(512, num_classes) def _make_layer(self, block, out_channels, blocks, stride=1): layers = [] for i in range(0, blocks): layers.append(block(self.in_channels, out_channels, stride)) self.in_channels = out_channels return nn.Sequential(*layers) def forward(self, x): out = self.conv1(x) out = self.layer1(out) out = self.layer2(out) out = self.layer3(out) out = torch.flatten(out,1) out = self.fc(out) return out
###1.4 总结ResNet通过引入残差连接,使得网络能够更容易地学习和训练。这种架构的设计可以有效地解决深度神经网络训练难题。
**2. ResNeXt**
###2.1 简介ResNeXt是由Saining Xie等人在2017年提出的,旨在进一步改进ResNet的性能。该网络结构通过引入分组卷积和残差连接,使得网络能够更有效地学习和训练。
###2.2 架构ResNeXt的基本架构如下所示:
markdown# ResNeXt## Block### Grouped Convolutional Layer#### Input: x#### Output: F(x) ### Identity Mapping#### Input: x#### Output: x### Residual Connection#### Input: F(x) + x#### Output: F(x) + x = H(x)
其中,`F(x)` 表示前向传播过程中的计算结果,而 `H(x)` 则是残差连接后的输出。
###2.3 实现代码
import torchimport torch.nn as nnclass ResidualBlock(nn.Module): def __init__(self, in_channels, out_channels, stride=1): super(ResidualBlock, self).__init__() self.conv = nn.Sequential( nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1), nn.BatchNorm2d(out_channels), nn.ReLU(), nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(out_channels) ) self.shortcut = nn.Sequential() if stride !=1 or in_channels != out_channels: self.shortcut = nn.Sequential( nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride, bias=False), nn.BatchNorm2d(out_channels) ) def forward(self, x): out = self.conv(x) + self.shortcut(x) return outclass ResNeXtBlock(nn.Module): def __init__(self, in_channels, out_channels, groups=32, stride=1): super(ResNeXtBlock, self).__init__() self.conv = nn.Sequential( nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1, groups=groups), nn.BatchNorm2d(out_channels) ) self.shortcut = nn.Sequential() if stride !=1 or in_channels != out_channels: self.shortcut = nn.Sequential( nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride, bias=False), nn.BatchNorm2d(out_channels) ) def forward(self, x): out = self.conv(x) + self.shortcut(x) return outclass ResNeXt(nn.Module): def __init__(self, num_classes=10): super(ResNeXt, self).__init__() self.in_channels =64 self.conv1 = nn.Sequential( nn.Conv2d(3,64, kernel_size=7, stride=2, padding=3), nn.BatchNorm2d(64), nn.ReLU(), nn.MaxPool2d(kernel_size=3, stride=2, padding=1) ) self.layer1 = self._make_layer(ResNeXtBlock,64,3) self.layer2 = self._make_layer(ResNeXtBlock,128,4, stride=2) self.layer3 = self._make_layer(ResNeXtBlock,256,6, stride=2) self.fc = nn.Linear(512, num_classes) def _make_layer(self, block, out_channels, blocks, stride=1): layers = [] for i in range(0, blocks): layers.append(block(self.in_channels, out_channels, stride)) self.in_channels = out_channels return nn.Sequential(*layers) def forward(self, x): out = self.conv1(x) out = self.layer1(out) out = self.layer2(out) out = self.layer3(out) out = torch.flatten(out,1) out = self.fc(out) return out
###2.4 总结ResNeXt通过引入分组卷积和残差连接,使得网络能够更有效地学习和训练。这种架构的设计可以进一步改进ResNet的性能。
**3. 总结**
本文介绍了两种基于残差连接的网络架构:ResNet和ResNeXt。通过引入残差连接,ResNet可以有效地解决深度神经网络训练难题,而ResNeXt则进一步改进了ResNet的性能。这种架构的设计可以在多个领域得到广泛应用。
**4. 参考文献**
[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages770–778,2016.
[2] Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages5987–5995,2017.