【动手学深度学习】--12.深度卷积神经网络AlexNet
**深度卷积神经网络AlexNet**
在前面的章节中,我们已经了解了如何使用卷积神经网络(CNN)来进行图像分类。然而,现实世界中的图像数据往往具有高维度、复杂的结构和多样性,这使得简单的CNN难以有效地学习和泛化。因此,我们需要一种更强大的模型来应对这些挑战。
**AlexNet**
AlexNet是由Alex Krizhevsky等人在2012年提出的,用于解决ImageNet图像分类大赛中的问题。这是一种深度卷积神经网络(Deep Convolutional Neural Network, DCNN),其结构比传统的CNN更复杂和强大。
**AlexNet的结构**
AlexNet的结构如下:
* **输入层**:224x224的RGB图像* **第1个卷积层**:11x11的滤波器,步长为4,输出维度为55x55* **激活函数**: ReLU(Rectified Linear Unit)
* **池化层**:3x3的最大池化,步长为2,输出维度为27x27* **第2个卷积层**:5x5的滤波器,步长为1,输出维度为27x27* **激活函数**: ReLU(Rectified Linear Unit)
* **池化层**:3x3的最大池化,步长为2,输出维度为13x13* **第3个卷积层**:3x3的滤波器,步长为1,输出维度为13x13* **激活函数**: ReLU(Rectified Linear Unit)
* **池化层**:3x3的最大池化,步长为2,输出维度为6x6* **全连接层**:4096个神经元* **激活函数**: ReLU(Rectified Linear Unit)
* **dropout**:0.5* **输出层**:1000个神经元**AlexNet的实现**
下面是使用PyTorch实现AlexNet的代码:
import torchimport torch.nn as nnimport torchvisionimport torchvision.transforms as transformsclass AlexNet(nn.Module): def __init__(self, num_classes=1000): super(AlexNet, self).__init__() self.features = nn.Sequential( nn.Conv2d(3,96, kernel_size=11, stride=4), nn.ReLU(), nn.MaxPool2d(kernel_size=3, stride=2), nn.Conv2d(96,256, kernel_size=5, padding=2), nn.ReLU(), nn.MaxPool2d(kernel_size=3, stride=2), nn.Conv2d(256,384, kernel_size=3, padding=1), nn.ReLU(), nn.Conv2d(384,384, kernel_size=3, padding=1), nn.ReLU(), nn.Conv2d(384,256, kernel_size=3, padding=1), nn.ReLU(), nn.MaxPool2d(kernel_size=3, stride=2) ) self.classifier = nn.Sequential( nn.Linear(256 *6 *6,4096), nn.ReLU(), nn.Dropout(p=0.5), nn.Linear(4096,4096), nn.ReLU(), nn.Dropout(p=0.5), nn.Linear(4096, num_classes) ) def forward(self, x): x = self.features(x) x = torch.flatten(x,1) x = self.classifier(x) return x# 初始化AlexNet模型model = AlexNet() # 加载ImageNet数据集transform = transforms.Compose([transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485,0.456,0.406], std=[0.229,0.224,0.225])]) trainset = torchvision.datasets.ImageNet(root='./data', split='train', download=True, transform=transform) trainloader = torch.utils.data.DataLoader(trainset, batch_size=32, shuffle=True) # 训练AlexNet模型criterion = nn.CrossEntropyLoss() optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9) for epoch in range(10): for i, data in enumerate(trainloader): inputs, labels = data optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step()
**总结**
AlexNet是深度卷积神经网络(DCNN)的代表例子,它通过使用多个卷积层和池化层来学习图像特征,并且使用全连接层和dropout来进行分类。AlexNet的结构比传统的CNN更复杂和强大,能够有效地学习和泛化现实世界中的图像数据。
**参考**
* Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems,25.
* Simonyan, K., & Zisserman, A. (2015). Very Deep Convolutional Networks for Large-Scale Image Recognition. International Conference on Learning Representations.