實際的CNN結(jié)構(gòu)可能要更為復(fù)雜,比如:分支,輸出拿來再用等;
- 分支網(wǎng)絡(luò):

other是指
如果還是跟之前一樣,一個個的去實例化的話,那么這個Net模型的代碼就會有一大堆的代碼冗余,而這是我們所不期望的。從這里不難看出,每個塊都有類似的結(jié)構(gòu):

如此便可直接將這些塊進行封裝,歸為一類(稱為Inception塊),再將塊給串聯(lián)起來,從而減少了重復(fù)工作。
卷積核有一些超參數(shù)是比較難選的:kernel大小等。所以GoogleNet就把他們幾個都試一試,再選最好用的那個來進行(通過設(shè)置權(quán)重大小來進行):

Concatenate表示把若干個張量拼接到一起;Average Pooling,均值池化,可以人為指定w,d,從而保證輸入輸出圖像大小一致(也可以用padding和strid來進行替代);1* 1卷積的個數(shù)取決于輸入張量的通道數(shù);

可以看到,通過這種方式可以把featuremap中輸入通道的所有信息融合到一起——信息融合(可以理解為考試之后求總分,然后按總分進行比較);
通過1* 1卷積可以改變通道數(shù)量,從而減少開銷:

這里的卷積計算,其實不只是針對1* 1的卷積,對所有的卷積都是一樣的。卷積核的個數(shù),決定了輸出的通道數(shù)。輸入的通道數(shù)決定了卷積核的層數(shù)
用代碼實現(xiàn)Inception Module
#分支1,池化分支
#這段寫在__init__
self.branch_pool = nn.Conv2d(in_channels, 24, kernel_size=1)
#這段寫在forward
branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
#將上一步的結(jié)果放到上面寫好的1*1的卷積后得到輸出結(jié)果
branch_pool = self.branch_pool(branch_pool)
#分支2,就一個1*1的卷積
self.branch1x1 = nn.Conv2d(in_channels, 16, kernel_size=1)
#直接給1*1的卷積就行
branch1x1 = self.branch1x1(x)
#分支3,一個1*1卷積接一個5*5的
self.branch5x5_1 = nn.Conv2d(in_channels, 16, kernel_size=1)
#為了保證寬高不變,設(shè)置padding為2
self.branch5x5_2 = nn.Conv2d(16, 24, kernel_size=5, padding=2)
brach5x5 = self.branch5x5_1(x)
brach5x5 = self.branch5x5_2(brach5x5)
#此時,batch、w和h都一樣,只有通道數(shù)c不一樣
#分支4,一個1*1接兩個3*3的
self.branch3x3_1 = nn.Conv2d(in_channels, 16, kernel_size=1)
self.branch3x3_2 = nn.Conv2d(16, 24, kernel_size=3, padding=1)
self.branch3x3_3 = nn.Conv2d(24, 24, kernel_size=3, padding=1)
branch3x3 = self.branch3x3_1(x)
branch3x3 = self.branch3x3_2(branch3x3)
branch3x3 = self.branch3x3_3(branch3x3)
#最后要把上面四個分支拼接起來(Concatenate)
#將剛剛的四個分支的結(jié)果放到一個數(shù)組中,再通過torch的cat函數(shù)來進行拼接
outputs = [branch1x1, branch5x5, branch3x3, branch_pool]
#維度為1是因為要對channels這個維度進行拼接
return torch. cat(outputs, dim=1)
把上面的分支一起放進一個類中以及對應(yīng)的Net模型就是下面這樣:
class InceptionA(nn.Module):
def __init__(self, in_channels):
super(InceptionA, self).__init__()
#初始輸入通道未定,這樣方便實例化的時候使用
self.branch1x1 = nn.Conv2d(in_channels, 16, kernel_size=1)
self.branch5x5_1 = nn.Conv2d(in_channels, 16, kernel_size=1)
self.branch5x5_2 = nn.Conv2d(16, 24, kernel_size=5, padding=2)
self.branch3x3_1 = nn.Conv2d(in_channels, 16, kernel_size=1)
self.branch3x3_2 = nn.Conv2d(16, 24, kernel_size=3, padding=1)
self.branch3x3_3 = nn.Conv2d(24, 24, kernel_size=3, padding=1)
self.branch_pool = nn.Conv2d(in_channels, 24, kernel_size=1)
def forward(self, x):
branch1x1 = self.branch1x1(x)
brach5x5 = self.branch5x5_1(x)
brach5x5 = self.branch5x5_2(brach5x5)
branch3x3 = self.branch3x3_1(x)
branch3x3 = self.branch3x3_2(branch3x3)
branch3x3 = self.branch3x3_3(branch3x3)
branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
branch_pool = self.branch_pool(branch_pool)
outputs = [branch1x1, branch5x5, branch3x3, branch_pool]
return torch. cat(outputs, dim=1)
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1,10, kernel_size = 5)
#24*3+16=88,這里共有88個通道
self.conv2 = nn.Conv2d(88,20, kernel_size = 5)
self.incep1 = InceptionA(in_channels=10)
self.incep2 = InceptionA(in_channels=20)
self.mp = nn.MaxPool2d(2)
#1408是根據(jù)mnist數(shù)據(jù)集經(jīng)過網(wǎng)絡(luò)計算后所得到的元素個數(shù)
self.fc = nn.Linear(1408, 10)
def forward(self, x):
in_size = x.size(0)
x = F.relu(self.mp(self.conv1(x)))
x = self.incep1(x)
x = F.relu(self.mp(self.conv2(x)))
x = self.incep2(x)
x = x.view(in_size, -1)
x = self.fc(x)
return x
將上述代碼放到MNIST數(shù)據(jù)集中所得到的結(jié)果就是下面這樣:

可以看到,訓(xùn)練太多次的時候也會出現(xiàn)過擬合的狀態(tài);我們可以把正確率最高的那個模型進行存盤,方便以后的使用;
- ResNet:

可以看到,如果把3×3的卷積核一直這樣堆下去,那么20層的效果比56層的效果要好;
梯度消失:由一大堆<1的梯度相乘,那么最終的結(jié)果會趨近于0。這樣一來,權(quán)重也就無法更新,也就是無法得到有效的訓(xùn)練;

ResNet就是把卷積后的結(jié)果再加上原來的x之后再進行激活,通過導(dǎo)數(shù)不難看出:
,使其最終不會趨向于0;
具體實現(xiàn)就是再最后和x做一個加法就行了;

ResNet有一堆跳連接。放大那里畫虛線是因為輸入的x與輸出張量的維度不同,要做單獨的處理,如池化層等;

代碼實現(xiàn):
class ResidualBlock(nn.Module):
def __init__(self, channels):
super(ResidualBlock, self).__init__()
self.channels = channels
#通過padding保證輸出圖像大小不變,而且輸入/出的通道數(shù)要一樣
self.conv1 = nn.Conv2d(channels,channels, kernel_size = 3, padding=1)
self.conv2 = nn.Conv2d(channels,channels, kernel_size = 3, padding=1)
def forward(self, x):
y = F.relu(self.conv1(x))
y = self.conv2(y)
#最后這里記得要加個x
return F.relu(x + y)
這樣就可以寫出模型的完整類:
class Net(nn.Module):
def __init__(self, channels):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(channels,channels, kernel_size = 3, padding=1)
self.conv2 = nn.Conv2d(channels,channels, kernel_size = 3, padding=1)
self.mp = nn.MaxPool2d(2)
self,rblock1 = ResidualBlock(16)
self,rblock2 = ResidualBlock(32)
self.fc = nn.Linear(512,10)
def forward(self, x):
in_size = x.size(0)
x = self.mp(F.relu(self.conv1(x)))
x = self.rblock1(x)
x = self.mp(F.relu(self.conv2(x)))
x = self.rblock2(x)
x = x.view(in_size, -1)
x = self.fc(x)
return x
記住,如果網(wǎng)絡(luò)結(jié)果非常復(fù)雜,那么就可以用新的類去封裝他,就像這里的ResidualBlock一樣;
簡單的測試方法:在forward中保留前n行,然后注釋掉其他行,看看輸出結(jié)果和預(yù)期的結(jié)果是否一致,沒問題就多保留一行,繼續(xù)看結(jié)果;

以后的路
- 從理論層面探討深度學(xué)習(xí)原理(多讀書);
- 閱讀Pytorch文檔,不是很多,至少通讀一遍;
- 復(fù)現(xiàn)模型的經(jīng)典工作——讀別人的代碼,學(xué)習(xí)一下,再來自己寫代碼,這樣一來就可以選擇自己感興趣的特定領(lǐng)域去發(fā)展了(如果無法復(fù)現(xiàn)就去看下原來的代碼,學(xué)學(xué)別人的技術(shù),從而提升自己);
- 擴充視野:不斷解決知識上的盲點,從而提升自己的能力,將來就可以組裝自己的模型了;