새소식

Deep Learning

CNN Architectures

  • -
  • AlexNet
    • Max pooling, ReLU nonlinearity
    • More data and bigger model (7 hidden layers, 650k units, 60M params)
    • GPU implementation (50x speedup over CPU)
      • Trained on two GPUs for a week
    • Dropout regularization
    • 61M parameters
  • VGG Net
    • Small filters, Deeper networks
    • AlexNet(8 layer) VS VGG16(16~19 layers)
    • Only 3x3 convolution stride 1, pad 1 and 2x2 maxpool stride 2
    • Why 3x3 stacks?
      • Stacked convolution layers have a large receptive field.
        • Two 3x3 layers => 5x5 receptive field
        • Three 3x3 layers => 7x7 receptive field
      • More non-linearity
      • Less parameters to learn (~140M per Network)
      • ** stacked of smaller convolution layers have same effective receptive field as more larger convolution layers
      • ** but deeper, more non-linearities and fewer parameters.
  • GoogLeNet
    • Decision the type of convolution you want to make at each layer은 다음 계층으로 이동하기 전에 각 컨볼루션과 그에 따른 기능 맵을 병렬로 연결하기 때문에 안됨
  • ResNet
    • Deeper layer makes the performance worse and degrades. (Quite counter-intuitive)
    • Because deeper networks failed to the identity mapping
    • So, use identity mapping!
      • Rather than just expecting the deeper network learn the identity mapping as a new function, we can give it a hint.
    • First trial
      • Case 1. When convolution keep its input dimension (256 -> 256)
        • Merging
        • Assuming 3x3 convolution outputs the same spatial dimension size as input
          • If without non-linear functions like ReLU, this process goes meaningless.
          • Because of ReLU, the value range by the way the output from ReLU and X is different.
        • Solution : add one more 3x3 convolution layer
      • Case 2. When convolution changes spatial dimension.
        • 28x28 => 3x3 => 14x14
        • Identity mapping을 하려고 할 때, spatial dimension doesn’t match.
        • 따라서 layer size가 바뀌면 identity size도 변경 (1x1 layer 이용)
      • Case 3. What if we want channel dimension to be changed?
        • 256x28x28 => 3x3,32 => 32x28x28
        • Identity : 1x1 convolution filter, 32, /2
      • Case 4. However if we keep stacking residual layers very deeply, the computation cost will also increases.
      • “Reduce the computation”
        • ResNet 50 이상부터는 1x1 - 3x3 - 1x1 block
    • Shortcut connection 1
      • Residual block can be applied when input and output has same spatial & channel dimension.
    • Shortcut connection 2 : Channel dimension increase

To be Continued . . .

'Deep Learning' 카테고리의 다른 글

Machine Learning - Arcing  (0) 2022.10.25
creating tensor  (0) 2021.06.18
Aggregating Features  (0) 2021.06.14
Up-sampling  (0) 2021.06.14
CNN [1] : Pooling & Convolution  (0) 2021.06.07
Contents

포스팅 주소를 복사했습니다

이 글이 도움이 되었다면 공감 부탁드립니다.