Convolutional Layer

Motivating Convolutional Networks

  • Up until now, we've convolved an image with two different filters to get a new image
  • Specifically, our convolution process has the following properties:

    • An image and filter are visually represented as a 3D volume
    • Each channel in the new image corresponds to a filter output
  • Now, we'll see how convolution can relate to neural networks

A Single Layer of a Convolutional Network

convolutional_layer

Benefit of Convolutional Networks

  • The number of learnable parameters depends on the following:

    • The number of filters in our network
    • The dimensions of each filter
  • Therefore, the size of an input image could grow extremely large, but the number of parameters will remain fixed
  • Meaning, convolutional networks are less prone to overfitting
  • For example, if we have 1010 filters that are 3×3×33 \times 3 \times 3 in one layer, then we'll have the following number of parameters:
270 weights +10 biases =280 total parameters270 \text{ weights } + 10 \text{ biases } = 280 \text{ total parameters}
  • In this case, we could have a 1000×10001000 \times 1000 or 5000×50005000 \times 5000 image, but the number of parameters remains fixed at 280280
  • We just need to learn the weights and biases of these 1010 filters to detect vertical edges, horizontal edges, and other features
  • Then, we can apply these filters to very large images, while the number of parameters will always remain fixed and relatively small

Summarizing Notation

  • General notation:

    • ll: The lthl^{th} convolution layer in a network
    • nh[l]n_{h}^{[l]}: The height of the input image in the lthl^{th} layer
    nh[l]=nh[l1]+2p[l]f[l]s[l]+1n_{h}^{[l]} = \lfloor \frac{n_{h}^{[l-1]}+2p^{[l]}-f^{[l]}}{s^{[l]}} + 1 \rfloor
    • nw[l]n_{w}^{[l]}: The width of the input image in the lthl^{th} layer
    nw[l]=nw[l1]+2p[l]f[l]s[l]+1n_{w}^{[l]} = \lfloor \frac{n_{w}^{[l-1]}+2p^{[l]}-f^{[l]}}{s^{[l]}} + 1 \rfloor
  • Hyperparameters:

    • f[l]f^{[l]}: The size of the filter in the lthl^{th} layer
    • p[l]p^{[l]}: The amount of padding in the lthl^{th} layer
    • s[l]s^{[l]}: The stride in the lthl^{th} layer
    • nc[l]n_{c}^{[l]}: The number of filters in the lthl^{th} layer
  • Dimensions:

    • input\text{input}:    nh[l1]×nw[l1]×nc[l1]\space \space \space n_{h}^{[l-1]} \times n_{w}^{[l-1]} \times n_{c}^{[l-1]}
    • filter\text{filter}: f[l]×f[l]×nc[l1]\quad f^{[l]} \times f^{[l]} \times n_{c}^{[l-1]}
    • output\text{output}:  nh[l]×nw[l]×nc[l]\space n_{h}^{[l]} \times n_{w}^{[l]} \times n_{c}^{[l]}
    • a[l]a^{[l]}: nh[l]×nw[l]×nc[l]\qquad n_{h}^{[l]} \times n_{w}^{[l]} \times n_{c}^{[l]}
    • w[l]w^{[l]}: f[l]×f[l]×nc[l1]×nc[l]\qquad f^{[l]} \times f^{[l]} \times n_{c}^{[l-1]} \times n_{c}^{[l]}
    • b[l]b^{[l]}:  1×1×1×nc[l]\qquad \space 1 \times 1 \times 1 \times n_{c}^{[l]}

Implementing Layers in a Convolutional Network

  • An output image is used as the input image in the next layer
  • Typically, the dimensions of an output image shrinks as we go deeper within our network
  • On the other hand, the number of channels tends to increase as we go deeper within our network
  • In other words, we tend to add more channels to filters that are deeper within our network
  • Also, there are three types of layers in a convolutional network:

    • Convolutional layers
    • Pooling layers
    • Fully connected layers
  • So far, we've only used convolutional layers in our examples

Example of Network using Convolutional Layers

convolutionalnetwork


tldr

  • In convolutional networks, we can relate our input image to a[l1]a^{[l-1]}
  • We can also relate the filters to w[l]w^{[l]}
  • Then, the convolved image relates to w[l]a[l1]w^{[l]}a^{[l-1]}
  • The output of applying an activation function and bias term to the convolved image related to the activations a[l]a^{[l]}
  • This output can also be used to feed forward into other convolutional layers
  • Typically, the dimensions of an output image shrinks as we go deeper and deeper within our network
  • On the other hand, the number of channels tends to increase as we go deeper and deeper within our network
  • In other words, we tend to add more channels to filters that are deeper within our network

References

Previous
Next

Convolution Operation

Pooling Layer