Lecture 07

2023-11-21

recap
Today's topic: Convolutional Neural Network
CNN Visualization
Image representation:
- 1920x1080 RGB image has 6,220,800 values
- Feed such an image to mlp with 1 hidden layer of 100 units, we need 100 x 6,220,800 weight variables!
- The model should be able to handle pixel translation, noise etc.
Convolutional network architecture
- CNN = Feature learning + classification
- Feature learning = convolution + pooling
  - convolution makes data deeper (increase number of channels)
  - pooling shrink width and height
  - number of channels depends on number of kernels
- Analogy to human brain
- Components summary: kernel size, stride step, pooling method, ......
Filter types:
- median filter: 用于降噪，噪声一般是极端值，取中间值可以避开噪声
- mean filter: 用于生成平滑图像
- gaussian filter: 用于生成更平滑的图像
- edge filter: 用于提取边缘的像素
- learnable filter: 让weights自动匹配特征形态
Convolutional layer
Relationship among layer sizes
Feature locality: convolution extract features on a small area. As the network goes deep, neurons will capture more and more local features, locality becomes more and more "global"/"abstract"

首页