Lecture 07
2023-11-21
- recap
- Today's topic: Convolutional Neural Network
- CNN Visualization
- Image representation:
- 1920x1080 RGB image has 6,220,800 values
- Feed such an image to mlp with 1 hidden layer of 100 units, we need 100 x 6,220,800 weight variables!
- The model should be able to handle pixel translation, noise etc.
- Convolutional network architecture
- CNN = Feature learning + classification
- Feature learning = convolution + pooling
- convolution makes data deeper (increase number of channels)
- pooling shrink width and height
- number of channels depends on number of kernels
- Analogy to human brain
- Components summary: kernel size, stride step, pooling method, ......
- Filter types:
- median filter: 用于降噪,噪声一般是极端值,取中间值可以避开噪声
- mean filter: 用于生成平滑图像
- gaussian filter: 用于生成更平滑的图像
- edge filter: 用于提取边缘的像素
- learnable filter: 让weights自动匹配特征形态
- Convolutional layer
- Relationship among layer sizes
- Feature locality: convolution extract features on a small area. As the network goes deep, neurons will capture more and more local features, locality becomes more and more "global"/"abstract"