Difference between revisions of "2017 Winter Project Week/DeepLearningMethodology"

From NAMIC Wiki
Jump to: navigation, search
(Created page with "This is a 3 hour introductory course on Deep Learning Methodology for Project Week #24. Instructor: Mohsen Ghafoorian ==Basic concepts: (60-75 min)== *loss function *stoc...")
 
 
Line 5: Line 5:
  
 
==Basic concepts: (60-75 min)==
 
==Basic concepts: (60-75 min)==
*loss function
+
*loss function (categorical cross entropy, MSE)
 
*stochastic gradient descent
 
*stochastic gradient descent
*update rules
+
*update rules (SGD issue, Momentum, Nestrov, Adadelta, RMSProp, Adam)
 
*learning rate
 
*learning rate
 
*activation functions
 
*activation functions
 
*why non-linearities?
 
*why non-linearities?
*Sigmoid (vanishing grad, zero centered), tanh
+
*Sigmoid (vanishing gradient problem, non-zero centered features), tanh
*relu (dead relu), leaky relu, prelu
+
*relu (dead relu issue), leaky relu, prelu
 
*weight initialization
 
*weight initialization
 
*regularization
 
*regularization
 
*augmentation
 
*augmentation
*L1
+
*L1/L2
*L2
 
 
*dropout
 
*dropout
 
*batch norm
 
*batch norm
*network babysitting
+
*network babysitting (bad learning rate, bad initialization, overfitting)
  
 
==State of the art CNN methods: (60 min)==
 
==State of the art CNN methods: (60 min)==

Latest revision as of 06:14, 10 January 2017

Home < 2017 Winter Project Week < DeepLearningMethodology

This is a 3 hour introductory course on Deep Learning Methodology for Project Week #24.

Instructor: Mohsen Ghafoorian


Basic concepts: (60-75 min)

  • loss function (categorical cross entropy, MSE)
  • stochastic gradient descent
  • update rules (SGD issue, Momentum, Nestrov, Adadelta, RMSProp, Adam)
  • learning rate
  • activation functions
  • why non-linearities?
  • Sigmoid (vanishing gradient problem, non-zero centered features), tanh
  • relu (dead relu issue), leaky relu, prelu
  • weight initialization
  • regularization
  • augmentation
  • L1/L2
  • dropout
  • batch norm
  • network babysitting (bad learning rate, bad initialization, overfitting)

State of the art CNN methods: (60 min)

  • alexnet
  • vgg net
  • google net
  • resnet
  • highway nets
  • dense nets
  • GANs

Biomedical segmentation

  • sliding window
  • fully convolutional nets
  • Unet