Deep Learning

1. Base

1.1. What does it mean to go deep?

Reference to working with Neural Networks with more than two intermediate layers. Some staple problems one experiences during back-propogation as we deepen our networks are:

  1. exploding gradients

    These are easier to deal with as one still has the notion of direction retained although the magnitudes are large. Techniques such as gradient clipping and Regularization should help with this.

  2. vanishing gradients

    When backpropogating, when employing the chain rule, the corresponding gradients for the earlier parameters may diminish in magnitude due to the nature of the intermediate layers' gradients' natures. The tanh has gradients between 0 and 1 for instance, backpropogating n layers back - this will result in exponentially diminishing layers, reulting in virtually no learning for the initial layers. Some common techniques used to deal with this issue are:

    • ReLU activations are more robust against diminishing gradients
    • Skip connections, like in Residual Neural Networks, also helps deal with the issue by skipping layers

1.1.1. Present situation

The above issues with deep networks have been addressed satisfactorily and networks with 100s of hidden layers (a non output and non input layer) can be successfully trained.

1.2. Applications of DL

1.2.1. NLP

  • QnA
  • Speech recog
  • summarization
  • classifying docs

1.2.2. Computer Vision

  • Satellite and Drone Imagery Interpretation
  • Face Recognition
  • Image Captioning
  • Reading Traffic Signs
  • autonomous driving

1.2.3. Medicine

  • Anomaly detection (in radiology, CT, MRI and X-ray for instance)
  • detecting features in pathology slides
  • measuring features in ultrasounds
  • diagnosing diabetic retinopathy

1.2.4. Biology

  • Folding, classifying, … proteins
  • genomic tasks
  • cell classification
  • analysing protein/protein interaction

1.2.5. Image Generation

  • colorization
  • upscaling resolution
  • denoising
  • stylistic adaptation

1.2.6. Recommendation Systems

  • web search
  • product recommendations
  • landing page layouts

1.2.7. Games

  • Chess
  • Go
  • complex RTS

1.2.8. Robotics

  • handling objects that are challenging to locate (shiny, unusual texture etc)

1.4. towards SOTA: squeezing out performance in DL

This section (inspired by DL for Coders : fastai + pytorch, written with Computer Vision in mind), logs tweaks and techniques that are usually employed to squueze out performance from a Deep Learning model.

Do note that this section is about training a model from scratch, without employing Transfer Learning or doing so only in cases of a very distinct domain where the pretraining task isn't too closely related to the desired task.

Regularization is a tool that allows one to use overly capable models while avoiding overfits on the data. Prefer data augmentation before actually doing anything to the model and rely on manually manipulating weights after you've explored data augmentation strategies.

2. References

2.0.1. DL for Coders : fastai + pytorch

  • https://course.fast.ai/Resources/book.html
  • upgrading skills : specializing further in fast ai and pytorch
  • will populate notes in here in accordance with what I learn there
  • will also be coding along in python in org-babel cells for comprehensive pass of the book
Tags::ml:ai: