Transfer Learning

Introducing Transfer Learning

  • Suppose someone has trained a deep network on an NLP task
  • We may be interested in using the network if:

    • The network was trained on very large dataset
    • The network provides a very good accuracy
  • If we're interested in a similar NLP task, then we may want to use this pre-trained network for our particular task

    • Then, we'll only need to train this network on a smaller set of data
  • At a high level, we're essentially taking a network with parameters that are close enough to those parameters in our optimal network

    • To focus the network on our particular task, we just slightly tune the parameters by training the model on a small set of relevent data
  • We may be interested in using a pre-trained network if:

    • The network was pre-trained for a task similar to ours
    • The network was pre-trained on a very large dataset
    • The network is very accurate
  • The potential benefits of transfer learning are the following:

    • Reduced training time
    • Improvement in predictions
    • Training only needs a small dataset
  • In general, the following rules of thumb hold true in transfer learning:

    • As our model grows larger, the accuracy improves
    • As our data grows larger, the accuracy improves
  • The process of using a pre-trained network is called transfer learning

Different Options and Types of Transfer Learning

  • In general, there are two types of transfer learning

    • Feature-based transfer learning
    • Fine-tuning transfer learning
  • Feature-based transfer learning takes the weights from the pre-trained model, then uses these weights as input into our own (new) model

    • Impying, we typically train the data on an entirely new model
  • Fine-tuning transfer learning inputs our own data set into the pre-trained model

    • Implying, we rarely make changes to the pre-trained model
    • Sometimes, we may adjust the output layer

      • e.g. adding a softmax layer

References

Previous
Next

Transformer Models

BERT