- Framework
- Pre-training
- trained on unlabeled dadta over different pre-training tasks
- Fine-tuning
- fine-tuned parameters using labeled data from the downstream tasks
- for example for the NLI dataset, classification task would require supervised fine-tuning on the pre-trained BERT model