...

/

Transfer Learning

Transfer Learning

Learn when to use transfer learning vs. training from scratch by aligning your approach with the task’s data size, domain similarity, compute budget, and performance needs.

In modern artificial intelligence interviews—especially those focused on deep learning and generative AI—employers love to probe what methods you know and when and why you would choose one. A favorite among such questions is choosing between transfer learning and training from scratch. This reveals your practical understanding of machine learning strategy. “Should we fine-tune an existing model or build a new one from zero?” is a common dilemma in real projects, especially in GenAI-related roles. By asking “When would you prefer transfer learning over training from scratch, and vice versa?” they want to see if you grasp the concepts and the trade-offs properly and won’t naively train everything from scratch when a smarter approach exists.

Press + to interact

The interviewer expects you to know what transfer learning is and how it differs from training a model from scratch. They check that you understand key factors like data availability, computational cost, and model performance. Essentially, they want to hear that you can weigh pros and cons: using a pretrained GPT model to fine-tune for your task vs. training a brand new model. To answer effectively, you must show you understand both approaches and can articulate when each is appropriate. This tells them you can make informed decisions in building GenAI systems.

What is transfer learning?

Transfer learning is a machine learning technique where you start with a model already trained on one problem, then adapt it to a new but related problem. Instead of training a brand new model from random initialization, you reuse the knowledge the pretrained model has gained. In practice, this often means taking a model trained on a large dataset (the source task) and fine-tuning it on your smaller, specific dataset (the target task). The preexisting model provides a head start—it has learned useful features or representations that can apply to your new task​.

It is important to clarify that while fine-tuning is one of the most common transfer learning methods in deep learning, it is not the only approach. Under the umbrella of transfer learning, there is also the concept of feature extraction. In feature extraction, you use the pretrained model as a fixed feature extractor—its weights remain unchanged, and you only train new layers on top of these extracted features. In contrast, fine-tuning involves continuing the training process on the pretrained model, adjusting some or all of the parameters alongside the new layers to adapt the model more precisely to your task.

Using a pretrained model is like borrowing a toolkit: feature extraction uses the tools as-is, while fine-tuning adapts them to better fit your task. The right choice depends on your dataset size, task similarity, and available resources.

If someone has learned Spanish, they will likely learn Italian faster than someone with no background in learning languages. Similarly, a neural network that learned to recognize cats and dogs can be fine-tuned to recognize lions and wolves more easily than a new network without prior knowledge. The pretrained model serves as a foundation that already understands generic patterns (like shapes, edges, or, in NLP, general language structure). You then train it more on your specific data, tweaking its parameters to specialize in your task. This adaptation process is usually called ...