Training Methodologies of LLMs

Large language models can be pre-trained, fine tuned models. Further they can be instruction-tuned or RL-tuned. This write-up offers you the implications of these terms.

Pretrained LLMs

Pretrained models have been trained on vast datasets. These are foundational models. Their learning includes the learning of patterns, grammar, facts and some reasoning abilities.

Pretrained models leverage the accumulated knowledge over the years. It is the beginning and makes sure that the model has mastered the nuances of language.

Pretrained models are similar to a library with many many books in the mind. They are a repository of knowledge.

Fine-tuned LLMs

A pre-trained model is further trained on a specific dataset. It makes the model ready for a specific task. A fine-tuned model retains its vast general knowledge, but also becomes a specialist in a specific domain. It could be healthcare. Imagine a physician fine tuning model in cardiology.

Instruction-tuned LLMs

These models are fine tuned using textual instructions. These models do not rely on vast data, but they rely on the instructions provided to them. They are in fact a bridge between generic responses and task specific outputs. Their answers are aligned to the intent of the user. Imagine a model that teaches a cooking recipe. With the instructions received by the model, it can teach anyone the art of making a good dish. It is like directing the narrative.

RL-tuned LLMs

RL, as we know, is reinforced learning. Here the model learns from the feedback. While interacting with the environment, the model either receives rewards or penalties based on its actions. It refines its behaviour over a period of time. This feedbacks is iterative loop. It can be adapted in real-time. The responses are honed and the performance is improved.

A musician may hit a wrong note occasionally, but with such an error, he adjusts to make the next rendition better. RL-tuned LLMs work on these lines, by refining the output in the light of feedback received.

print

Leave a Reply

Your email address will not be published. Required fields are marked *