LLMs : Pros and Cons

The way we interact with software has changed a great deal by what we call large language models. LLMs are where deep learning and computational resources combine.

Still, LLMs can generate false, outdated and problematic information. They even hallucinate — generate information that does not exist.

First let us understand language models. These generate responses as we humans do. They have been trained on a large corpus of data, and that makes them understand the nuances of the language. These models are neural networks with many layers to learn complex patterns and relationships. They generalize and understand the context. They do not remain restricted to pre-defined rules and patterns, but learn from massive data to develop their own understanding of the language.

As a result, these generate coherent and contextually relevant responses.

Deep learning is a game-changer. The precursors to neural networks relied on pre-defined rules and patterns. Deep learning invests them with the capability to understand language naturally — in a human-like way.

Deep learning networks have many layers, that make them analyze and learn complex patterns and relationships.

Broad language understanding is developed in pre-training stage, and later fine-tuning makes the model versatile and adaptable. To perform specific task, task descriptions and examples are given — few-shot learning or task descriptions alone are given — zero-shot learning. The pre-trained weights are adjusted based on this information.

Though deep learning with multiple layers and attention mechanism enables the model to generate human-like text, there could be overgeneralization — the responses are not contextually relevant, accurate and updated.

LLMs have capabilities based on their training data. The training data could be out of date. The input text may be ambiguous and less detailed. It may lead to wrong context.

The input data may have incorrect information or biases. This is true for sensitive and controversial topics. The models use the data as short hands. These patterns are based prejudices. The responses too reflect these prejudices.

LLMs do not have the ability to check the correctness of the information they generate. The confidence with which the response is generated may mislead the users.

Hallucinations are possible when the queries are not correctly framed. They do not produce false information intentionally. The model has to generate a response as per the patterns learnt.

LLMs are not trained to reason. They are not students of any subject — science, literature, computer code etc. They are simply trained to predict the next token in the text.

print

Leave a Reply

Your email address will not be published. Required fields are marked *