Large Language Models (LLMs) of 2023

We have been hearing about LLMs for a while, but they became a part of our consciousness in 2023. LLMs are the foundation of chatbots. Many big tech companies are now in race to build LLMs.

LLMs are advanced AI models which do NLP or natural language processing. They have been trained on massive corpus of data. They understand relationships between words. They are able to answer our queries. They can translate from one language to another language. They can generate text and are harbingers of generative AI. They can summarize a voluminous document into a concise format.

LLMs are now becoming multi-modal, and are trained on not only text, but on images and audio.

Let us learn about the LLMs available in 2023.

GPT-4: It has been released in March 2023 by OpenAI. It has become the current benchmark. It processes both text and images. Its training methodology has not been revealed. It has a trillion plus parameters. (six times the parameters of GPT-3, based on 175 billion parameters). It has been fine-tuned on Reinforcement Learning by Human Feedback method (RLHF method). This RLHF generated data are again used to train the model. This enhances its performance. It shows the least hallucinations. In November 2023, its new version called GPT-Turbo has been released. It is updated till April 2023 in terms of data. It can handle larger prompts.

Gemini: Google released Gemini multi-modal LLMs in 2023 in three versions — Nano, Pro and Ultra. Its chatbot Bard has underlying LLM Gemini Nano. A separate article has been written on Gemini.

GPT-3.5: It was released towards the end of November 2022 by OpenAI. It is the underlying model for ChatGPT. Since Google has now released Ultra (Gemini), a new version of brand will appear called Bard Advanced. Gemini Pro intermediates between GPT-3.5 and GPT-4. GPT-4 handles only text. It hallucinates more. ChatGPT plus works on GPT-4.

Llama 2: It has been releases by Facebook in March 2023. It is an open-source AI model. There is a model with 7 billion parameters and another with a 70 billion parameters. GPT-4 outperforms Llama-2 or Google’s PaLM-2.

PaLM-2: It has been launched by Google in May 2023. It is very powerful. It has 540 billion parameters. It has reasoning capabilities. It has been trained on 100 languages. The older version of Bard was based on PaLM-2.

Claude-2: It has been developed by Anthropic, founded by former OpenAI employees. Claude 1 has been released in July 2023. It has huge context-length. (The number of words a model considers in its input). Claude-2 is a new version released in November 2023. It has higher context length than GPT-4.

Mistral 7B: A Paris-based startup Mistral has built not a larger language model, but a niftier language model. Mistral 7B was released in September 2023. Another version Mistral 8x7B has been launched. It is a watered-down version of GPT-4. It completes with Llama-2 of Facebook.

print

Leave a Reply

Your email address will not be published. Required fields are marked *