RAG Enhances LLM

Retrieval augmented generation or RAG enhances the capability of an LLM. LLMs, as we know, are the tools to transform vast amounts of unstructured data into usable information. LLMs could become outdated and cannot address a specific task by the time the model is put to use. LLM consists of billions or trillions of parameters — the number of neurons. RAG optimizes the output of an LLM by sourcing from external knowledge base in addition to the information on which it was trained. This external information could be sourced from organization’s proprietary data or other content to which it is directed.

In short, RAG expands an LLM’s knowledge base. This improves its accuracy as well as contextuality.

RAG uses search function to retrieve relevant data and adds this data to the prompt to get better generative output. RAG could be useful to retrieve public data on internet as well as data from private sources.

RAG was coined by Patrick Lewis, a research scientist associated with the startup Cohere. The term was coined in a paper published in 2020. As LLMs cannot expand or revise their memory, they at times hallucinate. RAG is one way to reduce hallucinations in generative AI results.

Apart from Cohere, there are other vendors who provide RAG-based apps — Vectara, OpenAI, Microsoft Azure, Google Vertex AI, LangChain, Databricks and Llamaindex.

Vector data base and graph technologies are used to retrieve proprietary data. A vector database stores, indexes and manages massive vector data. Companies use vector search capabilities in their databases. By 2026, more than 30 per cent enterprises are expected to adopt vector databased to ground their foundational models.

print

Leave a Reply

Your email address will not be published. Required fields are marked *