Happy Makar Sankranti: Vector Databases

Chatbots in previous years were fluent but they were forgetful. Since then, we are using vector embeddings. Embeddings, as we know, are words represented as vectors. Vectors are a sequence of numbers which encode information.

In math, we also have the notion of proximity or closeness. We can use geometry to encode any property.

In natural language processing, the idea is to encode semantic similarity through the distance of embeddings in the representation space.

Vector databases store embeddings of words and phrases. These enable LLMs to fetch quickly contextually relevant information. When LLMs come across a term, they can retrieve similar embeddings from the database, maintaining context and coherence.

Vector databases can scale to accommodate vast amounts of embeddings. Scalability is vital for chatbots, content generation and question-answering.

It is necessary to run a safety check. There should be ethical and cultural nuances. There is industry-specific jargon. There should be ambiguity resolution.

print

Leave a Reply

Your email address will not be published. Required fields are marked *