Overcoming GPT Token Limit

The content window is the amount of information the model can receive and its response to the user. The sum of received and created information is the content that model can operate with.

ChatGPT has the content window of 4 thousand tokens. GPT-4 has 8 thousand tokens. GPT-3.5 Turbo has 16 thousand tokens. These are not enough to load a book or a web-site. Tokens are pieces of words that are used as inputs to AI model. Before processing the prompt, the AI model breaks down input into tokens. Each token will not correspond to the start and end of the word. Tokens include trailing spaces and even sub-words. The number of tokens processed in a single API request depends on the length of the input-output text. One token is roughly equivalent to 4 characters or 0.75 words for English text.

The context window for GPT is the number of previous words the model factors in while generating the next word. The larger the context window, the more context the model has to generate the next word. The default context window for GPT is 1024 tokens.

How to overcome this?

Vector Indexes

Suppose you have 50 documents with information of 50 thousand tokens. It is 35-40 thousand characters. This information that the model has should be used to answer the query.

To accomplish this, all these documents will be split into chunks, say we get 20 pieces of 2000 characters each. These 20 pieces must be converted into vectors. When the user puts a question, that is transformed into a vector also. We then make use of cosine distance and find the vectors of the document pieces closest to the question vector. The search is for the most suitable vectors where the information on the topic is likely to be contained.

The last step is to use the vectors of these pieces into text. It is added to GPT content. Then a question that the user has put forward is asked again.

In short, vector search is a tool that enables you to add only relevant information from all the data one has loaded to the model’s context. This overcomes the contextual window limit in GPT.

print

Leave a Reply

Your email address will not be published. Required fields are marked *