To Peak AI Should Go Beyond Transformer

We have been into a post-ChatGPT world for almost a year, while entering the coming year 2024. It is a short time to ask the question — has this technology of generative AI peaked?

Instead, this is the time to leverage the generative AI technology in different fields. Google has released its most awaited AI model Gemini in December 2023. It has come some nine months after GPT-4. It was expected that Gemini may push the envelope further. However, Gemini Ultra hardly inches ahead of GPT-4 on performance benchmarks. There is no model yet seen which beats GPT-4. Is at this the limit of LLMs? How can we jump from here to artificial general intelligence (AGI) that puts cognitive ability of the model on par with that of human beings? LLMs have taken us so far, but no further. Of course, there is a chance that a model will emerge eventually.

Transformer architecture used since 2017 scale up by increasing the number of parameters, and though OpenAI has not disclosed the parameters of its models, it has been estimated that GPT-3 has 175 billion parameters. LLMs scale linearly with the amount of data and compute. But such scaling is not practical. It is an expensive proposition. Thus, transformer architecture has limitations that prevent LLMs reach AGI.

Transformers are not good at generalizing. It affects their capability to reach AGI.

Something has to be conceived on top of a transformer which provides it some capacity of reasoning.

print

Leave a Reply

Your email address will not be published. Required fields are marked *