Amazon’s AI Model with Emergent Abilities

Amazon’s new AI model is showing linguistic abilities for which it has not been trained. It is showing the type of naturalness that matches human-level AI or artificial general intelligence (AGI). The paper has not yet been peer reviewed.

The model meets all the criteria set up by an expert linguist. It takes the leaps which a human learner naturally takes but is difficult for a model.

This model is called Big Adaptive Streamable TTS with Emergent Abilities or BASE TTS. The initial model was trained on 1 lac hours of speech data of which 90 per cent was in English. Amazon AGI team trained two smaller models, one on 1 thousand hours of speech and another on 10 thousand hours of speech. They wanted to find out which of the three models showed the type of naturalness they were looking for. In fact, they were looking for emergent abilities or abilities they were not trained on.

The 10 thousand hours of speech model scored the highest on the emergent abilities criteria list. It included abilities to understand punctuation, non-English words and emotions. It blurts out words which are natural for human readers, say shh — a non-word. It also used internet jargon, say ASAP or as-soon-as possible. The model was never told to come up with such surprising outputs. It produces emotional or whispered speeches and pronounces correctly foreign words. It has not been trained for all this, and it may not strictly constitute AGI, but it is on the path to realize this goal especially when it could do this without huge training data to get there.

The model’s evaluation and testing should continue to know its true capabilities and generalizability. It is too early to infer anthropomorphization. The models output are based on statistical patterns, not on genuine understanding or sentience.

print

Leave a Reply

Your email address will not be published. Required fields are marked *