Superhuman AI: Some Observations

There were rumours that OpenAI is close to developing superhuman AI when Sam Altman was unceremoniously dismissed from the organization. OpenAI has built a Super-alignment team to control AI that surpasses the human beings. The members of this team include Collin Burns, Pavel and Leopold. They see to it that AI systems behave as intended. The team was built in July, 2023 to steer, regulate and govern superintelligent AI systems.

These days we tend to align models that are dumber than us. The idea is to find ways to align models that are smarter than us.

We know Sutskever played an active role in Altman’s ouster. After Altman’s come back, he is in a state of limbo. Still, he heads the Super-alignment team.

To the AL community, super-alignment is a sensitive subject. To some, it is a red herring. To others, it is a premature subfield.

Surprising Altman compares OpenAI and the Manhattan project. Both are treated as projects which require protection against catastrophic risks. Many scientists are skeptical about AI gathering world-ending capacity anytime soon, or for that matter ever.

Instead, attention should be focused on AI bias and toxicity. Sutskever believes that AI, either from OpenAI or some other can threaten humanity. At OpenAI, 20% computer chips are available for Super-alignment’s team research.

The team is currently developing the framework for AI’s governance and control.

It is a moot point to define superintelligence, and whether a particular AI system has reached that level. The present approach is to use less sophisticated models such as GPT-2 so as to guide the more sophisticated models towards the desired direction.

The research will also focus on a model’s egregious behaviour. Human beings are trading off between weak models and sophisticated models. But can a lower-class student direct a college student? The weak-strong model approach may lead to some breakthroughs, as far as hallucinations are concerned.

Internally, a model recognizes its hallucination — whether what it says is fact or fiction. However, the models are rewarded, either thumbs up or thumbs down. Even for false things, they are rewarded at times. Research should enable us to summon a model’s knowledge and to let it discriminate with such knowledge whether what is said is fact or fiction. This would reduce hallucinations.

As AI is reshaping our culture and society, it is necessary to align it with human values. The most important thing is the readiness to share such research publicly.

print

Leave a Reply

Your email address will not be published. Required fields are marked *