Sam Altman and Brockman in Microsoft AI team

Sam Altman, former OpenAI CEO and Greg Brockman, former President, OpenAI are joining Microsoft as members of their new team for advanced research in AI. Microsoft’s partnership with OpenAI will continue. In the meantime, OpenAI has appointed Emmet Shear as the new CEO of OpenAI. He has been named as the interim CEO by the Board on 19th November, 2023. He was working with Twitch Interactive. He has ties to the effective altruism movement.

The colleagues who left OpenAI together with Altman and Brockman too will join the new Microsoft team. They will soon be provided with resources required to run the show successfully. Sam Altman will take over as the CEO of this new group.

Zephyr 7b Language Model

Zephyr 7b is also an LLM developed by Hugging Face, using 7 billion parameters. It is a model on the lines of GPT. It is fine-tuned to be more helpful and informative. It has been trained on public datasets and synthetic datasets using DPO — direct preference optimization.

It outperforms many other models. It generates more fluent and informative text. It follows instructions better.

It can be used for NLP or natural language processing.

It is still in the making. It should be used for academic purposes only. It could generate problematic text.

Issues of Pharma Marketing

Brands are the promises they make. People love brands since they are sure they will get what is promised. Legally speaking, brands represent IP, patents, copyright, trademarks and design or a combination of these.

Branded products are expensive. They sometimes are more expensive than they should be. No doubt, brand distinguishes a company’s product. It differentiates a product from other products.

It should be noted that a product cannot be an absolute monopoly. It can be substituted to various degrees.

To the extent a brand differentiates a product, the producer can charge a mark-up for the added value. The added value can be real or perceived.

A brand, as we noted, is a promise — and it is basically a promise of quality. This builds brand loyalty.

In pharma marketing, there are clinical trials. There are side effects of drugs. There are patents to create temporary monopolies. There is the so-called ethical promotion through doctors.

Here in the name of promotion, the doctors could be bribed to prescribe the medicines.

In other words, in pharma marketing, there is asymmetric information — there is, therefore, no informed choice.

This escalates the healthcare costs for the patients. Mostly, these costs are out-of-pocket.

There are three types of medicines broadly – on patent medicine which are brands, off-patent medicines which are generic but produced by a reputed company, off-patent generics that are unbranded. In treatment, there is substitution between these three. There are differences in price between these three.

Wholesale pharma markets such as Bhagirath Palace, Delhi and Princees Street, Mumbai reveal non-adherence to existing regulations, and lack of quality controls.

In India, pharma manufacture consists of 3000 drug companies and 10,500 manufacturing units. Out of these 10500 units, about 8500 are MSMEs. GMP or good-manufacturing-practices are there since the late 1980, and still only 2000 units are GMP-compliant. There is tardy implementation of Drugs and Cosmetics Act.

Mashelkar Committee (2003) quoted 0.5 to 35 per cent as the extent to which drugs are spurious. Regulatory authorities report sub-standard drugs to the extent of 8.19 to 10.64 per cent. There are spurious drugs to the extent of 0.24 to 0.47 per cent. Spurious means fake or counterfeit and sub-standard.

Spurious and sub-standard drugs do not facilitate treatment.

GMP, it seems, are directed towards exports, but they should also be directed towards domestic consumption too. Those MSME units which are not GMP-compliant must be closed down.

Jan Aushadhi stores sell unbranded generics. There was a recommendation that doctors should prescribe unbranded generics. It is since then withdrawn.

Just like doctors, pharma companies do influence chemists and retailers. A generic prescription gives the pharmacist a choice to sell brand, a branded generic and an unbranded generic.

Ilya Sutskever, Chief Scientist, OpenAI

Ilya is the chief scientist at OpenAI and a Board member. He is Israeli-Canadian. His focus is to prevent artificial superintelligence which can outmatch humans. Sutskever was born in Soviet Russia. He has, however, been reared up in Jerusalem since he was five. He studied at University of Toronto, Canada with Geoffrey Hinton, the pioneer of AI. Hinton was in Google. He left the company early 2023 to warn the world about the perils of generative AI. It should be noted that Hinton, and his two graduate students, one of them being Sutskever, developed a neural network in 2021 to identify objects in photos. The software was called AlexNet. It worked by recognizing patterns. Google acquired Hinton’s startup DNNresearch. Google hired Sutskever where he extended the ability of pattern recognition of images to pattern recognition for words and sentences.

Elon Musk, Tesla CEO, noticed Sutskever, and at his instance he left Google and co-founded OpenAI in 2015 along with Musk. Later Musk fell out with OpenAI which tended to be a with-profit company accepting heavy investment from Microsoft.

At OpenAI Sutskever contributed significantly to the development of LLMs including GPT-2, GPT-3 and DALL-E ( text-to-image). In 2022, they released a conversational bot called ChatGPT.

Sutskever had concerns about the potential perils of AI, especially about superintelligence. He had disagreements with Altman about the pace of introducing AI products, since there were issues of safety. He was concerned about superintelligence going rogue, no matter who built it.

Sutskever tended to think in terms of aligning the development of AI with ethical principles. He wanted this superalignment for superintelligence too.

Creativity at Stake

We know WPP has decided to merge Wunderman Thompson and VMLY&R to create VML. This retirement of Wunderman Thompson is one of the big signs of the demise of traditional creativity.

Creativity stands at a crossroads since 1990s when ad agencies commenced what is called ‘unbundling’. The full agency format offered services like media planning and buying, creative execution, marketing research all under one roof. Later, the industry witnessed the individual functions of the full agency as strategic business units (SBUs), and later as independent businesses. That led to the emergence of media agencies, creative agencies and research organisations.

Creative agencies after this unbundling have never been able to price their offerings independently. In advertising networks, they were treated as ‘cost centres.’ In the last few years, reaching people through the right media has become more important than getting the creative right. It makes the creative output of ‘poor quality’. The point is that even if the quality is poor, the ads reach more number of people. What is the point if ads of good quality reach fewer people? Creative talent in ad agencies will now aim at reaching more people with average quality of creative work quickly.

Veterans in the field lament the lack of value attached to creativity today. Global networks are driven by profitability, rather than creativity. It must be accepted that the new entrants in advertising industry are less passionate about creativity than the old timers who joined a decade ago. DDB Mudra, Ogilvy are those agencies which exhibit this passion. Wunderman Thompson too was passionate about creativity.

All said and done, the global revenues of advertising have fallen. Mergers are theoretically good, but whether they work in practice is a moot point. Organisation culture is at stake. There is insecurity at the top. Top tier talent tends to exit.

Digital advertising has disrupted the industry, as it accounts for 40 per cent of total ad spends in India.

LLMs : Pros and Cons

The way we interact with software has changed a great deal by what we call large language models. LLMs are where deep learning and computational resources combine.

Still, LLMs can generate false, outdated and problematic information. They even hallucinate — generate information that does not exist.

First let us understand language models. These generate responses as we humans do. They have been trained on a large corpus of data, and that makes them understand the nuances of the language. These models are neural networks with many layers to learn complex patterns and relationships. They generalize and understand the context. They do not remain restricted to pre-defined rules and patterns, but learn from massive data to develop their own understanding of the language.

As a result, these generate coherent and contextually relevant responses.

Deep learning is a game-changer. The precursors to neural networks relied on pre-defined rules and patterns. Deep learning invests them with the capability to understand language naturally — in a human-like way.

Deep learning networks have many layers, that make them analyze and learn complex patterns and relationships.

Broad language understanding is developed in pre-training stage, and later fine-tuning makes the model versatile and adaptable. To perform specific task, task descriptions and examples are given — few-shot learning or task descriptions alone are given — zero-shot learning. The pre-trained weights are adjusted based on this information.

Though deep learning with multiple layers and attention mechanism enables the model to generate human-like text, there could be overgeneralization — the responses are not contextually relevant, accurate and updated.

LLMs have capabilities based on their training data. The training data could be out of date. The input text may be ambiguous and less detailed. It may lead to wrong context.

The input data may have incorrect information or biases. This is true for sensitive and controversial topics. The models use the data as short hands. These patterns are based prejudices. The responses too reflect these prejudices.

LLMs do not have the ability to check the correctness of the information they generate. The confidence with which the response is generated may mislead the users.

Hallucinations are possible when the queries are not correctly framed. They do not produce false information intentionally. The model has to generate a response as per the patterns learnt.

LLMs are not trained to reason. They are not students of any subject — science, literature, computer code etc. They are simply trained to predict the next token in the text.

Good Bye, Altman! See You Soon!

Sam Altman’s ouster from OpenAI this weekend (18-19 November, 2023) is surprising. The Board tacitly says about Altman’s communications with the Board being not candid. It affects the Board’s ability to exercise its responsibilities. It is so ambiguous.

There was OpenAI DevDay. OpenAI announced buid-your-own ChatGPT and also announced GPT-4 Turbo. Microsoft too had restricted the use of ChatGPT of late, but soon lifted it. Altman too announced that OpenAI pauses signups for ChatGPT Plus due to capacity challenges. Were the new releases having some security concerns?

OpenAI’s future too has been compartmentalised into two distinct visions — a commercial approach and not-for-profit approach (seconded by Ilya Sutskever).

On November 17, Altman joined a Google Meet at the instance of Ilaya where the news of his removal as CEO was broken. President Brockman too was informed through another Google Meet by Ilaya about this news. Brockman was also told that he no longer remains the chairman of the Board.

It is not at all clear why the ouster happened now, and the reasoning behind it. But the decision has wider implications — both in terms of the future direction of the company and the future of AI as technology.

The future of the company will be shaped by Ilaya and the remaining five Board members. Security of the products would be a priority rather than fast release of new features and commercialization.

Microsoft as the financial investor came to know about Board’s decision just a minute before the public announcement. Since its own direction will also be affected by the developments, in future it it likely to be more involved in governance of OpenAI. Microsoft’s competitors will take advantage of this situation. Perhaps, there is more reliance on one company and one product.

Agreed, Sam Altman was a public face of OpenAI. Both Altman and Brockman are not novices. They could get support from some other industry stalwarts.

Generative AI and Coding

Generative AI is a blessing for coders. One has to give a prompt in natural language, and the model writes the code. In addition, it has the ability to detect the bugs in the written code.

All this raises concerns about coding as a job. Could there be job losses since coders will use AI programming assistants — Co-pilot and Code Assist?

Co-pilot assists coders to write code, suggest code snippets and provides real-time hints. The coding process is thus streamlined. The whole thing facilitates code writing. There is no replacement of code programmers.

There is a limit to what these AI models do — context length. It is a limit to which the model can comprehend the lines of code. As we know, enterprise-level code consists of millions of lines of code. It is necessary to have a human element in the loop while developing code.

AI coding assistants improve productivity of the programmers. It does not mean they do more work. As there are complex issues while programming, we have to increase the number of coders.

In early days, coding was done in low-level language, say assembly language. In the 1960s and 1970s, Fortran was being taught at educational institutes. Coders did not spend all their time in writing low-level code. They can develop a large scale distributed system in Fortran. However, the new developments bring a new set of complexities — public and private cloud, scaling up and down, dependencies while distributing.

AI coding assistants also learn the code from what a human coder writes. It means that human programmers, and domain experts are still needed.

AI coding assistants facilitate automation and testing. They only automate repetitive tasks. AI cannot reach human-levels of problem solving.

Chinese LLM Model

As we know by now, an LLM is a computer algorithm trained on massive datasets to understand and process natural language. This is what we require — AI that generates text, video and audio.

Chinese company 01.AI has released its new LLM, Yi-34B. it is so called since it has been trained on 34 billion parameters. Parameters, as we know, are the weights of inputs a model learns to predict what comes next in a sequence.

Kai-Fu Lee, a Taiwanese computer scientist, has founded this company in March, 2023. Its LLM model is open source. It is available to developers in English and Chinese.

Hugging Face, open source developer community platform, ranked Yi-34B first in a leaderboard of pre-trained LLMs. Though the model is smaller than Falcon-180B and Meta Lla-Ma2 70B, still it beats them. Lee Sees it as a ‘gold standard’ on key metrics.

Lee has worked for American Big Tech and is considered AI pioneer. He has authored two books on the subject.