Machine Take Over : Just Prevent It

Geoffrey Hinton, formerly from Google, and now at University of Toronto is one of godfathers of AI, along with Lecun and Bengio. He was speaking at the Collision tech conference in the Canadian city to a packed audience of 30,000 startup founders, investors and techies. It was Wednesday, the 28th June, 2023. He urged the governments across the world to make sure that machines do not take control of society. The audience attending was to explore how they can ride the AI wave, and were not much interested in knowing about the dangers of AI.

Hinton strongly feels that AI can take the control away from the human beings. Critics may feel he is overplaying the risks, but he feels the risk real. Besides, AI deepens inequality — it would make the rich, richer, and the poor, poorer. He was worried about fake news spread by ChatGPT-like bots.

Hinton feels AI-generated content should be distinguished by putting a watermark-like sign which is used to mark the currency. The European Union may consider such a move in its legislation.

The conference discussions were far from the threats posed by AI. They were about the opportunities created by the new technology. It is too premature to consider AI as an existential threat. As Andrew Ng puts it, it is ‘like talking about overpopulation on Mars.’

Facebook’s Recommendation Models of AI

Facebook made a bold claim on 28th June, 2023 that the recommendation models it is working on could surpass the biggest LLM models of today. Facebook is into research of multi-modal AI, say visual and auditory to better comprehend a piece of content. Some such models are in public domain, and some are used internally to improve relevance or targeting of messages. These advanced models understand people’s preferences, and have tens of trillions of parameters. In other words, orders of magnitude larger than the biggest language models of today. Is it talking about a theoretical possibility of the potential of a model? The company is clear that these very large models available at present can be trained and deployed efficiently at scale. Is the company ready to create infrastructure for such model? Perhaps, what they are aiming at is aspirational.

Preference understanding and modelling is a sort of behavioral analysis. Are they aiming at training the models on practically every written work available?

The 100 trillion parameter claim, though somewhat exaggerated, still shows that Facebook is aiming at something scarily big.

Facebook is conceiving a model larger than anything yet created. Facebook would like to dazzle advertisers with science. There would be large-scale attention models. There could be graph neural networks. There could be few shot learning and other techniques. An architecture that is hierarchical deep neural retrieval network.

Researchers may not be impressed. They are familiar with such ideas. The users either do not understand or care. However, an advertiser does feel it is better to put money on media where it is well-spent. Facebook is trying to convince them that it excels in understanding consumer behaviour. The primary aim of social media and tech platforms is to sell ads with better granular and precision targeting. Despite the users revolting against all this, the platforms try to impress upon advertisers the value and legitimacy of targeting. Advertising becomes prolific but the issue is whether it improves.

These platforms do not do market research to help their users. Have they ever done research to tell us which 10 advertising books are the best for media students? Instead they look over our shoulders when we are surfing the net, and buying some toffees to bombard us with ads of toffees the next day.

Do we really need a model with 10 trillion parameters just to tell what people like? And spend a hell of amount on building it.

Media Agencies

Media agencies are younger than the creative agencies. Therefore, these are nimble and agile. They are providing value for money to their clients, the advertisers. They also are quick to detect new opportunities. They recruit the appropriate manpower to exploit these opportunities, and evolve organisational structure to accommodate the manpower. Thus the organisation structure of a media agency is complex. Today’s media agency tend to have in their structure generalists knowing marketing, advertising and media. There are specialists across disciplines such as strategic planning, buying, analytics, digital branding, sports, retail, content and so on. It goes without saying that media agencies will take some time before they provide seamless integrated service to advertisers.

Clients have become demanding and want more out of the media agencies. Media agencies cannot afford to hire high quality talents at the entry level.

Media agencies will have to develop expertise in areas such as e-commerce, technology and the use of automation while handling clients at scale. Clients are tired of dealing with a variety of agencies. It augurs well for the future of large, integrated media agency. There are many successful creative agencies around the world, but there are not many large media agencies.

Digital media has overtaken TV in 2022. The market share of the other media is declining. At the same time, in absolute rupee terms, in India, the market share of the other media is growing. Each of them have a unique role to play in a brand’s life at different points of time.

Taara : Lasers to Beam Internet

Google conceived of a plan of using high-altitude balloons in stratosphere to bring internet access to rural and remote areas. However, the project was given up due to high costs. Google worked on laser internet technology in its innovation lab called X, which is nicknamed ‘Moonshot Factory’.

In the project, traffic-light sized machines beam laser carrying data, essentially fibre-optic data. The idea emerged from failed balloon internet project Loon. The balloon project too used lasers to connect data between balloons.

The project has been named Taara and is helping to link up internet in 13 countries, including Australia, Fiji and Kenya.

It will cost a dollar per gigabyte. Even in urban areas internet will be delivered faster since it is less expensive to beam data between buildings without burying cables. This is called moonshot composting.

Bharti Airtel may tie up with Google for this project in India.

Phi1 : Small Language Model from Microsoft

In language models of AI, we have large language models and small language models. Microsoft recently revealed its model Phi-1 with 1.3 billion parameters. Traditionally, there is a feeling that larger models are superior. However, Microsoft focussed on the quality of data for training the model. Phi-1 has been trained on curated text-book level dataset. It has already outperformed GPT-3.5 with 100 billion parameters.

Phi-1 has the transformer architecture. The crucial part is its training with text-book-level data. It completed the training process with 8 Nividia A 100 GPUs. The training process was completed in just four days. Instead of parameter count, the focus was the quality of training data. Phi-1’s accuracy score is 50.6 which surpasses that of GPT-3.5’s of 47%, though it runs with 175 billion parameters.

Microsoft wants to open source Phi-1 on HuggingFace. It is not for the first time that Microsoft is dealing with a small language model. It previously has introduced Orca, a 13 billion parameter model trained on synthetic data using GPT-4. Orca too has surpassed ChatGPT.

The belief that increased stock size is essential for better performance has ben dismissed. High quality data vests a smaller model with accuracy.

2400th Write-up, AI’s New Algorithm : Gemini

DeepMind’s AlphaGo AI software made a champion of the board game, Go, a loser in 2016. The techniques of AlphaGo can be borrowed to make a new AI system called Gemini to compete with ChatGPT from OpenAI.

Gemini is a large language model and is a work in progress. It is similar to ChatGPT, but its capabilities will be enhanced further — the ability to plan or the ability to solve problems. It would be a combination of the wonderful language capabilities of large models and the plus points of AlphaGo. At the same time, there are some new innovations.

AlphaGo is based on a technique developed by DeepMind called RL or reinforcement learning. The software learns from the actions taken in the games by repeated attempts and receiving feedback. It also uses free search to explore and remember possible moves on the board.

The language models will take a big leap by performing more tasks. Gemini will take some time before it is ready. It could be Google’s response to ChatGPT and other generative AI technology.

Language models are trained on massive data. The patterns are used to become proficient at predicting the letters and words (tokens) that follow a piece of text. The technology can be further enhanced by using reinforcement learning based on feedback from humans. It will give Gemini additional capabilities. So far language models learnt about the world only through the test. It is indirect learning. It is its major limitation. It can learn to manipulate tasks using neuroscience and robotics. Of course, the technology then could be abused, and will be difficult to control.

GPT : How It Makes an Inference

Autoregressive language modelling in GPT uses a sequence of tokens to predict the next token in the sequence. The model is trained on huge dataset of text. It learns to predict next token based on the content of the previous tokens.

GPT model’s architecture has a Transformer. In NLP the transformer enables the model to learn long term dependencies between tokens. This understanding is vital for learning the context of a text sequence.

In training a GPT model masked language modelling technique is used. Some tokens in the input sequence are masked. The model is asked to predict them. This helps the model to learn the relationships between different tokens in a sequence.

A trained GPT model is used for inference. The model can generate text, translate language or answer questions.

It can be fine tuned to perform specific language tasks.

GPT model’s transformer layer processes the sequence of tokens. Each token, as we know, is a piece of text, say a word or a character. The model assigns a numerical vector to each token. It is called embedding. The embedding represents its meaning and context. These embeddings are processed through the transformer layers. Ultimately, there is the output sequence.

BERT and GPT Models

At present, for natural language processing (NLP), there are two models in use — BERT and GPT. Both are large language models (LLMs). Both these models are based on transformer architecture. However, there are some key differences between these models.

BERT stands for Bidirectional Encoder Representations from Transformers. This model is bidirectional, i.e. it can process text in both the directions. It is thus appropriate for question answering and sentiment analysis since the model has to comprehend the full context. BERT has been trained on a dataset of text and code. It has 340 million parameters. BERT is open source. It is easier to fine tune BERT for specific tasks. BERT works on encoding mechanisms to generate language. BERT is a transformer encoder. BERT is bi-directional. BERT’s input and output positions of each token are the same.

GPT stands for Generative Pre-trained Transformer. It is an autoregressive model, i.e. it can process text in one direction only. It is appropriate for text summarization and translation, where the model has to generate new text based on a given prompt. GPT has been trained on a dataset of text only, and it has 1.5 billion parameters. GPT is proprietary model. GPT relies on decoder part of the Transformer to generate text. GPT is a Transformer decoder. GPT is unidirectional. GPT is meant for autoregressive inference. These models generate an output of one token at a time. The probability distributions over the next token depends on the previous tokens. In short, these models generate output by predicting one token at a time based on previous tokens.

Dark Patterns

ASCI processed advertising complaints in 2021-22. Some 29 per cent were about the dark patterns to lure customers. These ads were promoted by influencers and were from sectors such as e-commerce fashion, personal care, crypto, food and beverage and finance.

Let us understand the dark patterns. It is an attempt by a user interface to trick users into making choices that are detrimental to their interest, e.g. buying an expensive product, paying more than what was initially declared, making choices on false/paid-for feedback. These dark patterns impede a customer’s right to be well-informed. These constitute unfair trade practices prohibited under the Consumer Protection Act, 2019. The problem can be addressed by having self-regulation.

Types of Dark Patterns

Nagging : It is persistent, annoying and repetitive criticism and requests for action.

Bait and switch : Here what is delivered is different from what is advertised. Often, there is a switch to a lower quality product or another product.

Disguised ads : Ads are so designed that these look like content, say news articles or user-generated content.

Urgency : Here a sense of scarcity is created or a sense of urgency is created. It is is a pressume tactic goading consumers to make a purchase or take action.

Basket sneaking : Here some additional products/services sneak into the shopping cart without the user consent.

Forced action : Consumers are coerced into taking action which otherwise they would not have taken, e.g. signing up a service so as to access content.

Subscription traps : It is easier to sign up for a service, but difficult to cancel it. The cancellation is a cumbersome multi-step process.

Hidden costs : Some costs are kept hidden till the consumers are already into making a purchase.

Pre-ticked boxes : Some checkboxes are already checked. It is assumed that you will not bother to uncheck them. These boxes are for opting into email newsletters or agreeing to receive promotional material.

Misleading buttons : A button says something but does something else. There is a ‘cancel’ button which does not cancel or a ‘no thanks’ button that signs you up for something. These buttons must be deleted.

Disabled links : It is annoying to click a link to close out a pop up but it does not do anything. Either you do whatever the pop up asks or close the browser tab totally. Such links should be disabled.

Utterly Butterly Amul Girl

Sylvester daCunha, the man behind the ‘utterly butterly’ Amul girl passed away on 21st June, 2023 in Mumbai.

daCunha was a veteran advertising man who was associated with Amul brand since the 1960s. He co-created Amul girl with his art director Eustace Fernandes. The Amul girls cheeky sentences created a memorable campaign for the brand, and it celebrated its golden jubilee in 2016.

It is the longest running ad campaign in the world. It has a single character and a topic. The whole thing started in 1966 when Kurien started operation flood in Gujarat.

Amul girl is the blue-haired noseless girl who tossed chucklesome lines into the social flow, and made Amul butter a household staple.

The ‘utterly butterly delicious’ qualifier was the contribution of daCunha’s wife, Nisha daCunha.

The style and technique of Amul campaign has remained unchanged over the years. daCunha in fact pioneered ‘moment marketing’. The catch phrase utterly butterly delicious became unforgetable.

The mascot has stood the test of time and is still relevant 57 years after it was conceived.

daCunha is survived by his wife, a son Rahul daCunha, also an adman. He was a brother of the late Gerson daCunha, an adman.

We are left with a simple Amul girl with big eyes in a dress with red polka dots and matching ribbon in her hair and paired with red shoes. She delights us with her witticisms and turns of phrase. She is adorable, and does clever and at times cringeworthy wordplay. She makes on-the-button references to topical events.