As we know Apple’s iPhones were manufactured in China by a company called Foxconn. We also know that the chips that power generative AI are made by American company Nvidia. Both these companies Foxconn and Nvidia would like to tie up to create ‘AI factories’. These factories will be powerful data processing centres. They will drive the next generation products such as electric cars. The processing centres will facilitate faster movement to the new AI era. It would include digitalisation of manufacturing and inspection workflows, AI-powered electric vehicle and robotics. They will also facilitate language-based generative AI services.
Blog
-
Adobe’s New AI Tools
Adobe, famous for its Photoshop, has entered the AI race earlier in 2023 by introducing a new family of generative AI tools.
Firefly that creates images, text effects, audio, vectors and 3D for users is Adobe’s AI tool. With other tools such as Midjourney , DALL-E and Stable Diffusion, Adobe has integrated AI into its mainline products. It ensures that this integration fuels creativity and delivers power and precision.
People have generated more than 3 billion images through Adobe Firefly since it was announced early in 2023. Adobe is taking measured steps to roll out AI features across its popular products and is engaging actively with the creative community with beta releases, hoping to get feedback to make the future versions better.
Firefly is based on the content within Adobe’s stock library. It is publicly licensed content and public domain content for which license has lapsed.
Adobe Firefly is generative AI imaging tool. Adobe has also added features to Illustrator and Adobe Express. Adobe has improved Photoshop’s text-to-image capabilities.
Adobe announced all this at the annual MAX event in Los Angelas, California. The keynote was held in the Peacock Theatre.
-
Co-pilot and Microsoft
Microsoft started through the lens of how people really want to interact with the emerging technology of AI. They thought of this concept of a co-pilot. The first one was the GitHub co-pilot. It was a programming partner for the developers. It helped them to write code more efficiently and effectively. It should be noted that co-pilot is not an auto-pilot. It keeps humans at the centre.
Microsoft initiated being a co-pilot for the web. Microsoft 365 brings co-pilot in Teams and in Word. There is co-pilot in PowerPoint and Excel. There could be co-pilot in security.
The intent is to have co-pilots in products across the spectrum, whether in Windows or Microsoft 365.
Microsoft had an AI layer with Cortana for a few years. The present rendezvous with AI is not sudden. It is a result of 10 plus years of work and research. There was AI in Word as autocorrect. There was designer in PowerPoint. However, the LLMs are more powerful and have given us the new interface. It has enhanced our ability to talk to the computer. The question-answer sessions were not possible in past.
Humans are well-versed in asking questions. Computers have not been great at answering those questions. The search results make you hop from one result to another. However, we can ask specific questions and if the answer does not satisfy us, we can ask follow up questions. The same thing is possible with a Word document. It has become a natural process.
Microsoft is a tool company. It is a platform company. Microsoft has a security co-pilot allowing security researchers to move at machine speed.
Microsoft has evolved and published two versions of responsible AI standard. Technology is both a tool and a weapon.
These are the views of Frank X Shaw, chief communication officer, Microsoft. These have been paraphrased here for the benefit of the readers.
-
Biased Bots
In recruitment and selection, employers in countries such as the US use some form of AI to screen and rank the candidates for hiring. Several black candidates observed a bias against them by the AI algorithms. There was also bias seen against disabled and over the 40 candidates. One algorithm discriminated against the CVs where the word ‘women’s’ occurred.
Many of these AI tools have been proven to be unduly invasive of the workers’ privacy and discriminate against women, people with disabilities and people of colour.
The federal agencies are working at potential discrimination arising from datasets that train the AI systems and the opaque ‘blackbox’ models that make it difficult to exercise anti-bias diligence.
Is this ‘responsible AI’? Can we indulge in automation in the recruitment and selection market without any restriction? The issue is how to regulate the use of AI in hiring and guard against algorithmic bias.
-
Implications of AI
As we already know, three AI scientists won the Turing Prize in 2019 — they were Geoffrey Hinton, Yaan Lecun and Bengio. The prize was given for their outstanding work in the areas of deep learning and neural networks.
Despite their collaboration, Bengio and Lecun hold different opinions on AI’s potential risks,
In October 2023, there was a debate about the potential risks of AI between Yann Lecun and Youshua Bengio. As we know Lecun is Facebook’s chief AI scientist. He rolled out the debate on his Facebook page addressing the silent majority of AI scientists to express their opinions about the reliability of AI. It gave rise to a lively discussion, eliciting comments from respected AI community.
Bengio from University of Montreal answered to Lecun’s post. He did not agree with Lecun’s perspective on AI safety. He advised prudence in designing AI systems. He was not in favour of open source AI systems. He compared them to the distribution of dangerous weapons freely.
Lecun focused on safe systems but advised avoiding the catastrophic scenarios. He feels there is enough funding to make AI safe and reliable. He does not agree with the comparison of open AI systems with the free distribution of dangerous weapons. He feels AI is to enhance human intelligence. It does not intend to cause any harm.
Eisner from Microsoft also contributed to the debate. He supported the weaponry analogy of Bengio. It was agreed that though there cannot be zero risk situation, the access could be restricted to minimize the harms.
AI debate has not remained restricted to academicians. It has invited attention of the thinkers and policy makers. With the fast advancing field of AI, there is a need for fruitful debate about the implications of AI.
-
Important Concepts of Transformer Architecture
Add and Normalize
In transformer architecture, we come a cross the term ‘add and normalize’. The first step is ‘add’ — a residual connection that adds input of the sublayer such as self-attention or feedforward network to its output. This prevents vanishing gradient problem and makes it possible for the model to learn deeper representations. The second stop is ‘normalize’ — there is normalization of the sublayer across the feature dimension. It stabilizes the training process and reduces dependency on initialization.
Multi-head Attention
Multi-head attention enables a neural network to learn different aspects of input sequence by applying multiple attention in parallel. The idea that works here is that different queries, keys and values can capture different semantic information from the same input. To illustrate, one attention head can focus on syntactic structure of the sentence, and other can focus on semantics of the words.
There are four steps in multi-head attention.
- To begin with, the input queries, keys and values are projected into h subspaces using linear transformations. Here h is the number of attention heads. Each subspace has a lower dimension than the original input space.
2. Secondly, each projected query, key and value are fed into a scaled dot-product attention function. It computes the attention weights and outputs for each subspace independently.
3. Thirdly, the outputs of h attention heads are concatenated and linearly transformed into the final output dimension.
4. Lastly, the final output is optionally passed through a layer of normalization and a feedforward network.
Multi-head attention has several advantages — it can learn more complex and diverse patterns from the input sequence by combining attention functions. It is cost effective. It improves memory usage by reducing dimensionality of each subspace. It makes the model robust by introducing more parameters. Multi-head attention can be implemented from scratch in TensorFlow and Keras.
Multi-head attention and Self-attention
These two are related concepts, and yet distinct in transformer architecture.
Attention is the ability of the network to attend to different parts of another sequence while making predictions.
Self-attention is the ability of the network to attend to different parts of the same sequence while making predictions.
Multi-head attention makes it possible for the neural network to learn different aspects of the input or output sequence by applying multiple attention functions in parallel.
Self-attention can capture long-range dependencies and contextual information from the input sequence. It can be combined with multi-head attention. It can be regularized by applying dropout or other methods to the attention weights. It reduces overfitting.
1
-
Private Large Language Models (LLMs) in Indian Banking
LLMs run generative AI applications such as ChatGPT. LLMs facilitate communications and provide information clarity. LLMs are cost-and-time-intensive to develop.
Banking leadership cannot be achieved solely on the basis deposit mobilisation and treasury operations. Technology is a vital ingredient that helps to build nonreplicable customer relationships. This builds the coveted competitive advantage.
HDFC Bank and its rival Axis Bank are contemplating the adoption of private LLMs trained on their internal data. LLMs make available generative AI to let the customers experience healthy customer interface and intuitive experience.
HDFC will launch a private LLM-powered website in next six months. Currently, the site is in beta stage. LLMs would provide an ability to convert buying through a lot or data points. Through simple prompts, a customer would quickly access information he is seeking regarding any product. Ultimately, a customer can get details of his bank account.
A private LLM model will be leveraged to write credit assessment reports, business requirement documents and so on.
Axis Bank is contemplating generative AI-based virtual assistants for customers. In case of operations, customers would be using inference capabilities to automate usage. They plan to use private LLMs for specific use cases by the end of 2024. They are engaging with cloud service providers (CSPs) and software-as-a-service (SaaS) providers to explore various options.
-
Mind Your Language
Autocorrect, a spell check feature, is pre-installed on most virtual key boards on Android and iOS phones. It also identifies misspelt words on Apps such as MS Word and Google Docs. It corrects the spellings even while we are typing the word.
There is predictive typing feature which infers the word or words which will appear next in a sentence. There is autocomplete to predict the words which will complete the sentence.
All this appeared in the early 1990s when Internet was nascent to enable faster and error-free typing.
There is a large market for the writing enhancement software. There is an AI-tool called Grammarly, a grammar checking tool. After the US, India is its largest market for it. Proofread of Google competes with Grammarly. It is a collaborative tool for Google Workspace called Duet AI. It operates on subscription model. Grammarly offers a free tier in addition to subscription plans.
ChatGPT generated material is subjected by the students to AI-paraphrasing tools such as QuillBot to avoid detection of copy lifting.
Previously, people were judged on the basis of the language used — grammar and spellings. By using software more and more people make sure that what they write stands out.
Audio-visual media these days teaches us new words. Previously we learnt new words from text and there was likelihood of getting its pronunciation wrong. Video teaches us the correct phonetic aspects of the new words. The earlier generations had good writing skills, whereas the new generation has good speaking skills. Even people love to write the way they speak.
LLMs can now be trained to understand how the new generation uses the language.
-
Predictive AI
All of us are now aware of generative AI where the model is trained on massive data so that it is able to generate derivative of the data, either as a summary or something else.
Another type of AI is Predictive AI which also uses lot of data and subjects it to statistical analysis such as clustering and regression to predict an outcome.
NetApp works in this area. They have been working on it since 2018 in collaboration with Nvidia.
They use predictive analysis to predict failures in all types of systems. Clinical trial data of drugs can be compared to other data sets. This way they detect anomalies. It quickens the presentation of data to the regulatory authorities. Predictive AI can be used to predict forest fires. It can also be used in medical imaging.
NetApp also helps customers to store their images and documents. Google and Nvidia’s generative AI later can be used to search these documents, and avoid some documents being searched to let them remain private.
NetApp assists AI models to do ‘model traceability’ — keeping track of documents and datasets used for individual models. Models then can be compared in terms of accuracy.
It is difficult to monetize generative AI. However, predictive AI is easily monetizable, as its outcome is powerful and impactful.
-
Google’s Anti-trust Case
We have already discussed the Google case of anti-competitive practices to maintain its dominant position. As we have already observed, the Justice department’s case against Google refers to a series of contracts where Google pays web browsers and smartphone makers to be the default search engine.
Michael Roszak, a senior Google executive (vice-president for finance) wrote notes on communications for a training programme in 2017. In the notes, Michael writes that search advertising is one of the greatest business models ever created. This business, he continues, ignores one of the fundamental laws of economics. It ignores the demand side of the equation (users and queries) and focuses only on the supply side of advertisers. He goes on to compare this search advertising business to illicit business of cigarettes and drugs.
The document was used as a piece of evidence in the case. Roszac testified at the trial in September, 2023. However, the government removed from the web public access to emails, chats and internal presentations at the instance of Google. The exhibits were reposted after the judge brokered a compromise to create procedure for their posting. Thereafter, Roszak’s notes were made publicly available on 28th Sept, 2023.
The document is full of exaggeration and hyperbole. Roszak testified that he could not recall any presentation on the subject. He further said the document was never sent to anyone else at Google. Roszak said he was saying things he did not believe as part of presentation in the course.