Author: Shabbir Chunawalla

  • AI Voice Cloning

    The availability of a few seconds of an audio sample (samples of even less than five seconds) is enough to train AI to replicate a person’s voice resembling the accent, pauses, speech patterns, etc. After this, a text to speech software is used. All the inputs will be cloned.

    Cloning the voice of celebrities after their death or giving an audio outlet to people who are unable to speak could be the legitimate uses of voice cloning. However, such cloning could be used for criminal purposes. It is easier to get the voice samples. It could be an answer to a wrong number. Or a video uploaded by a person about his life and work. Or a public speech or an interview. Or hosting a programme.

    The cloned voice could be robotic. It could sound like gibberish. The caller could be in a hurry. It is not wise to reveal confidential data. One can be alert and can cross-examine the caller. However, the voice at the other end sounds so convincing.

    Voice cloning is a serious risk.

  • OpenAI’s Talent Hunt

    OpenAI is always on the lookout for talent. Of late, it was hired three experts in the area of computer vision (CV). They have been pulled away from its rival DeepMind. They will be based in Zurich, Switzerland office. These individuals are Alexander Kolesnikov. Lucas Beyer and Xiaohua, Zhai. They will work on multi-modal AI.

    To begin with ChatGPT was text model. Later the company added voice and image features. It then became multi-modal. DALL-E, the image model of ChatGPT is available within ChatGPT. As we have observed in a separate blog, OpenAI has also introduced Sora, an AI video model.

    There is fierce competition among the AI companies for top talents. The compensation offered run close to seven figure or more. It is not unusual, as the experts in AI are most sought after, Google recently hired Noam Shazeer, founder of Character.AI by paying $2.7 billion package.

    Beyer has 70000 followers on X. OpenAI’s key persons have also left in recent times, e.g. Ilaya Sutskever who launched Safe Superintelligence (SSI), Mira Murati, former CTO left the company in September 2024. She is raising money for a new AI venture.

    OpenAI is expanding globally. It has set up Zurich office. It intends to open an outposts at New York, Singapore and Seattle. It has already an outpost in London and Tokyo.

    The newly recruited talent will be based Zurich. Zurich has ETH Zurich, a research university with a reputed computer science department. Apple too has set up a laboratory at Zurich.

  • AI and Sentience

    We have referred in previous write ups the issue of consciousness. Can consciousness be infused into an AI system? Can AI too have feelings and emotions? Can it feel happy and sad? In other words, can it have sentience. Some predict AI to have a semblance of consciousness by 2035. Such a system would have rights just like human beings.

    AI safety bodies are meeting in November 2024 to develop stronger safety frameworks.

    This is quite a divisive issue. It can fragment the society, where views could be drastically different. Birch, an expert on animal sentience, could see AI systems of future would have interests and moral significance. There could different sub-cultures. A new model of AI could be a product or a new form of conscious entity.

    Scientists have developed some markers for animal consciousness, e.g. octopus has greater consciousness than a snail or an oyster.

    Some feel further research on AI must consider its potential risks for the human beings.

    There could be consciousness looming over AI systems. It could be in distant future and impractical. Even then, the possibility cannot be dismissed altogether.

    By the way, intelligence and consciousness are different. The prospect of sentience raises ethical and philosophical questions. Some consider sentient AI to be very dangerous. Some say there is no need to anthropomorphize AI. As we are advancing towards AGI which is a stage where AI matches human cognition, scientists assure us that AGI-powered machine is possible without being sentient.

    Sentient AI is an AI system capable of feeling and thinking and feeling like a human. It can perceive the world around it and have emotions. It is being aware about itself — its state of being and being able to sense or predict, feelings of others, instead of saying ‘I am hungry it says, ‘I know I am hungry’ or ‘I want to eat dosa and idlis in breakfast.’ The current AI we have is not capable of experiencing sentience.

  • The Crux of Singularity

    I have already written a blog on Singularity where it was defined in terms of rapid change in different disciplines. As the most profound change could be brought about by AI, we can redefine Singularity in terms of AI. Thus, Singularity is going to be the moment when AI will no longer be under human control. Are we ready for such an eventuality? One expert has put an exact date on this happening.

    Singularity happens when AI surpasses human control. It may be just a few years away. It is going to be a much shorter estimate then most of the current projections.

    Some estimate that AGI could be achieved within 3 to 8 years. AGI is the ability to perform tasks as well as the humans. AGI is a precursor to singularity. There is global excitement about AGI. It means capital investment and human commitment. Talent is channelized in the direction of achieving AGI.

    Still the technology has not matured to achieve AGI. There is a pursuit underway. Maybe, it will take long years. Maybe it is within our reach in near future.

    Asimov, a sci-fi writer speaks about faster-than-light travel navigated by robots with a hyperspatial engine designed by a supercomputer called the Brain without any human intervention. The element of elimination of human creators could be described as Singularity. Asimov talks of robots outthinking humans in this work. The idea is superintelligence that exceeds human cognitive abilities. This is the crux of Singularity.

  • Biopharmaceuticals and Patents

    Biopharmaceuticals are medicines made from living cells (such as yeast and bacteria). The conventional drugs, by contrast, are made from chemicals. Biopharmaceuticals are used in the treatment of chronic diseases such as cancer, diabetes, cardiovascular disease and auto-immune diseases. Biopharmaceuticals are put in two categories — biologics and biosimilars. Biosimilars resemble the prescribed biologics cleared by authorities. In fact, they are follow-on biologics. They are equally efficacious and safe as the first biologic version.

    India biopharma industry is fastest growing globally. India is a pioneer in global biosimilars, say the first country to approve a biosimilar for Hepatitis B.

    In India, there are 98 approved biosimilars, of which at least 50 are available in the market.

    Many biologic products will lose patent between now and 2030. Thus, there is opportunity for Indian biopharma to launch biosimilars.

    The government promotes acceleration of the production of biopharmaceuticals. Despite this, India has only a 3 per cent share of the global biosimilar market.

    Patenting Process

    To encourage innovation and to allow the recovery of drug development costs, a 20-year patent is given for exclusive marketing of patented products. As soon as the patent expires, there are low-cost versions of the patented drug. Biosimilars are such low-cost drugs.

    However, exclusivity of patented products is lengthened by making minor modifications in the patented drug and getting a renewed patent. It thus blocks the entry of low-cost economic versions of patented drugs and biosimilars. To illustrate, trantuzumab (Herceptin), a biologic used to treat breast cancer, got a new patent by introducing a SC version. This is patent evergreening.

    India guards against patent evergreening by Section 3 (d) of the patents Act,1970. It rejects renewal of patents by small innovations that lack substantive improvement. Glivec (Imatinib) used to treat leukemia from Novartis was denied patent under this clause. The Courts later upheld the decision. Section 3(i) of the Act restricts patenting mixtures of known compounds unless a synergistic effect is proven, Section 3(e) of the Act prevents patents on treatment methods.

    Despite these safeguards, there ae hurdles in launching biosimilars. Pertuzumab, for certain breast cancers has become controversial. Even in India, more than 70 per cent of granted patents are of products with minor or secondary modification. In the US, 74 per cent patents are of the existing drugs. Many drugs receive patents to lengthen their life. The European Union has framed guidelines and has seen significant biosimilar adoption. India should be more proactive in this respect.

  • ChatGPT Enriches the Rich

    We are celebrating the second birth anniversary of ChatGPT which was introduced in November 2022. It quickly became the fastest growing app in history with about 200 million as active users today. In its early days, it was hailed as a big leap forward — it had varied capabilities to generate text, images, code. It had the capacity to influence film making and the education system. It has fulfilled some of these expectations — say it is a good educational tool, but its influence on other areas of life and business is still an open question.

    In these two years, a handful of tech firms have benefited greatly — the big six tech firms grew in aggregate more than $8 trillion since ChatGPT’s birth. Enterprise level adoption, though slow, has been growing continuously. The app must facilitate transformation of the business in general, rather than enriching the big tech.

    Consultancy firms promoted generative AI to their clients. The AI business of IT service firms too is going up. We expect to see the benefits of AI percolate down to small businesses and startups. It is expensive to be in the business of building ‘foundational models’. These have to compete with OpenAI, Google and Facebook models. The smaller businesses focus on building services that act as wrapper around existing models. Even here big tech can introduce a model that snuffs out models of the smaller firms, e.g. speech recognition models suffered when OpenAI introduced Whisper in 2022.

    Maybe, the stranglehold of big tech loosens in future, as for specific needs there could be a shift for smaller models. These smaller models are easier and economical.

    Of late, AI has plateaued. It is, in fact, an opportunity for businesses to introduce generative AI into their processes and assess it for a return on investment.

    ChatGPT has left a legacy two years after its appearance of enriching the big tech. It should lead to the next phase of AI — it should become democratic. It should benefit smaller business. It should have smaller models to remove entry barriers. Till now, the revolutionary technology has just extended the wealth and influence of big tech.

  • Satcom and Mobile Services

    Satellite communication services and terrestrial networks (mobile services) are pitted against one another. There is an issue of spectrum allocation. The mobile services companies expect the spectrum to be allocated by auction, while satcom companies expect it to be allocated administratively.

    Are both these companies on par? Terrestrial networks can serve both hand-held and rooftop devices. Satellite services face significant hurdles for mobile devices. They cannot match the terrestrial companies. Both of them are on par only when fixed wireless access (FWA) solutions or broadband services.

    Satellite companies cannot replicate the services of terrestrial networks for cellphones which cannot efficiently receive satellite signals at high frequencies. Thus there is no direct competition between with traditional mobile networks.

    The area of overlap of services is fixed wirless access (FWA). Here too satcom companies face significant challenges as compared to their terrestrial counterparts. They have to track fast-moving satellites and hence their terminals are complex. Cellular base trans receiver stations (BTS) are stationary.

    In addition, satellites operate 600-1200 kms away. Base stations or BTS are just a few or even a few hundred kms. away. Thus, satcoms offer lower data speeds than their terrestrial counterparts. Satellite terminals are thus more expensive. These are used where terrestrial networks are not available. India has 29 million BTSs, and 8 lacs towers. Starlink, by contrast, has only 7000 satellites globally and could expand to 40000 satellites.

    Mobile operators work on the model of high volumes and low average revenue per user (ARPU). It is the opposite for satellite players. Starlink’s ARPU is $100 as compared to that of mobile operators which is $ 10-15 for FWA services in India.

    Both these companies are complementary and do not compete with each other. The issue of level playing field is not relevant.

    Regulatory provisions cannot stifle innovation. There should not be lobbying on the basis of level playing field.

  • AI-driven Drug Discoveries

    Earlier, pharma companies largely relied on time-consuming and expensive methods of drug discovery. The advent of AI changed this. AI now supports research to achieve breakthroughs. It helps in manufacturing and testing novel therapies. Merck has developed an in-house clone of ChatGPT called myGPT with 30000 users. It has been used for automation and efficiency in day-to-day activities, e.g. content writing, translation, content revision, coding assistance and so on.

    Merck also uses SYNTHIA, an AI-powered drug discovery platform. It also uses ADDISON, first AI-powered software-as-a-service platform that facilitates screening of billions of molecules. It accelerates the process of drug discovery and reduces its cost.

  • A Fateful Knock

    These are the days of large language models (LLMs). This is the story of two important men in the field of AI and how they started working together.

    Geoffrey Hinton, now a Nobel laureate, was teaching in 2007 at University of Toronto. He describes how he first met IIaya Sutskever, formerly working in OpenAI as Chief Scientist. Ilaya had completed his master’s in computer science. In Hinton’s office, on a Sunday, Ilaya knocked at the door. Hinton answered the knock. Ilaya requested for working in Hinton’s lab. This is a common practice amongst the students — they approach the faculty for lab work. Hinton passed on a paper on backpropagation to Ilaya and set another meeting with him a week later. In the next meeting, Ilaya came back and said he did not understand the paper. Hinton was disappointed. In fact, Ilaya had understood it thoroughly but had an issue with not giving the gradient to a sensible function optimizer. He wanted to improve on the paper. Hinton realized Ilaya was special. Ilaya did research under Hinton to get his PhD in 2013.

    As we have observed in a previous blog, Alex, Hinton and Sutskever developed AlexNet (a CNN network). AlexNet became a precursor to modern AI models.

    In 2013, Hinton, Ilaya began working at Google Brain (after a company they had started was acquired by Google). Hinton continued with Google, while Ilaya joined Greg Brockman and Sam Altman to launch OpenAI in 2015.

    We know OpenAI initiated the AI revolution by launching ChatGPT in late 2022. Ilaya was overseeing the research at OpenAI as the Chief Scientist.

    Hinton has been awarded Nobel in physics in 2024 for inventions in ML. Ilaya left OpenAI and is now working on SSI.

    It was a knock at the door of Hinton on a placid Sunday that changed the world for ever.

  • Vanilla RAG and Agentic RAG

    We have already discussed RAG — retrieval augmented generation. It accesses external knowledge sources to respond to user queries. However, we do need a more nuanced, complex and adaptive RAG. The traditional vanilla RAG has its limitations. Agentic RAG has now emerged. It is an advanced architecture — it combines the foundational principles of RAG with autonomy and flexibility of AI agents.

    Vanilla RAG is a linear pipeline. The user queries are processed through retrieval. It struggles with flexibility. There is no iterative refinement. Agentic RAG addresses these shortcomings. Agents act autonomously. They coordinate complex task — planning, reasoning with multiple steps and tool utilization. The retrieval system becomes dynamic.

    Agents are incorporated at various stages of RAG pipeline. Agents decide whether external knowledge is required. They select apt retrieval tools — vector search, web search, APIs. It formulates queries customized for the task. Agents after retrieving data validate the data. Agents can resolve queries with accuracy and speed from internal sources, documentation and community fora. It is similar to the fine-tuning of an LLM.

    The architecture is not confined to a single agent. It can use multiple agents.

    RAG has its limitations. An agentic RAG may not respond since information is not available in database. It is a waste of compute. In addition, it does not scale with more compute.

    Google has moved to RIG retrieval interleaved generation.