Blog

  • Issues in Chip Push

    India has cumbersome administrative structure. There is lack of experienced engineers. There are high tariffs on the imports of electronic components. There is inadequate infrastructure.

    India can think of signing a free trade agreement (FTA) with countries such as Taiwan. That will facilitate the imports of components, equipment and material.

    It is necessary to look at the whole supply chain. There should be IC design, testing, packaging and material supply.

    Taiwan is the world’s leading supplier of advanced chips — it caters to 90 percent of the market. As there is US-China tension, Taiwan would like to diversify to alternative destinations for their supply chains. India has attracted PSMC to tie up investment. However, it is a technical tie up and Taiwan PSMC has a marginal role in the financials of putting the plant together.

  • Tata Electronics and Chips

    Tata Electronics has partnered with Powerchip Semiconductor Manufacturing Corporation (PSMC) to produce the high-end chips of 14 nm. They also plan to manufacture 28 nm chips at their fabrication unit at Dholera, Gujarat.

    The 14 nm chips could be incorporated in very small components — these become efficient, powerful and faster. It will attract a huge order book.

    Currently, 28-90 nm chips have a 50 per cent market share. Tata’s fabrication unit at Dholena will have a capacity of 50,000 wafers per month and will produce 3 billion chips every year. These chips will cater to high-performance computing, electric vehicles and consumer electronics. The fab plant will serve both the commercial as well as strategic sectors.

    The first chip from the plan is expected to hit the market by the end of 2026.

    Apart from the fab unit, the Group is investing in an ATMP project in Assam. It will start commercial production by late 2025 or early 2026.

  • Semi-Conductor Laboratory (SCL)

    In Mohali, Punjab, Sem-Conductor Laboratory (SCL) was set up in 1984 in a campus of 51 acres. It was set up three years before Semiconductor Manufacturing Company (SMC) was set up. There was a mysterious fire some 35 years ago that destroyed its facilities. SCL was a government owned company. It resumed its operations in 1995.

    SCL is trying again to earn its place in the sun. It has received Rs. 10000 crores from the government for its modernization plan.

    The US-based Micron has set up ATMP facility at Sanand, Gujrat. SCL is facilitating the working of Micron.

    SCL has two fabrication lines — for 6 inch and 8-inch wafers. It has an assembly, test, marking and packaging unit (ATMP). It has a compound semiconductor unit. SCL will support startups and industry for R&D and prototyping.

    SCL has been serving strategic sector like space and satellite, railways and telecom. It supplies them 180 nm chips.

    As soon as it starts 28 nm technology, it aims to increase its capacity to 24000 wafers per month. At present, it rolls out 700 wafers per month. Wafer acts as a foundation for creating chips. Chip manufacturing starts with wafer preparation.

    There is a diminishing market for 180 nm chips. It meets critical requirements of the automotive industry. SCL produces chips of 1-2 micron to 180 nm successfully. To make 180 nm chips, its partner is Tower Semiconductor, Israel.

    SCL collaborated with ISRO for Chandrayan-3 mission by fabricating Vikram Processor. It is making CCDS and image sensors for ISRO.

    SCL is progressing towards 12-inch wafer fab with 28 nm technology. It will have to find a technical partner for 28 nm fab. Global Companies are reluctant to share their technical know-how. Tata Group is lucky to get it from PSMC. There is a ray of hope — IBM and IMEC, Belgium have expressed interest in partnering for developing the 28 nm technology for R&D purpose. Even Tata Group has expressed interest in modernizing SCL.

  • Towards AGI

    At a talk at Stanford eCorner recently Sam Altman said, ‘Whether OpenAI burns a few billion a year, I don’t care… (the focus is) to stay on a trajectory to create more value for society.’

    According to Sam Altman, they are making AGI and all this worth it. ChatGPT-4, by comparison, is the dumbest model. GPT-5 is going to be smarter than GPT-4 and GPT-6 is going to be smarter than GPT-5. Right now, they are not at the top of the curve, but it is always going to be better.

    At the outset in OpenAI they did not realize that they will require funding to this extent. The idea was to push the frontiers of AI research.

    There are discussions on X about OpenAI’s burn of money. It exceeds the GDP of some countries. However, once AI reaches AGI, it would be miniscule price to pay for something that could transform humanity.

    In the beginning of this year, Altman was expecting a capital investment $5 to $7 trillion to reshape the business of chips and AI.

    AGI equals the full cognitive abilities of human beings or even at times surpasses them. Human beings learn from the environment, observation, reading, interactions, experiences and so on. Right now, the AI is narrow intelligence. It is a journey towards general intelligence. This concept was initiated by Alan Turing who said that we should have computer that cannot be distinguished from a human being. It is an intelligent computer.

    AGI will have wide variety of applications. At the same time, AGI can turn rogue and take charge of things and do things detrimental to human life. We should be cautious, and regulate the journey towards AGI so that it remains aligned to ethical values.

  • Autonomous Cars

    Musk met the Chinese authorities (April 2024) to receive the tentative approval for Tesla’s Full Self-Driving (FSD) software in China. Tesla may continue teaming up with Baidu (which licenses mapping data) and its lane-level navigation service for FSD. Tesla is using these services since 2020.

    Tela’s FSD is the most autonomous version of the advanced driver assistance system (ADAS). Tesla vehicles have Autopilot feature (Tesla trade-marked ADAS suite). These cars are equipped with multiple cameras and vision processing software. Autopilot includes ‘Traffic-Aware Cruise Control’ (to match the speed of the vehicle to surrounding traffic). Autopilot also includes AutoSteer (to assist in steering within a clearly marked lane). FSD is an upgrade on Autopilot with minimal driver intervention. It has been released in 2020. Auto Lane Change assists in moving to an adjacent lane on the highway (when Autosteer is engaged). Autopark helps in parallel and perpendicular parking. Summon moves the vehicle in and out of a tight space (using the mobile app or key). Stop Sign Control identifies stop signs and traffic lights. The vehicle automatically slows down and stops on approach.

    The software is regularly updated over the air.

    Still, no driver has many issues. Between 2014 and 2024, a decade has elapsed, and still driverless cars are elusive. There are issues of red signal jumps, non-recognition of pedestrians (a cyclist disappearing behind a parked car) and so on. Google’s Waymo and GM’s Cruise too have limited success (that too in ring-fenced geotagged areas).

    Tesla has been beta testing its full self-driving system since 2020.

    Carmakers have yet to figure out the right technology for autonomous cars — lidar, radar, sensors, cameras. Tesla relies on cameras, whereas other carmakers depend on multiple sensors. On device computers thus require huge processing power.

    AI, especially LLM’s generative AI, will be helpful in future as the computers will learn from the data aggregated by sensors or cameras.

  • Types of Generative AI

    When AI creates human-like contents, it is called generative AI. It is relatively a new field. The content could be text or pictures or videos or poetry or computer code. This is achieved by four techniques which have evolved in the last ten years. These draw inspiration from deep learning, transformers and neural networks. These techniques rely on data to learn how to generate content.

    LLMs are foundation models — neural networks trained on huge amounts of data so as to learn the relationship between words. This enables them to predict the next word that should appear in any sequence of words. They are further trained on specific data — fine-tuned to carry out specific tasks.

    In LLMs, the first step is tokenization (words, parts of words, combinations of prefixes-suffixes and linguistic elements). Next step is matrix transformation to convert tokens into numerical data analyzable by computers.

    LLMs are useful in natural language processing (NLP).

    Diffusion models is another method to do generative AI. They follow a process called iterative denoising. There is a text prompt. The computer has to create an image. It is an image of random noise. It is like drawing an image by scribbling randomly on a piece of paper. These scribbles are refined by using training data. At each step, noise is removed, adjusting the image to the desired characteristics. It then generates an entirely new image that matches text prompt (this is not found in training data).

    Stable Diffusion and Dall-E follow this process to create photo-realistic images and generate videos too as proved by Sora.

    Generative Adversarial Networks (GANs) emerged in 2014 to generate synthetic content (both text and images). Here two different algorithms are pitted against each other. One is called generator and the other discriminator. Both have to out-do each other. The generator tries to generate realistic content. The discriminator tries to decide whether it is real or not. Each learns from the other. There is constant improvement till the generator learns how to create content that is as close as possible to real.

    GANs are versatile tools for generating pictures, video, text and sound. They are extensively used for computer vision and NLP tasks.

    Neural Radiance Fields (NeRFs) is the latest technology that emerged in 2020. It is asked to create representations of 3D objects using deep learning.

    Certain portions of the image are not seen, say an object in the background which an object in the foreground obscures, or the rear aspects of an object being photographed from the front.

    Here we have to predict the volumetric properties of objects. These are mapped on 3D spatial coordinates (using neural networks, using geometry and reflection of light around an object).

    This technique is pioneered by Nvidia. These are used in simulations video games, robotics, architecture and urban planning.

    Hybrid models of generative AI combine various techniques described above to create innovative content generation.

    DeepMind’s AlphaCode combines LLMs with reinforcement learning (RL) to generate high quality computer code.

  • Old Pals Turn Rivals in AI Race

    Denis Hassabis and Mustafa Suleyman both were London residents and now compete with each other by being in the rival companies in AI — one is at Google and the other at Microsoft.

    Mustafa Suleyman is a Syrian immigrant — father a taxi driver and mother a nurse. At the age of 11, he was accepted as a student at Queen Elizabeth’s School. From one of the roughest areas of London, they moved to a safer locality in north.

    Denis Hassabis and Mustafa Suleyman came to know each other as the younger brother of Denis was friend of Mustafa Suleyman. Those days, Denis was 20, and was a chess player and a video game designer. His parents ran a toy shop at London.

    Dr.Hassabis, now 47, is the Chief Executive of Google DeepMind, devoted to AI. Suleyman is 39 now and has been appointed the Chief Executive of Microsoft AI.

    They meandered from London to the Big Tech quite unusually. In 2010, they were both co-founders of DeepMind, a seminal AI research lab. Their paths diverged when Google acquired DeepMind in 2014 for $650 million.

    ChatGPT from OpenAI arrived in 2022, and that kicked off an AI race. Dr. Hassabis was put in charge of AI research by Google. Mustafa Suleyman established another startup, Infection AI that struggled to gain traction. Unexpectedly, Microsoft hired him and most of his team.

    Already, Google was rattled by Microsoft-backed OpenAI who introduced ChatGPT. Microsoft and Google are now rivals competing in the field of AI. Mustafa Suleyman is the head of Microsoft’s AI division.

    Dr. Hassabis in an interview claimed that what Mustafa Suleyman has learnt about AI is a result of his association with DeepMind all these years.

    Hassabis studied for a computer degree, and Suleyman studied philosophy and theology at Oxford. He dropped out to set up a help line for Muslim teenagers and working as a human rights officer for the mayor of London. Dr. Hassabis did PhD in neuroscience.

    In 2010, they discussed how they could change the world.

    Dr. Hassabis was on the verge of completing his post-doctoral work at the Gatsby Computational Neuroscience Unit (a University College London lab) that combined neuroscience with AI. Dr. Hassabis was a hard-working young man, and he invited Suleyman to build a startup. Shane Legg too joined. He was an AI researcher. All three met at an Italian restaurant and believed that AI could change the world.

    They obtained funding from Peter Thiel, a venture capitalist from Silicon Valley and set up DeepMind by the end of 2010. Its stated mission was to develop AGI.

    Dr. Hassabis and Dr. Legg pursued intelligent machines. Suleyman’s task was to build products. As competition became fierce, and manpower was being poached, they decided to sell the DeepMind to Google.

    Mustafa Suleyman’s leadership style was aggressive. He was placed on leave (2019). He too needed a break after hectic work for 10 years. He moved to Googles Cailfornia office but was not happy. He left to set up Inflection AI. He remained independent for some time. In March, 2024, Inflection AI vanished into Microsoft with Suleyman put in charge of a new Microsoft AI business.

    Suleyman now shuffles his time between Silicon Valley and London and is an official rival to DeepMind.

    Suleyman and Hassabis still text to each other and may meet over a dinner. Dr. Hassabis says that he is not much worried about any rivals.

  • Small Language Models (SLMs)

    Microsoft released in April 2024 Phi-3 mini, the first among three small language models (SLMs) the company plans to release.

    SLMs are compact versions of LLMs. Phi-3 mini is an SLM that has 3.8 billion parameters, and is trained on a much smaller dataset, as compared to GPT-4.

    It supports a context window of up to 1.28 lac tokens.

    The company will add more models including Phi-3 small and Phi-3 medium to the Phi-3 family.

    These models are cost-effective to operate and perform better on smaller devices such as laptops and smartphones.

    SLMs are fine-tuned and are customized for specific tasks. They undergo targeted training (less computing power and energy consumption).

    After receiving the prompt, the time taken by a model to make predictions is called inference latency. SLMs process quickly as they are smaller in size. They are more responsive and are suitable for real-time applications.

    According to Microsoft, Phi-3 mini has outperformed models of the same size and next size across a variety of benchmarks (language, reasoning, coding, math).

    It is ideal for analytical tasks as Phi-3 mini has strong reasoning and logic capabilities.

  • Speculation about GPT-5

    As we all know, OpenAI keeps working on GPT series, and is currently working on GPT-5. OpenAI is comparatively a smaller organization when pitted against tech biggies such as Facebook, Google, Apple and Amazon. However, it stole the march on them by releasing ChatGPT in late November 2022. It should be noted that its tie-up with Microsoft does not rob it of all that it has achieved. Many had jumped on the AI bandwagon since then. OpenAI is striving to maintain the lead. Maybe, GPT-5 is the answer.

    Undoubtedly, GPT-5 will be having many new features which its previous counterparts lacked. However, what these new features would be is a matter of guess. GPT-5 is being trained since 2023, and may perhaps have new parameters or a new architecture.

    GPT-5 is likely to be multi-modal — text, images, voice, video and it may have internet access by default. (GPT3 was limited to a previous date in terms of internet access).

    GPT-5 could be a smart agent with new capabilities. It could perform some tasks autonomously, say ordering things, taking calls and so on.

    Sam Altman is not sure about the time of the launch. However, he confirms the organization will launch an amazing new model this year. It is not clear whether it will be an upgrade the existing model or something new, say GPT-4.5 or GPT-5 itself.

    One thing is certain. It will pave the way for more versatile and capable AI models.

  • Key Terminology of LLMs

    We shall try to understand the key words associated with large language models.

    LLMLarge Language Model: It is a neural network, also called a foundation model that understands and generates human-like text. The text generated is contextually relevant. The examples of LLMs are GPTs, Gemini, Claude, Llama.

    Training: An LLM is trained on a vast corpus of dataset. The model learns to predict the next word in a sequence. Its precision is enhanced by adjusting its parameters.

    Fine-tuning: A pre-trained model performs broad number of tasks. It is fine-tuned to perform certain specific tasks or operate in a specific domain. The model is trained on such specific data, not covered in the original training data.

    Parameter: A parameter is a variable part of the model’s architecture, e.g. weights in neural networks. They are adjusted to minimize the difference between the predicted and actual output.

    Vector: In ML, vectors are an array of numbers representing data. The data can be processed by the algorithms. In LLMs, words or phrases are converted into vectors (called embeddings) which capture semantic meanings.

    Embeddings: These are dense vector representations of text. Here familiar words have similar representations in vector space. Embeddings capture context and semantic similarity between words. The technique is useful in machine translation and text summarization.

    Tokenization: Text is split into tokens — words, sub-words or characters.

    Transformers: This neural network architecture relies on self-attention to weigh the influence of different parts of the input data differently. It is useful in NLP tasks. It is at the core of modern LLMs.

    Attention: Attention mechanism enables models to focus on different segments of the input sequence while generating a response. The response is contextual and coherent.

    Inference: Here the trained model makes predictions — generates text based on input data (using knowledge gained during trained).

    Temperature: It is a hyperparameter and controls the randomness of predictions. A higher temperature produces more random outputs while a lower temperature makes the output deterministic. The logits are scaled up before applying SoftMax.

    Frequency: The probability of tokens based on their frequency of occurrence is considered. Here we can balance the generation of common and less common words.

    Sampling: In generating the text, next are words randomly picked based on its probability distribution. It makes the output varied and creative.

    Top-K-sampling: The next word is limited to the K most likely next words. It reduces randomness of text generation, though the variability in the output is maintained.

    RLHF – Reinforcement Learning from Human Feedback: Here model is fine-tuned based on human feedback.

    Decoding strategies: In decoding, output sequences are chosen. There is greedy decoding — next word that is most likely is chosen at each step. In beam search, the technique is expanded by considering multiple possibilities at the same time. This affects diversity and coherence of the model.

    Prompting: Here inputs or prompts are designed to guide the model to generate specific outputs.

    Transformer-XL: The transformer architecture is extended. The model learns beyond a fixed length without compromising coherence. It is useful in long documents of sequences.

    Masked Language Modelling (MLM): Certain input data segments are masked during training, prompting and the model is expected to predict concealed words. It is used in BERT to enhance effectiveness.

    Sequence-to-sequence Models — Seq2Seq: Here sequences are converted from one domain to another. To illustrate, translate from one language to another. Or converting questions to answers. Here both the encoder and decoder are involved.

    Generative Pre-trained Transformer (GPT): These are developed by OpenAI.

    Multi-Head Attention: This is a component of transformer model — model focuses on various representation perspectives simultaneously.

    Contextual Embeddings: Here the context of the word is considered. These are dynamic, and change based on the surrounding text.

    Auto-regressive Models: These models predict the next words based on previous words in a sequence. It is used in GPTs. Each output word becomes the next input. It facilitates coherent long generation.