Neural Networks and Deep Learning

Neural networks serve as a technique to solve ML problems such as supervised learning. Such networks have processing nodes which are interconnected. The whole thing resembles the working of the human brain. These nodes are arranged in layers, and weights are assigned to connections between them. These weights are arrived at after several iterations of feedbacks and feedforwards.

Deep learning is sophisticated version of neural networks consisting of millions of nodes arranged into thousands of layers to solve complex problems.

In the context of RL, deep learning is referred to as the deep reinforcement learning.

The sequential decision process in RL consists of changing the environment states and the system of rewards and penalties. This set up is based on Markov Chain Analysis.

A dog being trained in a field to fetch the ball thrown gets either a pat on the back as reward or is ignored as a penalty. It is an example of reinforced learning.

Deep Reinforcement Learning (DRL) means multiple layers of Artificial Neural Networks present in the architecture to resemble the working of the human brain.

Markov Decision Process is an RL algorithm that gives us a way to formalize sequential decision making.

Reinforcement Learning (RL)

Reinforcement learning is an area in Machine Learning (ML). Its aim is to maximize reward in a particular situation. In RL, either the software or machines have to choose the best possible behaviour or decide about the path to be taken.

Reinforcement learning (RL) is different from the supervised learning where the training data has the answer key or the model is trained with the correct answer. There is no training dataset in reinforced learning (RL), and it is bound to learn from experience.

In reinforcement learning, the reinforcement agent has to decide what to do in a particular situation to perform a given task. Between the agent and reward there are hurdles. The agent is supposed to find the best possible path to reach the reward.

Thus RL is a feedback-based ML technique. An agent learns to behave in a particular environment while performing action and seeing results of these actions.

The good actions amount to positive feedback and the bad actions to negative feedback or penalty. Thus the agent learns automatically using the feedback without any labelled data, unlike supervised learning. In the absence of labelled data, the agent learns by experience only.

RL solves a specific type of problem where decision making is sequential and the goal is long term, e.g. game playing, robotics etc. In RL, the agent interacts with the environment, and explores it by itself with a view to having best performance by getting maximum possible rewards. It learns by hit and trial, based on experience. The agent is an intelligent agent — computer programme.

The core part of AI here is that the agent works on the concept of RL. There is no need to pre-programme it. It learns by experience. And there is no human intervention.

A typical RL problem is that of a maze. A robot wants to get Kohinoor diamond. It has to avoid the hurdles of fire. It tries all possible paths. It has to choose a path with the least hurdles. A right step earns it a reward, and a wrong one substracts a reward.

Terms used

An agent is an entity that can perceive and/or explore the environment.

Environment is a situation where the agent is present or is surrounded by. The environment in RL is supposed to be stochastic or random in nature.

Action are the moves taken by an agent within the environment.

State is a situation returned by the environment after each action taken by an agent.

Reward is the feedback returned to the agent from the environment to evaluate the action of the agent.

Policy is the strategy by the agent for the next action based on the current state.

Q-value is similar to value but takes one more parameter as a current action.

Key Features

In RL, agent is not instructed about the environment, and what actions to take. It is hit and trial based. The agent takes next state and changes states in the light of feedback.

Salient Features

Input denotes the starting point of the model. Output could be more than one, as there are many solutions to the problem. Training is based on input. The model is either rewarded or punished depending upon the state it returns. The model continues to learn. The best solution is one that gives maximum reward.

Distinction between RL and SL

RL is sequential decision making. SL decision is based on initial input or the input at the start.

RL decision is dependent. SL takes independent decisions.

The chess game illustrates RL. Object identification illustrates SL.

Types of Reinforcement

Positive when an event occurs due to a particular behaviour. Negative when negative condition is stopped or avoided.

Implementation of RL

Value-based : Find the optional value function or maximum value at a state under any policy. Here the agent expects long-term return at any state under policy TT.

Policy-based: Optional policy for maximum future rewards. Here the agent tries to apply such a policy that the action performed in each step facilities maximum future reward.

The subcategories are deterministic where the action is produced by the policy ( TT ) at any state or stochastic where probability determines the produced action.

Model-based: Here a virtual model is created for the environment. The agent explores the environment to learn from it. There is no particular solution or algorithm for this approach as model representation is different for each environment.

Applications

RL is applied in robotics, automation, ML and data processing and customised training.

ChatGPT : ‘Code Red’ for Google’s Search Engine

The world has witnessed several disruptive products over the past three decades — the first is the Netscape’s web browser, the second is the Google’s search engine and the third is the Apple’s iPhone.

In November 2022, OpenAI’s ChatGPT became the next big disrupter. It provides information in clear, simple sentences, rather than a list of internet links, unlike a search engine. It explains the concepts in such a way that people understand them.

ChatGPT can generate ideas from scratch, e.g. appropriate gifts for Idd and Diwali, topics for blogs, different types of plans or business stategies.

Of course, ChatGPT still has to improve a lot. Still its release is a ‘code red’ for Google. Google’s search engine has been a gateway to the internet for the last twenty years. However, ChatGPT has the potential to replace the traditional Google search engine.

Google too has worked on chatbots, and has researched on AI. It can produce something resembling ChatGPT. It LaMDA or Language Model for Dialogue Applications has been declared sentient by engineer Blake. How far Google is willing to deploy this technology as a replacement for Google search is a moot point.

Encryption of Communication

Big Tech indulges in encryption of communication to protect the privacy of users’ data. However, the Governments are insisting on having the decryption keys on the ground of national security.

Apple has Advanced Data Cover for 14 categories of data. In addition, it has announced user protection through end-to-end encryption for 9 more categories of data on iCloud, e.g., photos and backup notes. Google too is testing end-to-end encryption for its Messages app for group chat. Twitter too is contemplating encryption for direct messages between users.

There are two grounds for Governments opposition to encryption. The first is national security and the second for investigation into criminal matters. In fact, the government agencies insist on having decryption keys to unscramble a message. Big Tech says even it does not have control over keys. In the absence of keys, unscrambling a message may take years.

Even if encryption is kept intact, it is possible to provide information such as the name of the suspect, the date on which the service started, last seen date, IP address, device type, email identity, contacts etc.

Experts hold that the origin of the messages can be traced through source code.

Many firms provide just communication services, and are not licensed services. They are add-on services. Here the government control is exercised through blocking the account, switch off of the internet service, removal of content by direction. It is a moot point whether they can be compelled to break encryption. Of course, the governments can seek assistance and co-operation.

Comic Journalism

An organisation of comic artists and journalists has been founded — The World Comics India (WCI). Its founder is Sharad Sharma. They publish ‘newspapers on walls’. These consist of a single sheet of comics, drawn and written by members of marginalised communities in vernacular language. These are put on walls to spread awareness about the issues they face in their localities. These comic sheets are grassroot comics. They are photocopied and posted on the walls. They are colour-printed on A2 sheets. They are visual-heavy news pages. They train the volunteers to do this.They are drawn in a simple 2×2 panel format. The comic may be a child’s doodle. Movement is carried forward by former volunteers and participants.

Nuclear Fusion Breakthrough

Scientists at LLNL in California have made a breakthrough in nuclear fusion technology. Nuclear fusion is brought about by fusing two hydrogen atoms to create helium. In the process, large amount of energy is released.

Lasers were used to bombard hydrogen isotopes held in superheated plasma state in order to fuse them into helium. A neutron is released and carbon-free clean energy is released.

Scientists wanted to produce more energy than it consumes. That was elusive so far. In the current experiment, 2.5 megajoules of energy was produced whereas the energy consumed to power lasers was 2.1 megajoules. Thus it is a breakthrough.

In nuclear fusion, the reaction creates new atoms from old. Lighter nuclei, say of hydrogen, are smashed together, to make heavier nuclei. And lots of energy. The energy comes from Einstein equation, i.e. E=mc2. Fusion is significant. It makes the Universe light up — it powers the stars. The reaction creates most elements we are made of. Fusion creates clean, safe, low radioactivity energy for everyone.

This has the potential to change the global energy landscape.

The experiment used Internal Confinement Fusion, with three key parts. A gold cavity with an opening at either end called hohlarum. A fuel capsule at hohlarum’s centre. A layer of deuterium-tritium fuel.

Laser beams (up to 192) are fired into hohlarum. It creates superhot plasma ( a gas stripped of electrones). X-rays (from the effect of the laser on the hohlarum) blow off the surface of the capsule. There is a reaction creating conditions for fusion. If the heat spreads rapidly enough through fuel, the energy yield could be more than the input. It is called ignition.

It is the Holy Grail of nuclear science — it imitates reactions on the sun. Fusion is the process that powers the sun and other stars. It involves two atoms joining or fusing together to form an atom of heavier element. Inside the sun, two H atoms produce a helium atom. It could lead to endless, cheap, clean, carbon-free energy. This is not going to happen in the short-term. Commercialisation would take some years — even decades. To do this on a near-continuous basis to make it a viable source of energy is a big challenge.

Nuclear energy is obtained either by fission or fusion. Fission-based power plants have been around since the 1950s. Scientists have been working on a reactor that uses nuclear fusion. It could be a source of clean, abundant and safe source of energy. It ultimately leads humanity to break its dependence on fossil fuels.

A nuclear fission reactor uses uranium which is not commonly found. The uranium atom is exposed to neutron radiation. It becomes excited and unstable. It splits into smaller atoms of elements like barium and krypton. It releases neutron radiation, which breaks apart uranium atoms, causing a chain reaction. In the process, energy is released. It is used to produce steam, and run turbines to produce electricity. Some by-products of fission remain radio-active for a long time. Fission generates 10 per cent of the world’s electricity. Fusion scores over fission because it can yield several times more energy without radioactivity. Fusion experiment at LLNL used hot fusion with ultra-high temperature. However, some scientists have theorized that cold fusion is possible at or near room temperature.

There are two types of technologies being researched for creating conditions for fusion.

Tokamak: It is the Soviet technology universally used. The nuclei here are heated into a plasma and then compressed magnetically.

Internal Confinement: It compresses and heats targets filled with thermonuclear fuel. This technology has been used by LLNL in California. It is called ignition.

AI Implementation

Corporates have taken to AI in a big way, and have invested heavily in AI systems. Some of the corporates are able to convert data science insights into business benefits but many are not able to do so. The corporates with old legacy systems, though they adopt AI, they do not get much benefits out of such adoption.

AI is a niche technology. To many corporates, it is a new area. Many corporates have AI practitioners who do not have top-tier AI capabilities. In such a situation, AI does not come close to prediction of future. Corporates must employ top-tier advanced AI to get the results. Most finance companies use AI profitably.

It is also true that nimble IT-savvy companies get better results from AI than big corporates, e.g. food delivery companies use AI to predict a lot of challenges in real time — especially the last-mile hiccups. Edtech companies use AI profitably. Hospitals make use of AI for health risk assessment as a part of preventive healthcare.

Legacy systems have flawed datasets. It generates a bias. AI should be deployed only after data verification. Top and senior executives must be involved in AI implementation. The data input for AI must be validated. The data should be trustworthy. There should be hub-and-spoke data management. There is centralisation of the platform, but the teams have flexibility to operate.

Thus in essence, successful AI implementation requires improved data practices, trust in advanced AI and integration of AI with business operations.

Digital Twins of Factories

We have already examined the concept of digital twins. In Industry 4.0 there are digitised plants, and there is cost reduction on account of that. The interconnected machinery of a digitised plant facilitates predictive maintenance. In addition, the shop floor is combined with simulation technologies so as to create digital twins of the factories. What is simulated is flow of goods and people. A digital twin shows us how it would work in real time and in real world. There could be updation of simulation depending upon the data from the real world.

Digital twins makes it simpler to reproduce factories across geographies. One does not have to start from scratch for every new plant. All plants could be connected and they can have a large digital twin. It saves a lot of learning time. Any change can be tested online before it is introduced on the ground. The change can then be introduced across all the plants.

There is multi-physics simulations as well. It can be used in a factory setup tp create better digital twins.

Money Transfer Modes

There are several methods of money transfer. The most popular method today is United Payments Interface (UPI). Another method is Immediate Payment Service or IMPS. Both these methods have been developed by National Payments Corporation of India. These are instantaneous. IMPS is available round the clock. Banks allow two methods of money transfer — NEFT or national electronics fund transfer and RTGS or real-time gross settlement. A customer has these choices. These transfer methods can exist simultaneously.

CBDC has been introduced of late. It appears as a liability in the central bank’s balance sheet. There is no merchant discount rate (MDR) involved here. The cost is borne by the RBI. Cash transfers are not always convenient. CBDC scores over cash. CBDC transfers are to digital wallets. The RBI is working on offline mode of CBDC transfers. Here a sub-wallet can be used for offline mode. The accounts will be balanced after the connectivity is restored.

As there is wallet-to-wallet exchange, there is anonymity equal to cash transactions. The transactions are not reflected in the CBS. CBS captures only the transaction when the bank transfers CBDCs to a wallet. It is just like a withdrawal transaction. All UPI transactions are captured by CBS.

Of course, there are digital trails of wallet-to-wallet transactions. The RBI is working on this too — It should be erased completely. The effort is to mimic cash.

CBDCs could reduce the cost of sending remittances from abroad.

AIIMS Cyber Attack

AIIMS was subjected to a cyber attack in November 2022. The attack was followed by similar attacks on Safdarjang Hospital and ICMR.

As we are aware, this is done through ransomware. Ransomware is a kind of malware which cyber criminals introduce into the system to seize sensitive data so as to extract ransom money or demand. This is initiated by such a simple thing as plugging a device for charging or by clicking on a link. These links are generally sent through phishing emails. These links contain the ransomware. The malware once activated commences to encrypt data from the infected server or device. In other words the data can be accessed only by the hacker. Victims are asked to pay for decryption key to resume access.

In AIIMS case, the data was encrypted, In addition, it was also copied. In such cases, if ransom is not paid, there is a threat of the data being made public. Generally, it is done on the dark web.

AIIMS systems are old and archaic. Perhaps, there were no security patches from the software suppliers. The antiviral programmes too perhaps have not functioned properly.

In such a malware attack, the data is recovered through a backup. Or else, they try to find a decryption key. In most cases, there is only one decryption key and so getting it decrypted seems difficult. It is not known whether there is data backup.

There could be efforts to apprehend the hackers. In case of foreign hackers, that can be done if there is collaboration with the foreign governments. Sometimes, there could be state actor too.