Deep learning has been used interchangeably with machine learning (ML), and ML and AI are put on par. As the data science is being practised today, we come across these terms every now and then. These terms have certain connotations.
Deep learning is a subset of ML and ML is a subset of AI. In fact, these three could be seen as three concentric, overlapping circles where AI is the outermost biggest circle, followed by ML and deep learning.
AI has taken over those tasks where human intelligence was deployed. ML and deep learning are both parts of AI.
ML is adaptive AI and is capable of working without human intervention or with least human intervention. Deep learning is a subset of ML, which uses artificial neural networks to mimic the learning process of the human brain.
Neural networks are computing systems with interconnected nodes whose working resembles the nervous system of neurons in the brain. These systems use algorithms which recognize hidden patterns, and correlations in raw data, cluster the data and classify it. They continue doing so over a period of time to keep learning and improving.
Neural networks use interconnected nodes or neurons in a layered structure resembling a human brain. It is called deep learning.
There is considerable research in neural networks fashioned after the neural network in the human nervous system.
Dr. Robert Hecht-Nielsen invented the first neural computer. According to him, a neural network consists of a highly inter-connected processing elements. These processors process the information in response to external inputs dynamically.
Human nervous system consists of billions of nerve cells or neurons, which are connected to other cells by axons. The stimuli are received by dendrites causing electric impulses which travel through the neural network. One neuron thus communicates with the other neuron to tackle the issue or stops the message without further forwarding it.
Artificial Neural Networks shortened as ANNs have multiple nodes, which interact with each other. These nodes receive data and perform simple operations on them. The output is passed on to their nodes. It is called activation or node value.
Each link is assigned a weight. ANNs have the capability of learning by altering weight values. Feed Forward ANN makes the information flow unidirectionally. It does not receive the information back through feedback loops. These are used in pattern generation, recognition, classification. They have fixed inputs and outputs. In FeedBack ANNs, there are feedback loops to address content addressable memories.
If the networks generates desirable output or good output, the weights of the nodes are kept as they are. If the output is not desired or poor output or with an error output, the system alters the weights in order to improve the results.
ANNs can be trained and are capable of learning. It could be supervised learning where the responses are provided by a teacher. ANNs make just guesswork and compares its own answers with those of the supervisor, and makes the adjustments.
In unsupervised learning, there is no example data set with known answers. Here a hidden pattern is recognised. There is clustering involved where a set of elements are divided into groups in accordance with some unknown pattern.
In reinforcement learning ANNs make the decision in response to the environment. In case, the observation is negative, the network adjusts its weight to make another decision next time.
Back Propagation Algorithm
This is the name given to the training/learning algorithm which learns by example. Here the algorithm is fed the example from which you expect it to learn. It changes the weights of the network to produce the desired output for a specific input after finishing the training.
Distinction between ML and Deep Learning
ML requires smaller data whereas deep learning requires big data. ML requires more human intervention to adjust and to learn. In deep learning, the computer learns from its own environment and past experience. ML requires short training, whereas deep learning requires longer training. ML is linear, whereas deep learning is non-linear and complex. ML training is on a CPU whereas deep learning requires GUI for training.
ML algorithms require human correction. Deep learning algorithms can improve their outcomes through repetition, with no human intervention.
Deep learning requires huge and at times unstructured data. It is an evolution of ML, and is a process that layers algorithms and computing units (or neurons) into an artificial neural network.
While ML uses simpler concepts such as predictive models, deep learning uses artificial neural networks.
Bayesian Networks (BN)
As in decision trees, these networks represent the probabilistic relationship of a set of random variables. These are also called Bayes Nets or Belief Networks.
In such networks, a node represents a random variable, e.g. the node cancer represents the proposition that the patient possibly suffers from cancer.
The edges connect the nodes. These edges represent the probabilistic dependence among the random variables. The strength of the relationship between variables is quantified by probability associated with each node.
The constraint in BN is that you cannot trace back to a node.
BNs are capable of handling multivalued variables simultaneously.
A knowledge engineer builds a Bayesian network. He defines a problem and identifies interesting variables. There are three values that these nodes can take a time, binary values, ordered values or integral values. Then arcs are created between the nodes. The conditional probabilities are assigned to each node to quantity the relationships among nodes.
Application of Neural Networks
They are used in aerospace, automobile guidance systems, electronics, finance, production, medicine (cancer cell analysis, EEG, ECG analysis, prosthetic design, transplant time optimizer), speech, telecom, transport, software, time series prediction, signal processing and control, anomaly detection.
The first neural network was conceived by Warren McCulloch and Walter Pitts in 1943. They wrote a paper on how neurons may work. They created a simple neural network using electric circuits.
In 1975, Fukushima developed the first neural network which was multilayered. These networks have been used to perform diverse tasks.
Later, deep learning systems where developed when Big Data appeared which was both structured and unstructured.
Weights are numeric values that are multiplied by inputs. In backpropagation, they are modified to reduce the loss.
There is self-adjustment depending on the difference between predicted outputs vs. training inputs.
Activation function is a mathematical formula that helps the neuron to switch on or off.
There is an input layer , there is a hidden layer of intermediary modes which take the weighted input and produce an output through activation, and output layer.
Convolutional neural networks are used in image processing, computer vision, speech recognition and machine translations.
In facial recognition , the brain quickly first resolves whether it is a male or female face or whether it is a black or white face. It is a matter of perception. The perception could be multi-layer perception.
Such networks also consider Long-Short-Term Memory (LSTM). There are sequence models. There are modular models.
Types of Deep Learning Neural Networks
We come across three major categories of neural networks — convolutional neural networks (CNNs), recurrent neural networks (RNNs) and generative adversarial neural networks (GANs).
CNNs are convoluted, fully connected layers used to process images as these networks can extract essential features from images with less computation cost and time. They are used to classify images and detection of objects.
RNNs are feedback connections to learn patterns, and are used where the context of the previous result could be extended to predicting the next results. A common illustration is natural language processing or NLP.
There is long-and-short-term memory (LSTM) here. RNNs are used to recognise voice and in the analysis of time series.
GANs are these instances where sufficient training data is not available. GANs are used to generate similar data, similar to the input. There are two aspects here — the generator and discriminator. The generator provides data similar to the original/data pattern, while the discriminator distinguishes between the original and the duplicate data generated. Both these are trained parallel.