What is Reinforcement Learning?
Reinforcement Learning (commonly abbreviated as RL) is an area and application of Machine Learning. Reinforcement, as described from its meaning, is about taking suitable actions to maximize reward in a particular situation. It is implemented after rigorous testing by various machines and complex software to find the best possible behavior or path that it should take in a specific condition.

The primary specifics of reinforcement learning are summarized as follows:

  • Input: The input is defined to be an initial state from which the model will start.
  • Output: There are many possible outputs as there are a variety of solutions for a particular task.
  • Training: The training is wholly based upon the input provided, in return, the model returns a state, and then it is the users decision to decide whether to reward or punish the model based on its output.
  • The model keeps learning.
  • The maximum award determines the best solution.

How is it different from Supervised Learning?
Supervised Learning is implemented based on a training set that acts as an answer key, so the model is trained according to the correct answer itself. In Reinforcement Learning, there is no answer, but the work is done by the Reinforcement Agent who decides what to do to perform the given task. In the absence of a dataset, it is bound to learn from experience. In Reinforcement Learning, each right step gives a reward while each wrong step subtracts the award.

 

REINFORCEMENT LEARNING

SUPERVISED LEARNING

Reinforcement learning is about making decisions sequentially. In more simpler words, we can say that the output depends on the state of the current input, and the next input depends on the output of the previous information. In Supervised learning, the decision is made on the initial data or the feedback given.
Reinforcement learning is decision dependent. So, labels are given to sequences of dependent decisions. Supervised learning the choices are independent of each other, so labels are assigned to each decision.
Example: Chess game Example: Object recognition

 

Applications and Use Cases of Reinforcement Learning
In the era of Convolutional Neural Network (CNN), Reinforcement Learning as a framework seems to be undervalued. After all, this branch of Machine Learning is unique and has its own importance. The uses of Reinforcement Learning are as follows:

  • Resource Management in Computer Clusters:

Reinforcement Learning can be used to automatically learn to allocate and schedule the computer resources for waiting jobs, with the primary objective to minimize the average job slowdown.

  • Traffic Light Control:

Researchers found a way to solve the traffic congestion problem using Reinforcement Learning. Though tested only on a simulated environment, a significant improvement is seen over conventional traffic methods.

Example: The below figure depicts five agents. These were put in a five-transaction traffic network, with a Reinforcement Learning agent at the central intersection to control traffic signaling. The state was defined as an eight-dimensional vector with each element representing the relative traffic flow of each lane. Eight choices were available to the agent, each representing a phase combination, and the reward function was defined as a reduction in delay compared with the previous time step.

  • Robotics:

Robotics witnesses a tremendous work on applying Reinforcement Learning. It can be used to help the robot to learn policies to map raw video images to the robot’s action.

  • Web System Configuration:

The Reinforcement Learning helps in achieving the targeted response time and measured response time.

  • Chemistry:

Reinforcement Learning can also be applied in optimizing chemical reactions. Combined with LSTM to model the policy function, the Reinforcement Learning agent can optimize the chemical reaction with the Markov decision process (MDP) characterized by {S, A, P, R}, where S was the set of experimental conditions (like temperature, pH, etc.), A was the set all possible actions that can change the experimental conditions, P was the transition probability from current experiment condition to the next term, and R was the reward which is a function of the state.

As a conclusion, Reinforcement Learning is highly helpful in industry and daily life, though it is criticized for its industrial use. Although it has its weaknesses, Reinforcement Learning is useful in the space of corporate research given its vast potential in decision making.

In the future, Reinforcement Learning is assumed to be assisting human and evolve into Artificial General Intelligence (AGI). Imagine about a robot assisting you in your work. Amazing. Isn’t it?