Differentiate between Batch Gradient Descent, Mini-Batch Gradient Descent, and Stochastic Gradient Descent.
Gradient descent is one of the most popular machine learning and deep learning optimization algorithms used to update a learning model's parameters. There are 3 variants of gradient descent. Batch Gradient Descent: Computation is carried out on the entire dataset in batch gradient descent. Stochastic Gradient Descent: Computation is carried over only one training sample in stochastic gradient descent. Mini Batch Gradient Descent: A small number/batch of training samples is used for computation in mini-batch gradient descent. For example, if a dataset has 1000 data points, then batch GD, will train on all the 1000 data points, Stochastic GD will train on only a single sample and the mini-batch GD will consider a batch size of say100 data points and update the parameters.