Stochastic Gradient Descent (SGD): A variant of gradient descent that updates the parameters of the model after each example or a small batch of examples, rather than after the entire dataset. This can make the training process faster and more robust to noise in the data.