TD3: A DRL algorithm that uses a neural network to approximate the action-value function and the deterministic policy and a replay buffer and target networks to stabilize the training and reduce the overestimation of the action-value function.