KL divergence is another measure to find similarities between two probability distributions. It measures how much one distribution diverges from the other.
Mutual information captures dependency between random variables and is more generalized than vanilla correlation coefficient, which captures only the linear relationship.
Mutual information is a measure of mutual dependency between two probability distributions or random variables. It tells us how much information about one variable is carried by the another variable.