Mutual information coming of age in 2012 ?

Post title - LIONblog

An interesting paper[2] of this period refers to the wide applicability of techniques based on information theory to detect arbitrary relationships between variables. It took seventeen years from my '94 paper [1] about using Mutual Information to identify informative features to reach this wider level of adoption.

It is absolutely clear – but still unknown to a large public - that a parameter may have zero correlation with respect to another parameter, while still containing the complete information needed to predict it!

The cause of the problem: correlation measures only linear relationships. The lesson: do not use correlation between variables if you suspect nonlinear relationships in your system. It is like using a hammer for driving screws.

Mutual information and related measures can identify relationships which would remain hidden by limiting your tools to linear ones.

If you are interested in predictions, you may throw away very informative, and therefore very useful, variables.