An interesting paper of this period refers to the wide applicability of techniques based on information theory to detect arbitrary relationships between variables. It took seventeen years from my '94 paper  about using Mutual Information to identify informative features to reach this wider level of adoption.
It is absolutely clear – but still unknown to a large public - that a parameter may have zero correlation with respect to another parameter, while still containing the complete information needed to predict it!
The cause of the problem: correlation measures only linear relationships. The lesson: do not use correlation between variables if you suspect nonlinear relationships in your system. It is like using a hammer for driving screws.
Mutual information and related measures can identify relationships which would remain hidden by limiting your tools to linear ones.
If you are interested in predictions, you may throw away very informative, and therefore very useful, variables.
-  R. Battiti. Using the mutual information for selecting features in supervised neural net learning.
IEEE Transactions on Neural Networks, 5(4):537—550 (1994).
Reprint: Using the mutual information for selecting features in supervised neural net learning.
- David N. Reshef, et al., Detecting Novel Associations in Large Data Sets,
Science 334, 1518 (2011)