Sunday, March 25, 2007

Friday, March 09, 2007

What is the difference between correlation and linear regression?

FAQ# 1143

Correlation and linear regression are not the same.

Correlation quantifies the degree to which two variables are related. With correlation, you are not drawing a best-fit line (that is regression). You simply are computing a correlation coefficient (r) that tells you how much one variable tends to change when the other one does. When r is 0.0, there is no relationship. When r is positive, there is a trend that one variable goes up as the other one goes up. When r is negative, there is a trend that one variable goes up as the other one goes down.
With correlation, you don't have to think about cause and effect. It doesn't matter which of the two variables you call "X" and which you call "Y". You'll get the same correlation coefficient if you swap the two.

Correlation is almost always used when you measure both variables. It rarely is appropriate when one variable is something you experimentally manipulate.

Linear regression finds the best line that predicts Y from X. The X variable is usually something you experimentally manipulate (time, concentration...) and the Y variable is something you measure. The decision of which variable you call "X" and which you call "Y" matters, as you'll get a different best-fit line if you swap the two. The line that best predicts Y from X is not the same as the line that predicts X from Y (however both those lines have the same value for R2).

Thursday, March 01, 2007

Significant paper

@article { Lee2005,
title = "Real observations of coastal algal blooms by an early warning system",
author = "J.H.W. Lee and I.J. Hodgkiss and I.H.Y. Lam",
year = "2005",
journal = "Estuarine, Coastal and Shelf Science",
volume = "65",
pages = "172--190"
}

paper regarding the hongkong evaborate study.