Ever try to program an R-Squared, Y-Intercept, and Slope calculator via a Linear Regression -- the good old sum of squares and codeviates from math class so many years ago? It sounds like an academic exercise, but this was an actual task I had to tackle for a custom reporting engine we wrote a while back. I had a rough time finding the nuts and bolts of the algorithm -- many online resources point you through Excel functions or a graphing calculator, but that wouldn't cut it for our app. In case there's another poor soul out there looking, let me post the foundation:
First, we create a ReportPoint object:
public class ReportPoint
{
private double _dblX ;
private double _dblY ;
public double X_Coord
{
get{ return _dblX ; }
set{ _dblX = value ; }
}
public double Y_Coord
{
get{ return _dblY ; }
set{ _dblY = value ; }
}
public ReportPoint( double X_Coordinate, double Y_Coordinate )
{
_dblX = X_Coordinate ;
_dblY = Y_Coordinate ;
}
}
Nothing extraordinary there. Here is the good part, assuming you pass in an ArrayList of our ReportPoints above:
public static void calcValues( ArrayList alPoints )
{
double sumOfX = 0 ;
double sumOfY =0 ;
double sumOfXSq = 0 ;
double sumOfYSq = 0 ;
double ssX = 0 ;
double ssY = 0 ;
double sumCodeviates = 0 ;
double sCo = 0 ;
for( int ctr = 0; ctr < alPoints.Count; ctr++ )
{
ReportPoint objPoint = ( ReportPoint ) alPoints[ ctr ] ;
double x = double.Parse( objPoint.X_Coord.ToString() ) ;
double y = double.Parse( objPoint.Y_Coord.ToString() ) ;
sumCodeviates+= ( x*y ) ;
sumOfX += x ;
sumOfY += y ;
sumOfXSq = sumOfXSq + ( x*x ) ;
sumOfYSq = sumOfYSq + ( y*y ) ;
}
sumOfXSq = Math.Round( sumOfXSq, 2 ) ;
sumOfYSq = Math.Round( sumOfYSq, 2 ) ;
ssX = sumOfXSq - ( ( sumOfX*sumOfX ) / alPoints.Count ) ;
ssY = sumOfYSq - ( ( sumOfY*sumOfY ) / alPoints.Count ) ;
double RNumerator = ( alPoints.Count * sumCodeviates ) - (sumOfX * sumOfY ) ;
double RDenom = ( alPoints.Count*sumOfXSq - ( Math.Pow( sumOfX, 2 ) ) )
* ( alPoints.Count*sumOfYSq - ( Math.Pow( sumOfY, 2 ) ) ) ;
sCo = sumCodeviates - ( ( sumOfX*sumOfY ) / alPoints.Count ) ;
double dblSlope = sCo / ssX ;
double meanX = sumOfX / alPoints.Count ;
double meanY = sumOfY /alPoints.Count ;
double dblYintercept = meanY - ( dblSlope * meanX ) ;
double dblR = RNumerator / Math.Sqrt( RDenom ) ;
double dblSlope = dblSlope ;
Console.WriteLine( "R-Squared: {0}", Math.Pow( dblR, 2 ) ) ;
Console.WriteLine( "Y-Intercept: {0}", dblYIntercept ) ;
Console.WriteLine( "Slope: {0}", dblSlope ) ;
Console.ReadLine() ;
}
Yes, yes, yes, I know a typed collection instead of an ArrayList would be better; I moved this code into a Console program to make an easy to follow demo of the logic and wanted to keep non-essentials to a minimum. Let's say I'm saving myself for Generics! So, in our main method we'd have:
[STAThread]
static void Main(string[] args)
{
ArrayList al = new ArrayList() ;
al.Add( new ReportPoint( 3, 2.6 ) ) ;
al.Add( new ReportPoint( 5.6, 20 ) ) ;
al.Add( new ReportPoint( 8.2, 30 ) ) ;
al.Add( new ReportPoint( 8.4, 50.7 ) ) ;
al.Add( new ReportPoint( 9, 51.4 ) ) ;
al.Add( new ReportPoint( 10, 37.9 ) ) ;
calcValues( al ) ;
}
There you have it. You really need to watch your order of operations.
Happy .Netting!
Posted
05-21-2004 8:36 AM
by
grant.killian