Writing Your Own Function Objects

SourcePro : Business Analysis Module User’s Guide : Using the Classes : Using the Model Selection Classes : Writing Your Own Function Objects

If we interpret model selection as search, we can interpret the template parameter F as the search evaluation criterion. When we instantiate the model selection classes, the choice for F should denote a function object that returns a numerical value whenever it is given a subset of predictor variables. For the class RWLinRegModelSelector<F>, the function object F should define the operator() method taking a matrix composed of some number of columns from a regression matrix, an observation vector, and vector of calculated parameters. It should also define a default constructor from which valid function objects can be created.

class F { public:

F();

double operator()( const RWGenMat<double>&

regressionMatrixColumns,

const RWMathVec<double>& observationVector,

const RWMathVec<double>& parameters );

};

The implementation of operator() should expect:

the number of rows in the matrix to equal the length of the observation vector

the number of columns in the matrix to equal the length of the parameter vector

the order of parameter values to correspond with the matrix row ordering.

For the class RWLinRegModelSelector<F>, the function object F should have the same interface as above, except that operator() takes an observation vector consisting of Boolean elements, rather than double-precision elements.

class F { public:

F();

double operator()( const RWGenMat<double>&

regressionMatrixColumns,

const RWMathVec<bool>& observationVector,

const RWMathVec<double>& parameters );

};

The following example shows how you might implement a function object to produce Mallow’s Cp statistic for subsets in linear regression. Note that you must take extra steps to run the example in a multithreaded environment, since the static variable fullModelMSE is not currently thread-safe.

#include <rw/math/genmat.h>

#include <rw/math/mathvec.h>

class RWLinMallowsCpEval {

public:

RWLinMallowsCpEval() {;}

double operator()( const RWGenMat<double>& xdata,

const RWMathVec<double>& ydata,

const RWMathVec<double>& params ) const

{

const size_t numObs = ydata.length();

const size_t numCoeffs = params.length();

RWMathVec<double> predictions = product(xdata, params);

RWMathVec<double> errors = ydata – predictions;

double SSE = 0.0;

for ( size_t i = 0; i < numObs; i++ )

SSE += errors(i)*errors(i);

return SSE / fullModelMSE + 2*numCoeffs – numObs;

}

static double fullModelMSE;};

// Static initialization is needed for some linkers.

double RWLinMallowsCpEval::fullModelMSE = 0.0;

#include <rw/analytics/linregress.h>

#include <rw/analytics/lranova.h>

#include <rw/analytics/lnrmodsel.h>

// This function (implementation not provided) reads in some data.

extern void getDataFromFile(const char* fileName,

RWGenMat<double> predMat,

RWMathVec<double> obsVec);

int main() {

RWGenMat<double> predictorMatrix;

RWMathVec<double> observationVector;

getDataFromFile(“regdata”, predictorMatrix, observationVector);

RWLinearRegression lr(predictorMatrix, observationVector);

RWLinearRegressionANOVA anova(lr);

RWLinMallowsCpEval::fullModelMSE = anova.meanSquareResidual();

RWLinRegModelSelector<RWLinMallowsCpEval> cpsel(lr,

rwForwardSelection);

cout << “Selected variable subset according to Mallows Cp: “

<< cpsel.selectedParamIndices() << endl;

return 0;

}