Writing Your Own Function Objects
If we interpret model selection as search, we can interpret the template parameter F as the search evaluation criterion. When we instantiate the model selection classes, the choice for F should denote a function object that returns a numerical value whenever it is given a subset of predictor variables. For the class RWLinRegModelSelector<F>, the function object F should define the operator() method taking a matrix composed of some number of columns from a regression matrix, an observation vector, and vector of calculated parameters. It should also define a default constructor from which valid function objects can be created.
 
class F { public:
F();
double operator()( const RWGenMat<double>&
regressionMatrixColumns,
const RWMathVec<double>& observationVector,
const RWMathVec<double>& parameters );
};
The implementation of operator() should expect:
*the number of rows in the matrix to equal the length of the observation vector
*the number of columns in the matrix to equal the length of the parameter vector
*the order of parameter values to correspond with the matrix row ordering.
For the class RWLinRegModelSelector<F>, the function object F should have the same interface as above, except that operator() takes an observation vector consisting of Boolean elements, rather than double-precision elements.
 
class F { public:
F();
double operator()( const RWGenMat<double>&
regressionMatrixColumns,
const RWMathVec<bool>& observationVector,
const RWMathVec<double>& parameters );
};
The following example shows how you might implement a function object to produce Mallow’s Cp statistic for subsets in linear regression. Note that you must take extra steps to run the example in a multithreaded environment, since the static variable fullModelMSE is not currently thread-safe.
 
#include <rw/math/genmat.h>
#include <rw/math/mathvec.h>
 
class RWLinMallowsCpEval {
public:
RWLinMallowsCpEval() {;}
double operator()( const RWGenMat<double>& xdata,
const RWMathVec<double>& ydata,
const RWMathVec<double>& params ) const
{
const size_t numObs = ydata.length();
const size_t numCoeffs = params.length();
 
RWMathVec<double> predictions = product(xdata, params);
RWMathVec<double> errors = ydata – predictions;
double SSE = 0.0;
for ( size_t i = 0; i < numObs; i++ )
SSE += errors(i)*errors(i);
return SSE / fullModelMSE + 2*numCoeffs – numObs;
}
static double fullModelMSE;};
 
// Static initialization is needed for some linkers.
double RWLinMallowsCpEval::fullModelMSE = 0.0;
 
#include <rw/analytics/linregress.h>
#include <rw/analytics/lranova.h>
#include <rw/analytics/lnrmodsel.h>
 
// This function (implementation not provided) reads in some data.
extern void getDataFromFile(const char* fileName,
RWGenMat<double> predMat,
RWMathVec<double> obsVec);
 
int main() {
RWGenMat<double> predictorMatrix;
RWMathVec<double> observationVector;
getDataFromFile(“regdata”, predictorMatrix, observationVector);
 
RWLinearRegression lr(predictorMatrix, observationVector);
RWLinearRegressionANOVA anova(lr);
RWLinMallowsCpEval::fullModelMSE = anova.meanSquareResidual();
RWLinRegModelSelector<RWLinMallowsCpEval> cpsel(lr,
rwForwardSelection);
cout << “Selected variable subset according to Mallows Cp: “
<< cpsel.selectedParamIndices() << endl;
return 0;
}