Overall Efficiency
The following comparative example demonstrates the efficiency you can achieve with the Essential Math Module. Consider the Fortran program given below:
double precision a(8100), b(8100), c(8100)
double precision second, start, dt
 
Niter = 1000
 
do 500 N=500,4000,500
do 100 i=1,N
a(i) = 3
b(i) = 5
100 continue
 
c Second returns elapsed time
start = second()
do 200 iter=1, Niter
do 200 i=1, N
c(i) = a(i) * b(i)
200 continue
 
dt = second() - start
 
write(6,2000) N, dt, dt/Niter
2000 format(i6, 5x, f8.5, 12x, f8.5)
500 continue
end
Here is the C++ program, using the Essential Math Module:
 
#include <rw/math/mathvec.h>
#include <iostream.h>
 
 
double second(); // Returns elapsed time
 
void main() {
int Niter = 1000;
 
for(int N=500; N<=4000; N+=500){
RWMathVec<double> a(N, 3);
RWMathVec<double> b(N, 5);
 
double start = second();
for(int iter=0; iter<Niter; iter++){
RWMathVec<double> c = a*b;
}
double dt = second() - start;
 
cout << N << " " << dt << " " << dt/Niter << endl;
}
}
Notice that the C++ program is much simpler. Also, the Fortran program uses static memory; the memory is declared and defined at compile time. This greatly simplifies the job for the compiler, since the addresses of the arrays are known at compile time, but greatly complicates the job for the programmer, since you must know the maximum size of your arrays in advance. The Essential Math Module version uses dynamic memory; it is obtained at run time.
Despite the use of dynamic memory and built-in slices, the Essential Math Module is faster than Fortran. Because C++ allows all of the code for the various binary and unary operators to be localized in one spot, it becomes easy to write optimization in the critical spots. This is what we have done. With Fortran, you are at the mercy of the compiler.
In this example, the critical code resides inside the binary multiply operator:
 
template <class T>
RWMathVec<T> operator*(const RWMathVec<double>&,
const RWMathVec<double>&);
This operator requires that three addresses be juggled: the vector operands and the return value. Two of these, the operands, can potentially be slices (that is, their stride may not be 1), which requires that their strides be held as well. Note that other operators, such as:
 
RWMathVec<double>& RWMathVec<double>::operator+=(const
RWMathVec<double>&);
involve only two operands, and so are much easier to optimize.