RWalib Introduction
RWalib is a C library of array operations with automatic thread control (ATC). The operations are those that would form a basis for an array-based language like PV-WAVE, accommodating n-dimensional arrays in six real and two complex data-types. As a computational engine for PV-WAVE, RWalib has no error checking or memory management of its own, so output arrays are allocated by the caller.
note | RWalib is only supported on 64-bit Linux and 64-bit Windows. |
By C programming paradigm, multidimensional arrays in RWalib are stored as flat arrays and referenced in row-major order. For those more comfortable with column-major referencing, e.g., users of MATLAB or PV-WAVE, RWalib can be used like this provided the programmer remembers to reverse dimension order, e.g., an (m,n,p,q) array becomes a (q,p,n,m) array, etc.
Each array operation has separate code for serial and OpenMP-parallel execution, where the serial version is faster on a single thread and is to be used whenever the operation is called from parallel code or is too small for multiple threads. If RWalib is called from a section of parallel code, the toggle alibParallel() should be used at the beginning and end of that section in order to disable and then re-enable parallel execution in RWalib. If parallel execution is enabled, RWalib automatically chooses between serial or parallel execution. By default, the decision is based on whether or not the number of parallel loop iterations exceeds a number called the global parallelization threshold, or GPT. The GPT defaults to 1000, but this number can be overridden by alibset(), a routine for runtime control over all performance parameters, including cache sizes and the maximum number of threads for parallel operations. Performance parameters are intialized by alibinit() and do not affect the calling routines.
ATC represents a significant refinement to the default method of thread control. Instead of the GPT, a parallelization threshold exists for each array operation on each data-type, and these thresholds automatically determine whether or not a particular instance of an operation is large enough to use its parallel code. Moreover, instead of a fixed number of threads for all parallel operations, the number of threads may vary automatically so it is optimal for each operation, regardless of its size and data-type. With ATC in effect, alibParallel() can still be used to toggle parallel execution, or alibset() could be even used to toggle between ATC and default thread control, but this should be unnecessary. ATC requires platform-specific tuning, and since RWalib is part of PV-WAVE, tuning is done in PV-WAVE with an easy to use routine named ompTune (graphical validation can be done with another easy to use routine called ompTspeedups). The result of tuning is a file, and once the file is loaded with alibset(), all RWalib performance parameters are loaded, and ATC is in effect unless suspended by the user.
RWalib has one include file, alib.h, with prototypes for all RWalib functions and with structure definitions, COMPLEX and DCOMPLEX, for single and double precision complex numbers. For a complex number z of either precision, its real and imaginary parts are z.r and z.i. Two other data-types are defined: wvlong is defined as a long except on Windows where it is defined as an INT64, and UCHAR is an unsigned char, though on Windows this definition already exists. alib.h defines other structures, but only SYS_CACHE is of interest to the user. It is used to inform RWalib about data-cache sizes and is initialized with default values by alibinit(). Its wvlong fields, line, l1, l2, l3, and l4, are the number of bytes in a cache line and in the L1, L2, L3, and L4 caches, but the l4 field can be ignored if the host does not have an L4 data-cache.
Each chapter in this manual details a set of routines of similar functionality. Each routine has one C function for each of its supported data-types, and these function names differ only in the last character which designates data-type. For example, the eight functions for element-wise subtraction are:
alibSubtb — for type UCHAR
alibSubts — for type short
alibSubti — for type int
alibSubtl — for type wvlong
alibSubtf — for type float
alibSubtd — for type double
alibSubtc — for type COMPLEX
alibSubtz — for type DCOMPLEX
Each RWalib function name begins with the prefix "alib". The documentation for each routine begins with the generic routine name (alibSubt for example) and is followed by a statement of its purpose, the prototypes for each data-type, the descriptions of each input parameter, the descriptions of each output parameter, and at least one example (usually several for multidimensional operations). For each RWalib routine, the documentation includes the name of its PV-WAVE API. Note however that for multidimensional operations, the order of dimensions is reversed in the PV-WAVE API since PV-WAVE is column-major.
note | In the PV-WAVE installation the RWalib binaries are found in <RW_DIR>/wave/bin/bin.*, and alib.h is found in <RW_DIR>/wave/src/priv. |
Version 2017.0
Copyright © 2017, Rogue Wave Software, Inc. All Rights Reserved.