!CACHE

This system variable provides a way for the user to inform PV-WAVE about data-cache sizes on the host, typically in a PV-WAVE startup file. Alternatively, these values can be provided as inputs to the performance tuner OMPTUNE, in which case !CACHE is loaded with these values when the tuning file is loaded, typically in a startup file.

The default values for each tag are:

** Structure !CACHE, 5 tags, 40 length:
  LINE LONG                         64
  L1   LONG                      32768
  L2   LONG                     262144
  L3   LONG                    8388608
  L4   LONG                   67108864

where tag L4 is currently unused and where the other tags represent the number of bytes in a data-cache line, in the smallest data-cache, in the second smallest data-cache, and in the third smallest data-cache, respectively.

For optimal performance of cache-blocked algorithms, these values should reflect cache sizes on the host. If the defaults in !CACHE do not reflect these sizes, it is recommended that they be changed to the correct values, most conveniently in a PV-WAVE startup file. To change the L1 and L2 tags for example, add the following lines to your startup file:

!CACHE.L1 = 65536
!CACHE.L2 = 524288
SET_OMP

A call to SET_OMP is always required for any changes to !CACHE to take effect.

Note:

If you prefer not to set !CACHE directly, you can provide the data-cache size values as inputs to OMPTUNE, where they are saved to the tuning file. When the tuning file is loaded, which can be done from a startup file, !CACHE is loaded with the saved values.

There are several ways to determine cache sizes on the host, but the recommended way is to get the values directly from the chip vendor website. The OS may also give this information, such as on Linux, where the following commands can be used:

getconf LEVEL1_DCACHE_LINESIZE
getconf LEVEL1_DCACHE_SIZE
getconf LEVEL2_CACHE_SIZE
getconf LEVEL3_CACHE_SIZE

In principle, cache-blocking could be disabled by setting all !CACHE tags equal to the same big value (e.g., LONG_MAX), but this is not recommended since performance is generally much better when the tags are accurately set (or even set to reasonable approximations).