PV-WAVE Implementation of HDF5

The PV‑WAVE implementation of the HDF5 library follows the HDF5 API as documented by the HDF group, with a few minor exceptions.

Initializing the HDF5 Module

Before you can use any of the PV‑WAVE HDF5 routines, you must initialize the HDF5 module. To to this, type:

WAVE> @hdf5_startup
   PV-WAVE:HDF5 Module Initialized
WAVE>

After the module is initialized, you can use any of the PV‑WAVE HDF5 routines.

Tip: Include a call to HDF5_STARTUP in your PV‑WAVE system startup file, or your personal PV‑WAVE startup file.

Initializing Common Block Variables

All of the data types (H5T_NATIVE_INT, and so on) and flag arguments (H5F_ACC_RDWR, and so on) for the HDF5 routines are used in the PV‑WAVE HDF5 routines exactly as they are shown in the HDF5 Reference Manual.

The values for these variables are stored in PV‑WAVE common blocks and initialized when the module is loaded.

Note:

The HDF5 common blocks must me included in any PV-WAVE procedures and functions that use the PV-WAVE HDF5 interface. To do this, place the line:

@hdf5_common

at the beginning of your PV-WAVE routines.

Exceptions

This section describes the PV‑WAVE exceptions to the HDF5 API.

General PV‑WAVE Exceptions to the HDF5 API

The following are PV-WAVE exceptions to the HDF5 API:

HDF5 1.8 modified the APIs for some HDF5 1.6 routines. When looking at the HDF5 1.8 documentation, you might see two versions of the routines:

<routine_name>1 

and:

<routine_name>2

The routines with a '1' appended to the name follow the HDF5 1.6 API. The routines with a '2' appended follow the HDF5 1.8 API.

In all cases, the PV-WAVE implementation uses the HDF5 1.6 API definition as documented in the <routine_name>1 section for that routine.

Note:

For these cases, call the routine using only the generic name, omitting the '1'.

Where you would use 'NULL' in C, use a zero ('0') in PV‑WAVE.

Specific PV‑WAVE Exceptions to the HDF5 API

Refer to the following topics for specific PV‑WAVE exceptions:

HOFFSET macro: The second argument must be a string.

Usage: HOFFSET(struct_name, "struct_element")

When creating a compound data type, use the routine H5wave_struct_sizeof() instead of sizeof() for the H5Tcreate() call.

Example:

new_id = H5Tcreate(H5T_COMPOUND, H5wave_struct_sizeof(wave_struct_instance))

H5Eprint takes no arguments.

Usage: H5Eprint

H5Eset_auto turns error printing on or off via the /On and /Off keywords. It takes no other arguments and prints only to stderr.

Usage: H5Eset_auto, [/On | /Off]

H5Pset_fill_value: The second argument is actually the data type class, not the actual data type.

Usage: ret = H5Pset_fill_value(plist_id, [H5T_INTEGER | H5T_FLOAT], value)

The only accepted values are H5T_INTEGER and H5T_FLOAT. The appropriate 'NATIVE' types of each will be used (FLOAT<-> DOUBLE, SHORT<->INT<->LONG) according to your dataset.

H5Pget_buffer: Takes only a plist id and returns only the buffer size in bytes.

Usage: size = H5Pget_buffer(plist_id)

H5Pset_buffer: Takes only plist id and size arguments.

Usage: ret = H5Pset_buffer(plist_id, size_in_bytes)

H5Gget_objinfo: returns the documented structure with the exception that the time element is represented by a PV-WAVE Date/Time variable.

Usage: ret = H5Gget_objinfo(loc_id, "name", [0 | 1], output_var)

General Notes

Use H5T_NATIVE_* data types wherever possible; only standard data types for which HDF5 provides conversion support can be read. Custom data conversions are not supported. Modified floating-point formats will, most likely, result in a file that can not be read by PV‑WAVE.

All PV‑WAVE HDF5 routines, with the exception of H5Eprint and H5Eset_auto, are PV‑WAVE functions and require the PV‑WAVE function-call syntax: return_value = FUNCTION(arg1, arg2 ,...).

When reading data into PV-WAVE from an HDF5 file, PV-WAVE uses metadata in the file to determine the correct data type for the resulting PV‑WAVE variable. When writing out PV-WAVE variables to an HDF5 file, it is important to properly describe the PV-WAVE data to the HDF5 library. PV-WAVE and the HDF5 library essentially hand raw memory buffers back and forth. The definitions you provide to HDF5 via the HDF5 API determine how these buffers are interpreted. This means that simple mistakes, like telling HDF5 that a PV-WAVE integer variable is an H5T_NATIVE_INT (it is a SHORT) can result in data corruption or segmentation violations.

Comparison of PV-WAVE and HDF5 data types shows the HDF5 types to use:

Comparison of PV-WAVE and HDF5 data types

PV-WAVE Type

HDF5 Type

BYTE

H5T_NATIVE_CHAR

INT

H5T_NATIVE_SHORT

INT32

H5T_NATIVE_INT32

LONG

H5T_NATIVE_LONG

FLOAT

H5T_NATIVE_FLOAT

DOUBLE

H5T_NATIVE_DOUBLE

STRING

H5T_C_S1

Note:

PV-WAVE Structures—Structures are described by defining the individual elements using the primitive types in Comparison of PV-WAVE and HDF5 data types.

64-Bit Platforms—You can use the type correlations in Comparison of PV-WAVE and HDF5 data types for all platforms except for 64-bit Windows. On 64-bit Windows you must describe PV-WAVE LONGs as the HDF5 type H5T_NATIVE_INT64. This description does work for all 64-bit platforms and will fail on 32-bit platforms. The H5T_NATIVE_LONG can be safely used for all platforms if you do not need to support 64-bit Windows.

Remember that PV‑WAVE arrays are described in column-major order, while C describes arrays in row-major order. The underlying memory buffers are identical, since PV‑WAVE is written in C. So, this means if you want to write a 2x3x4 PV‑WAVE array to an HDF5 file you should describe the array to HDF5 as 4x3x2. Conversely, if you read a 3x5 array from an HDF5 file it appears in PV‑WAVE as a 5x3 array. This whole issue stems from PV‑WAVE's FORTRAN roots. See the example routines for specific instances.

When declaring a structure using the H5Tinsert routine, you should declare the structure elements in the same order in which they appear in the structure itself. Sometimes, the HDF5 structure utilities do not accurately describe the structure's memory when elements are inserted out of memory order.

When writing out strings to an HDF5 file, you should use the H5T_VARIABLE string type whenever possible. Examples that use this string type can be found in the files h5_variable_string.pro and h5_vlen_of_structs.pro in the <RW_DIR>/hdf5-1_8/examples directory.

Unsupported Routines/Data Types

The HDF5 reference API is not supported.

PV‑WAVE does not support any HDF5 routines that allow users to specify/register their own functions via function pointer arguments. These include conversion routines, filters, memory managers, iteration routines, and so on.

Variable-length strings are far easier to work with than the old method of turning them into fix-length byte arrays. A great deal of processing had to be done to deal with fixed-length strings that could be easily bypassed for the variable-length strings. For this reason, any string contained within a VLEN data type must be a variable-length string. Fixed-length strings are not supported within VLENS at any level.

Consecutively nested H5T_ARRAY types are not supported. You can still have an array of something that has arrays in it, just not arrays of arrays. In general, you should specify multi-dimensional datasets via the H5Screate_simple() routine instead of using the H5T_ARRAY data type, as they accomplish the same thing and provide greater flexibility.

Opaque data types.

Example PV‑WAVE Code

A number of example PV-WAVE procedures are in the directory RW_DIR/hdf5-1_8/examples. In many cases these are direct C to PV‑WAVE translations of the examples provided by HDF5. The comments note any special changes/issues that you should be aware of, many of which are not noted in other documentation. We strongly recommend that you examine the examples closely.

VLEN Data Type

The HDF5 VLEN data type is fully supported. VLENs are represented in PV‑WAVE with the named stucture h5_wave_vlen. This structure is defined when you initiate the HDF5 OPI. You must use this named structure for any VLENs you wish to write. These are automatically created during read operations.

For example, to define a simple, two-element VLEN variable in PV‑WAVE:

vlen_var = {h5_wave_vlen, len:2l, data:LISTARR(2)}

Note:

The len tag indicates the number of elements in the list array and must be a PV‑WAVE LONG value.

You would then assign the data for this variable:

vlen_var.data(0) = data0
vlen_var.data(1) = data1

This structure may also be embedded in other structures, to any degree:

struct_var = {struct_var_t, Tag1:23.0d, Tag2:{h5_wave_vlen, len:5l, data:LISTARR(5)}, Tag3:22l}

Some examples are provided with this release in the hdf5-1_8/examples directory.

The procedure to read data structures containing VLENs is the same as that for non-VLEN data.

For example:

FUNCTION readv

@hdf5_common
FILENAME = "testv.h5"
DSET = "dset"
 
file_id = H5Fopen(FILENAME, H5F_ACC_RDONLY, H5P_DEFAULT)
dset_id = H5Dopen(file_id, DSET)
dtype_id = H5Dget_type(dset_id)
 
ret = H5Dread(dset_id, dtype_id, H5S_ALL, H5S_ALL, $
H5P_DEFAULT,output,Sname=["h5base","struct2","struct3"])
ret = H5Tclose(dtype_id)
ret = H5Dclose(dset_id)
ret = H5Fclose(file_id)
 
RETURN, output
 
END

With minor modifications, this simple PV-WAVE routine will read any dataset in an HDF5 file.

This routine is also included in the hdf5-1_8/examples directory.

Note:

The Sname keyword in the call to H5Dread() is discussed in the following sections.

Named Structures

All PV-WAVE structures returned by H5DREAD are named structures. Each name is based on the string(s) provided to H5DREAD via the Sname keyword. If the Sname keyword is not provided, the string "hdf5_struct_N" is used, where N is an incrementing integer. Note that if you wish to concatenate identical datasets from separate calls to H5DREAD into a single PV-WAVE array you must provide structure names as PV-WAVE views differently named structures as distinct data types.

If your data contains multiple HDF5 COMPOUND elements, you can pass an array of structure names to H5DREAD via the Sname keyword. These names are assigned to the PV-WAVE stuctures in the order in which they appear in the dataset. It is allowable to pass in more names than there are structures in the data. If you pass in fewer names, they are used until they are gone and then the first name in the name array is re-used as a base name with _N appended to it, where N is an incrementing integer.

For multi-dimensional datasets, you need only provide PV-WAVE structure names for a single data point. The PV-WAVE structure names are identical for all subsequent points in your data.

Within each data point, no comparison to existing structures in that data point is made. This means that while one structure may have a series of identical structures nested within it, each PV-WAVE structure is created with a different structure name. The exception is structures which are elements of the same VLEN. Structures contained within a single VLEN are all assigned the same PV-WAVE structure name during a read operation, since each element of any particular VLEN must be of the same type.

For example, if the file contains this data type:

structure1
   tag - VLEN (2 items)
          structure2
          structure2
   tag - numeric
   tag - structure3
   tag - structure3
An Sname array of ['name1','name2','name3','name4'] would produce:
structure1 - name1
   tag - VLEN (2 items)
         structure2 - name2
         structure2 - name2
   tag - numeric
   tag - structure3 - name3
   tag - structure3 - name4

The name assignments above are used for each point in the dataset.

If you allow PV-WAVE to name the structures for you, repeated calls to H5DREAD can lead to a proliferation of PV-WAVE structure definitions in your PV-WAVE session. This will eventually exhaust your system's resources. See the DELSTRUCT command for information about removing unused structure definitions from your PV-WAVE session.

If you use the same Sname string array for multiple calls to H5DREAD in a single PV-WAVE session, it is your responsibility to ensure that all structures using the same name are identical to those already defined in your PV-WAVE session. i.e., the structural hierarchy of datasets for which you are using the same PV-WAVE structure names must be identical.

Supported HDF5 Routines

Aside from the exceptions noted above, usage for each routine matches that of the HDF5 API as documented in the HDF5 manuals.

Attribute Interface

H5ACLOSE
H5ACREATE
H5ADELETE
H5AGET_NAME
H5AGET_NUM_ATTRS
H5AGET_SPACE
H5AGET_TYPE
H5AOPEN_IDX
H5AOPEN_NAME
H5AREAD
H5AWRITE

Dataset Interface

H5DCLOSE
H5DCREATE
H5DEXTEND
H5DGET_CREATE_PLIST
H5DGET_SPACE
H5DGET_STORAGE_SIZE
H5DGET_TYPE
H5DOPEN
H5DREAD
H5DWRITE

Error Interface

H5EPRINT
H5ESET_AUTO

File Interface

H5FCLOSE
H5FCREATE
H5FFLUSH
H5FGET_ACCESS_PLIST
H5FGET_CREATE_PLIST
H5FIS_HDF5
H5FMOUNT
H5FOPEN
H5FREOPEN
H5FUNMOUNT

Group Interface

H5GCLOSE
H5GCREATE
H5GGET_COMMENT
H5GGET_LINKVAL
H5GGET_NUM_OBJS
H5GGET_OBJINFO
H5GGET_OBJNAME_BY_IDX
H5GGET_OBJTYPE_BY_IDX
H5GLINK
H5GMOVE
H5GOPEN
H5GSET_COMMENT
H5GUNLINK

Identifier Interface

H5IGET_NAME
H5IGET_TYPE

Property List Interface

H5PCLOSE
H5PCOPY
H5PCREATE
H5PEQUAL
H5PGET_ALIGNMENT
H5PGET_BTREE_RATIOS
H5PGET_CACHE
H5PGET_CHUNK
H5PGET_CLASS
H5PGET_DRIVER
H5PGET_EXTERNAL
H5PGET_EXTERNAL_COUNT
H5PGET_FAPL_CORE
H5PGET_FAPL_FAMILY
H5PGET_FILL_VALUE
H5PGET_ISTORE_K
H5PGET_LAYOUT
H5PGET_META_BLOCK_SIZE
H5PGET_PRESERVE
H5PGET_SIEVE_BUF_SIZE
H5PGET_SIZES
H5PGET_SMALL_DATA_BLOCK_SIZE
H5PGET_SYM_K
H5PGET_USERBLOCK
H5PGET_VERSION
H5PSET_ALIGNMENT
H5PSET_BTREE_RATIOS
H5PSET_CACHE
H5PSET_CHUNK
H5PSET_DEFLATE
H5PSET_EXTERNAL
H5PSET_FAPL_CORE
H5PSET_FAPL_FAMILY
H5PSET_FAPL_SEC2
H5PSET_FAPL_SPLIT
H5PSET_FAPL_STDIO
H5PSET_FILL_VALUE
H5PSET_ISTORE_K
H5PSET_LAYOUT
H5PSET_META_BLOCK_SIZE
H5PSET_SIEVE_BUF_SIZE
H5PSET_SIZES
H5PSET_SMALL_DATA_BLOCK_SIZE
H5PSET_SYM_K
H5PSET_USERBLOCK

Dataspace Interface

H5SCLOSE
H5SCOPY
H5SCREATE
H5SCREATE_SIMPLE
H5SEXTENT_COPY
H5SGET_SELECT_BOUNDS
H5SGET_SELECT_ELEM_NPOINTS
H5SGET_SELECT_ELEM_POINTLIST
H5SGET_SELECT_HYPER_BLOCKLIST
H5SGET_SELECT_HYPER_NBLOCKS
H5SGET_SELECT_NPOINTS
H5SGET_SIMPLE_EXTENT_DIMS
H5SGET_SIMPLE_EXTENT_NDIMS
H5SGET_SIMPLE_EXTENT_NPOINTS
H5SGET_SIMPLE_EXTENT_TYPE
H5SIS_SIMPLE
H5SOFFSET_SIMPLE
H5SSELECT_ALL
H5SSELECT_ELEMENTS
H5SSELECT_HYPERSLAB
H5SSELECT_NONE
H5SSELECT_VALID
H5SSET_EXTENT_NONE
H5SSET_EXTENT_SIMPLE

Data Type Interface

H5TARRAY_CREATE
H5TCLOSE
H5TCOMMIT
H5TCOMMITTED
H5TCOPY
H5TCREATE
H5TENUM_CREATE
H5TENUM_INSERT
H5TENUM_NAMEOF
H5TENUM_VALUEOF
H5TEQUAL
H5TGET_ARRAY_DIMS
H5TGET_ARRAY_NDIMS
H5TGET_CLASS
H5TGET_CSET
H5TGET_EBIAS
H5TGET_FIELDS
H5TGET_INPAD
H5TGET_MEMBER_CLASS
H5TGET_MEMBER_INDEX
H5TGET_MEMBER_NAME
H5TGET_MEMBER_OFFSET
H5TGET_MEMBER_TYPE
H5TGET_MEMBER_VALUE
H5TGET_NMEMBERS
H5TGET_NORM
H5TGET_OFFSET
H5TGET_ORDER
H5TGET_PAD
H5TGET_PRECISION
H5TGET_SIGN
H5TGET_SIZE
H5TGET_STRPAD
H5TGET_SUPER
H5TINSERT
H5TIS_VARIABLE_STR
H5TLOCK
H5TOPEN
H5TSET_CSET
H5TSET_EBIAS
H5TSET_FIELDS
H5TSET_INPAD
H5TSET_NORM
H5TSET_OFFSET
H5TSET_ORDER
H5TSET_PAD
H5TSET_PRECISION
H5TSET_SIGN
H5TSET_SIZE
H5TSET_STRPAD
H5TVLEN_CREATE

General Functions

H5CHECK_VERSION
H5CLOSE
H5GET_LIBVERSION