PV-WAVE Advantage > IMSL Statistics Reference Guide > Data Mining > UNSUPERVISED_NOMINAL_FILTER Function
UNSUPERVISED_NOMINAL_FILTER Function
Converts nominal data into a series of binary encoded columns for input to a neural network. Optionally, it can also reverse the binary encoding, accepting a series of binary encoded columns and returning a single column of nominal classes.
Usage
result = UNSUPERVISED_NOMINAL_FILTER(x)
Input Parameters
x—A one or two-dimensional array, depending upon whether encoding or decoding is requested. If encoding is requested, x is an array of length n_obs, where n_obs is the number of observations, containing the categories for a nominal variable numbered from 1 to N_classes, where N_classes is the number of classes in x. If decoding is requested, then x is an array of size n_obs by N_classes. In this case, the columns contain only zeros and ones that are interpreted as binary encoded representations for a single nominal variable.
Returned Value
result—Array, z, where values in z are either the encoded or decoded values for x, depending upon whether encoding or decoding is requested. The result is a 2D array [n_obs × n_classes] if encoding. The result is 1D [n_obs] if decoding. (This is exactly opposite of the input x.)
Input Keywords
Decode—If set and nonzero, the binary encoding of x is reversed and the result is an array of nominal values. The values in each column of x should be zeros and ones. The values in the ith column of x are associated with the ith class of the nominal variable.
Output Keywords
N_classes—If encoding, the number of classes in x is returned via this keyword. It is ignored when decoding.
Discussion
UNSUPERVISED_NOMINAL_FILTER is designed to either encode or decode nominal variables using a simple binary mapping.
Binary Encoding
By default, x is an input array to which a binary filter is applied. Binary encoding takes each category in x, and creates a column in z, the output matrix, containing all zeros and ones. A value of zero indicates that this category is not present and a value of one indicates that it is present.
For example, if x = {2, 1, 3, 4, 2, 4} then n_classes = 4, and:
Notice that the number of columns in z is equal to the number of distinct classes in x. The number of rows in z is equal to the length of x.
Binary Decoding
If the Decode keyword is set, binary decoding takes each column in x, and returns the appropriate class in z. For example, if x is the same as described in Binary Encoding:
then z would be returned as z = {2, 1, 3, 4, 2, 4}. Notice this is the same as the original array because classes are numbered sequentially from 1 to n_classes. This ensures that the ith column of x is associated with the ith class in the output array.
Example
In this example, if the Decode keyword is set, input data X is a 2-D array and the return value is a 1-D array of length n_obs. If the Decode keyword is not set, X is a 1-D array and the return value is a 2-D array of length n_obs by n_classes
n_classes = 1
x = [3, 3, 1, 2, 2, 1, 2]
z = UNSUPERVISED_NOMINAL_FILTER(x, N_classes=n_classes)
x2 = UNSUPERVISED_NOMINAL_FILTER(z, /Decode)
; print outputs
PRINT, "n_classes = ", n_classes, Format = '(A19, I5)'
PRINT, ""
PM, LONG(x), Title = "X"
PRINT, ""
PM, z, Title = "Z"
PRINT, ""
PM, x2, Title = "Unfiltering Result"
Output
       n_classes =     3
 
X
           3
           3
           1
           2
           2
           1
           2
 
Z
           0           0           1
           0           0           1
           1           0           0
           0           1           0
           0           1           0
           1           0           0
           0           1           0
 
Unfiltering Result
           3
           3
           1
           2
           2
           1
           2