Image Processing: A Brief Overview
Many techniques are typically used by scientists and engineers to alter or process digital images, including: point operations, filtering, and image transforms. The fields of computer vision and pattern recognition often overlap with digital image processing (IP) and they use the following processing techniques: segmentation, classification, and difference analysis. All of these operations are built upon a basic set of IP routines, which are discussed in this chapter.
Point Operations
Point operations are a method of image processing in which each pixel in the output image is only dependent upon the corresponding pixel in the input image. In general, point operations are mathematical and/or logical operations performed on a single image, or between two images of equal size on a point-by-point basis.
Algebraic and Logical Operations
Algebraic, or mathematical operations used in image processing include addition, subtraction, multiplication, and, sometimes, a ratio of two images. Logical operations such as AND, OR and exclusive–OR are also used to process images. These mathematical and logical operations are performed between two images of equal size, or between an image and a scalar value.
Differences and similarities between two images can be enhanced and examined using algebraic operations. For instance, multiplying every pixel value in an image by two can increase the overall contrast; image subtraction, on the other hand, is a way to reveal image differences.
Algebraic and logical operations such as these can be performed using the IPMATH Function on page 369 in the Image Processing Toolkit.
Algebraic operations are also used for dynamic range scaling or shifting; however, you must keep the pixel-value range in mind when performing mathematical image operations because of the possibility of negative values appearing in the result. For example, when one image is subtracted from another, negative pixel values may result. Since a negative value’s display color is typically undefined, the negative image values in the result must be either clipped or scaled. Two images multiplied together can also result in values for which no corresponding color exists.
Thresholding
Thresholding is another point operation method, and binary thresholding is a specific type of thresholding. To perform binary thresholding, you first select a range of pixel values and a logical operator to form the conditional equation. Any pixel values that evaluate to “true” within the logical operation are set to white (or black) and all other pixel values are set to black (or white).
Multilevel thresholding is similar to binary thresholding except that many conditional equations and output levels are defined for a single input image. For example, you may wish to set all pixels between 10 and 20 to black, all pixels between 50 and 100 to medium gray, all pixels between 120 and 220 to white, and leave all other pixels unchanged.
The routines provided in the Image Processing Toolkit to perform thresholding operations are the DENSITY_SLICE Function on page 290, THRESH_ADAP Function on page 451, and THRESHOLD Function on page 454.
Histogram Operations
Another point operation used frequently in image processing is the histogram operation. A histogram is a plot of the number of pixels versus pixel values in an image. It’s possible to improve the overall appearance of an image by modifying its histogram. Two commonly used techniques to accomplish this are: histogram equalization, and histogram stretching. These operations both serve to modify the overall contrast and brightness of an image.
Two functions perform these operations in the Image Processing Toolkit: the HIST_STATS Function on page 335, and IPHISTOGRAM Function on page 366 in addition to the HIST_EQUAL function found in PV‑WAVE.
Filtering
Besides point operations, filtering is another commonly used image processing operation. Basically, filtering is used either to remove unwanted information in an image, or to enhance the information already present. There are several categories of filters, such as linear filters, nonlinear filters, and adaptive filters.
The Image Processing Toolkit routines that perform filtering operations are listed in Table 1-1: Filtering Routines along with the filter category to which each belongs. For your convenience, some associated PV‑WAVE filter routines are also listed.
 
Filtering Routines  
Filter Category
Image Processing Toolkit Routines
PV‑WAVE Routines
Linear Filter - Spatial Domain
ROBERTS, SOBEL
Linear Filter - Frequency Domain
 
Nonlinear Filter
 
Adaptive Filter
 
Linear Filters
Linear filters are defined by a filter kernel, which is itself a small image. Filter kernels, also called windows, are usually 3-by-3, 5-by-5, or 7-by-7 pixels, which contain values that mathematically define the characteristics of the linear transform. Two broad categories of linear filtering operators for digital IP are the spatial and frequency domain operators. The spatial domain refers to the original image plane itself, whereas the frequency domain representation is a fast Fourier transform (FFT) of that image.
The linear filters category can be divided into four subcategories: lowpass, highpass, bandpass, and bandstop filters. Lowpass filters are used to smooth or blur images. Highpass filters enhance image edges by eliminating low-frequency components. Bandstop filters are used to remove periodic noise; and bandpass filters are useful in image enhancement.
Spatial Domain
Linear filtering in the spatial domain is performed by convolution between the image and a filter kernel. Convolution involves passing the filter kernel over the entire input image. Pixel values in the output image are defined at the corresponding location in the input image under the center pixel in the filter kernel. Output values for the edges of the image can be ambiguous when part of the kernel hangs off the image edge. Typical methods for dealing with undefined output pixels are to simply copy the edge pixels from the input image directly to the output image or to extend the boundary of the input image by the size of the filter kernel before convolution is performed.
The PV‑WAVE Image Processing Toolkit comes with many spatial filter kernels which are located in the following directory:
(UNIX) ip-1_0/data/kernel/*.ker
(WIN) ip-1_0\data\kernel\*.ker
Typical spatial filters include the Gaussian filter, gradient masks, highpass spatial filters, Laplacian, lowpass spatial filters, and the Roberts and Sobel filters. See the table for the list of Image Processing Toolkit functions and associated PV‑WAVE routines for linear filtering in the spatial domain.
Frequency Domain
Spatial frequency filters are most often used for image restoration and enhancement. Restoration algorithms remove degradation or noise that has corrupted the image. The Wiener filter and any circularly symmetric filters are good examples of this filtering technique.
Filtering in the frequency domain is performed by multiplying the frequency-domain image with a frequency-domain filter. The product of the image and the filter is then transformed back into the spatial domain by performing an inverse FFT. This technique of using the spatial domain is often used for filters with large kernels.
See the table for the list of Image Processing Toolkit functions and associated PV‑WAVE routines for linear filtering in the spatial domain.
Nonlinear Filters
Nonlinear filters are used for removal of so-called salt-and-pepper noise and Gaussian noise, as well as edge detection in an image.
Nonlinear filters operate by passing a small window over an image and computing an output image pixel based on a given nonlinear function of the input image pixels under that window. Typical window sizes are 3-by-3, 5-by-5 or 7-by-7 pixels-squared.
The FILT_NONLIN Function in the Image Processing Toolkit offers the nine nonlinear filters, shown in Table 1-2: Nonlinear Filters, via the use of function keywords:
 
Nonlinear Filters  
FILT_NONLIN Function
Nonlinear Filter
Function Keyword
Typical Image Processing Usage
Alpha-Trimmed Mean
Atmeanf
Removing salt-and-pepper,
Gaussian noise
Contra-Harmonic Mean
Chmeanf
Removing Gaussian noise
while preserving edge features
Geometric Mean
Gmeanf
Removing Gaussian noise
Maximum
Maxf
Removing outlying low or
negative values
Minimum
Minf
Removing outlying high values
Mode
Modef
Removing noise
Range
Rangef
Edge-detection
Rank
Rankf
Removing salt-and-pepper
noise
Yp Mean
Ypmeanf
Removing Gaussian noise
while preserving edge features
The characteristics of the image information can influences the relative success of nonlinear filters which are not adaptive. Because adaptive filters alter their filtering characteristics based on local image content, they often perform better than their non-adaptive counterparts.
Adaptive Filters
Adaptive filters are particularly useful for noise reduction. By contrast, non-adaptive filters require a priori knowledge of the image noise characteristics to achieve similar optimal results.
The FILT_DWMTM Function on page 298 and FILT_MMSE Function on page 305 provide the adaptive filtering operations in the PV‑WAVE Image Processing Toolkit.
Morphological Image Processing
Image preprocessing for pattern recognition and image analysis applications involve morphological operations. Morphological image processing routines alter or in some way act upon shapes within an image. Erosion and dilation are both fundamental morphological operators that function to erode or dilate, respectively, objects in an image.
The morphological opening operation is erosion followed by dilation, whereas the morphological closing operation is just the opposite, dilation followed by erosion. Opening is a useful processing tool for smoothing contours, eliminating narrow extensions, and breaking thin links. Closing, on the other hand, is used to smooth contours, to link narrow regions, and to fill small gaps or holes.
In addition to the opening and closing morphological operations, the hit-or-miss transform is another morphological operation used primarily for shape definition. This transform is particularly useful in pattern recognition.
The PV‑WAVE Image Processing Toolkit includes seven morphological routines to perform morphological image processing, which compliments the ERODE and DILATE routines already provided in PV‑WAVE. The seven Image Processing Toolkit routines are: DIST_MAP Function on page 292, HIT_MISS Function on page 337, MORPH_CLOSE Function on page 402, MORPH_OPEN Function on page 403, MORPH_OUTLINE Function on page 405, SKELETONIZE Function on page 446, TOP_HAT Function on page 456.
Mensuration
For digital images, mensuration refers to the quantification, or measurement of object features within an image. Mensuration is useful in classification and object recognition, and is the image processing technique widely employed in the field of medicine for tumor identification and surgical planning. Common measures include computing the area, average graylevel, standard deviation, centroid, circularity, and perimeter of a single object.
Another measure is the principal axes of a region. An object’s principle axes are computed from the eigenvectors of the covariance matrix. This is known as the Hotelling transform. Because the principal axis representation of an object is insensitive to rotation, it is often used for target recognition and tracking.
Representation and Description
Image representation and description operations are used to preprocess images for pattern recognition and classification.
Texture
Texture analysis can be extremely important in describing regions in an image. Quantitative texture descriptions, such as smoothness, coarseness, and regularity are often used in image classification and pattern recognition. Texture values are typically based on regional statistical properties such as the moments of the regional histogram. Routines providing these types of statistical texture measurements are the  GLCM Function on page 326, GLCM_STATS Function on page 327, GLRL Function on page 329, GLRL_STATS Function on page 330, and the HIST_STATS Function on page 335.
Spectral analysis can also be used to describe texture. In a textural scale ranging from rough to smooth, for instance, a region with high spatial frequencies may be defined as rough, while a low spatial frequency area would be smooth. The POLAR_FFT Function on page 425 is sued to implement spectral texture analysis in the Image Processing Toolkit.
Correlation
The correlation between two images or between an image and a template is often used in template or prototype matching for pattern recognition. Correlation can also be used to facilitate automatic registration between images. The IPCORRELATE Function on page 359 performs correlation between an image and a template.
Image Transforms
There are numerous transforms that can be applied to any image. The two most common transforms are the fast Fourier transform (FFT) and its inverse. The FFT converts an image from the spatial domain to the spatial frequency domain. Many other transforms exist, however, and are useful for various applications.
The discrete cosine transform (DCT) is used in image compression. The Hough transform is useful in contour linking and identification of geometric shapes and lines in an image. It maps data from a Cartesian coordinate space into a polar parameter space. The Slant transform uses sawtooth waveforms as a basis set and reveals connectivity.
The principal components transform (PCT), also referred to as the Hotelling transform or the Karhunen-Loève transform, is useful for image compression and de-correlation and is widely used in remote sensing applications. The PCT is applied to the covariance matrix of the different spectral bands of a remote sensing image. Its output is an image that has a minimum amount of correlation. This maximum variance image combines most of the information present in the total spectral bands of the original image.
Image transforms provided in the Image Processing Toolkit include the DCT Function on page 288, HAAR Function on page 333, HOUGH Function on page 339, IPCT Function on page 364, IPWAVELET Function on page 386, PCT Function on page 422, RADON Function on page 429, and SLANT Function on page 449; in addition to the FFT function in PV‑WAVE, and the FFTCOMP function in PV‑WAVE IMSL Mathematics.
Geometric Transforms
Geometric transforms such as image rotation, scaling, and warping are also important in many applications. In particular multi-modal data can be registered to a common coordinate system through the use of geometric transforms. Geometric transforms modify the spatial relationships between image pixels. These functions are sometimes referred to as “rubber-sheet transformations” because of their likeness to stretching an image that has been transferred onto a sheet of rubber.
When a geometric transform is applied to an image, pixels in the input image don’t always map directly to a position in the output image. For this reason, some form of graylevel interpolation is necessary to determine the value at each pixel in the output image. Several methods of interpolation exist, among them are the bilinear and nearest neighbor methods.
In the nearest neighbor method, values in the output image are taken to be the value of the closest matching pixel in the input image. The choice of interpolation method is dependent on the application for which the image is being processed. For example, nearest neighbor interpolation is often used if the image is to be classified.
In the Image Processing Toolkit, the IPSCALE Function on page 378 performs geometric operations. There is also interactive image warping capability in the Image Tool (WzIPImage), as well as the following geometric operation routines found in PV‑WAVE: CONGRID, REBIN, ROT, and ROTATE.
Color Image Processing
There are two general categories of color image processing: full color and pseudo-color processing. Full color images are acquired with a sensor that detects the full visible range, such as a television camera. Pseudo-color images are artificially formed from images that are representative of spectral information outside the visible spectrum.
Color Models
The most common color models are the red, green, blue (RGB) model; and the hue, saturation, and value (HSV) model; whereas color printers use the cyan, magenta, yellow (CMY) model. The choice of a color model obviously depends on the origin of the image information, the sensor used to obtain the information, and the desired results of the image processing application. Additional information about color models can be found in the PV‑WAVE User’s Guide.
Density Slicing
Intensity or density slicing is pseudo-color image processing. In density slicing, planes parallel to the zero amplitude plane of the image are used to slice pixel amplitude levels into a discrete set of ranges, smaller than the original dynamic range of the pixel values. The plane locations are chosen by the user and between each plane, pixel values are mapped into a different color. In the Image Processing Toolkit, the DENSITY_SLICE Function on page 290 performs density slicing on an image.
Classification and Segmentation
Segmentation and classification are closely related, and sometimes confused. Segmentation is used to identify regions of common characteristics in an image. For example, a medical image can be segmented into areas of soft tissue, bone, and air; or a satellite image can be segmented into regions of vegetation, ground clutter, and water. Thresholding techniques, edge detection, and region growing are also commonly used for segmentation analysis.
Classification is a step beyond segmentation in which particular substances or objects are identified within an image and then segmented. Classification usually involves determining the number of separate classes contained within the image. This can be performed by the user, in which case it is termed supervised classification; or it can be performed automatically, also known as unsupervised classification.