Introduction
The routines in this chapter are used to test for goodness of fit and randomness. The goodness-of-fit tests are described in Conover (1980). There are two goodness-of-fit tests for general distributions, a Kolmogorov-Smirnov test and a chi-squared test. The user supplies the hypothesized cumulative distribution function for these two tests.
Three functions are provided for testing whether a set of data came from a normal distribution: the Shapiro-Wilk test, the Lilliefors test, and the chi-squared test. When the sample size is less than 5,000 observations, the Shapiro-Wilk test provides an accurate estimate for the p-value of this test. Lilliefors test is also popular but it only provides accurate p-value estimates for values between 0.01 and 0.1. Values below 0.01 are always returned as 0.01, and values above 0.1 are returned as 0.5. The general version of the chi-squared test is also available for the normal distribution.
The tests for randomness are often used to evaluate the adequacy of pseudorandom number generators. These tests are discussed in Knuth (1981).
The Kolmogorov-Smirnov routines in this chapter compute exact probabilities in small to moderate sample sizes. The chi-squared goodness-of-fit test may be used with discrete as well as continuous distributions.
The Kolmogorov-Smirnov and chi-squared goodness-of-fit test routines allow for missing values (NaN, not a number) in the input data. The routines that test for randomness do not allow for missing values.