NORM2SAMP Function
Computes statistics for mean and variance inferences using samples from two independently normal populations.
Usage
result = NORM2SAMP(x1, x2)
Input Parameters
x1—One-dimensional array containing the first sample.
x2—One-dimensional array containing the second sample.
Returned Value
result—Difference in means of the mean of the second sample from the first sample.
Input Keywords
Double—If present and nonzero, double precision is used.
Conf_Mean—Confidence level for two-sided interval estimate of the mean of x1 minus the mean of x2, in percent. Keyword Conf_Mean must be between 0.0 and 100.0 and is often 90.0, 95.0, or 99.0. For a one-sided confidence interval with confidence level c (at least 50 percent), set Conf_Mean = 100.0 – 2.0 × (100.0 – c). Default: Conf_Mean = 95.0
T_Test_Null_Hyp—Null hypothesis value for the t test. Default: T_Test_Null_Hyp = 0.0
Conf_Var—Confidence level for inference on variances. Under the assumption of equal variances, the pooled variance is used to obtain a two-sided Conf_Var percent confidence interval for the common variance if Ci_Comm_Var is specified. Without making the assumption of equal variances, the ratio of the variances is of interest. A two-sided Conf_Var percent confidence interval for the ratio of the variance of the first sample to that of the second sample is computed and is returned if Ci_Ratio_Var is specified. The confidence intervals are symmetric in probability. Default: Conf_Var = 95.0
Chi_Sq_Null_Hyp—Null hypothesis value for the chi-squared test. Default: Chi_Sq_Null_Hyp = 1.0
Output Keywords
Mean_X1—Means of the first sample.
Mean_X2—Means of the second sample.
Ci_Diff_Eq_Var—Named variable into which the two-element array containing the lower confidence limit and the upper limit for the mean of the first population minus the mean of the second, assuming equal variances is stored.
Ci_Diff_Ne_Var—Named variable into which the two-element array containing the lower confidence limit and the upper limit for the mean of the first population minus the mean of the second, assuming unequal variances, is stored.
T_Test_Eq_Var—Variable into which the three-element array containing statistics associated with a t test for μ1 – μ2 = d, where d is the null hypothesis value, is stored. (See the description of T_Test_Null_Hyp.) The first element contains degrees of freedom, second element contains the t value, and third element contains the probability of a larger t in absolute value, assuming the null hypothesis is true. This test assumes equal variances.
T_Test_Ne_Var—Named variable into which the three-element array containing statistics associated with a t test for μ1 – μ2 = d, where d is the null hypothesis value, is stored. (See the description of T_Test_Null_Hyp.) The first element contains the degrees of freedom for Satterthwaite’s approximation, the second element contains the t value, and the third element contains the probability of a larger t in absolute value, assuming the null hypothesis is true. This test does not assume equal variances.
Pooled_Var—Named variable into which the pooled variance for the two samples is stored.
Ci_Comm_Var—Named variable into which the two-element array containing the lower confidence limit and the upper confidence limit for the common (or pooled) variance is stored.
Chi_Sq_Test—Named variable into which the three-element array containing statistics associated with the chi-squared test for σ2 = σ20, where σ2 is the common (or pooled) variance and σ20 is the null hypothesis value, is stored. (See description of Chi_Sq_Null_Hyp.) The first element contains the degrees of freedom, the second element contains the chi-squared value, and the third element contains the probability of a larger chi-squared value, p-value. This test assumes equal variances.
Stdev_X1—Named variable into which the standard deviation of the first sample is stored.
Stdev_X2—Named variable into which the standard deviation of the second sample is stored.
Ci_Ratio_Var—Named variable into which the two-element array containing the approximate lower confidence limit and the approximate upper confidence limit for the ratio of the variance of the first population to the second is stored.
F_Test—Named variable into which the four-element array containing statistics associated with the F test for equality of variances is stored. The first element contains the degrees of freedom for the numerator, the second element contains the degrees of freedom for the denominator, the third element contains the F test value, and the fourth element contains the probability of a larger F value, p-value, assuming the null hypothesis (H0: σ21 = σ22) is true.
Discussion
Function NORM2SAMP computes statistics for making inferences about the means and variances of two normal populations, using independent samples in
x1 and
x2. For inferences concerning parameters of a single normal population, see
NORM1SAMP Function.
Let μ1 and σ21 be the mean and variance of the first population, and let μ2 and σ22 be the corresponding quantities of the second population. The function contains test statistics and confidence intervals for difference in means, equality of variances, and the pooled variance.
The means and variances for the two samples are as follows:
and:
Inferences about the Means
The test that the difference in means equals a certain value, for example, μ0, depends on whether or not the variances of the two populations can be considered equal. If the variances are equal and T_Test_Null_Hyp equals zero, the test is the two-sample t test, which is equivalent to an analysis-of-variance test. The pooled variance for the difference-in-means test is as follows:
The t statistic is as follows:
Also, the confidence interval for the difference in means can be obtained by specifying Ci_Diff_Eq_Var.
If the population variances are not equal, the ordinary t statistic does not have a t distribution and several approximate tests for the equality of means have been proposed. (See, for example, Anderson and Bancroft 1952, and Kendall and Stuart 1979.) One of the earliest tests devised for this situation is the Fisher-Behrens test, based on Fisher’s concept of fiducial probability. A procedure used if T_Test_Ne_Var and/or Ci_Diff_Ne_Var are specified is the Satterthwaite’s procedure, as suggested by H.F. Smith and modified by F.E. Satterthwaite (Anderson and Bancroft 1952, p. 83).
The test statistic is:
where
Under the null hypothesis of μ1 – μ2 = d, this quantity has an approximate t distribution with degrees of freedom given by the following equation:
Inferences about the Variances
The F statistic for testing the equality of variances is given by:
F = s2max / s2min,
where s2max is the maximum of s21 and s22. If the variances are equal, this quantity has an F distribution with n1 – 1 and n2 – 1 degrees of freedom, where n1 is the sample size corresponding to s2max.
Generally, it is not recommended that the results of the F test be used to decide whether to use the regular t test or the modified t' on a single set of data. The modified t' (Satterthwaite’s procedure) is the more conservative approach to use if there is doubt about the equality of the variances.
Example 1
This example, taken from Conover and Iman (1983, p. 294), involves scores on arithmetic tests of two grade-school classes. The question is whether a group taught by an experimental method has a higher mean score. Only the difference in means is output. The data are shown in
Table 2-2: Class Scores.
x1 = [72, 75, 77, 80, 104, 110, 125]
x2 = [111, 118, 128, 138, 140, 150, 163, 164, 169]
PRINT, 'difference of means = ', NORM2SAMP(x1, x2)
; PV-WAVE prints: difference of means = -50.4762
Example 2
The same data is used for this example as for the initial example. Here, the results of the t test are output. The variances of the two populations are assumed to be equal. It is seen from the output that there is strong reason to believe that the two means are different (t value of –4.804). Since the lower 97.5-percent confidence limit does not include zero, the null hypothesis is that μ1 ≤ μ2 would be rejected at the 0.05 significance level. (The closeness of the values of the sample variances provides some qualitative substantiation of the assumption of equal variances.) First, define a procedure to print the results.
PRO print_results, diff, sp, ci, t
PM, diff, Title = 'Difference of Means: '
PM, sp, Title = 'Pooled Variance: '
PM, 'CI for Difference of Means is (', ci(0), ',', ci(1), ')'
PM, ' '
PM, 't-test for Equal Variances:'
PM, t(0), Title = 'Degrees of Freedom:'
PM, t(1), Title = 't statistic: '
PM, t(2), Title = 'P-Value:'
END
x1 = [72, 75, 77, 80, 104, 110, 125]
x2 = [111, 118, 128, 138, 140, 150, 163, 164, 169]
diff = NORM2SAMP(x1, x2, Pooled_Var = sp, $
Ci_Diff_Eq_Var = ci, T_Test_Eq_Var = t)
print_results, diff, sp, ci, t
This results in the following output:
Difference of Means:
-50.4762
Pooled Variance:
434.633
CI for Difference of Means is
( -73.0100, -27.9424)
t-test for Equal Variances:
Degrees of Freedom:
14.0000
t statistic:
-4.80436
P-Value:
0.000280258