RANDOM_SAMPLE Function
Generates a simple pseudorandom sample from a finite population.
Usage
result = RANDOM_SAMPLE(nsamp, population)
Input Parameters
nsamp—The sample size desired.
population—A one or two dimensional array containing the population to be sampled. If either of the keywords First_Call or Additional_Call are specified, then population contains a different part of the population on each invocation, otherwise population contains the entire population.
Returned Value
result—nsamp by nvar array containing the sample, where nvar is the number of columns in the argument population.
Input Keywords
Double—If present and nonzero, double precision is used.
First_Call—If present and nonzero, then this is the first invocation with this data; additional calls to RANDOM_SAMPLE may be made to add to the population. Additional calls should be made using the keyword Additional_Call. Keywords Index and Npop are required if First_Call is set. See Example 2.
Additional_Call—If present and nonzero, then this is an additional invocation of RANDOM_SAMPLE, and updating for the subpopulation in population is performed. Keywords Index, Npop, and Sample are required if Additional_Call is set. It is not necessary to know the number of items in the population in advance. Npop is used to cumulate the population size and should not be changed between calls to RANDOM_SAMPLE. See Example 2.
Input/Output Keywords
Index—A one-dimensional array of length nsamp containing the indices of the sample in the population. Output if keyword First_Call is used. Input/Output if keyword Additional_Call is used.
Npop—The number of items in the population. Output if keyword First_Call is used. Input/Output if keyword Additional_Call is used.
Sample—An array of size nsamp by nvar containing the sample. Initially, the result of calling RANDOM_SAMPLE with keyword First_Call is used for Sample.
Discussion
Routine RANDOM_SAMPLE generates a pseudorandom sample from a given population, without replacement, using an algorithm due to McLeod and Bellhouse (1983).
The first nsamp items in the population are included in the sample. Then, for each successive item from the population, a random item in the sample is replaced by that item from the population with probability equal to the sample size divided by the number of population items that have been encountered at that time.
Example 1
In this example, RANDOM_SAMPLE is used to generate a sample of size 5 from a population stored in the matrix population.
RANDOMOPT, Set = 123457
pop = STATDATA(2)
samp = RANDOM_SAMPLE(5, pop)
PM, samp
; PV-WAVE prints the following:
; 1764.00 36.4000
; 1828.00 62.5000
; 1923.00 5.80000
; 1773.00 34.8000
; 1769.00 106.100
Example 2
Routine RANDOM_SAMPLE is now used to generate a sample of size 5 from the same population as in the example above except the data are input to RANDOM_SAMPLE one observation at a time. This is the way RANDOM_SAMPLE may be used to sample from a file on disk or tape. Notice that the number of records need not be known in advance.
RANDOMOPT, Set = 123457
pop = STATDATA(2)
samp = RANDOM_SAMPLE(5, pop(0, *), /First_Call, Index = ii, $
Npop=np)
FOR i=1L,175 DO samp = RANDOM_SAMPLE(5, pop(i, *), $
/Additional_Call, index = ii, npop = np, sample = samp)
PM, samp
; PV-WAVE prints the following:
; 1764.00 36.4000
; 1828.00 62.5000
; 1923.00 5.80000
; 1773.00 34.8000
; 1769.00 106.100