Stepwise Selection
Stepwise selection has been proposed as a technique that combines advantages of forward and backward selection. At any point in the search, a single predictor variable may be added or deleted. Commonly, the starting subset is the empty set. When we think in terms of a predictor variable subset’s bit representation, stepwise selection allows one bit in the representation to be flipped at any point in the search. Therefore, since a subset’s bit representation has N bits, each subset in the search graph for stepwise selection has N neighbors. If no bit is flipped more than once, at most N2 subsets are evaluated before stepwise selection ends. There are, however, no guarantees that each bit will be flipped at most one time.
No strong theoretical results exist for comparing the effectiveness of stepwise selection against forward or backward selection. Stepwise selection evaluates more subsets than the other two techniques, so in practice it tends to produce better subsets (A. J. Miller, Subset Selection in Regression, Chapman & Hall, 1990). Of course, the price that stepwise selection pays for finding better subsets is reduced computational speed: usually more subsets must be evaluated before the search ends.