Using the nextToken() Method
The nextToken() interface simply returns the next token, which may be empty. This allows search strings to contain empty fields of data. To detect the end of tokenization using this interface, use the done() method on the tokenizer. For example, this code extracts all tokens from a string using the nextToken() method:
 
RWUConversionContext ascii("ascii");
 
RWUString text("John,Doe;,,33,175;");
RWUString delimiters(",;");
RWUString next;
RWUTokenizer tok(text);
 
while (!tok.done()) {
next = tok.nextToken(delimiters);
// Process the token
}
The following tokens are extracted by this code:
 
John
Doe
 
 
33
175
Note that the comma and semicolon characters act as delimiters, and are specified using an RWUString.
In this case, two empty tokens are extracted by nextToken(). If the function call operator tokenizing interface had been used instead, the empty tokens would not be returned.
This code below illustrates tokenizing a string using a regular expression delimiter and the nextToken() interface:
 
RWUConversionContext ascii("ascii");
RWUString text("John, Doe, 33,175;");
RWURegularExpression delimiters(RWCString("[{Zs}]*[,;][{Zs}]*"));
RWUString next;
RWUTokenizer tok(text);
 
while (!tok.done()) {
next = tok.nextToken(delimiters);
// Process the token
}
The following tokens are extracted by this code:
 
John
Doe
33
175
The RWURegularExpression delimiter expression
 
RWURegularExpression delimiters(RWCString("[{Zs}]*[,;][{Zs}]*"));
specifies any number of occurrences of whitespace, followed by either a comma or a semicolon, followed by any number of whitespace characters. (See Regular Expression String Searching for more information on RWURegularExpression.)