Top of document
©Copyright 1999 Rogue Wave Software

String Operations

In the following sections, we'll examine the standard library operations used to create and manipulate strings.

Declaration and Initialization of string

The simplest form of declaration for a string simply names a new variable, or names a variable along with the initial value for the string. This form was used extensively in the example graph program given in 9 (Example, Graphs). A copy constructor also permits a string to be declared that takes its value from a previously defined string.

   string s1;
    string s2 ("a string");
    string s3 = "initial value";
    string s4 (s3);
 

In these simple cases the capacity is initially exactly the same as the number of characters being stored. Alternative constructors let you explicitly set the initial capacity. Yet another form allows you to set the capacity and initialize the string with repeated copies of a single character value.

   string s6 ("small value", 100);// holds 11 values, can hold 100
    string s7 (10, '\n');          // holds ten newline characters
 
Initializing from Iterators

Finally, like all the container classes in the standard library, a string can be initialized using a pair of iterators. The sequence being denoted by the iterators must have the appropriate type of elements.

   string s8 (aList.begin(), aList.end());
 

Resetting Size and Capacity

As with the vector data type, the current size of a string is yielded by the size() member function, while the current capacity is returned by capacity(). The latter can be changed by a call on the reserve() member function, which (if necessary) adjusts the capacity so that the string can hold at least as many elements as specified by the argument. The member function max_size() returns the maximum string size that can be allocated. Usually this value is limited only by the amount of available memory.

   cout << s6.size() << endl;
    cout << s6.capacity() << endl;
    s6.reserve(200);                    // change capacity to 200
    cout << s6.capacity() << endl;
    cout << s6.max_size() << endl;
 

The member function length() is simply a synonym for size(). The member function resize() changes the size of a string, either truncating characters from the end or inserting new characters. The optional second argument for resize() can be used to specify the character inserted into the newly created character positions.

   s7.resize(15, '\t');                // add tab characters at end
    cout << s7.length() << endl;        // size should now be 15
 

The member function empty() returns true if the string contains no characters, and is generally faster than testing the length against a zero constant.

   if (s7.empty()) 
       cout << "string is empty" << endl;
 

Assignment, Append and Swap

A string variable can be assigned the value of either another string, a literal C-style character array, or an individual character.

   s1 = s2;
    s2 = "a new value";
    s3 = 'x';
 

The operator += can also be used with any of these three forms of argument, and specifies that the value on the right hand side should be appended to the end of the current string value.

   s3 += "yz";                   // s3 is now xyz
 

The more general assign() and append() member functions let you specify a subset of the right hand side to be assigned to or appended to the receiver. A single integer argument n indicates that only the first n characters should be assigned/appended, while two arguments, pos and n, indicate that the n values following position pos should be used.

   s4.assign (s2, 3);           // assign first three characters
    s4.append (s5, 2, 3);        // append characters 2, 3 and 4
 

The addition operator + is used to form the catenation of two strings. The + operator creates a copy of the left argument, then appends the right argument to this value.

   cout << (s2 + s3) << endl;   // output catenation of s2 and s3
 

As with all the containers in the standard library, the contents of two strings can be exchanged using the swap() member function.

   s5.swap (s4);                // exchange s4 and s5
 

Character Access

An individual character from a string can be accessed or assigned using the subscript operator. The member function at() is a synonym for this operation.

   cout << s4[2] << endl;        // output position 2 of s4
    s4[2] = 'x';                  // change position 2
    cout << s4.at(2) << endl;     // output updated value
 

The member function c_str() returns a pointer to a null terminated character array, whose elements are the same as those contained in the string. This lets you use strings with functions that require a pointer to a conventional C-style character array. The resulting pointer is declared as constant, which means that you cannot use c_str() to modify the string. In addition, the value returned by c_str() might not be valid after any operation that may cause reallocation (such as append() or insert()). The member function data() returns a pointer to the underlying character buffer.

   char d[256];
    strcpy(d, s4.c_str());                // copy s4 into array d
 

Iterators

The member functions begin() and end() return beginning and ending random-access iterators for the string. The values denoted by the iterators will be individual string elements. The functions rbegin() and rend() return backwards iterators.

Invalidating Iterators

Insertion, Removal and Replacement

The string member functions insert() and remove() are similar to the vector functions insert() and erase(). Like the vector versions, they can take iterators as arguments, and specify the insertion or removal of the ranges specified by the arguments. The function replace() is a combination of remove and insert, in effect replacing the specified range with new values.

   s2.insert(s2.begin()+2, aList.begin(), aList.end());
    s2.remove(s2.begin()+3, s2.begin()+5);
    s2.replace(s2.begin()+3, s2.begin()+6, s3.begin(), s3.end());
 

In addition, the functions also have non-iterator implementations. The insert() member function takes as argument a position and a string, and inserts the string into the given position. The remove function takes two integer arguments, a position and a length, and removes the characters specified. And the replace function takes two similar integer arguments as well as a string and an optional length, and replaces the indicated range with the string (or an initial portion of a string, if the length has been explicitly specified).

   s3.insert (3, "abc");      //insert abc after position 3
    s3.remove (4, 2);          // remove positions 4 and 5
    s3.replace (4, 2, "pqr");  //replace positions 4 and 5 with pqr
 

Copy and Substring

The member function copy() generates a substring of the receiver, then assigns this substring to the target given as the first argument. The range of values for the substring is specified either by an initial position, or a position and a length.

   s3.copy (s4, 2);       // assign to s4 positions 2 to end of s3
    s5.copy (s4, 2, 3);    // assign to s4 positions 2 to 4 of s5
 

The member function substr() returns a string that represents a portion of the current string. The range is specified by either an initial position, or a position and a length.

   cout << s4.substr(3) << endl;       // output 3 to end
    cout << s4.substr(3, 2) << endl;    // output positions 3 and 4

String Comparisons

Comparing Strings

The member function compare() is used to perform a lexical comparison between the receiver and an argument string. Optional arguments permit the specification of a different starting position or a starting position and length of the argument string. See Chapter 13 (Lexical Comparison) for a description of lexical ordering. The function returns a negative value if the receiver is lexicographically smaller than the argument, a zero value if they are equal and a positive value if the receiver is larger than the argument.

The relational and equality operators (<, <=, ==, !=, >= and >) are all defined using the comparison member function. Comparisons can be made either between two strings, or between strings and ordinary C-style character literals.

Searching Operations

The member function find() determines the first occurrence of the argument string in the current string. An optional integer argument lets you specify the starting position for the search. (Remember that string index positions begin at zero.) If the function can locate such a match, it returns the starting index of the match in the current string. Otherwise, it returns a value out of the range of the set of legal subscripts for the string. The function rfind() is similar, but scans the string from the end, moving backwards.

   s1 = "mississippi";
    cout << s1.find("ss") << endl;              // returns 2
    cout << s1.find("ss", 3) << endl;           // returns 5
    cout << s1.rfind("ss") << endl;             // returns 5
    cout << s1.rfind("ss", 4) << endl;          // returns 2
 

The functions find_first_of(), find_last_of(), find_first_not_of(), and find_last_not_of() treat the argument string as a set of characters. As with many of the other functions, one or two optional integer arguments can be used to specify a subset of the current string. These functions find the first (or last) character that is either present (or absent) from the argument set. The position of the given character, if located, is returned. If no such character exists then a value out of the range of any legal subscript is returned.

   i = s2.find_first_of ("aeiou");            // find first vowel
    j = s2.find_first_not_of ("aeiou", i);     // next non-vowel
 

Top of document