Top of document
©Copyright 1999 Rogue Wave Software

set and multiset Operations

Sets and Bags

The member functions provided by the set and multiset data types will shortly be described in more detail. Note that while member functions provide basic operations, the utility of these data structures is greatly extended through the use of the generic algorithms described in Chapters 13 and 14.

Declaration and Initialization of Set

A set is a template data structure, specialized by the type of the elements it contains, and the operator used to compare keys. The latter argument is optional, and, if it is not provided, the less than operator for the key type will be assumed. The element type can be a primitive language type (such as integer or double), a pointer type, or a user-defined type. The element type must recognize both the equality testing operator (operator ==) and the less than comparison operator (operator <).

Initializing Sets with Iterators

Sets can be declared with no initial elements, or they can be initialized from another container by providing a pair of iterators. An optional argument in both cases is an alternative comparison function; this value overrides the value provided by the template parameter. This mechanism is useful if a program contains two or more sets with the same values but different orderings, as it prevents more than one copy of the set member function from being instantiated. The copy constructor can be used to form a new set that is a clone, or copy, of an existing set.

   set <int> set_one;
 
    set <int, greater<int> > set_two;
    set <int> set_three(greater<int>());
 
    set <gadget, less<gadget> > gset;
    set <gadget> gset(less<gadget>());
 
    set <int> set_four (aList.begin(), aList.end());
    set <int> set_five 
       (aList.begin(), aList.end(), greater<int>());
 
    set <int> set_six (set_four);                // copy constructor
 

A set can be assigned to another set, and two sets can exchange their values using the swap() operation (in a manner analogous to other standard library containers).

   set_one = set_five;
    set_six.swap(set_two);
 

Type Definitions

The classes set and multiset include a number of type definitions. The most common use for these is in a declaration statement. For example, an iterator for a set of integers can be declared in the following fashion:

   set<int>::iterator location;
 

In addition to iterator, the following types are defined:

value_type
The type associated with the elements the set maintains.
const_iterator
An iterator that does not allow modification of the underlying sequence.
reverse_iterator
An iterator that moves in a backward direction.
const_reverse_iterator
A combination constant and reverse iterator.
reference
A reference to an underlying element.
const_reference
A reference to an underlying element that will not permit modification.
size_type
An unsigned integer type, used to refer to the size of containers.
value_compare
A function that can be used to compare two elements.
difference_type
A signed integer type, used to describe the distance between iterators.

Insertion

The Pair Data Type

Unlike a list or vector, there is only one way to add a new element to a set. A value can be inserted into a set or a multiset using the insert() member function. With a multiset, the function returns an iterator that denotes the value just inserted. Insert operations into a set return a pair of values, in which the first field contains an iterator, and the second field contains a boolean value that is true if the element was inserted, and false otherwise. Recall that in a set, an element will not be inserted if it matches an element already contained in the collection.

   set_one.insert (18);
 
    if (set_one.insert(18).second)
       cout << "element was inserted" << endl;
    else
       cout << "element was not inserted " << endl;
 

Insertions of several elements from another container can also be performed using an iterator pair:

   set_one.insert (set_three.begin(), set_three.end());
 

The pair data structure is a tuple of values. The first value is accessed through the field name first, while the second is, naturally, named second. A function named make_pair() simplifies the task of producing an instance of class pair.

template <class T1, class T2>
 struct pair {
    T1 first;
    T2 second;
    pair (const T1 & x, const T2 & y) : first(x), second(y) { }
 };
 
 template <class T1, class T2>
 inline pair<T1, T2> make_pair(const T1& x, const T2& y)
    { return pair<T1, T2>(x, y); }
 

In determining the equivalence of keys, for example, to determine if the key portion of a new element matches any existing key, the comparison function for keys is used, and not the equivalence (==) operator. Two keys are deemed equivalent if the comparison function used to order key values yields false in both directions. That is, if Compare(key1, key2) is false, and if Compare(key2, key1) is false, then key1 and key2 are considered equivalent.

Removal of Elements from a Set

Values are removed from a set using the member function erase(). The argument can be either a specific value, an iterator that denotes a single value, or a pair of iterators that denote a range of values. When the first form is used on a multiset, all arguments matching the argument value are removed, and the return value indicates the number of elements that have been erased.

   // erase element equal to 4
    set_three.erase(4);
 
    // erase element five   
    set<int>::iterator five = set_three.find(5);
    set_three.erase(five);
    
    // erase all values between seven and eleven
    set<int>::iterator seven = set_three.find(7);
    set<int>::iterator eleven = set_three.find(11);
    set_three.erase (seven, eleven);
    

If the underlying element type provides a destructor, then the destructor will be invoked prior to removing the element from the collection.

Searching and Counting

The member function size() will yield the number of elements held by a container. The member function empty() will return a boolean true value if the container is empty, and is generally faster than testing the size against zero.

The member function find() takes an element value, and returns an iterator denoting the location of the value in the set if it is present, or a value matching the end-of-set (the value yielded by the function end()) if it is not. If a multiset contains more than one matching element, the value returned can be any appropriate value.

   set<int>::iterator five = set_three.find(5);
    if (five != set_three.end())
        cout << "set contains a five" << endl;
 

The member functions lower_bound() and upper_bound() are most useful with multisets, as with sets they simply mimic the function find(). The member function lower_bound() yields the first entry that matches the argument key, while the member function upper_bound() returns the first value past the last entry matching the argument. Finally, the member function equal_range() returns a pair of iterators, holding the lower and upper bounds.

The member function count() returns the number of elements that match the argument. For a set this value is either zero or one, whereas for a multiset it can be any nonnegative value. Since a non-zero integer value is treated as true, the count() function can be used to test for inclusion of an element, if all that is desired is to determine whether or not the element is present in the set. The alternative, using find(), requires testing the result returned by find() against the end-of-collection iterator.

   if (set_three.count(5))
       cout << "set contains a five" << endl;
 

Iterators

No Iterator Invalidation

The member functions begin() and end() produce iterators for both sets and multisets. The iterators produced by these functions are constant to ensure that the ordering relation for the set is not inadvertently or intentionally destroyed by assigning a new value to a set element. Elements are generated by the iterators in sequence, ordered by the comparison operator provided when the set was declared. The member functions rbegin() and rend() produce iterators that yield the elements in reverse order.

Set Operations

The traditional set operations of subset test, set union, set intersection, and set difference are not provided as member functions, but are instead implemented as generic algorithms that will work with any ordered structure. These functions are described in more detail in Chapter 14 (Set Operations). The following summary describes how these functions can be used with the set and multiset container classes.

Subset test

The function includes() can be used to determine if one set is a subset of another; that is, if all elements from the first are contained in the second. In the case of multisets the number of matching elements in the second set must exceed the number of elements in the first. The four arguments are a pair of iterators representing the (presumably) smaller set, and a pair of iterators representing the (potentially) larger set.

   if (includes(set_one.begin(), set_one.end(),
       set_two.begin(), set_two.end()))
          cout << "set_one is a subset of set_two" << endl;
 

The less than operator (operator <) will be used for the comparison of elements, regardless of the operator used in the declaration of the set. Where this is inappropriate, an alternative version of the includes() function is provided. This form takes a fifth argument, which is the comparison function used to order the elements in the two sets.

Set Union or Intersection

The function set_union() can be used to construct a union of two sets. The two sets are specified by iterator pairs, and the union is copied into an output iterator that is supplied as a fifth argument. To form the result as a set, an insert iterator must be used to form the output iterator. (See Chapter 2: Insert Iterators for a discussion of insert iterators.) If the desired outcome is a union of one set with another, then a temporary set can be constructed, and the results swapped with the argument set prior to deletion of the temporary set.

   // union two sets, copying result into a vector
    vector<int> v_one (set_one.size() + set_two.size());
 
    set_union(set_one.begin(), set_one.end(), 
       set_two.begin(), set_two.end(), v_one.begin());
 
    // form union in place
    set<int> temp_set;
    set_union(set_one.begin(), set_one.end(), 
       set_two.begin(), set_two.end(), 
       inserter(temp_set, temp_set.begin()));
    set_one.swap(temp_set);  // temp_set will be deleted
 

The function set_intersection() is similar, and forms the intersection of the two sets.

As with the includes() function, the less than operator (operator <) is used to compare elements in the two argument sets, regardless of the operator provided in the declaration of the sets. Should this be inappropriate, alternative versions of both the set_union() or set_intersection() functions permit the comparison operator used to form the set to be given as a sixth argument.

The operation of taking the union of two multisets should be distinguished from the operation of merging two sets. Imagine that one argument set contains three instances of the element 7, and the second set contains two instances of the same value. The union will contain only three such values, while the merge will contain five. To form the merge, the function merge() can be used (see Chapter 14: Merge Ordered Sequences). The arguments to this function exactly match those of the set_union() function.

Set Difference

There are two forms of set difference. A simple set difference represents the elements in the first set that are not contained in the second. A symmetric set difference is the union of the elements in the first set that are not contained in the second, with the elements in the second that are not contained in the first. These two values are constructed by the functions set_difference() and set_symmetric_difference(), respectively. The use of these functions is similar to the use of the set_union() function described earlier.

Other Generic Algorithms

Because sets are ordered and have constant iterators, a number of the generic functions described in Chapters 13 and 14 either are not applicable to sets or are not particularly useful. However, the following table gives a few of the functions that can be used in conjunction with the set data type.

Purpose
Name
Chapter / Section
Copy one sequence into another
copy
13 (Copy One Sequence into Another Sequence)
Find an element that matches a condition
find_if
13 (Find an Element Satisfying a Condition)
Find a subsequence within a set
search
13 (Find a Subsequence within a Sequence)
Count number of elements that satisfy condition
count_if
13 (Count the Number of Elements...)
Reduce set to a single value
accumulate
13 (Reduce Sequence to a Single Value)
Execute function on each element
for_each
13 (Apply a Function to All Elements in a Collection)

Top of document