*Learning objectives*

The methods join(), intersect(), and diff() |

We turn to the set operations intersection, union,
difference, and symmetric difference now. These also are
supported by the class `set`.

Let us imagine that we run a big marriage bureau. The
general isolation and loneliness of the modern person in the
information society guarantees a large clientele for us. New
customers join every year (because their willingness simply to
address someone nice at the bus stop or in the café
decreases more and more thanks to SMS) and some sign off
(because we have found a compatible partner for them, or because
they are simply sick and tired of the other sex). The customer
numbers of the customers at the end of the years 2001 and 2002
are stored in the files `Customers01.txt` and
`Customers02.txt`, respectively.

The file `Customers01.txt` looks for example
like this:

7174 1234 5557 8181 9999 2788 5696

The file `Customers02.txt` like this:

1234 8181 2788 5696 2335 8912

We want to know now which customers we had within the two years altogether, which customers we attended to both in the first and in the second year, which customers we newly acquired in the year 2002, and which ones we lost. For this we must unite the sets of the customer numbers, intersect and subtract them from each other, respectively. This is what following program does.

#include <LEDA/set.h> #include <fstream> #include <iostream> using leda::set; using std::ifstream; using std::cout; using std::endl; int main() { ifstream is01("Data/Customers01.txt"); ifstream is02("Data/Customers02.txt"); set<int> S, T; int customer_id; while(is01 >> customer_id) T.insert(customer_id); while(is02 >> customer_id) S.insert(customer_id); set<int> All_customers = S.join(T); set<int> Both_years_customers = S.intersect(T); set<int> New_customers = S.diff(T); set<int> Signedoff_customers = T.diff(S); cout << "Altogether we had the following customers:\n"; forall(customer_id, All_customers) cout << customer_id << " "; cout << endl; cout << "The following were customers in both years:\n"; forall(customer_id, Both_years_customers) cout << customer_id << " "; cout << endl; cout << "The following customers are new:\n"; forall(customer_id, New_customers) cout << customer_id << " "; cout << endl; cout << "The following customers have signed off:\n"; forall(customer_id, Signedoff_customers) cout << customer_id << " "; cout << endl; }

The program outputs the following:

Altogether we had the following customers: 1234 2335 2788 5557 5696 7174 8181 8912 9999 The following were customers in both years: 1234 2788 5696 8181 The following customers are new: 2335 8912 The following customers have signed off: 5557 7174 9999

The method

S.join(T);

unites two sets `S` and `T` and
returns the union set.

In contrast, the method

S.intersect(T);

intersects two sets `S` and `T` and
returns the difference set.

The method

S.diff(T);

calculates the set difference of two sets `S` and
`T`, that is, it returns a set consisting of all elements
which are contained in `S`, but not in
`T`.

Furthermore, there is the method

S.symdiff(T);

It calculates the *symmetric
difference* of two sets and returns it. This is the set of
all the elements of both sets which are not contained in the
intersection.

With the overloaded comparison operators two sets can be tested for equality, inequality, proper and improper inclusion. For example, the expression

S < T;

tests whether `S` is a proper subset of `T`.

Finally, there are the methods
`size()`,
`empty()`, and
`clear()` with the same meaning as in
the case of stacks and queues.

The class `set` is implemented by
randomized search trees.

The operations `insert()`,
`del()`, and
`member()` take time O(log(n)), if the
set contains n elements. `empty()` and
`size()` take time O(n), and
`clear()` takes constant time. If
`S` and `T` are two sets
with n and m elements, respectively, then
`S.join(T)` and
`S.diff(T)` take time
O(m·log(n+m)) and
`S.intersect(T)` takes time
O(n·log(n+m)).

Furthermore, there are some operators of the class
`set` we did not mention. These are
described on the corresponding manual page.