Learning objectives
| Sorting lists |
| Removing duplicates from lists |
| Outputting lists |
| Merging lists |
In principle, there can be four reasons to sort a collection of objects:
A problem definition of the latter
kind is referred to as a unification
problem. Let us imagine for example
that we have a list of names in which many names occur
repeatedly. This could be entries in a visitor's book of
a web site, in which the same people make their highly
qualified comments again and again. We want to know now which
different people have left an entry. We
have stored the list of names in a file
SomeNames.txt. It could
look as follows:
Doris Böhm Christian Uhrig Doris Böhm Mick Jagger Christian Uhrig
To find out only the different
names therein, we read in all names into a list L, sort the list by a
call of
L.sort();
and remove all duplicates by
L.unique();
from the sorted version:
#include <LEDA/list.h>
#include <LEDA/string.h>
#include <iostream>
using leda::list;
using leda::string;
int main()
{
list<string> L;
string s;
while(std::cin) {
s.read_line();
L.append(s);
}
L.sort();
L.unique();
L.print('\n');
}
We get our input by redirecting standard input.
$ NameUniq < Data/SomeNames.txt
Doris Böhm
Christian Uhrig
Mick Jagger$![]() | Note |
|---|---|
Actually, the two blank lines at the beginning of the
above output should not appear. The first one is due to the fact
that |
The method sort() arranges the
linking information of the containers in a sorted
order. Precondition is that the attribute type T
be linearly ordered by a comparison function
int compare(const T&, const T&);
Such a function is already predefined for strings. We will learn how to sort self-defined types in Section 2.3.5. We will get to know more about linearly ordered types in general in Section 2.8.3.
The method unique() removes
duplicates from a list. Here it is precondition that the list
be sorted in accordance with the order defined on the
attribute type.
With a call of
L.print();
a list can be output in an easy and comfortable way. A separator, which is output between the single values, can be passed to the method. It can even be told by specification of an output stream where it shall direct the output to. Its complete signature is
print(ostream& O, char space = ' ');
There are also other sorting methods:
L.merge_sort(); L.bucket_sort(f);
are stable sorting
algorithms, that is, if container A has the same contents
as container B, and A precedes B in the sequence of the containers
before the sorting, then the same still holds after the
sorting. merge_sort() is faster than
sort() if the list contains long, pre-sorted
intervals.
bucket_sort() needs a function
f() as argument that “throws the
individual elements into numbered buckets”. This means it
must return an integer number for every element of the attribute
type T, which should originate from a not too large
interval [l..r]. (The return value is the number of the bucket in
which the element will land.) Then it arranges the containers in
an order of increasing values of f. To be
able to use bucket_sort() efficiently,
one should know the exact algorithm; it can be looked up for example in
LEDA, A Platform for Combinatorial and Geometric Computing.
The method merge() is useful in
some algorithms. It merges two already pre-sorted lists into a
sorted list. In a call
L.merge(M);
the containers of M are inserted into
L so that L remains sorted.
M is emptied!