5.2.1. How to start

Learning objectives

 The class `graph` Creating nodes and edges The classes `node_array` and `edge_array` Using basic graph algorithms Topological sorting with `TOPSORT()` Iterating over all nodes and edges

Every author who starts to describe a connected topic complex, such as the LEDA system itself, must face the following, extremely important question, which is often hard to answer: "Oh my God, where shall I begin?" The problem is that the individual topics are dependent on each other: To understand what is said in section Y, the reader must have read section X first. Every good author takes care to arrange the sections so that the reader can read through the complete works from the front to behind once and must never already know on page x what stands written only later on page y. So the author should get by without forward references; then the reader gets by on one single pass through the text. Many otherwise outstanding works lack exactly this quality, what makes them difficult to understand for the non-adept.

How can the sections be arranged in that way? First of all, it makes sense to model the problem definition as a graph: For every section a node is introduced. If the contents of section X are a precondition for the understanding of the contents of section Y, then an edge (X, Y) from X to Y is drawn. The graph arising this way is called a dependency graph. Figure 5.4 shows such a dependency graph for a topic complex consisting of 8 sections.

Of course, we should begin our work with a section to which no edges are incident, that is, with a node v whose in-degree is 0. Only such a node does not have other nodes as prerequisite for the understanding. But what if there is no such node? Then we have lost; it is impossible to guide the reader through the work in one single pass: In every section he has already to know the contents of another section. If, in contrast, such a node v exists, we can start with v. We give it the number 1.

But how do we arrange the remaining nodes? Since we have already recognized the section v as the first of a good order, we can remove v from the set of the sections still to be arranged, that is, the node v can be removed from the graph. By this, the in-degree of all nodes previously adjacent to v is decreased by 1. Now the same question as before arises: With which node do we carry on? Again, we search for a node w with in-degree 0. If there is none, we have lost. Otherwise, we give it the number 2, remove it from the graph, and continue in this way until we either are left with an empty graph (and have given all nodes a number) or have stopped at a non-empty graph in which at least one edge leads to every node.

In case we have been able to assign a number N(v) to every node v, we have won: We arrange the sections of our work in the order given by the N(v)'s. This numbering N has a special characteristic: Whenever there is an edge (u,v) in the original graph, the number N(u) is smaller than the number N(v). A numbering with such a characteristic is called a topological sorting.

Exactly when can such a topological sorting be found? As one may easily suspect, the existence of such a sorting has to do with the existence of cycles in a graph. Exercise 48 asks to work out that a topological sorting exists if and only if a graph does not have any cycles.

A first program

Sorting a graph topologically is a basic graph algorithm problem. Of course, LEDA has a corresponding built-in function. As a first program, we want to build up the graph from Figure 5.4 by means of the structure that LEDA provides for this, and calculate a topological sorting.

LEDA's class for managing graphs is called, how wonder, `graph`. It is it an extremely powerful class, whose methods we will get to know one after the other.

A graph is created by

`graph G;`

First of all, to build up the structure of the graph, nodes must be inserted. This is done by a call of

`node v = G.new_node();`

This method returns an item of type `node`, which is a dependent item type. The corresponding node can be accessed later only via such an item; for this purpose, the returned item can be saved in a variable of type `node`. The node can be deleted from the graph at a later time via this item by a call of

`G.del_node(v);`

This removes also all leaving and incident edges of this node.

An edge can be inserted between two nodes `v1` and `v2` by

`edge e = G.new_edge(v1, v2);`

Also here, a dependent item is returned; it is of type `edge`. Later accesses to the edge just inserted are possible only via such an item. For example, the edge can be deleted from the graph by

`G.del_edge(e);`

With the methods `new_node()` and `new_edge()`, the combinatorial structure of a given graph can be constructed successively. The graph, however, still has no information associated! But it could be desirable to give every node a name, for example, the name of a corresponding section in a Pulitzer-Price suspicious tutorial on LEDA. And after all, the number of a topological sorting shall be stored with every node, too! Here a way to associate some information to a node is necessary. For this purpose, LEDA has ready the class `node_array` (among other classes for the same purpose). This class is used, like an ordinary `array`, via the subscript operator `[]`. Here the key indices are always of type `node`. Correspondingly, there is the class `edge_array`, with which information can be associated to the edges of a graph.

Note There are only very few problem definitions that can be solved with the class `graph` alone. As we have said, this class merely serves for the coding of the combinatorial structure of a graph. In most cases, additional classes, such as `node_array` or `edge_array`, are needed to hold additional information.

On a graph already constructed, a `node_array` for the numbering of a topological sorting is defined as follows:

`node_array<int> top_num(G);`

(Node arrays will be explained in detail in Section 5.2.2.)

The LEDA function `TOPSORT()` for calculating a topological sorting is used as follows:

`bool has_top_sort = TOPSORT(G, top_num);`

It returns `false` if the graph is cyclic and therefore does not have a topological sorting. (We will get to know a possible implementation of this function in Section 5.2.7.) The following program first builds up the above graph and then calls `TOPSORT()` to calculate a topological sorting.

Important If you work under Unix and would like to try out the programs of this chapter, you must keep in mind the following: Under Unix, the class `graph` and all other classes and algorithms introduced in this chapter (apart from the class `GraphWin` discussed in Section 5.2.5) are contained in the LEDA library `libG`. For this library to be linked in correctly, the programmer must take into account what was said in Section 1.3.1 about linking under Unix: The order `-lG -lL -lm` must be adhered to.For `GraphWin` the libraries `libW` and `libP` additionally have to be linked in. The link order must be `-lW -lP -lG -lL -lm`.

Here is the program:

Filename: AuthorsProblem.C
LEDA users as of version 5.0: Note that header files are now partitioned into modules (subfolders)!
```#include <LEDA/graph.h>
#include <LEDA/node_array.h>
#include <LEDA/basic_graph_alg.h>
#include <LEDA/string.h>

using leda::graph;
using leda::node;
using leda::node_array;
using leda::TOPSORT;
using leda::string;
using std::cout;

int main()
{
graph G;

node v0 = G.new_node();
node v1 = G.new_node();
node v2 = G.new_node();
node v3 = G.new_node();
node v4 = G.new_node();
node v5 = G.new_node();
node v6 = G.new_node();
node v7 = G.new_node();

G.new_edge(v0, v1);
G.new_edge(v0, v3);
G.new_edge(v1, v2);
G.new_edge(v2, v3);
G.new_edge(v4, v5);
G.new_edge(v5, v0);
G.new_edge(v5, v2);
G.new_edge(v6, v1);
G.new_edge(v6, v4);
G.new_edge(v7, v2);
G.new_edge(v7, v3);
G.new_edge(v7, v6);
G.new_edge(v0, v1);

node_array<string> name(G);
name[v0] = "A";
name[v1] = "B";
name[v2] = "C";
name[v3] = "D";
name[v4] = "E";
name[v5] = "F";
name[v6] = "G";
name[v7] = "H";

cout << "This graph has " << G.number_of_nodes() << " nodes and ";
cout                      << G.number_of_edges() << " edges.\n";

node_array<int> top_num(G);

if(!TOPSORT(G, top_num)) {
cout << "G has no topological sorting!\n";
}
else {
node v;

cout << "The following is a topological sorting:\n";
forall_nodes(v,G)
cout << name[v] << " " << top_num[v] << "\n";

G.sort_nodes(top_num);
cout << "Nodes sorted topologically:\n";
forall_nodes(v,G)
cout << name[v] << " " << top_num[v] << "\n";
}
}
```

The output of the program reads:

```This graph has 8 nodes and 13 edges.
The following is a topological sorting:
A 5
B 6
C 7
D 8
E 3
F 4
G 2
H 1
Nodes sorted topologically:
H 1
G 2
E 3
F 4
A 5
B 6
C 7
D 8```

So as a clever author, one could put the sections in the order H, G, E, F, A, B, C, D. This would burden no material not yet covered on the reader, nowhere. So he would never have to skim forwards then! With this method, a reader-friendly order of the sections can always be found, provided that the dependency graph does not have any cycles.[40]

To be able to use the function `TOPSORT()` or other basic graph algorithms, the header file `basic_graph_alg.h` must be included.

Since we do not use the edge items returned by `new_edge()` any further in the above program, we do not store them temporarily.

A call of

```G.number_of_nodes();
G.number_of_edges();```

returns the current number of nodes and edges of graph.

All nodes of a graph can be traversed by means of the macro

`forall_nodes(v, G) {...}`

It successively returns items `v` that refer to the individual nodes of a graph.

Accordingly, the macro

`forall_edges(e, G) {...}`

iterates over all edges. We will learn the details on how to iterate over graphs in Section 5.2.4.

In the internal representation of a `graph`, the nodes and edges lie in a certain order; this order is traversed by `forall_nodes` and `forall_edges`. The method

`G.sort_nodes(numbering)`

reorders the nodes internally by the order that has been passed in the node array `numbering`. In the previous program, we made use of this method to output the nodes in the order of their topological sorting. Corresponding to `sort_nodes()`, there is the method `sort_edges()` for sorting edges.

Implementation and further information

Graphs are implemented by two doubly linked, circular lists, one for the nodes and one for the edges; we will go into the details later. Most operations on a `graph` take constant time; if not, the required time mostly is obvious from the semantics of the operation.

Of course, the definite reference for the class `graph` is the corresponding manual page. This page contains a whole wealth of information - the interface of `graph` is exceptionally great. We will get to know the most important concepts and methods in the next sections step by step.

Exercises

 Exercise 48. Work out why the following holds true: A directed graph has a topological sorting if and only if it does not contain a cycle.Work out in addition why the method introduced in this section discovers such a sorting if there is one. Why does a topological sorting not have to be unique? Exercise 49. A graph is called planar if it can be drawn into the plane such that its edges do not intersect in any point.In an old children's game, 3 houses H1, H2 and H3 shall be connected to a waterworks W, an electric power plant E, and a thermal power station H by lines such that the lines do not cross, see Figure 5.5. This graph is called K3,3. We will get to know its significance for drawing graphs in more detail in Section 5.2.6.Build up this graph with the class `graph` and use the built-in function `Is_Planar()` to test whether this graph is planar. To use this function, the header file `graph_misc.h` has to be included.

[40] What is natural for a computer scientist and needs hardly any explanation to him, namely to arrange the individual sections in the order of a topological sorting when writing scientific texts, seems not be well known in other disciplines. So the author got the following feedback on the application of topological sorting introduced here from a physicist well-known to him who has been working in research for over 10 years and already has published numerous scientific papers: “Well, I'm flabbergasted. Since more then 10 years I've been wondering under great efforts again and again how I should arrange my sections. It was completely unclear to me that this can be done so easily with the help of a graph. You should publish this. This could be pioneering for the publication practice in all disciplines, included the arts.