*Learning objectives*

The classes `node_array` and `edge_array` |

The class `GRAPH` |

The classes `node_map` and `edge_map` |

Breadth first search with `BFS()` |

As we already mentioned, it is necessary in most problem to store additional information with the nodes and/or the edges in order to access this information later starting from a certain node or a certain edge. Therefore LEDA offers several ways to do this, which we will got to know in the following.

We already introduced the classes * node_array*
and

`edge_array`

The definitions

node_array<I> A(G); node_array<I> B(G, x);

create two node arrays `A`

and `B`

that associate a value of type `I`

to every node of the
*already constructed* graph
`G`

. In the first case, a default value of type
`I`

is assigned to every node; in the second case, the
explicitly denoted value `x`

is assigned.

The value `v`

that is associated to a
node is accessed via the subscript operator, which, as always,
returns a reference to the corresponding value:

A[v] = y;

Node arrays and edge arrays offer a very flexible way to
associate information to the nodes and edges of a graph: On a
given graph, one can define *arbitrarily
many* such arrays *at an arbitrary moment of
the execution*. Admittedly, the graph G has to be
completely constructed at the time of the definition of these
arrays. The node array is *valid* only for the nodes
that are contained in G at this moment:

Warning | |
---|---|

If additional nodes (or edges) are added to a graph that a node array (or an edge array) is already defined on, the index set of this array is not augmented by these nodes (or edges)! A sequence of statements such as the following creates an error message: node_array<int> dist(G); node v = G.new_node(); dist[v] = 42;
Here, the use of a |

In some cases, one knows that the graph G will never contain more than n nodes and that this n is not very large. Then the constructor

node_array<I> A(G, n, x);

of `node_array`

may be worthwhile:
It allocates the space for n nodes altogether in the
node array *a priori*, and associated the value x to every node. If m
is the number of nodes in G at the time of this definition,
n >= m must hold, and only n-m nodes
altogether may be added to G without the node array becoming
invalid.

As already said, arbitrarily many node arrays and edge arrays can be defined on a graph G. Instead of defining many different node arrays and edge arrays, the following alternative, which is applicable if the number of the arrays to be defined is known a priori, is more efficient: Provided that n_slots node arrays and e_slots edge arrays are needed, the constructor

graph G(n_slots, e_slots);

of the class `graph`

creates a graph in which the
internal structures for the representation of the nodes and edges
contain space for additional entries (so-called “slots”)
from n_slots node arrays and e_slots edge arrays. (We will not delve
into the internal representation and the way of indirection here.) To
use these additional entries for a certain node array
`A`

, this array has to be created as follows:

node_array<I> A; A.use_node_data(G, x);

This assigns one of the slots to `A`

and initializes
all entries of the array to `x`

. If no slot is
available, `A`

is created as an ordinary node
array.

The next possibility for associating information to nodes
or edges is offered by the class `GRAPH`

.
This class is suited in particular for *dynamic graphs*, that is, for
graphs in which nodes and edges are added or deleted often and
at unpredictable points in time.

The class * GRAPH* is
derived, in the sense of C++, from the class

`graph`

; it realizes so-called `I1`

and a value of a certain information type `I2`

is
associated to every node and every edge, respectively.For example, a definition

GRAPH<string,int> G;

defines `G`

as a parameterized graph in which a value
of type `string`

is associated to every node and
a value of type `int`

is associated to every edge.

A new node is created in `G`

with

G.new_node(s);

With this, the information (here: the string) `s`

is
associated to the new node. Accordingly, an additional information
`i`

can be passed also to the method
`new_edge()`

; `i`

is
associated to the newly created edge:

G.new_edge(v1, v2, i);

The value associated to a node (an edge) can be accessed via a suitable node item
`v`

(a suitable edge item `e`

)
with the subscript operator:

node v; edge e; //... v and e get their values ... G[v] = "LOVE"; G[e] = 42;

This data structure is *fully dynamic*, that is,
information can be associated to the nodes and edges without
restrictions, even if the graph grows or shrinks
heavily. Furthermore, accessing the information via a node item
or an edge item is a little more efficient than accessing it
with a node array or an edge array. Only
*one* piece of information, however, can be
stored per node and per edge in a `GRAPH`

:

Note | |
---|---|

If one wants to associate several variables to a node or to an edge,
one can use a composite type as a user-defined type
parameter. Another possibility is to use one or several
additional |

As a working example we want to implement our search
“From LOVE to PAIN” from Section “A working example: Breadth first search in a graph” once again. This time, however, we do
not want to write the algorithm for breadth first search by
ourselves, of course: As any other fundamental graph
algorithm, also this one is built into LEDA. It is available by
the function * BFS()*.

What do we have to do to use this built-in function
`BFS()`

properly in our problem? At first,
we have to build the graph *explicitly* that
the function has to search in. We did not do this in Section “A working example: Breadth first search in a graph”; there, we did not have a data
structure for graphs yet, and we created the graph only
*implicitly* during the program run by
traversing it with breadth first search.

Again we read our list of 4-letter words from a file and
create a node for every word read. This node has to know which
word it carries. To associate this information, we use a
`GRAPH<string,int>`

. Here the type
`int`

of the edge information is not important - we
dot not associate information to the edges, but we have to
declare an information type; therefore we set all edge values to
0.

After we have created a node a for the word A, we look up
whether there are nodes in the graph constructed so far that
have to be neighbors to a. We find these nodes with the same
method as in Section “A working example: Breadth first search in a graph”: We create
all possibly adjacent words B by replacing the letters of A, and
look up whether B already occurs in the graph. When we find such
a node, we have to insert the corresponding edges to establish
the neighborship relations. For every edge, the reverse edge has
to be inserted, because the derivation relation of the word game
is symmetric: If the word B is derivable from the word A, then
also A is derivable from B. We find the node b that corresponds
to the word B in the graph already constructed as follows: For
every word, we have to memorize the node corresponding to this
word; this is easily done with a
`d_array<string,node>`

. This
`d_array`

serves also for the foregoing
test whether a certain word B occurs at all as a node in the
graph.

After the graph `G`

has been completely
constructed, we let the user insert two words
`from`

and `to`

, for which a
derivation is to be searched. A suitable call

BFS(G, start, dist, pred);

executes a breadth first search in `G`

, starting from
the node `start`

. The distance information that is
calculated in this call for every visited node `v`

is
stored in the node array `dist`

; the edge via that
`v`

is reached in the search, and which therefore lies
on a shortest path from `start`

to
`v`

, is stored in the node array
`pred`

. We use this node array
`pred`

at the end in order to walk along a shortest
path from the node `to`

back to the start node
`from`

.

#include <LEDA/graph.h> #include <LEDA/d_array.h> #include <LEDA/list.h> #include <LEDA/basic_graph_alg.h> #include <fstream> #include <iostream> using leda::GRAPH; using leda::node; using leda::edge; using leda::node_array; using leda::d_array; using leda::string; using leda::list; using std::cout; using std::cin; int main() { std::ifstream is("Data/4LetterWords.txt"); GRAPH<string,int> G; d_array <string,node> D; string s; // Read all words. Each word becomes a node of G while (is >> s) { node v = G.new_node(s); D[s] = v; // Which node belongs to which word? // Find all adjacent words and record adjacency in G // Step through all 4 letters of s for(int i = 0; i < 4; i++) { string s_adjacent = s; // Step through all letters of the alphabet for(char c = 'A' ; c <= 'Z'; c++) { if(c == s[i]) continue; // Candidate adjacent word: replace letter i s_adjacent[i] = c; if(D.defined(s_adjacent)) { // Adjacent word found; record adjacency by edges G.new_edge(D[s], D[s_adjacent], 0); G.new_edge(D[s_adjacent], D[s], 0); // backward edge } } } } // Let user input search words string from, to; cout << "Start with word: "; cin >> from; cout << "End with word: "; cin >> to; if(!D.defined(from) || !D.defined(to)) { cout << "At least one of the words is not a part of the word list!\n"; exit(1); } // Call Breadth First Search on G node_array<int> dist(G,-1); node_array<edge> pred(G); BFS(G, D[from], dist, pred); // Output distance and path from <from> to <to> int distance = dist[D[to]]; cout << "The distance from " << from << " to " << to; cout << " is " << distance << "\n"; if(distance == -1) { cout << "There is no way from " << from << " to " << to << ".\n"; } else { cout << "Here is a way to reach it:\n"; node v = D[to]; node start = D[from]; list<node> L; L.push(v); // Output nodes in reverse order later while(v != start) { v = G.opposite(v, pred[v]); // v = G.source(pred[v]); is also possible here L.push(v); } int dcount = 0; forall(v, L) cout << dcount++ << " " << G.inf(v) << "\n"; } }

The shortest path from LISP to JAVA reads as follows:

Start with word:End with word:`LISP`

The distance from LISP to JAVA is 5 Here is a way to reach it: 0 LISP 1 LIMP 2 LAMP 3 LAMA 4 LAVA 5 JAVA`JAVA`

So short? Yes, after all, all Turing complete programming languages are equal...

The function `BFS()`

requires a node
array `node_array<int>`

that is valid
on `G`

and whose values are all initialized to
-1. Furthermore, it returns a list of the nodes that have been
reached in the search; this return value is not used here.

We get the way from `to`

back to
`from`

by following the references in
`pred`

. For an edge and a node that is the
source (the target) of the edge, the method
`opposite()`

returns the opposite node,
which is then the target (the source) of the edge; we will
describe this function in more detail in Section 5.2.4.

To print the derivation steps in the correct order, we
collect all words on the way back in a list and print this list
in reversed order. (This is done implicitly here, by appending
every word *in front of* the list and, at
the end, outputting this list from the front to behind.)

For a node_item or an edge_item `it`

, the method

G.inf(it);

returns a copy of the value
associated to the corresponding node or edge. In contrast to the
operator `[]`

, the value cannot be
changed by this.

There is a third possibility to provide graphs with
information: Node maps and edge maps are a fully dynamic
alternative to node arrays and edge arrays. They are offered by
LEDA in the form of the classes * node_map*
and

`edge_map`

The definitions

node_map<I> A(G); node_map<I> B(G,x);

define two node maps on the graph `G`

. In the first
definition, the default value of type `I`

is associated to
every node (in the node map `A`

); in the second
definition, the explicitly denoted value `x`

is
associated (in the node map `B`

). Edge
maps are defined accordingly.

As with node arrays and edge arrays, the information
associated to a node or an edge is accessed with the subscript
operator `[]`

.

Node maps and edge maps use hashing to associate information to nodes and edges. The definition of a node map and an edge map takes constant time (in contrast to O(n) and O(m) for node arrays and edge arrays on a graph with n nodes and m edges, respectively), and accessing the information associated to a node or an edge has constant costs (as the expectation value).

As with node arrays and edge arrays, if the number of node maps and edge maps to be defined is known a priori, there is the possibility to use one of slots for the node values created in the constructor of the graph:

graph G(n_slots, e_slots);

Such a slot can be used by the following definitions:

node_map<I> A; A.use_node_data(G, x);

By the way, the classes `node_array`

and
`node_map`

share these slots and act exactly the
same when using one of these slots with
`use_node_data()`

.

When shall one use which of the possibilities presented in this section? Table 5.1 tries to give some hints. Therein, n denotes the number of nodes in a graph. The table shows only the possibilities to associate information to a node; for the edges the corresponding statements hold with the corresponding types.

The main criterion is the question whether the graph is
dynamic and whether it therefore will change
*after* the definition of the respective
structure, or whether it is static. Only in the latter case, node
arrays and edge arrays are an option.

If the graph is static, the question is important to how many nodes and edges information shall be associated. If these are only a few, a node map or an edge map takes only little disk space and should therefore be preferred. If, in contrast, information is to be stored for more than half of the nodes or edges, node maps and edge maps are worthwhile.

**Table 5.1. 5 possibilities to associate information to a graph**

`node_array` | `G(n_slots,e_slots)` with `A.use_node_data(G,x)` | `node_array<I> A(graph G, int n, I x)` | `GRAPH` | `node_map` | |

Fully dynamic? | no | no | only for n - |V| further calls of `new_node()` | yes | yes |

Number of possible definitions on a given graph | arbitrarily many | n_slots | arbitrarily many | 1 | arbitrarily many |

Access time | O(1), fast | O(1), very fast to fast | O(1), fast | O(1), very fast | O(1) expected |

Time for initialization | O(n) | O(n · n_slots) | O(n) | O(n) | O(1) |

The manual pages for
`node_array`

,
`edge_array`

,
`GRAPH`

,
`node_map`

, and
`edge_map`

contain further information about the respective classes.