Learning objectives
The classes node_array and edge_array |
The class GRAPH |
The classes node_map and edge_map |
Breadth first search with BFS() |
As we already mentioned, it is necessary in most problem to store additional information with the nodes and/or the edges in order to access this information later starting from a certain node or a certain edge. Therefore LEDA offers several ways to do this, which we will got to know in the following.
We already introduced the classes node_array
and edge_array
and their usage briefly in Section 5.2.1. These classes allow us
to associate additional information to the nodes and edges of a
graph. In the following, we will talk about node arrays only;
all that will be said holds accordingly for edge arrays.
The definitions
node_array<I> A(G); node_array<I> B(G, x);
create two node arrays A and B
that associate a value of type I to every node of the
already constructed graph
G. In the first case, a default value of type
I is assigned to every node; in the second case, the
explicitly denoted value x is assigned.
The value v that is associated to a
node is accessed via the subscript operator, which, as always,
returns a reference to the corresponding value:
A[v] = y;
Node arrays and edge arrays offer a very flexible way to associate information to the nodes and edges of a graph: On a given graph, one can define arbitrarily many such arrays at an arbitrary moment of the execution. Admittedly, the graph G has to be completely constructed at the time of the definition of these arrays. The node array is valid only for the nodes that are contained in G at this moment:
![]() | Warning |
|---|---|
If additional nodes (or edges) are added to a graph that a node array (or an edge array) is already defined on, the index set of this array is not augmented by these nodes (or edges)! A sequence of statements such as the following creates an error message: node_array<int> dist(G); node v = G.new_node(); dist[v] = 42;
Here, the use of a |
In some cases, one knows that the graph G will never contain more than n nodes and that this n is not very large. Then the constructor
node_array<I> A(G, n, x);
of node_array may be worthwhile:
It allocates the space for n nodes altogether in the
node array a priori, and associated the value x to every node. If m
is the number of nodes in G at the time of this definition,
n >= m must hold, and only n-m nodes
altogether may be added to G without the node array becoming
invalid.
As already said, arbitrarily many node arrays and edge arrays can be defined on a graph G. Instead of defining many different node arrays and edge arrays, the following alternative, which is applicable if the number of the arrays to be defined is known a priori, is more efficient: Provided that n_slots node arrays and e_slots edge arrays are needed, the constructor
graph G(n_slots, e_slots);
of the class graph creates a graph in which the
internal structures for the representation of the nodes and edges
contain space for additional entries (so-called “slots”)
from n_slots node arrays and e_slots edge arrays. (We will not delve
into the internal representation and the way of indirection here.) To
use these additional entries for a certain node array
A, this array has to be created as follows:
node_array<I> A; A.use_node_data(G, x);
This assigns one of the slots to A and initializes
all entries of the array to x. If no slot is
available, A is created as an ordinary node
array.
The next possibility for associating information to nodes
or edges is offered by the class GRAPH.
This class is suited in particular for dynamic graphs, that is, for
graphs in which nodes and edges are added or deleted often and
at unpredictable points in time.
The class GRAPH is
derived, in the sense of C++, from the class
graph; it realizes so-called parameterized
graphs. Here “parameterized” means that a value of a
certain information type I1 and a value of a certain information type I2 is
associated to every node and every edge, respectively.
For example, a definition
GRAPH<string,int> G;
defines G as a parameterized graph in which a value
of type string is associated to every node and
a value of type int is associated to every edge.
A new node is created in G with
G.new_node(s);
With this, the information (here: the string) s is
associated to the new node. Accordingly, an additional information
i can be passed also to the method
new_edge(); i is
associated to the newly created edge:
G.new_edge(v1, v2, i);
The value associated to a node (an edge) can be accessed via a suitable node item
v (a suitable edge item e)
with the subscript operator:
node v; edge e; //... v and e get their values ... G[v] = "LOVE"; G[e] = 42;
This data structure is fully dynamic, that is,
information can be associated to the nodes and edges without
restrictions, even if the graph grows or shrinks
heavily. Furthermore, accessing the information via a node item
or an edge item is a little more efficient than accessing it
with a node array or an edge array. Only
one piece of information, however, can be
stored per node and per edge in a GRAPH:
![]() | Note |
|---|---|
If one wants to associate several variables to a node or to an edge,
one can use a composite type as a user-defined type
parameter. Another possibility is to use one or several
additional |
As a working example we want to implement our search
“From LOVE to PAIN” from Section “A working example: Breadth first search in a graph” once again. This time, however, we do
not want to write the algorithm for breadth first search by
ourselves, of course: As any other fundamental graph
algorithm, also this one is built into LEDA. It is available by
the function BFS().
What do we have to do to use this built-in function
BFS() properly in our problem? At first,
we have to build the graph explicitly that
the function has to search in. We did not do this in Section “A working example: Breadth first search in a graph”; there, we did not have a data
structure for graphs yet, and we created the graph only
implicitly during the program run by
traversing it with breadth first search.
Again we read our list of 4-letter words from a file and
create a node for every word read. This node has to know which
word it carries. To associate this information, we use a
GRAPH<string,int>. Here the type
int of the edge information is not important - we
dot not associate information to the edges, but we have to
declare an information type; therefore we set all edge values to
0.
After we have created a node a for the word A, we look up
whether there are nodes in the graph constructed so far that
have to be neighbors to a. We find these nodes with the same
method as in Section “A working example: Breadth first search in a graph”: We create
all possibly adjacent words B by replacing the letters of A, and
look up whether B already occurs in the graph. When we find such
a node, we have to insert the corresponding edges to establish
the neighborship relations. For every edge, the reverse edge has
to be inserted, because the derivation relation of the word game
is symmetric: If the word B is derivable from the word A, then
also A is derivable from B. We find the node b that corresponds
to the word B in the graph already constructed as follows: For
every word, we have to memorize the node corresponding to this
word; this is easily done with a
d_array<string,node>. This
d_array serves also for the foregoing
test whether a certain word B occurs at all as a node in the
graph.
After the graph G has been completely
constructed, we let the user insert two words
from and to, for which a
derivation is to be searched. A suitable call
BFS(G, start, dist, pred);
executes a breadth first search in G, starting from
the node start. The distance information that is
calculated in this call for every visited node v is
stored in the node array dist; the edge via that
v is reached in the search, and which therefore lies
on a shortest path from start to
v, is stored in the node array
pred. We use this node array
pred at the end in order to walk along a shortest
path from the node to back to the start node
from.
#include <LEDA/graph.h>
#include <LEDA/d_array.h>
#include <LEDA/list.h>
#include <LEDA/basic_graph_alg.h>
#include <fstream>
#include <iostream>
using leda::GRAPH;
using leda::node;
using leda::edge;
using leda::node_array;
using leda::d_array;
using leda::string;
using leda::list;
using std::cout;
using std::cin;
int main()
{
std::ifstream is("Data/4LetterWords.txt");
GRAPH<string,int> G;
d_array <string,node> D;
string s;
// Read all words. Each word becomes a node of G
while (is >> s) {
node v = G.new_node(s);
D[s] = v; // Which node belongs to which word?
// Find all adjacent words and record adjacency in G
// Step through all 4 letters of s
for(int i = 0; i < 4; i++) {
string s_adjacent = s;
// Step through all letters of the alphabet
for(char c = 'A' ; c <= 'Z'; c++) {
if(c == s[i]) continue;
// Candidate adjacent word: replace letter i
s_adjacent[i] = c;
if(D.defined(s_adjacent)) {
// Adjacent word found; record adjacency by edges
G.new_edge(D[s], D[s_adjacent], 0);
G.new_edge(D[s_adjacent], D[s], 0); // backward edge
}
}
}
}
// Let user input search words
string from, to;
cout << "Start with word: ";
cin >> from;
cout << "End with word: ";
cin >> to;
if(!D.defined(from) || !D.defined(to)) {
cout << "At least one of the words is not a part of the word list!\n";
exit(1);
}
// Call Breadth First Search on G
node_array<int> dist(G,-1);
node_array<edge> pred(G);
BFS(G, D[from], dist, pred);
// Output distance and path from <from> to <to>
int distance = dist[D[to]];
cout << "The distance from " << from << " to " << to;
cout << " is " << distance << "\n";
if(distance == -1) {
cout << "There is no way from " << from << " to " << to << ".\n";
}
else {
cout << "Here is a way to reach it:\n";
node v = D[to];
node start = D[from];
list<node> L;
L.push(v); // Output nodes in reverse order later
while(v != start) {
v = G.opposite(v, pred[v]);
// v = G.source(pred[v]); is also possible here
L.push(v);
}
int dcount = 0;
forall(v, L)
cout << dcount++ << " " << G.inf(v) << "\n";
}
}
The shortest path from LISP to JAVA reads as follows:
Start with word:LISPEnd with word:JAVAThe distance from LISP to JAVA is 5 Here is a way to reach it: 0 LISP 1 LIMP 2 LAMP 3 LAMA 4 LAVA 5 JAVA
So short? Yes, after all, all Turing complete programming languages are equal...
The function BFS() requires a node
array node_array<int> that is valid
on G and whose values are all initialized to
-1. Furthermore, it returns a list of the nodes that have been
reached in the search; this return value is not used here.
We get the way from to back to
from by following the references in
pred. For an edge and a node that is the
source (the target) of the edge, the method
opposite() returns the opposite node,
which is then the target (the source) of the edge; we will
describe this function in more detail in Section 5.2.4.
To print the derivation steps in the correct order, we collect all words on the way back in a list and print this list in reversed order. (This is done implicitly here, by appending every word in front of the list and, at the end, outputting this list from the front to behind.)
For a node_item or an edge_item it, the method
G.inf(it);
returns a copy of the value
associated to the corresponding node or edge. In contrast to the
operator [], the value cannot be
changed by this.
There is a third possibility to provide graphs with
information: Node maps and edge maps are a fully dynamic
alternative to node arrays and edge arrays. They are offered by
LEDA in the form of the classes node_map
and edge_map.
The definitions
node_map<I> A(G); node_map<I> B(G,x);
define two node maps on the graph G. In the first
definition, the default value of type I is associated to
every node (in the node map A); in the second
definition, the explicitly denoted value x is
associated (in the node map B). Edge
maps are defined accordingly.
As with node arrays and edge arrays, the information
associated to a node or an edge is accessed with the subscript
operator [].
Node maps and edge maps use hashing to associate information to nodes and edges. The definition of a node map and an edge map takes constant time (in contrast to O(n) and O(m) for node arrays and edge arrays on a graph with n nodes and m edges, respectively), and accessing the information associated to a node or an edge has constant costs (as the expectation value).
As with node arrays and edge arrays, if the number of node maps and edge maps to be defined is known a priori, there is the possibility to use one of slots for the node values created in the constructor of the graph:
graph G(n_slots, e_slots);
Such a slot can be used by the following definitions:
node_map<I> A; A.use_node_data(G, x);
By the way, the classes node_array and
node_map share these slots and act exactly the
same when using one of these slots with
use_node_data().
When shall one use which of the possibilities presented in this section? Table 5.1 tries to give some hints. Therein, n denotes the number of nodes in a graph. The table shows only the possibilities to associate information to a node; for the edges the corresponding statements hold with the corresponding types.
The main criterion is the question whether the graph is dynamic and whether it therefore will change after the definition of the respective structure, or whether it is static. Only in the latter case, node arrays and edge arrays are an option.
If the graph is static, the question is important to how many nodes and edges information shall be associated. If these are only a few, a node map or an edge map takes only little disk space and should therefore be preferred. If, in contrast, information is to be stored for more than half of the nodes or edges, node maps and edge maps are worthwhile.
Table 5.1. 5 possibilities to associate information to a graph
node_array | G(n_slots,e_slots) with A.use_node_data(G,x) | node_array<I> A(graph G, int n, I x) | GRAPH | node_map | |
| Fully dynamic? | no | no | only for n - |V| further calls of new_node() | yes | yes |
| Number of possible definitions on a given graph | arbitrarily many | n_slots | arbitrarily many | 1 | arbitrarily many |
| Access time | O(1), fast | O(1), very fast to fast | O(1), fast | O(1), very fast | O(1) expected |
| Time for initialization | O(n) | O(n · n_slots) | O(n) | O(n) | O(1) |
The manual pages for
node_array,
edge_array,
GRAPH,
node_map, and
edge_map
contain further information about the respective classes.