2.1.1. The LEDA set of rules, handle types, and the categorization of the LEDA types

Learning objectives

LEDA rules in general
The LEDA rule “Definition with initialization by copying
Handle scheme and reference counting
string as a handle-like type
The categorization of the LEDA types

The LEDA set of rules

In the last program the strings s and t are defined with initialization by copying. A syntactically equivalent form for the initialization of t is the definition

string t = "LEDA";

This possibility of definition with initalization by copying exists for every LEDA type. This is a firm rule, which is always valid, namely the LEDA ruleDefinition with initialization by copying”.

This rule originates from a set of rules which must be adhered to when programming with LEDA in order to write syntactically and semantically correct and efficient LEDA programs. The rule stating that definition with initialization by copying is possible for every LEDA type is one of them. We will get to know these rules by and by. The impatient reader can obtain a summary in Appendix B, The golden LEDA rules.

Handle types and handle-like types

But what exactly is a copy of a string? What do statements like

t = s;

or

string t(s);

really do? As experience shows, in programs working with many string variables many of these variables have the same value, i.e., they refer to the same character string. It would be a great waste of memory if every variable denoted an object of its own, i.e. a separate area in memory. LEDA always tries to offer an implementation of a data type as clever as possible, that is as memory and/or space saving as possible. It therefore works with a handle scheme and with reference counting: If the string variable s is assigned to the variable t by the statement t = s or if t initialized with a copy of s, then not the individual characters are copied from the physical memory location of s to the memory location of t, but rather s and t are made to point to the same physical memory location. In this spirit, they are “handles” by which the actual memory location is accessed. The actual string, i.e. the sequence of the characters in memory, only exists once and LEDA counts how many handles refer to it.

Not until a string is modified, for example by an access by [], the two handles are actually “decoupled”, that is, new storage is requested and a copy is made. In this regard strings, concerning their implementation, are very similar to pointers, but thanks to LEDA everything happens automatically in the background; the user does not have to worry about the difficulties usually occurring when using pointers and dynamic memory allocation.

Some types of LEDA follow this handle scheme with reference counting, for example the type point for points in the plane and the type integer for arbitrarily large integers. Accordingly, these types are called handle types.

By various reasons pertaining the implementation, which we do not want to dwell on here, the type string is not a real handle type. According to this, the class string is only referred to as a handle-like type.

Categorization of the LEDA data types

With regard to the way how an object of a LEDA type is copied and with regard to the question how such an object is built up from partial objects or how it must occur in a comprehensive object, all data types of LEDA fit in a system which is illustrated in Appendix A, Categorization of the LEDA data types. We will get to know this categorization by and by. Therein the type string is subclassified as a handle-like type under the item types, which in turn represent a subclass of the primitive types.

Bypassing the handle scheme

Unlike the assignment operator, the operator +=, of whom we have taken advantage in the program for palindrome testing to reverse a string, always makes a copy of a string without us noticing this. Thereby, when reversing a string as we have implemented it until now, for each and every character of the original string appended to the reversed string, a copy of a string is made in defiance of the handle scheme. An unnecessary waste of memory! So how can we reverse a (long) string memory-efficiently? The following program contains two tricks. The first one is a trick to form arbitrarily long palindromes; it has to be ascribed to the English language. The second one is a programming trick which avoids the copy rage of +=.

Filename: PalindromeGenerator.C
#include <LEDA/string.h>
#include <iostream>

using leda::string;
using std::cout;

string reverse_efficient(const string& s) {
  string result = s;

  int i = 0;
  int j = result.length() - 1;

  while(i < j) {
    char tmp = result[i];
    result[i] = result[j];
    result[j] = tmp;
    i++; j--;
  }
  
  return result;
}


int main() {
  string s;
  
  cin >> s;

  string s_quoted = "'" + s + "'";
  
  string r = reverse_efficient(s_quoted);

  string outstring = s_quoted + " sides reversed is " + r;

  cout << outstring << "\n";
  cout << reverse_efficient(outstring) << "\n";
}

The output of the program with the input Joachim is

'Joachim' sides reversed is 'mihcaoJ'
'Joachim' si desrever sedis 'mihcaoJ'

and the first trick should be obvious thereby.

The efficiency of the function reverse_efficient() is due to the fact that altogether, only one single copy is made, namely in that moment result[0] is write-accessed for the first time. The swapping of the characters happens by two indices i and j, inside this copy. There i runs from front to rear and j from rear to front until they meet in the middle. In each step the characters at positions i and j are swapped.

Implementation and further information

Strings are implemented by character arrays of C++.

If the string s consists of n characters and the string t of m characters, the following holds: All operations containing a search for the substring t in the string s take time O(n·m). (We will get to know the meaning of the O-Notation in Section 2.2.3.) Accessing an individual character by [] takes constant time. All other operations take time O(n).

The class string possesses yet some other methods. Furthermore, the methods introduced above can partly be invoked with different or additional arguments. We do not want to introduce all of these possibilities here. We refer to the corresponding manual page of the class string instead.

Tips for the use of the class string

  • In order to create strings from values of other types like double the constructor

    string s(const char* format, ...);

    can be used. Like the C function scanf() it awaits a format string and after that a list of the objects to be turned into a string.

  • The class string should in general be preferred to C++ strings of type char[].

  • In case of many time critical input/output operations the type char[] may be more efficient.

  • Whether the LEDA class leda::string or the homonymous class std::string of the C++ standard library is used, is a question of taste and habit. However, as it is described in Section “The namespace leda”, in both cases one has to pay attention to the correct namespace, otherwise one does not know which class one actually uses!

Exercises

Exercise 1.

Turn an int value and a double value into a string without using functions of the C or C++ standard library.

Exercise 2.

Write a function that turns a string which only consists of numbers into an int value. Then write a function which tests whether there is a character occurring more than once in a string.

Calculate the MAXNONREPMIRP number with the help of the functions above and the function for reversing a string. This is the largest (MAX) prime number which is also a prime when read backwards (MIRP) and in which no number occurs more then once (NONREP).




Imprint-Dataprotection