2.1. Strings (class string)

Learning objectives

The class string
Creating, copying, modifying, and concatenating strings
Accessing substrings

In the beginning was the Word and the Word was with God and the Word was God.” It is already a sciptural wisdom that words are tremendously important. We therefore start with the data type of LEDA which is useful for storing and processing words and sentences, the type string, one of the simplest data types of LEDA.

A string is a sequence of several (maybe null) characters. The LEDA type string is therefore closely related to the C++ type char[], however, it has a number of advantages over the latter type, which we will soon get to know.

First of all we want to use this type in order to play a bit with words. Many further examples of this tutorial are games with words and sentences. With them complex facts can often be represented in a vivid and amusing way.

EVA, CAN I STAB BATS IN A CAVE? Pardon? The author is so angry about the complexity of the language C++ and would like to agonize innocent animals?[4] No, he has formed a sentence which reads the same forward as backward! Such a sentence or word is called a palindrome. (Blanks, punctuation marks and case are neglected when building palindromes, so that the definitions of word palindromes and sentence palindromes are equivalent.) Some other famous palindromes are “Doc, note: I dissent a fatness. I diet on cod.” and “A man, a plan, a canal - Panama!”.

At first let us write a LEDA program which reads a string from standard input and tests whether this string is a palindrome. For this purpose the program creates the reversed string from the input string, that is the string with reversed order of the individual characters, and compares it with the original one. To reverse the string, it successively appends all characters from behind to a second string. This program already reveals a lot of qualities of LEDA in general and of LEDA strings in particular.

Filename: PalindromeTest.C
#include <LEDA/string.h>
#include <iostream>

using leda::string;
using std::cout;

int main() {
  string s; // default initialization

  s.read_line();

  string r; 

  for(int i = s.length() - 1 ; i >= 0 ; i--) 
    r += s[i];   // inefficient, see below!

  if(s == r) // fast comparison of handle-like type
    cout << s << " is a palindrome\n";
  else
    cout << s << " is not a palindrome\n";
}

Here the string s is first defined with default initialization, that is, the variable s is set to the default value of the data type[5] in the moment of the definition. This is typically the “simplest” value of the type; concerning strings, this is the empty string, that is the string consisting of null characters.

Strings offer a large number of operations.[6] By a call of the method

s.read_line();

a string s can be read from standard input. Thereby, the line separator \n serves as the end marker of a line.

The above program appends the respectively next character of the string s to the string r by using the operator +=. In general a string s can be appended to the string r by

r += s;

In doing so, the operator += returns a reference to r.

The individual characters of a string are accessed by the subscript operator []:

char& c = s[i];

It returns a reference to the i-th character so that this character can also be modified by operator [].

The comparison operator

s == r;

compares two strings s and r.

In comparison with the type char[] of C++ it stands out that we do not have to take care of storage allocation here. The string r becomes longer and longer by the operator += in a magical way.

Further operations on strings

Let us have a look at some further basic operations of the class string. We create a string, access parts of it and replace them by other strings or completely delete them. Finally, we concatenate a couple of strings.

Filename: StringBasics.C
#include <LEDA/string.h>
#include <iostream>

using leda::string;
using std::cout;
using std::endl;

int main()
{
  string s = "The LEDA Tutorial"; // LEDA rule 1

  cout << s.head(3) << endl; 
  cout << s(4,8) << endl;    // substring
  cout << s.tail(8) << endl; 

  string t("LEDA"); // equivalent to t = "LEDA";

  cout << "LEDA is a substring starting on position " << s.pos(t) << endl;

  s = s.insert("wonderful ",4); 
  cout << s << endl;
  
  s = s.replace("LEDA Tutorial","soupstone");
  cout << s << endl;

  s = s.replace_all("soupstone","soupstone by Dr. Hook");
  cout << s << endl;

  s = s.del("soupstone by ");
  cout << s << endl;

  string u;
  u = s;  // let u and s reference the same memory
  u = u + " sings '" + t + " " + t + " " + t +"'" ;
  cout << u << endl;
}

The output of the program is

The
LEDA
Tutorial
LEDA is a substring starting on position 4
The wonderful LEDA Tutorial
The wonderful soupstone
The wonderful soupstone by Dr. Hook
The wonderful Dr. Hook
The wonderful Dr. Hook sings 'LEDA LEDA LEDA'

The method

s.pos(t);

searches in a string s for the substring t and returns the position at which t occurs for the first time in s and -1 if the substring t does not occur, respectively. The count starts at 0, that is, the first character of a string s is the character s[0].

The methods

s.head(i);
s.tail(i);

return the first and the last i characters, respectively, of the string s (in a new object of type string).

With the overloaded function operator

s(n,m);

the substring starting from the character at position n (exclusively) and ending with the character at position m (inclusively) can be extracted.

The method

s.insert(i, t);

inserts a string t at position i into s.

The method

s.replace(t, u);

replaces the fist occurrence of the substring t by the string u.

In contrast,

s.replace_all(t, u);

replaces all occurrences of t by u.

By

s.del(t);

the first occurrence of t in s deleted, that is, it is replaced by the empty string.

The operator

s + t;

concatenates two strings, that is, it returns a string which consists of all characters of s followed by all characters of t.

Finally, the assignment operator

s = t;

assigns a copy of the string t to a string s.

In this process none of these methods and operators changes the original string; in every case a modified string is returned.



[4] Yes, C++ sometimes is frustrating. The author has tried in this tutorial to use the basic concepts of this mighty programming language only. The main focus herein lies on the description of the types and algorithms of LEDA; therefore, most of the programs consist of a single main().

[5] If there is one. The built-in types like char, int, or double have no default value, nor have some types of LEDA.

[6] Admittedly LEDA's strings are not as powerful as the corresponding data types of languages like Perl and AWK; these languages are specialized in string processing. But LEDA's strings do bear comparison with the homonymous class string of the C++ standard library; both classes offer nearly the same functionality.




Imprint-Dataprotection