2.4.1. Stacks with an unbounded number of elements (class stack)

Learning objectives

The class stack
Error and exception handling

Anyone out there who ever operated a pocket calculator of Hewlett-Packard before? As a rule, only two types of answers come from the ones who have done so before: Either “Oh God! Terribly complicated” or “Brilliant! With that I calculate every expression, no matter complex it be, with a few keystrokes only!” What is this due to?

These pocket calculators do not expect the expressions to be evaluated in the so-called infix notation to which we are used since elementary school. Infix notation means that in an arithmetic expression the operators are written down in between the operands. This notation has the rules that multiplication and division operators have precedence over the addition and subtraction operators, and that parentheses must be used to influence the evaluation order. The expression

3 * ( ( 1 + 2 ) * 3 + 5)

for example is in infix notation and has the value 42, a number playing a big role in computer science.[22]

Pocket calculators of Hewlett-Packard, however, expect the expressions to be in the so-called postfix notation, also named reverse Polish notation (RPN) after its inventor, the Polish logician Jan Lukasiewicz. There, operands always precede their operators and parentheses can completely be renounced. In postfix notation the expression equivalent to the one above reads

3 1 2 + 3 * 5 + *

The value of such an expression turns out from a simple procedure working with a stack: The expression is read from left to right. If a number is read, it is pushed on the stack. If an operator is read, the two topmost numbers are popped from the stack, arithmetically joined according to the operator just read, and the result is pushed on the stack again. The final result, then, is the one and only number which is left on top of the stack at the end.

A program which processes its input this way with the help of a stack is also referred to as a push down automaton. Figure 2.18 shows how a push down automaton for postfix notation evaluates the expression 8 + ( 1 + 2 ) * 4.

Figure 2.18. A stack for the evaluation of arithmetic expressions

A stack for the evaluation of arithmetic expressions

Here the expression 8+(1+2)*4 is evaluated (in reverse Polish notation) with the help of a stack.

The benefit of formulating an expression in postfix notation is due to the fact that parentheses and precedence rules can be completely renounced. The expressions are shorter than their infix equivalents and can be evaluated by a program (namely with the help of a stack) with less overhead. Once one has got used to this notation, one appreciates very much not to have to pay attention to parenthesis hierarchies any more when evaluating complex expressions. One is then very fast with a corresponding pocket calculator. One must first develop a feeling, though, how a large expression can be split into partial expressions.[23]

Let us now write a LEDA program which takes advantage of the class stack to evaluate arithmetic expressions, which so implements the above push down automaton.

We read the expression we want to evaluate from standard input. We must split it first into its tokens, that is its minimal syntactic units. In this case, these tokens are decimal numbers and the individual operator characters. (We ignore the problem of the unary minus here.) In order to additionally demonstrate a clean modularization, we hive off this partial task in a translation unit of its own. The header file Tokenizer.h looks as follows:

Filename: Tokenizer.h
namespace Tokenizer {

enum token_kind { 
  NUMBER, 
  ADD = '+', SUB = '-', MUL = '*', DIV = '/',
  END_OF_INPUT
};

extern token_kind current_token;
extern double current_number;

void get_next_token();

}

The function get_next_token() reads the respectively next token from the input and records the kind of the token in the variable current_token. If it is a NUMBER, then the function additionally holds the value of this number in the variable current_number.

Our main program uses a stack<double> to store the numbers occurring in the expressions. It repeatedly invokes the function get_next_token(). If this function returns the token NUMBER, the program puts this number onto the stack by

S.push(current_number);

If a token representing an operator is returned, the program takes the two topmost numbers from the stack by

number = S.pop();

Then it joins these numbers arithmetically according to the operator, and pushes the result of the arithmetic operation on the stack again.

The token END_OF_INPUT finally shows that all characters of the input were read. Then only one single number should be left on the stack; it is the result of the evaluation. This number can be determined by top():

S.top();

returns the topmost element of the stack without deleting it from the stack.

Filename: RPNEvaluator.C
#include <LEDA/stack.h>
#include "Tokenizer.h"
#include <iostream>

using leda::stack;
using namespace Tokenizer;

int main()
{
  
  stack<double> S;
  get_next_token();

  while(current_token != END_OF_INPUT) {
    if(current_token == NUMBER) 
      S.push(current_number);
    else {
      double num1 = S.pop();
      double num2 = S.pop();
      switch(current_token) {
      case ADD:
        S.push(num2 + num1);
        break;
      case SUB:
        S.push(num2 - num1);
        break;
      case MUL:
        S.push(num2 * num1);
        break;
      case DIV:
        S.push(num2 / num1);
        break;
      }
    }
    get_next_token();
  }

  std::cout << S.top() << "\n";
}

The LEDA class stack has a particularly simple interface. Apart from the methods push(), pop(), and top() used above, there are the following methods:

S.empty();

tests whether the stack is empty.

S.size();

returns the number of elements on the stack. And finally

S.clear();

deletes all elements and empties the stack.

To complete the picture we want to list the implementation Tokenizer.C. It determines the next token with the help of the known stream functions from the C++ standard library.

Filename: Tokenizer.C
#include "Tokenizer.h"
#include <iostream>

namespace Tokenizer {

token_kind current_token;
double current_number;

void get_next_token() {
  char c=0;
  std::cin >> c;
  
  switch(c) {

  case 0:
    current_token = END_OF_INPUT;
    return;

  case '+': case '-': case '*': case '/':
    current_token = token_kind(c);
    return;

  case '0':  case '1':  case '2':  case '3':
  case '4':  case '5':  case '6':  case '7':
  case '8':  case '9':
  case '.':
    std::cin.putback(c);
    std::cin >> current_number;
    current_token = NUMBER;
    return;
  }
}

}

Now we can pass our original expression to the push down automaton for evaluating:

3 1 2 + 3 * 5 + *
42

Error handling

What happens if we forget an operator in the input, for example the number 5 in the above expression? Sometime, the push down automaton then will try to pop an element from an empty stack. However, this is forbidden according to the precondition of the operation pop()! See what happens:

3 1 2 + 3 * + *
 
LEDA ERROR HANDLER
        slist: pop on empty list.
 
 
Abort

The program stops with an error message of the internal LEDA error handler. This handler checks the preconditions of some operations as we have seen in Section “Preconditions”. It particularly checks whether a stack is not empty at the time of a pop(). (Stacks are implemented with the help of the class slist; the error message therefore originates from slist).

Often it makes sense to catch and handle errors of this type. Here, for example, it would be good style to draw the attention of the user to the fact that his expression contains too few operands.

To not mix the error handling with the implementation of the procedure and to not cover the look on the essentials up thereby, we want to enclose the code for accessing the stack with a try-block and catch and handle possible errors in an accompanying catch-block. This looks as follows:

Filename: RPNEvaluatorErrorsHandeled.C
#include <LEDA/stack.h>
#include <LEDA/error.h>
#include "Tokenizer.h"
#include <iostream>

using leda::stack;
using leda::set_error_handler;
using leda::leda_exception;

using namespace Tokenizer;

int main()
{
  set_error_handler(leda::exception_error_handler);

  stack<double> S;
  get_next_token();

  while(current_token != END_OF_INPUT) {
    if(current_token == NUMBER) 
      S.push(current_number);
    else 
      try {
        double num1 = S.pop();
        double num2 = S.pop();
	
        switch(current_token) {
        case ADD:
          S.push(num2 + num1);
          break;
        case SUB:
          S.push(num2 - num1);
          break;
        case MUL:
          S.push(num2 * num1);
          break;
        case DIV:
          S.push(num2 / num1);
          break;
        }
      }
      catch(leda_exception e) {
        std::cerr << e.get_msg() << std::endl;
        std::cerr << e.get_num() << std::endl;
        std::cerr << "Too few numbers in expression." << std::endl;
        exit(1);
      }
    get_next_token();	    
  }

  if(S.size() != 1 ) {
    std::cout << "Too few operators in expression." << std::endl;
    exit(1);
  }
  else std::cout << S.pop() << std::endl;
}

First, we must inform the LEDA error handler that it shall throw an exception on the occurrence of an error, which we then will catch. This is carried out by the statement

set_error_handler(leda::exception_error_handler);

The exception thrown by LEDA is of the type leda_exception. An object of this type contains an error number corresponding to the exception and an error message. These contents can be accessed by the methods

e.get_num();
e.get_msg();

Thereby, if a catch possibly must catch several exceptions at the same position, the type of the exception can be found out. This makes a handling adapted to the respective exception possible.[24]

Let us now test the above insufficient input once again:

3 1 2 + 3 * + *
slist: pop on empty list.
1
Too few numbers in expression.

So we receive exact information now about what we have done wrong.

[Note] Note

All exceptions have the error number 1 in the current version 5.0 of LEDA as it also appears in the above output. However, this shall be improved soon according to the persons in charge.

In the above program, we have also caught the case that our expression contains too few operators. This does not manifest itself by an exception thrown by a method of the class stack, though, but by the fact that more than one number is contained in the stack at the end. We check this by size(). An example input for this case creates the following output:

1 1
Too few operators in expression.

So this is not the way to add up 1 and 1...

Implementation and further information

Stacks are implemented by singly linked linear lists. All operations take constant time except for clear() which takes time O(n) if the stack contains n elements.

More about stacks can be found on the corresponding manual page. The possibilities of the error handling in LEDA are also described on a manual page of their own.

Exercises

Exercise 16.

A fully parenthesized arithmetic expression is an arithmetic expression in which every sub-expression, consisting of two operands and an operator, is surrounded by parentheses, like for example

((( 3 + (2 + 5)) * ( 6 + 8 )) - ( 6 * 2 ))

Evaluate such fully parenthesized arithmetic expressions with a program which uses two stacks, one for operands and one for operators.

Then refine the program so that it can also evaluate partially parenthesized arithmetic expressions. To be more precise, it shall be in the position to evaluate all expressions which are created according to the following rules:

  1. An expression starts with a (unary) plus or minus or none of them; a term follows, followed by an arbitrary number of others terms (also none) which are separated by a plus or a minus.

  2. A term is a factor, followed by an arbitrary number of other factors (also none) which are separated by a multiplication or a division sign.

  3. A factor is a non-negative integer number or a parenthesized expression.

Expressed with the help of a context-free grammar, the set of (incompletely parenthesized) arithmetic expressions is defined as follows:

expression ::= ( '+' | '-' )? term (( '+' | '-' ) term)*              
term ::= factor (( '*' | '/' ) factor)*                             
factor ::= ( '0' | '1' |...| '9' )+ |  ( expression )

The usual precedence rules (multiplication and division have precedence over addition and subtraction) shall be followed. So the same value as in the case of the previous, fully parenthesized expression shall be output for the expression

( 3 + 2 + 5 ) * ( 6 + 8 ) - 6 * 2



[22] The deeper significance of the expression value 42 for computer science is described in [5].

[23] To be more precise, one must master a form of recursion intuitively to arrange the original infix expression in postfix notation. One can practice this. It is known, for example, that Apollo astronauts calculated critical course correction maneuvers in the seventies with their HP 65.

[24] Here it must be admitted that this exception scheme is not as hierarchized as we are used to it for example from a large GUI library. This is due to the fact that exception handling in programs which mainly implement combinatorial algorithms does by far not play such a big role as in programs with an extensive user interaction.




Imprint-Dataprotection