6.1. Integers of arbitrary length (class integer)

Learning objectives

The class integer

C++ provides the types short, int, and long for holding and computing integer numbers. All these types come in signed (that is, a number may be negative) and unsigned (that is a number is 0 or positive) form.

Let us see the ubiquitous type int at work in the computation of the Fibonacci numbers. These are integer numbers defined by a simple rule: The first two Fibonacci numbers are equal to 1, each later Fibonacci number is the sum of its two predecessors. In formulae:

F0 = 1, F1 = 1

Fn = Fn-1 + Fn-2

Fibonacci numbers are often found in nature. For example, they appear in the pedigree of bees and in the floret spirals of sunflowers, see [7].

A simple C++ program for computing the Fibonacci numbers is the following:

Filename: FibonacciNumbersInt.C
LEDA users as of version 5.0: Note that header files are now partitioned into modules (subfolders)!
#include <iostream>

using std::cout;
using std::endl;

int main() 
{
  int a = 1, b = 1;

  cout << "Fib(0) = " << a << endl;
  cout << "Fib(1) = " << b << endl;

  for(int i = 2; ; i++) { 
    int sum = a + b;
    cout << "Fib(" << i << ") = " << sum << endl;
    a = b;
    b = sum;
  }
}

The output of this program on the author's machine is the following:

Fib(0) = 1
Fib(1) = 1
Fib(2) = 2
Fib(3) = 3
Fib(4) = 5
Fib(5) = 8
Fib(6) = 13
...
Fib(44) = 1134903170
Fib(45) = 1836311903
Fib(46) = -1323752223
Fib(47) = 512559680
...

Oops, what's that? F46 is a negative number? No, of course not. What has happened here is a wrap-around, caused by the boundedness of the type int: This type has only a finite domain (and so have short and long and the unsigned forms). The details vary from machine to machine, but in general, the following holds: If w is the word size of the machine (with w=32 or w=64 being usual values on today's workstations), then the type unsigned int uses w bits, that is, its domain is the interval [0..2w-1]. In contrast, the domain of type int is the interval [MININT..MAXINT], where MININT and MAXINT are predefined constants.[43] On most machines, MININT is defined as -2w-1 and MAXINT as 2w-1-1, and so they are on the author's machine. This means that the largest number of type int on the author's machine (which has w=32) is the number 2,147,483,647. That is why the sum of F44 and F45 does not fall into the interval of int on the author's machine, so a wrap-around occurs in the computation of F46, hitting a number from the negative half of the interval [MININT..MAXINT]. See Exercise 74 for details.

In situations like these, it would be fine to have an integer type that can deal with integers of arbitrary length, that is, whose operations never cause the resulting numbers to wrap around. And of course, LEDA provides such a type, which is called - what a surprise - integer. Here is the same program as above, now using LEDA's type integer instead of C++'s int:

Filename: FibonacciNumbers.C
LEDA users as of version 5.0: Note that header files are now partitioned into modules (subfolders)!
#include <LEDA/integer.h>

using leda::integer;
using std::cout;
using std::endl;

int main() 
{
  integer a = 1, b = 1;

  cout << "Fib(0) = " << a << endl;
  cout << "Fib(1) = " << b << endl;

  for(int i = 2; ; i++) { 
    integer sum = a + b;
    cout << "Fib(" << i << ") = " << sum << endl;
    a = b;
    b = sum;
  }
}

The output of this program for some interesting parts is shown in the following:

...
Fib(44) = 1134903170
Fib(45) = 1836311903
Fib(46) = 2971215073
Fib(47) = 4807526976
Fib(48) = 7778742049
...
Fib(270) = 193270471243015279782059101964580241188515112465021394429
Fib(271) = 312718191492907860985910767785256677811449301165198482789
Fib(272) = 505988662735923140767969869749836918999964413630219877218
...

As we can see from the program, it is very easy to use the type integer: All we have to do is to write integer where we would normally write int. The LEDA type integer supports all the well-known arithmetic operations +, -, *, /, %, +=, -=, *=, /=, %=, - (unary), ++, --, <<, and >>, the common bitwise operations &&, &&=, |, |=, and ~, as well as the comparison operators <, <=, >, >=, ==, and !=. This means that we can use integers in all expressions arbitrarily intermixed with ordinary C++ ints.

The operations of integer never wrap-around and always yield the exact result. (Of course, the number of digits in an integer is bounded by the size of the overall memory of our machine. So if we let our integers grow larger and larger, we will eventually end up with an out-of-memory error.)

Moreover, the class integer provides arithmetic operations such as computing the square root, the greatest common divisor, and the dual logarithm.

Furthermore, the class integer overloads the usual stream operators for input and output.

integers can be created from C++ numbers of type int, unsigned int, long, unsigned long, and double. Sometimes it is necessary to create an integer from a decimal number that is too long to fit into a long or double, so that it cannot be specified as an integer or float literal in the source code. Then the following constructor comes in handy:

integer( const char* s );

It creates an integer from a decimal notation as given in the string s. For example, to create an integer that represents the first 20 decimal digits of Euler's number, we could use the following constructor:

integer e( "2718281828459045235360287" );

Here is another classic program that very soon leads to integer numbers too large to fit into the type int: computing factorials. Remember that n!, the n-th factorial, is defined as the product of all integer numbers from 1 to n. In formulae:

0! = 1

n! = 1 · 2 · ... · n

In combinatorics, n! is interpreted as the number of possibilities to put n distinguishable things into a row. There is also a nice recursive formulation for n!:

0! = 1

n! = n · (n-1)!

The next program uses this recursion and LEDA's class integer to compute factorials of arbitrary length.

Filename: Factorials.C
LEDA users as of version 5.0: Note that header files are now partitioned into modules (subfolders)!
#include <LEDA/integer.h>

using leda::integer;
using std::cout;
using std::endl;


integer factorial(const integer& n)
{
  if( n == 0 )
    return 1;
  else
    return n * factorial( n - 1 );
}


int main() 
{
  for( int i = 0; ; i++ ) 
    cout << i << "! = " << factorial( i ) << endl;
}

The output is

0! = 1
1! = 1
2! = 2
3! = 6
4! = 24
5! = 120
6! = 720
7! = 5040
8! = 40320
9! = 362880
10! = 3628800
11! = 39916800
12! = 479001600
13! = 6227020800
14! = 87178291200
15! = 1307674368000
16! = 20922789888000
17! = 355687428096000
18! = 6402373705728000
19! = 121645100408832000
20! = 2432902008176640000
21! = 51090942171709440000
22! = 1124000727777607680000
23! = 25852016738884976640000
24! = 620448401733239439360000
25! = 15511210043330985984000000
26! = 403291461126605635584000000
27! = 10888869450418352160768000000
28! = 304888344611713860501504000000
29! = 8841761993739701954543616000000
30! = 265252859812191058636308480000000
31! = 8222838654177922817725562880000000
32! = 263130836933693530167218012160000000
33! = 8683317618811886495518194401280000000
34! = 295232799039604140847618609643520000000
35! = 10333147966386144929666651337523200000000
36! = 371993326789901217467999448150835200000000
37! = 13763753091226345046315979581580902400000000
38! = 523022617466601111760007224100074291200000000
39! = 20397882081197443358640281739902897356800000000
40! = 815915283247897734345611269596115894272000000000

Wow - that really gets big! And how fast!

Implementation and tips

Objects of type integer are implemented by a vector of unsigned longs. The sign and the size are stored in extra member variables.

The arithmetic operations are implemented very efficiently. Some of the inner loops are even hand-coded in assembler. The running time of addition and subtraction is O(n) (with n being the numbers of machine words that are used to represent the numbers); the running time of multiplication and division is O(nlog 3).

LEDA's integers are about 30-50 times slower that ints. So they should only be used if exact integer arithmetic is needed and if a wrap-around cannot be afforded and if it cannot be guaranteed that the numbers will not fall outside of int's domain (for example, by an a-priori analysis). See also Exercise 76 for this.

Further information

More information about the class integer can be found on its manual page.

Exercises

Exercise 74.

Read about the representation and arithmetics of integers in one's complement and two's complement in a book about computer architecture, for example, in [16]. Then explain the integer output -1323752223 in the Fibonacci numbers program that uses ints.

Exercise 75.

Verify by an experiment that the running time of LEDA's integers for multiplication indeed is O(nlog 3): Construct a number with n machine words (for example, by repeated squaring), then multiply it 10 times by itself and measure the time used for these multiplications. Repeat the experiment with a number having n+1 machine words, and compute the times' ratio: It should converge to O(nlog 3).

Make an analogous experiment for division.

Exercise 76.

At first sight, the definition of the Fibonacci numbers as given above suggests a recursive function for computing Fn: To compute Fn, first compute Fn-1 and Fn-2 recursively, then add the results. This, however, is not a good way to compute Fn. (Why?)

Nevertheless, use this recursive formulation to compute F35, first with C++'s type int, then with LEDA's integer. Compare the running times. What conclusion do you draw for programs that deal only with “small” integers (that is, integers that fall into int's domain)?

Exercise 77.

Write a function that efficiently computes the i-th power of an integer.

Hint: Examine the bit pattern of the exponent and use repeated squaring of the base.



[43] Under Unix™, these constants are defined in the system header file limits.h.




Imprint-Dataprotection