BigInt - Multiprecision Integers

Welcome to the home of BigInt, my class for handling arbitrarily large integers.  BigInt has been around for a few years now and has proved quite popular.  But apparently some Johnny-come-lately outfit called Microsoft is trying to muscle in on the market for large integer arithmetic with their new System.Numerics.BigInteger type.  I say Bring 'em on.

A special welcome to those who are using BigInt to solve problems on Project Euler!  If you find it useful please let me know - I would love to hear from you.  (One problem that responds well to the BigInt treatment is #56).

Here's the C# file that you need:

Choosing the target audience

I didn't appreciate at first the impact of the storage base on the end product.  The trouble is that converting between binary and decimal is tedious and slow.  Think about it - adding another most-significant decimal digit to a number can change the value of any bit in its binary representation except bit zero.  So if possible you should choose a base and stick with it.  Binary has the advantage of native speed and efficiency, but decimal has the advantage of being more human-friendly.  If you were doing something like SSL where humans don't need to see the numbers then you would choose binary.  But BigInt's target audience is people who want to input lots of big base 10 numbers and see lots of big base 10 answers, so I went with base 10 storage.  As you will see, this makes BigInt pretty slow for things like RSA encryption.

Storage

To store a BigInt, I break the number up into "chunks" of three decimal digits.  Chunk zero holds the three least significant decimal digits, chunk one holds the next three digits, and so on.  Obviously, each chunk has 1,000 possible values (0-999).  Each chunk is held in a block of ten bits.  Now, ten bits can actually hold 1,024 different values, so there is about 2% wastage in this method, but that's a pretty good trade for the speed and simplicity of storage.  Of course, the computer works in multiples of eight bits, so some shifting and masking is required to manipulate the chunks.  The chunks are divided into "blocks" of 8 chunks, each of which occupies 10 x 8 = 80 bits, which is 10 bytes or 5 shorts.  An array of bit masks tells us which bits of which bytes within the block hold the chunk we want.

In addition to the bit array, there is a Sign property to indicate positive (1) or negative (-1).  That two's-complement stuff just gives me a headache.

I decided to make BigInts immutable, like .NET strings.  Once you have created a BigInt (through construction, addition, or whatever), you can't change its value.

Addition (and subtraction)

Adding a + b (a,b >= 0) was pretty simple - just add a.chunk[i] + b.chunk[i] + carry to get answer.chunk[i].  O(N).  Then I tried to add support for signs and it got a little tricky.  In the end, two extra steps were required for the addition of two numbers with opposite signs (a.k.a subtraction)...

First, the number with larger magnitude (absolute value) has to be the positive one (so that intermediate steps can borrow if needs be). If the larger number is the negative one then we compute -((-a) + (-b)) instead.  For example, if asked to compute 11 + (-99) we would actually compute -(-11 + 99).  Unary minus is of course a very fast operation since we just flip the sign indicator.

Second, you have to borrow 1000 when the positive number's chunk is less than the negative number's chunk (plus carry).  In this case the carry for the next chunk becomes -1.

Multiplication

Armed with addition, (naive) multiplication is easy.  Easier than addition really, because you don't have to worry about signs or borrowing on the intermediate steps.  Just multiply every chunk in a by every chunk in b (with shifts) and add up the results.  The sign of the result is a.Sign * b.Sign.  The only trouble is, this is an O(N2 ) operation.  Thanks to http://www.tc.umn.edu/~ringx004/article-main.html for pointing out the faster Divide and Conquer method, which proved easier to implement than it first appeared.  The BigInt class now defines a threshold number of chunks, above which the D&C algorithm is used and below which the naive algorithm is used.

Div and mod

This was the tough one: how to divide a by b and get the (integer) quotient and remainder?  I just wanted to implement a naive division that essential 'undoes' the steps of naive multiplication, but for a while I couldn't see how to unpick all those additions.  Eventually I realised something: if you divide the first few digits of a by the first few digits of b, you will get an estimate for the first few digits of the quotient that will be high by at most 1.  Picking two numbers at random...

a = 64,745,023,423,004,376,034,552
b = 798,963,467,349,673,448

To start calculating a/b, BigInt takes the most significant chunk of each number (64 and 798).  Since 64 < 798, it appends the next chunk of a to make 67745.  Then its guess for the first few digits of the quotient is 67745 / 798 = 81.  There is a chance that this is one too high, so the next step is to check by calculating (81000 * b), which is 64716040855323549288000.  Since this is less than a, we know that 81 is correct.  Now we subtract (81000 * b) from a to get the remainder and go round again.  Keep going until the remainder is less than b and there you have it.

Raising to a power

The latest version of BigInt usurps the ^ operator, which is supposed to mean XOR in C#, and uses it as the raise-to-a-power operator. So (BigInt)10^100 == BigInt.Power(10, 100) == 10 to the power of 100 == one googol.  I chose this operator because it's often used for this purpose in other applications, and because there isn't an obvious alternative. 

The one drawback is that the precedence isn't quite right, because the power operation binds tighter than xor.  You would expect (BigInt)2 ^ 3 + 4 to yield 12 (8 + 4), but in fact it yields 128 (2^7).  So by all means use the ^ operator, but remember to use plenty of parentheses!

BigInt in action

Armed with all of this, what can we do?  In RSA with BigInts you can see yet another crude RSA encryption system, this time using BigInt to crunch (almost) realistically big messages and keys.

Homer's Last Theorem

Believe it or not, we can use BigInt in the context of a Simpsons episode.  In the Homer3 segment of Treehouse of Horror VI, Homer enters the hypothetical "z" axis of the Third Dimension.  Here in the background floats the "equation" 178212 + 184112 = 192212.  If this statement were true it would disprove Fermat's Last Theorem.  But in fact it is not true, it is just a "Fermat near miss".  Since the nine most significant digits on the right and left hand sides are equal, it would look true when calculated on a typical (non-BigInt) calculator.  The following code searches sums of powers of twelve of the numbers between 1,000 and 2,000 and finds a couple of these near misses, including Homer's.  It also shows the true values in these sums, which are obviously nowhere near equal.

const int ARRAY_SIZE = 2000;

BigInt[] powers = new BigInt[ARRAY_SIZE + 1];

 

for (int x = 1; x < ARRAY_SIZE; x++)

{

    powers[x] = (BigInt)x ^ 12;

}

 

string[] frontDigits = new string[ARRAY_SIZE + 1];

for (int i = 500; i < ARRAY_SIZE; i++)

{

    frontDigits[i] = powers[i].ToString().Substring(0, 9);

}

 

for (int i = 1000; i < ARRAY_SIZE; i++)

{

    for (int j = i + 1; j < ARRAY_SIZE; j++)

    {

        BigInt fermatMissTest = powers[i] + powers[j];

        string missTestDigits = fermatMissTest.ToString().Substring(0, 9);

        for (int k = i; k <= ARRAY_SIZE; k++)

        {

            if (frontDigits[k] == missTestDigits)

            {

                Console.WriteLine("{0}^12 + {1}^12 \"=\" {2}^12", i, j, k);

                Console.WriteLine("  {0}^12 + {1}^12 = {2}", i, j, fermatMissTest);

                Console.WriteLine("            {0}^12 = {1}\n", k, powers[k]);

            }

        }

    }

}

Here is the output:

1363^12 + 1808^12 "=" 1813^12
  1363^12 + 1808^12 = 1261170053084770027529532240400562560817
  1813^12 = 1261170051398003589253823949865506481681

1782^12 + 1841^12 "=" 1922^12
  1782^12 + 1841^12 = 2541210258614589176288669958142428526657
  1922^12 = 2541210259314801410819278649643651567616

©2012 Carl Johansen