performance how pi - What is the fastest way to get the value of π?





11 Answers

I really like this program, because it approximates π by looking at its own area.

IOCCC 1988 : westley.c

#define _ -F<00||--F-OO--;
int F=00,OO=00;main(){F_OO();printf("%1.3f\n",4.*-F/OO/OO);}F_OO()
{
            _-_-_-_
       _-_-_-_-_-_-_-_-_
    _-_-_-_-_-_-_-_-_-_-_-_
  _-_-_-_-_-_-_-_-_-_-_-_-_-_
 _-_-_-_-_-_-_-_-_-_-_-_-_-_-_
 _-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
 _-_-_-_-_-_-_-_-_-_-_-_-_-_-_
 _-_-_-_-_-_-_-_-_-_-_-_-_-_-_
  _-_-_-_-_-_-_-_-_-_-_-_-_-_
    _-_-_-_-_-_-_-_-_-_-_-_
        _-_-_-_-_-_-_-_
            _-_-_-_
}
calculated so many

I'm looking for the fastest way to obtain the value of π, as a personal challenge. More specifically, I'm using ways that don't involve using #define constants like M_PI, or hard-coding the number in.

The program below tests the various ways I know of. The inline assembly version is, in theory, the fastest option, though clearly not portable. I've included it as a baseline to compare against the other versions. In my tests, with built-ins, the 4 * atan(1) version is fastest on GCC 4.2, because it auto-folds the atan(1) into a constant. With -fno-builtin specified, the atan2(0, -1) version is fastest.

Here's the main testing program (pitimes.c):

#include <math.h>
#include <stdio.h>
#include <time.h>

#define ITERS 10000000
#define TESTWITH(x) {                                                       \
    diff = 0.0;                                                             \
    time1 = clock();                                                        \
    for (i = 0; i < ITERS; ++i)                                             \
        diff += (x) - M_PI;                                                 \
    time2 = clock();                                                        \
    printf("%s\t=> %e, time => %f\n", #x, diff, diffclock(time2, time1));   \
}

static inline double
diffclock(clock_t time1, clock_t time0)
{
    return (double) (time1 - time0) / CLOCKS_PER_SEC;
}

int
main()
{
    int i;
    clock_t time1, time2;
    double diff;

    /* Warmup. The atan2 case catches GCC's atan folding (which would
     * optimise the ``4 * atan(1) - M_PI'' to a no-op), if -fno-builtin
     * is not used. */
    TESTWITH(4 * atan(1))
    TESTWITH(4 * atan2(1, 1))

#if defined(__GNUC__) && (defined(__i386__) || defined(__amd64__))
    extern double fldpi();
    TESTWITH(fldpi())
#endif

    /* Actual tests start here. */
    TESTWITH(atan2(0, -1))
    TESTWITH(acos(-1))
    TESTWITH(2 * asin(1))
    TESTWITH(4 * atan2(1, 1))
    TESTWITH(4 * atan(1))

    return 0;
}

And the inline assembly stuff (fldpi.c) that will only work for x86 and x64 systems:

double
fldpi()
{
    double pi;
    asm("fldpi" : "=t" (pi));
    return pi;
}

And a build script that builds all the configurations I'm testing (build.sh):

#!/bin/sh
gcc -O3 -Wall -c           -m32 -o fldpi-32.o fldpi.c
gcc -O3 -Wall -c           -m64 -o fldpi-64.o fldpi.c

gcc -O3 -Wall -ffast-math  -m32 -o pitimes1-32 pitimes.c fldpi-32.o
gcc -O3 -Wall              -m32 -o pitimes2-32 pitimes.c fldpi-32.o -lm
gcc -O3 -Wall -fno-builtin -m32 -o pitimes3-32 pitimes.c fldpi-32.o -lm
gcc -O3 -Wall -ffast-math  -m64 -o pitimes1-64 pitimes.c fldpi-64.o -lm
gcc -O3 -Wall              -m64 -o pitimes2-64 pitimes.c fldpi-64.o -lm
gcc -O3 -Wall -fno-builtin -m64 -o pitimes3-64 pitimes.c fldpi-64.o -lm

Apart from testing between various compiler flags (I've compared 32-bit against 64-bit too, because the optimisations are different), I've also tried switching the order of the tests around. But still, the atan2(0, -1) version still comes out on top every time.




In the interests of completeness, a C++ template version, which for an optimised build will compute PI at compile time and will inline to a single value.

#include <iostream>

template<int I>
struct sign
{
    enum {value = (I % 2) == 0 ? 1 : -1};
};

template<int I, int J>
struct pi_calc
{
    inline static double value ()
    {
        return (pi_calc<I-1, J>::value () + pi_calc<I-1, J+1>::value ()) / 2.0;
    }
};

template<int J>
struct pi_calc<0, J>
{
    inline static double value ()
    {
        return (sign<J>::value * 4.0) / (2.0 * J + 1.0) + pi_calc<0, J-1>::value ();
    }
};


template<>
struct pi_calc<0, 0>
{
    inline static double value ()
    {
        return 4.0;
    }
};

template<int I>
struct pi
{
    inline static double value ()
    {
        return pi_calc<I, I>::value ();
    }
};

int main ()
{
    std::cout.precision (12);

    const double pi_value = pi<10>::value ();

    std::cout << "pi ~ " << pi_value << std::endl;

    return 0;
}

Note for I > 10, optimised builds can be slow, likewise for non-optimised runs. For 12 iterations I believe there are around 80k calls to value() (in the absence of memoisation).




The following answers precisely how to do this in the fastest possible way -- with the least computing effort. Even if you don't like the answer, you have to admit that it is indeed the fastest way to get the value of PI.

The FASTEST way to get the value of Pi is:

  1. chose your favorite programming language
  2. load it's Math library
  3. and find that Pi is already defined there!! ready to use it..

in case you don't have a Math library at hand..

the SECOND FASTEST way (more universal solution) is:

look up Pi on the Internet, e.g. here:

http://www.eveandersson.com/pi/digits/1000000 (1 million digits .. what's your floating point precision? )

or here:

http://3.141592653589793238462643383279502884197169399375105820974944592.com/

or here:

http://en.wikipedia.org/wiki/Pi

It's really fast to find the digits you need for whatever precision arithmetic you would like to use, and by defining a constant, you can make sure that you don't waste precious CPU time.

Not only is this a partly humorous answer, but in reality, if anybody would go ahead and compute the value of Pi in a real application .. that would be a pretty big waste of CPU time, wouldn't it? At least I don't see a real application for trying to re-compute this.

Dear Moderator: please note that the OP asked: "Fastest Way to get the value of PI"




Instead of defining pi as a constant, I always use acos(-1).







If by fastest you mean fastest to type in the code, here's the golfscript solution:

;''6666,-2%{2+.2/@*\/10.3??2*+}*`1000<~\;



Pi is exactly 3! [Prof. Frink (Simpsons)]

Joke, but here's one in C# (.NET-Framework required).

using System;
using System.Text;

class Program {
    static void Main(string[] args) {
        int Digits = 100;

        BigNumber x = new BigNumber(Digits);
        BigNumber y = new BigNumber(Digits);
        x.ArcTan(16, 5);
        y.ArcTan(4, 239);
        x.Subtract(y);
        string pi = x.ToString();
        Console.WriteLine(pi);
    }
}

public class BigNumber {
    private UInt32[] number;
    private int size;
    private int maxDigits;

    public BigNumber(int maxDigits) {
        this.maxDigits = maxDigits;
        this.size = (int)Math.Ceiling((float)maxDigits * 0.104) + 2;
        number = new UInt32[size];
    }
    public BigNumber(int maxDigits, UInt32 intPart)
        : this(maxDigits) {
        number[0] = intPart;
        for (int i = 1; i < size; i++) {
            number[i] = 0;
        }
    }
    private void VerifySameSize(BigNumber value) {
        if (Object.ReferenceEquals(this, value))
            throw new Exception("BigNumbers cannot operate on themselves");
        if (value.size != this.size)
            throw new Exception("BigNumbers must have the same size");
    }

    public void Add(BigNumber value) {
        VerifySameSize(value);

        int index = size - 1;
        while (index >= 0 && value.number[index] == 0)
            index--;

        UInt32 carry = 0;
        while (index >= 0) {
            UInt64 result = (UInt64)number[index] +
                            value.number[index] + carry;
            number[index] = (UInt32)result;
            if (result >= 0x100000000U)
                carry = 1;
            else
                carry = 0;
            index--;
        }
    }
    public void Subtract(BigNumber value) {
        VerifySameSize(value);

        int index = size - 1;
        while (index >= 0 && value.number[index] == 0)
            index--;

        UInt32 borrow = 0;
        while (index >= 0) {
            UInt64 result = 0x100000000U + (UInt64)number[index] -
                            value.number[index] - borrow;
            number[index] = (UInt32)result;
            if (result >= 0x100000000U)
                borrow = 0;
            else
                borrow = 1;
            index--;
        }
    }
    public void Multiply(UInt32 value) {
        int index = size - 1;
        while (index >= 0 && number[index] == 0)
            index--;

        UInt32 carry = 0;
        while (index >= 0) {
            UInt64 result = (UInt64)number[index] * value + carry;
            number[index] = (UInt32)result;
            carry = (UInt32)(result >> 32);
            index--;
        }
    }
    public void Divide(UInt32 value) {
        int index = 0;
        while (index < size && number[index] == 0)
            index++;

        UInt32 carry = 0;
        while (index < size) {
            UInt64 result = number[index] + ((UInt64)carry << 32);
            number[index] = (UInt32)(result / (UInt64)value);
            carry = (UInt32)(result % (UInt64)value);
            index++;
        }
    }
    public void Assign(BigNumber value) {
        VerifySameSize(value);
        for (int i = 0; i < size; i++) {
            number[i] = value.number[i];
        }
    }

    public override string ToString() {
        BigNumber temp = new BigNumber(maxDigits);
        temp.Assign(this);

        StringBuilder sb = new StringBuilder();
        sb.Append(temp.number[0]);
        sb.Append(System.Globalization.CultureInfo.CurrentCulture.NumberFormat.CurrencyDecimalSeparator);

        int digitCount = 0;
        while (digitCount < maxDigits) {
            temp.number[0] = 0;
            temp.Multiply(100000);
            sb.AppendFormat("{0:D5}", temp.number[0]);
            digitCount += 5;
        }

        return sb.ToString();
    }
    public bool IsZero() {
        foreach (UInt32 item in number) {
            if (item != 0)
                return false;
        }
        return true;
    }

    public void ArcTan(UInt32 multiplicand, UInt32 reciprocal) {
        BigNumber X = new BigNumber(maxDigits, multiplicand);
        X.Divide(reciprocal);
        reciprocal *= reciprocal;

        this.Assign(X);

        BigNumber term = new BigNumber(maxDigits);
        UInt32 divisor = 1;
        bool subtractTerm = true;
        while (true) {
            X.Divide(reciprocal);
            term.Assign(X);
            divisor += 2;
            term.Divide(divisor);
            if (term.IsZero())
                break;

            if (subtractTerm)
                this.Subtract(term);
            else
                this.Add(term);
            subtractTerm = !subtractTerm;
        }
    }
}



With doubles:

4.0 * (4.0 * Math.Atan(0.2) - Math.Atan(1.0 / 239.0))

This will be accurate up to 14 decimal places, enough to fill a double (the inaccuracy is probably because the rest of the decimals in the arc tangents are truncated).

Also Seth, it's 3.141592653589793238463, not 64.




This version (in Delphi) is nothing special, but it is at least faster than the version Nick Hodge posted on his blog :). On my machine, it takes about 16 seconds to do a billion iterations, giving a value of 3.1415926525879 (the accurate part is in bold).

program calcpi;

{$APPTYPE CONSOLE}

uses
  SysUtils;

var
  start, finish: TDateTime;

function CalculatePi(iterations: integer): double;
var
  numerator, denominator, i: integer;
  sum: double;
begin
  {
  PI may be approximated with this formula:
  4 * (1 - 1/3 + 1/5 - 1/7 + 1/9 - 1/11 .......)
  //}
  numerator := 1;
  denominator := 1;
  sum := 0;
  for i := 1 to iterations do begin
    sum := sum + (numerator/denominator);
    denominator := denominator + 2;
    numerator := -numerator;
  end;
  Result := 4 * sum;
end;

begin
  try
    start := Now;
    WriteLn(FloatToStr(CalculatePi(StrToInt(ParamStr(1)))));
    finish := Now;
    WriteLn('Seconds:' + FormatDateTime('hh:mm:ss.zz',finish-start));
  except
    on E:Exception do
      Writeln(E.Classname, ': ', E.Message);
  end;
end.



If you want to compute an approximation of the value of π (for some reason), you should try a binary extraction algorithm. Bellard's improvement of BBP gives does PI in O(N^2).


If you want to obtain an approximation of the value of π to do calculations, then:

PI = 3.141592654

Granted, that's only an approximation, and not entirely accurate. It's off by a little more than 0.00000000004102. (four ten-trillionths, about 4/10,000,000,000).


If you want to do math with π, then get yourself a pencil and paper or a computer algebra package, and use π's exact value, π.

If you really want a formula, this one is fun:

π = -i ln(-1)




Calculating π from circle area :-)

<input id="range" type="range" min="10" max="960" value="10" step="50" oninput="calcPi()">
<br>
<div id="cont"></div>

<script>
function generateCircle(width) {
    var c = width/2;
    var delta = 1.0;
    var str = "";
    var xCount = 0;
    for (var x=0; x <= width; x++) {
        for (var y = 0; y <= width; y++) {
            var d = Math.sqrt((x-c)*(x-c) + (y-c)*(y-c));
            if (d > (width-1)/2) {
                str += '.';
            }
            else {
                xCount++;
                str += 'o';
            }
            str += "&nbsp;" 
        }
        str += "\n";
    }
    var pi = (xCount * 4) / (width * width);
    return [str, pi];
}

function calcPi() {
    var e = document.getElementById("cont");
    var width = document.getElementById("range").value;
    e.innerHTML = "<h4>Generating circle...</h4>";
    setTimeout(function() {
        var circ = generateCircle(width);
        e.innerHTML  = "<pre>" + "π = " + circ[1].toFixed(2) + "\n" + circ[0] +"</pre>";
    }, 200);
}
calcPi();
</script>



Related