[performance] What is the fastest way to get the value of π?


I really like this program, which approximates pi by looking at its own area :-)

IOCCC 1988 : westley.c

#define _ -F<00||--F-OO--;
int F=00,OO=00;main(){F_OO();printf("%1.3f\n",4.*-F/OO/OO);}F_OO()

Solutions are welcome in any language. :-) I'm looking for the fastest way to obtain the value of π, as a personal challenge. More specifically I'm using ways that don't involve using #defined constants like M_PI, or hard-coding the number in.

The program below tests the various ways I know of. The inline assembly version is, in theory, the fastest option, though clearly not portable; I've included it as a baseline to compare the other versions against. In my tests, with built-ins, the 4 * atan(1) version is fastest on GCC 4.2, because it auto-folds the atan(1) into a constant. With -fno-builtin specified, the atan2(0, -1) version is fastest.

Here's the main testing program (pitimes.c):

#include <math.h>
#include <stdio.h>
#include <time.h>

#define ITERS 10000000
#define TESTWITH(x) {                                                       \
    diff = 0.0;                                                             \
    time1 = clock();                                                        \
    for (i = 0; i < ITERS; ++i)                                             \
        diff += (x) - M_PI;                                                 \
    time2 = clock();                                                        \
    printf("%s\t=> %e, time => %f\n", #x, diff, diffclock(time2, time1));   \

static inline double
diffclock(clock_t time1, clock_t time0)
    return (double) (time1 - time0) / CLOCKS_PER_SEC;

    int i;
    clock_t time1, time2;
    double diff;

    /* Warmup. The atan2 case catches GCC's atan folding (which would
     * optimise the ``4 * atan(1) - M_PI'' to a no-op), if -fno-builtin
     * is not used. */
    TESTWITH(4 * atan(1))
    TESTWITH(4 * atan2(1, 1))

#if defined(__GNUC__) && (defined(__i386__) || defined(__amd64__))
    extern double fldpi();

    /* Actual tests start here. */
    TESTWITH(atan2(0, -1))
    TESTWITH(2 * asin(1))
    TESTWITH(4 * atan2(1, 1))
    TESTWITH(4 * atan(1))

    return 0;

And the inline assembly stuff (fldpi.c), noting that it will only work for x86 and x64 systems:

    double pi;
    asm("fldpi" : "=t" (pi));
    return pi;

And a build script that builds all the configurations I'm testing (build.sh):

gcc -O3 -Wall -c           -m32 -o fldpi-32.o fldpi.c
gcc -O3 -Wall -c           -m64 -o fldpi-64.o fldpi.c

gcc -O3 -Wall -ffast-math  -m32 -o pitimes1-32 pitimes.c fldpi-32.o
gcc -O3 -Wall              -m32 -o pitimes2-32 pitimes.c fldpi-32.o -lm
gcc -O3 -Wall -fno-builtin -m32 -o pitimes3-32 pitimes.c fldpi-32.o -lm
gcc -O3 -Wall -ffast-math  -m64 -o pitimes1-64 pitimes.c fldpi-64.o -lm
gcc -O3 -Wall              -m64 -o pitimes2-64 pitimes.c fldpi-64.o -lm
gcc -O3 -Wall -fno-builtin -m64 -o pitimes3-64 pitimes.c fldpi-64.o -lm

Apart from testing between various compiler flags (I've compared 32-bit against 64-bit too, because the optimisations are different), I've also tried switching the order of the tests around. The atan2(0, -1) version still comes out top every time, though.

If by fastest you mean fastest to type in the code, here's the golfscript solution:


In the interests of completeness, a C++ template version, which for an optimised build will compute PI at compile time and will inline to a single value.

#include <iostream>

template<int I>
struct sign
    enum {value = (I % 2) == 0 ? 1 : -1};

template<int I, int J>
struct pi_calc
    inline static double value ()
        return (pi_calc<I-1, J>::value () + pi_calc<I-1, J+1>::value ()) / 2.0;

template<int J>
struct pi_calc<0, J>
    inline static double value ()
        return (sign<J>::value * 4.0) / (2.0 * J + 1.0) + pi_calc<0, J-1>::value ();

struct pi_calc<0, 0>
    inline static double value ()
        return 4.0;

template<int I>
struct pi
    inline static double value ()
        return pi_calc<I, I>::value ();

int main ()
    std::cout.precision (12);

    const double pi_value = pi<10>::value ();

    std::cout << "pi ~ " << pi_value << std::endl;

    return 0;

Note for I > 10, optimised builds can be slow, likewise for non-optimised runs. For 12 iterations I believe there are around 80k calls to value() (in the absence of memoisation).

If you want to compute an approximation of the value of π (for some reason), you should try a binary extraction algorithm. Bellard's improvement of BBP gives does PI in O(N^2).

If you want to obtain an approximation of the value of π to do calculations, then:

PI = 3.141592654

Granted, that's only an approximation, and not entirely accurate. It's off by a little more than 0.00000000004102. (four ten-trillionths, about 4/10,000,000,000).

If you want to do math with π, then get yourself a pencil and paper or a computer algebra package, and use π's exact value, π.

If you really want a formula, this one is fun:

π = -i ln(-1)

Instead of defining pi as a constant, I always use acos(-1).

The following answers precisely how to do this in the fastest possible way -- with the least computing effort. Even if you don't like the answer, you have to admit that it is indeed the fastest way to get the value of PI.

The FASTEST way to get the value of Pi is:

  1. chose your favorite programming language
  2. load it's Math library
  3. and find that Pi is already defined there!! ready to use it..

in case you don't have a Math library at hand..

the SECOND FASTEST way (more universal solution) is:

look up Pi on the Internet, e.g. here:

http://www.eveandersson.com/pi/digits/1000000 (1 million digits .. what's your floating point precision? )

or here:


or here:


It's really fast to find the digits you need for whatever precision arithmetic you would like to use, and by defining a constant, you can make sure that you don't waste precious CPU time.

Not only is this a partly humorous answer, but in reality, if anybody would go ahead and compute the value of Pi in a real application .. that would be a pretty big waste of CPU time, wouldn't it? At least I don't see a real application for trying to re-compute this.

Dear Moderator: please note that the OP asked: "Fastest Way to get the value of PI"

If you are willing to use an approximation, 355 / 113 is good for 6 decimal digits, and has the added advantage of being usable with integer expressions. That's not as important these days, as "floating point math co-processor" ceased to have any meaning, but it was quite important once.

This version (in Delphi) is nothing special, but it is at least faster than the version Nick Hodge posted on his blog :). On my machine, it takes about 16 seconds to do a billion iterations, giving a value of 3.1415926525879 (the accurate part is in bold).

program calcpi;



  start, finish: TDateTime;

function CalculatePi(iterations: integer): double;
  numerator, denominator, i: integer;
  sum: double;
  PI may be approximated with this formula:
  4 * (1 - 1/3 + 1/5 - 1/7 + 1/9 - 1/11 .......)
  numerator := 1;
  denominator := 1;
  sum := 0;
  for i := 1 to iterations do begin
    sum := sum + (numerator/denominator);
    denominator := denominator + 2;
    numerator := -numerator;
  Result := 4 * sum;

    start := Now;
    finish := Now;
    WriteLn('Seconds:' + FormatDateTime('hh:mm:ss.zz',finish-start));
    on E:Exception do
      Writeln(E.Classname, ': ', E.Message);

With doubles:

4.0 * (4.0 * Math.Atan(0.2) - Math.Atan(1.0 / 239.0))

This will be accurate up to 14 decimal places, enough to fill a double (the inaccuracy is probably because the rest of the decimals in the arc tangents are truncated).

Also Seth, it's 3.141592653589793238463, not 64.

Calculating π from circle area :-)

<input id="range" type="range" min="10" max="960" value="10" step="50" oninput="calcPi()">
<div id="cont"></div>

function generateCircle(width) {
    var c = width/2;
    var delta = 1.0;
    var str = "";
    var xCount = 0;
    for (var x=0; x <= width; x++) {
        for (var y = 0; y <= width; y++) {
            var d = Math.sqrt((x-c)*(x-c) + (y-c)*(y-c));
            if (d > (width-1)/2) {
                str += '.';
            else {
                str += 'o';
            str += "&nbsp;" 
        str += "\n";
    var pi = (xCount * 4) / (width * width);
    return [str, pi];

function calcPi() {
    var e = document.getElementById("cont");
    var width = document.getElementById("range").value;
    e.innerHTML = "<h4>Generating circle...</h4>";
    setTimeout(function() {
        var circ = generateCircle(width);
        e.innerHTML  = "<pre>" + "π = " + circ[1].toFixed(2) + "\n" + circ[0] +"</pre>";
    }, 200);