# [performance] What is the fastest way to get the value of π?

I really like this program, which approximates pi by looking at its own area :-)

IOCCC 1988 : westley.c

`#define _ -F<00||--F-OO--; int F=00,OO=00;main(){F_OO();printf("%1.3f\n",4.*-F/OO/OO);}F_OO() { _-_-_-_ _-_-_-_-_-_-_-_-_ _-_-_-_-_-_-_-_-_-_-_-_ _-_-_-_-_-_-_-_-_-_-_-_-_-_ _-_-_-_-_-_-_-_-_-_-_-_-_-_-_ _-_-_-_-_-_-_-_-_-_-_-_-_-_-_ _-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_ _-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_ _-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_ _-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_ _-_-_-_-_-_-_-_-_-_-_-_-_-_-_ _-_-_-_-_-_-_-_-_-_-_-_-_-_-_ _-_-_-_-_-_-_-_-_-_-_-_-_-_ _-_-_-_-_-_-_-_-_-_-_-_ _-_-_-_-_-_-_-_ _-_-_-_ }`

Solutions are welcome in any language. :-) I'm looking for the fastest way to obtain the value of π, as a personal challenge. More specifically I'm using ways that don't involve using `#define`

d constants like `M_PI`

, or hard-coding the number in.

The program below tests the various ways I know of. The inline assembly version is, in theory, the fastest option, though clearly not portable; I've included it as a baseline to compare the other versions against. In my tests, with built-ins, the `4 * atan(1)`

version is fastest on GCC 4.2, because it auto-folds the `atan(1)`

into a constant. With `-fno-builtin`

specified, the `atan2(0, -1)`

version is fastest.

Here's the main testing program (`pitimes.c`

):

```
#include <math.h>
#include <stdio.h>
#include <time.h>
#define ITERS 10000000
#define TESTWITH(x) { \
diff = 0.0; \
time1 = clock(); \
for (i = 0; i < ITERS; ++i) \
diff += (x) - M_PI; \
time2 = clock(); \
printf("%s\t=> %e, time => %f\n", #x, diff, diffclock(time2, time1)); \
}
static inline double
diffclock(clock_t time1, clock_t time0)
{
return (double) (time1 - time0) / CLOCKS_PER_SEC;
}
int
main()
{
int i;
clock_t time1, time2;
double diff;
/* Warmup. The atan2 case catches GCC's atan folding (which would
* optimise the ``4 * atan(1) - M_PI'' to a no-op), if -fno-builtin
* is not used. */
TESTWITH(4 * atan(1))
TESTWITH(4 * atan2(1, 1))
#if defined(__GNUC__) && (defined(__i386__) || defined(__amd64__))
extern double fldpi();
TESTWITH(fldpi())
#endif
/* Actual tests start here. */
TESTWITH(atan2(0, -1))
TESTWITH(acos(-1))
TESTWITH(2 * asin(1))
TESTWITH(4 * atan2(1, 1))
TESTWITH(4 * atan(1))
return 0;
}
```

And the inline assembly stuff (`fldpi.c`

), noting that it will only work for x86 and x64 systems:

```
double
fldpi()
{
double pi;
asm("fldpi" : "=t" (pi));
return pi;
}
```

And a build script that builds all the configurations I'm testing (`build.sh`

):

```
#!/bin/sh
gcc -O3 -Wall -c -m32 -o fldpi-32.o fldpi.c
gcc -O3 -Wall -c -m64 -o fldpi-64.o fldpi.c
gcc -O3 -Wall -ffast-math -m32 -o pitimes1-32 pitimes.c fldpi-32.o
gcc -O3 -Wall -m32 -o pitimes2-32 pitimes.c fldpi-32.o -lm
gcc -O3 -Wall -fno-builtin -m32 -o pitimes3-32 pitimes.c fldpi-32.o -lm
gcc -O3 -Wall -ffast-math -m64 -o pitimes1-64 pitimes.c fldpi-64.o -lm
gcc -O3 -Wall -m64 -o pitimes2-64 pitimes.c fldpi-64.o -lm
gcc -O3 -Wall -fno-builtin -m64 -o pitimes3-64 pitimes.c fldpi-64.o -lm
```

Apart from testing between various compiler flags (I've compared 32-bit against 64-bit too, because the optimisations are different), I've also tried switching the order of the tests around. The `atan2(0, -1)`

version still comes out top every time, though.

Instead of defining pi as a constant, I always use `acos(-1)`

.

If this article is true, then the algorithm that Bellard has created could be one of the speediest available. He has created pi to 2.7 TRILLION digits using a DESKTOP PC!

...and he has published his work here

Good work Bellard, You are a pioneer!

If you are willing to use an approximation, `355 / 113`

is good for 6 decimal digits, and has the added advantage of being usable with integer expressions. That's not as important these days, as "floating point math co-processor" ceased to have any meaning, but it was quite important once.

The following answers **precisely how to do this in the fastest possible way -- with the least computing effort**. Even if you don't like the answer, you have to admit that it is indeed the fastest way to **get** the value of PI.

The **FASTEST way** to get the value of Pi is:

- chose your favorite programming language
- load it's Math library
- and find that Pi is already defined there!! ready to use it..

in case you don't have a Math library at hand..

the **SECOND FASTEST** way (more universal solution) is:

look up Pi on the Internet, e.g. here:

http://www.eveandersson.com/pi/digits/1000000 (1 million digits .. what's your floating point precision? )

or here:

http://3.141592653589793238462643383279502884197169399375105820974944592.com/

or here:

http://en.wikipedia.org/wiki/Pi

It's really fast to find the digits you need for whatever precision arithmetic you would like to use, and by defining a constant, you can make sure that you don't waste precious CPU time.

Not only is this a partly humorous answer, but in reality, if anybody would go ahead and compute the value of Pi in a real application .. that would be a pretty big waste of CPU time, wouldn't it? At least I don't see a real application for trying to re-compute this.

Dear Moderator: please note that the OP asked: **"Fastest Way** to get the value of PI"

If you want to *compute* an approximation of the value of π (for some reason), you should try a binary extraction algorithm. Bellard's improvement of BBP gives does PI in O(N^2).

If you want to *obtain* an approximation of the value of π to do calculations, then:

```
PI = 3.141592654
```

Granted, that's only an approximation, and not entirely accurate. It's off by a little more than 0.00000000004102. (four ten-trillionths, about ^{4}/_{10,000,000,000}).

If you want to do *math* with π, then get yourself a pencil and paper or a computer algebra package, and use π's exact value, π.

If you really want a formula, this one is fun:

## π = -*i* ln(-1)

This version (in Delphi) is nothing special, but it is at least faster than the version Nick Hodge posted on his blog :). On my machine, it takes about 16 seconds to do a billion iterations, giving a value of **3.14159265**25879 (the accurate part is in bold).

```
program calcpi;
{$APPTYPE CONSOLE}
uses
SysUtils;
var
start, finish: TDateTime;
function CalculatePi(iterations: integer): double;
var
numerator, denominator, i: integer;
sum: double;
begin
{
PI may be approximated with this formula:
4 * (1 - 1/3 + 1/5 - 1/7 + 1/9 - 1/11 .......)
//}
numerator := 1;
denominator := 1;
sum := 0;
for i := 1 to iterations do begin
sum := sum + (numerator/denominator);
denominator := denominator + 2;
numerator := -numerator;
end;
Result := 4 * sum;
end;
begin
try
start := Now;
WriteLn(FloatToStr(CalculatePi(StrToInt(ParamStr(1)))));
finish := Now;
WriteLn('Seconds:' + FormatDateTime('hh:mm:ss.zz',finish-start));
except
on E:Exception do
Writeln(E.Classname, ': ', E.Message);
end;
end.
```

In the interests of completeness, a C++ template version, which for an optimised build will compute PI at compile time and will inline to a single value.

```
#include <iostream>
template<int I>
struct sign
{
enum {value = (I % 2) == 0 ? 1 : -1};
};
template<int I, int J>
struct pi_calc
{
inline static double value ()
{
return (pi_calc<I-1, J>::value () + pi_calc<I-1, J+1>::value ()) / 2.0;
}
};
template<int J>
struct pi_calc<0, J>
{
inline static double value ()
{
return (sign<J>::value * 4.0) / (2.0 * J + 1.0) + pi_calc<0, J-1>::value ();
}
};
template<>
struct pi_calc<0, 0>
{
inline static double value ()
{
return 4.0;
}
};
template<int I>
struct pi
{
inline static double value ()
{
return pi_calc<I, I>::value ();
}
};
int main ()
{
std::cout.precision (12);
const double pi_value = pi<10>::value ();
std::cout << "pi ~ " << pi_value << std::endl;
return 0;
}
```

Note for I > 10, optimised builds can be slow, likewise for non-optimised runs. For 12 iterations I believe there are around 80k calls to value() (in the absence of memoisation).

With doubles:

```
4.0 * (4.0 * Math.Atan(0.2) - Math.Atan(1.0 / 239.0))
```

This will be accurate up to 14 decimal places, enough to fill a double (the inaccuracy is probably because the rest of the decimals in the arc tangents are truncated).

Also Seth, it's 3.14159265358979323846**3**, not 64.

If by fastest you mean fastest to type in the code, here's the golfscript solution:

```
;''6666,-2%{2+.2/@*\/10.3??2*+}*`1000<~\;
```

Calculating π from circle area :-)

```
<input id="range" type="range" min="10" max="960" value="10" step="50" oninput="calcPi()">
<br>
<div id="cont"></div>
<script>
function generateCircle(width) {
var c = width/2;
var delta = 1.0;
var str = "";
var xCount = 0;
for (var x=0; x <= width; x++) {
for (var y = 0; y <= width; y++) {
var d = Math.sqrt((x-c)*(x-c) + (y-c)*(y-c));
if (d > (width-1)/2) {
str += '.';
}
else {
xCount++;
str += 'o';
}
str += " "
}
str += "\n";
}
var pi = (xCount * 4) / (width * width);
return [str, pi];
}
function calcPi() {
var e = document.getElementById("cont");
var width = document.getElementById("range").value;
e.innerHTML = "<h4>Generating circle...</h4>";
setTimeout(function() {
var circ = generateCircle(width);
e.innerHTML = "<pre>" + "π = " + circ[1].toFixed(2) + "\n" + circ[0] +"</pre>";
}, 200);
}
calcPi();
</script>
```