math chapter 11 - Understanding “randomness”

14 Answers

I guess both methods are as random although my gutfeel would say that rand() * rand() is less random because it would seed more zeroes. As soon as one rand() is 0, the total becomes 0

answer key stats

I can't get my head around this, which is more random?



rand() * rand()

I´m finding it a real brain teaser, could you help me out?


Intuitively I know that the mathematical answer will be that they are equally random, but I can't help but think that if you "run the random number algorithm" twice when you multiply the two together you'll create something more random than just doing it once.

Oversimplification to illustrate a point.

Assume your random function only outputs 0 or 1.

random() is one of (0,1), but random()*random() is one of (0,0,0,1)

You can clearly see that the chances to get a 0 in the second case are in no way equal to those to get a 1.

When I first posted this answer I wanted to keep it as short as possible so that a person reading it will understand from a glance the difference between random() and random()*random(), but I can't keep myself from answering the original ad litteram question:

Which is more random?

Being that random(), random()*random(), random()+random(), (random()+1)/2 or any other combination that doesn't lead to a fixed result have the same source of entropy (or the same initial state in the case of pseudorandom generators), the answer would be that they are equally random (The difference is in their distribution). A perfect example we can look at is the game of Craps. The number you get would be random(1,6)+random(1,6) and we all know that getting 7 has the highest chance, but that doesn't mean the outcome of rolling two dice is more or less random than the outcome of rolling one.

The obligatory xkcd ...

Some things about "randomness" are counter-intuitive.

Assuming flat distribution of rand(), the following will get you non-flat distributions:

  • high bias: sqrt(rand(range^2))
  • bias peaking in the middle: (rand(range) + rand(range))/2
  • low:bias: range - sqrt(rand(range^2))

There are lots of other ways to create specific bias curves. I did a quick test of rand() * rand() and it gets you a very non-linear distribution.

Most rand() implementations have some period. I.e. after some enormous number of calls the sequence repeats. The sequence of outputs of rand() * rand() repeats in half the time, so it is "less random" in that sense.

Also, without careful construction, performing arithmetic on random values tends to cause less randomness. A poster above cited "rand() + rand() + rand() ..." (k times, say) which will in fact tend to k times the mean value of the range of values rand() returns. (It's a random walk with steps symmetric about that mean.)

Assume for concreteness that your rand() function returns a uniformly distributed random real number in the range [0,1). (Yes, this example allows infinite precision. This won't change the outcome.) You didn't pick a particular language and different languages may do different things, but the following analysis holds with modifications for any non-perverse implementation of rand(). The product rand() * rand() is also in the range [0,1) but is no longer uniformly distributed. In fact, the product is as likely to be in the interval [0,1/4) as in the interval [1/4,1). More multiplication will skew the result even further toward zero. This makes the outcome more predictable. In broad strokes, more predictable == less random.

Pretty much any sequence of operations on uniformly random input will be nonuniformly random, leading to increased predictability. With care, one can overcome this property, but then it would have been easier to generate a uniformly distributed random number in the range you actually wanted rather than wasting time with arithmetic.

The accepted answer is quite lovely, but there's another way to answer your question. PachydermPuncher's answer already takes this alternative approach, and I'm just going to expand it out a little.

The easiest way to think about information theory is in terms of the smallest unit of information, a single bit.

In the C standard library, rand() returns an integer in the range 0 to RAND_MAX, a limit that may be defined differently depending on the platform. Suppose RAND_MAX happens to be defined as 2^n - 1 where n is some integer (this happens to be the case in Microsoft's implementation, where n is 15). Then we would say that a good implementation would return n bits of information.

Imagine that rand() constructs random numbers by flipping a coin to find the value of one bit, and then repeating until it has a batch of 15 bits. Then the bits are independent (the value of any one bit does not influence the likelihood of other bits in the same batch have a certain value). So each bit considered independently is like a random number between 0 and 1 inclusive, and is "evenly distributed" over that range (as likely to be 0 as 1).

The independence of the bits ensures that the numbers represented by batches of bits will also be evenly distributed over their range. This is intuitively obvious: if there are 15 bits, the allowed range is zero to 2^15 - 1 = 32767. Every number in that range is a unique pattern of bits, such as:


and if the bits are independent then no pattern is more likely to occur than any other pattern. So all possible numbers in the range are equally likely. And so the reverse is true: if rand() produces evenly distributed integers, then those numbers are made of independent bits.

So think of rand() as a production line for making bits, which just happens to serve them up in batches of arbitrary size. If you don't like the size, break the batches up into individual bits, and then put them back together in whatever quantities you like (though if you need a particular range that is not a power of 2, you need to shrink your numbers, and by far the easiest way to do that is to convert to floating point).

Returning to your original suggestion, suppose you want to go from batches of 15 to batches of 30, ask rand() for the first number, bit-shift it by 15 places, then add another rand() to it. That is a way to combine two calls to rand() without disturbing an even distribution. It works simply because there is no overlap between the locations where you place the bits of information.

This is very different to "stretching" the range of rand() by multiplying by a constant. For example, if you wanted to double the range of rand() you could multiply by two - but now you'd only ever get even numbers, and never odd numbers! That's not exactly a smooth distribution and might be a serious problem depending on the application, e.g. a roulette-like game supposedly allowing odd/even bets. (By thinking in terms of bits, you'd avoid that mistake intuitively, because you'd realise that multiplying by two is the same as shifting the bits to the left (greater significance) by one place and filling in the gap with zero. So obviously the amount of information is the same - it just moved a little.)

Such gaps in number ranges can't be griped about in floating point number applications, because floating point ranges inherently have gaps in them that simply cannot be represented at all: an infinite number of missing real numbers exist in the gap between each two representable floating point numbers! So we just have to learn to live with gaps anyway.

As others have warned, intuition is risky in this area, especially because mathematicians can't resist the allure of real numbers, which are horribly confusing things full of gnarly infinities and apparent paradoxes.

But at least if you think it terms of bits, your intuition might get you a little further. Bits are really easy - even computers can understand them.

Consider you have a simple coin flip problem where even is considered heads and odd is considered tails. The logical implementation is:

rand() mod 2

Over a large enough distribution, the number of even numbers should equal the number of odd numbers.

Now consider a slight tweak:

rand() * rand() mod 2

If one of the results is even, then the entire result should be even. Consider the 4 possible outcomes (even * even = even, even * odd = even, odd * even = even, odd * odd = odd). Now, over a large enough distribution, the answer should be even 75% of the time.

I'd bet heads if I were you.

This comment is really more of an explanation of why you shouldn't implement a custom random function based on your method than a discussion on the mathematical properties of randomness.

It's not exactly obvious, but rand() is typically more random than rand()*rand(). What's important is that this isn't actually very important for most uses.

But firstly, they produce different distributions. This is not a problem if that is what you want, but it does matter. If you need a particular distribution, then ignore the whole “which is more random” question. So why is rand() more random?

The core of why rand() is more random (under the assumption that it is producing floating-point random numbers with the range [0..1], which is very common) is that when you multiply two FP numbers together with lots of information in the mantissa, you get some loss of information off the end; there's just not enough bit in an IEEE double-precision float to hold all the information that was in two IEEE double-precision floats uniformly randomly selected from [0..1], and those extra bits of information are lost. Of course, it doesn't matter that much since you (probably) weren't going to use that information, but the loss is real. It also doesn't really matter which distribution you produce (i.e., which operation you use to do the combination). Each of those random numbers has (at best) 52 bits of random information – that's how much an IEEE double can hold – and if you combine two or more into one, you're still limited to having at most 52 bits of random information.

Most uses of random numbers don't use even close to as much randomness as is actually available in the random source. Get a good PRNG and don't worry too much about it. (The level of “goodness” depends on what you're doing with it; you have to be careful when doing Monte Carlo simulation or cryptography, but otherwise you can probably use the standard PRNG as that's usually much quicker.)

Multiplying numbers would end up in a smaller solution range depending on your computer architecture.

If the display of your computer shows 16 digits rand() would be say 0.1234567890123 multiplied by a second rand(), 0.1234567890123, would give 0.0152415 something you'd definitely find fewer solutions if you'd repeat the experiment 10^14 times.

  1. There is no such thing as more random. It is either random or not. Random means "hard to predict". It does not mean non-deterministic. Both random() and random() * random() are equally random if random() is random. Distribution is irrelevant as far as randomness goes. If a non-uniform distribution occurs, it just means that some values are more likely than others; they are still unpredictable.

  2. Since pseudo-randomness is involved, the numbers are very much deterministic. However, pseudo-randomness is often sufficient in probability models and simulations. It is pretty well known that making a pseudo-random number generator complicated only makes it difficult to analyze. It is unlikely to improve randomness; it often causes it to fail statistical tests.

  3. The desired properties of the random numbers are important: repeatability and reproducibility, statistical randomness, (usually) uniformly distributed, and a large period are a few.

  4. Concerning transformations on random numbers: As someone said, the sum of two or more uniformly distributed results in a normal distribution. This is the additive central limit theorem. It applies regardless of the source distribution as long as all distributions are independent and identical. The multiplicative central limit theorem says the product of two or more independent and indentically distributed random variables is lognormal. The graph someone else created looks exponential, but it is really lognormal. So random() * random() is lognormally distributed (although it may not be independent since numbers are pulled from the same stream). This may be desirable in some applications. However, it is usually better to generate one random number and transform it to a lognormally-distributed number. Random() * random() may be difficult to analyze.

For more information, consult my book at The book is under construction, but the relevant material is there. Note that chapter and section numbers may change over time. Chapter 8 (probability theory) -- sections 8.3.1 and 8.3.3, chapter 10 (random numbers).

Actually, when you think about it rand() * rand() is less random than rand(). Here's why.

Essentially, there are the same number of odd numbers as even numbers. And saying that 0.04325 is odd, and like 0.388 is even, and 0.4 is even, and 0.15 is odd,

That means that rand() has a equal chance of being an even or odd decimal.

On the other hand, rand() * rand() has it's odds stacked a bit differently. Lets say:

double a = rand();
double b = rand();
double c = a * b;

a and b both have a 50% precent chance of being even or odd. Knowing that

  • even * even = even
  • even * odd = even
  • odd * odd = odd
  • odd * even = even

means that there a 75% chance that c is even, while only a 25% chance it's odd, making the value of rand() * rand() more predictable than rand(), therefore less random.

Assuming that rand() returns a number between [0, 1) it is obvious that rand() * rand() will be biased toward 0. This is because multiplying x by a number between [0, 1) will result in a number smaller than x. Here is the distribution of 10000 more random numbers:

google.charts.load("current", { packages: ["corechart"] });

function drawChart() {
  var i;
  var randomNumbers = [];
  for (i = 0; i < 10000; i++) {
    randomNumbers.push(Math.random() * Math.random());
  var chart = new google.visualization.Histogram(document.getElementById("chart-1"));
  var data = new google.visualization.DataTable();
  data.addColumn("number", "Value");
  randomNumbers.forEach(function(randomNumber) {
  chart.draw(data, {
    title: randomNumbers.length + " rand() * rand() values between [0, 1)",
    legend: { position: "none" }
<script src=""></script>

<div id="chart-1" style="height: 500px">Generating chart...</div>

If rand() returns an integer between [x, y] then you have the following distribution. Notice the number of odd vs even values:

google.charts.load("current", { packages: ["corechart"] });
document.querySelector("#draw-chart").addEventListener("click", drawChart);

function randomInt(min, max) {
  return Math.floor(Math.random() * (max - min + 1)) + min;

function drawChart() {
  var min = Number(document.querySelector("#rand-min").value);
  var max = Number(document.querySelector("#rand-max").value);
  if (min >= max) {
  var i;
  var randomNumbers = [];
  for (i = 0; i < 10000; i++) {
    randomNumbers.push(randomInt(min, max) * randomInt(min, max));
  var chart = new google.visualization.Histogram(document.getElementById("chart-1"));
  var data = new google.visualization.DataTable();
  data.addColumn("number", "Value");
  randomNumbers.forEach(function(randomNumber) {
  chart.draw(data, {
    title: randomNumbers.length + " rand() * rand() values between [" + min + ", " + max + "]",
    legend: { position: "none" },
    histogram: { bucketSize: 1 }
<script src=""></script>

<input type="number" id="rand-min" value="0" min="0" max="10">
<input type="number" id="rand-max" value="9" min="0" max="10">
<input type="button" id="draw-chart" value="Apply">

<div id="chart-1" style="height: 500px">Generating chart...</div>

It's easy to show that the sum of the two random numbers is not necessarily random. Imagine you have a 6 sided die and roll. Each number has a 1/6 chance of appearing. Now say you had 2 dice and summed the result. The distribution of those sums is not 1/12. Why? Because certain numbers appear more than others. There are multiple partitions of them. For example the number 2 is the sum of 1+1 only but 7 can be formed by 3+4 or 4+3 or 5+2 etc... so it has a larger chance of coming up.

Therefore, applying a transform, in this case addition on a random function does not make it more random, or necessarily preserve randomness. In the case of the dice above, the distribution is skewed to 7 and therefore less random.

The answer would be it depends, hopefully the rand()*rand() would be more random than rand(), but as:

  • both answers depends on the bit size of your value
  • that in most of the cases you generate depending on a pseudo-random algorithm (which is mostly a number generator that depends on your computer clock, and not that much random).
  • make your code more readable (and not invoke some random voodoo god of random with this kind of mantra).

Well, if you check any of these above I suggest you go for the simple "rand()". Because your code would be more readable (wouldn't ask yourself why you did write this, for ...well... more than 2 sec), easy to maintain (if you want to replace you rand function with a super_rand).

If you want a better random, I would recommend you to stream it from any source that provide enough noise (radio static), and then a simple rand() should be enough.