pointers with example - What are the differences between a pointer variable and a reference variable in C++?



15 Answers

What's a C++ reference (for C programmers)

A reference can be thought of as a constant pointer (not to be confused with a pointer to a constant value!) with automatic indirection, ie the compiler will apply the * operator for you.

All references must be initialized with a non-null value or compilation will fail. It's neither possible to get the address of a reference - the address operator will return the address of the referenced value instead - nor is it possible to do arithmetics on references.

C programmers might dislike C++ references as it will no longer be obvious when indirection happens or if an argument gets passed by value or by pointer without looking at function signatures.

C++ programmers might dislike using pointers as they are considered unsafe - although references aren't really any safer than constant pointers except in the most trivial cases - lack the convenience of automatic indirection and carry a different semantic connotation.

Consider the following statement from the C++ FAQ:

Even though a reference is often implemented using an address in the underlying assembly language, please do not think of a reference as a funny looking pointer to an object. A reference is the object. It is not a pointer to the object, nor a copy of the object. It is the object.

But if a reference really were the object, how could there be dangling references? In unmanaged languages, it's impossible for references to be any 'safer' than pointers - there generally just isn't a way to reliably alias values across scope boundaries!

Why I consider C++ references useful

Coming from a C background, C++ references may look like a somewhat silly concept, but one should still use them instead of pointers where possible: Automatic indirection is convenient, and references become especially useful when dealing with RAII - but not because of any perceived safety advantage, but rather because they make writing idiomatic code less awkward.

RAII is one of the central concepts of C++, but it interacts non-trivially with copying semantics. Passing objects by reference avoids these issues as no copying is involved. If references were not present in the language, you'd have to use pointers instead, which are more cumbersome to use, thus violating the language design principle that the best-practice solution should be easier than the alternatives.

normal vs performance

I know references are syntactic sugar, so code is easier to read and write.

But what are the differences?


Summary from answers and links below:

  1. A pointer can be re-assigned any number of times while a reference cannot be re-assigned after binding.
  2. Pointers can point nowhere (NULL), whereas a reference always refers to an object.
  3. You can't take the address of a reference like you can with pointers.
  4. There's no "reference arithmetic" (but you can take the address of an object pointed by a reference and do pointer arithmetic on it as in &obj + 5).

To clarify a misconception:

The C++ standard is very careful to avoid dictating how a compiler may implement references, but every C++ compiler implements references as pointers. That is, a declaration such as:

int &ri = i;

if it's not optimized away entirely, allocates the same amount of storage as a pointer, and places the address of i into that storage.

So, a pointer and a reference both use the same amount of memory.

As a general rule,

  • Use references in function parameters and return types to provide useful and self-documenting interfaces.
  • Use pointers for implementing algorithms and data structures.

Interesting read:




Contrary to popular opinion, it is possible to have a reference that is NULL.

int * p = NULL;
int & r = *p;
r = 1;  // crash! (if you're lucky)

Granted, it is much harder to do with a reference - but if you manage it, you'll tear your hair out trying to find it. References are not inherently safe in C++!

Technically this is an invalid reference, not a null reference. C++ doesn't support null references as a concept as you might find in other languages. There are other kinds of invalid references as well. Any invalid reference raises the spectre of undefined behavior, just as using an invalid pointer would.

The actual error is in the dereferencing of the NULL pointer, prior to the assignment to a reference. But I'm not aware of any compilers that will generate any errors on that condition - the error propagates to a point further along in the code. That's what makes this problem so insidious. Most of the time, if you dereference a NULL pointer, you crash right at that spot and it doesn't take much debugging to figure it out.

My example above is short and contrived. Here's a more real-world example.

class MyClass
{
    ...
    virtual void DoSomething(int,int,int,int,int);
};

void Foo(const MyClass & bar)
{
    ...
    bar.DoSomething(i1,i2,i3,i4,i5);  // crash occurs here due to memory access violation - obvious why?
}

MyClass * GetInstance()
{
    if (somecondition)
        return NULL;
    ...
}

MyClass * p = GetInstance();
Foo(*p);

I want to reiterate that the only way to get a null reference is through malformed code, and once you have it you're getting undefined behavior. It never makes sense to check for a null reference; for example you can try if(&bar==NULL)... but the compiler might optimize the statement out of existence! A valid reference can never be NULL so from the compiler's view the comparison is always false, and it is free to eliminate the if clause as dead code - this is the essence of undefined behavior.

The proper way to stay out of trouble is to avoid dereferencing a NULL pointer to create a reference. Here's an automated way to accomplish this.

template<typename T>
T& deref(T* p)
{
    if (p == NULL)
        throw std::invalid_argument(std::string("NULL reference"));
    return *p;
}

MyClass * p = GetInstance();
Foo(deref(p));

For an older look at this problem from someone with better writing skills, see Null References from Jim Hyslop and Herb Sutter.

For another example of the dangers of dereferencing a null pointer see Exposing undefined behavior when trying to port code to another platform by Raymond Chen.




Apart from syntactic sugar, a reference is a const pointer (not pointer to a const). You must establish what it refers to when you declare the reference variable, and you cannot change it later.

Update: now that I think about it some more, there is an important difference.

A const pointer's target can be replaced by taking its address and using a const cast.

A reference's target cannot be replaced in any way short of UB.

This should permit the compiler to do more optimization on a reference.




References are very similar to pointers, but they are specifically crafted to be helpful to optimizing compilers.

  • References are designed such that it is substantially easier for the compiler to trace which reference aliases which variables. Two major features are very important: no "reference arithmetic" and no reassigning of references. These allow the compiler to figure out which references alias which variables at compile time.
  • References are allowed to refer to variables which do not have memory addresses, such as those the compiler chooses to put into registers. If you take the address of a local variable, it is very hard for the compiler to put it in a register.

As an example:

void maybeModify(int& x); // may modify x in some way

void hurtTheCompilersOptimizer(short size, int array[])
{
    // This function is designed to do something particularly troublesome
    // for optimizers. It will constantly call maybeModify on array[0] while
    // adding array[1] to array[2]..array[size-1]. There's no real reason to
    // do this, other than to demonstrate the power of references.
    for (int i = 2; i < (int)size; i++) {
        maybeModify(array[0]);
        array[i] += array[1];
    }
}

An optimizing compiler may realize that we are accessing a[0] and a[1] quite a bunch. It would love to optimize the algorithm to:

void hurtTheCompilersOptimizer(short size, int array[])
{
    // Do the same thing as above, but instead of accessing array[1]
    // all the time, access it once and store the result in a register,
    // which is much faster to do arithmetic with.
    register int a0 = a[0];
    register int a1 = a[1]; // access a[1] once
    for (int i = 2; i < (int)size; i++) {
        maybeModify(a0); // Give maybeModify a reference to a register
        array[i] += a1;  // Use the saved register value over and over
    }
    a[0] = a0; // Store the modified a[0] back into the array
}

To make such an optimization, it needs to prove that nothing can change array[1] during the call. This is rather easy to do. i is never less than 2, so array[i] can never refer to array[1]. maybeModify() is given a0 as a reference (aliasing array[0]). Because there is no "reference" arithmetic, the compiler just has to prove that maybeModify never gets the address of x, and it has proven that nothing changes array[1].

It also has to prove that there are no ways a future call could read/write a[0] while we have a temporary register copy of it in a0. This is often trivial to prove, because in many cases it is obvious that the reference is never stored in a permanent structure like a class instance.

Now do the same thing with pointers

void maybeModify(int* x); // May modify x in some way

void hurtTheCompilersOptimizer(short size, int array[])
{
    // Same operation, only now with pointers, making the
    // optimization trickier.
    for (int i = 2; i < (int)size; i++) {
        maybeModify(&(array[0]));
        array[i] += array[1];
    }
}

The behavior is the same; only now it is much harder to prove that maybeModify does not ever modify array[1], because we already gave it a pointer; the cat is out of the bag. Now it has to do the much more difficult proof: a static analysis of maybeModify to prove it never writes to &x + 1. It also has to prove that it never saves off a pointer that can refer to array[0], which is just as tricky.

Modern compilers are getting better and better at static analysis, but it is always nice to help them out and use references.

Of course, barring such clever optimizations, compilers will indeed turn references into pointers when needed.

EDIT: Five years after posting this answer, I found an actual technical difference where references are different than just a different way of looking at the same addressing concept. References can modify the lifespan of temporary objects in a way that pointers cannot.

F createF(int argument);

void extending()
{
    const F& ref = createF(5);
    std::cout << ref.getArgument() << std::endl;
};

Normally temporary objects such as the one created by the call to createF(5) are destroyed at the end of the expression. However, by binding that object to a reference, ref, C++ will extend the lifespan of that temporary object until ref goes out of scope.




While both references and pointers are used to indirectly access another value, there are two important differences between references and pointers. The first is that a reference always refers to an object: It is an error to define a reference without initializing it. The behavior of assignment is the second important difference: Assigning to a reference changes the object to which the reference is bound; it does not rebind the reference to another object. Once initialized, a reference always refers to the same underlying object.

Consider these two program fragments. In the first, we assign one pointer to another:

int ival = 1024, ival2 = 2048;
int *pi = &ival, *pi2 = &ival2;
pi = pi2;    // pi now points to ival2

After the assignment, ival, the object addressed by pi remains unchanged. The assignment changes the value of pi, making it point to a different object. Now consider a similar program that assigns two references:

int &ri = ival, &ri2 = ival2;
ri = ri2;    // assigns ival2 to ival

This assignment changes ival, the value referenced by ri, and not the reference itself. After the assignment, the two references still refer to their original objects, and the value of those objects is now the same as well.




A reference is an alias for another variable whereas a pointer holds the memory address of a variable. References are generally used as function parameters so that the passed object is not the copy but the object itself.

    void fun(int &a, int &b); // A common usage of references.
    int a = 0;
    int &b = a; // b is an alias for a. Not so common to use. 



This is based on the tutorial. What is written makes it more clear:

>>> The address that locates a variable within memory is
    what we call a reference to that variable. (5th paragraph at page 63)

>>> The variable that stores the reference to another
    variable is what we call a pointer. (3rd paragraph at page 64)

Simply to remember that,

>>> reference stands for memory location
>>> pointer is a reference container (Maybe because we will use it for
several times, it is better to remember that reference.)

What's more, as we can refer to almost any pointer tutorial, a pointer is an object that is supported by pointer arithmetic which makes pointer similar to an array.

Look at the following statement,

int Tom(0);
int & alias_Tom = Tom;

alias_Tom can be understood as an alias of a variable (different with typedef, which is alias of a type) Tom. It is also OK to forget the terminology of such statement is to create a reference of Tom.




A reference to a pointer is possible in C++, but the reverse is not possible means a pointer to a reference isn't possible. A reference to a pointer provides a cleaner syntax to modify the pointer. Look at this example:

#include<iostream>
using namespace std;

void swap(char * &str1, char * &str2)
{
  char *temp = str1;
  str1 = str2;
  str2 = temp;
}

int main()
{
  char *str1 = "Hi";
  char *str2 = "Hello";
  swap(str1, str2);
  cout<<"str1 is "<<str1<<endl;
  cout<<"str2 is "<<str2<<endl;
  return 0;
}

And consider the C version of the above program. In C you have to use pointer to pointer (multiple indirection), and it leads to confusion and the program may look complicated.

#include<stdio.h>
/* Swaps strings by swapping pointers */
void swap1(char **str1_ptr, char **str2_ptr)
{
  char *temp = *str1_ptr;
  *str1_ptr = *str2_ptr;
  *str2_ptr = temp;
}

int main()
{
  char *str1 = "Hi";
  char *str2 = "Hello";
  swap1(&str1, &str2);
  printf("str1 is %s, str2 is %s", str1, str2);
  return 0;
}

Visit the following for more information about reference to pointer:

As I said, a pointer to a reference isn't possible. Try the following program:

#include <iostream>
using namespace std;

int main()
{
   int x = 10;
   int *ptr = &x;
   int &*ptr1 = ptr;
}



I use references unless I need either of these:

  • Null pointers can be used as a sentinel value, often a cheap way to avoid function overloading or use of a bool.

  • You can do arithmetic on a pointer. For example, p += offset;




Another difference is that you can have pointers to a void type (and it means pointer to anything) but references to void are forbidden.

int a;
void * p = &a; // ok
void & p = a;  //  forbidden

I can't say I'm really happy with this particular difference. I would much prefer it would be allowed with the meaning reference to anything with an address and otherwise the same behavior for references. It would allow to define some equivalents of C library functions like memcpy using references.




Also, a reference that is a parameter to a function that is inlined may be handled differently than a pointer.

void increment(int *ptrint) { (*ptrint)++; }
void increment(int &refint) { refint++; }
void incptrtest()
{
    int testptr=0;
    increment(&testptr);
}
void increftest()
{
    int testref=0;
    increment(testref);
}

Many compilers when inlining the pointer version one will actually force a write to memory (we are taking the address explicitly). However, they will leave the reference in a register which is more optimal.

Of course, for functions that are not inlined the pointer and reference generate the same code and it's always better to pass intrinsics by value than by reference if they are not modified and returned by the function.




Another interesting use of references is to supply a default argument of a user-defined type:

class UDT
{
public:
   UDT() : val_d(33) {};
   UDT(int val) : val_d(val) {};
   virtual ~UDT() {};
private:
   int val_d;
};

class UDT_Derived : public UDT
{
public:
   UDT_Derived() : UDT() {};
   virtual ~UDT_Derived() {};
};

class Behavior
{
public:
   Behavior(
      const UDT &udt = UDT()
   )  {};
};

int main()
{
   Behavior b; // take default

   UDT u(88);
   Behavior c(u);

   UDT_Derived ud;
   Behavior d(ud);

   return 1;
}

The default flavor uses the 'bind const reference to a temporary' aspect of references.




Maybe some metaphors will help; In the context of your desktop screenspace -

  • A reference requires you to specify an actual window.
  • A pointer requires the location of a piece of space on screen that you assure it will contain zero or more instances of that window type.



The difference is that non-constant pointer variable(not to be confused with a pointer to constant) may be changed at some time during program execution, requires pointer semantics to be used(&,*) operators, while references can be set upon initialization only(that's why you can set them in constructor initializer list only, but not somehow else) and use ordinary value accessing semantics. Basically references were introduced to allow support for operators overloading as I had read in some very old book. As somebody stated in this thread - pointer can be set to 0 or whatever value you want. 0(NULL, nullptr) means that the pointer is initialized with nothing. It is an error to dereference null pointer. But actually the pointer may contain a value that doesn't point to some correct memory location. References in their turn try not to allow a user to initialize a reference to something that cannot be referenced due to the fact that you always provide rvalue of correct type to it. Although there are a lot of ways to make reference variable be initialized to a wrong memory location - it is better for you not to dig this deep into details. On machine level both pointer and reference work uniformly - via pointers. Let's say in essential references are syntactic sugar. rvalue references are different to this - they are naturally stack/heap objects.




I always decide by this rule from C++ Core Guidelines:

Prefer T* over T& when "no argument" is a valid option




Related