What are the differences between a pointer variable and a reference variable in C++?


I know references are syntactic sugar, so code is easier to read and write.

But what are the differences?


Summary from answers and links below:

  1. A pointer can be re-assigned any number of times while a reference cannot be re-seated after binding.
  2. Pointers can point nowhere (NULL), whereas reference always refer to an object.
  3. You can't take the address of a reference like you can with pointers.
  4. There's no "reference arithmetics" (but you can take the address of an object pointed by a reference and do pointer arithmetics on it as in &obj + 5).

To clarify a misconception:

The C++ standard is very careful to avoid dictating how a compiler must implement references, but every C++ compiler implements references as pointers. That is, a declaration such as:

int &ri = i;

if it's not optimized away entirely, allocates the same amount of storage as a pointer, and places the address of i into that storage.

So, a pointer and a reference both occupy the same amount of memory.

As a general rule,

  • Use references in function parameters and return types to define useful and self-documenting interfaces.
  • Use pointers to implement algorithms and data structures.

Interesting read:



Answers



  1. A pointer can be re-assigned:

    int x = 5;
    int y = 6;
    int *p;
    p =  &x;
    p = &y;
    *p = 10;
    assert(x == 5);
    assert(y == 10);

    A reference cannot, and must be assigned at initialization:

    int x = 5;
    int y = 6;
    int &r = x;
  2. A pointer has its own memory address and size on the stack (4 bytes on x86), whereas a reference shares the same memory address (with the original variable) but also takes up some space on the stack. Since a reference has the same address as the original variable itself, it is safe to think of a reference as another name for the same variable. Note: What a pointer points to can be on the stack or heap. Ditto a reference. My claim in this statement is not that a pointer must point to the stack. A pointer is just a variable that holds a memory address. This variable is on the stack. Since a reference has its own space on the stack, and since the address is the same as the variable it references. More on stack vs heap. This implies that there is a real address of a reference that the compiler will not tell you.

    int x = 0;
    int &r = x;
    int *p = &x;
    int *p2 = &r;
    assert(p == p2);
  3. You can have pointers to pointers to pointers offering extra levels of indirection. Whereas references only offer one level of indirection.

    int x = 0;
    int y = 0;
    int *p = &x;
    int *q = &y;
    int **pp = &p;
    pp = &q;//*pp = q
    **pp = 4;
    assert(y == 4);
    assert(x == 0);
  4. Pointer can be assigned nullptr directly, whereas reference cannot. If you try hard enough, and you know how, you can make the address of a reference nullptr. Likewise, if you try hard enough you can have a reference to a pointer, and then that reference can contain nullptr.

    int *p = nullptr;
    int &r = nullptr; <--- compiling error
  5. Pointers can iterate over an array, you can use ++ to go to the next item that a pointer is pointing to, and + 4 to go to the 5th element. This is no matter what size the object is that the pointer points to.

  6. A pointer needs to be dereferenced with * to access the memory location it points to, whereas a reference can be used directly. A pointer to a class/struct uses -> to access it's members whereas a reference uses a ..

  7. A pointer is a variable that holds a memory address. Regardless of how a reference is implemented, a reference has the same memory address as the item it references.

  8. References cannot be stuffed into an array, whereas pointers can be (Mentioned by user @litb)

  9. Const references can be bound to temporaries. Pointers cannot (not without some indirection):

    const int &x = int(12); //legal C++
    int *y = &int(12); //illegal to dereference a temporary.

    This makes const& safer for use in argument lists and so forth.




What's a C++ reference (for C programmers)

A reference can be thought of as a constant pointer (not to be confused with a pointer to a constant value!) with automatic indirection, ie the compiler will apply the * operator for you.

All references must be initialized with a non-null value or compilation will fail. It's neither possible to get the address of a reference - the address operator will return the address of the referenced value instead - nor is it possible to do arithmetics on references.

C programmers might dislike C++ references as it will no longer be obvious when indirection happens or if an argument gets passed by value or by pointer without looking at function signatures.

C++ programmers might dislike using pointers as they are considered unsafe - although references aren't really any safer than constant pointers except in the most trivial cases - lack the convenience of automatic indirection and carry a different semantic connotation.

Consider the following statement from the C++ FAQ:

Even though a reference is often implemented using an address in the underlying assembly language, please do not think of a reference as a funny looking pointer to an object. A reference is the object. It is not a pointer to the object, nor a copy of the object. It is the object.

But if a reference really were the object, how could there be dangling references? In unmanaged languages, it's impossible for references to be any 'safer' than pointers - there generally just isn't a way to reliably alias values across scope boundaries!

Why I consider C++ references useful

Coming from a C background, C++ references may look like a somewhat silly concept, but one should still use them instead of pointers where possible: Automatic indirection is convenient, and references become especially useful when dealing with RAII - but not because of any perceived safety advantage, but rather because they make writing idiomatic code less awkward.

RAII is one of the central concepts of C++, but it interacts non-trivially with copying semantics. Passing objects by reference avoids these issues as no copying is involved. If references were not present in the language, you'd have to use pointers instead, which are more cumbersome to use, thus violating the language design principle that the best-practice solution should be easier than the alternatives.




If you want to be really pedantic, there is one thing you can do with a reference that you can't do with a pointer: extend the lifetime of a temporary object. In C++ if you bind a const reference to a temporary object, the lifetime of that object becomes the lifetime of the reference.

std::string s1 = "123";
std::string s2 = "456";

std::string s3_copy = s1 + s2;
const std::string& s3_reference = s1 + s2;

In this example s3_copy copies the temporary object that is a result of the concatenation. Whereas s3_reference in essence becomes the temporary object. It's really a reference to a temporary object that now has the same lifetime as the reference.

If you try this without the const it should fail to compile. You cannot bind a non-const reference to a temporary object, nor can you take its address for that matter.




how does the ampersand(&) sign work in c++?

To start, note that

this

is a special pointer ( == memory address) to the class its in. First, an object is instantiated:

CDummy a;

Next, a pointer is instantiated:

CDummy *b;

Next, the memory address of a is assigned to the pointer b:

b = &a;

Next, the method CDummy::isitme(CDummy &param) is called:

b->isitme(a);

A test is evaluated inside this method:

if (&param == this) // do something

Here's the tricky part. param is an object of type CDummy, but &param is the memory address of param. So the memory address of param is tested against another memory address called "this". If you copy the memory address of the object this method is called from into the argument of this method, this will result in true.

This kind of evaluation is usually done when overloading the copy constructor

MyClass& MyClass::operator=(const MyClass &other) {
    // if a programmer tries to copy the same object into itself, protect
    // from this behavior via this route
    if (&other == this) return *this;
    else {
        // otherwise truly copy other into this
    }
}

Also note the usage of *this, where this is being dereferenced. That is, instead of returning the memory address, return the object located at that memory address.




The & has more the one meanings:

1) take the address of a variable

int x;
void* p = &x;
//p will now point to x, as &x is the address of x

2) pass an argument by reference to a function

void foo(CDummy& x);
//you pass x by reference
//if you modify x inside the function, the change will be applied to the original variable
//a copy is not created for x, the original one is used
//this is preffered for passing large objects
//to prevent changes, pass by const reference:
void fooconst(const CDummy& x);

3) declare a reference variable

int k = 0;
int& r = k;
//r is a reference to k
r = 3;
assert( k == 3 );

4) bitwise and operator

int a = 3 & 1; // a = 1

n) others???




Well the CDummy& param that is declared as a parameter of the function CDummy::isitme is actually a reference which is "like" a pointer, but different. The important thing to note about references is that inside functions where they are passed as parameters, you really have a reference to the instance of the type, not "just" a pointer to it. So, on the line with the comment, the '&' is functioning just like in C, it is getting the address of the argument passed in, and comparing it to this which is, of course, a pointer to the instance of the class the method is being called on.




Pass by pointer & Pass by reference

A reference is semantically the following:

T& <=> *(T * const)

const T& <=> *(T const * const)

T&& <=> [no C equivalent] (C++11)

As with other answers, the following from the C++ FAQ is the one-line answer: references when possible, pointers when needed.

An advantage over pointers is that you need explicit casting in order to pass NULL. It's still possible, though. Of the compilers I've tested, none emit a warning for the following:

int* p() {
    return 0;
}
void x(int& y) {
  y = 1;
}
int main() {
   x(*p());
}



In fact, most compilers emit the same code for both functions calls, because references are generally implemented using pointers.

Following this logic, when an argument of (non-const) reference type is used in the function body, the generated code will just silently operate on the address of the argument and it will dereference it. In addition, when a call to such a function is encountered, the compiler will generate code that passes the address of the arguments instead of copying their value.

Basically, references and pointers are not very different from an implementation point of view, the main (and very important) difference is in the philosophy: a reference is the object itself, just with a different name.

References have a couple more advantages compared to pointers (e. g. they can't be NULL, so they are safer to use). Consequently, if you can use C++, then passing by reference is generally considered more elegant and it should be preferred. However, in C, there's no passing by reference, so if you want to write C code (or, horribile dictu, code that compiles with both a C and a C++ compiler, albeit that's not a good idea), you'll have to restrict yourself to using pointers.




Pass by pointer is the only way you could pass "by reference" in C, so you still see it used quite a bit.

The NULL pointer is a handy convention for saying a parameter is unused or not valid, so use a pointer in that case.

References can't be updated once they're set, so use a pointer if you ever need to reassign it.

Prefer a reference in every case where there isn't a good reason not to. Make it const if you can.




How is a reference different from a pointer in implementation?

Most references are implemented using a pointer variable i.e. a reference usually takes up one word of memory. However, a reference that is used purely locally can - and often is - eliminated by the optimizer. For example:

  struct S { int a, int b[100]; };  
  void do_something(const vector<S>& v)
  {
    for (int i=0; i<v.size(); ++i) {
        int*& p = v[i].b;
          for (int j=0; j<100; ++j) cout <<p[j];
  }

In this case, p needs not be stored in memory (maybe it just exists in a register, maybe it disappears into the instructions).




A reference can be thought of as an implicitly de-referenced constant pointer (note this). Once a reference, always a reference. It allows for ease of writing code. Unless of course, you bring in move semantics and r-value references. The standard doesn't mandate how references should be implemented just as it does not mandate how pointers should be implemented. Most of the time though, pointers are synonymous with addresses of objects.




Use references when you can, and pointers when you need to. Reasons you'd need to use a pointer:

  1. There might not be an object for it to point at (null pointer, no null references).
  2. You might need to refer to different objects during its lifetime.
  3. You might need to refer to a whole array of objects (but std::vector is usually better).

There are, however, cases where the usage of the two doesn't really overlap at all, and you simply can't substitute one for the other. For one obvious example, consider the following code:

template <class T, size_t N>
size_t size(T(&matrix)[N]) {
    return N;
}

This allows you to find the size of an array:

int array1[some_size];
int array2[some_other_size];

size_t n = size(array1);  // retrieves `some_size`
size_t m = size(array2);  // retrieves `some_other_size`

...but it simply won't compile if you try to pass a pointer to it:

int *x = new int[some_size];

size_t n = size(x);    // won't compile

At very best it seems to make little sense to write the code to receive a pointer when part of its point is to reject being passed a pointer.