Is Java “pass-by-reference” or “pass-by-value”?


Answers

I just noticed you referenced my article.

The Java Spec says that everything in Java is pass-by-value. There is no such thing as "pass-by-reference" in Java.

The key to understanding this is that something like

Dog myDog;

is not a Dog; it's actually a pointer to a Dog.

What that means, is when you have

Dog myDog = new Dog("Rover");
foo(myDog);

you're essentially passing the address of the created Dog object to the foo method.

(I say essentially because Java pointers aren't direct addresses, but it's easiest to think of them that way)

Suppose the Dog object resides at memory address 42. This means we pass 42 to the method.

if the Method were defined as

public void foo(Dog someDog) {
    someDog.setName("Max");     // AAA
    someDog = new Dog("Fifi");  // BBB
    someDog.setName("Rowlf");   // CCC
}

let's look at what's happening.

  • the parameter someDog is set to the value 42
  • at line "AAA"
    • someDog is followed to the Dog it points to (the Dog object at address 42)
    • that Dog (the one at address 42) is asked to change his name to Max
  • at line "BBB"
    • a new Dog is created. Let's say he's at address 74
    • we assign the parameter someDog to 74
  • at line "CCC"
    • someDog is followed to the Dog it points to (the Dog object at address 74)
    • that Dog (the one at address 74) is asked to change his name to Rowlf
  • then, we return

Now let's think about what happens outside the method:

Did myDog change?

There's the key.

Keeping in mind that myDog is a pointer, and not an actual Dog, the answer is NO. myDog still has the value 42; it's still pointing to the original Dog (but note that because of line "AAA", its name is now "Max" - still the same Dog; myDog's value has not changed.)

It's perfectly valid to follow an address and change what's at the end of it; that does not change the variable, however.

Java works exactly like C. You can assign a pointer, pass the pointer to a method, follow the pointer in the method and change the data that was pointed to. However, you cannot change where that pointer points.

In C++, Ada, Pascal and other languages that support pass-by-reference, you can actually change the variable that was passed.

If Java had pass-by-reference semantics, the foo method we defined above would have changed where myDog was pointing when it assigned someDog on line BBB.

Think of reference parameters as being aliases for the variable passed in. When that alias is assigned, so is the variable that was passed in.

Question

I always thought Java was pass-by-reference, however I've seen a couple of blog posts (For example, this blog) that claim it isn't. I don't think I understand the distinction they're making.

What is the explanation?




Java copies the reference by value. So if you change it to something else (e.g, using new) the reference does not change outside the method. For native types, it is always pass by value.




Just to show the contrast, compare the following C++ and Java snippets:

In C++: Note: Bad code - memory leaks! But it demonstrates the point.

void cppMethod(int val, int &ref, Dog obj, Dog &objRef, Dog *objPtr, Dog *&objPtrRef)
{
    val = 7; // Modifies the copy
    ref = 7; // Modifies the original variable
    obj.SetName("obj"); // Modifies the copy of Dog passed
    objRef.SetName("objRef"); // Modifies the original Dog passed
    objPtr->SetName("objPtr"); // Modifies the original Dog pointed to 
                               // by the copy of the pointer passed.
    objPtr = new Dog("newObjPtr");  // Modifies the copy of the pointer, 
                                   // leaving the original object alone.
    objPtrRef->SetName("objRefPtr"); // Modifies the original Dog pointed to 
                                    // by the original pointer passed. 
    objPtrRef = new Dog("newObjPtrRef"); // Modifies the original pointer passed
}

int main()
{
    int a = 0;
    int b = 0;
    Dog d0 = Dog("d0");
    Dog d1 = Dog("d1");
    Dog *d2 = new Dog("d2");
    Dog *d3 = new Dog("d3");
    cppMethod(a, b, d0, d1, d2, d3);
    // a is still set to 0
    // b is now set to 7
    // d0 still have name "d0"
    // d1 now has name "objRef"
    // d2 now has name "objPtr"
    // d3 now has name "newObjPtrRef"
}

In Java,

public static void javaMethod(int val, Dog objPtr)
{
   val = 7; // Modifies the copy
   objPtr.SetName("objPtr") // Modifies the original Dog pointed to 
                            // by the copy of the pointer passed.
   objPtr = new Dog("newObjPtr");  // Modifies the copy of the pointer, 
                                  // leaving the original object alone.
}

public static void main()
{
    int a = 0;
    Dog d0 = new Dog("d0");
    javaMethod(a, d0);
    // a is still set to 0
    // d0 now has name "objPtr"
}

Java only has the two types of passing: by value for built-in types, and by value of the pointer for object types.




I always think of it as "pass by copy". It is a copy of the value be it primitive or reference. If it is a primitive it is a copy of the bits that are the value and if it is an Object it is a copy of the reference.

public class PassByCopy{
    public static void changeName(Dog d){
        d.name = "Fido";
    }
    public static void main(String[] args){
        Dog d = new Dog("Maxx");
        System.out.println("name= "+ d.name);
        changeName(d);
        System.out.println("name= "+ d.name);
    }
}
class Dog{
    public String name;
    public Dog(String s){
        this.name = s;
    }
}

output of java PassByCopy:

name= Maxx
name= Fido

Primitive wrapper classes and Strings are immutable so any example using those types will not work the same as other types/objects.




This will give you some insights of how Java really works to the point that in your next discussion about Java passing by reference or passing by value you'll just smile :-)

Step one please erase from your mind that word that starts with 'p' "_ _ _ _ _ _ _", especially if you come from other programming languages. Java and 'p' cannot be written in the same book, forum, or even txt.

Step two remember that when you pass an Object into a method you're passing the Object reference and not the Object itself.

  • Student: Master, does this mean that Java is pass-by-reference?
  • Master: Grasshopper, No.

Now think of what an Object's reference/variable does/is:

  1. A variable holds the bits that tell the JVM how to get to the referenced Object in memory (Heap).
  2. When passing arguments to a method you ARE NOT passing the reference variable, but a copy of the bits in the reference variable. Something like this: 3bad086a. 3bad086a represents a way to get to the passed object.
  3. So you're just passing 3bad086a that it's the value of the reference.
  4. You're passing the value of the reference and not the reference itself (and not the object).
  5. This value is actually COPIED and given to the method.

In the following (please don't try to compile/execute this...):

1. Person person;
2. person = new Person("Tom");
3. changeName(person);
4.
5. //I didn't use Person person below as an argument to be nice
6. static void changeName(Person anotherReferenceToTheSamePersonObject) {
7.     anotherReferenceToTheSamePersonObject.setName("Jerry");
8. }

What happens?

  • The variable person is created in line #1 and it's null at the beginning.
  • A new Person Object is created in line #2, stored in memory, and the variable person is given the reference to the Person object. That is, its address. Let's say 3bad086a.
  • The variable person holding the address of the Object is passed to the function in line #3.
  • In line #4 you can listen to the sound of silence
  • Check the comment on line #5
  • A method local variable -anotherReferenceToTheSamePersonObject- is created and then comes the magic in line #6:
    • The variable/reference person is copied bit-by-bit and passed to anotherReferenceToTheSamePersonObject inside the function.
    • No new instances of Person are created.
    • Both "person" and "anotherReferenceToTheSamePersonObject" hold the same value of 3bad086a.
    • Don't try this but person==anotherReferenceToTheSamePersonObject would be true.
    • Both variables have IDENTICAL COPIES of the reference and they both refer to the same Person Object, the SAME Object on the Heap and NOT A COPY.

A picture is worth a thousand words:

Note that the anotherReferenceToTheSamePersonObject arrows is directed towards the Object and not towards the variable person!

If you didn't get it then just trust me and remember that it's better to say that Java is pass by value. Well, pass by reference value. Oh well, even better is pass-by-copy-of-the-variable-value! ;)

Now feel free to hate me but note that given this there is no difference between passing primitive data types and Objects when talking about method arguments.

You always pass a copy of the bits of the value of the reference!

  • If it's a primitive data type these bits will contain the value of the primitive data type itself.
  • If it's an Object the bits will contain the value of the address that tells the JVM how to get to the Object.

Java is pass-by-value because inside a method you can modify the referenced Object as much as you want but no matter how hard you try you'll never be able to modify the passed variable that will keep referencing (not p _ _ _ _ _ _ _) the same Object no matter what!


The changeName function above will never be able to modify the actual content (the bit values) of the passed reference. In other word changeName cannot make Person person refer to another Object.


Of course you can cut it short and just say that Java is pass-by-value!




The crux of the matter is that the word reference in the expression "pass by reference" means something completely different from the usual meaning of the word reference in Java.

Usually in Java reference means a a reference to an object. But the technical terms pass by reference/value from programming language theory is talking about a reference to the memory cell holding the variable, which is something completely different.




A reference is always a value when represented, no matter what language you use.

Getting an outside of the box view, let's look at Assembly or some low level memory management. At the CPU level a reference to anything immediately becomes a value if it gets written to memory or to one of the CPU registers. (That is why pointer is a good definition. It is a value, which has a purpose at the same time).

Data in memory has a Location and at that location there is a value (byte,word, whatever). In Assembly we have a convenient solution to give a Name to certain Location (aka variable), but when compiling the code, the assembler simply replaces Name with the designated location just like your browser replaces domain names with IP addresses.

Down to the core it is technically impossible to pass a reference to anything in any language without representing it (when it immediately becomes a value).

Lets say we have a variable Foo, its Location is at the 47th byte in memory and its Value is 5. We have another variable Ref2Foo which is at 223rd byte in memory, and its value will be 47. This Ref2Foo might be a technical variable, not explicitly created by the program. If you just look at 5 and 47 without any other information, you will see just two Values. If you use them as references then to reach to 5 we have to travel:

(Name)[Location] -> [Value at the Location]
---------------------
(Ref2Foo)[223]  -> 47
(Foo)[47]       -> 5

This is how jump-tables work.

If we want to call a method/function/procedure with Foo's value, there are a few possible way to pass the variable to the method, depending on the language and its several method invocation modes:

  1. 5 gets copied to one of the CPU registers (ie. EAX).
  2. 5 gets PUSHd to the stack.
  3. 47 gets copied to one of the CPU registers
  4. 47 PUSHd to the stack.
  5. 223 gets copied to one of the CPU registers.
  6. 223 gets PUSHd to the stack.

In every cases above a value - a copy of an existing value - has been created, it is now upto the receiving method to handle it. When you write "Foo" inside the method, it is either read out from EAX, or automatically dereferenced, or double dereferenced, the process depends on how the language works and/or what the type of Foo dictates. This is hidden from the developer until she circumvents the dereferencing process. So a reference is a value when represented, because a reference is a value that has to be processed (at language level).

Now we have passed Foo to the method:

  • in case 1. and 2. if you change Foo (Foo = 9) it only affects local scope as you have a copy of the Value. From inside the method we cannot even determine where in memory the original Foo was located.
  • in case 3. and 4. if you use default language constructs and change Foo (Foo = 11), it could change Foo globally (depends on the language, ie. Java or like Pascal's procedure findMin(x, y, z: integer;var m: integer);). However if the language allows you to circumvent the dereference process, you can change 47, say to 49. At that point Foo seems to have been changed if you read it, because you have changed the local pointer to it. And if you were to modify this Foo inside the method (Foo = 12) you will probably FUBAR the execution of the program (aka. segfault) because you will write to a different memory than expected, you can even modify an area that is destined to hold executable program and writing to it will modify running code (Foo is now not at 47). BUT Foo's value of 47 did not change globally, only the one inside the method, because 47 was also a copy to the method.
  • in case 5. and 6. if you modify 223 inside the method it creates the same mayhem as in 3. or 4. (a pointer, pointing to a now bad value, that is again used as a pointer) but this is still a local problem, as 223 was copied. However if you are able to dereference Ref2Foo (that is 223), reach to and modify the pointed value 47, say, to 49, it will affect Foo globally, because in this case the methods got a copy of 223 but the referenced 47 exists only once, and changing that to 49 will lead every Ref2Foo double-dereferencing to a wrong value.

Nitpicking on insignificant details, even languages that do pass-by-reference will pass values to functions, but those functions know that they have to use it for dereferencing purposes. This pass-the-reference-as-value is just hidden from the programmer because it is practically useless and the terminology is only pass-by-reference.

Strict pass-by-value is also useless, it would mean that a 100 Mbyte array should have to be copied every time we call a method with the array as argument, therefore Java cannot be stricly pass-by-value. Every language would pass a reference to this huge array (as a value) and either employs copy-on-write mechanism if that array can be changed locally inside the method or allows the method (as Java does) to modify the array globally (from the caller's view) and a few languages allows to modify the Value of the reference itself.

So in short and in Java's own terminology, Java is pass-by-value where value can be: either a real value or a value that is a representation of a reference.




The distinction, or perhaps just the way I remember as I used to be under the same impression as the original poster is this: Java is always pass by value. All objects( in Java, anything except for primitives) in Java are references. These references are passed by value.




You can never pass by reference in Java, and one of the ways that is obvious is when you want to return more than one value from a method call. Consider the following bit of code in C++:

void getValues(int& arg1, int& arg2) {
    arg1 = 1;
    arg2 = 2;
}
void caller() {
    int x;
    int y;
    getValues(x, y);
    cout << "Result: " << x << " " << y << endl;
}

Sometimes you want to use the same pattern in Java, but you can't; at least not directly. Instead you could do something like this:

void getValues(int[] arg1, int[] arg2) {
    arg1[0] = 1;
    arg2[0] = 2;
}
void caller() {
    int[] x = new int[1];
    int[] y = new int[1];
    getValues(x, y);
    System.out.println("Result: " + x[0] + " " + y[0]);
}

As was explained in previous answers, in Java you're passing a pointer to the array as a value into getValues. That is enough, because the method then modifies the array element, and by convention you're expecting element 0 to contain the return value. Obviously you can do this in other ways, such as structuring your code so this isn't necessary, or constructing a class that can contain the return value or allow it to be set. But the simple pattern available to you in C++ above is not available in Java.




In Java only references are passed and are passed by value:

Java arguments are all passed by value (the reference is copied when used by the method) :

In the case of primitive types, Java behaviour is simple: The value is copied in another instance of the primitive type.

In case of Objects, this is the same: Object variables are pointers (buckets) holding only Object’s address that was created using the "new" keyword, and are copied like primitive types.

The behaviour can appear different from primitive types: Because the copied object-variable contains the same address (to the same Object) Object's content/members might still be modified within a method and later access outside, giving the illusion that the (containing) Object itself was passed by reference.

"String" Objects appear to be a perfect counter-example to the urban legend saying that "Objects are passed by reference":

In effect, within a method you will never be able, to update the value of a String passed as argument:

A String Object, holds characters by an array declared final that can't be modified. Only the address of the Object might be replaced by another using "new". Using "new" to update the variable, will not let the Object be accessed from outside, since the variable was initially passed by value and copied.




Java is always pass by value, not pass by reference

First of all, we need to understand what pass by value and pass by reference are.

Pass by value means that you are making a copy in memory of the actual parameter's value that is passed in. This is a copy of the contents of the actual parameter.

Pass by reference (also called pass by address) means that a copy of the address of the actual parameter is stored.

Sometimes Java can give the illusion of pass by reference. Let's see how it works by using the example below:

public class PassByValue {
    public static void main(String[] args) {
        Test t = new Test();
        t.name = "initialvalue";
        new PassByValue().changeValue(t);
        System.out.println(t.name);
    }

    public void changeValue(Test f) {
        f.name = "changevalue";
    }
}

class Test {
    String name;
}

The output of this program is:

changevalue

Let's understand step by step:

Test t = new Test();

As we all know it will create an object in the heap and return the reference value back to t. For example, suppose the value of t is 0x100234 (we don't know the actual JVM internal value, this is just an example) .

new PassByValue().changeValue(t);

When passing reference t to the function it will not directly pass the actual reference value of object test, but it will create a copy of t and then pass it to the function. Since it is passing by value, it passes a copy of the variable rather than the actual reference of it. Since we said the value of t was 0x100234, both t and f will have the same value and hence they will point to the same object.

If you change anything in the function using reference f it will modify the existing contents of the object. That is why we got the output changevalue, which is updated in the function.

To understand this more clearly, consider the following example:

public class PassByValue {
    public static void main(String[] args) {
        Test t = new Test();
        t.name = "initialvalue";
        new PassByValue().changeRefence(t);
        System.out.println(t.name);
    }

    public void changeRefence(Test f) {
        f = null;
    }
}

class Test {
    String name;
}

Will this throw a NullPointerException? No, because it only passes a copy of the reference. In the case of passing by reference, it could have thrown a NullPointerException, as seen below:

Hopefully this will help.




To make a long story short, Java objects have some very peculiar properties.

In general, Java has primitive types (int, bool, char, double, etc) that are passed directly by value. Then Java has objects (everything that derives from java.lang.Object). Objects are actually always handled through a reference (a reference being a pointer that you can't touch). That means that in effect, objects are passed by reference, as the references are normally not interesting. It does however mean that you cannot change which object is pointed to as the reference itself is passed by value.

Does this sound strange and confusing? Let's consider how C implements pass by reference and pass by value. In C, the default convention is pass by value. void foo(int x) passes an int by value. void foo(int *x) is a function that does not want an int a, but a pointer to an int: foo(&a). One would use this with the & operator to pass a variable address.

Take this to C++, and we have references. References are basically (in this context) syntactic sugar that hide the pointer part of the equation: void foo(int &x) is called by foo(a), where the compiler itself knows that it is a reference and the address of the non-reference a should be passed. In Java, all variables referring to objects are actually of reference type, in effect forcing call by reference for most intends and purposes without the fine grained control (and complexity) afforded by, for example, C++.




As far as I know, Java only knows call by value. This means for primitive datatypes you will work with an copy and for objects you will work with an copy of the reference to the objects. However I think there are some pitfalls; for example, this will not work:

public static void swap(StringBuffer s1, StringBuffer s2) {
    StringBuffer temp = s1;
    s1 = s2;
    s2 = temp;
}


public static void main(String[] args) {
    StringBuffer s1 = new StringBuffer("Hello");
    StringBuffer s2 = new StringBuffer("World");
    swap(s1, s2);
    System.out.println(s1);
    System.out.println(s2);
}

This will populate Hello World and not World Hello because in the swap function you use copys which have no impact on the references in the main. But if your objects are not immutable you can change it for example:

public static void appendWorld(StringBuffer s1) {
    s1.append(" World");
}

public static void main(String[] args) {
    StringBuffer s = new StringBuffer("Hello");
    appendWorld(s);
    System.out.println(s);
}

This will populate Hello World on the command line. If you change StringBuffer into String it will produce just Hello because String is immutable. For example:

public static void appendWorld(String s){
    s = s+" World";
}

public static void main(String[] args) {
    String s = new String("Hello");
    appendWorld(s);
    System.out.println(s);
}

However you could make a wrapper for String like this which would make it able to use it with Strings:

class StringWrapper {
    public String value;

    public StringWrapper(String value) {
        this.value = value;
    }
}

public static void appendWorld(StringWrapper s){
    s.value = s.value +" World";
}

public static void main(String[] args) {
    StringWrapper s = new StringWrapper("Hello");
    appendWorld(s);
    System.out.println(s.value);
}

edit: i believe this is also the reason to use StringBuffer when it comes to "adding" two Strings because you can modifie the original object which u can't with immutable objects like String is.




I can't believe that nobody mentioned Barbara Liskov yet. When she designed CLU in 1974, she ran into this same terminology problem, and she invented the term call by sharing (also known as call by object-sharing and call by object) for this specific case of "call by value where the value is a reference".




Java passes references by value.

So you can't change the reference that gets passed in.






Links