questions - simple memory leak example in java




Creating a memory leak with Java (20)

I just had an interview, and I was asked to create a memory leak with Java. Needless to say I felt pretty dumb having no clue on how to even start creating one.

What would an example be?


Static field holding object reference [esp final field]

class MemorableClass {
    static final ArrayList list = new ArrayList(100);
}

Calling String.intern() on lengthy String

String str=readString(); // read lengthy string any source db,textbox/jsp etc..
// This will place the string in memory pool from which you can't remove
str.intern();

(Unclosed) open streams ( file , network etc... )

try {
    BufferedReader br = new BufferedReader(new FileReader(inputFile));
    ...
    ...
} catch (Exception e) {
    e.printStacktrace();
}

Unclosed connections

try {
    Connection conn = ConnectionFactory.getConnection();
    ...
    ...
} catch (Exception e) {
    e.printStacktrace();
}

Areas that are unreachable from JVM's garbage collector, such as memory allocated through native methods

In web applications, some objects are stored in application scope until the application is explicitly stopped or removed.

getServletContext().setAttribute("SOME_MAP", map);

Incorrect or inappropriate JVM options, such as the noclassgc option on IBM JDK that prevents unused class garbage collection

See IBM jdk settings.


A common example of this in GUI code is when creating a widget/component and adding a listener to some static/application scoped object and then not removing the listener when the widget is destroyed. Not only do you get a memory leak, but also a performance hit as when whatever you are listening to fires events, all your old listeners are called too.


Any time you keep references around to objects that you no longer need you have a memory leak. See Handling memory leaks in Java programs for examples of how memory leaks manifest themselves in Java and what you can do about it.


As a lot of people have suggested, Resource Leaks are fairly easy to cause - like the JDBC examples. Actual Memory leaks are a bit harder - especially if you aren't relying on broken bits of the JVM to do it for you...

The ideas of creating objects that have a very large footprint and then not being able to access them aren't real memory leaks either. If nothing can access it then it will be garbage collected, and if something can access it then it's not a leak...

One way that used to work though - and I don't know if it still does - is to have a three-deep circular chain. As in Object A has a reference to Object B, Object B has a reference to Object C and Object C has a reference to Object A. The GC was clever enough to know that a two deep chain - as in A <--> B - can safely be collected if A and B aren't accessible by anything else, but couldn't handle the three-way chain...


Create a static Map and keep adding hard references to it. Those will never be GC'd.

public class Leaker {
    private static final Map<String, Object> CACHE = new HashMap<String, Object>();

    // Keep adding until failure.
    public static void addToCache(String key, Object value) { Leaker.CACHE.put(key, value); }
}

Everyone always forgets the native code route. Here's a simple formula for a leak:

  1. Declare native method.
  2. In native method, call malloc. Don't call free.
  3. Call the native method.

Remember, memory allocations in native code come from the JVM heap.


Here's a simple/sinister one via http://wiki.eclipse.org/Performance_Bloopers#String.substring.28.29.

public class StringLeaker
{
    private final String muchSmallerString;

    public StringLeaker()
    {
        // Imagine the whole Declaration of Independence here
        String veryLongString = "We hold these truths to be self-evident...";

        // The substring here maintains a reference to the internal char[]
        // representation of the original string.
        this.muchSmallerString = veryLongString.substring(0, 1);
    }
}

Because the substring refers to the internal representation of the original, much longer string, the original stays in memory. Thus, as long as you have a StringLeaker in play, you have the whole original string in memory, too, even though you might think you're just holding on to a single-character string.

The way to avoid storing an unwanted reference to the original string is to do something like this:

...
this.muchSmallerString = new String(veryLongString.substring(0, 1));
...

For added badness, you might also .intern() the substring:

...
this.muchSmallerString = veryLongString.substring(0, 1).intern();
...

Doing so will keep both the original long string and the derived substring in memory even after the StringLeaker instance has been discarded.


I came across a more subtle kind of resource leak recently. We open resources via class loader's getResourceAsStream and it happened that the input stream handles were not closed.

Uhm, you might say, what an idiot.

Well, what makes this interesting is: this way, you can leak heap memory of the underlying process, rather than from JVM's heap.

All you need is a jar file with a file inside which will be referenced from Java code. The bigger the jar file, the quicker memory gets allocated.

You can easily create such a jar with the following class:

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.zip.ZipEntry;
import java.util.zip.ZipOutputStream;

public class BigJarCreator {
    public static void main(String[] args) throws IOException {
        ZipOutputStream zos = new ZipOutputStream(new FileOutputStream(new File("big.jar")));
        zos.putNextEntry(new ZipEntry("resource.txt"));
        zos.write("not too much in here".getBytes());
        zos.closeEntry();
        zos.putNextEntry(new ZipEntry("largeFile.out"));
        for (int i=0 ; i<10000000 ; i++) {
            zos.write((int) (Math.round(Math.random()*100)+20));
        }
        zos.closeEntry();
        zos.close();
    }
}

Just paste into a file named BigJarCreator.java, compile and run it from command line:

javac BigJarCreator.java
java -cp . BigJarCreator

Et voilà: you find a jar archive in your current working directory with two files inside.

Let's create a second class:

public class MemLeak {
    public static void main(String[] args) throws InterruptedException {
        int ITERATIONS=100000;
        for (int i=0 ; i<ITERATIONS ; i++) {
            MemLeak.class.getClassLoader().getResourceAsStream("resource.txt");
        }
        System.out.println("finished creation of streams, now waiting to be killed");

        Thread.sleep(Long.MAX_VALUE);
    }

}

This class basically does nothing, but create unreferenced InputStream objects. Those objects will be garbage collected immediately and thus, do not contribute to heap size. It is important for our example to load an existing resource from a jar file, and size does matter here!

If you're doubtful, try to compile and start the class above, but make sure to chose a decent heap size (2 MB):

javac MemLeak.java
java -Xmx2m -classpath .:big.jar MemLeak

You will not encounter an OOM error here, as no references are kept, the application will keep running no matter how large you chose ITERATIONS in the above example. The memory consumption of your process (visible in top (RES/RSS) or process explorer) grows unless the application gets to the wait command. In the setup above, it will allocate around 150 MB in memory.

If you want the application to play safe, close the input stream right where it's created:

MemLeak.class.getClassLoader().getResourceAsStream("resource.txt").close();

and your process will not exceed 35 MB, independent of the iteration count.

Quite simple and surprising.


I don't think anyone has said this yet: you can resurrect an object by overriding the finalize() method such that finalize() stores a reference of this somewhere. The garbage collector will only be called once on the object so after that the object will never destroyed.


I have had a nice "memory leak" in relation to PermGen and XML parsing once. The XML parser we used (I can't remember which one it was) did a String.intern() on tag names, to make comparison faster. One of our customers had the great idea to store data values not in XML attributes or text, but as tagnames, so we had a document like:

<data>
   <1>bla</1>
   <2>foo</>
   ...
</data>

In fact, they did not use numbers but longer textual IDs (around 20 characters), which were unique and came in at a rate of 10-15 million a day. That makes 200 MB of rubbish a day, which is never needed again, and never GCed (since it is in PermGen). We had permgen set to 512 MB, so it took around two days for the out-of-memory exception (OOME) to arrive...


I think that a valid example could be using ThreadLocal variables in an environment where threads are pooled.

For instance, using ThreadLocal variables in Servlets to communicate with other web components, having the threads being created by the container and maintaining the idle ones in a pool. ThreadLocal variables, if not correctly cleaned up, will live there until, possibly, the same web component overwrites their values.

Of course, once identified, the problem can be solved easily.


I thought it was interesting that no one used the internal class examples. If you have an internal class; it inherently maintains a reference to the containing class. Of course it is not technically a memory leak because Java WILL eventually clean it up; but this can cause classes to hang around longer than anticipated.

public class Example1 {
  public Example2 getNewExample2() {
    return this.new Example2();
  }
  public class Example2 {
    public Example2() {}
  }
}

Now if you call Example1 and get an Example2 discarding Example1, you will inherently still have a link to an Example1 object.

public class Referencer {
  public static Example2 GetAnExample2() {
    Example1 ex = new Example1();
    return ex.getNewExample2();
  }

  public static void main(String[] args) {
    Example2 ex = Referencer.GetAnExample2();
    // As long as ex is reachable; Example1 will always remain in memory.
  }
}

I've also heard a rumor that if you have a variable that exists for longer than a specific amount of time; Java assumes that it will always exist and will actually never try to clean it up if cannot be reached in code anymore. But that is completely unverified.


Most examples here are "too complex". They are edge cases. With these examples, the programmer made a mistake (like don't redefining equals/hashcode), or has been bitten by a corner case of the JVM/JAVA (load of class with static...). I think that's not the type of example an interviewer want or even the most common case.

But there are really simpler cases for memory leaks. The garbage collector only frees what is no longer referenced. We as Java developers don't care about memory. We allocate it when needed and let it be freed automatically. Fine.

But any long-lived application tend to have shared state. It can be anything, statics, singletons... Often non-trivial applications tend to make complex objects graphs. Just forgetting to set a reference to null or more often forgetting to remove one object from a collection is enough to make a memory leak.

Of course all sort of listeners (like UI listeners), caches, or any long-lived shared state tend to produce memory leak if not properly handled. What shall be understood is that this is not a Java corner case, or a problem with the garbage collector. It is a design problem. We design that we add a listener to a long-lived object, but we don't remove the listener when no longer needed. We cache objects, but we have no strategy to remove them from the cache.

We maybe have a complex graph that store the previous state that is needed by a computation. But the previous state is itself linked to the state before and so on.

Like we have to close SQL connections or files. We need to set proper references to null and remove elements from the collection. We shall have proper caching strategies (maximum memory size, number of elements, or timers). All objects that allow a listener to be notified must provide both a addListener and removeListener method. And when these notifiers are no longer used, they must clear their listener list.

A memory leak is indeed truly possible and is perfectly predictable. No need for special language features or corner cases. Memory leaks are either an indicator that something is maybe missing or even of design problems.


Probably one of the simplest examples of a potential memory leak, and how to avoid it, is the implementation of ArrayList.remove(int):

public E remove(int index) {
    RangeCheck(index);

    modCount++;
    E oldValue = (E) elementData[index];

    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index + 1, elementData, index,
                numMoved);
    elementData[--size] = null; // (!) Let gc do its work

    return oldValue;
}

If you were implementing it yourself, would you have thought to clear the array element that is no longer used (elementData[--size] = null)? That reference might keep a huge object alive ...


The answer depends entirely on what the interviewer thought they were asking.

Is it possible in practice to make Java leak? Of course it is, and there are plenty of examples in the other answers.

But there are multiple meta-questions that may have been being asked?

  • Is a theoretically "perfect" Java implementation vulnerable to leaks?
  • Does the candidate understand the difference between theory and reality?
  • Does the candidate understand how garbage collection works?
  • Or how garbage collection is supposed to work in an ideal case?
  • Do they know they can call other languages through native interfaces?
  • Do they know to leak memory in those other languages?
  • Does the candidate even know what memory management is, and what is going on behind the scene in Java?

I'm reading your meta-question as "What's an answer I could have used in this interview situation". And hence, I'm going to focus on interview skills instead of Java. I believe your more likely to repeat the situation of not knowing the answer to a question in an interview than you are to be in a place of needing to know how to make Java leak. So, hopefully, this will help.

One of the most important skills you can develop for interviewing is learning to actively listen to the questions and working with the interviewer to extract their intent. Not only does this let you answer their question the way they want, but also shows that you have some vital communication skills. And when it comes down to a choice between many equally talented developers, I'll hire the one who listens, thinks, and understands before they respond every time.


The following is a pretty pointless example, if you do not understand JDBC. Or at least how JDBC expects a developer to close Connection, Statement and ResultSet instances before discarding them or losing references to them, instead of relying on the implementation of finalize.

void doWork()
{
   try
   {
       Connection conn = ConnectionFactory.getConnection();
       PreparedStatement stmt = conn.preparedStatement("some query"); // executes a valid query
       ResultSet rs = stmt.executeQuery();
       while(rs.hasNext())
       {
          ... process the result set
       }
   }
   catch(SQLException sqlEx)
   {
       log(sqlEx);
   }
}

The problem with the above is that the Connection object is not closed, and hence the physical connection will remain open, until the garbage collector comes around and sees that it is unreachable. GC will invoke the finalize method, but there are JDBC drivers that do not implement the finalize, at least not in the same way that Connection.close is implemented. The resulting behavior is that while memory will be reclaimed due to unreachable objects being collected, resources (including memory) associated with the Connection object might simply not be reclaimed.

In such an event where the Connection's finalize method does not clean up everything, one might actually find that the physical connection to the database server will last several garbage collection cycles, until the database server eventually figures out that the connection is not alive (if it does), and should be closed.

Even if the JDBC driver were to implement finalize, it is possible for exceptions to be thrown during finalization. The resulting behavior is that any memory associated with the now "dormant" object will not be reclaimed, as finalize is guaranteed to be invoked only once.

The above scenario of encountering exceptions during object finalization is related to another other scenario that could possibly lead to a memory leak - object resurrection. Object resurrection is often done intentionally by creating a strong reference to the object from being finalized, from another object. When object resurrection is misused it will lead to a memory leak in combination with other sources of memory leaks.

There are plenty more examples that you can conjure up - like

  • Managing a List instance where you are only adding to the list and not deleting from it (although you should be getting rid of elements you no longer need), or
  • Opening Sockets or Files, but not closing them when they are no longer needed (similar to the above example involving the Connection class).
  • Not unloading Singletons when bringing down a Java EE application. Apparently, the Classloader that loaded the singleton class will retain a reference to the class, and hence the singleton instance will never be collected. When a new instance of the application is deployed, a new class loader is usually created, and the former class loader will continue to exist due to the singleton.

The interviewer was probably looking for a circular reference like the code below (which incidentally only leak memory in very old JVMs that used reference counting, which isn't the case any more). But it's a pretty vague question, so it's a prime opportunity to show off your understanding of JVM memory management.

class A {
    B bRef;
}

class B {
    A aRef;
}

public class Main {
    public static void main(String args[]) {
        A myA = new A();
        B myB = new B();
        myA.bRef = myB;
        myB.aRef = myA;
        myA=null;
        myB=null;
        /* at this point, there is no access to the myA and myB objects, */
        /* even though both objects still have active references. */
    } /* main */
}

Then you can explain that with reference counting, the above code would leak memory. But most modern JVMs don't use reference counting any longer, most use a sweep garbage collector, which will in fact collect this memory.

Next you might explain creating an Object that has an underlying native resource, like this:

public class Main {
    public static void main(String args[]) {
        Socket s = new Socket(InetAddress.getByName("google.com"),80);
        s=null;
        /* at this point, because you didn't close the socket properly, */
        /* you have a leak of a native descriptor, which uses memory. */
    }
}

Then you can explain this is technically a memory leak, but really the leak is caused by native code in the JVM allocating underlying native resources, which weren't freed by your Java code.

At the end of the day, with a modern JVM, you need to write some Java code that allocates a native resource outside the normal scope of the JVM's awareness.


Threads are not collected until they terminate. They serve as roots of garbage collection. They are one of the few objects that won't be reclaimed simply by forgetting about them or clearing references to them.

Consider: the basic pattern to terminate a worker thread is to set some condition variable seen by the thread. The thread can check the variable periodically and use that as a signal to terminate. If the variable is not declared volatile, then the change to the variable might not be seen by the thread, so it won't know to terminate. Or imagine if some threads want to update a shared object, but deadlock while trying to lock on it.

If you only have a handful of threads these bugs will probably be obvious because your program will stop working properly. If you have a thread pool that creates more threads as needed, then the obsolete/stuck threads might not be noticed, and will accumulate indefinitely, causing a memory leak. Threads are likely to use other data in your application, so will also prevent anything they directly reference from ever being collected.

As a toy example:

static void leakMe(final Object object) {
    new Thread() {
        public void run() {
            Object o = object;
            for (;;) {
                try {
                    sleep(Long.MAX_VALUE);
                } catch (InterruptedException e) {}
            }
        }
    }.start();
}

Call System.gc() all you like, but the object passed to leakMe will never die.

(*edited*)


You are able to make memory leak with sun.misc.Unsafe class. In fact this service class is used in different standard classes (for example in java.nio classes). You can't create instance of this class directly, but you may use reflection to do that.

Code doesn't compile in Eclipse IDE - compile it using command javac (during compilation you'll get warnings)

import java.lang.reflect.Constructor;
import java.lang.reflect.Field;
import sun.misc.Unsafe;


public class TestUnsafe {

    public static void main(String[] args) throws Exception{
        Class unsafeClass = Class.forName("sun.misc.Unsafe");
        Field f = unsafeClass.getDeclaredField("theUnsafe");
        f.setAccessible(true);
        Unsafe unsafe = (Unsafe) f.get(null);
        System.out.print("4..3..2..1...");
        try
        {
            for(;;)
                unsafe.allocateMemory(1024*1024);
        } catch(Error e) {
            System.out.println("Boom :)");
            e.printStackTrace();
        }
    }

}

You can create a moving memory leak by creating a new instance of a class in that class's finalize method. Bonus points if the finalizer creates multiple instances. Here's a simple program that leaks the entire heap in sometime between a few seconds and a few minutes depending on your heap size:

class Leakee {
    public void check() {
        if (depth > 2) {
            Leaker.done();
        }
    }
    private int depth;
    public Leakee(int d) {
        depth = d;
    }
    protected void finalize() {
        new Leakee(depth + 1).check();
        new Leakee(depth + 1).check();
    }
}

public class Leaker {
    private static boolean makeMore = true;
    public static void done() {
        makeMore = false;
    }
    public static void main(String[] args) throws InterruptedException {
        // make a bunch of them until the garbage collector gets active
        while (makeMore) {
            new Leakee(0).check();
        }
        // sit back and watch the finalizers chew through memory
        while (true) {
            Thread.sleep(1000);
            System.out.println("memory=" +
                    Runtime.getRuntime().freeMemory() + " / " +
                    Runtime.getRuntime().totalMemory());
        }
    }
}




memory-leaks