用Java創建內存洩漏



Answers

持有對象引用的靜態字段[esp final field]

class MemorableClass {
    static final ArrayList list = new ArrayList(100);
}

在冗長的String上調用String.intern()

String str=readString(); // read lengthy string any source db,textbox/jsp etc..
// This will place the string in memory pool from which you can't remove
str.intern();

(未封閉)開放流(文件,網絡等)

try {
    BufferedReader br = new BufferedReader(new FileReader(inputFile));
    ...
    ...
} catch (Exception e) {
    e.printStacktrace();
}

未連接的連接

try {
    Connection conn = ConnectionFactory.getConnection();
    ...
    ...
} catch (Exception e) {
    e.printStacktrace();
}

無法從JVM的垃圾收集器訪問的區域 ,例如通過本機方法分配的內存

在Web應用程序中,一些對象存儲在應用程序範圍中,直到應用程序被明確停止或刪除。

getServletContext().setAttribute("SOME_MAP", map);

不正確或不合適的JVM選項 ,如IBM JDK上的noclassgc選項可防止未使用的類垃圾回收

請參閱IBM jdk設置

Question

我剛剛接受了一次採訪,並被要求用Java創建內存洩漏。 毋庸置疑,我對於如何開始創建一個自己而言毫無頭緒。

一個例子會是什麼?




The interviewer might have be looking for a circular reference solution:

    public static void main(String[] args) {
        while (true) {
            Element first = new Element();
            first.next = new Element();
            first.next.next = first;
        }
    }

This is a classic problem with reference counting garbage collectors. You would then politely explain that JVMs use a much more sophisticated algorithm that doesn't have this limitation.

-Wes Tarle




I have had a nice "memory leak" in relation to PermGen and XML parsing once. The XML parser we used (I can't remember which one it was) did a String.intern() on tag names, to make comparison faster. One of our customers had the great idea to store data values not in XML attributes or text, but as tagnames, so we had a document like:

<data>
   <1>bla</1>
   <2>foo</>
   ...
</data>

In fact, they did not use numbers but longer textual IDs (around 20 characters), which were unique and came in at a rate of 10-15 million a day. That makes 200 MB of rubbish a day, which is never needed again, and never GCed (since it is in PermGen). We had permgen set to 512 MB, so it took around two days for the out-of-memory exception (OOME) to arrive...




Here's a simple/sinister one via http://wiki.eclipse.org/Performance_Bloopers#String.substring.28.29 .

public class StringLeaker
{
    private final String muchSmallerString;

    public StringLeaker()
    {
        // Imagine the whole Declaration of Independence here
        String veryLongString = "We hold these truths to be self-evident...";

        // The substring here maintains a reference to the internal char[]
        // representation of the original string.
        this.muchSmallerString = veryLongString.substring(0, 1);
    }
}

Because the substring refers to the internal representation of the original, much longer string, the original stays in memory. Thus, as long as you have a StringLeaker in play, you have the whole original string in memory, too, even though you might think you're just holding on to a single-character string.

The way to avoid storing an unwanted reference to the original string is to do something like this:

...
this.muchSmallerString = new String(veryLongString.substring(0, 1));
...

For added badness, you might also .intern() the substring:

...
this.muchSmallerString = veryLongString.substring(0, 1).intern();
...

Doing so will keep both the original long string and the derived substring in memory even after the StringLeaker instance has been discarded.




A common example of this in GUI code is when creating a widget/component and adding a listener to some static/application scoped object and then not removing the listener when the widget is destroyed. Not only do you get a memory leak, but also a performance hit as when whatever you are listening to fires events, all your old listeners are called too.




Create a static Map and keep adding hard references to it. Those will never be GC'd.

public class Leaker {
    private static final Map<String, Object> CACHE = new HashMap<String, Object>();

    // Keep adding until failure.
    public static void addToCache(String key, Object value) { Leaker.CACHE.put(key, value); }
}



The interviewer was probably looking for a circular reference like the code below (which incidentally only leak memory in very old JVMs that used reference counting, which isn't the case any more). But it's a pretty vague question, so it's a prime opportunity to show off your understanding of JVM memory management.

class A {
    B bRef;
}

class B {
    A aRef;
}

public class Main {
    public static void main(String args[]) {
        A myA = new A();
        B myB = new B();
        myA.bRef = myB;
        myB.aRef = myA;
        myA=null;
        myB=null;
        /* at this point, there is no access to the myA and myB objects, */
        /* even though both objects still have active references. */
    } /* main */
}

Then you can explain that with reference counting, the above code would leak memory. But most modern JVMs don't use reference counting any longer, most use a sweep garbage collector, which will in fact collect this memory.

Next you might explain creating an Object that has an underlying native resource, like this:

public class Main {
    public static void main(String args[]) {
        Socket s = new Socket(InetAddress.getByName("google.com"),80);
        s=null;
        /* at this point, because you didn't close the socket properly, */
        /* you have a leak of a native descriptor, which uses memory. */
    }
}

Then you can explain this is technically a memory leak, but really the leak is caused by native code in the JVM allocating underlying native resources, which weren't freed by your Java code.

At the end of the day, with a modern JVM, you need to write some Java code that allocates a native resource outside the normal scope of the JVM's awareness.




I thought it was interesting that no one used the internal class examples. If you have an internal class; it inherently maintains a reference to the containing class. Of course it is not technically a memory leak because Java WILL eventually clean it up; but this can cause classes to hang around longer than anticipated.

public class Example1 {
  public Example2 getNewExample2() {
    return this.new Example2();
  }
  public class Example2 {
    public Example2() {}
  }
}

Now if you call Example1 and get an Example2 discarding Example1, you will inherently still have a link to an Example1 object.

public class Referencer {
  public static Example2 GetAnExample2() {
    Example1 ex = new Example1();
    return ex.getNewExample2();
  }

  public static void main(String[] args) {
    Example2 ex = Referencer.GetAnExample2();
    // As long as ex is reachable; Example1 will always remain in memory.
  }
}

I've also heard a rumor that if you have a variable that exists for longer than a specific amount of time; Java assumes that it will always exist and will actually never try to clean it up if cannot be reached in code anymore. But that is completely unverified.




下面將會有一個非常明顯的例子,除了標準的被遺忘的監聽器,靜態引用,hashmaps中的偽造/可修改密鑰,或者只是線程被卡住而沒有任何機會結束它們的生命週期之外,Java漏洞。

  • File.deleteOnExit() - 總是洩漏字符串, 如果字符串是一個子字符串,則洩漏更加嚴重(底層的char []也會洩漏) - 在Java 7子字符串中也複製了char[] ,所以後者不適用 ; 雖然,丹尼爾不需要投票。

我將專注於線程,主要顯示非管理線程的危險,不希望甚至碰到擺動。

  • Runtime.addShutdownHook並不刪除...然後,即使使用removeShutdownHook,由於ThreadGroup類中有關未啟動線程的錯誤,它可能無法收集,從而有效地洩漏了ThreadGroup。 JGroup在GossipRouter中有洩漏。

  • 創建,但不是開始,一個Thread進入與上述相同的類別。

  • 創建一個線程會繼承ContextClassLoaderAccessControlContext ,加上ThreadGroup和任何InheritedThreadLocal ,所有這些引用都是潛在的洩漏,以及由類加載器加載的所有類以及所有靜態引用和ja-ja。 這個效果在整個jucExecutor框架中特別明顯,該框架具有超級簡單的ThreadFactory接口,但大多數開發人員不知道潛在的危險。 也有很多圖書館根據要求啟動線程(太多行業流行庫)。

  • ThreadLocal緩存; 在很多情況下這些都是邪惡的。 我確信每個人都已經看到了很多基於ThreadLocal的簡單緩存,還有一個壞消息:如果線程繼續超出預期的上下文ClassLoader的生命期,那麼這是一個純粹的小漏洞。 除非確實需要,否則不要使用ThreadLocal緩存。

  • 當ThreadGroup本身沒有線程時調用ThreadGroup.destroy() ,但它仍然保留子線程組。 一個不好的洩漏會阻止ThreadGroup從其父項中移除,但是所有的子項都會變得無法枚舉。

  • 使用WeakHashMap和值(in)直接引用密鑰。 這是一個很難找到沒有堆轉儲。 這適用於所有擴展的Weak/SoftReference ,它們可能會將硬引用保留回守護對象。

  • 使用帶有HTTP(S)協議的java.net.URL並從(!)加載資源。 這個是特殊的, KeepAliveCache在系統ThreadGroup中創建一個新線程,它洩漏當前線程的上下文類加載器。 線程是在沒有活動線程存在時根據第一個請求創建的,所以無論你是幸運還是洩漏。 該漏洞已在Java 7中修復,創建線程的代碼正確地刪除了上下文類加載器。 還有幾個案例( 像ImageFetcher 也是固定的 )創建類似的線程。

  • 使用PNGImageDecoder在構造函數中傳遞new java.util.zip.Inflater() (例如PNGImageDecoder ),而不是調用PNGImageDecoderend() 。 那麼,如果你傳入的構造函數只有new ,沒有機會...是的,如果它被手動作為構造函數參數傳遞,那麼在流上調用close()不會關閉inflater。 這不是真正的洩漏,因為它會被終結者發布......當它認為有必要的時候。 直到那一刻,它嚴重地吃掉了本機內存,這可能會導致Linux oom_killer不受懲罰地殺死這個進程。 主要的問題是Java中的定型非常不可靠,G1讓它變得更糟,直到7.0.2。 故事的道德:盡快釋放本土資源; 終結者太窮了。

  • java.util.zip.Deflater相同的情況。 由於Deflater在Java中是飢餓的內存,也就是說總是使用15位(最大值)和8個內存級別(最大值為9)來分配幾百KB的本機內存,所以這個情況更糟糕。 幸運的是, Deflater沒有廣泛使用,據我所知,JDK沒有任何濫用。 如果您手動創建DeflaterInflater始終調用end() 。 最後兩個最好的部分: 你不能通過普通的分析工具找到它們。

(我可以添加更多的時間浪費,我遇到了請求。)

祝你好運,保持安全; 洩漏是邪惡的!




你可以通過sun.misc.Unsafe類使內存洩漏。 事實上,這個服務類被用在不同的標準類中(例如在java.nio類中)。 你不能直接創建這個類的實例 ,但你可以使用反射來做到這一點

代碼不能在Eclipse IDE中編譯 - 使用命令javac進行編譯(在編譯期間,您將收到警告)

import java.lang.reflect.Constructor;
import java.lang.reflect.Field;
import sun.misc.Unsafe;


public class TestUnsafe {

    public static void main(String[] args) throws Exception{
        Class unsafeClass = Class.forName("sun.misc.Unsafe");
        Field f = unsafeClass.getDeclaredField("theUnsafe");
        f.setAccessible(true);
        Unsafe unsafe = (Unsafe) f.get(null);
        System.out.print("4..3..2..1...");
        try
        {
            for(;;)
                unsafe.allocateMemory(1024*1024);
        } catch(Error e) {
            System.out.println("Boom :)");
            e.printStackTrace();
        }
    }

}



I don't think anyone has said this yet: you can resurrect an object by overriding the finalize() method such that finalize() stores a reference of this somewhere. The garbage collector will only be called once on the object so after that the object will never destroyed.




答案完全取決於面試官認為他們在問什麼。

在實踐中可能會洩漏Java嗎? 當然是,其他答案中有很多例子。

但是有多個元問題可能被問到?

  • 理論上“完美”的Java實現容易洩漏嗎?
  • 候選人是否理解理論與現實之間的差異?
  • 候選人是否了解垃圾收集的工作原理?
  • 或者垃圾收集在理想情況下應該如何工作?
  • 他們知道他們可以通過本地接口調用其他語言嗎?
  • 他們是否知道以其他語言洩漏內存?
  • 候選人是否知道內存管理是什麼,以及Java在幕後發生了什麼?

我正在閱讀你的元問題“我在本次訪談中可以使用的答案是什麼”。 因此,我將專注於面試技巧而不是Java。 我相信你更有可能在面試中重複不知道問題答案的情況,而不是你需要知道如何讓Java洩漏的地方。 所以,希望這會有所幫助。

你可以為面試開發的最重要的技能之一是學習積極傾聽問題,並與面試官一起提取他們的意圖。 這不僅讓你以他們想要的方式回答他們的問題,而且還表明你有一些重要的溝通技巧。 當它涉及到許多同樣有才華的開發人員之間的選擇時,我會聘請那些在每次回應之前傾聽,思考和理解的人。




Threads are not collected until they terminate. They serve as roots of garbage collection. They are one of the few objects that won't be reclaimed simply by forgetting about them or clearing references to them.

Consider: the basic pattern to terminate a worker thread is to set some condition variable seen by the thread. The thread can check the variable periodically and use that as a signal to terminate. If the variable is not declared volatile , then the change to the variable might not be seen by the thread, so it won't know to terminate. Or imagine if some threads want to update a shared object, but deadlock while trying to lock on it.

If you only have a handful of threads these bugs will probably be obvious because your program will stop working properly. If you have a thread pool that creates more threads as needed, then the obsolete/stuck threads might not be noticed, and will accumulate indefinitely, causing a memory leak. Threads are likely to use other data in your application, so will also prevent anything they directly reference from ever being collected.

As a toy example:

static void leakMe(final Object object) {
    new Thread() {
        public void run() {
            Object o = object;
            for (;;) {
                try {
                    sleep(Long.MAX_VALUE);
                } catch (InterruptedException e) {}
            }
        }
    }.start();
}

Call System.gc() all you like, but the object passed to leakMe will never die.

(*edited*)




ArrayList.remove(int)的實現可能是潛在內存洩漏的最簡單示例之一,以及如何避免它。

public E remove(int index) {
    RangeCheck(index);

    modCount++;
    E oldValue = (E) elementData[index];

    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index + 1, elementData, index,
                numMoved);
    elementData[--size] = null; // (!) Let gc do its work

    return oldValue;
}

如果你自己實現它,你是否想過清除不再使用的數組元素( elementData[--size] = null )? 這個參考可能會讓一個巨大的物體活著......




As a lot of people have suggested, Resource Leaks are fairly easy to cause - like the JDBC examples. Actual Memory leaks are a bit harder - especially if you aren't relying on broken bits of the JVM to do it for you...

The ideas of creating objects that have a very large footprint and then not being able to access them aren't real memory leaks either. If nothing can access it then it will be garbage collected, and if something can access it then it's not a leak...

One way that used to work though - and I don't know if it still does - is to have a three-deep circular chain. As in Object A has a reference to Object B, Object B has a reference to Object C and Object C has a reference to Object A. The GC was clever enough to know that a two deep chain - as in A <--> B - can safely be collected if A and B aren't accessible by anything else, but couldn't handle the three-way chain...




Links