GDB corrupted stack frame - How to debug?



Answers

If the situation is fairly simple, Chris Dodd's answer is the best one. It does look like it jumped through a NULL pointer.

However, it is possible the program shot itself in the foot, knee, neck, and eye before crashing—overwrote the stack, messed up the frame pointer, and other evils. If so, then unraveling the hash is not likely to show you potatoes and meat.

The more efficient solution will be to run the program under the debugger, and step over functions until the program crashes. Once a crashing function is identified, start again and step into that function and determine which function it calls causes the crash. Repeat until you find the single offending line of code. 75% of the time, the fix will then be obvious.

In the other 25% of situations, the so-called offending line of code is a red herring. It will be reacting to (invalid) conditions set up many lines before—maybe thousands of lines before. If that is the case, the best course chosen depends on many factors: mostly your understanding of the code and experience with it:

  • Perhaps setting a debugger watchpoint or inserting diagnostic printf's on critical variables will lead to the necessary A ha!
  • Maybe changing test conditions with different inputs will provide more insight than debugging.
  • Maybe a second pair of eyes will force you to check your assumptions or gather overlooked evidence.
  • Sometimes, all it takes is going to dinner and thinking about the gathered evidence.

Good luck!

Question

I have the following stack trace. Is it possible to make out anything useful from this for debugging?

Program received signal SIGSEGV, Segmentation fault.
0x00000002 in ?? ()
(gdb) bt
#0  0x00000002 in ?? ()
#1  0x00000001 in ?? ()
#2  0xbffff284 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) 

Where to start looking at the code when we get a Segmentation fault, and the stack trace is not so useful?

NOTE: If I post the code, then the SO experts will give me the answer. I want to take the guidance from SO and find the answer myself, so I'm not posting the code here. Apologies.




Look at some of your other registers to see if one of them has the stack pointer cached in them. From there, you might be able to retrieve a stack. Also, if this is embedded, quite often stack is defined at a very particular address. Using that, you can also sometimes get a decent stack. This all assumes that when you jumped to hyperspace, your program didn't puke all over memory along the way...






Links