c++ vector - STL Alternative

example arduino (9)

My experience is that well designed STL code runs slowly in debug builds because the optimizer is turned off. STL containers emit a lot of calls to constructors and operator= which (if they are light weight) gets inlined/removed in release builds.

Also, Visual C++ 2005 and up has checking enabled for STL in both release and debug builds. It is a huge performance hog for STL-heavy software. It can be disabled by defining _SECURE_SCL=0 for all your compilation units. Please note that having different _SECURE_SCL status in different compilation units will almost certainly lead to disaster.

You could create a third build configuration with checking turned off and use that to debug with performance. I recommend you to keep a debug configuration with checking on though, since it's very helpful to catch erroneous array indices and stuff like that.

I really hate using STL containers because they make the debug version of my code run really slowly. What do other people use instead of STL that has reasonable performance for debug builds?

I'm a game programmer and this has been a problem on many of the projects I've worked on. It's pretty hard to get 60 fps when you use STL container for everything.

I use MSVC for most of my work.

Another crucial difference between debug and release is how local variables are stored. Conceptually local variables are allocated storage in a functions stack frame. The symbol file generated by the compiler tells the debugger the offset of the variable in the stack frame, so the debugger can show it to you. The debugger peeks at the memory location to do this.

However, this means every time a local variable is changed the generated code for that source line has to write the value back to the correct location on the stack. This is very inefficient due to the memory overhead.

In a release build the compiler may assign a local variable to a register for a portion of a function. In some cases it may not assign stack storage for it at all (the more registers a machine has the easier this is to do).

However, the debugger doesn't know how registers map to local variables for a particular point in the code (I'm not aware of any symbol format that includes this information), so it can't show it to you accurately as it doesn't know where to go looking for it.

Another optimization would be function inlining. In optimized builds the compiler may replace a call to foo() with the actual code for foo everywhere it is used because the function is small enough. However, when you try to set a breakpoint on foo() the debugger wants to know the address of the instructions for foo(), and there is no longer a simple answer to this -- there may be thousands of copies of the foo() code bytes spread over your program. A debug build will guarantee that there is somewhere for you to put the breakpoint.

std::vector is only as good as new. It simply handles the underlying memory allocation for you A couple of things you can do - assuming you don't want to write a whole new new handler.

Pre-allocate vectors or resize() if you know what eventual size they will be, this stops wasteful memory copies as they grow.

If you are going to be using the vector again with the same size, it's better to keep it and refill it than to delete it and recreate it.

Generally on embedded targets if you know the memory requirements it's best to statically allocate all the memory at the start and divide it up yourself - it's not like another user is going to want some.

If your vector will be reallocated many times then yes, it can cause memory fragmentation. The simplest way to avoid that would be using std::vector::reserve() if you more or less know how big your array can grow.

You can also consider using std::deque instead of vector, so you won't have problem with memory fragmentation at all.

Here is topic on which can be interesting for you: what-is-memory-fragmentation.

Why does a C/C++ program often have optimization turned off in debug mode?

Without any optimization on, the flow through your code is linear. If you are on line 5 and single step, you step to line 6. With optimization on, you can get instruction re-ordering, loop unrolling and all sorts of optimizations.
For example:

void foo() {
1:  int i;
2:  for(i = 0; i < 2; )
3:    i++;
4:  return;

In this example, without optimization, you could single step through the code and hit lines 1, 2, 3, 2, 3, 2, 4

With optimization on, you might get an execution path that looks like: 2, 3, 3, 4 or even just 4! (The function does nothing after all...)

Bottom line, debugging code with optimization enabled can be a royal pain! Especially if you have large functions.

Note that turning on optimization changes the code! In certain environment (safety critical systems), this is unacceptable and the code being debugged has to be the code shipped. Gotta debug with optimization on in that case.

While the optimized and non-optimized code should be "functionally" equivalent, under certain circumstances, the behavior will change.
Here is a simplistic example:

    int* ptr = 0xdeadbeef;  // some address to memory-mapped I/O device
    *ptr = 0;   // setup hardware device
    while(*ptr == 1) {    // loop until hardware device is done
       // do something

With optimization off, this is straightforward, and you kinda know what to expect. However, if you turn optimization on, a couple of things might happen:

  • The compiler might optimize the while block away (we init to 0, it'll never be 1)
  • Instead of accessing memory, pointer access might be moved to a register->No I/O Update
  • memory access might be cached (not necessarily compiler optimization related)

In all these cases, the behavior would be drastically different and most likely wrong.

Optimizing code is an automated process that improves the runtime performance of the code while preserving semantics. This process can remove intermediate results which are unncessary to complete an expression or function evaluation, but may be of interest to you when debugging. Similarly, optimizations can alter the apparent control flow so that things may happen in a slightly different order than what appears in the source code. This is done to skip unnecessary or redundant calculations. This rejiggering of code can mess with the mapping between source code line numbers and object code addresses making it hard for a debugger to follow the flow of control as you wrote it.

Debugging in unoptimized mode allows you to see everything you've written as you've written it without the optimizer removing or reordering things.

Once you are happy that your program is working correctly you can turn on optimizations to get improved performance. Even though optimizers are pretty trustworthy these days, it's still a good idea to build a good quality test suite to ensure that your program runs identically (from a functional point of view, not considering performance) in both optimized and unoptimized mode.

Should I worry about memory fragmentation with std::vector?

The answer to your worries may be std::deque. It gives you a similar interface to that of std::vector, but works better with fragmented memory, since it allocates several small arrays instead of a large one. It is actually less efficient than std::vector in some aspects, but for your case it may be a good trade-off.

If your running visual studios you may want to consider the following:

#define _SECURE_SCL 0

That's just for iterators, what type of STL operations are you preforming? You may want to look at optimizing your memory operations; ie, using resize() to insert several elements at once instead of using pop/push to insert elements one at a time.

You're declaring a virtual function and not defining it:

virtual void calculateCredits();

Either define it or declare it as:

virtual void calculateCredits() = 0;

Or simply:

virtual void calculateCredits() { };

Read more about vftable: http://en.wikipedia.org/wiki/Virtual_method_table

c++ performance stl debug-build