c++ containers comparison - In which scenario do I use a particular STL container?

4 Answers

Here is a flowchart inspired by David Moore's version (see above) that I created, which is up-to-date (mostly) with the new standard (C++11). This is only my personal take on it, it's not indisputable, but I figured it could be valuable to this discussion:

list best for

I've been reading up on STL containers in my book on C++, specifically the section on the STL and its containers. Now I do understand each and every one of them have their own specific properties, and I'm close to memorizing all of them... But what I do not yet grasp is in which scenario each of them is used.

What is the explanation? Example code is much prefered.

Look at Effective STL by Scott Meyers. It's good at explaining how to use the STL.

If you want to store a determined/undetermined number of objects and you're never going to delete any, then a vector is what you want. It's the default replacement for a C array, and it works like one, but doesn't overflow. You can set its size beforehand as well with reserve().

If you want to store an undetermined number of objects, but you'll be adding them and deleting them, then you probably want a list...because you can delete an element without moving any following elements - unlike vector. It takes more memory than a vector, though, and you can't sequentially access an element.

If you want to take a bunch of elements and find only the unique values of those elements, reading them all into a set will do it, and it will sort them for you as well.

If you have a lot of key-value pairs, and you want to sort them by key, then a map is useful...but it will only hold one value per key. If you need more than one value per key, you could have a vector/list as your value in the map, or use a multimap.

It's not in the STL, but it is in the TR1 update to the STL: if you have a lot of key-value pairs that you're going to look up by key, and you don't care about their order, you might want to use a hash - which is tr1::unordered_map. I've used it with Visual C++ 7.1, where it was called stdext::hash_map. It has a lookup of O(1) instead of a lookup of O(log n) for map.

It all depends on what you want to store and what you want to do with the container. Here are some (very non-exhaustive) examples for the container classes that I tend to use most:

vector: Compact layout with little or no memory overhead per contained object. Efficient to iterate over. Append, insert and erase can be expensive, particularly for complex objects. Cheap to find a contained object by index, e.g. myVector[10]. Use where you would have used an array in C. Good where you have a lot of simple objects (e.g. int). Don't forget to use reserve() before adding a lot of objects to the container.

list: Small memory overhead per contained object. Efficient to iterate over. Append, insert and erase are cheap. Use where you would have used a linked list in C.

set (and multiset): Significant memory overhead per contained object. Use where you need to find out quickly if that container contains a given object, or merge containers efficiently.

map (and multimap): Significant memory overhead per contained object. Use where you want to store key-value pairs and look up values by key quickly.

The flow chart on the cheat sheet suggested by zdan provides a more exhaustive guide.

One lesson I've learned is: Try to wrap it in a class, since changing the container type one fine day can yield big surprises.

class CollectionOfFoo {
    Collection<Foo*> foos;
    .. delegate methods specifically 

It doesn't cost much up front, and saves time in debugging when you want to break whenever somebody does operation x on this structure.

Coming to selecting the perfect data structure for a job:

Each data structure provides some operations, which can be varying time complexity:

O(1), O(lg N), O (N), etc.

You essentially have to take a best guess, on which operations will be done most, and use a data structure which has that operation as O(1).

Simple, isn't it (-: