c++ example implementation - What really is a deque in STL?





3 Answers

deque = double ended queue

A container which can grow in either direction.

Deque is typically implemented as a vector of vectors (a list of vectors can't give constant time random access). While the size of the secondary vectors is implementation dependent, a common algorithm is to use a constant size in bytes.

dequeue code vs

I was looking at STL containers and trying to figure what they really are (i.e. the data structure used), and the deque stopped me: I thought at first that it was a double linked list, which would allow insertion and deletion from both ends in constant time, but I am troubled by the promise made by the operator [] to be done in constant time. In a linked list, arbitrary access should be O(n), right?

And if it's a dynamic array, how can it add elements in constant time? It should be mentioned that reallocation may happen, and that O(1) is an amortized cost, like for a vector.

So I wonder what is this structure that allows arbitrary access in constant time, and at the same time never needs to be moved to a new bigger place.




(This is an answer I've given in another thread. Essentially I'm arguing that even fairly naive implementations, using a single vector, conform to the requirements of "constant non-amortized push_{front,back}". You might be surprised, and think this is impossible, but I have found other relevant quotes in the standard that define the context in a surprising way. Please bear with me; if I have made a mistake in this answer, it would be very helpful to identify which things I have said correctly and where my logic has broken down. )

In this answer, I am not trying to identify a good implementation, I'm merely trying to help us to interpret the complexity requirements in the C++ standard. I'm quoting from N3242, which is, according to Wikipedia, the latest freely available C++11 standardization document. (It appears to be organized differently from the final standard, and hence I won't quote the exact page numbers. Of course, these rules might have changed in the final standard, but I don't think that has happened.)

A deque<T> could be implemented correctly by using a vector<T*>. All the elements are copied onto the heap and the pointers stored in a vector. (More on the vector later).

Why T* instead of T? Because the standard requires that

"An insertion at either end of the deque invalidates all the iterators to the deque, but has no effect on the validity of references to elements of the deque."

(my emphasis). The T* helps to satisfy that. It also helps us to satisfy this:

"Inserting a single element either at the beginning or end of a deque always ..... causes a single call to a constructor of T."

Now for the (controversial) bit. Why use a vector to store the T*? It gives us random access, which is a good start. Let's forget about the complexity of vector for a moment and build up to this carefully:

The standard talks about "the number of operations on the contained objects.". For deque::push_front this is clearly 1 because exactly one T object is constructed and zero of the existing T objects are read or scanned in any way. This number, 1, is clearly a constant and is independent of the number of objects currently in the deque. This allows us to say that:

'For our deque::push_front, the number of operations on the contained objects (the Ts) is fixed and is independent of the number of objects already in the deque.'

Of course, the number of operations on the T* will not be so well-behaved. When the vector<T*> grows too big, it'll be realloced and many T*s will be copied around. So yes, the number of operations on the T* will vary wildly, but the number of operations on T will not be affected.

Why do we care about this distinction between counting operations on T and counting operations on T*? It's because the standard says:

All of the complexity requirements in this clause are stated solely in terms of the number of operations on the contained objects.

For the deque, the contained objects are the T, not the T*, meaning we can ignore any operation which copies (or reallocs) a T*.

I haven't said much about how a vector would behave in a deque. Perhaps we would interpret it as a circular buffer (with the vector always taking up its maximum capacity(), and then realloc everything into a bigger buffer when the vector is full. The details don't matter.

In the last few paragraphs, we have analyzed deque::push_front and the relationship between the number of objects in the deque already and the number of operations performed by push_front on contained T-objects. And we found they were independent of each other. As the standard mandates that complexity is in terms of operations-on-T, then we can say this has constant complexity.

Yes, the Operations-On-T*-Complexity is amortized (due to the vector), but we're only interested in the Operations-On-T-Complexity and this is constant (non-amortized).

The complexity of vector::push_back or vector::push_front is irrelevant in this implementation; those considerations involve operations on T* and hence are irrelevant. If the standard was referring to the 'conventional' theoretical notion of complexity, then they wouldn't have explicitly restricted themselves to the "number of operations on the contained objects". Am I overinterpreting that sentence?




From overview, you can think deque as a double-ended queue

The datas in deque are stored by chuncks of fixed size vector, which are

pointered by a map(which is also a chunk of vector, but its size may change)

The main part code of the deque iterator is as below:

/*
buff_size is the length of the chunk
*/
template <class T, size_t buff_size>
struct __deque_iterator{
    typedef __deque_iterator<T, buff_size>              iterator;
    typedef T**                                         map_pointer;

    // pointer to the chunk
    T* cur;       
    T* first;     // the begin of the chunk
    T* last;      // the end of the chunk

    //because the pointer may skip to other chunk
    //so this pointer to the map
    map_pointer node;    // pointer to the map
}

The main part code of the deque is as below:

/*
buff_size is the length of the chunk
*/
template<typename T, size_t buff_size = 0>
class deque{
    public:
        typedef T              value_type;
        typedef T&            reference;
        typedef T*            pointer;
        typedef __deque_iterator<T, buff_size> iterator;

        typedef size_t        size_type;
        typedef ptrdiff_t     difference_type;

    protected:
        typedef pointer*      map_pointer;

        // allocate memory for the chunk 
        typedef allocator<value_type> dataAllocator;

        // allocate memory for map 
        typedef allocator<pointer>    mapAllocator;

    private:
        //data members

        iterator start;
        iterator finish;

        map_pointer map;
        size_type   map_size;
}

Below i will give you the core code of deque, mainly about three parts:

  1. iterator

  2. How to construct a deque

1. iterator(__deque_iterator)

The main problem of iterator is, when ++, -- iterator, it may skip to other chunk(if it pointer to edge of chunk). For example, there are three data chunks: chunk 1,chunk 2,chunk 3.

The pointer1 pointers to the begin of chunk 2, when operator --pointer it will pointer to the end of chunk 1, so as to the pointer2.

Below I will give the main function of __deque_iterator:

Firstly, skip to any chunk:

void set_node(map_pointer new_node){
    node = new_node;
    first = *new_node;
    last = first + chunk_size();
}

Note that, the chunk_size() function which compute the chunk size, you can think of it returns 8 for simplify here.

operator* get the data in the chunk

reference operator*()const{
    return *cur;
}

operator++, --

// prefix forms of increment

self& operator++(){
    ++cur;
    if (cur == last){      //if it reach the end of the chunk
        set_node(node + 1);//skip to the next chunk
        cur = first;
    }
    return *this;
}

// postfix forms of increment
self operator++(int){
    self tmp = *this;
    ++*this;//invoke prefix ++
    return tmp;
}
self& operator--(){
    if(cur == first){      // if it pointer to the begin of the chunk
        set_node(node - 1);//skip to the prev chunk
        cur = last;
    }
    --cur;
    return *this;
}

self operator--(int){
    self tmp = *this;
    --*this;
    return tmp;
}
iterator skip n steps / random access
self& operator+=(difference_type n){ // n can be postive or negative
    difference_type offset = n + (cur - first);
    if(offset >=0 && offset < difference_type(buffer_size())){
        // in the same chunk
        cur += n;
    }else{//not in the same chunk
        difference_type node_offset;
        if (offset > 0){
            node_offset = offset / difference_type(chunk_size());
        }else{
            node_offset = -((-offset - 1) / difference_type(chunk_size())) - 1 ;
        }
        // skip to the new chunk
        set_node(node + node_offset);
        // set new cur
        cur = first + (offset - node_offset * chunk_size());
    }

    return *this;
}

// skip n steps
self operator+(difference_type n)const{
    self tmp = *this;
    return tmp+= n; //reuse  operator +=
}

self& operator-=(difference_type n){
    return *this += -n; //reuse operator +=
}

self operator-(difference_type n)const{
    self tmp = *this;
    return tmp -= n; //reuse operator +=
}

// random access (iterator can skip n steps)
// invoke operator + ,operator *
reference operator[](difference_type n)const{
    return *(*this + n);
}

2. How to construct a deque

common function of deque

iterator begin(){return start;}
iterator end(){return finish;}

reference front(){
    //invoke __deque_iterator operator*
    // return start's member *cur
    return *start;
}

reference back(){
    // cna't use *finish
    iterator tmp = finish;
    --tmp; 
    return *tmp; //return finish's  *cur
}

reference operator[](size_type n){
    //random access, use __deque_iterator operator[]
    return start[n];
}


template<typename T, size_t buff_size>
deque<T, buff_size>::deque(size_t n, const value_type& value){
    fill_initialize(n, value);
}

template<typename T, size_t buff_size>
void deque<T, buff_size>::fill_initialize(size_t n, const value_type& value){
    // allocate memory for map and chunk
    // initialize pointer
    create_map_and_nodes(n);

    // initialize value for the chunks
    for (map_pointer cur = start.node; cur < finish.node; ++cur) {
        initialized_fill_n(*cur, chunk_size(), value);
    }

    // the end chunk may have space node, which don't need have initialize value
    initialized_fill_n(finish.first, finish.cur - finish.first, value);
}

template<typename T, size_t buff_size>
void deque<T, buff_size>::create_map_and_nodes(size_t num_elements){
    // the needed map node = (elements nums / chunk length) + 1
    size_type num_nodes = num_elements / chunk_size() + 1;

    // map node num。min num is  8 ,max num is "needed size + 2"
    map_size = std::max(8, num_nodes + 2);
    // allocate map array
    map = mapAllocator::allocate(map_size);

    // tmp_start,tmp_finish poniters to the center range of map
    map_pointer tmp_start  = map + (map_size - num_nodes) / 2;
    map_pointer tmp_finish = tmp_start + num_nodes - 1;

    // allocate memory for the chunk pointered by map node
    for (map_pointer cur = tmp_start; cur <= tmp_finish; ++cur) {
        *cur = dataAllocator::allocate(chunk_size());
    }

    // set start and end iterator
    start.set_node(tmp_start);
    start.cur = start.first;

    finish.set_node(tmp_finish);
    finish.cur = finish.first + num_elements % chunk_size();
}

Let's assume i_deque has 20 int elements 0~19 whose chunk size is 8, and now push_back 3 elements to i_deque:

i_deque.push_back(1);
i_deque.push_back(2);
i_deque.push_back(3);

It's internal structure like below:

Then push_back again, it will invoke allocate new chunk:

push_back(3)

If we push_front, it will allocate new chunk before the prev start

Note when push_back element into deque, if all the maps and chunks are filled, it will cause allocate new map, and adjust chunks.But the above code may be enough for you to understand deque.




Related


Tags

c++   stl   deque