c# in - Why is Dictionary preferred over Hashtable?




safe thread (16)

In most programming languages, dictionaries are preferred over hashtables. What are the reasons behind that?


Answers

Collections & Generics are useful for handling group of objects. In .NET, all the collections objects comes under the interface IEnumerable, which in turn has ArrayList(Index-Value)) & HashTable(Key-Value). After .NET framework 2.0, ArrayList & HashTable were replaced with List & Dictionary. Now, the Arraylist & HashTable are no more used in nowadays projects.

Coming to the difference between HashTable & Dictionary, Dictionary is generic where as Hastable is not Generic. We can add any type of object to HashTable, but while retrieving we need to cast it to the required type. So, it is not type safe. But to dictionary, while declaring itself we can specify the type of key and value, so there is no need to cast while retrieving.

Let's look at an example:

HashTable

class HashTableProgram
{
    static void Main(string[] args)
    {
        Hashtable ht = new Hashtable();
        ht.Add(1, "One");
        ht.Add(2, "Two");
        ht.Add(3, "Three");
        foreach (DictionaryEntry de in ht)
        {
            int Key = (int)de.Key; //Casting
            string value = de.Value.ToString(); //Casting
            Console.WriteLine(Key + " " + value);
        }

    }
}

Dictionary,

class DictionaryProgram
{
    static void Main(string[] args)
    {
        Dictionary<int, string> dt = new Dictionary<int, string>();
        dt.Add(1, "One");
        dt.Add(2, "Two");
        dt.Add(3, "Three");
        foreach (KeyValuePair<int, String> kv in dt)
        {
            Console.WriteLine(kv.Key + " " + kv.Value);
        }
    }
}

HashTable:

Key/value will be converted into an object (boxing) type while storing into the heap.

Key/value needs to be converted into the desired type while reading from the heap.

These operations are very costly. We need to avoid boxing/unboxing as much as possible.

Dictionary : Generic variant of HashTable.

No boxing/unboxing. No conversions required.


People are saying that a Dictionary is the same as a hash table.

This is not necessarily true. A hash table is an implementation of a dictionary. A typical one at that, and it may be the default one in .NET, but it's not by definition the only one.

You could equally well implement a dictionary with a linked list or a search tree, it just wouldn't be as efficient (for some metric of efficient).


Since .NET Framework 3.5 there is also a HashSet<T> which provides all the pros of the Dictionary<TKey, TValue> if you need only the keys and no values.

So if you use a Dictionary<MyType, object> and always set the value to null to simulate the type safe hash table you should maybe consider switching to the HashSet<T>.


The Hashtable is a loosely-typed data structure, so you can add keys and values of any type to the Hashtable. The Dictionary class is a type-safe Hashtable implementation, and the keys and values are strongly typed. When creating a Dictionary instance, you must specify the data types for both the key and value.


For what it's worth, a Dictionary is (conceptually) a hash table.

If you meant "why do we use the Dictionary<TKey, TValue> class instead of the Hashtable class?", then it's an easy answer: Dictionary<TKey, TValue> is a generic type, Hashtable is not. That means you get type safety with Dictionary<TKey, TValue>, because you can't insert any random object into it, and you don't have to cast the values you take out.

Interestingly, the Dictionary<TKey, TValue> implementation in the .NET Framework is based on the Hashtable, as you can tell from this comment in its source code:

The generic Dictionary was copied from Hashtable's source

Source


Another important difference is that Hashtable is thread safe. Hashtable has built-in multiple reader/single writer (MR/SW) thread safety which means Hashtable allows ONE writer together with multiple readers without locking.

In the case of Dictionary there is no thread safety; if you need thread safety you must implement your own synchronization.

To elaborate further:

Hashtable provides some thread-safety through the Synchronized property, which returns a thread-safe wrapper around the collection. The wrapper works by locking the entire collection on every add or remove operation. Therefore, each thread that is attempting to access the collection must wait for its turn to take the one lock. This is not scalable and can cause significant performance degradation for large collections. Also, the design is not completely protected from race conditions.

The .NET Framework 2.0 collection classes like List<T>, Dictionary<TKey, TValue>, etc. do not provide any thread synchronization; user code must provide all synchronization when items are added or removed on multiple threads concurrently

If you need type safety as well thread safety, use concurrent collections classes in the .NET Framework. Further reading here.

An additional difference is that when we add the multiple entries in Dictionary, the order in which the entries are added is maintained. When we retrieve the items from Dictionary we will get the records in the same order we have inserted them. Whereas Hashtable doesn't preserve the insertion order.


According to what I see by using .NET Reflector:

[Serializable, ComVisible(true)]
public abstract class DictionaryBase : IDictionary, ICollection, IEnumerable
{
    // Fields
    private Hashtable hashtable;

    // Methods
    protected DictionaryBase();
    public void Clear();
.
.
.
}
Take note of these lines
// Fields
private Hashtable hashtable;

So we can be sure that DictionaryBase uses a HashTable internally.


In .NET, the difference between Dictionary<,> and HashTable is primarily that the former is a generic type, so you get all the benefits of generics in terms of static type checking (and reduced boxing, but this isn't as big as people tend to think in terms of performance - there is a definite memory cost to boxing, though).


Dictionary <<<>>> Hashtable differences:

  • Generic <<<>>> Non-Generic
  • Needs own thread synchronization <<<>>> Offers thread safe version through Synchronized() method
  • Enumerated item: KeyValuePair <<<>>> Enumerated item: DictionaryEntry
  • Newer (> .NET 2.0) <<<>>> Older (since .NET 1.0)
  • is in System.Collections.Generic <<<>>> is in System.Collections
  • Request to non-existing key throws exception <<<>>> Request to non-existing key returns null
  • potentially a bit faster for value types <<<>>> bit slower (needs boxing/unboxing) for value types

Dictionary / Hashtable similarities:

  • Both are internally hashtables == fast access to many-item data according to key
  • Both need immutable and unique keys
  • Keys of both need own GetHashCode() method

Similar .NET collections (candidates to use instead of Dictionary and Hashtable):

  • ConcurrentDictionary - thread safe (can be safely accessed from several threads concurrently)
  • HybridDictionary - optimized performance (for few items and also for many items)
  • OrderedDictionary - values can be accessed via int index (by order in which items were added)
  • SortedDictionary - items automatically sorted
  • StringDictionary - strongly typed and optimized for strings

Dictionary<> is a generic type and so it's type safe.

You can insert any value type in HashTable and this may sometimes throw an exception. But Dictionary<int> will only accept integer values and similarly Dictionary<string> will only accept strings.

So, it is better to use Dictionary<> instead of HashTable.


One more difference that I can figure out is:

We can not use Dictionary<KT,VT> (generics) with web services. The reason is no web service standard supports the generics standard.


The Extensive Examination of Data Structures Using C# article on MSDN states that there is also a difference in the collision resolution strategy:

The Hashtable class uses a technique referred to as rehashing.

Rehashing works as follows: there is a set of hash different functions, H1 ... Hn, and when inserting or retrieving an item from the hash table, initially the H1 hash function is used. If this leads to a collision, H2 is tried instead, and onwards up to Hn if needed.

The Dictionary uses a technique referred to as chaining.

With rehashing, in the event of a collision the hash is recomputed, and the new slot corresponding to a hash is tried. With chaining, however, a secondary data structure is utilized to hold any collisions. Specifically, each slot in the Dictionary has an array of elements that map to that bucket. In the event of a collision, the colliding element is prepended to the bucket's list.


Notice that MSDN says: "Dictionary<(Of <(TKey, TValue>)>) class is implemented as a hash table", not "Dictionary<(Of <(TKey, TValue>)>) class is implemented as a HashTable"

Dictionary is NOT implemented as a HashTable, but it is implemented following the concept of a hash table. The implementation is unrelated to the HashTable class because of the use of Generics, although internally Microsoft could have used the same code and replaced the symbols of type Object with TKey and TValue.

In .NET 1.0 Generics did not exist; this is where the HashTable and ArrayList originally began.


Dictionary:

  • It returns/throws Exception if we try to find a key which does not exist.

  • It is faster than a Hashtable because there is no boxing and unboxing.

  • Only public static members are thread safe.

  • Dictionary is a generic type which means we can use it with any data type (When creating, must specify the data types for both keys and values).

    Example: Dictionary<string, string> <NameOfDictionaryVar> = new Dictionary<string, string>();

  • Dictionay is a type-safe implementation of Hashtable, Keys and Values are strongly typed.

Hashtable:

  • It returns null if we try to find a key which does not exist.

  • It is slower than dictionary because it requires boxing and unboxing.

  • All the members in a Hashtable are thread safe,

  • Hashtable is not a generic type,

  • Hashtable is loosely-typed data structure, we can add keys and values of any type.


maybe something like this:

foreach (var keyvaluepair in dict)
{
    if(Object.ReferenceEquals(keyvaluepair.Value, searchedObject))
    {
        //dict.Remove(keyvaluepair.Key);
        break;
    }
}






c# .net vb.net data-structures