Handling Hash Collisions

What happens when there is a collision in a hash table, and how is it handled?

A hash collision occurs when two or more keys in a hash table produce the same hash value, resulting in them being mapped to the same index in the underlying array. Collisions are inevitable in hash tables because the number of possible keys is typically much larger than the size of the array.

When a collision occurs, there are different strategies to handle it and store the colliding elements. The two most common strategies are:

Separate Chaining:

  • In separate chaining, each slot of the hash table is a linked list or another collection that can store multiple elements.
  • When a collision occurs, the colliding elements are stored in the same linked list at the corresponding index.
  • During lookup, the hash function is applied to the key to find the appropriate index, and then the linked list at that index is traversed to find the desired element.

Open Addressing:

  • In open addressing, when a collision occurs, the hash table probes for the next empty slot in the array to store the colliding element.
  • The probing can be done using different techniques, such as linear probing, quadratic probing, or double hashing.
  • During lookup, the hash function is applied to the key, and the resulting index is probed until the desired element is found or an empty slot is encountered.

Here's an example of handling collisions using separate chaining:

#include <iostream>
#include <list>
#include <vector>

class HashTable {
 private:
  std::vector<std::list<int>> table;
  int size;

 public:
  HashTable(int tableSize)
    : size(tableSize) { table.resize(size); }

  void insert(int key) {
    int index = hashFunction(key);
    table[index].push_back(key);  
  }

  bool search(int key) {
    int index = hashFunction(key);
    auto& list = table[index];
    return std::find(
      list.begin(), list.end(), key
    ) != list.end();
  }

 private:
  int hashFunction(int key) {
    return key % size; }
};

int main() {
  HashTable hashTable(10);

  // Inserting elements
  hashTable.insert(5);
  hashTable.insert(15);  // Collision with 5
  hashTable.insert(25);  // Collision with 5

  std::cout
    << "Search for 15: "
    << (hashTable.search(15)
      ? "Found" : "Not Found")
    << '\n';
  std::cout
    << "Search for 20: "
    << (hashTable.search(20)
      ? "Found" : "Not Found")
    << '\n';
}
Search for 15: Found
Search for 20: Not Found

In this example, when a collision occurs (e.g., keys 5, 15, and 25 produce the same hash value), the colliding elements are stored in a linked list at the corresponding index. During search, the linked list is traversed to find the desired element.

Handling collisions effectively is crucial for maintaining the performance and efficiency of hash tables. The choice of collision resolution strategy depends on factors such as the expected number of elements, the distribution of keys, and the desired trade-offs between memory usage and lookup time.

Hashing and std::hash

This lesson provides an in-depth look at hashing in C++, including std::hash, collision strategies, and usage in hash-based containers.

Questions & Answers

Answers are generated by AI models and may not have been reviewed. Be mindful when running any code on your device.

When to Use Hash Tables
In what scenarios are hash tables a good choice compared to other data structures?
Writing Custom Hash Functions
How can I write a custom hash function for my own data type?
std::unordered_map vs std::map
What are the differences between std::unordered_map and std::map in C++?
Hash Table Load Factor
What is the load factor of a hash table, and how does it affect performance?
Hash Table vs Binary Search Tree
When should I use a hash table (std::unordered_map) versus a binary search tree (std::map)?
Or Ask your Own Question
Get an immediate answer to your specific question using our AI assistant