Hash Maps using std::unordered_map

Handling Hash Collisions in std::unordered_map

How does std::unordered_map handle hash collisions?

Abstract art representing computer programming

std::unordered_map handles hash collisions using a technique called chaining. When multiple keys hash to the same bucket, the container stores them in a linked list or a similar data structure within that bucket. This allows multiple elements to coexist in the same bucket.

When a collision occurs, the container uses the equality comparison function (operator== or a custom comparator) to find the desired element within the bucket.

Here's an example that demonstrates hash collisions:

#include <iostream>
#include <string>
#include <unordered_map>

struct Person {
  std::string name;
  int age;

  bool operator==(const Person& other) const {
    return name == other.name && age == other.age;
  }
};

namespace std {
template <>
struct hash<Person> {
  size_t operator()(const Person& person) const {
    return std::hash<std::string>{}(person.name);
  }
};
}

int main() {
  std::unordered_map<Person, int> peopleMap;

  peopleMap[Person{"Alice", 25}] = 1;
  peopleMap[Person{"Bob", 30}] = 2;

  // Collision with "Alice" key
  peopleMap[Person{"Alice", 30}] = 3;

  for (const auto& [person, id] : peopleMap) {
    std::cout
      << "Name: " << person.name
      << ", Age: " << person.age
      << ", ID: " << id << '\n';
  }

  std::cout << "Bucket count: "
    << peopleMap.bucket_count() << '\n';  
  std::cout << "Load factor: "
    << peopleMap.load_factor() << '\n';    
}
Name: Alice, Age: 30, ID: 3
Name: Alice, Age: 25, ID: 1
Name: Bob, Age: 30, ID: 2
Bucket count: 8
Load factor: 0.375

In this example, the Person type uses only the name member for hashing. As a result, Person{"Alice", 25} and Person{"Alice", 30} collide because they have the same name.

The container resolves the collision by storing both elements in the same bucket. When accessing elements, it uses the equality comparison function to distinguish between the colliding elements.

The bucket_count() function returns the number of buckets in the container, and load_factor() returns the average number of elements per bucket. A higher load factor indicates more collisions.

To minimize collisions and improve performance, you can:

  • Use a good hash function that distributes keys evenly across buckets.
  • Increase the number of buckets by calling rehash() with a higher bucket count.
  • Adjust the maximum load factor using max_load_factor() to trigger automatic rehashing when the load factor exceeds a certain threshold.
// Increase bucket count to 10
peopleMap.rehash(10);

// Set maximum load factor to 0.75
peopleMap.max_load_factor(0.75);

By understanding how std::unordered_map handles collisions using chaining and taking steps to minimize collisions, you can optimize the performance of your unordered map and ensure efficient key-value lookups.

Answers to questions are automatically generated and may not have been reviewed.

Free, Unlimited Access

Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.

Screenshot from Warhammer: Total War
Screenshot from Tomb Raider
Screenshot from Jedi: Fallen Order
Contact|Privacy Policy|Terms of Use
Copyright © 2024 - All Rights Reserved