Collision Resolution with Hashing

While many hash functions exist, new programmers need clarification about which one to choose. There’s no formula available for choosing the right hash function. Clustering or collision is the most common problem in hash functions and must be addressed appropriately.

Collision Resolution Techniques:

When one or more hash values compete with a single hash table slot, collisions occur. To resolve this, the next available empty slot is assigned to the current hash value. The most common methods are open addressing, chaining, probabilistic hashing, perfect hashing and coalesced hashing techniques.

Let’s understand them in more detail:

a) Chaining:

This technique implements a linked list and is the most popular out of all the collision resolution techniques. Below is an example of a chaining process.

Since one slot here has 3 elements – {50, 85, 92}, a linked list is assigned to include the other 2 items {85, 92}. When you use the chaining technique, the insertion or deletion of items with the hash table is fairly simple and high-performing. Likewise, a chain hash table inherits the pros and cons of a linked list. Alternatively, chaining can use dynamic arrays instead of linked lists.

b) Open Addressing:

This technique depends on space usage and can be done with linear or quadratic probing techniques. As the name says, this technique tries to find an available slot to store the record. It can be done in one of the 3 ways –

Linear probing – Here, the next probe interval is fixed to 1. It supports the best caching but miserably fails at clustering.
Quadratic probing – the probe distance is calculated based on the quadratic equation. This is considerably a better option as it balances clustering and caching.
Double hashing – Here, the probing interval is fixed for each record by a second hashing function. This technique has poor cache performance although it does not have any clustering issues.

Below are some of the hashing techniques that can help in resolving collision.

c) Probabilistic hashing:

This is memory-based hashing that implements caching. When a collision occurs, either the old record is replaced by the new or the new record may be dropped. Although this scenario has a risk of losing data, it is still preferred due to its ease of implementation and high performance.

d) Perfect hashing:

When the slots are uniquely mapped, the chances of collision are minimal. However, it can be done where there is a lot of spare memory.

e) Coalesced hashing:

This technique is a combo of open address and chaining methods. A chain of items is are stored in the table when there is a collision. The next available table space is used to store the items to prevent collision.

To ace your interview, you can explore related topics. Learn about other data structures through the following blogs on our website:

These provide insights into the various types of data structures and will give you better understanding.

Pave the route to your dream job by preparing for your interview with questions from our Job Interview Questions Book Series These provide a comprehensive list of questions for your interview regardless of your area of expertise. These books include:

You can find them on our website!

Back to blog