Data Structures for Sets

A set is a group of unique, yet similar items related to each other where order doesn’t matter. For example, a set of balls as shown in the picture is a real-world example off a set.

Below represents a set of balls, but none of them are the same. We can also arrange them in any order. We cannot extend the set to add new items as they are finite. ball={football, baseball, basketball, cricket ball, tennis ball}

How are sets implemented?

Set is an abstract data type that uses a List, an Associative array or a Bit array for its implementation. We can implement a simple data representation of similar items using a Linear list. However, if you are representing Boolean results like True or False, a Bit array comes in handy as it requires very little storage space. Hash tables and binary trees are used to depict search algorithms and Associative arrays are very useful in those scenarios.

A null set contains no elements. Null sets are used when we are not sure if the elements will be populated as a result of a runtime operation. The values are dynamically populated in a null set.

How are sets implemented?

A set supports the following operations:

Subset:
smaller set of the larger set. In the below example, A is a Subset of a larger portion of B. A subset returns 1 if the first set has some elements that are a subset of B, else returns 0.

Here, A is a subset of B and includes A={4,2,1}

Union:

Returns a unique set of elements of both Set A and B. For example, pick all the musicians who play a band, sing a chorus, or sing both, without including the duplicates.

Intersection:

When you need a set of elements common to both sets, you can create an intersection. Below is a representation of the intersection of sets A and B.

Types of sets:

A Mutable Set:
Sets can be mutable if you want to create them during runtime. The size of the set can be dynamically defined and elements can be added or removed during runtime. You can create, add, delete, or check the capacity when working with a mutable set.

An Immutable Set:

When you are sure of the set size and the elements, immutable sets are best. These are static, so you cannot alter the set. You can check the size if empty, build or read the element of the immutable set.

A Disjoint Set:
This set is a collection of subsets that do not overlap with each other.

A very good way to represent these singleton sets is using a linked list. Elements can be added to the set by pointing to other sets. Undirected graphs are another way to use a disjoint set. Disjoint set supports find, union and subset operations.

A Sparse Set:

This is similar to the Bit set with the only difference being that the elements are indices of a larger array. As managing a very large array is a big hassle, a sparse set lets you easily break them into multiple sets and reference the parent set.
Below is an example of a sparse set.

When you use a sparse set, the entire set is split into multiple coarse-grained arrays and each array stores the indices. This makes search faster and simpler. A sparse set supports Union and intersection operations.

To ace your interview, you can explore related topics. Learn about other data structures through the following blogs on our website:

These provide insights into the various types of data structures and will give you better understanding.

Get one step closer to your dream job!

by preparing for your interview with questions from our Job Interview Questions Book Series. These provide a comprehensive list of questions for your interview regardless of your area of expertise. These books include:

You can find them on our website!

Back to blog