EasyCustom Data Structures

Design HashSet

Explanation & Solution

Description

Design a HashSet without using any built-in hash table libraries.

Implement a function that processes an array of operations and an array of arguments:

  • "MyHashSet" — Initializes the HashSet object.
  • "add" — Inserts the value key into the HashSet.
  • "remove" — Removes the value key from the HashSet. If key does not exist, do nothing.
  • "contains" — Returns true if the value key exists in the HashSet, false otherwise.
Input:operations = ["MyHashSet","add","add","contains","contains","add","contains","remove","contains"], args = [[],[1],[2],[1],[3],[2],[2],[2],[2]]
0
MyHashSet
1
add
2
add
3
contains
4
contains
5
add
6
contains
7
remove
8
contains
Output:[null,null,null,true,false,null,true,null,false]
0
null
1
null
2
null
3
true
4
false
5
null
6
true
7
null
8
false

Explanation: Add 1 and 2. Contains(1) is true. Contains(3) is false. Add 2 again (no-op). Contains(2) is true. Remove 2. Contains(2) is false.

Constraints

  • 0 <= key <= 10^6
  • At most 10^4 calls will be made to add, remove, and contains

Approach

Custom Data Structures pattern

1. Choose a Hash Function

  • Use modulo hashing: hash(key) = key % SIZE
  • Choose SIZE = 1000 to balance between memory and chain length
  • This maps each key to a bucket index from 0 to 999

2. Initialize Buckets

  • Create an array of SIZE empty arrays (chaining approach)
  • Each bucket is a linked list (array) of keys that hash to that bucket

3. Implement Add

  • Compute the bucket index using the hash function
  • Check if the key already exists in that bucket
  • If not, push the key into the bucket

4. Implement Remove

  • Compute the bucket index
  • Find the key's position using indexOf
  • If found, remove it using splice

5. Implement Contains

  • Compute the bucket index
  • Return whether the key exists in that bucket using includes

Key Insight

  • Chaining handles hash collisions by storing multiple keys per bucket
  • With a good hash function and reasonable bucket count, average operations are O(1)
Time
O(n/SIZE) average per operation where n = number of keys
Space
O(n + SIZE)

Solution Code