Introduction
Welcome to Chapter 10! So far, we’ve explored linear data structures like arrays, linked lists, stacks, and queues, each offering unique strengths and weaknesses regarding storage and access patterns. While arrays give us O(1) access by index and linked lists excel at O(1) insertions/deletions at specific points, searching for an arbitrary value in both often requires an O(N) scan.
But what if you need to find information instantly based on a unique identifier, like looking up a word in a dictionary or a phone number by name? This is where Hash Maps (also known as Hash Tables or Dictionaries) and Hash Sets shine! These powerful data structures are designed for blazing-fast lookups, insertions, and deletions, often achieving an average time complexity of O(1) – that’s constant time, no matter how much data you have!
In this chapter, we’ll unravel the magic behind hash maps and sets. We’ll learn how they work by understanding the concept of “hashing,” delve into the clever ways they handle collisions, build our own basic versions in TypeScript, and explore their countless real-world applications, from caching to database indexing. Get ready to add some serious speed to your data management toolkit!
Core Concepts: The Magic of Hashing
At the heart of both hash maps and hash sets lies a fundamental idea: hashing. Let’s break down this concept step by step.
What is a Hash Map?
Imagine you have a phone book, but instead of being sorted alphabetically, it’s organized by a unique “code” for each person. To find someone’s number, you don’t flip through pages; you instantly calculate their code and go directly to that page. That’s essentially a Hash Map!
A Hash Map stores data as key-value pairs. Each key is unique and maps to a value. The beauty of a hash map is its ability to retrieve a value associated with a key almost instantaneously.
Think of it like this:
- Key: The unique identifier (e.g., a person’s name).
- Value: The data associated with that key (e.g., their phone number).
In JavaScript/TypeScript, you’re already familiar with objects {} which are essentially a form of hash map (though with specific limitations for keys). The built-in Map object is a more robust, modern implementation.
How Does Hashing Work? The Core Idea
The “magic code” we talked about earlier is generated by a hash function.
The Hash Function:
- A hash function takes a
keyas input (it could be a string, a number, or even an object) and returns a fixed-size integer, called a hash value or hash code. - This hash value is then used as an index into an underlying array (often called the “buckets” or “slots”).
- Goal: A good hash function should distribute keys evenly across the array, and consistently return the same hash value for the same key.
Let’s visualize this:
- A hash function takes a
- Buckets (or Slots):
- The hash map uses an internal array, where each element of the array is called a “bucket” or “slot.”
- When you insert a key-value pair, the hash function tells you which bucket to put it in.
- When you want to retrieve a value, the hash function tells you which bucket to look in.
Collision Resolution: What Happens When Keys Collide?
What if two different keys produce the same hash value? This is called a collision, and it’s a common challenge in hashing. A perfect hash function (one that never produces collisions) is rare and often impossible for arbitrary data. So, hash maps need strategies to resolve them.
Two primary strategies are:
Chaining:
- This is the most common approach. Each bucket in the underlying array doesn’t just hold one item; it holds a list (like a linked list or another array) of key-value pairs.
- If two keys hash to the same bucket, both key-value pairs are simply added to that bucket’s list.
- When searching, you go to the bucket and then iterate through the list within that bucket to find the specific key.
flowchart TD A[Key1] –> B{Hash Function} B –> C1[Hash Value: 5] D[Buckets Array] C1 –> E1[Bucket 5] E1 –> F1["[Key1, Value1]"] A2[Key2] –> B B –> C2[Hash Value: 5] C2 –> E1 E1 –> F2["[Key2, Value2]"] F1 –>|\1| F2 style E1 fill:#f9f,stroke:#333,stroke-width:2px
* **Why it works:** Even if there's a collision, you only have to search a *small* list within that bucket, not the entire map.
2. **Open Addressing (Probing):**
* Instead of storing a list in each bucket, each bucket holds at most one key-value pair.
* If a collision occurs (the intended bucket is already occupied), the algorithm "probes" for the *next available* empty slot using a defined strategy.
* Common probing strategies include:
* **Linear Probing:** Check the next slot, then the next, and so on (e.g., `index + 1`, `index + 2`). Can lead to "clustering" where occupied slots group together.
* **Quadratic Probing:** Check `index + 1^2`, `index + 2^2`, `index + 3^2`, etc.
* **Double Hashing:** Use a second hash function to determine the step size for probing.
* **Complexity:** More complex to implement correctly than chaining, especially for deletion.
We'll focus on chaining for our implementation as it's conceptually simpler for a basic understanding.
### Load Factor and Rehashing
The **load factor** of a hash map is a crucial concept for performance. It's calculated as:
`Load Factor = (Number of stored items) / (Number of buckets)`
* If the load factor gets too high (e.g., above 0.7 or 0.75), it means many buckets are getting full, leading to more collisions and longer lists within buckets (for chaining). This degrades performance.
* To maintain O(1) average time complexity, when the load factor exceeds a certain threshold, the hash map performs a **rehashing** operation. This involves:
1. Creating a new, larger array of buckets (e.g., double the size).
2. Re-inserting *all* existing key-value pairs into the new, larger array using the hash function.
* Rehashing is an O(N) operation, but it happens infrequently, ensuring that most individual operations remain O(1) on average.
### What is a Hash Set?
A **Hash Set** is very similar to a Hash Map, but with a simpler purpose: storing a collection of **unique values**. It doesn't store key-value pairs; it just stores values.
* Think of it as a hash map where the "key" and the "value" are the same thing (or the value is just a placeholder like `true` or `null`).
* It leverages hashing to ensure that each element stored is unique, and to provide O(1) average time complexity for checking if a value exists (membership testing), adding a value, or removing a value.
In TypeScript, the built-in `Set` object provides this functionality.
### Big-O Complexity of Hash Maps and Sets
This is where hash maps and sets truly shine!
* **Average Case:**
* **Insertion (add/set):** O(1)
* **Deletion (delete):** O(1)
* **Lookup (get/has):** O(1)
* **Why O(1)?:** A good hash function distributes keys evenly, so most operations involve a single hash calculation and then directly accessing a bucket. If there's a collision, the list within the bucket is usually very short.
* **Worst Case:**
* **Insertion, Deletion, Lookup:** O(N)
* **Why O(N)?:** This happens in rare scenarios, primarily if:
* All keys hash to the *same* bucket (a terrible hash function or malicious input). In this case, you end up with a single linked list of all N items, degrading performance to that of a linked list.
* The load factor is extremely high, leading to very long collision chains.
* Modern built-in `Map` and `Set` implementations are highly optimized to minimize these worst-case scenarios.
* **Space Complexity:**
* O(N), where N is the number of key-value pairs (or values in a set). This is because we need to store all the items.
### When to Choose Hash Maps/Sets?
* When you need lightning-fast lookups, insertions, and deletions based on a key or value.
* When the order of elements doesn't matter.
* When you need to quickly check for the presence of an item (Sets).
* When you need to count frequencies or group items by a unique identifier.
## Step-by-Step Implementation: Building a Basic `MyHashMap` in TypeScript
Let's get our hands dirty and build a simplified `MyHashMap` class in TypeScript. We'll use the chaining method for collision resolution.
We'll start with the bare bones and add functionality incrementally.
**1. Project Setup:**
Make sure you have a TypeScript project set up. If not, quickly create one:
```bash
mkdir my-hashmap-project
cd my-hashmap-project
npm init -y
npm install typescript @types/node --save-dev
npx tsc --init # Creates tsconfig.json
Now, create a file named src/MyHashMap.ts.
2. Basic Structure and Item Type:
First, we need a way to represent a key-value pair within our buckets.
// src/MyHashMap.ts
// Define a type for our key-value pairs stored in buckets
type KeyValuePair<K, V> = [K, V];
// Our HashMap class
class MyHashMap<K, V> {
// The internal array of buckets. Each bucket will be an array of KeyValuePair.
private buckets: KeyValuePair<K, V>[][];
private capacity: number; // The size of our buckets array
private size: number; // The number of actual key-value pairs stored
constructor(initialCapacity: number = 10) {
if (initialCapacity < 1) {
throw new Error("Capacity must be at least 1.");
}
this.capacity = initialCapacity;
// Initialize buckets with empty arrays for chaining
this.buckets = Array.from({ length: this.capacity }, () => []);
this.size = 0;
}
// We'll add methods here
}
KeyValuePair<K, V>: This generic type represents a[key, value]tuple. Using generics<K, V>means our hash map can work with any key typeKand any value typeV. How cool is that?buckets: This is the core of our hash map – an array where each element is another array. This inner array will holdKeyValuePairs that hash to the same index (our chaining strategy).capacity: The total number of buckets we have.size: The actual number of key-value pairs currently stored in our hash map. This will be useful for tracking the load factor later.constructor: Initializes ourbucketsarray with empty inner arrays, ready to store items.
3. The hash Function:
This is the brain of our hash map. For simplicity, we’ll implement a basic string hashing function. For non-string keys, we’ll convert them to strings.
// ... (inside MyHashMap class)
// A simple hash function that takes a key and returns an index
private hash(key: K): number {
// Convert the key to a string to ensure consistent hashing
const keyString = String(key);
let hashValue = 0;
for (let i = 0; i < keyString.length; i++) {
// A common simple hashing technique: sum character codes
hashValue = (hashValue + keyString.charCodeAt(i)) % this.capacity;
}
return hashValue;
}
hash(key: K): Takes any key, converts it to a string, and then sums up the ASCII (or Unicode) character codes.% this.capacity: The modulo operator is crucial here! It ensures that thehashValuealways falls within the valid index range of ourbucketsarray (from0tothis.capacity - 1).
4. Implementing set(key, value) - Adding or Updating Items:
This method will add a new key-value pair or update the value if the key already exists.
// ... (inside MyHashMap class)
set(key: K, value: V): void {
const index = this.hash(key);
const bucket = this.buckets[index];
// Check if the key already exists in the bucket (collision or update)
for (let i = 0; i < bucket.length; i++) {
if (bucket[i][0] === key) {
// Key found, update its value
bucket[i][1] = value;
return; // We're done
}
}
// Key not found, add a new key-value pair to the bucket
bucket.push([key, value]);
this.size++;
// Optional: Check load factor and rehash if necessary
// (We'll skip rehashing for this basic implementation to keep it simple,
// but it's crucial for real-world performance!)
// if (this.size / this.capacity > 0.7) {
// this.rehash();
// }
}
- We calculate the
indexusing ourhashfunction. - We get the
bucket(the inner array) at thatindex. - We then iterate through the
bucketto see if thekeyalready exists. If it does, we update itsvalueand return. - If the
keyis not found after checking the entirebucket, it means it’s a new entry, so wepushthe[key, value]pair into thebucketand incrementthis.size.
5. Implementing get(key) - Retrieving Values:
This method allows us to retrieve the value associated with a given key.
// ... (inside MyHashMap class)
get(key: K): V | undefined {
const index = this.hash(key);
const bucket = this.buckets[index];
// Iterate through the bucket to find the key
for (let i = 0; i < bucket.length; i++) {
if (bucket[i][0] === key) {
return bucket[i][1]; // Key found, return its value
}
}
return undefined; // Key not found
}
- Again, calculate the
index. - Access the
bucket. - Iterate through the
bucket’s items. If thekeymatches, return the correspondingvalue. - If the loop finishes without finding the key, return
undefined.
6. Implementing has(key) - Checking for Key Existence:
This is similar to get, but just returns a boolean.
// ... (inside MyHashMap class)
has(key: K): boolean {
const index = this.hash(key);
const bucket = this.buckets[index];
for (let i = 0; i < bucket.length; i++) {
if (bucket[i][0] === key) {
return true; // Key found
}
}
return false; // Key not found
}
7. Implementing delete(key) - Removing Items:
This method removes a key-value pair from the hash map.
// ... (inside MyHashMap class)
delete(key: K): boolean {
const index = this.hash(key);
const bucket = this.buckets[index];
for (let i = 0; i < bucket.length; i++) {
if (bucket[i][0] === key) {
// Key found, remove it from the bucket using splice
bucket.splice(i, 1);
this.size--;
return true; // Successfully deleted
}
}
return false; // Key not found
}
// Helper to get the current number of items
getSize(): number {
return this.size;
}
} // End of MyHashMap class
- Find the
bucketusing the hashindex. - Iterate and find the key.
- If found, use
splice(i, 1)to remove that item from thebucketarray. - Decrement
this.sizeand returntrue. - If not found, return
false.
8. Testing Our MyHashMap:
Create a new file src/app.ts (or index.ts) to test our implementation.
// src/app.ts
import { MyHashMap } from './MyHashMap'; // Assuming MyHashMap is in MyHashMap.ts
console.log("--- Testing MyHashMap ---");
const myMap = new MyHashMap<string, number>(5); // Small capacity for demonstration
console.log(`Initial size: ${myMap.getSize()}`); // Expected: 0
myMap.set("apple", 10);
myMap.set("banana", 20);
myMap.set("cherry", 30);
myMap.set("date", 40);
myMap.set("elderberry", 50);
console.log(`Size after additions: ${myMap.getSize()}`); // Expected: 5
console.log(`Value of apple: ${myMap.get("apple")}`); // Expected: 10
console.log(`Value of banana: ${myMap.get("banana")}`); // Expected: 20
console.log(`Value of grape (not in map): ${myMap.get("grape")}`); // Expected: undefined
console.log(`Has cherry? ${myMap.has("cherry")}`); // Expected: true
console.log(`Has fig? ${myMap.has("fig")}`); // Expected: false
// Test update
myMap.set("apple", 100);
console.log(`Updated value of apple: ${myMap.get("apple")}`); // Expected: 100
// Test deletion
console.log(`Deleting banana: ${myMap.delete("banana")}`); // Expected: true
console.log(`Value of banana after deletion: ${myMap.get("banana")}`); // Expected: undefined
console.log(`Size after deletion: ${myMap.getSize()}`); // Expected: 4
console.log(`Deleting non-existent key 'fig': ${myMap.delete("fig")}`); // Expected: false
// Demonstrate collision (with a very simple hash function, this is easy)
// Let's assume 'fa' and 'ga' might collide with our simple hash function and small capacity
// For example, if capacity is 5:
// hash('date') = (100+97+116+101) % 5 = 414 % 5 = 4
// hash('fig') = (102+105+103) % 5 = 310 % 5 = 0
// hash('gin') = (103+105+110) % 5 = 318 % 5 = 3
// hash('jam') = (106+97+109) % 5 = 312 % 5 = 2
// Let's manually create a collision for demonstration
// If 'apple' hashes to 0, and we add 'apricot', which also hashes to 0 with our simple function,
// they will be in the same bucket.
// 'apple': 97+112+112+108+101 = 530. 530 % 5 = 0
// 'apricot': 97+112+114+105+99+111+116 = 754. 754 % 5 = 4
// Okay, my simple hash function doesn't make collisions easily predictable without manual calculation.
// Let's just trust our chaining logic works if a collision *were* to occur.
// The key is that `get` and `delete` correctly iterate the bucket.
// Using the built-in Map for comparison (always prefer built-in for production)
console.log("\n--- Testing Built-in Map ---");
const builtinMap = new Map<string, string>();
builtinMap.set("name", "Alice");
builtinMap.set("age", "30");
console.log(`Built-in Map name: ${builtinMap.get("name")}`); // Expected: Alice
console.log(`Built-in Map has age: ${builtinMap.has("age")}`); // Expected: true
builtinMap.delete("age");
console.log(`Built-in Map has age after delete: ${builtinMap.has("age")}`); // Expected: false
To run this:
npx tsc # Compile TypeScript
node dist/app.js # Run the compiled JavaScript
This basic MyHashMap demonstrates the core principles of hashing and collision resolution through chaining. In a production environment, you would always use the built-in Map object in JavaScript/TypeScript, as it’s highly optimized with sophisticated hash functions and rehashing mechanisms.
Mini-Challenge: Implementing MyHashSet
Now that you’ve built a MyHashMap, let’s apply that knowledge to create a MyHashSet.
Challenge: Implement a MyHashSet<T> class that uses the same underlying hashing principles as MyHashMap. It should provide the following methods:
constructor(initialCapacity: number)add(value: T): boolean- Adds a value to the set. Returnstrueif the value was added (it was new),falseotherwise (it already existed).has(value: T): boolean- Checks if a value is present in the set.delete(value: T): boolean- Removes a value from the set. Returnstrueif the value was removed,falseotherwise.getSize(): number- Returns the number of unique values in the set.
Hint: Think about how a Set is fundamentally a Map where you only care about the keys. What could you use as the “value” in your underlying KeyValuePair for a Set? Perhaps a simple boolean true or null?
What to Observe/Learn: Notice the strong similarities between the implementation of a hash map and a hash set. The core logic of hashing, finding the bucket, and iterating within the bucket remains largely the same.
Feel free to open src/MyHashSet.ts and give it a try!
Click for Solution (after you've tried it!)
// src/MyHashSet.ts
// Define a type for our key-value pairs stored in buckets
// For a Set, the "value" part is just a placeholder, we only care about the key.
type SetItem<T> = [T, true]; // We'll use 'true' as a dummy value
export class MyHashSet<T> {
private buckets: SetItem<T>[][];
private capacity: number;
private size: number;
constructor(initialCapacity: number = 10) {
if (initialCapacity < 1) {
throw new Error("Capacity must be at least 1.");
}
this.capacity = initialCapacity;
this.buckets = Array.from({ length: this.capacity }, () => []);
this.size = 0;
}
private hash(value: T): number {
const valueString = String(value);
let hashValue = 0;
for (let i = 0; i < valueString.length; i++) {
hashValue = (hashValue + valueString.charCodeAt(i)) % this.capacity;
}
return hashValue;
}
add(value: T): boolean {
const index = this.hash(value);
const bucket = this.buckets[index];
// Check if the value already exists
for (let i = 0; i < bucket.length; i++) {
if (bucket[i][0] === value) {
return false; // Value already exists, not added
}
}
// Value not found, add a new item
bucket.push([value, true]); // Store the value with a dummy 'true'
this.size++;
return true; // Value successfully added
}
has(value: T): boolean {
const index = this.hash(value);
const bucket = this.buckets[index];
for (let i = 0; i < bucket.length; i++) {
if (bucket[i][0] === value) {
return true; // Value found
}
}
return false; // Value not found
}
delete(value: T): boolean {
const index = this.hash(value);
const bucket = this.buckets[index];
for (let i = 0; i < bucket.length; i++) {
if (bucket[i][0] === value) {
bucket.splice(i, 1);
this.size--;
return true; // Successfully deleted
}
}
return false; // Value not found
}
getSize(): number {
return this.size;
}
}
// Add to src/app.ts to test:
/*
import { MyHashSet } from './MyHashSet';
console.log("\n--- Testing MyHashSet ---");
const mySet = new MyHashSet<string>(5);
console.log(`Initial set size: ${mySet.getSize()}`); // Expected: 0
console.log(`Adding apple: ${mySet.add("apple")}`); // Expected: true
console.log(`Adding banana: ${mySet.add("banana")}`); // Expected: true
console.log(`Adding apple again: ${mySet.add("apple")}`); // Expected: false (duplicate)
console.log(`Set size after additions: ${mySet.getSize()}`); // Expected: 2 (apple, banana)
console.log(`Has banana? ${mySet.has("banana")}`); // Expected: true
console.log(`Has cherry? ${mySet.has("cherry")}`); // Expected: false
console.log(`Deleting banana: ${mySet.delete("banana")}`); // Expected: true
console.log(`Has banana after deletion? ${mySet.has("banana")}`); // Expected: false
console.log(`Set size after deletion: ${mySet.getSize()}`); // Expected: 1
console.log(`Deleting non-existent fig: ${mySet.delete("fig")}`); // Expected: false
// Using the built-in Set for comparison
console.log("\n--- Testing Built-in Set ---");
const builtinSet = new Set<number>();
builtinSet.add(10);
builtinSet.add(20);
builtinSet.add(10); // Duplicate, ignored
console.log(`Built-in Set has 10: ${builtinSet.has(10)}`); // Expected: true
console.log(`Built-in Set size: ${builtinSet.size}`); // Expected: 2
builtinSet.delete(20);
console.log(`Built-in Set has 20 after delete: ${builtinSet.has(20)}`); // Expected: false
*/
Real-World Use Cases of Hash Maps and Sets
Hash Maps and Sets are ubiquitous in software engineering due to their phenomenal average-case performance. You interact with them daily without even knowing it!
Hash Map Use Cases:
Caching:
- How: Store frequently accessed data (e.g., database query results, API responses) in a hash map where the key is the query/request and the value is the result.
- Why: When a request comes in, check the cache first (O(1) lookup). If found, return instantly; otherwise, compute/fetch and then store in cache. This dramatically speeds up applications by avoiding expensive re-computations or network calls.
- Examples: Browser caches, server-side data caches (Redis, Memcached often use hash table-like structures), memoization in dynamic programming.
Database Indexing:
- How: Databases often use hash tables (or B-trees, which we’ll cover later) to create indexes on columns. The key is the indexed column’s value, and the value is a pointer to the actual row(s) in the database.
- Why: When you query
SELECT * FROM Users WHERE email = '[email protected]', the database can use a hash index on theemailcolumn to find Alice’s record in O(1) time, instead of scanning millions of rows.
Counting Frequencies:
- How: To count occurrences of words in a document or elements in a list, iterate through the items. Use each item as a key in a hash map, and its count as the value.
- Why: O(1) average time to increment a count or add a new item, making it very efficient for large datasets.
- Examples: Word frequency analysis, finding the most common element in an array.
Symbol Tables in Compilers/Interpreters:
- How: Compilers use hash maps to store information about variables, functions, and classes (their types, scopes, memory locations) as they parse source code.
- Why: Fast lookups are essential to quickly resolve identifiers during compilation.
Configuration Storage:
- How: Storing application settings (e.g.,
theme: "dark",logLevel: "debug") where each setting name is a key and its value is the configuration. - Why: Simple, direct, and fast access to any configuration parameter.
- How: Storing application settings (e.g.,
Hash Set Use Cases:
Detecting Duplicates:
- How: To check if an array contains duplicates, iterate through it and add each element to a hash set. If
addreturnsfalse(meaning the element already existed), you’ve found a duplicate. - Why: O(1) average time to check for existence, making it very efficient for large lists.
- Examples: Validating unique usernames, finding duplicate files.
- How: To check if an array contains duplicates, iterate through it and add each element to a hash set. If
Membership Testing:
- How: Quickly check if an item is part of a larger collection.
- Why: O(1) average time
has()operation. - Examples: Checking if a user is in an allowed list, validating input against a set of valid options.
Tracking Visited Nodes in Graph Algorithms:
- How: In algorithms like Breadth-First Search (BFS) or Depth-First Search (DFS) (which we’ll cover later), a hash set is often used to keep track of nodes that have already been visited to prevent infinite loops and redundant processing.
- Why: O(1) average time to mark a node as visited and check if a node has been visited.
Unique Element Collection:
- How: If you need a collection of items where each item must be unique, a hash set is the perfect choice.
- Examples: Collecting all unique tags from a blog post, gathering all distinct IP addresses that connected to a server.
Common Pitfalls & Troubleshooting
While hash maps and sets are incredibly powerful, there are a few common traps to watch out for:
Poor Hash Function:
- Pitfall: If your hash function is bad (e.g., always returns
0, or returns0for 80% of inputs), all your items will end up in the same few buckets. This effectively degrades your O(1) average time complexity to O(N) because every operation then requires traversing a long list within that single bucket. - Troubleshooting: Design hash functions that distribute keys as evenly as possible across the buckets. For custom objects, ensure you’re hashing based on their unique identifiers. For production, trust the built-in
MapandSetwhich use highly optimized hash functions.
- Pitfall: If your hash function is bad (e.g., always returns
Ignoring Load Factor (for custom implementations):
- Pitfall: If you don’t implement rehashing, and your hash map keeps growing, the buckets will become increasingly full, leading to more collisions and longer chains. This will cause your average-case performance to degrade towards O(N).
- Troubleshooting: For custom hash map/set implementations, always monitor the load factor (
size / capacity) and rehash (create a larger underlying array and re-insert all items) when it exceeds a certain threshold (e.g., 0.7 or 0.75).
Mutable Keys:
- Pitfall: If you use an object as a key in a hash map, and then mutate that object after it’s been inserted, its hash value might change. If you then try to
getordeletethat key, the hash function will produce a different index, and you won’t be able to find the original entry. - Troubleshooting: Keys in hash maps should ideally be immutable. If you must use objects as keys, ensure they are not modified after insertion, or use a string representation of the object (e.g.,
JSON.stringify) as the actual key (though this has its own complexities). JavaScript’s built-inMapcan use objects as keys, but it relies on reference equality, not structural equality.
- Pitfall: If you use an object as a key in a hash map, and then mutate that object after it’s been inserted, its hash value might change. If you then try to
Using
Object{}vs.Mapfor Key-Value Pairs:- Pitfall: In JavaScript/TypeScript, plain objects
{}can behave like hash maps. However,Objectkeys are always converted to strings. This meansobj[1]andobj['1']refer to the same key. You also can’t use objects as keys directly; they’ll be converted to"[object Object]". This can lead to unexpected behavior and collisions. - Troubleshooting: For true hash map behavior where keys can be of any type (including numbers, booleans, and even objects themselves), and for better performance and API, always prefer the built-in
Mapobject in modern JavaScript/TypeScript. Similarly, for unique collections, use the built-inSet.
- Pitfall: In JavaScript/TypeScript, plain objects
Summary
Phew! We’ve covered a lot about Hash Maps and Sets. Let’s recap the key takeaways:
- Hash Maps (aka Hash Tables, Dictionaries) store data as unique key-value pairs, while Hash Sets store unique values.
- Their core principle is hashing, where a
hash functionconverts a key/value into anindexfor an underlying array ofbuckets. - Collision resolution is essential: Chaining (storing a list of items in each bucket) and Open Addressing (probing for the next empty slot) are common strategies.
- Load factor helps manage performance; if it gets too high, rehashing (creating a larger array and re-inserting all items) is performed.
- The average time complexity for insertion, deletion, and lookup in hash maps and sets is a phenomenal O(1), making them incredibly fast for these operations. The worst case is O(N).
- They are widely used in caching, database indexing, frequency counting, duplicate detection, and membership testing.
- For practical applications in TypeScript, always prefer the built-in
MapandSetobjects as they are highly optimized. - Be mindful of issues like poor hash functions, high load factors, and mutable keys, especially when considering custom implementations.
You now have a solid understanding of how these powerful data structures work and why they are indispensable in modern software engineering.
What’s Next?
In the next chapter, we’ll shift our focus to Trees, starting with the fundamentals of tree structures. Trees offer a different way to organize data, particularly useful when you need ordered data, hierarchical relationships, or more efficient range queries than hash maps can provide. Get ready to explore branching paths!
References
This page is AI-assisted and reviewed. It references official documentation and recognized resources where relevant.