You think dictionary lookup is always O(1). That’s only half the story.
Today, you understand how dictionaries actually work.
Today’s Goal
By the end of today, you will:
- Understand how dictionaries store data
- Learn how hashing works
- Understand collisions and their impact
- Know when O(1) breaks down
The Illusion
user = {
"name": "John",
"age": 30
}
You think:
This is just key-value storage
Reality:
This is a highly optimized hash table
Dictionary Internal Model
key -> hash(key) -> index -> slot -> value
Dictionary does NOT scan all keys. It jumps directly using hash.
Lookup Process
user["name"]
Steps:
- Compute hash of key
- Map hash to index
- Check slot
- Return value
Why It’s Fast
Because:
- No iteration
- Direct addressing via hash
Average case:
O(1)
Collisions (Reality Check)
Two keys can produce same index.
hash(key1) == hash(key2)
What Happens on Collision
Python uses:
- open addressing
- probing strategy
index -> next slot -> next slot -> ...
Until empty slot or match is found.
Impact of Collisions
Worst case:
O(n)
Dictionary degrades to list-like behavior.
Hash Function
hash("name")
Produces an integer used to locate slot.
Important Rule
Dictionary keys must be:
- immutable
- hashable
Invalid Key Example
{[1,2]: "value"}
Error:
TypeError: unhashable type
Why Mutable Objects Fail
Because:
- their hash can change
- dictionary would lose track of location
Resizing Behavior
When dictionary grows:
- allocate bigger table
- rehash all keys
- redistribute entries
Hidden Cost
Resizing is expensive. But happens rarely.
Insertion Order (Python 3.7+)
Dictionaries preserve insertion order.
d = {"a":1, "b":2}
Order is maintained.
Memory Layout Insight
Dictionary stores:
- keys
- values
- hash table index
More memory than list.
Real Example
users = {}
for i in range(1_000_000):
users[i] = i
Fast insertions due to hashing.
Performance Comparison
# list lookup
999999 in list(range(1_000_000))
# dict lookup
999999 in {i:i for i in range(1_000_000)}
Dictionary is significantly faster.
Your Task
- Compare list vs dict lookup
- Create many keys to force collisions (conceptually)
- Measure performance difference
Common Mistakes
- Assuming dict is always O(1)
- Using mutable keys
- Ignoring memory overhead
Think Deeper
- What happens during collision resolution?
- Why must keys be immutable?
- When can dictionary performance degrade?
Subtle Insight (CRITICAL)
Dictionaries are fast because of hashing, but hashing is not free.
Tomorrow
Functions & Call Stack — how execution flows
Rule
- O(1) is average, not guaranteed
- Understand collisions
- Respect memory cost
See you in Day 5.