You think dictionary lookup is always O(1). That’s only half the story.

Today, you understand how dictionaries actually work.


Today’s Goal

By the end of today, you will:

  • Understand how dictionaries store data
  • Learn how hashing works
  • Understand collisions and their impact
  • Know when O(1) breaks down

The Illusion

user = {
    "name": "John",
    "age": 30
}

You think:

This is just key-value storage

Reality:

This is a highly optimized hash table


Dictionary Internal Model

key -> hash(key) -> index -> slot -> value

Dictionary does NOT scan all keys. It jumps directly using hash.


Lookup Process

user["name"]

Steps:

  1. Compute hash of key
  2. Map hash to index
  3. Check slot
  4. Return value

Why It’s Fast

Because:

  • No iteration
  • Direct addressing via hash

Average case:

O(1)

Collisions (Reality Check)

Two keys can produce same index.

hash(key1) == hash(key2)

What Happens on Collision

Python uses:

  • open addressing
  • probing strategy
index -> next slot -> next slot -> ...

Until empty slot or match is found.


Impact of Collisions

Worst case:

O(n)

Dictionary degrades to list-like behavior.


Hash Function

hash("name")

Produces an integer used to locate slot.


Important Rule

Dictionary keys must be:

  • immutable
  • hashable

Invalid Key Example

{[1,2]: "value"}

Error:

TypeError: unhashable type

Why Mutable Objects Fail

Because:

  • their hash can change
  • dictionary would lose track of location

Resizing Behavior

When dictionary grows:

  1. allocate bigger table
  2. rehash all keys
  3. redistribute entries

Hidden Cost

Resizing is expensive. But happens rarely.


Insertion Order (Python 3.7+)

Dictionaries preserve insertion order.

d = {"a":1, "b":2}

Order is maintained.


Memory Layout Insight

Dictionary stores:

  • keys
  • values
  • hash table index

More memory than list.


Real Example

users = {}

for i in range(1_000_000):
    users[i] = i

Fast insertions due to hashing.


Performance Comparison

# list lookup
999999 in list(range(1_000_000))

# dict lookup
999999 in {i:i for i in range(1_000_000)}

Dictionary is significantly faster.


Your Task

  • Compare list vs dict lookup
  • Create many keys to force collisions (conceptually)
  • Measure performance difference

Common Mistakes

  • Assuming dict is always O(1)
  • Using mutable keys
  • Ignoring memory overhead

Think Deeper

  1. What happens during collision resolution?
  2. Why must keys be immutable?
  3. When can dictionary performance degrade?

Subtle Insight (CRITICAL)

Dictionaries are fast because of hashing, but hashing is not free.


Tomorrow

Functions & Call Stack — how execution flows


Rule

  • O(1) is average, not guaranteed
  • Understand collisions
  • Respect memory cost

See you in Day 5.