You think a list is just a collection of items. It is not.

It is a memory-managed structure with real trade-offs.

Today, you stop using data structures blindly.


Today’s Goal

By the end of today, you will:

  • Understand how lists, tuples, and sets work internally
  • Learn time and space complexity
  • Choose the right structure based on behavior and cost

The Illusion

data = [1, 2, 3]

You think this stores values.

Reality:

[ ptr | ptr | ptr ]  -> references to objects

A list stores references, not values.


Lists — Dynamic Arrays

Internal Model

list object
  -> pointer to array
array
  -> [ref, ref, ref]
  • contiguous array of references
  • over-allocated for growth

Resizing Behavior

When capacity is full:

  1. allocate bigger array
  2. copy references
  3. append new item
lst = []
for i in range(1_000_000):
    lst.append(i)

Most appends are cheap, some are expensive.


Complexity (List)

OperationComplexity
AccessO(1)
AppendO(1)*
InsertO(n)
SearchO(n)

*amortized


Hidden Cost

lst.insert(0, 10)

Shifts all elements → O(n)


Tuple — Immutable Array

t = (1, 2, 3)
  • fixed size
  • immutable
  • no resizing
  • more memory efficient than list

Why Tuple Is Faster

  • no mutation checks
  • predictable layout
  • better cache locality

Set — Hash Table

s = {1, 2, 3}

Internal Model

hash(key) -> index -> bucket -> value

Lookup Process

2 in s

Steps:

  1. compute hash
  2. map to index
  3. compare value

No full scan.


Complexity (Set)

OperationComplexity
LookupO(1) avg
InsertO(1) avg
DeleteO(1) avg

Collisions

If many keys map to same index:

performance -> degrades toward O(n)

Compare Structures

FeatureListTupleSet
OrderYesYesNo
MutableYesNoYes
LookupO(n)O(n)O(1)
MemoryHighLowHigh

Real Example

lst = [1,2,2,3,3]
unique = list(set(lst))

Memory Behavior

  • list → dynamic + extra capacity
  • tuple → compact
  • set → hash table overhead

Why This Matters

Wrong structure leads to:

  • slow performance
  • high memory usage
  • poor scalability

Performance Example

data = list(range(1_000_000))
lookup = set(data)

999999 in data
999999 in lookup

Huge difference at scale.


Your Task

  • compare list vs set lookup time
  • measure memory using sys.getsizeof
  • test append vs insert

Common Mistakes

  • using list everywhere
  • ignoring insertion cost
  • not using set for lookup

Think Deeper

  1. why is set faster?
  2. why is tuple memory efficient?
  3. when does resizing hurt performance?

Final Insight

Data structures are trade-offs between speed, memory, and behavior.


Tomorrow

Dictionaries — hashing, collisions, real O(1)


Rule

  • choose based on cost
  • not habit

See you in Day 4.