1️⃣ What is process linking (at the lowest level)?

Linking = shared fate

When two processes are linked:

spawn_link(fn -> work() end)

Process.link(pid)

they form a bidirectional failure relationship.

Rule (very important)

If one linked process crashes, the other also crashes
(unless it is trapping exits)

Case A: Parent and child are linked (default behavior)

parent ─── linked ─── child

Scenario 1: Child crashes

child crashes → exit signal sent → parent crashes

Scenario 2: Parent crashes

parent crashes → exit signal sent → child crashes

📌 Linking is symmetric

This is intentional:

No orphan processes
Fail fast
Fail loudly

2️⃣ What is exit trapping?

Normally, exit signals kill the process.

But a process can say:

Process.flag(:trap_exit, true)

Now exit signals become messages instead of fatal errors.

With trapping enabled

Child crashes → parent receives a message

{:EXIT, child_pid, reason}

instead of dying.

Important distinctions

Scenario	Without trap	With trap
Linked process crashes	You crash	You get a message
You crash	Linked process crashes	Linked process crashes
Normal exit (`:normal`)	Ignored	Ignored

📌 Trapping is asymmetric — only the trapping process is protected.

Why trapping exists

To allow:

Supervisors
Fault monitors
Restart logic
Controlled recovery

3️⃣ Why you should almost never manually trap exits

Because OTP already solved this with Supervisors.

Manual trapping is:

Easy to get wrong
Hard to reason about
Rarely needed in application code

4️⃣ How Supervisors actually work (important)

A supervisor is just a process that:

Traps exits
Links to children
Restarts them based on a strategy

That’s it.

Supervisor crash behavior

Supervisor does NOT die when child crashes

Because it has:

Process.flag(:trap_exit, true)

So instead of dying, it receives:

{:EXIT, child_pid, reason}

and decides what to do.

5️⃣ Supervisor restart strategies (key concept)

`:one_for_one` (most common)

child A crashes → restart only child A

`:one_for_all`

child A crashes → terminate all children → restart all

`:rest_for_one`

child A crashes → restart A + all started after A

📌 Supervisor never restarts itself
Its parent supervisor does.

6️⃣ What happens if the parent (supervisor) crashes?

Let’s say:

RootSupervisor
  └── AppSupervisor
        ├── Worker A
        └── Worker B

AppSupervisor crashes

RootSupervisor receives exit signal
RootSupervisor restarts AppSupervisor
AppSupervisor starts fresh
AppSupervisor restarts its children (A, B)

📌 Everything below is restarted

7️⃣ What happens to process state on restart?

This is CRITICAL

Process state is LOST on crash

When a process crashes:

Heap is destroyed
Mailbox is destroyed
State is gone

Restart = new process, new PID.

So how does Elixir handle state?

Option 1: Rebuild from source of truth (most common)

init(_) do
  state = load_from_db()
  {:ok, state}
end

Examples:

Database
ETS
External service
File
Cache

Option 2: Externalize state

Instead of holding state in memory:

ETS tables
Mnesia
Redis
Postgres

Workers become stateless coordinators.

Option 3: Event sourcing (advanced)

Persist every event
Replay on restart
Rebuild state deterministically

Used in:

Financial systems
Workflow engines
CRMs
Distributed systems

8️⃣ What happens to children when parent restarts?

Situation	Result
Child crashes	Supervisor decides
Parent crashes	Children die
Parent restarts	Children restart
Child state	Lost
Parent state	Lost

📌 This is why state must be recoverable

9️⃣ Why this model is actually powerful

Instead of:

Locks
Try/catch everywhere
Defensive programming
Zombie threads

Elixir says:

Let it crash → restart cleanly → recover state

This is how:

WhatsApp runs millions of processes
Telecom systems run for years
Fault isolation stays sane

🔑 Final mental model (memorize this)

Links propagate crashes
Trapping converts crashes into messages
Supervisors trap exits
Supervisors restart children
Parents don’t restart themselves
Restarted processes lose state
State must be external or reconstructable

One-sentence summary

Linked processes share fate; trapping converts exits to messages; supervisors trap exits and restart children based on strategy, but restarted processes are brand new and must rebuild any state they previously held.

Blog

How Process Linking and Supervision Really Work in Elixir

1️⃣ What is process linking (at the lowest level)?

Linking = shared fate

Rule (very important)

Case A: Parent and child are linked (default behavior)

Scenario 1: Child crashes

Scenario 2: Parent crashes

2️⃣ What is exit trapping?

With trapping enabled

Child crashes → parent receives a message

Important distinctions

Why trapping exists

3️⃣ Why you should almost never manually trap exits

4️⃣ How Supervisors actually work (important)

Supervisor crash behavior

Supervisor does NOT die when child crashes

5️⃣ Supervisor restart strategies (key concept)

`:one_for_one` (most common)

`:one_for_all`

`:rest_for_one`

6️⃣ What happens if the parent (supervisor) crashes?

AppSupervisor crashes

7️⃣ What happens to process state on restart?

This is CRITICAL

So how does Elixir handle state?

Option 1: Rebuild from source of truth (most common)

Option 2: Externalize state

Option 3: Event sourcing (advanced)

8️⃣ What happens to children when parent restarts?

9️⃣ Why this model is actually powerful

🔑 Final mental model (memorize this)

One-sentence summary

Blog

How Process Linking and Supervision Really Work in Elixir

1️⃣ What is process linking (at the lowest level)?

Linking = shared fate

Rule (very important)

Case A: Parent and child are linked (default behavior)

Scenario 1: Child crashes

Scenario 2: Parent crashes

2️⃣ What is exit trapping?

With trapping enabled

Child crashes → parent receives a message

Important distinctions

Why trapping exists

3️⃣ Why you should almost never manually trap exits

4️⃣ How Supervisors actually work (important)

Supervisor crash behavior

Supervisor does NOT die when child crashes

5️⃣ Supervisor restart strategies (key concept)

:one_for_one (most common)

:one_for_all

:rest_for_one

6️⃣ What happens if the parent (supervisor) crashes?

AppSupervisor crashes

7️⃣ What happens to process state on restart?

This is CRITICAL

So how does Elixir handle state?

Option 1: Rebuild from source of truth (most common)

Option 2: Externalize state

Option 3: Event sourcing (advanced)

8️⃣ What happens to children when parent restarts?

9️⃣ Why this model is actually powerful

🔑 Final mental model (memorize this)

One-sentence summary

`:one_for_one` (most common)

`:one_for_all`

`:rest_for_one`