This guide is designed for anyone who understands basic programming (variables, functions) but is new to Concurrency. We are going to move from the “Why” to the “How,” ending with a functional code project you can run right now.
1. Prerequisites: The Mindset Shift
Before we code, you must understand one thing: Elixir processes are NOT Operating System (OS) threads.
- OS Threads: Heavy, cost Megabytes of RAM, managed by Windows/macOS/Linux. You can usually run a few thousand.
- Elixir Processes: Tiny, cost ~2.6 Kilobytes of RAM, managed by the Erlang VM (BEAM). You can run millions on a single laptop.
2. The Anatomy of a Process
Every process in Elixir has three things:
- A PID (Process Identifier): A unique address (like a phone number).
- An Isolated Memory Space: It cannot see or touch the variables of another process.
- A Mailbox: A queue where messages from other processes wait to be processed.
3. Preemptive Scheduling: The “Fairness” Engine
In many languages, a “heavy” loop can freeze the entire application. In Elixir, the Scheduler prevents this.
How it works:
The BEAM scheduler gives every process a budget of 2,000 “Reductions” (roughly 2,000 function calls).
- Step 1: Process A starts running.
- Step 2: After 2,000 reductions, the Scheduler says “Time’s up!”
- Step 3: Process A is paused (preempted), put at the back of the line, and Process B starts.
Result: Your app stays responsive even if one process is doing massive math or is stuck in an infinite loop.
4. Hands-on Tutorial: The “Secret Agent” System
We are going to build a system where one process (the “Agent”) waits for secret codes.
Step A: The Script
Create a file named secret_agent.exs or simply paste this into your iex terminal.
Elixir
defmodule SecretAgent do
@doc """
This function is the 'loop'. It stays alive, waiting for messages.
"""
def listen do
# The 'receive' block checks the process mailbox
receive do
{:whisper, message} ->
IO.puts("Agent [#{inspect(self())}] received secret: #{message}")
listen() # Tail-call recursion: keep the agent alive!
{:set_priority, level} ->
# Levels: :low, :normal, :high, :max
Process.flag(:priority, level)
IO.puts("Agent priority changed to: #{level}")
listen()
:terminate ->
IO.puts("Agent signing off. Over and out.")
# We don't call listen() here, so the process ends.
end
end
end
Step B: Running the Code
Follow these commands in your terminal (iex):
Elixir
# 1. Spawn the process (This starts the agent in the background)
agent_pid = spawn(fn -> SecretAgent.listen() end)
# 2. Check if it's alive
IO.puts("Is agent alive? #{Process.alive?(agent_pid)}")
# 3. Send a message (The 'send' function is: send(destination, message))
send(agent_pid, {:whisper, "The eagle has landed."})
# 4. Change its priority
# This tells the BEAM scheduler to give this process more/less CPU time.
send(agent_pid, {:set_priority, :high})
# 5. Kill the process
send(agent_pid, :terminate)
5. Advanced Concept: Priority Levels
Elixir processes aren’t just equal; some are “more equal” than others.
| Priority | Description |
| :low | Runs only when the CPU is bored. Perfect for log cleanup. |
| :normal | The default. Most of your code lives here. |
| :high | Use for tasks that must be smooth (like handling a UI or a game tick). |
| :max | Danger zone. Can block the entire system if not careful. |
Export to Sheets
How to check priority: Inside any process, you can run Process.read_flag(:priority) to see how important the scheduler thinks you are.
6. Fault Tolerance: “Let it Crash”
In Elixir, we don’t use try/catch for everything. Instead, we link processes.
Elixir
# spawn_link connects the current process to the new one.
# If the new one crashes, the current one dies too.
spawn_link(fn -> raise "BOOM!" end)
To handle this properly in production, we use Supervisors. A Supervisor is a process that does nothing but watch other processes. If one crashes, the Supervisor catches the “exit signal” and restarts it instantly.
Summary for the Beginner
- Spawn to create a worker.
- Send/Receive to communicate (no shared variables!).
- Preemption ensures one bad process doesn’t lag the whole system.
- Priority helps the scheduler decide who goes first.
- Links help us build systems that can recover from errors automatically.
Level Up
This is the “Level Up” moment for every Elixir developer. While spawn, send, and receive are the building blocks, in the real world, we use GenServer (Generic Server).
Think of a GenServer as a professional Secret Agent who has a standardized handbook, a private office, and a managed schedule.
1. What is a GenServer?
In our previous example, we had to manually write a loop (listen()) and handle every message. A GenServer is a behavior that handles all that “boilerplate” code for you. It provides:
- State Management: It remembers things (like a score in a game).
- Synchronous Calls: You can wait for a response (Call).
- Asynchronous Casts: You can send a message and move on (Cast).
2. The Project: A Global Counter
We are going to build a Counter that runs in its own process. Any other process in your app can ask it to increment or give the current total.
The Code (counter.ex)
Copy this into a file or your terminal:
Elixir
defmodule GlobalCounter do
use GenServer # This pulls in all the magic!
# --- Client API (Functions other processes call) ---
def start_link(initial_value \\ 0) do
# Starts the process and links it to the caller
GenServer.start_link(__MODULE__, initial_value, name: :my_counter)
end
def increment do
# 'Cast' is fire-and-forget (async)
GenServer.cast(:my_counter, :inc)
end
def get_value do
# 'Call' waits for a return value (sync)
GenServer.call(:my_counter, :get)
end
# --- Server Callbacks (The logic running inside the process) ---
@impl true
def init(count) do
# Sets the initial state of the process
{:ok, count}
end
@impl true
def handle_cast(:inc, current_count) do
# We return the NEW state: current_count + 1
{:noreply, current_count + 1}
end
@impl true
def handle_call(:get, _from, current_count) do
# Return format: {:reply, response_to_user, new_state}
{:reply, current_count, current_count}
end
end
3. Breaking Down the Magic
The “State”
In the init function, we set the state to 0. This 0 is tucked away in its own private memory. No other process can change it except by sending a message that matches handle_cast(:inc, ...).
The “Scheduler” & Priority
Even though this is a GenServer, it is still a Process.
- If 10,000 people try to
incrementat the exact same time, the BEAM Scheduler ensures the Counter process doesn’t hog the CPU. - It processes one message at a time from its mailbox, ensuring the math is always correct (no “race conditions”).
Preemption
If your handle_call contained a massive loop that took 5 seconds, the BEAM would preempt (pause) this GenServer every 2,000 reductions to let your web server or database connection run. Your app never feels “stuck.”
4. Testing it in IEx
Run these commands to see it in action:
Elixir
# 1. Start the counter
GlobalCounter.start_link(10)
# 2. Increment it a few times
GlobalCounter.increment()
GlobalCounter.increment()
# 3. Get the value
IO.puts "The count is: #{GlobalCounter.get_value()}"
# Output: The count is: 12
5. Why is this “The Best” Way?
- Fault Tolerance: If this Counter crashes, a Supervisor can restart it with its last known state.
- Concurrency: You can have 1,000 different counters running at once, all independent.
- Location Transparency: Because we named it
:my_counter, we don’t even need to know its PID to talk to it!
Your Next Step
You now know how to create, talk to, and manage processes.
Supervisors
This is the final “boss level” of Elixir reliability. In most languages, if a variable or a service crashes, the app is broken until a human restarts it. In Elixir, we use Supervisors to build “Self-Healing” systems.
A Supervisor is a specialized process that has one job: watch other processes (its “children”) and restart them if they die.
1. The Strategy: “Let it Crash”
In Elixir, we don’t try to catch every possible error with massive try/catch blocks. Instead, we:
- Isolate the dangerous code in a process.
- Link it to a Supervisor.
- If it crashes, the Supervisor notices the “exit signal” and starts a fresh copy of that process.
2. Coding the Supervisor
We will use the GlobalCounter we built in the previous step. We want to make sure that even if someone sends a “poison” message to our counter and it crashes, the counter comes back online immediately.
The Code (application.ex)
Elixir
defmodule MySystem.Supervisor do
use Supervisor # Tells Elixir this module is a manager
def start_link(_opts) do
# This starts the Supervisor process itself
Supervisor.start_link(__MODULE__, :ok, name: :my_supervisor)
end
@impl true
def init(:ok) do
# Define the 'children' this supervisor should watch
children = [
# We tell it to start the GlobalCounter with an initial value of 0
{GlobalCounter, 0}
]
# Strategy: :one_for_one means if a child dies, only restart that child.
Supervisor.init(children, strategy: :one_for_one)
end
end
3. The “Self-Healing” Magic in Action
Let’s see what happens when we intentionally break the system.
Step 1: Start the Supervisor
Run this in iex:
Elixir
# This starts the supervisor, which automatically starts the GlobalCounter
MySystem.Supervisor.start_link([])
# Check the value (should be 0)
IO.puts "Initial Value: #{GlobalCounter.get_value()}"
Step 2: “Kill” the Counter
In Elixir, you can manually kill a process to see how the supervisor reacts.
Elixir
# Find the PID of our counter
pid = Process.whereis(:my_counter)
IO.puts "Counter PID before crash: #{inspect(pid)}"
# Brute force kill the process
Process.exit(pid, :kill)
# Wait a millisecond... and check again!
new_pid = Process.whereis(:my_counter)
IO.puts "Counter PID after crash: #{inspect(new_pid)}"
What just happened?
- You killed the process.
- The Supervisor received a signal saying, “One of my children just died!”
- Because the strategy is
:one_for_one, the Supervisor immediately looked at its blueprint and spawned a brand newGlobalCounter. - The system is back online before a human could even notice a problem.
4. Preemptive Scheduling + Supervision
This is where the power of Elixir truly shines:
- Preemptive: If a process is stuck in a loop, the scheduler pauses it so the Supervisor still has CPU time to check on its children.
- Priority: You can set the Supervisor to
:highpriority to ensure that the “Manager” is always awake and responsive, even if the “Workers” are overwhelmed. - Isolation: The crash of the
GlobalCounterdidn’t stop theSupervisor. It didn’t stop your Web Server. It only affected that one tiny piece of memory.
Summary: The Elixir Way
- Processes: Independent workers with their own mailboxes.
- Preemptive Scheduler: A fair boss that makes everyone take turns.
- GenServer: A professional way to structure these workers.
- Supervisors: The safety net that restarts workers when they fail.
You’ve just learned the core architecture of systems like WhatsApp and Discord.
Let’s “Stress Test” where we spawn 100,000 processes at once to see how your computer handles it? It’s a great way to see the “Lightweight” nature of Elixir in action.
Stress Test
This is the “aha!” moment for most developers. In languages like Java or Python, spawning 100,000 threads would likely crash your computer or consume gigabytes of RAM.
In Elixir, we’re going to spawn 1,000,000 processes, have them all perform a small task, and watch how the Preemptive Scheduler handles it without breaking a sweat.
1. The Setup: The “Million Process” Challenge
We want to see two things:
- How fast Elixir can create these processes.
- How the scheduler ensures the system remains responsive even under extreme load.
The Code
Copy and paste this directly into your iex terminal:
Elixir
# Define a task that simulates a tiny bit of work
work = fn ->
# Each process will wait for 5 seconds and then just disappear
Process.sleep(5000)
end
# Measure the time it takes to spawn 1,000,000 processes
{time_micros, _} = :timer.tc(fn ->
for _ <- 1..1_000_000 do
spawn(work)
end
end)
IO.puts "Spawned 1 million processes in #{time_micros / 1_000_000} seconds."
What to Look For:
- Speed: On a modern laptop, this usually takes under 2 seconds.
- Memory: Open your Activity Monitor (macOS) or Task Manager (Windows). You’ll see the RAM usage spike, but it won’t crash. Each process only takes about 2.6 KB of memory.
2. Why didn’t the computer freeze?
This is where Preemptive Scheduling and Reductions come back into play.
Even though you just dumped 1,000,000 tasks onto the CPU:
- The Scheduler divides and conquers: The BEAM typically starts one scheduler per CPU core. If you have 8 cores, you have 8 “bosses” handing out work.
- No Hogs allowed: Because of the 2,000 reduction limit, even if one of those 1,000,000 processes tried to calculate Pi to a billion digits, the scheduler would pause it every few milliseconds to let the others (and your terminal) stay active.
3. The “Priority” Stress Test
Let’s make it harder. We’ll spawn a “Heavy” process and a “High Priority” process to see who wins.
Elixir
# 1. A 'Heavy' process that loops forever (Low Priority)
spawn(fn ->
Process.flag(:priority, :low)
# Infinite loop
loop = fn f -> f.(f) end
loop.(loop)
end)
# 2. A 'Fast' process that needs to respond quickly (High Priority)
spawn(fn ->
Process.flag(:priority, :high)
start_time = System.monotonic_time(:millisecond)
# Even with the heavy process running, this should finish instantly
IO.puts "High Priority task finished in #{System.monotonic_time(:millisecond) - start_time}ms"
end)
The Result: Even though the “Heavy” process is trying to eat 100% of a CPU core, your “High Priority” task will likely finish in 0ms to 1ms. The scheduler sees the priority flag and moves the high-priority task to the front of the “Run Queue.”
4. Real-World Comparison
| Feature | OS Threads (Java/C++) | Elixir Processes |
| Creation Time | Slow (Milliseconds) | Ultra-Fast (Microseconds) |
| Memory Cost | ~1MB – 2MB each | ~2.6KB each |
| Context Switching | Expensive (Involves the OS Kernel) | Cheap (Handled by the BEAM) |
| Fault Tolerance | One thread crash can kill the process | Totally isolated; “Let it Crash” |
Summary of Your Journey
You’ve gone from the basics to the most advanced architecture in modern software:
- Isolation: You saw that processes don’t share memory.
- Communication: You used
sendandreceive(The Actor Model). - Efficiency: You spawned a million workers without melting your CPU.
- Reliability: You used GenServers and Supervisors to build a self-healing system.
This is the secret sauce behind Discord (handling billions of messages) and WhatsApp (handling millions of simultaneous connections).