This article is based on content created by kojix2 (a human) alternately calling DeepWiki and ChatGPT, but kojix2 (a human) has reviewed, edited, and proofread the entire text. The article was translated from Japanese to English using Claude. If you find any mistakes, please comment. Thank you.
Crystal’s parallel processing is based on a hybrid model that primarily uses Fiber (cooperative and lightweight) and utilizes Thread (OS threads) when necessary.
ExecutionContext, which has been rapidly developed since around 2024-2025, provides a new abstraction layer for safely spreading Fibers across multiple threads.
This article organizes the latest parallel execution model in Crystal.
Building with Parallel Execution Enabled
As of November 19, 2025, you need to use the following two flags:
-
-Dpreview_mt: Enables parallel execution of Fibers -
-Dexecution_context: Enables the use of ExecutionContext
crystal build -Dpreview_mt -Dexecution_context program.cr
While Crystal’s parallel execution is in preview, it has been over 6 years since its release and works without issues in many cases.
Overview of Crystal’s Concurrency and Parallelism
Crystal has five major execution models:
| Model | Execution Unit | Characteristics |
|---|---|---|
| Fiber (default) | Fiber (lightweight thread) | Cooperative, automatic switching on I/O, lightweight |
| ExecutionContext::Concurrent | Fiber group | Sequential execution on 1 thread (concurrent) |
| ExecutionContext::Parallel | Fiber group | Execution on multiple threads (parallel) |
| ExecutionContext::Isolated | 1 Fiber + 1 dedicated thread | For GUI loops and blocking FFI calls |
| Thread | OS thread | For handling low-level operations |
The standard design is as follows:
- Use Fiber as the basis
- Use ExecutionContext only where parallelism is needed
Cooperative Scheduling of Fiber and I/O
Fiber is a cooperative execution model that has existed for a while. By default (when parallel execution is disabled), switching occurs only when:
I/Osleep-
Channelreceive/send Fiber.yield
are triggered. (Fiber.suspend is called and the Fiber is suspended.)
The basic approach in Crystal is to put I/O-bound processing on Fibers.
Each Fiber has its own stack memory. The stack has a virtual size of 8MiB, but it’s only reserved, and actual memory usage starts from 4KiB.
What is a “Stack” in Crystal?
When reading Crystal documentation, you’ll encounter the word “stack.” Note that this differs from the general meaning of “stack” – it refers to a “memory region that behaves like a stack,” which is actually memory allocated from the OS heap.
What is placed on the stack:
- Value types:
Struct,Tuple,StaticArray, etc. - Primitive types:
Int32,Float64,Bool,Char, etc. - Pointers to reference types:
Array,Hash, etc. (The reference type objects themselves are placed on the heap, but the pointers to them are placed on the stack)
Values placed on the stack are not directly targeted by GC, but they are scanned during GC execution to prevent heap objects referenced by stack variables from being mistakenly collected.
As described later, the key point is that when captured by closures like spawn do end, the above value types are exceptionally placed on the heap and become accessible from other threads.
Background Knowledge: Thread / Scheduler / Fiber
In Crystal, each thread has its own Crystal::Scheduler that manages the fibers to be executed.
Main Thread Creation and Initialization
The main thread is automatically created by the OS when the program starts. Subsequently, when Thread.current is called, a Thread object for the main thread is created. The stack address of the main thread is obtained with the stack_address method. This is the actual thread stack allocated by the OS when the process starts.
Main Fiber Creation
When the Thread object is initialized, the main Fiber is created simultaneously. The main Fiber uses a special constructor Fiber.new(stack : Void*, thread) to utilize the OS thread stack. Unlike normal Fibers, makecontext is not called, and it uses the already running context.
Lazy Initialization of Scheduler
The main thread’s scheduler is initialized when Thread#scheduler is called. The scheduler has:
-
@event_loop: Platform-specific event loop -
@stack_pool: Fiber stack reuse pool -
@runnable: Queue of runnable fibers -
@main: Thread’s main fiber
Default Thread Configuration
Without using ExecutionContext and preview_mt, only the main thread exists. The main thread has its own Crystal::Scheduler instance, which manages all fibers.
Stack Allocation for New Fibers
When a new Fiber is created, stack memory is obtained from Fiber::StackPool. When a Fiber terminates, its stack is returned to the pool through StackPool.release for reuse by the next Fiber. Stack allocation reserves 8MiB of virtual address space. Only the bottom page of the stack (4KiB) is committed to physical memory. When the stack grows and reaches a guard page, that page’s guard status is removed and a new guard page is committed. This continues until reserved pages run out.
Parallel Execution with ExecutionContext
ExecutionContext is a “virtual thread group” that executes Fibers together.
ExecutionContext::Concurrent
This is the same concurrent execution as traditional Fibers. It’s safe and easy to handle.
ctx = Fiber::ExecutionContext::Concurrent.new("workers")
- Only one Fiber executes at a time within the context
- Therefore, access contention to shared variables doesn’t occur (however, using Mutex/Atomic is considered safer as “recommended safety”)
Suitable when parallelization is unnecessary but you want to use Fibers.
ExecutionContext::Parallel
Parallel execution on multiple threads.
ctx = Fiber::ExecutionContext::Parallel.new("workers", 8)
Changing parallel size during execution:
ctx.resize(count)
- Each thread runs its own scheduler
- The scheduler is an instance of the
Fiber::ExecutionContext::Parallel::Schedulerclass, responsible for executing individual Fibers. It has a local queue and manages runnable Fibers. It searches for and executes Fibers in the main loop (run_loop).
- The scheduler is an instance of the
- Fibers within the context are moved to and executed on arbitrary threads
- When a Fiber moves between threads, only the execution context (registers and stack pointer) actually moves. The Fiber’s stack memory (heap from the OS perspective) does not move. This memory region is fixed during the Fiber’s lifetime. When a Fiber resumes on a new thread, the saved stack pointer is loaded and points to the original stack memory region.
- Due to parallelism,
Atomic/Mutexis mandatory for shared mutable state.- Local variables and instance variables (pointers) captured from the closure that spawns the Fiber are placed in a closure data structure allocated on the heap, and that pointer moves with the Fiber. This means that value type local variables (like StaticArray) that would normally be allocated on the stack are exceptionally allocated on the heap.
Parallel is the central feature of Crystal’s goal of “safe and fast parallel execution.”
ExecutionContext::Isolated
1 Fiber = 1 dedicated thread
gui = Fiber::ExecutionContext::Isolated.new("GUI") do
Gtk.main
end
gui.wait
- A single Fiber monopolizes an OS thread
- Safe to use blocking I/O (e.g., GUI event loops, blocking FFI calls)
- Cannot add additional spawns within the context (they are forced to go to the default context)
Suitable for main loops of GUI applications and FFI that calls C functions with I/O bundle blocking.
Default Fiber Without Using ExecutionContext
When ExecutionContext is not specified, Fibers execute in the default ExecutionContext (Fiber::ExecutionContext.default). The default ExecutionContext is Parallel, but since the initial parallelism is set to 1, it behaves the same as Concurrent.
Fiber::ExecutionContext.default.size # => 1
Basic Patterns of Channel and WaitGroup
Crystal’s parallel processing is based on a Channel + WaitGroup pattern similar to Go.
Producer-Consumer (Parallel)
consumers = Fiber::ExecutionContext::Parallel.new("consumers", 8)
channel = Channel(Int32).new(64)
wg = WaitGroup.new(32)
result = Atomic.new(0)
32.times do
consumers.spawn do
while value = channel.receive?
result.add(value)
end
ensure
wg.done
end
end
1024.times { |i| channel.send(i) }
channel.close
wg.wait
p result.get # => 523776
- Communication via Channel
- Synchronization via WaitGroup
- Safe updates of shared state via Atomic
This is the basic form of parallel execution in Crystal.
32 consumer Fibers executing in parallel atomically add 1024 integer values (0-1023) received from the channel and calculate their sum (523776)
Protection of Shared Variables in Concurrent
Concurrent is serial execution so contention doesn’t occur, but Crystal officially states that using Atomic / Mutex is preferable.
Atomic / Mutex / SpinLock
Atomic
A variable that can safely read and write values even when accessed simultaneously from multiple threads, a basic synchronization primitive for preventing race conditions.
- Directly mapped to LLVM atomic instructions
- compare_and_set, add, sub, get, set
- Same memory orders as C/C++: Acquire / Release / Relaxed, etc.
Types that cannot be used with Atomic include value types such as structures (Struct) and StaticArray.
Mutex
A lock that protects code regions (critical sections) that must not be executed simultaneously by multiple Fibers, controlling so that only one Fiber can execute at a time.
- Fiber-safe
- Three modes: Checked / Reentrant / Unchecked
- Re-entry prohibited by default (safe)
mutex = Mutex.new
shared_array = [] of Int32
10.times do |i|
spawn do
mutex.synchronize do
# Only one Fiber executes at a time within this block
shared_array << i
sleep 0.001.seconds
end
end
end
sleep 1.second
puts shared_array.size # => 10
Example of manually locking/unlocking:
mutex = Mutex.new
counter = 0
10.times do
spawn do
mutex.lock
begin
counter += 1
sleep 0.001.seconds
ensure
mutex.unlock # Always unlock
end
end
end
sleep 1.second
puts counter # => 10
SpinLock
A lightweight lock specialized for very short-term locks. It continues to use CPU while waiting (spinning), so it’s unsuitable for long-term locks.
- For very short critical sections
- Only effective with preview_mt / win32
SpinLock is used in implementations such as Crystal::Scheduler, Crystal::ThreadLocalValue, Crystal::Once, Mutex, WaitGroup, EventLoop::Polling, and Fiber::StackPool. There are almost no scenarios where users would directly use SpinLock in code.
Areas to Be Careful About in the Standard Library
The following are areas in the Crystal standard library that may not guarantee complete thread safety and require caution.
What Qualifies as a Shared Variable Subject to Contention?
While we’ve used the term “shared variable,” Crystal doesn’t have user-accessible global variables, so the most typical shared variable is a class variable.
- Class variables: Always shared variables (determined by variable type)
- Instance variables and local variables: Determined by whether they are referenced from multiple Fibers or threads when spawned
If captured by spawn, local variables can also become shared variables.
ENV
- The safety of Unix’s getenv/setenv/unsetenv is environment-dependent
- Parallel modification is not recommended
This is also discussed in the Crystal Forum:
https://forum.crystal-lang.org/t/eliminate-environment-modifications/8533/29
Class Variables
In Crystal, you can use the @[ThreadLocal] annotation to make class variables thread-local.
class Foo
@[ThreadLocal]
@@var = 123
def self.var
@@var
end
end
In this case, each thread has an independent copy of @@var, so changing the value in one thread doesn’t affect other threads.
Class variables without @[ThreadLocal] are shared. In this case, you need to use Atomic / Mutex for parallel updates.
IO (File, Socket, STDOUT/ERR)
Safety may not be guaranteed when simultaneously operating on the same IO from multiple threads.
Logger
Logger also uses IO internally. Writing to the same Logger from multiple threads may not be safe.
Report Any Issues You Find
Crystal is a programming language with far fewer users compared to languages like Python and Java. User reports are very valuable and precious. It’s important to continue improving the language and libraries by actively reporting bugs to Crystal Forum and GitHub issues.
Cases Where Thread Should Be Used
Thread directly represents the OS’s native thread. It can be used when low-level control is needed.
There are almost no cases where you should use Thread directly without using ExecutionContext.
It may be an option in cases such as:
- Want to parallelize compute-intensive tasks
- FFI is blocking and cannot suspend Fiber (however, if the FFI function is CPU-intensive processing, blocking is considered desirable behavior)
- C library requires thread-local initialization
Using Thread::Channel enables safe communication between threads.
FFI (C Library Calls) and Parallel Execution
Since C libraries are not necessarily thread-safe, following patterns like these is considered safe:
- Wrap with
Mutex - Isolate in
ExecutionContext::Isolatedcontext - Dedicated Thread + Thread::Channel
- Use ThreadLocal state
Summary
Crystal’s parallel execution is currently in the midst of major evolution. In addition to Fiber, which has been used for concurrent execution in I/O-bound processing, ExecutionContext::Parallel now enables full-fledged parallel processing. Using Atomic / Mutex / Channel / WaitGroup, you can build safe parallel processing similar to Go. Execution::Isolated is effective for GUI / FFI. Thread can be used in special cases where OS threads need to be handled directly. Note that there are ambiguous parts regarding thread safety in the standard library.
Practical Guidelines for Parallel Execution in Crystal
- Leave I/O to
Fiber- No special action needed as Crystal’s I/O model is tightly integrated with
Fiber.
- No special action needed as Crystal’s I/O model is tightly integrated with
- Use Parallel or Thread for CPU-bound tasks
-
ExecutionContext::Parallelis the first choice.
-
- Protect shared state with
AtomicorMutex- Treat gray zones like
ENVandLoggerconservatively
- Treat gray zones like
- Test explicitly using
-Dpreview_mtand-Dexecution_context
This concludes the article. Thank you for reading to the end.
