Table of Contents
- Goroutines
- Mutexes
- WaitGroups
- Condition Variables
Goroutines
Concurrency is at the heart of modern systems programming, and Go offers some of the most intuitive and powerful primitives to support it. This article offers a detailed exploration of Go’s concurrency tools, ranging from goroutines and closures to the Go runtime scheduler accompanied by concise examples, best practices, and clarifications.
In traditional operating systems, a process is an independent program with its memory, file descriptors, and at least one thread. A thread, in turn, is an execution context with a call stack and processor state. Threads are usually heavyweight; they require significant resources to create and manage.
By contrast, Go introduces goroutines, which are managed entirely by the Go runtime, not the OS. These are extremely lightweight and designed to scale massively.
A goroutine is an execution context that is managed by the Go runtime (as opposed to a thread that is managed by the operating system).
To launch a goroutine, the syntax is straightforward:
go func() {
fmt.Println("Running inside a goroutine")
}()
This code launches an anonymous function in a separate goroutine. The main function continues executing immediately, while the goroutine runs concurrently.
The function running as a goroutine can take parameters, but it cannot return a value.
Note: Parameters passed to a goroutine are evaluated before the goroutine starts, which means they retain the expected value.
When go
is called, the runtime allocates a new, small stack (typically starting at 2KB in older versions, adjusted based on historical averages in Go 1.19+). This stack grows dynamically as needed, avoiding over-allocation and making goroutines extremely memory-efficient compared to OS threads, which often preallocate megabytes of stack space.
Additionally, goroutines don’t have priorities. Unlike OS threads, where higher-priority threads can preempt lower ones, Go’s scheduler treats all goroutines equally though newer versions may favor starving goroutines to prevent deadlocks.
The Go scheduler assigns goroutines to operating system threads to get more work on each thread as opposed to doing less work on many threads.
Let’s consider a simple example:
func f() {
fmt.Println("Hello from goroutine")
}
func main() {
go f()
fmt.Println("Hello from main")
time.Sleep(100 * time.Millisecond)
}
Possible outputs:
Hello from main
followed byHello from goroutine
Hello from goroutine
followed byHello from main
- Only
Hello from main
(if f didn’t execute within 100ms)
Why? Because the go f()
line spawns a new concurrent task, but the main function doesn’t wait for it unless explicitly instructed here we use Sleep
for a crude delay.
When main returns and the program exits, all running goroutines terminate abruptly, mid-function, without a chance to perform any cleanup.
Go supports closures: functions that capture variables from the surrounding scope. While powerful, closures can lead to subtle bugs in concurrent programs.
❌ Problematic Code:
for _, v := range []string{"a", "b", "c"} {
go func() {
fmt.Println(v)
}()
}
This may unexpectedly print c three times. Why? Because each closure captures the same variable v, which changes during the loop. By the time the goroutine runs, v might already be “c”.
Each anonymous function is a closure. We are running three goroutines, each with a closure that captures the s variable from the enclosing scope.
✅ Safer Alternatives
Option 1: Copy in loop:
for _, v := range []string{"a", "b", "c"} {
value := v
go func() {
fmt.Println(value)
}()
}
Option 2: Pass as parameter:
for _, v := range []string{"a", "b", "c"} {
go func(value string) {
fmt.Println(value)
}(v)
}
Both solutions ensure each goroutine receives its copy of the value. In either solution, the loop variable no longer escapes to the heap, because a copy of it is captured. Go normally allocates variables on the stack, which is fast and temporary. But if the compiler detects that a variable might be used after the current function ends, it allocates it on the heap, a slower but persistent memory region. Once the function declaring that variable returns, the global variable will be pointing to a stale memory location. Closures are a common reason for variables escaping to the heap. The Go compiler uses escape analysis to detect this behavior.
A frequent concern among Go developers is how to safely stop a running goroutine. Unlike some languages that offer direct APIs to terminate threads, Go intentionally avoids providing a built-in mechanism to kill goroutines. This design encourages safer, controlled concurrency. There is no magic function that will terminate or pause a goroutine.
Instead, goroutines must be structured to exit on their own when they receive a signal, often via a channel or a context.Context
. Here’s a simple approach using a done
channel:
done := make(chan struct{})
go func() {
for {
select {
case <-done:
fmt.Println("Goroutine exiting...")
return
default:
// Work...
}
}
}()
// ...
done <- struct{}{} // Send signal to stop...
This allows the goroutine to exit gracefully, releasing resources and avoiding hard shutdowns. Once created, you have to be mindful of how to terminate them responsibly.
In Go, if something goes wrong inside a goroutine like an unexpected nil pointer dereference, it may trigger a panic
. A panic unwinds the function call stack, looking for a recover()
function. If none is found, it prints a stack trace and terminates the program.
This is not a recommended way to stop a goroutine. Panics are for exceptional circumstances and should be handled carefully:
go func() {
defer func() {
if r := recover(); r != nil {
fmt.Println("Recovered from panic:", r)
}
}()
panic("something went wrong")
}()
The Go runtime uses an M:N scheduler, which means it maps M goroutines to N operating system threads. This approach allows Go to scale efficiently across multiple cores without overwhelming the OS with too many threads.
The Go runtime starts several goroutines when a program starts. Exactly how many depends on the implementation and may change between versions. However, there is at least one for the garbage collector and another for the main goroutine.
How It Works?!
- When a goroutine becomes runnable, the scheduler places it in a queue.
- When an OS thread is available, the scheduler assigns it a goroutine to execute.
- If a goroutine blocks (e.g., on I/O or a channel), the OS thread is either reused or a new one is created.
Go limits the number of simultaneously running OS threads using the GOMAXPROCS
setting:
runtime.GOMAXPROCS(4) // Use up to 4 OS threads for goroutines
This allows the developer to tune performance based on available CPU cores.
Blocking I/O is handled differently depending on whether it’s synchronous or asynchronous:
- Synchronous I/O (e.g., file read): the thread running the goroutine is also blocked. The runtime may create a new thread to continue scheduling other goroutines.
- Asynchronous I/O (e.g., network operations): Go uses a component called a netpoller.
Instead of blocking a thread for a system call, the goroutine is blocked, and a netpoller thread is used to wait for asynchronous events.
The netpoller listens for I/O readiness and wakes up the relevant goroutines when the data is available. This enables Go to efficiently scale to thousands of concurrent network connections without using thousands of threads.
Mutexes
A mutex (mutual exclusion) ensures that only one goroutine can execute a piece of code at a time, preventing race conditions when accessing shared resources.
💡 Core Rule: Only one goroutine at a time should access the critical section protected by a mutex.
Below is a simple example that locks access to a shared resource. Using defer ensures the mutex is always unlocked even if the function returns early.
var mu sync.Mutex
func safeOperation() {
mu.Lock()
defer mu.Unlock()
// Code here is protected: only one goroutine can execute it at a time
}
This guarantees that no two goroutines can run the critical section simultaneously. Let’s walk through a more realistic example: creating a cache that safely stores values retrieved from a database. Multiple goroutines may access the cache concurrently, so we must synchronize access using a mutex.
type Cache struct {
mu sync.Mutex
store map[string]*Data
}
func (c *Cache) Get(id string) (Data, bool) {
// First, lock and check if the value is already in the cache
c.mu.Lock()
data, found := c.store[id]
c.mu.Unlock()
if found {
if data == nil {
return Data{}, false
}
return *data, true
}
// Simulate a slow database call
result, ok := fetchData(id)
// Lock again to safely write to the cache
c.mu.Lock()
defer c.mu.Unlock()
// Double-check in case another goroutine already inserted the data
if existing, exists := c.store[id]; exists {
return *existing, true
}
if !ok {
c.store[id] = nil // Mark as "not found"
return Data{}, false
}
c.store[id] = &result
return result, true
}
Explanation:
- The
mu
mutex protects thestore
map from concurrent access. - The first lock is used to check for an existing value.
- The second lock (after fetching) is to store the result, using
defer
for safe release. - A double-check after the slow operation avoids overwriting if another goroutine already added the value.
Avoid copying structs that contain a mutex, this breaks mutual exclusion. This method copies the entire struct, including the mutex, making each goroutine work with a different lock. Instead, always use pointer receivers when your struct contains a sync.Mutex
.
func (c Cache) Get(id string) (Data, bool) { ... } // BAD: receiver by value
A mutex doesn’t remember who locked it. If a goroutine tries to lock it twice without unlocking, it deadlocks.
var mu sync.Mutex
func doWork() {
mu.Lock()
defer mu.Unlock()
// Safe work here
}
func nestedCall() {
mu.Lock()
defer mu.Unlock()
doWork() // DEADLOCK: doWork tries to lock again
}
Move the business logic into a separate function that doesn’t handle locking:
func doWorkUnlocked() {
// Work logic without locking
}
func doWork() {
mu.Lock()
defer mu.Unlock()
doWorkUnlocked()
}
func nestedCall() {
mu.Lock()
defer mu.Unlock()
doWorkUnlocked()
}
This pattern avoids recursive locks and makes the code easier to test and reuse.
Sometimes you can use a buffered channel as a lightweight mutex replacement:
var lockChan = make(chan struct{}, 1)
func Lock() {
lockChan <- struct{}{}
}
func Unlock() {
select {
case <-lockChan:
default:
}
}
This mimics a mutex by sending a token into the channel when locking and removing it when unlocking. It’s simple and effective in specific scenarios.
If you expect lots of concurrent reads and few writes, sync.RWMutex
is more efficient. It allows multiple readers but only one writer.
type Cache struct {
mu sync.RWMutex
store map[string]*Data
}
func (c *Cache) Get(id string) (Data, bool) {
// Allow multiple readers
c.mu.RLock()
data, found := c.store[id]
c.mu.RUnlock()
if found {
if data == nil {
return Data{}, false
}
return *data, true
}
// Slow operation without lock
result, ok := fetchData(id)
// Lock for writing
c.mu.Lock()
defer c.mu.Unlock()
// Check again to avoid race
if existing, exists := c.store[id]; exists {
return *existing, true
}
if !ok {
c.store[id] = nil
return Data{}, false
}
c.store[id] = &result
return result, true
}
Explanation:
RLock()
is used for non-blocking read access multiple goroutines can read simultaneously.Lock()
ensures exclusive write access no other reader or writer can proceed while the lock is held.- The structure and logic are similar to the earlier mutex example, but optimized for read-heavy workloads.
WaitGroups
In Go, when working with multiple goroutines that must finish before continuing, a common approach is to use a sync.WaitGroup
. This construct behaves like a counter that can be incremented and decremented in a thread-safe way. The core idea is simple: increase the counter before starting a goroutine and decrease it once the goroutine is done. When the counter reaches zero, any code waiting on it is unblocked.
The most common usage pattern involves looping over several tasks, incrementing the counter with Add(1)
for each goroutine before it’s launched, and then calling Done()
at the end of the goroutine. To ensure Done()
is always called, especially if the goroutine exits early due to an error or panic, it’s typically placed inside a defer
statement.
var wg sync.WaitGroup
for i := 0; i < 10; i++ {
wg.Add(1)
go func() {
defer wg.Done()
// Do some work here
}()
}
wg.Wait()
In this example, the main goroutine waits with Wait()
until all ten other goroutines call Done()
, ensuring that the work is fully completed before continuing. This pattern is especially useful when coordinating concurrent calls in a service that aggregates results.
Consider a case where two different services need to be called in parallel, and their results used together. Instead of waiting for each sequentially, you can launch both at once and collect the results after both are done:
func runInParallel() (A, B) {
var wg sync.WaitGroup
var resultA A
var resultB B
wg.Add(1)
go func() {
defer wg.Done()
resultA = callServiceA()
}()
wg.Add(1)
go func() {
defer wg.Done()
resultB = callServiceB()
}()
wg.Wait()
return resultA, resultB
}
This design saves time and allows you to continue only once both calls are complete. It’s important to remember that Add()
must be called before the goroutine starts. If you mistakenly call Add(1)
inside the goroutine, there’s a risk that Wait()
might already be executing, leading to a race where the program continues too early or waits forever.
One of the subtle bugs with WaitGroup
comes from forgetting to call Done()
, or calling it in a place where it might be skipped. The safest way to handle this is to always use defer
at the top of the goroutine. This ensures that even if the logic changes over time, the decrement will still happen properly.
Sometimes, when using WaitGroup
together with channels, you might run into a deadlock scenario. For example, if multiple goroutines send values to a channel but no goroutine is ready to receive them, all of them will block at the send operation. If at the same time you’re waiting on the WaitGroup
, the program will hang indefinitely.
Take this example:
func main() {
ch := make(chan int)
var wg sync.WaitGroup
for i := 0; i < 5; i++ {
wg.Add(1)
go func(val int) {
defer wg.Done()
ch <- val
}(i)
}
wg.Wait()
close(ch)
for v := range ch {
fmt.Println(v)
}
}
Here, all goroutines try to send values into the channel, but since there’s no receiver running, they get stuck. As a result, Done()
is never called, and Wait()
blocks forever. The problem is not with WaitGroup
itself, but with the lack of coordination between sending and receiving.
To fix this, one approach is to launch a separate goroutine that reads from the channel before waiting for completion. That way, senders are unblocked, Done()
is called, and the main goroutine can eventually proceed.
go func() {
for value := range ch {
fmt.Println(value)
}
}()
wg.Wait()
close(ch)
Now, each goroutine sends to an open channel and is guaranteed to be received, ensuring the system moves forward. Another way to address the same issue is to reverse the coordination: wait in a background goroutine and close the channel after all tasks complete.
go func() {
wg.Wait()
close(ch)
}()
for value := range ch {
fmt.Println(value)
}
This variation moves the blocking Wait()
out of the main goroutine and closes the channel safely once all the work is done. The main goroutine then continuously reads from the channel until it is closed, avoiding deadlock entirely.
In both solutions, the essential idea is to ensure that sending and receiving are balanced and that goroutines are allowed to finish without being indefinitely blocked on an operation. When channels and WaitGroup
are used together, understanding the order of operations is crucial. With careful structuring, WaitGroup
becomes a robust way to coordinate concurrent tasks and maintain control over execution flow.
Condition Variables
In Go, condition variables are not as widely used as channels for synchronizing goroutines. However, they still have practical value, especially when working in shared-memory contexts where complex coordination is required. While Go encourages using channels, condition variables offer a more traditional mechanism found in many other languages, such as Java, and are useful for scenarios like signaling changes in state between goroutines.
A classic example is the producer-consumer scenario. Producers generate data, and consumers process it. If the queue is full, producers must wait. If the queue is empty, consumers must pause until new items arrive. Although this can be solved elegantly with channels, using a condition variable illustrates a lower-level synchronization model based on explicit signaling.
The first step is building a circular queue with limited capacity. It should allow enqueuing and dequeuing, returning success or failure based on space availability.
type Queue struct {
items []int
start, end int
count int
}
func NewQueue(capacity int) *Queue {
return &Queue{
items: make([]int, capacity),
start: 0,
end: -1,
count: 0,
}
}
func (q *Queue) Enqueue(value int) bool {
if q.count == len(q.items) {
return false
}
q.end = (q.end + 1) % len(q.items)
q.items[q.end] = value
q.count++
return true
}
func (q *Queue) Dequeue() (int, bool) {
if q.count == 0 {
return 0, false
}
value := q.items[q.start]
q.start = (q.start + 1) % len(q.items)
q.count--
return value, true
}
With the queue in place, a lock and two condition variables are required. One is used to pause producers when the queue is full, and the other allows consumers to wait when the queue is empty. Both conditions share the same mutex, ensuring changes to the queue are always made within a protected section.
lock := sync.Mutex{}
condFull := sync.NewCond(&lock)
condEmpty := sync.NewCond(&lock)
queue := NewQueue(10)
A producer repeatedly generates random values and tries to place them in the queue. If the queue is full, it pauses using the condition variable. Once a consumer removes an item and signals, the producer continues.
producer := func() {
for {
value := rand.Int()
lock.Lock()
for !queue.Enqueue(value) {
condFull.Wait()
}
lock.Unlock()
condEmpty.Signal()
time.Sleep(time.Millisecond * time.Duration(rand.Intn(1000)))
}
}
The loop where Enqueue
is attempted to ensure that the condition is always re-checked after the producer is awakened since the queue’s state might have changed while it was sleeping. The Wait()
call automatically releases the mutex while sleeping and reacquires it on wake-up.
Consumers follow a nearly identical structure. They attempt to retrieve data from the queue. If nothing is available, they wait. Once a producer adds a new item, it signals the consumer side to resume.
consumer := func() {
for {
lock.Lock()
var value int
for {
var ok bool
if value, ok = queue.Dequeue(); !ok {
fmt.Println("Queue is empty")
condEmpty.Wait()
continue
}
break
}
lock.Unlock()
condFull.Signal()
time.Sleep(time.Millisecond * time.Duration(rand.Intn(1000)))
fmt.Println("Consumed: ", value)
}
}
This symmetry between producer and consumer ensures fairness and efficiency. Every time the queue transitions from full to not-full or from empty to not-empty, the relevant goroutines are notified. Because these transitions aren’t atomic with Wait()
resuming, the condition must always be checked again in a loop. This pattern prevents race conditions and unexpected states from occurring.
Finally, launching multiple producers and consumers creates a concurrent environment where condition variables shine. Their coordination maintains a smooth flow of production and consumption without overfilling or starving the queue.
for i := 0; i < 10; i++ {
go producer()
}
for i := 0; i < 10; i++ {
go consumer()
}
select {}
Experimenting with different numbers of producers and consumers reveals the balancing act. More producers lead to more frequent “Queue is full” messages, while more consumers result in “Queue is empty” being printed more often. Regardless of the ratio, condition variables ensure that neither group overwhelms the system, allowing for controlled, synchronized concurrency.