The Lock-Free Gauntlet: Implementing a Concurrent Linked List in C++
So, you think writing a linked list is easy? Try doing it without locks.
I recently set myself a slightly masochistic challenge: implement a thread-safe linked list without using std::mutex. That means diving headfirst into the world of std::atomic, Compare-And-Swap (CAS) loops, and the absolute nightmare that is C++ memory ordering.
Here is the code, the pain, and the glory.
The Atomic Core
The heart of any lock-free structure is the Node. In a standard list, next is just a pointer. In a lock-free list, it’s a battlefield.
template <typename T>
struct Node {
T data;
std::atomic<Node<T>*> next;
};
We wrap next in std::atomic because multiple threads will be racing to update it simultaneously. If two threads try to insert a node after Head at the exact same nanosecond, one must succeed and the other must retry. This is where the CAS loop comes in.
The CAS Loop: Spin to Win
The “Compare-And-Swap” (CAS) operation is the primitive that makes lock-free programming possible. In C++, it’s atomic_compare_exchange_weak.
Here is my implementation of push:
void push(T data) {
auto node = new Node<T>{};
node->data = data;
auto next = head.load(std::memory_order_relaxed);
// Compare and Swap to atomically insert.
do {
node->next = next;
} while (!std::atomic_compare_exchange_weak_explicit(
&head,
&next,
node,
std::memory_order_release,
std::memory_order_relaxed));
}
Breaking It Down
- Load the Current Head: We grab a snapshot of
head. - Speculative Link: We point our new
node->nextto this snapshot. - The Gamble: We ask the CPU: “Is
headstill pointing to what I think it is? If yes, point it tonode. If no, tell me whatheadis now.” - Loop: If we failed (because another thread snuck in and changed
head), we update our explicitnextvariable and try again.
It’s a spinlock, but for a single instruction. Efficient? Yes. Painful to debug? Absolutely.
A Note on Memory Ordering
You’ll notice std::memory_order_release and std::memory_order_relaxed.
- Relaxed: “I don’t care about order, just give me the value.” We use this for loading the initial
headbecause if we are wrong, the CAS will fail anyway. - Release: “Make sure all my previous writes (like
node->data = data) are visible to the thread that acquires this pointer.”
If you mess this up, a consumer thread might see the new Node pointer but read garbage data from node->data. This is why lock-free programming is reserved for those who enjoy reading the C++ standard at 3 AM.
The Verdict
writing a lock-free linked list is a rite of passage. It forces you to understand the hardware underneath your code. It’s faster than a mutex in highly contended scenarios, but the cognitive overhead is massive.
Is it worth it? For a high-frequency trading system, yes. For your average CRUD app? Just use a std::mutex and go home.
Check out the full source code in my Systems Playground.