Threads & Condition Variables
Overview
In this lab you will implement three short concurrency primitives in C
using pthread mutexes and condition variables:
- A cyclic (reusable) barrier.
- A bounded buffer (producer/consumer).
- A simulated byte-pool allocator that demonstrates the “covering conditions” idea from lecture.
Each part is a single C file. The autograder runs hidden stress tests
against your code; any deadlock fails the test by alarm timeout. You
can also run ThreadSanitizer locally with make tsan – it’s
the easiest way to find data races in your own code.
Each part is graded on two things: correctness (the bulk of the
points) and propagating errors from pthread_*_init (4 points
per part). pthread_mutex_init and pthread_cond_init can fail with
EAGAIN, ENOMEM, or EINVAL when the system is short on
resources. Each *_init function in this lab is declared to return 0
on success and non-zero on failure – check the return code of every
pthread_*_init call you make and propagate any failure to your
caller.
There is no report and no written questions. You submit a tarball; the autograder does the grading.
Setup
Download the starter tarball with the following code:
cd ~/cs224
wget https://cs224.cs.vassar.edu/labs/cv_lab.tar.gz
tar xzf cv_lab.tar.gz
cd cv_lab
make sanity # builds and runs three smoke tests; should pass
# *only* once you start filling in code
In the starter code, you only edit the three .c files.
The headers, the Makefile, and the tests are fixed; the autograder
uses the exact headers in your tarball.
cv_lab/
barrier.h barrier.c
bbuf.h bbuf.c
allocator.h allocator.c
tests/sanity_*.c
Makefile
README.md
Part 1 – Cyclic barrier (35 pts: 31 correctness, 4 error check)
What is a barrier?
A barrier is a synchronization point that holds every
participating thread until all of them have arrived. Imagine a group
hike where everyone has agreed to wait at the next trailhead before
continuing: nobody starts the next leg until the slowest hiker has
reached the meeting point. In a multithreaded program, threads
“arrive” at the barrier by calling barrier_wait, and none of them
returns from that call until all n participants have called it. The
moment the last thread arrives, every thread is released and they all
continue on together.
Barriers show up everywhere parallel work is broken into phases
that each thread contributes to. Iterative simulations (a fluid
solver advancing one timestep at a time, a cellular-automaton update
pass, a graph-coloring sweep) need every thread done with phase k
before any thread can safely read its neighbors’ results in phase
k+1. Numerical libraries like OpenMP and MPI provide barriers as a
core primitive for exactly this reason. You can think of a barrier as
the multi-thread generalization of pthread_join – “wait until
everyone’s done with this step” – except that the threads don’t exit
afterwards; they reuse themselves for the next step.
That reuse is what makes a cyclic (also called reusable) barrier important. A single-shot barrier handles one round and then its data structure is exhausted. A cyclic barrier is one you can call from inside a loop:
for (int phase = 0; phase < N_PHASES; phase++) {
do_my_share_of_phase(phase);
barrier_wait(&bar); // wait for everyone to finish this phase
}
Pthreads itself provides pthread_barrier_t (with
pthread_barrier_init, pthread_barrier_wait, and
pthread_barrier_destroy); in production code you would just use
that. We’re implementing our own from scratch in this lab to see how
mutexes and condition variables compose into a higher-level
synchronization primitive.
Your job in this part is to write a cyclic barrier_wait.
Specification
Implement barrier_init, barrier_wait, and barrier_destroy in
barrier.c. The header barrier.h defines the type and prototypes;
do not modify it.
You must use only barrier->mutex and barrier->cond. No
spinning, no sleep, no atomics, no global state.
Why barrier_t has a phase field
A first attempt at a cyclic barrier might track only the arrival
count: increment on entry, broadcast and reset to zero when the count
reaches num_threads, and have everyone else wait while the count
isn’t zero. This works for the first round but breaks under repeated
use. The trouble is that the count’s “zero” state is transient. As
soon as the trip-count is reached, one of the released threads loops
around, calls barrier_wait again, and bumps the count back off zero
for the next phase – before the previous phase’s slower waiters
have all reacquired the mutex and re-checked the predicate. Those
slower waiters then see a non-zero count, decide they were spuriously
woken, and go back to sleep. Nothing will ever signal them again.
phase exists to give each round of the barrier a stable identifier
that the next round doesn’t erase. A waiter snapshots the phase
number on entry and waits until the phase changes, not until the
count hits some particular value. The phase only ever moves forward,
so a slow waiter can always tell whether its round has ended – even
if a faster thread has already begun the next one. Use the phase
field in your wait predicate.
The autograder runs eight threads through five thousand phases with
a deadlock alarm. Between every two consecutive barrier_wait calls
it checks an invariant: every thread must be at the same phase
number. Any lost wakeup or missed synchronization will fail.
Part 2 – Bounded buffer (45 pts: 41 correctness, 4 error check)
Implement bbuf_init, bbuf_put, bbuf_get, and bbuf_destroy in
bbuf.c. bbuf.h defines a buffer of BBUF_CAPACITY (= 2) integer
slots with one mutex and two condition variables: not_empty
(consumers wait here when the buffer is empty) and not_full
(producers wait here when the buffer is full).
Lecture walks through three broken versions before arriving at a correct one. The bugs are worth keeping in mind as you write your implementation:
- Using
if (count == ...)instead ofwhile– broken because a thread returning fromcond_waitis not guaranteed that the condition still holds. The fix is to recheck the predicate in awhileloop. - Using a single condition variable with
while– still broken with multiple producers and multiple consumers. A consumer’s signal can wake a fellow consumer instead of a waiting producer; that consumer sees an empty buffer and goes back to sleep, and eventually everyone is asleep. - Using two condition variables (
not_emptyandnot_full) – correct. This is what you must implement.
Items must come out of bbuf_get in the same order they went into
bbuf_put (FIFO). No item may be lost or returned twice.
The autograder runs:
- a single-producer / single-consumer FIFO test with thousands of items, and
- a four-producer / four-consumer stress test with thousands of
items per producer. Each item is encoded
(producer_id << 16) | seq, and the test fails if any(pid, seq)pair is consumed zero or more than one times.
Both run with a deadlock alarm. The single-condition-variable bug reliably deadlocks under this stress test.
Part 3 – Covering conditions (20 pts: 16 correctness, 4 error check)
Implement alloc_init, alloc_acquire, alloc_release, and
alloc_destroy in allocator.c. This is a simulated allocator:
it tracks bytes_left in a fixed-size pool but does not allocate
real memory. alloc_acquire(a, n) blocks until at least n bytes are
free, then deducts n; alloc_release(a, n) adds n back to the
pool and wakes waiters that may now proceed.
This is the “covering conditions” scenario from lecture. Multiple
threads can be blocked at the same time with different requested
sizes, and a release that frees X bytes might satisfy some
waiters and not others. pthread_cond_signal wakes only one waiter;
if the woken waiter can’t proceed and goes back to sleep, other
waiters that could proceed are stranded. The fix is the appropriate
broadcast call. Pick the right primitive.
You must use only a->mutex and a->cond (one condition
variable, exactly as the lecture example).
The autograder runs:
- a stress test with many threads and varied request sizes against a small pool, and
- a deterministic trap: the pool is held empty, two waiters of
different sizes are parked on the cond var, then exactly enough
bytes are released to satisfy the smaller waiter only. Under
cond_signalthe wrong waiter is woken, deadlocks, and alarm fires.
Submitting
From the cv_lab/ directory:
make submission # creates submission.tar.gz with your three .c files
Upload submission.tar.gz to the Threads and Condition Variables
assignment on Gradescope. The autograder prints a per-part score.
You may submit unlimited times. Only your last submission counts.
If a part is incomplete, leave the TODO stubs in place; you’ll get 0 for that part but the autograder will still run the others.
Tips
- Run
make sanityearly and often. The sanity tests are not the autograder, but if they fail or hang, the autograder definitely will. make tsanrebuilds the sanity tests with ThreadSanitizer. Any “data race” warning here is a real bug – fix it before submitting, even if the sanity test still passes.- If you see a sanity run hang for more than a few seconds, you
have a deadlock. Ctrl-C, recheck the slide for the relevant
broken version, and look for a missing wait predicate, a missing
while, or signaling the wrong condition variable. - Pthread reference:
man pthread_mutex_init,man pthread_cond_wait,man pthread_cond_signal,man pthread_cond_broadcast.