- Lab
-
Libraries: If you want this lab, consider one of these libraries.
- Core Tech
Python Concurrency: Building and Evaluating AI-Assisted Solutions
Concurrency is where Python gets fast and where it gets dangerous. In this hands-on lab, you take a slow, sequential data pipeline and rebuild it as correct, high-performance concurrent code. You will overlap I/O-bound work with threads, parallelize CPU-bound work with processes and overlap waits with asyncio, then measure the real payoff of each change. You will reproduce and fix a race condition and a deadlock using locks and queues and learn to pick the right tool for CPU-bound versus I/O-bound tasks. Finally, you will put an AI coding assistant to the test by reviewing and repairing AI-generated concurrency code, building the judgement to evaluate any concurrent solution for safety, readability and performance
Lab Info
Table of Contents
-
Challenge
Introduction
Concurrency is where Python gets fast and where it gets dangerous. A program that does one thing at a time waits for every slow step before it starts the next, so a job that spends most of its life waiting on the network or grinding through a calculation runs far slower than the hardware allows. In this lab, you take a slow, sequential media-catalog job and rebuild it as correct, high-performance concurrent code. You overlap I/O-bound waits with threads, parallelize CPU-bound work with processes, overlap waits with
asyncio, measure the real payoff of each change, fix a race condition and a deadlock and finally review and repair a concurrency snippet written by an AI assistant.You join Globomantics as a back-end developer on the team that owns the nightly media-catalog enrichment job. Each catalog entry needs a simulated metadata lookup, which is an I/O-bound wait and a checksum computation, which is a CPU-bound calculation. The workspace is pre-configured so you focus only on the concurrency.
concurrency/jobs.pyholds the read-only support module. It providesCATALOG, the list of catalog entries,io_fetch(entry)which simulates a network lookup,cpu_hash(entry)which runs a CPU-heavy hashing loop, andprocess_job(entry)which combines both. You do not edit this file.concurrency/sequential.py,concurrency/executors.py,concurrency/performance.py,concurrency/sync.py,concurrency/async_pipeline.py, andconcurrency/ai_review.pyare stubs you fill in, one concept per file.concurrency/ai_snippet.pyholds an AI-generated snippet with an unsafe shared-state bug that you repair in the final task.run.pyis a driver script that imports the modules and prints results so you can run the pipeline end to end. You do not edit it..envholds aLAB_API_KEY=placeholder used by the AI-review task.
Each task can be validated individually by clicking on the Validate button next to it.
If you get stuck, every task has a Task Solution section you can expand to reveal the answer. This can be found under the FEEDBACK/CHECKS section of every task.
info> The
solutions/folder at the top of the workspace contains the final state of every task.A failed task will list one or more failed checks under its Checks section, each with a specific message describing what went wrong.
The starting point of the lab is a directory named "
catalog-pipeline". The current directory of the built-in Terminal will be set to thecatalog-pipeline/directory. Packagesopenaiandpython-dotenvare already installed withpip3.info> Note: Before running the AI-review task later in the lab, copy the API key from the top of the Code Lab menu and paste it into the
.envfile as the value ofLAB_API_KEY.You can run the pipeline at any time with
python3 run.pyin the Terminal.Click on the Next step arrow to get started.
-
Challenge
Establishing a sequential baseline
A sequential program does one thing at a time. Each step runs to completion before the next one starts, so when a step has to wait, everything behind it waits too. That is the simplest way to run the catalog job and it is also the slowest. Before speeding anything up you need a baseline, a single honest number that says how long the job takes today. Every concurrency change later in the lab is measured against that number, so a baseline is what turns "this feels faster" into "this is two times faster".
A baseline is captured with a wall-clock measurement. You read the clock once before the work, run the work, read the clock again. The difference is the elapsed time the learner actually waited.
For example, here is a sequential loop over an unrelated list of city temperature readings:
import time def summarize(readings): results = [] for reading in readings: results.append(reading["city"].upper()) return results def time_summarize(readings): start = time.perf_counter() summarize(readings) return time.perf_counter() - starttime.perf_counterreturns a high-resolution clock value in seconds. Subtracting the start value from the end value gives the elapsed seconds as a float. You now have a working, timed baseline.run_sequentialprocesses the whole catalog one job at a time andmeasure_sequentialreports how many seconds that takes.That single number hides two very different kinds of work. The metadata lookup in
io_fetchis I/O-bound. It spends almost all of its time waiting on something external and the processor sits idle during the wait. The hashing loop incpu_hashis CPU-bound. It keeps the processor busy from start to finish with no waiting at all. The two respond to concurrency in opposite ways and telling them apart is the key to picking the right tool for each. -
Challenge
Choosing and using executors
Python runs your bytecode under a global interpreter lock, the GIL. Only one thread executes Python code at a time, so threads do not make pure computation any faster. They still help when the work is waiting, because a thread blocked on a network read releases the lock and lets another thread run. That is why threads suit I/O-bound work. For CPU-bound work you need real parallelism, and that means separate processes. Each process has its own interpreter and its own GIL, so several of them crunch numbers at the same time on different cores.
The
concurrent.futuresmodule gives both kinds of worker the same interface. You create an executor, hand it a function and an iterable of inputs withexecutor.map, and it returns the results in input order. AThreadPoolExecutorruns the work on threads and aProcessPoolExecutorruns it on processes. Swapping one for the other is often a one-line change.For example, here is an executor squaring a list of numbers:
from concurrent.futures import ThreadPoolExecutor def square(n): return n * n def square_all(numbers): with ThreadPoolExecutor() as executor: return list(executor.map(square, numbers))The
withblock shuts the pool down for you once every result is back. You now have two faster paths through the same catalog.run_threadedoverlaps the metadata waits on a pool of threads, which is the right move for I/O-bound work.run_multiprocessspreads the hashing across separate processes, which is the right move for CPU-bound work that the GIL would otherwise serialize. Therecommended_executorselector captures that decision in code, so the choice between threads and processes becomes a single named rule instead of a guess. -
Challenge
Measuring the payoff
Concurrency is only worth its complexity when it actually saves time, so you need a number that says how much faster the new code is. That number is the speedup. You measure how long the old way takes, measure how long the new way takes and divide the first by the second. A speedup of
3.0means the concurrent run finished in a third of the time. A speedup near1.0means the two runs took about the same, so all the extra machinery bought you nothing.A bare speedup figure is not yet a decision. Threads, processes and event loops all add overhead, and on small or already-fast workloads that overhead can swallow the gain. Teams usually set a minimum speedup they will accept before they keep concurrent code in the project. If the measured speedup clears that bar the change earns its place, otherwise the simpler sequential version wins.
For example, here is a report that rendered in twelve seconds now renders in four:
def render_speedup(old_seconds, new_seconds): return old_seconds / new_seconds ratio = render_speedup(12.0, 4.0) keep_it = ratio >= 2.0Here
ratiois3.0andkeep_itisTrue, because three clears the team's bar of two. You can now put a number on a concurrency change instead of guessing at it.speedupreports how many times faster the concurrent run was andis_worth_concurrencyturns that ratio into a clear yes or no against a threshold you choose. Together they keep concurrency honest, because a change that only shaves off a sliver of time no longer survives the comparison. -
Challenge
Synchronizing shared state
When several threads share one variable, updating it is not as safe as it looks. A statement like
total += 1reads the value, adds one, and writes it back. If two threads run those three sub-steps at overlapping moments, one update can quietly overwrite the other and the increment is lost. That is a race condition and it grows more likely the more threads you add.A
threading.Lockprotects against this by marking a critical section that only one thread may enter at a time. A thread acquires the lock, does its work and releases it, so the read, the add and the write happen as one indivisible step. Awithstatement acquires the lock on the way in and releases it on the way out.For example, here is a vote tally guarded by a lock:
import threading tally = 0 tally_lock = threading.Lock() def add_vote(): global tally with tally_lock: tally += 1A
queue.Queuesolves a related problem. When worker threads each produce a result, they need a safe place to hand it off. A queue is thread-safe on its own, so each worker callsputto add an item and the collector callsgetto take one, with no extra lock of your own.Locks bring one new hazard. If one thread holds lock A and waits for lock B while another thread holds lock B and waits for lock A, neither can move and the program hangs forever. That is a deadlock. The cure is to always acquire several locks in the same order everywhere, so two threads can never each be waiting on a lock the other already holds. Your concurrent workers can now share state without corrupting it. A
Lockturns an unsafecounter += 1into an atomic critical section, so no update is ever lost. Aqueue.Queuemoves results from many worker threads into one collector without a lock of your own. And acquiring multiple locks in a consistent order keeps two threads from each waiting on what the other holds, which is what turns a hang into a clean finish. These are the building blocks that make shared-memory concurrency correct rather than merely fast. -
Challenge
Asynchronous concurrency
Python's
asynciooverlaps waiting on a single thread by switching between tasks. A function defined withasync defis a coroutine. Calling it just returns a coroutine object that runs only when awaited.Inside a coroutine,
awaithands control back to the event loop during a slow operation, so the loop can run others meanwhile.await asyncio.sleep(seconds)is the async stand-in for a blocking wait.For example, here is a coroutine that pretends to read a sensor and a second one that reads several at once:
import asyncio async def read_sensor(name): await asyncio.sleep(0.1) return f"reading-{name}" async def read_all(names): readings = await asyncio.gather(*(read_sensor(n) for n in names)) return list(readings)asyncio.gatherschedules every coroutine and waits for them together, so their waits overlap and results come back in order. To run a top-level coroutine from ordinary code, hand it toasyncio.run. You now have a third concurrency tool alongside threads and processes. Withasyncio, a single thread overlaps many waits by switching between coroutines whenever one pauses on anawait. A coroutine marks where it can yield control, andasyncio.gatherruns a whole batch of them at once so their waits happen in parallel rather than one after another. For I/O-bound work like the metadata lookups, this gives you the overlap of threads without the cost of running more than one thread. -
Challenge
Evaluating AI-assisted concurrency solutions
An AI assistant will happily write concurrent code that looks reasonable but hides a bug: a race condition, a missing lock, or a blocking call inside an
asyncfunction. These stay invisible until the code runs under load and produces a corrupted total or a hung process.Reading for safety means asking a few questions. Is every piece of shared mutable state guarded by a lock? Does any coroutine block instead of awaiting? Can the result collection drop or duplicate work?
info> Note: Before running the following tasks, copy the API key from the top of the Code Lab menu and paste it into the
.envfile as the value ofLAB_API_KEY.A model can give you a second opinion on the code. The request goes through the lab's endpoint using the
LAB_API_KEYin.env. The client, base URL andgpt-4o-minimodel are already wired up inconcurrency/ai_review.py, so you write only the request. Before running it, copy the API key from the top of the Code Lab menu into.env.For example, here is the shape of a chat request that asks the model to explain an unrelated piece of code and then reads the reply text:
response = client.chat.completions.create( model="gpt-4o-mini", messages=[ {"role": "user", "content": "Explain in one sentence what a regular expression is."} ], ) answer = response.choices[0].message.contentA model's review is a starting point, not a verdict. You still judge what it says and apply the fix yourself. You have taken a slow, sequential pipeline and turned it into correct, fast concurrent code. Run the finished pipeline at any time with
python3 run.pyin the Terminal to see the catalog process through the sequential baseline, the thread pool, the process pool, and the async path, with the timing for each.Across the lab you measured a sequential baseline, overlapped I/O-bound waits with a
ThreadPoolExecutor, parallelized CPU-bound work with aProcessPoolExecutorand overlapped many waits on one thread withasyncio. You quantified each change with a speedup figure, then made shared state safe with aLock, moved results between workers through aqueue.Queueand removed a deadlock by ordering lock acquisition. Finally you asked a model to review concurrent code and repaired the unsafe shared-state bug it left behind.The judgement you built here carries to any concurrent solution. Match the tool to the work, threads and async for waiting and processes for computing. Guard every piece of shared mutable state. Measure before you trust a speedup. To go further, explore
asyncioqueues for producer-consumer pipelines, theconcurrent.futuresas_completedAPI for streaming results as they finish and Python's free-threading build for true thread parallelism without the GIL.
About the author
Real skill practice before real-world application
Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.
Learn by doing
Engage hands-on with the tools and technologies you’re learning. You pick the skill, we provide the credentials and environment.
Follow your guide
All labs have detailed instructions and objectives, guiding you through the learning process and ensuring you understand every step.
Turn time into mastery
On average, you retain 75% more of your learning if you take time to practice. Hands-on labs set you up for success to make those skills stick.