Libraries: If you want this lab, consider one of these libraries.
Core Tech

Python Debugging: Independent and AI-Assisted Problem Solving

In this Code Lab, you'll debug intentionally broken Python code using both manual analysis and AI assistance. You'll identify errors independently, use AI agents to refine your debugging approach, implement fixes, and evaluate solutions for correctness. When finished, you'll have practical skills for leveraging AI in your debugging workflow.

Get started Contact sales

Lab Info

Level

Intermediate

Last updated

Jul 18, 2026

Duration

32m

Challenge

Introduction
Welcome to the Python Debugging: Independent and AI-Assisted Problem Solving Code Lab. In this hands-on lab, you build a small debugging workflow and use it to repair a legacy module that carries three different kinds of bug: an edge case that crashes on empty input, a type issue that crashes when a value arrives as the wrong type, and a logic error that quietly returns the wrong answer without raising anything. You first analyze the failures on your own, then bring in an AI agent to refine and compare debugging approaches, apply the fixes, and evaluate the result for correctness, readability, and reliability.

About the tools and concepts
**Independent analysis** is the part of debugging you do yourself. You run the code on an input that triggers the failure, read what it does, and reason about the root cause before reaching for any help. Some bugs announce themselves with an exception, and some do not: a logic error runs to completion and simply returns the wrong value, so part of analyzing code independently is noticing when a result is wrong even though nothing crashed.
AI-assisted debugging adds a model to that loop. Instead of pasting an entire file into a chat window, you send the model only the failing functions and a short summary of what went wrong. Keeping the request focused is what keeps each call small, which keeps token usage low and the model's attention on the actual problem. You then compare the model's suggested approach against your own diagnosis rather than accepting it blindly.

The openai SDK reaches the lab model through one configured client, so every call uses the same model, endpoint, and key. The pdb debugger ships with Python and is available in the environment for stepping through code by hand. The workflow itself uses only plain Python: dictionaries to carry outcomes, an f-string to assemble the prompt, and simple comparisons to evaluate the result.

### Prerequisites
Before starting this lab, you should have:
- Fundamental Python programming and syntax
- Familiarity with common Python errors and exceptions
- Basic knowledge of debugging concepts (breakpoints, print debugging)
- Understanding of function calls and code execution flow
- Experience reading and interpreting error messages The lab environment is ready to use. Run python3 --version from inside the workspace folder at any time to confirm the runtime. The stack is Python 3.12 with the openai SDK and pytest 8.x installed, and pdb available from the standard library. You validate each task by running its check with python3.
Background

The Scenario

You are a Python developer at CarvedRock Solutions, responsible for maintaining legacy scripts that still carry bugs. Your team pairs traditional debugging with AI-assisted problem solving. Your task is to analyze the broken code independently to identify the errors and their root causes, work with an AI agent to compare debugging strategies and refine your approach while keeping each request small to control token usage, implement the fixes with proper handling, and validate the result for accuracy and code quality.

The legacy module ships with three intentional bugs of different kinds, so you practice spotting more than one failure mode: a crash on an empty list, a crash on a value that arrives as a string, and a miscount that never raises at all.

The Application Structure

Key files in the lab environment
- `workspace/src/config.py`: shared constants, the prompt size limit and the error category table, read from one place - `workspace/src/llm_client.py`: the single shared client the lab AI is reached through - `workspace/src/ai_debugger.py`: the call helper, the prompt builder, the fix request, and the approach comparison that talk to the model - `workspace/src/analysis.py`: the harness that reproduces a failure, the classifier that names a raised error, and the diagnoser that also catches silent logic errors - `workspace/src/legacy_orders.py`: the legacy CarvedRock module with three bugs to repair - `workspace/src/evaluator.py`: the check that scores the repaired functions on correctness, readability, and reliability - `workspace/src/logger.py`: the shared stage logger - `workspace/data/cases.py`: the sample order data, the failing input, and the original buggy snapshot - `workspace/run_debugger.py`: the end-to-end runner that drives the whole workflow Complete the tasks in order. Each task builds on the previous one.
Run the full workflow from the **workspace** directory at any point with:
```
python3 run_debugger.py
```
The runner makes a call to the AI, so before you run it, export your lab key into the shell:
```
export LAB_API_KEY=<the key shown at the top of the lab pane>
```
info> If you get stuck, you can refer to the provided solution code for each task, available in the solutions folder.
Challenge

Setting Up the AI Client and the Call Helper

Pointing One Client at the Lab Model

Every AI call in this lab should travel through the same place: one client, one model, one endpoint. Centralizing that in llm_client.py means you can see exactly how the model is reached, and every later step that needs the model reuses the same configured client rather than building its own. The model name and the endpoint live here as named values, so changing either is a single edit. The key itself is read from the LAB_API_KEY environment variable, which is why your first step is to put that key into the shell.

Reaching the Model Through One Helper

A debugging workflow should not repeat the request plumbing every time it wants an answer. One helper, call_llm, takes a prompt, sends it as a single user message through the shared client, and returns just the model's text. The Chat Completions interface expects a list of messages, each a dictionary with a role and content, and it returns an object whose choices carry the model's replies. Because every later step calls this one function, the message shape and the response handling are defined once and reused everywhere.
Challenge

Analyzing the Broken Code Independently

Reproducing a Failure Without Crashing the Workflow

Before you fix anything, you need to see each failure for yourself. Running a broken function directly would stop the program at the first exception, so instead you wrap the call in a small harness that runs the function, catches whatever it raises, and reports the outcome as plain data. That outcome is a dictionary recording whether the call raised, the name of the error if it did, and the value it returned if it did not. Capturing the result on a clean run matters here, because one of the bugs in this module does not raise at all, and the only way to catch it is to compare the value it returned against what you expected.

Naming the Root Cause, Even When Nothing Raises

An exception name on its own is a clue, not a diagnosis. Turning ZeroDivisionError into edge case: empty or zero-length input is what lets you talk about a bug in terms of cause rather than symptom, and the category table in config.py holds those readable labels. But some bugs never raise. A logic error runs cleanly and returns the wrong value, so the diagnoser needs a second path: when nothing was raised yet the result does not match what you expected, the cause is a logic error. Handling both paths is what makes your independent analysis cover all three bug types in this module.
Challenge

Refining the Fix with the AI Agent and Comparing Approaches

Sending Only What the Model Needs

A model cannot help faster by reading more of your codebase. The opposite is true: the more you send, the more token budget you spend and the more the signal gets diluted. The prompt builder keeps the request tight by trimming the source to the failing functions and pairing it with a one-line summary of what went wrong. That focused prompt is both cheaper and clearer than pasting a whole module, and the MAX_SOURCE_CHARS limit in config.py is what bounds how much source goes out.

Requesting a Fix and Comparing It to Your Own Analysis

Asking the model for a fix is only half of AI-assisted debugging. The other half is judgment: you compare the model's approach against the diagnosis you reached on your own, rather than pasting its answer in unread. First the fix request runs the focused prompt through the call helper and strips the Markdown fences the model usually wraps code in. Then the comparison records your independent diagnosis next to a simple check of whether the model's fix actually addresses the edge case you identified, giving you a side-by-side view of where the two approaches agree.
Challenge

Implementing the Fixes and Evaluating the Solution

Repairing Three Different Bugs

Your analysis and the model agree on the causes, so now you apply the fixes. Each of the three functions fails in a different way, and each fix is a single, deliberate edit. The empty-list crash is an edge case, fixed with a guard that returns 0.0 when there are no orders. The string-percent crash is a type issue, fixed by converting the value to a number before the arithmetic. The miscount is a logic error, fixed by correcting the comparison so a value equal to the threshold is included. Fixing all three is what turns your diagnosis into working, reliable code.

Evaluating for Correctness, Readability, and Reliability

A fix is only trustworthy when it holds up on more than one measure. Correctness means the functions return the right answers across every sample case, including the inputs that first exposed the bugs. Reliability means the input that used to crash now runs without raising. Readability means the edge case is handled in a clear, explicit way rather than with an obscure trick, which you check by confirming the guard is present in the source. The evaluator computes all three and folds them into one report, so the runner can print an honest summary instead of a single pass or fail.
Challenge

Run the Full Workflow
Now that every task is complete, run the end-to-end workflow to watch the broken module get diagnosed, fixed, and scored.
1. Confirm the runtime is available:
  
  python3 --version
2. If you opened a new terminal since Task 1, export your lab key again so the runner can reach the model:
  
  export LAB_API_KEY=<the key shown at the top of the lab pane>
3. Run the workflow from the workspace directory:
  
  python3 run_debugger.py
4. Watch the log stream print an [INIT] line as the module loads, an [ANALYZE] line listing your independent diagnosis of all three bugs, an [AI] line as the fix is requested, a [COMPARE] line showing whether the model addressed the edge case, a [VERIFY] line showing the empty-order input no longer raises, an [EVALUATE] line reporting correctness, reliability, and readability, and a [DONE] line reporting how long the run took.
5. Confirm the [VERIFY] line reports raises: False with a result of 0.0, and the [EVALUATE] line reports all_clear: True.
Expected Result: Every layer you built is visible in one run: your analysis names all three root causes, the AI agent returns a focused fix you compared against your own, the repaired functions handle the inputs that first crashed, and the evaluator confirms correctness, reliability, and readability before the workflow reports a clean result.
Challenge

Conclusion
Congratulations on completing the Python Debugging: Independent and AI-Assisted Problem Solving lab. You built a debugging workflow end to end: a shared client and call helper for the lab AI, a harness that reproduces failures and diagnoses both crashes and silent logic errors on your own, a focused prompt and fix request that keep each model call small, a comparison that weighs the model's approach against yours, and an evaluator that scores the repair on three dimensions. These are the habits that make AI a useful partner in debugging rather than a replacement for understanding the bug.

What You Have Accomplished
1. Configured the LLM API client: exported the key and pointed one shared client at the lab model.
2. Built the shared call helper: sent a prompt as a single user message and returned just the model's text.
3. Reproduced the failures safely: ran functions in a harness that reports the error or the value as plain data.
4. Classified and diagnosed root causes: named both raised errors and silent logic errors.
5. Built a token-efficient debug prompt: sent only the failing functions and a short summary.
6. Requested the AI fix and compared approaches: cleaned the reply and weighed it against your own diagnosis.
7. Implemented three fixes: repaired an edge case, a type issue, and a logic error.
8. Evaluated the repair: scored correctness, readability, and reliability into one verdict.
Key Takeaways
- Some bugs raise and some do not, so independent analysis means checking returned values, not only catching exceptions.
- An exception name is a symptom, and mapping it to a root-cause category is what turns a clue into a diagnosis.
- Sending the model only the failing functions and a short summary keeps each call small, which controls token usage and sharpens the answer.
- AI-assisted debugging is strongest when you compare the model's approach against your own rather than accepting it unread.
- A fix earns trust on more than one measure: correctness, reliability, and readability together, not correctness alone.
Experiment Before You Go

You still have time in the lab environment. Try these explorations:
- Add a new entry to ERROR_CATEGORIES in config.py, then call classify with that name and watch the new category come back.
- Lower MAX_SOURCE_CHARS in config.py and print the prompt from build_prompt to see how the focused source shrinks.
- Add a new case to EVAL_CASES in data/cases.py and rerun the workflow to watch the evaluator count it.
- Open legacy_orders.py under pdb with python3 -m pdb -c continue run_debugger.py and step through the repaired functions.
- Revert one of the three fixes, rerun the workflow, and watch the evaluator report which dimension the regression breaks.

About the author

Angel Sayani

Angel Sayani is a Certified Artificial Intelligence Expert®, CEO of IntellChromatics, author of two books in cybersecurity and IT certifications, world record holder, and a well-known cybersecurity and digital forensics expert.

Real skill practice before real-world application

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Learn by doing

Engage hands-on with the tools and technologies you’re learning. You pick the skill, we provide the credentials and environment.

Follow your guide

All labs have detailed instructions and objectives, guiding you through the learning process and ensuring you understand every step.

Turn time into mastery

On average, you retain 75% more of your learning if you take time to practice. Hands-on labs set you up for success to make those skills stick.

Python Debugging: Independent and AI-Assisted Problem Solving

Lab Info

Table of Contents

Introduction

Background

The Scenario

The Application Structure

Setting Up the AI Client and the Call Helper

Pointing One Client at the Lab Model

Reaching the Model Through One Helper

Analyzing the Broken Code Independently

Reproducing a Failure Without Crashing the Workflow

Naming the Root Cause, Even When Nothing Raises

Refining the Fix with the AI Agent and Comparing Approaches

Sending Only What the Model Needs

Requesting a Fix and Comparing It to Your Own Analysis

Implementing the Fixes and Evaluating the Solution

Repairing Three Different Bugs

Evaluating for Correctness, Readability, and Reliability

Run the Full Workflow

Conclusion

What You Have Accomplished

Key Takeaways

Experiment Before You Go

About the author

Real skill practice before real-world application

Learn by doing

Follow your guide

Turn time into mastery

Get started with Pluralsight