- Lab
-
Libraries: If you want this lab, consider one of these libraries.
- Core Tech

Guided: Building Multi-Agent Systems with AutoGen
Learn the fundamentals of building multi-agent systems with Microsoft's AutoGen framework. In this hands-on lab, you'll create a two-agent customer support chatbot that can answer user queries and escalate complex issues to a human, all orchestrated by AI.

Lab Info
Table of Contents
-
Challenge
Overview
Introduction
Welcome to Multi-Agent Systems with AutoGen.
In this lab, you’ll build a two-agent customer support bot in a Flask app using Microsoft’s AutoGen, backed by Azure OpenAI. You’ll wire up three capabilities: a quick-answer QA path, a summarizer tool (always 3–5 bullets + a TL;DR), and a Support → Escalation flow that routes complex or risky requests to a simulated human tier with a clean handoff.
By the end, the app will:
- Answer direct questions via a Simple QA path
- Summarize long text into 3–5 bullets with a one-sentence TL;DR
- Route support requests through Support → Escalation and produce human-ready handoffs
Outcomes
- Content & Learning: Turn long tickets, chats, or transcripts into crisp bullets + TL;DR
- Productivity: Auto-triage support requests; generate ready-to-send escalation notes
- Compliance & QA: Encode escalation rules (PII, legal, payments, outages) into governed prompts
- Pipelines: Reuse the summarizer and escalation logic as building blocks in larger RAG/agent workflows
Mental Model
- Agent: A specialized teammate with a role (system message) and optional tools (e.g., Support vs. Escalation).
- Tool: A capability the app can call (e.g., a summarizer that always returns 3–5 bullets + TL;DR).
- Router (orchestrator): A traffic controller that sends input to Support first, watches for the
[ESCALATE]
flag, and—if needed—asks Escalation to draft the human handoff. - Dialogue: A structured conversation between agents that yields one user-facing response (or a handoff).
Repo Layout (teaching-first split)
workspace/ └── app/ ├── base.py # provided; env wiring, Azure/AutoGen config, Simple QA adapter, Agent adapter ├── agents.py # Step 1: define Support & Escalation agents + ESCALATE_TOKEN ├── router.py # Step 2: orchestration (Support → optional Escalation, returns final string) ├── tools.py # Step 3: implement "smart_summarizer" (3–5 bullets + TL;DR) └── templates/ └── index.html # provided UI; no changes needed flask_app.py # provided; handles key, routes, and modes .env # created automatically when you paste your API key in the UI
Getting Started
-
Launch the app in your Web Browser tab and paste your Azure OpenAI API key into the field at the top. The app saves it to
.env
and uses sensible defaults for endpoint, version, and model. -
Use the dropdown to try each mode:
- Simple QA — lightweight answers, clearly and concisely
- Summarize text — converts long passages into 3–5 bullets, ending with a one-sentence TL;DR
- Agent — Support → Escalation: Support responds first; if the request is risky/complex, it emits
[ESCALATE]
, and Escalation drafts a clean human handoff
-
Build it step by step (teaching flow):
Step 1 — Agents
Open
app/autogen_agents.py
. Author system messages for SupportAgent and EscalationAgent; set/confirm theESCALATE_TOKEN
contract.Step 2 — Router
Open
app/autogen_router.py
. Orchestrate the flow: call Support → detectESCALATE_TOKEN
→ (optional) call Escalation to produce the handoff that starts with “Escalated to human: …”.Step 3 — Tool
In
app/tools.py
, implementsmart_summarizer
so it always returns 3–5 bullets + TL;DR.(Provided) Base
app/base.py
already wires environment defaults, the Simple QA adapter, and the Agent-mode adapter—no edits needed.You’ll finish with a page that answers, summarizes, and escalates—with governed outputs that feel production-ready and a file layout that’s easy to teach and extend.
Note: This lab experience was developed by the Pluralsight team using Forge, an internally developed AI tool utilizing Gemini technology. All sections were verified by human experts for accuracy prior to publication. For issue reporting, please contact us.
-
Challenge
Agents
Support & Escalation Agents
In this step you’ll define two AutoGen agents with clear roles—think frontline and handoff. You’ll author a SupportAgent that answers FAQs concisely and an EscalationAgent that produces a clean, human-ready handoff. Routing (when to escalate) happens in the router step.
Key Concepts
-
System messages = job descriptions Put tone, scope, and rules here so outputs are consistent and auditable.
-
Trigger token contract (single source of truth) Define
ESCALATE_TOKEN
and have Support emit[ESCALATE]
+ a short reason for risky/complex/PII/policy/account-access cases. The router imports the same constant to detect escalations. -
Deterministic handoff Escalation output must start with
Escalated to human: <one-sentence summary>
for easy reading and logging. -
Config reuse Both agents share the same config from
azure_autogen_config()
inapp/base.py
.
Your Task
-
Create the shared token at the top of the file:
ESCALATE_TOKEN = "[ESCALATE]"
-
Create two
AssistantAgent
s using the same Azure config:-
SupportAgent — concise Tier-1 answers; when escalation is needed, emit
ESCALATE_TOKEN
+ one short reason. When not escalating, never mention the token. -
EscalationAgent — produces a brief human handoff that begins with
Escalated to human: <one-sentence summary>
and asks only for essentials.
-
-
Return both agents from
create_support_and_escalation_agents() -> Tuple[AssistantAgent, AssistantAgent]
.
Starter Code (edit
app/agents.py
)# app/agents.py from typing import Tuple from autogen import AssistantAgent from app.base import azure_autogen_config # Single source of truth for the trigger token (router imports this) ESCALATE_TOKEN = "[ESCALATE]" def create_support_and_escalation_agents() -> Tuple[AssistantAgent, AssistantAgent]: """ Return (support_agent, escalation_agent), both using the same Azure config. TODOs: - SupportAgent system message: • Concise Tier-1 answers. • If complex/risky/PII/policy/account access needed → emit ESCALATE_TOKEN + one-sentence reason. • Otherwise DO NOT mention or describe the token. - EscalationAgent system message: • Produce a handoff beginning with 'Escalated to human: <one-sentence summary>'. • Ask only for essential details needed by a human. """ cfg = azure_autogen_config() # TODO: SupportAgent support = AssistantAgent( name="SupportAgent", system_message=( "TODO: Concise Tier-1 answers. If complex/risky/PII/policy/account access is needed, " f"emit {ESCALATE_TOKEN} plus a short reason. " "If you are not escalating, do not mention the token." ), llm_config=cfg, ) # TODO: EscalationAgent escalation = AssistantAgent( name="EscalationAgent", system_message=( "TODO: Acknowledge escalation and produce a brief handoff starting with " "'Escalated to human: <one-sentence summary>'. Ask only for essentials." ), llm_config=cfg, ) return support, escalation
Code (Solved) — Click to expand
# app/agents.py from typing import Tuple from autogen import AssistantAgent from app.base import azure_autogen_config # Single source of truth for the trigger token (router imports this) ESCALATE_TOKEN = "[ESCALATE]" def create_support_and_escalation_agents() -> Tuple[AssistantAgent, AssistantAgent]: """ Return (support_agent, escalation_agent), both using the same Azure config. """ cfg = azure_autogen_config() support = AssistantAgent( name="SupportAgent", system_message=( "You are a Tier-1 Customer Support Agent. " "Answer common questions concisely and helpfully. " "If the request is complex, risky, requires account access or PII, or needs policy exceptions, " f"reply with the token {ESCALATE_TOKEN} followed by one short sentence explaining why. " "If you are not escalating, do not mention or describe the token." ), llm_config=cfg, ) escalation = AssistantAgent( name="EscalationAgent", system_message=( "You handle escalations to a human. When invoked, acknowledge the escalation, " "ask only for essential details, and provide a final handoff that begins with: " "Escalated to human: <one-sentence summary>\n" "Be brief, professional, and actionable." ), llm_config=cfg, ) return support, escalation
Next Up
Implement the router in
app/router.py
:- Call SupportAgent first and capture its reply.
- If the reply contains
ESCALATE_TOKEN
, extract the short reason and prompt EscalationAgent to produce the handoff that starts withEscalated to human: <one-sentence summary>
. - Return a single string to the UI (either Support’s reply or the handoff).
-
-
Challenge
Router
Orchestration (Support → Escalation)
This step wires the conversation flow. The router sends a user message to SupportAgent first. If Support includes the shared
ESCALATE_TOKEN
(e.g.,[ESCALATE]
) with a short reason, the router builds a brief prompt and asks EscalationAgent to produce a clean, human-ready handoff. The router always returns one final string for the UI.
Key Concepts
-
Deterministic trigger Look for the literal token imported from
app/agents.py
(ESCALATE_TOKEN
). This keeps routing simple and testable. -
Reason extraction Parse the text after the token as a short reason and include it in the escalation prompt.
-
Consistent handoff Escalation output must start with
Escalated to human:
+ one-sentence summary for easy reading and logging. -
Single responsibility helpers Small functions keep things teachable and easy to unit-test:
_support_reply
→_parse_escalation
→_escalate
Your Task
- Call Support first (
_support_reply
) and capture its single-turn reply. - Detect the token with
_parse_escalation
; if present, extract the short reason. - Ask EscalationAgent via
_escalate
to produce the handoff that begins withEscalated to human: <one-sentence summary>
. - Return a single string — Support’s answer or the human handoff.
Note: With the simplified layout, the router lives in
app/router.py
and imports agents directly fromapp/agents.py
.
Starter Code —
app/router.py
# app/router.py from __future__ import annotations from typing import Tuple from app.agents import create_support_and_escalation_agents, ESCALATE_TOKEN def _support_reply(support, user_input: str) -> str: """Single-turn call to SupportAgent.""" # TODO: Call support.generate_reply(...) with the user_input and return trimmed text. # messages format: [{"role": "user", "content": user_input}] raise NotImplementedError def _parse_escalation(text: str) -> Tuple[bool, str]: """ Detect the escalation token and extract a brief reason. Returns (should_escalate, reason). """ # TODO: # - If ESCALATE_TOKEN not in text: return (False, "") # - Else return (True, <text after token>.strip() or "No reason provided.") raise NotImplementedError def _escalate(escalation, user_input: str, reason: str) -> str: """Ask EscalationAgent for the human-ready handoff.""" # TODO: Build a short prompt that includes: # - The original user_input # - The extracted reason # - The instruction that the response must start with: # "Escalated to human: <one-sentence summary>" # Then call escalation.generate_reply(...) and return trimmed text. raise NotImplementedError def run_support_flow(user_input: str) -> str: """ Support first; escalate only on the token. Returns a single display string for the UI. """ # TODO: # - Create agents via create_support_and_escalation_agents() # - Get Support reply # - Parse for escalation; if escalate, call _escalate and return result # - Otherwise return the Support reply raise NotImplementedError
Code (Solved) — Click to expand
# app/router.py from __future__ import annotations from typing import Tuple from app.agents import create_support_and_escalation_agents, ESCALATE_TOKEN def _support_reply(support, user_input: str) -> str: """Single-turn call to SupportAgent.""" reply = support.generate_reply(messages=[{"role": "user", "content": user_input}]) or "" return reply.strip() def _parse_escalation(text: str) -> Tuple[bool, str]: """ Detect the escalation token and extract a brief reason. Returns (should_escalate, reason). """ if ESCALATE_TOKEN not in text: return False, "" reason = text.split(ESCALATE_TOKEN, 1)[-1].strip() or "No reason provided." return True, reason def _escalate(escalation, user_input: str, reason: str) -> str: """Ask EscalationAgent for the human-ready handoff.""" prompt = ( "A Tier-1 agent decided to escalate this conversation.\n" f"User message:\n{user_input}\n\n" f"Escalation reason: {reason}\n\n" "Produce a final handoff that starts with:\n" "Escalated to human: <one-sentence summary>\n" "Ask only for essential details if needed." ) reply = escalation.generate_reply(messages=[{"role": "user", "content": prompt}]) or "" return reply.strip() def run_support_flow(user_input: str) -> str: """ Support first; escalate only on the token. Returns a single display string for the UI. """ support, escalation = create_support_and_escalation_agents() s_txt = _support_reply(support, user_input) should_escalate, reason = _parse_escalation(s_txt) if should_escalate: return _escalate(escalation, user_input, reason) return s_txt
How to Try It
-
Non-escalation example: “How do I reset my password?” → Support replies concisely → router returns Support’s text.
-
Escalation example: “Please change my billing plan and refund the last charge.” → Support emits
[ESCALATE] refund requires account verification
→ router prompts Escalation → final output starts withEscalated to human: Customer requests a plan change and refund …
Consider This
- Pre-checks: Add a lightweight keyword pre-filter (e.g., refund, chargeback, legal, outage, MFA) to escalate sooner.
- Clarifying question: Insert a brief follow-up before escalation to improve handoff quality.
- Structured reason: Have Support emit a tiny block (YAML/JSON) and update
_parse_escalation
. - Analytics: Log
should_escalate
,reason
, and timestamps for QA and dashboards.
Next Up
Build the Summarizer Tool in
app/tools.py
: implementsmart_summarizer
so Summarize text always returns 3–5 bullets + a TL;DR with a predictable shape.
-
-
Challenge
Tools
Summarizer Tool
In this step you’ll build a reusable tool the app can call—think “press a button to summarize.” Under the hood it uses an AutoGen
AssistantAgent
, but to the app it’s just a callable that returns text. When the UI runs Summarize text, it looks upsmart_summarizer
and calls.run(text)
.
Your task
-
Build an AutoGen
AssistantAgent
with a strong system message that enforces:- 3–5 bullets, facts only
- Include source names if present
- Finish with
TL;DR:
(one sentence)
-
Define
summarize_text(text: str) -> str
that:- Sends the raw text as a user message
- Returns a string reply (trimmed; no dicts/JSON)
-
Wrap it in
_SimpleTool
:name="smart_summarizer"
- Clear
description
_runner=summarize_text
-
Export
build_tools(...) -> list
that returns only this tool for now.
Starter code — edit
app/tools.py
# app/tools.py from __future__ import annotations from dataclasses import dataclass from typing import Callable, List from app.base import azure_autogen_config # Guarded import so the app still renders in starter mode try: from autogen import AssistantAgent # type: ignore except Exception: class AssistantAgent: # fallback stub def __init__(self, *_, **__): pass def generate_reply(self, *_, **__): return "• Placeholder bullet\n• Add more bullets here\n\nTL;DR: Placeholder." @dataclass class _SimpleTool: name: str description: str _runner: Callable[[str], str] def run(self, text: str) -> str: return self._runner(text) def build_tools(_unused_llm) -> List[_SimpleTool]: """ Starter tool setup for summarization. TODOs: - Create an AutoGen AssistantAgent with a strong system_message for summarization. - Write summarize_text(text: str) -> str that calls agent.generate_reply(...) and returns a string. - Wrap summarize_text in a _SimpleTool named 'smart_summarizer' with a clear description. - Return a list containing this tool. """ # TODO(1): Build the summarizer agent agent = AssistantAgent( name="SummarizerAgent", system_message=( "TODO: Output 3–5 concise, fact-only bullets. " "If source names appear, include them. " "After the bullets, write one sentence that begins with 'TL;DR:'." ), llm_config=azure_autogen_config(), ) # TODO(2): Define the runner def summarize_text(text: str) -> str: text = (text or "").strip() if not text: return "• Please paste some text to summarize.\n\nTL;DR: No content provided." prompt = ( "Summarize the following content into 3–5 concise bullets, " "then add a one-sentence TL;DR line starting with 'TL;DR:'.\n\n" f"{text}" ) try: reply = agent.generate_reply(messages=[{'role': 'user', 'content': prompt}]) or "" out = (reply or "").strip() if not out: out = "• (no bullets)\n\nTL;DR: Summary unavailable." except Exception: out = ("• This is a placeholder summary while your setup finishes.\n" "• Replace the TODO system message when ready.\n\nTL;DR: Placeholder output.") # Optional guardrail to keep shape predictable if "TL;DR:" not in out: out += "\n\nTL;DR: Summary unavailable." return out # TODO(3): Wrap as a tool summarizer_tool = _SimpleTool( name="smart_summarizer", description="Summarize into 3–5 concise bullets and finish with 'TL;DR: ...'.", _runner=summarize_text, ) return [summarizer_tool]
Consider this
- Return strings only. Avoid dicts/JSON—keeps callers simple.
- Be explicit in the system message. Vague rules → inconsistent shape.
- Guardrail helps. Ensuring a
TL;DR:
line keeps the UI predictable.
Code (Solved) — Click to expand
# app/tools.py from __future__ import annotations from dataclasses import dataclass from typing import Callable, List from app.base import azure_autogen_config # Guarded import so the app still renders if autogen isn't installed yet try: from autogen import AssistantAgent # type: ignore except Exception: class AssistantAgent: # fallback stub def __init__(self, *_, **__): pass def generate_reply(self, *_, **__): return "• Placeholder bullet\n• Add more bullets here\n\nTL;DR: Placeholder." @dataclass class _SimpleTool: name: str description: str _runner: Callable[[str], str] def run(self, text: str) -> str: return self._runner(text) def build_tools(_unused_llm) -> List[_SimpleTool]: """Return the lab's single tool: smart_summarizer.""" agent = AssistantAgent( name="SummarizerAgent", system_message=( "You are an expert technical summarizer.\n" "- Output exactly 3–5 concise bullets with concrete facts.\n" "- If source names appear in the text, include them in bullets.\n" "- Avoid fluff and opinions.\n" "- After the bullets, write exactly one sentence that begins with 'TL;DR:'." ), llm_config=azure_autogen_config(), ) def summarize_text(text: str) -> str: text = (text or "").strip() if not text: return "• Please paste some text to summarize.\n\nTL;DR: No content provided." prompt = ( "Summarize the following content into 3–5 concise bullets with concrete facts. " "Include source names if present. After the bullets, add exactly one sentence starting with 'TL;DR:'.\n\n" f"{text}" ) try: reply = agent.generate_reply(messages=[{'role': 'user', 'content': prompt}]) or "" out = (reply or "").strip() except Exception: out = ("• This is a placeholder summary while your setup finishes.\n" "• Replace the TODO prompts when ready.\n\nTL;DR: Placeholder output.") if not out: out = "• (no bullets)\n\nTL;DR: Summary unavailable." if "TL;DR:" not in out: out += "\n\nTL;DR: Summary unavailable." return out summarizer_tool = _SimpleTool( name="smart_summarizer", description="Summarize into 3–5 concise bullets and finish with 'TL;DR: ...'.", _runner=summarize_text, ) return [summarizer_tool]
Next up
Finish and try it: Use the UI to run all three modes:
- Simple QA — e.g., “What are your support hours?”
- Summarize text — paste a long paragraph; verify 3–5 bullets + TL;DR.
- Agent — try a normal question (no escalation) and a support request like “Please refund my last invoice and switch me to annual billing.” to observe the full Support → Escalation flow.
-
-
Challenge
Summary
Try It: QA vs Summarize vs Agent
Follow these steps to see how each mode behaves differently.
Step 1 — QA Mode
- Go to the Simple QA dropdown mode in the UI.
- Paste: “What are your support hours?”
- Click Run.
What you should see: A short, direct answer in one or two sentences. No tools or routing are involved.
Step 2 — Summarize Mode
- Switch to Summarize text mode.
- Paste this sample customer support text:
Sample Support Transcript Customer: Hi, I was charged twice for my monthly subscription. Agent: Sorry about that! Can you confirm the last 4 digits of your payment card? Customer: Sure, it’s 1234. Agent: Thanks. I’ve submitted a request to refund the duplicate charge. You should see the refund in 5–7 business days. Customer: Great, can you also switch me to annual billing so this doesn’t happen again? Agent: Yes, I can do that right now. You’ll be billed annually starting next cycle.
- Click Run.
What you should see:
- 3–5 concise bullet points summarizing the transcript
- A single
TL;DR:
line with a one-sentence takeaway - Always the same format, thanks to your summarizer tool’s system message
Step 3 — Agent Mode
- Switch to Agent mode.
- Paste the same support text from Step 2.
- Click Run.
What you should see: Because the input involves refunds, billing changes, and last-4 digits, your agent will likely trigger escalation. Expect a handoff that begins with:
Escalated to human: <one-sentence summary>
(If you want a non-escalation demo instead, try: “Summarize our pricing tiers at a high level.”)
Key Takeaways
- QA Mode: Perfect for quick, direct questions.
- Summarize Mode: Always outputs 3–5 bullets + TL;DR — ideal for consistent summaries of longer text.
- Agent Mode: Dynamically decides how to respond; with real support requests it will often escalate per your policy.
About the author
Real skill practice before real-world application
Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.
Learn by doing
Engage hands-on with the tools and technologies you’re learning. You pick the skill, we provide the credentials and environment.
Follow your guide
All labs have detailed instructions and objectives, guiding you through the learning process and ensuring you understand every step.
Turn time into mastery
On average, you retain 75% more of your learning if you take time to practice. Hands-on labs set you up for success to make those skills stick.