Guided: Building Multi-Agent Systems with AutoGen

Libraries: If you want this lab, consider one of these libraries.
Core Tech

Guided: Building Multi-Agent Systems with AutoGen

Learn the fundamentals of building multi-agent systems with Microsoft's AutoGen framework. In this hands-on lab, you'll create a two-agent customer support chatbot that can answer user queries and escalate complex issues to a human, all orchestrated by AI.

Get started Contact sales

Lab Info

Level

Intermediate

Last updated

Oct 14, 2025

Duration

30m

Challenge

Overview
Introduction

Welcome to Multi-Agent Systems with AutoGen.

In this lab, you’ll build a two-agent customer support bot in a Flask app using Microsoft’s AutoGen, backed by Azure OpenAI. You’ll wire up three capabilities: a quick-answer QA path, a summarizer tool (always 3–5 bullets + a TL;DR), and a Support → Escalation flow that routes complex or risky requests to a simulated human tier with a clean handoff.

By the end, the app will:
- Answer direct questions via a Simple QA path
- Summarize long text into 3–5 bullets with a one-sentence TL;DR
- Route support requests through Support → Escalation and produce human-ready handoffs
Outcomes
- Content & Learning: Turn long tickets, chats, or transcripts into crisp bullets + TL;DR
- Productivity: Auto-triage support requests; generate ready-to-send escalation notes
- Compliance & QA: Encode escalation rules (PII, legal, payments, outages) into governed prompts
- Pipelines: Reuse the summarizer and escalation logic as building blocks in larger RAG/agent workflows
Mental Model
- Agent: A specialized teammate with a role (system message) and optional tools (e.g., Support vs. Escalation).
- Tool: A capability the app can call (e.g., a summarizer that always returns 3–5 bullets + TL;DR).
- Router (orchestrator): A traffic controller that sends input to Support first, watches for the [ESCALATE] flag, and—if needed—asks Escalation to draft the human handoff.
- Dialogue: A structured conversation between agents that yields one user-facing response (or a handoff).
Repo Layout (teaching-first split)
```
workspace/
└── app/
    ├── base.py               # provided; env wiring, Azure/AutoGen config, Simple QA adapter, Agent adapter
    ├── agents.py     # Step 1: define Support & Escalation agents + ESCALATE_TOKEN
    ├── router.py     # Step 2: orchestration (Support → optional Escalation, returns final string)
    ├── tools.py              # Step 3: implement "smart_summarizer" (3–5 bullets + TL;DR)
    └── templates/
        └── index.html        # provided UI; no changes needed
flask_app.py                  # provided; handles key, routes, and modes
.env                          # created automatically when you paste your API key in the UI
```
Getting Started
1. Launch the app in your Web Browser tab and paste your Azure OpenAI API key into the field at the top. The app saves it to .env and uses sensible defaults for endpoint, version, and model.
2. Use the dropdown to try each mode:
  
  Simple QA — lightweight answers, clearly and concisely
  
  Summarize text — converts long passages into 3–5 bullets, ending with a one-sentence TL;DR
  
  Agent — Support → Escalation: Support responds first; if the request is risky/complex, it emits [ESCALATE], and Escalation drafts a clean human handoff
3. Build it step by step (teaching flow):
Step 1 — Agents

Open app/autogen_agents.py. Author system messages for SupportAgent and EscalationAgent; set/confirm the ESCALATE_TOKEN contract.

Step 2 — Router

Open app/autogen_router.py. Orchestrate the flow: call Support → detect ESCALATE_TOKEN → (optional) call Escalation to produce the handoff that starts with “Escalated to human: …”.

Step 3 — Tool

In app/tools.py, implement smart_summarizer so it always returns 3–5 bullets + TL;DR.

(Provided) Base

app/base.py already wires environment defaults, the Simple QA adapter, and the Agent-mode adapter—no edits needed.

You’ll finish with a page that answers, summarizes, and escalates—with governed outputs that feel production-ready and a file layout that’s easy to teach and extend.

Note: This lab experience was developed by the Pluralsight team using Forge, an internally developed AI tool utilizing Gemini technology. All sections were verified by human experts for accuracy prior to publication. For issue reporting, please contact us.

Challenge

Agents

Support & Escalation Agents

In this step you’ll define two AutoGen agents with clear roles—think frontline and handoff. You’ll author a SupportAgent that answers FAQs concisely and an EscalationAgent that produces a clean, human-ready handoff. Routing (when to escalate) happens in the router step.

Key Concepts

System messages = job descriptions Put tone, scope, and rules here so outputs are consistent and auditable.
Trigger token contract (single source of truth) Define ESCALATE_TOKEN and have Support emit [ESCALATE] + a short reason for risky/complex/PII/policy/account-access cases. The router imports the same constant to detect escalations.
Deterministic handoff Escalation output must start with Escalated to human: <one-sentence summary> for easy reading and logging.
Config reuse Both agents share the same config from azure_autogen_config() in app/base.py.

Your Task

Create the shared token at the top of the file: ESCALATE_TOKEN = "[ESCALATE]"
Create two AssistantAgents using the same Azure config:
- SupportAgent — concise Tier-1 answers; when escalation is needed, emit ESCALATE_TOKEN + one short reason. When not escalating, never mention the token.
- EscalationAgent — produces a brief human handoff that begins with Escalated to human: <one-sentence summary> and asks only for essentials.
Return both agents from create_support_and_escalation_agents() -> Tuple[AssistantAgent, AssistantAgent].

Starter Code (edit `app/agents.py`)

# app/agents.py
from typing import Tuple
from autogen import AssistantAgent
from app.base import azure_autogen_config

# Single source of truth for the trigger token (router imports this)
ESCALATE_TOKEN = "[ESCALATE]"


def create_support_and_escalation_agents() -> Tuple[AssistantAgent, AssistantAgent]:
    """
    Return (support_agent, escalation_agent), both using the same Azure config.

    TODOs:
    - SupportAgent system message:
      • Concise Tier-1 answers.
      • If complex/risky/PII/policy/account access needed → emit ESCALATE_TOKEN + one-sentence reason.
      • Otherwise DO NOT mention or describe the token.
    - EscalationAgent system message:
      • Produce a handoff beginning with 'Escalated to human: <one-sentence summary>'.
      • Ask only for essential details needed by a human.
    """
    cfg = azure_autogen_config()

    # TODO: SupportAgent
    support = AssistantAgent(
        name="SupportAgent",
        system_message=(
            "TODO: Concise Tier-1 answers. If complex/risky/PII/policy/account access is needed, "
            f"emit {ESCALATE_TOKEN} plus a short reason. "
            "If you are not escalating, do not mention the token."
        ),
        llm_config=cfg,
    )

    # TODO: EscalationAgent
    escalation = AssistantAgent(
        name="EscalationAgent",
        system_message=(
            "TODO: Acknowledge escalation and produce a brief handoff starting with "
            "'Escalated to human: <one-sentence summary>'. Ask only for essentials."
        ),
        llm_config=cfg,
    )

    return support, escalation

Code (Solved) — Click to expand

# app/agents.py
from typing import Tuple
from autogen import AssistantAgent
from app.base import azure_autogen_config

# Single source of truth for the trigger token (router imports this)
ESCALATE_TOKEN = "[ESCALATE]"


def create_support_and_escalation_agents() -> Tuple[AssistantAgent, AssistantAgent]:
    """
    Return (support_agent, escalation_agent), both using the same Azure config.
    """
    cfg = azure_autogen_config()

    support = AssistantAgent(
        name="SupportAgent",
        system_message=(
            "You are a Tier-1 Customer Support Agent. "
            "Answer common questions concisely and helpfully. "
            "If the request is complex, risky, requires account access or PII, or needs policy exceptions, "
            f"reply with the token {ESCALATE_TOKEN} followed by one short sentence explaining why. "
            "If you are not escalating, do not mention or describe the token."
        ),
        llm_config=cfg,
    )

    escalation = AssistantAgent(
        name="EscalationAgent",
        system_message=(
            "You handle escalations to a human. When invoked, acknowledge the escalation, "
            "ask only for essential details, and provide a final handoff that begins with: "
            "Escalated to human: <one-sentence summary>\n"
            "Be brief, professional, and actionable."
        ),
        llm_config=cfg,
    )

    return support, escalation

Next Up

Implement the router in app/router.py:

Call SupportAgent first and capture its reply.
If the reply contains ESCALATE_TOKEN, extract the short reason and prompt EscalationAgent to produce the handoff that starts with Escalated to human: <one-sentence summary>.
Return a single string to the UI (either Support’s reply or the handoff).

Challenge

Router

Orchestration (Support → Escalation)

This step wires the conversation flow. The router sends a user message to SupportAgent first. If Support includes the shared ESCALATE_TOKEN (e.g., [ESCALATE]) with a short reason, the router builds a brief prompt and asks EscalationAgent to produce a clean, human-ready handoff. The router always returns one final string for the UI.

Key Concepts

Deterministic trigger Look for the literal token imported from app/agents.py (ESCALATE_TOKEN). This keeps routing simple and testable.
Reason extraction Parse the text after the token as a short reason and include it in the escalation prompt.
Consistent handoff Escalation output must start with Escalated to human: + one-sentence summary for easy reading and logging.
Single responsibility helpers Small functions keep things teachable and easy to unit-test: _support_reply → _parse_escalation → _escalate

Your Task

Call Support first (_support_reply) and capture its single-turn reply.
Detect the token with _parse_escalation; if present, extract the short reason.
Ask EscalationAgent via _escalate to produce the handoff that begins with Escalated to human: <one-sentence summary>.
Return a single string — Support’s answer or the human handoff.

Note: With the simplified layout, the router lives in app/router.py and imports agents directly from app/agents.py.

Starter Code — `app/router.py`

# app/router.py
from __future__ import annotations
from typing import Tuple
from app.agents import create_support_and_escalation_agents, ESCALATE_TOKEN


def _support_reply(support, user_input: str) -> str:
    """Single-turn call to SupportAgent."""
    # TODO: Call support.generate_reply(...) with the user_input and return trimmed text.
    # messages format: [{"role": "user", "content": user_input}]
    raise NotImplementedError


def _parse_escalation(text: str) -> Tuple[bool, str]:
    """
    Detect the escalation token and extract a brief reason.
    Returns (should_escalate, reason).
    """
    # TODO:
    # - If ESCALATE_TOKEN not in text: return (False, "")
    # - Else return (True, <text after token>.strip() or "No reason provided.")
    raise NotImplementedError


def _escalate(escalation, user_input: str, reason: str) -> str:
    """Ask EscalationAgent for the human-ready handoff."""
    # TODO: Build a short prompt that includes:
    # - The original user_input
    # - The extracted reason
    # - The instruction that the response must start with:
    #   "Escalated to human: <one-sentence summary>"
    # Then call escalation.generate_reply(...) and return trimmed text.
    raise NotImplementedError


def run_support_flow(user_input: str) -> str:
    """
    Support first; escalate only on the token.
    Returns a single display string for the UI.
    """
    # TODO:
    # - Create agents via create_support_and_escalation_agents()
    # - Get Support reply
    # - Parse for escalation; if escalate, call _escalate and return result
    # - Otherwise return the Support reply
    raise NotImplementedError

Code (Solved) — Click to expand

# app/router.py
from __future__ import annotations
from typing import Tuple
from app.agents import create_support_and_escalation_agents, ESCALATE_TOKEN


def _support_reply(support, user_input: str) -> str:
    """Single-turn call to SupportAgent."""
    reply = support.generate_reply(messages=[{"role": "user", "content": user_input}]) or ""
    return reply.strip()


def _parse_escalation(text: str) -> Tuple[bool, str]:
    """
    Detect the escalation token and extract a brief reason.
    Returns (should_escalate, reason).
    """
    if ESCALATE_TOKEN not in text:
        return False, ""
    reason = text.split(ESCALATE_TOKEN, 1)[-1].strip() or "No reason provided."
    return True, reason


def _escalate(escalation, user_input: str, reason: str) -> str:
    """Ask EscalationAgent for the human-ready handoff."""
    prompt = (
        "A Tier-1 agent decided to escalate this conversation.\n"
        f"User message:\n{user_input}\n\n"
        f"Escalation reason: {reason}\n\n"
        "Produce a final handoff that starts with:\n"
        "Escalated to human: <one-sentence summary>\n"
        "Ask only for essential details if needed."
    )
    reply = escalation.generate_reply(messages=[{"role": "user", "content": prompt}]) or ""
    return reply.strip()


def run_support_flow(user_input: str) -> str:
    """
    Support first; escalate only on the token.
    Returns a single display string for the UI.
    """
    support, escalation = create_support_and_escalation_agents()

    s_txt = _support_reply(support, user_input)
    should_escalate, reason = _parse_escalation(s_txt)

    if should_escalate:
        return _escalate(escalation, user_input, reason)

    return s_txt

How to Try It

Non-escalation example: “How do I reset my password?” → Support replies concisely → router returns Support’s text.
Escalation example: “Please change my billing plan and refund the last charge.” → Support emits [ESCALATE] refund requires account verification → router prompts Escalation → final output starts with Escalated to human: Customer requests a plan change and refund …

Consider This

Pre-checks: Add a lightweight keyword pre-filter (e.g., refund, chargeback, legal, outage, MFA) to escalate sooner.
Clarifying question: Insert a brief follow-up before escalation to improve handoff quality.
Structured reason: Have Support emit a tiny block (YAML/JSON) and update _parse_escalation.
Analytics: Log should_escalate, reason, and timestamps for QA and dashboards.

Next Up

Build the Summarizer Tool in app/tools.py: implement smart_summarizer so Summarize text always returns 3–5 bullets + a TL;DR with a predictable shape.

Challenge

Tools

Summarizer Tool

In this step you’ll build a reusable tool the app can call—think “press a button to summarize.” Under the hood it uses an AutoGen AssistantAgent, but to the app it’s just a callable that returns text. When the UI runs Summarize text, it looks up smart_summarizer and calls .run(text).

Your task

Build an AutoGen AssistantAgent with a strong system message that enforces:
- 3–5 bullets, facts only
- Include source names if present
- Finish with TL;DR: (one sentence)
Define summarize_text(text: str) -> str that:
- Sends the raw text as a user message
- Returns a string reply (trimmed; no dicts/JSON)
Wrap it in _SimpleTool:
- name="smart_summarizer"
- Clear description
- _runner=summarize_text
Export build_tools(...) -> list that returns only this tool for now.

Starter code — edit `app/tools.py`

# app/tools.py
from __future__ import annotations
from dataclasses import dataclass
from typing import Callable, List
from app.base import azure_autogen_config

# Guarded import so the app still renders in starter mode
try:
    from autogen import AssistantAgent  # type: ignore
except Exception:
    class AssistantAgent:  # fallback stub
        def __init__(self, *_, **__): pass
        def generate_reply(self, *_, **__):
            return "• Placeholder bullet\n• Add more bullets here\n\nTL;DR: Placeholder."

@dataclass
class _SimpleTool:
    name: str
    description: str
    _runner: Callable[[str], str]
    def run(self, text: str) -> str:
        return self._runner(text)

def build_tools(_unused_llm) -> List[_SimpleTool]:
    """
    Starter tool setup for summarization.

    TODOs:
    - Create an AutoGen AssistantAgent with a strong system_message for summarization.
    - Write summarize_text(text: str) -> str that calls agent.generate_reply(...) and returns a string.
    - Wrap summarize_text in a _SimpleTool named 'smart_summarizer' with a clear description.
    - Return a list containing this tool.
    """

    # TODO(1): Build the summarizer agent
    agent = AssistantAgent(
        name="SummarizerAgent",
        system_message=(
            "TODO: Output 3–5 concise, fact-only bullets. "
            "If source names appear, include them. "
            "After the bullets, write one sentence that begins with 'TL;DR:'."
        ),
        llm_config=azure_autogen_config(),
    )

    # TODO(2): Define the runner
    def summarize_text(text: str) -> str:
        text = (text or "").strip()
        if not text:
            return "• Please paste some text to summarize.\n\nTL;DR: No content provided."
        prompt = (
            "Summarize the following content into 3–5 concise bullets, "
            "then add a one-sentence TL;DR line starting with 'TL;DR:'.\n\n"
            f"{text}"
        )
        try:
            reply = agent.generate_reply(messages=[{'role': 'user', 'content': prompt}]) or ""
            out = (reply or "").strip()
            if not out:
                out = "• (no bullets)\n\nTL;DR: Summary unavailable."
        except Exception:
            out = ("• This is a placeholder summary while your setup finishes.\n"
                   "• Replace the TODO system message when ready.\n\nTL;DR: Placeholder output.")
        # Optional guardrail to keep shape predictable
        if "TL;DR:" not in out:
            out += "\n\nTL;DR: Summary unavailable."
        return out

    # TODO(3): Wrap as a tool
    summarizer_tool = _SimpleTool(
        name="smart_summarizer",
        description="Summarize into 3–5 concise bullets and finish with 'TL;DR: ...'.",
        _runner=summarize_text,
    )

    return [summarizer_tool]

Consider this

Return strings only. Avoid dicts/JSON—keeps callers simple.
Be explicit in the system message. Vague rules → inconsistent shape.
Guardrail helps. Ensuring a TL;DR: line keeps the UI predictable.

Code (Solved) — Click to expand

# app/tools.py
from __future__ import annotations
from dataclasses import dataclass
from typing import Callable, List
from app.base import azure_autogen_config

# Guarded import so the app still renders if autogen isn't installed yet
try:
    from autogen import AssistantAgent  # type: ignore
except Exception:
    class AssistantAgent:  # fallback stub
        def __init__(self, *_, **__): pass
        def generate_reply(self, *_, **__):
            return "• Placeholder bullet\n• Add more bullets here\n\nTL;DR: Placeholder."

@dataclass
class _SimpleTool:
    name: str
    description: str
    _runner: Callable[[str], str]
    def run(self, text: str) -> str:
        return self._runner(text)

def build_tools(_unused_llm) -> List[_SimpleTool]:
    """Return the lab's single tool: smart_summarizer."""
    agent = AssistantAgent(
        name="SummarizerAgent",
        system_message=(
            "You are an expert technical summarizer.\n"
            "- Output exactly 3–5 concise bullets with concrete facts.\n"
            "- If source names appear in the text, include them in bullets.\n"
            "- Avoid fluff and opinions.\n"
            "- After the bullets, write exactly one sentence that begins with 'TL;DR:'."
        ),
        llm_config=azure_autogen_config(),
    )

    def summarize_text(text: str) -> str:
        text = (text or "").strip()
        if not text:
            return "• Please paste some text to summarize.\n\nTL;DR: No content provided."
        prompt = (
            "Summarize the following content into 3–5 concise bullets with concrete facts. "
            "Include source names if present. After the bullets, add exactly one sentence starting with 'TL;DR:'.\n\n"
            f"{text}"
        )
        try:
            reply = agent.generate_reply(messages=[{'role': 'user', 'content': prompt}]) or ""
            out = (reply or "").strip()
        except Exception:
            out = ("• This is a placeholder summary while your setup finishes.\n"
                   "• Replace the TODO prompts when ready.\n\nTL;DR: Placeholder output.")

        if not out:
            out = "• (no bullets)\n\nTL;DR: Summary unavailable."
        if "TL;DR:" not in out:
            out += "\n\nTL;DR: Summary unavailable."
        return out

    summarizer_tool = _SimpleTool(
        name="smart_summarizer",
        description="Summarize into 3–5 concise bullets and finish with 'TL;DR: ...'.",
        _runner=summarize_text,
    )
    return [summarizer_tool]

Next up

Finish and try it: Use the UI to run all three modes:

Simple QA — e.g., “What are your support hours?”
Summarize text — paste a long paragraph; verify 3–5 bullets + TL;DR.
Agent — try a normal question (no escalation) and a support request like “Please refund my last invoice and switch me to annual billing.” to observe the full Support → Escalation flow.

Challenge

Summary
Try It: QA vs Summarize vs Agent

Follow these steps to see how each mode behaves differently.

Step 1 — QA Mode
1. Go to the Simple QA dropdown mode in the UI.
2. Paste: “What are your support hours?”
3. Click Run.
What you should see: A short, direct answer in one or two sentences. No tools or routing are involved.

Step 2 — Summarize Mode
1. Switch to Summarize text mode.
2. Paste this sample customer support text:
Sample Support Transcript Customer: Hi, I was charged twice for my monthly subscription. Agent: Sorry about that! Can you confirm the last 4 digits of your payment card? Customer: Sure, it’s 1234. Agent: Thanks. I’ve submitted a request to refund the duplicate charge. You should see the refund in 5–7 business days. Customer: Great, can you also switch me to annual billing so this doesn’t happen again? Agent: Yes, I can do that right now. You’ll be billed annually starting next cycle.
1. Click Run.
What you should see:
- 3–5 concise bullet points summarizing the transcript
- A single TL;DR: line with a one-sentence takeaway
- Always the same format, thanks to your summarizer tool’s system message
Step 3 — Agent Mode
1. Switch to Agent mode.
2. Paste the same support text from Step 2.
3. Click Run.
What you should see: Because the input involves refunds, billing changes, and last-4 digits, your agent will likely trigger escalation. Expect a handoff that begins with: Escalated to human: <one-sentence summary> (If you want a non-escalation demo instead, try: “Summarize our pricing tiers at a high level.”)

Key Takeaways
- QA Mode: Perfect for quick, direct questions.
- Summarize Mode: Always outputs 3–5 bullets + TL;DR — ideal for consistent summaries of longer text.
- Agent Mode: Dynamically decides how to respond; with real support requests it will often escalate per your policy.

About the author

Danny Sullivan

Danny Sullivan is a former special education teacher and professional baseball player that moved into software development in 2014. He’s experienced with Ruby, Python and JavaScript ecosystems, but enjoys Ruby most for its user friendliness and rapid prototyping capabilities.

Real skill practice before real-world application

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Learn by doing

Engage hands-on with the tools and technologies you’re learning. You pick the skill, we provide the credentials and environment.

Follow your guide

All labs have detailed instructions and objectives, guiding you through the learning process and ensuring you understand every step.

Turn time into mastery

On average, you retain 75% more of your learning if you take time to practice. Hands-on labs set you up for success to make those skills stick.