Featured resource
2025 Tech Upskilling Playbook
Tech Upskilling Playbook

Build future-ready tech teams and hit key business milestones with seven proven plays from industry leaders.

Check it out
  • Lab
    • Libraries: If you want this lab, consider one of these libraries.
    • Core Tech
Labs

Guided: Building Multi-Agent Systems with AutoGen

Learn the fundamentals of building multi-agent systems with Microsoft's AutoGen framework. In this hands-on lab, you'll create a two-agent customer support chatbot that can answer user queries and escalate complex issues to a human, all orchestrated by AI.

Lab platform
Lab Info
Level
Intermediate
Last updated
Oct 14, 2025
Duration
30m

Contact sales

By clicking submit, you agree to our Privacy Policy and Terms of Use.
Table of Contents
  1. Challenge

    Overview

    Introduction

    Welcome to Multi-Agent Systems with AutoGen.

    In this lab, you’ll build a two-agent customer support bot in a Flask app using Microsoft’s AutoGen, backed by Azure OpenAI. You’ll wire up three capabilities: a quick-answer QA path, a summarizer tool (always 3–5 bullets + a TL;DR), and a Support → Escalation flow that routes complex or risky requests to a simulated human tier with a clean handoff.

    By the end, the app will:

    • Answer direct questions via a Simple QA path
    • Summarize long text into 3–5 bullets with a one-sentence TL;DR
    • Route support requests through Support → Escalation and produce human-ready handoffs

    Outcomes

    • Content & Learning: Turn long tickets, chats, or transcripts into crisp bullets + TL;DR
    • Productivity: Auto-triage support requests; generate ready-to-send escalation notes
    • Compliance & QA: Encode escalation rules (PII, legal, payments, outages) into governed prompts
    • Pipelines: Reuse the summarizer and escalation logic as building blocks in larger RAG/agent workflows

    Mental Model

    • Agent: A specialized teammate with a role (system message) and optional tools (e.g., Support vs. Escalation).
    • Tool: A capability the app can call (e.g., a summarizer that always returns 3–5 bullets + TL;DR).
    • Router (orchestrator): A traffic controller that sends input to Support first, watches for the [ESCALATE] flag, and—if needed—asks Escalation to draft the human handoff.
    • Dialogue: A structured conversation between agents that yields one user-facing response (or a handoff).

    Repo Layout (teaching-first split)

    workspace/
    └── app/
        ├── base.py               # provided; env wiring, Azure/AutoGen config, Simple QA adapter, Agent adapter
        ├── agents.py     # Step 1: define Support & Escalation agents + ESCALATE_TOKEN
        ├── router.py     # Step 2: orchestration (Support → optional Escalation, returns final string)
        ├── tools.py              # Step 3: implement "smart_summarizer" (3–5 bullets + TL;DR)
        └── templates/
            └── index.html        # provided UI; no changes needed
    flask_app.py                  # provided; handles key, routes, and modes
    .env                          # created automatically when you paste your API key in the UI
    

    Getting Started

    1. Launch the app in your Web Browser tab and paste your Azure OpenAI API key into the field at the top. The app saves it to .env and uses sensible defaults for endpoint, version, and model.

    2. Use the dropdown to try each mode:

      • Simple QA — lightweight answers, clearly and concisely
      • Summarize text — converts long passages into 3–5 bullets, ending with a one-sentence TL;DR
      • Agent — Support → Escalation: Support responds first; if the request is risky/complex, it emits [ESCALATE], and Escalation drafts a clean human handoff
    3. Build it step by step (teaching flow):

    Step 1 — Agents

    Open app/autogen_agents.py. Author system messages for SupportAgent and EscalationAgent; set/confirm the ESCALATE_TOKEN contract.

    Step 2 — Router

    Open app/autogen_router.py. Orchestrate the flow: call Support → detect ESCALATE_TOKEN → (optional) call Escalation to produce the handoff that starts with “Escalated to human: …”.

    Step 3 — Tool

    In app/tools.py, implement smart_summarizer so it always returns 3–5 bullets + TL;DR.

    (Provided) Base

    app/base.py already wires environment defaults, the Simple QA adapter, and the Agent-mode adapter—no edits needed.

    You’ll finish with a page that answers, summarizes, and escalates—with governed outputs that feel production-ready and a file layout that’s easy to teach and extend.


    Note: This lab experience was developed by the Pluralsight team using Forge, an internally developed AI tool utilizing Gemini technology. All sections were verified by human experts for accuracy prior to publication. For issue reporting, please contact us.

  2. Challenge

    Agents

    Support & Escalation Agents

    In this step you’ll define two AutoGen agents with clear roles—think frontline and handoff. You’ll author a SupportAgent that answers FAQs concisely and an EscalationAgent that produces a clean, human-ready handoff. Routing (when to escalate) happens in the router step.


    Key Concepts

    • System messages = job descriptions Put tone, scope, and rules here so outputs are consistent and auditable.

    • Trigger token contract (single source of truth) Define ESCALATE_TOKEN and have Support emit [ESCALATE] + a short reason for risky/complex/PII/policy/account-access cases. The router imports the same constant to detect escalations.

    • Deterministic handoff Escalation output must start with Escalated to human: <one-sentence summary> for easy reading and logging.

    • Config reuse Both agents share the same config from azure_autogen_config() in app/base.py.


    Your Task

    1. Create the shared token at the top of the file: ESCALATE_TOKEN = "[ESCALATE]"

    2. Create two AssistantAgents using the same Azure config:

      • SupportAgent — concise Tier-1 answers; when escalation is needed, emit ESCALATE_TOKEN + one short reason. When not escalating, never mention the token.

      • EscalationAgent — produces a brief human handoff that begins with Escalated to human: <one-sentence summary> and asks only for essentials.

    3. Return both agents from create_support_and_escalation_agents() -> Tuple[AssistantAgent, AssistantAgent].


    Starter Code (edit app/agents.py)

    # app/agents.py
    from typing import Tuple
    from autogen import AssistantAgent
    from app.base import azure_autogen_config
    
    # Single source of truth for the trigger token (router imports this)
    ESCALATE_TOKEN = "[ESCALATE]"
    
    
    def create_support_and_escalation_agents() -> Tuple[AssistantAgent, AssistantAgent]:
        """
        Return (support_agent, escalation_agent), both using the same Azure config.
    
        TODOs:
        - SupportAgent system message:
          • Concise Tier-1 answers.
          • If complex/risky/PII/policy/account access needed → emit ESCALATE_TOKEN + one-sentence reason.
          • Otherwise DO NOT mention or describe the token.
        - EscalationAgent system message:
          • Produce a handoff beginning with 'Escalated to human: <one-sentence summary>'.
          • Ask only for essential details needed by a human.
        """
        cfg = azure_autogen_config()
    
        # TODO: SupportAgent
        support = AssistantAgent(
            name="SupportAgent",
            system_message=(
                "TODO: Concise Tier-1 answers. If complex/risky/PII/policy/account access is needed, "
                f"emit {ESCALATE_TOKEN} plus a short reason. "
                "If you are not escalating, do not mention the token."
            ),
            llm_config=cfg,
        )
    
        # TODO: EscalationAgent
        escalation = AssistantAgent(
            name="EscalationAgent",
            system_message=(
                "TODO: Acknowledge escalation and produce a brief handoff starting with "
                "'Escalated to human: <one-sentence summary>'. Ask only for essentials."
            ),
            llm_config=cfg,
        )
    
        return support, escalation
    

    Code (Solved) — Click to expand
    # app/agents.py
    from typing import Tuple
    from autogen import AssistantAgent
    from app.base import azure_autogen_config
    
    # Single source of truth for the trigger token (router imports this)
    ESCALATE_TOKEN = "[ESCALATE]"
    
    
    def create_support_and_escalation_agents() -> Tuple[AssistantAgent, AssistantAgent]:
        """
        Return (support_agent, escalation_agent), both using the same Azure config.
        """
        cfg = azure_autogen_config()
    
        support = AssistantAgent(
            name="SupportAgent",
            system_message=(
                "You are a Tier-1 Customer Support Agent. "
                "Answer common questions concisely and helpfully. "
                "If the request is complex, risky, requires account access or PII, or needs policy exceptions, "
                f"reply with the token {ESCALATE_TOKEN} followed by one short sentence explaining why. "
                "If you are not escalating, do not mention or describe the token."
            ),
            llm_config=cfg,
        )
    
        escalation = AssistantAgent(
            name="EscalationAgent",
            system_message=(
                "You handle escalations to a human. When invoked, acknowledge the escalation, "
                "ask only for essential details, and provide a final handoff that begins with: "
                "Escalated to human: <one-sentence summary>\n"
                "Be brief, professional, and actionable."
            ),
            llm_config=cfg,
        )
    
        return support, escalation
    

    Next Up

    Implement the router in app/router.py:

    • Call SupportAgent first and capture its reply.
    • If the reply contains ESCALATE_TOKEN, extract the short reason and prompt EscalationAgent to produce the handoff that starts with Escalated to human: <one-sentence summary>.
    • Return a single string to the UI (either Support’s reply or the handoff).

  3. Challenge

    Router

    Orchestration (Support → Escalation)

    This step wires the conversation flow. The router sends a user message to SupportAgent first. If Support includes the shared ESCALATE_TOKEN (e.g., [ESCALATE]) with a short reason, the router builds a brief prompt and asks EscalationAgent to produce a clean, human-ready handoff. The router always returns one final string for the UI.


    Key Concepts

    • Deterministic trigger Look for the literal token imported from app/agents.py (ESCALATE_TOKEN). This keeps routing simple and testable.

    • Reason extraction Parse the text after the token as a short reason and include it in the escalation prompt.

    • Consistent handoff Escalation output must start with Escalated to human: + one-sentence summary for easy reading and logging.

    • Single responsibility helpers Small functions keep things teachable and easy to unit-test: _support_reply → _parse_escalation → _escalate


    Your Task

    1. Call Support first (_support_reply) and capture its single-turn reply.
    2. Detect the token with _parse_escalation; if present, extract the short reason.
    3. Ask EscalationAgent via _escalate to produce the handoff that begins with Escalated to human: <one-sentence summary>.
    4. Return a single string — Support’s answer or the human handoff.

    Note: With the simplified layout, the router lives in app/router.py and imports agents directly from app/agents.py.


    Starter Code — app/router.py

    # app/router.py
    from __future__ import annotations
    from typing import Tuple
    from app.agents import create_support_and_escalation_agents, ESCALATE_TOKEN
    
    
    def _support_reply(support, user_input: str) -> str:
        """Single-turn call to SupportAgent."""
        # TODO: Call support.generate_reply(...) with the user_input and return trimmed text.
        # messages format: [{"role": "user", "content": user_input}]
        raise NotImplementedError
    
    
    def _parse_escalation(text: str) -> Tuple[bool, str]:
        """
        Detect the escalation token and extract a brief reason.
        Returns (should_escalate, reason).
        """
        # TODO:
        # - If ESCALATE_TOKEN not in text: return (False, "")
        # - Else return (True, <text after token>.strip() or "No reason provided.")
        raise NotImplementedError
    
    
    def _escalate(escalation, user_input: str, reason: str) -> str:
        """Ask EscalationAgent for the human-ready handoff."""
        # TODO: Build a short prompt that includes:
        # - The original user_input
        # - The extracted reason
        # - The instruction that the response must start with:
        #   "Escalated to human: <one-sentence summary>"
        # Then call escalation.generate_reply(...) and return trimmed text.
        raise NotImplementedError
    
    
    def run_support_flow(user_input: str) -> str:
        """
        Support first; escalate only on the token.
        Returns a single display string for the UI.
        """
        # TODO:
        # - Create agents via create_support_and_escalation_agents()
        # - Get Support reply
        # - Parse for escalation; if escalate, call _escalate and return result
        # - Otherwise return the Support reply
        raise NotImplementedError
    

    Code (Solved) — Click to expand
    # app/router.py
    from __future__ import annotations
    from typing import Tuple
    from app.agents import create_support_and_escalation_agents, ESCALATE_TOKEN
    
    
    def _support_reply(support, user_input: str) -> str:
        """Single-turn call to SupportAgent."""
        reply = support.generate_reply(messages=[{"role": "user", "content": user_input}]) or ""
        return reply.strip()
    
    
    def _parse_escalation(text: str) -> Tuple[bool, str]:
        """
        Detect the escalation token and extract a brief reason.
        Returns (should_escalate, reason).
        """
        if ESCALATE_TOKEN not in text:
            return False, ""
        reason = text.split(ESCALATE_TOKEN, 1)[-1].strip() or "No reason provided."
        return True, reason
    
    
    def _escalate(escalation, user_input: str, reason: str) -> str:
        """Ask EscalationAgent for the human-ready handoff."""
        prompt = (
            "A Tier-1 agent decided to escalate this conversation.\n"
            f"User message:\n{user_input}\n\n"
            f"Escalation reason: {reason}\n\n"
            "Produce a final handoff that starts with:\n"
            "Escalated to human: <one-sentence summary>\n"
            "Ask only for essential details if needed."
        )
        reply = escalation.generate_reply(messages=[{"role": "user", "content": prompt}]) or ""
        return reply.strip()
    
    
    def run_support_flow(user_input: str) -> str:
        """
        Support first; escalate only on the token.
        Returns a single display string for the UI.
        """
        support, escalation = create_support_and_escalation_agents()
    
        s_txt = _support_reply(support, user_input)
        should_escalate, reason = _parse_escalation(s_txt)
    
        if should_escalate:
            return _escalate(escalation, user_input, reason)
    
        return s_txt
    

    How to Try It

    • Non-escalation example: “How do I reset my password?” → Support replies concisely → router returns Support’s text.

    • Escalation example: “Please change my billing plan and refund the last charge.” → Support emits [ESCALATE] refund requires account verification → router prompts Escalation → final output starts with Escalated to human: Customer requests a plan change and refund …


    Consider This

    • Pre-checks: Add a lightweight keyword pre-filter (e.g., refund, chargeback, legal, outage, MFA) to escalate sooner.
    • Clarifying question: Insert a brief follow-up before escalation to improve handoff quality.
    • Structured reason: Have Support emit a tiny block (YAML/JSON) and update _parse_escalation.
    • Analytics: Log should_escalate, reason, and timestamps for QA and dashboards.

    Next Up

    Build the Summarizer Tool in app/tools.py: implement smart_summarizer so Summarize text always returns 3–5 bullets + a TL;DR with a predictable shape.


  4. Challenge

    Tools

    Summarizer Tool

    In this step you’ll build a reusable tool the app can call—think “press a button to summarize.” Under the hood it uses an AutoGen AssistantAgent, but to the app it’s just a callable that returns text. When the UI runs Summarize text, it looks up smart_summarizer and calls .run(text).


    Your task

    1. Build an AutoGen AssistantAgent with a strong system message that enforces:

      • 3–5 bullets, facts only
      • Include source names if present
      • Finish with TL;DR: (one sentence)
    2. Define summarize_text(text: str) -> str that:

      • Sends the raw text as a user message
      • Returns a string reply (trimmed; no dicts/JSON)
    3. Wrap it in _SimpleTool:

      • name="smart_summarizer"
      • Clear description
      • _runner=summarize_text
    4. Export build_tools(...) -> list that returns only this tool for now.


    Starter code — edit app/tools.py

    # app/tools.py
    from __future__ import annotations
    from dataclasses import dataclass
    from typing import Callable, List
    from app.base import azure_autogen_config
    
    # Guarded import so the app still renders in starter mode
    try:
        from autogen import AssistantAgent  # type: ignore
    except Exception:
        class AssistantAgent:  # fallback stub
            def __init__(self, *_, **__): pass
            def generate_reply(self, *_, **__):
                return "• Placeholder bullet\n• Add more bullets here\n\nTL;DR: Placeholder."
    
    @dataclass
    class _SimpleTool:
        name: str
        description: str
        _runner: Callable[[str], str]
        def run(self, text: str) -> str:
            return self._runner(text)
    
    def build_tools(_unused_llm) -> List[_SimpleTool]:
        """
        Starter tool setup for summarization.
    
        TODOs:
        - Create an AutoGen AssistantAgent with a strong system_message for summarization.
        - Write summarize_text(text: str) -> str that calls agent.generate_reply(...) and returns a string.
        - Wrap summarize_text in a _SimpleTool named 'smart_summarizer' with a clear description.
        - Return a list containing this tool.
        """
    
        # TODO(1): Build the summarizer agent
        agent = AssistantAgent(
            name="SummarizerAgent",
            system_message=(
                "TODO: Output 3–5 concise, fact-only bullets. "
                "If source names appear, include them. "
                "After the bullets, write one sentence that begins with 'TL;DR:'."
            ),
            llm_config=azure_autogen_config(),
        )
    
        # TODO(2): Define the runner
        def summarize_text(text: str) -> str:
            text = (text or "").strip()
            if not text:
                return "• Please paste some text to summarize.\n\nTL;DR: No content provided."
            prompt = (
                "Summarize the following content into 3–5 concise bullets, "
                "then add a one-sentence TL;DR line starting with 'TL;DR:'.\n\n"
                f"{text}"
            )
            try:
                reply = agent.generate_reply(messages=[{'role': 'user', 'content': prompt}]) or ""
                out = (reply or "").strip()
                if not out:
                    out = "• (no bullets)\n\nTL;DR: Summary unavailable."
            except Exception:
                out = ("• This is a placeholder summary while your setup finishes.\n"
                       "• Replace the TODO system message when ready.\n\nTL;DR: Placeholder output.")
            # Optional guardrail to keep shape predictable
            if "TL;DR:" not in out:
                out += "\n\nTL;DR: Summary unavailable."
            return out
    
        # TODO(3): Wrap as a tool
        summarizer_tool = _SimpleTool(
            name="smart_summarizer",
            description="Summarize into 3–5 concise bullets and finish with 'TL;DR: ...'.",
            _runner=summarize_text,
        )
    
        return [summarizer_tool]
    

    Consider this

    • Return strings only. Avoid dicts/JSON—keeps callers simple.
    • Be explicit in the system message. Vague rules → inconsistent shape.
    • Guardrail helps. Ensuring a TL;DR: line keeps the UI predictable.

    Code (Solved) — Click to expand
    # app/tools.py
    from __future__ import annotations
    from dataclasses import dataclass
    from typing import Callable, List
    from app.base import azure_autogen_config
    
    # Guarded import so the app still renders if autogen isn't installed yet
    try:
        from autogen import AssistantAgent  # type: ignore
    except Exception:
        class AssistantAgent:  # fallback stub
            def __init__(self, *_, **__): pass
            def generate_reply(self, *_, **__):
                return "• Placeholder bullet\n• Add more bullets here\n\nTL;DR: Placeholder."
    
    @dataclass
    class _SimpleTool:
        name: str
        description: str
        _runner: Callable[[str], str]
        def run(self, text: str) -> str:
            return self._runner(text)
    
    def build_tools(_unused_llm) -> List[_SimpleTool]:
        """Return the lab's single tool: smart_summarizer."""
        agent = AssistantAgent(
            name="SummarizerAgent",
            system_message=(
                "You are an expert technical summarizer.\n"
                "- Output exactly 3–5 concise bullets with concrete facts.\n"
                "- If source names appear in the text, include them in bullets.\n"
                "- Avoid fluff and opinions.\n"
                "- After the bullets, write exactly one sentence that begins with 'TL;DR:'."
            ),
            llm_config=azure_autogen_config(),
        )
    
        def summarize_text(text: str) -> str:
            text = (text or "").strip()
            if not text:
                return "• Please paste some text to summarize.\n\nTL;DR: No content provided."
            prompt = (
                "Summarize the following content into 3–5 concise bullets with concrete facts. "
                "Include source names if present. After the bullets, add exactly one sentence starting with 'TL;DR:'.\n\n"
                f"{text}"
            )
            try:
                reply = agent.generate_reply(messages=[{'role': 'user', 'content': prompt}]) or ""
                out = (reply or "").strip()
            except Exception:
                out = ("• This is a placeholder summary while your setup finishes.\n"
                       "• Replace the TODO prompts when ready.\n\nTL;DR: Placeholder output.")
    
            if not out:
                out = "• (no bullets)\n\nTL;DR: Summary unavailable."
            if "TL;DR:" not in out:
                out += "\n\nTL;DR: Summary unavailable."
            return out
    
        summarizer_tool = _SimpleTool(
            name="smart_summarizer",
            description="Summarize into 3–5 concise bullets and finish with 'TL;DR: ...'.",
            _runner=summarize_text,
        )
        return [summarizer_tool]
    

    Next up

    Finish and try it: Use the UI to run all three modes:

    • Simple QA — e.g., “What are your support hours?”
    • Summarize text — paste a long paragraph; verify 3–5 bullets + TL;DR.
    • Agent — try a normal question (no escalation) and a support request like “Please refund my last invoice and switch me to annual billing.” to observe the full Support → Escalation flow.
  5. Challenge

    Summary

    Try It: QA vs Summarize vs Agent

    Follow these steps to see how each mode behaves differently.


    Step 1 — QA Mode

    1. Go to the Simple QA dropdown mode in the UI.
    2. Paste: “What are your support hours?”
    3. Click Run.

    What you should see: A short, direct answer in one or two sentences. No tools or routing are involved.


    Step 2 — Summarize Mode

    1. Switch to Summarize text mode.
    2. Paste this sample customer support text:

    Sample Support Transcript Customer: Hi, I was charged twice for my monthly subscription. Agent: Sorry about that! Can you confirm the last 4 digits of your payment card? Customer: Sure, it’s 1234. Agent: Thanks. I’ve submitted a request to refund the duplicate charge. You should see the refund in 5–7 business days. Customer: Great, can you also switch me to annual billing so this doesn’t happen again? Agent: Yes, I can do that right now. You’ll be billed annually starting next cycle.

    1. Click Run.

    What you should see:

    • 3–5 concise bullet points summarizing the transcript
    • A single TL;DR: line with a one-sentence takeaway
    • Always the same format, thanks to your summarizer tool’s system message

    Step 3 — Agent Mode

    1. Switch to Agent mode.
    2. Paste the same support text from Step 2.
    3. Click Run.

    What you should see: Because the input involves refunds, billing changes, and last-4 digits, your agent will likely trigger escalation. Expect a handoff that begins with: Escalated to human: <one-sentence summary> (If you want a non-escalation demo instead, try: “Summarize our pricing tiers at a high level.”)


    Key Takeaways

    • QA Mode: Perfect for quick, direct questions.
    • Summarize Mode: Always outputs 3–5 bullets + TL;DR — ideal for consistent summaries of longer text.
    • Agent Mode: Dynamically decides how to respond; with real support requests it will often escalate per your policy.

About the author

Danny Sullivan is a former special education teacher and professional baseball player that moved into software development in 2014. He’s experienced with Ruby, Python and JavaScript ecosystems, but enjoys Ruby most for its user friendliness and rapid prototyping capabilities.

Real skill practice before real-world application

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Learn by doing

Engage hands-on with the tools and technologies you’re learning. You pick the skill, we provide the credentials and environment.

Follow your guide

All labs have detailed instructions and objectives, guiding you through the learning process and ensuring you understand every step.

Turn time into mastery

On average, you retain 75% more of your learning if you take time to practice. Hands-on labs set you up for success to make those skills stick.

Get started with Pluralsight