Featured resource
2025 Tech Upskilling Playbook
Tech Upskilling Playbook

Build future-ready tech teams and hit key business milestones with seven proven plays from industry leaders.

Check it out
  • Lab
    • Libraries: If you want this lab, consider one of these libraries.
Labs

Lab: Design and Deploy a Research Assistant with LangChain

Build and deploy a LangChain-based research assistant that combines web search and arXiv-derived document retrieval. Add safeguards, expose it with LangServe, and monitor execution quality and cost with LangSmith traces.

Lab platform
Lab Info
Last updated
Mar 27, 2026
Duration
45m

Contact sales

By clicking submit, you agree to our Privacy Policy and Terms of Use, and consent to receive marketing emails from Pluralsight.
Table of Contents
  1. Challenge

    ### Step 1: Introduction

    Welcome to the LangChain Research Assistant Code Lab!

    In this hands-on lab, you'll build a research assistant for analysts who track fast-moving transformer topics. You'll expose research tools with LangChain, guide the assistant through bounded multi-step investigation, deploy the workflow through LangServe, and review its behavior with monitoring and tracing outputs.

    Background

    You are part of an AI team that supports an internal research desk following fast-moving machine learning topics. The team already maintains a curated arXiv-derived research corpus, but analysts still spend too much time cross-checking live updates, internal notes, and recent model announcements before they can answer practical research questions. As part of the engineering team, you've been tasked with building a proof-of-concept research assistant that can combine local retrieval with recent update lookup, then return a grounded, structured answer through an API.

    Familiarizing with the Program Structure

    The lab environment includes the following key files:

    • research_tools.py: Defines the LangChain tools used for retrieval, recent-update lookup, and calculator logic
    • research_assistant.py: Implements prompt construction, tool calling, bounded research flow, and final synthesis
    • serve_assistant.py: Exposes the assistant through LangServe and generates a monitoring summary

    The lab also includes provided support code and prepared data:

    • support/: Provided helpers for retrieval loading, local embeddings, search simulation, request and response schemas, query analysis, tracing, and model selection
    • data/arxiv_corpus.json: A prepared local research corpus derived from real arXiv abstracts
    • data/arxiv_chunks.json, data/arxiv_vector_index.faiss, data/arxiv_vector_index.pkl, data/arxiv_vector_manifest.json: Prepared retrieval artifacts for the local vector store
    • data/live_search_results.json: Prepared recent-update records used by the search tool
    • data/sample_queries.json: A sample query set used for monitoring and observability checks
    • data_prep/: Reference corpus-preparation scripts for abstract collection, chunking, and vector-index preparation

    You may also notice models-bge-small-en-v1.5-onnx-q.part.* files in the workspace. Those are setup artifacts used to reconstruct the local embedding-model cache during environment startup, so they are not part of the lab work and can be ignored while you complete the tasks.

    The environment uses Python 3.10 with LangChain, FastAPI, LangServe, and LangSmith-ready tracing hooks. All dependencies are pre-installed in the lab environment.

    All commands in this lab assume your working directory is /home/ps-user/workspace.

    Understanding the Research Assistant Environment

    Before you build the assistant, you need to understand what is already provided and what remains your responsibility. The lab is intentionally scoped so you focus on LangChain application logic rather than unrelated infrastructure. The corpus has already been prepared. The retrieval layer, recent-update dataset, query-analysis helper, and request and response schemas are also provided. You are not asked to parse PDFs, build an indexing pipeline from scratch, create a crawler, or design a complex agent graph.

    What Is Pre-prepared for You

    For this lab, the local research corpus was prepared in advance so you can focus on LangChain tool use, orchestration, and deployment.

    For this lab, the recent-update search source is also simulated with prepared records so the assistant behavior stays reproducible.

    That means:

    • data/arxiv_corpus.json is a pre-prepared local research corpus built from real arXiv abstracts and indexed into a local vector store
    • data/live_search_results.json is a pre-prepared recent-update dataset used to simulate a search provider
    • support/retrieval_client.py and support/search_client.py are provided interfaces over those prepared datasets

    This lab keeps the retrieval setup simple by indexing one abstract per paper. If your own corpus includes full papers or longer documents, you can adjust the chunking and indexing strategy to fit that content and your retrieval goals.

    Reference corpus-preparation scripts are also included in data_prep/. They are not used by the lab runtime, but they are there if you want to inspect or extend the abstract collection, chunking, and vector-index preparation workflow.

    Your task is to wrap these provided interfaces as LangChain tools, not to build the corpus pipeline or search provider itself.

    Instead, your work centers on four practical concerns:

    Tooling: The assistant needs a small set of tools it can call deliberately. In this lab, those tools include local document retrieval, recent update search, and one calculator tool.

    Research Flow: The assistant should explain its plan, gather evidence, and only then synthesize a response. Multi-step behavior is useful for comparison queries and requests that require both local grounding and recent information.

    Safeguards: The assistant should not run unbounded. It needs clarification behavior for vague questions, limits on loops and tool calls, and a graceful way to return partial answers if some evidence is missing.

    Deployment and Observability: The final workflow should run through an API, return a consistent JSON response, and produce enough trace and monitoring information to analyze reliability and cost.

    Important Note 1: The automated checks in this lab use prepared data and a deterministic local chat model from support/models.py so your progress can be evaluated reproducibly. In your own environment outside the platform, if OPENAI_API_KEY is set, the completed assistant can run with ChatOpenAI instead. That means the live assistant can produce different wording, plan text, source ordering, and latency even when the application logic is correct.

    Important Note 2: Complete tasks in order. Each task builds on the previous one. Use the Feedback/Checks panel frequently to catch errors early.

    info > If you get stuck on a task, you can find solution files with the completed code for each task in the solution folder of your file tree.

  2. Challenge

    ### Step 2: Build the Research Assistant Foundation

    Research assistants are only as useful as the tools they can call and the boundaries those tools enforce. A good tool layer gives the model clear choices and produces normalized outputs that later synthesis code can trust. In this lab, your assistant works with two research sources and one calculator tool:

    Local Retrieval: This tool searches the prepared arXiv-style corpus for grounded background material. It is useful when the analyst needs durable research context, core methods, or papers related to a transformer topic.

    Recent Update Search: This tool looks up current-style updates from a prepared recent-results dataset. It is useful when the analyst needs the latest benchmark movement, product announcements, or public discussion around a topic.

    Calculator Tool: This tool handles arithmetic for research planning tasks, such as comparing window sizes or simple ratios, without relying on the language model to calculate mentally.

    The key design idea is that each tool should have a narrow purpose, predictable inputs, and a consistent output shape. This makes tool selection easier for the model and keeps the downstream assistant code simpler. In this lab, the function docstring attached to each @tool becomes the tool description the model sees when it decides which tool to call, so write it as part of the tool design rather than as a placeholder. Now that the assistant can retrieve local research material, add the remaining tools so the next step can plan and compare evidence across multiple sources.

    The corpus and search inputs for this lab are already prepared. In this step, you are wrapping the provided interfaces as LangChain tools rather than building the corpus pipeline, ranking logic, or search provider.

  3. Challenge

    ### Step 3: Implement Multi-step Research and Synthesis Logic

    A research assistant should not jump directly from the user question to the final answer. It should first decide how to approach the problem, then gather evidence, then synthesize a grounded response. In this lab, that process begins with a prompt that teaches the model how to behave, followed by a single-step tool-calling function, and then a bounded multi-step loop for more complex research questions. ### Understanding Tool Calling

    In LangChain, the model can request a tool by returning a tool call that names the tool and supplies arguments. Your code is responsible for turning that request into an actual tool invocation, capturing the result, and appending the tool result back into the message history. This step is essential: without it, the model can ask for tools, but the assistant never gathers any evidence. ### Understanding Multi-step Research and Synthesis

    A single tool call is often not enough for a practical research task. The assistant still needs a final synthesis step that turns gathered evidence into one grounded answer. In this lab, that means normalizing the collected tool outputs into source records, then asking the chat model to write the final answer from that evidence. Now that the assistant can turn gathered evidence into a grounded response, extend the research flow so comparison-style questions can gather evidence across more than one step before that final synthesis call.

    Understanding Multi-step Research and Synthesis

    A single tool call is often not enough for a practical research task. The assistant still needs a final synthesis step that turns gathered evidence into one grounded answer. In this lab, that means normalizing the collected tool outputs into source records, then asking the chat model to write the final answer from that evidence.

  4. Challenge

    ### Step 4: Add Safeguards for Reliability and Cost

    An unconstrained assistant can become expensive, confusing, or brittle. For a research workflow, it is better to be explicit about boundaries. If the query is too vague, the assistant should ask for clarification instead of pretending it knows what the analyst meant. If a tool returns nothing, the assistant should surface that fact and continue carefully. If the loop runs too long, the assistant should stop and return the best partial answer it can support.

    In this lab, the safeguards are visible in both the response payload and the monitoring outputs. That makes them easier to reason about and easier to verify later in traces.

  5. Challenge

    ### Step 5: Deploy and Observe the Assistant

    Once the assistant logic is in place, you expose it through an API so other applications can call it. In this lab, LangServe wraps the runnable and gives you a practical REST surface for invoking the assistant. A deployed assistant also needs observability. That means you should be able to review traces, summarize response time and tool usage, and identify common failure modes from a set of representative research queries. When you run the completed app in your own environment, you can also inspect the generated API documentation and schema through FastAPI's /docs and /openapi.json surfaces.

    Understanding Deterministic Tests Versus Live Runs

    The automated checks use a deterministic local model and prepared data so your progress can be evaluated consistently. When you run the deployed assistant live with provider access, several things can look different:

    • the research_plan text may be longer or more conversational
    • the exact wording of the final answer can change from run to run
    • the source ordering can vary
    • live runs will reflect actual model latency and token usage instead of only deterministic local behavior

    Those differences are expected. The test suite is verifying the structure and behavior of your LangChain application. The live run is where you observe how the completed assistant behaves with external services.

    The same distinction applies to search. In the task work, you wrap the provided LocalSearchClient so the assistant has a stable, testable search-style tool. In a live deployment, the same LangChain tool pattern could wrap a real search API.

    The localhost API runs inside the lab machine and does not require public web access. ### Inspect a Live Trace in LangSmith

    After the monitoring report is working, you can inspect a live run in LangSmith from your own environment.

    This part is meant for a separate live setup with internet access and working provider credentials, NOT for the lab environment itself.

    In this project, support/tracing.py reads the tracing environment variables and writes the monitoring summary report. The response payload and monitoring report both use trace_enabled as a quick signal that tracing was enabled for that run.

    Set the following variables in the same terminal where you will start serve_assistant.py, or add them to your shell profile before opening a new terminal:

    export OPENAI_API_KEY="your-openai-key"
    export LANGSMITH_API_KEY="your-langsmith-key"
    export LANGSMITH_TRACING=true
    export LANGSMITH_PROJECT="your-project-name"
    

    If you need to set the LangSmith region explicitly, use:

    export LANGSMITH_ENDPOINT="https://api.smith.langchain.com"
    

    for North America, or:

    export LANGSMITH_ENDPOINT="https://eu.api.smith.langchain.com"
    

    for Europe.

    Then start the assistant:

    cd /home/ps-user/workspace/code
    python3 serve_assistant.py
    

    In another terminal, send one request:

    curl -s -X POST http://127.0.0.1:8000/research-assistant/invoke 
      -H "Content-Type: application/json" 
      --data '{"query":"What are the latest long-context transformer trends?"}'
    

    In the JSON response, trace_enabled should be true. In LangSmith, you should see a new trace in the your-project-name project for that assistant request, with the assistant/model call and any tool activity captured as child runs.

  6. Challenge

    ### Step 6: Conclusion

    In this lab, you have:

    • Built the Research Assistant Foundation: Wrapped the provided retrieval, search, and calculator capabilities as LangChain tools with bounded, normalized behavior.
    • Implemented Multi-step Research and Synthesis Logic: Taught the assistant to explain its plan, execute tool calls, gather evidence across steps, and return one grounded response.
    • Added Safeguards for Reliability and Cost: Enforced clarification-first handling, bounded loops, response warnings, and graceful degradation when evidence was incomplete.
    • Deployed the Assistant Through LangServe: Exposed the assistant through a practical API endpoint with a consistent JSON contract.
    • Observed and Analyzed Runs: Generated a monitoring report and prepared the assistant for live trace inspection in LangSmith.

    The automated tests in this lab validate your implementation with deterministic local behavior so the checks stay reproducible. The live API runs and LangSmith traces show how the completed assistant behaves with real provider access, where wording, source ordering, latency, and token usage can vary. Together, those two views give you both reliable implementation feedback and realistic runtime validation.

About the author

Nicolae has been a Software Engineer since 2013, focusing on Java and web stacks. Nicolae holds a degree in Computer Science and enjoys teaching, traveling and motorsports.

Real skill practice before real-world application

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Learn by doing

Engage hands-on with the tools and technologies you’re learning. You pick the skill, we provide the credentials and environment.

Follow your guide

All labs have detailed instructions and objectives, guiding you through the learning process and ensuring you understand every step.

Turn time into mastery

On average, you retain 75% more of your learning if you take time to practice. Hands-on labs set you up for success to make those skills stick.

Get started with Pluralsight