Featured resource
2025 Tech Upskilling Playbook
Tech Upskilling Playbook

Build future-ready tech teams and hit key business milestones with seven proven plays from industry leaders.

Check it out
  • Lab
    • Libraries: If you want this lab, consider one of these libraries.
    • Core Tech
Labs

Guided: Using Regular Expressions in Python

Unlock the power of regular expressions in Python with this hands-on lab designed to build your confidence and capability in text manipulation and data extraction. Whether you're parsing log files, validating user input, or cleaning messy datasets, mastering regex can streamline your workflow and enhance your code's precision. This lab walks you through the core syntax, including essential symbols like \d, \w, and \s, as well as anchors and quantifiers for targeted pattern matching. You'll gain practical experience using Python's match(), search(), and findall() functions to locate data, and learn how to validate common formats such as emails and dates. Finally, you'll manipulate text using powerful substitution and splitting techniques—essential tools for real-world tasks like analyzing CSV-style strings or cleaning logs. Perfect for developers, data analysts, and anyone looking to boost their pattern recognition skills, this lab makes regex approachable, applicable, and immediately useful.

Lab platform
Lab Info
Level
Beginner
Last updated
Sep 24, 2025
Duration
2h 20m

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.
Table of Contents
  1. Challenge

    Introduction

    Welcome to the Guided: Using Regular Expressions in Python Lab

    In this lab, you will be provided with an environment and step-by-step instructions to help you:

    • Create proper regular expressions with core regex syntax and patterns
    • Perform text search and extraction with Python's re module
    • Manipulate text through substitution and splitting with Python's re module
    • Apply regex for real world data validation and extraction

    Prerequisites

    You should have a basic understanding of Python, including how to write methods and instantiate variables. No prior experience with regular expressions is required.

    Throughout the lab, you will run Python commands in the Terminal window your task implementations. All commands should be run from the workspace directory and will follow this structure:

    python3 regex_utils.py step<step_number> task<task_number> <text_to_perform_regex_on> <prefix_or_suffix_if_applicable>
    

    Tip: If you need assistance at any point, you can refer to the /solution directory. It contains subdirectories for each of the steps with example implementations.


  2. Challenge

    Text Search and Extraction

    Regular Expression Syntax

    A regular expression (regex or regexp) is a powerful tool for describing and matching patterns in text. It is a sequence of characters that defines a search pattern. It's used for matching, locating, and managing text. Think of it like a super-powered search tool that can describe very complex text patterns.

    You typically use regular expressions in:

    • Searching for specific text within a document or a string.
    • Validating data inputs (like checking if an email address is properly formatted).
    • Replacing parts of text.
    • Splitting text into parts based on patterns.

    This is done following a very specific syntax that contains symbols, anchors, quantifiers, special characters, lookaheads, and lookbehinds.

    Regular Expression Syntax Table Quick Reference
    | Symbol             | Category              | Meaning / Description                                                           |
    

    |--------------------|-----------------------|----------------------------------------------------------------------------------| | . | Wildcard | Matches any single character except newline | | \d | Character class | Matches any digit (same as [0-9]) | | \D | Character class | Matches any non-digit | | \w | Character class | Matches any word character (letters, digits, underscore) | | \W | Character class | Matches any non-word character | | \s | Character class | Matches any whitespace (space, tab, newline) | | \S | Character class | Matches any non-whitespace character | | [...] | Character set | Matches any one character inside brackets | | [^...] | Negated set | Matches any character not inside brackets | | | | Alternation | OR operator; matches either the left or right pattern | | () | Grouping | Groups expressions, enables capturing or combining parts | | (?:...) | Non-capturing group | Groups pattern but doesn't capture it | | (?P<name>...) | Named group | Captures a group with a name | | \b | Anchor | Word boundary (between word and non-word character) | | \B | Anchor | Not a word boundary | | ^ | Anchor | Matches the start of the string (or line with multiline flag) | | $ | Anchor | Matches the end of the string (or line with multiline flag) | | * | Quantifier | Matches 0 or more repetitions | | + | Quantifier | Matches 1 or more repetitions | | ? | Quantifier | Matches 0 or 1 (makes preceding token optional) | | {n} | Quantifier | Matches exactly n repetitions | | {n,} | Quantifier | Matches n or more repetitions | | {n,m} | Quantifier | Matches between n and m repetitions | | ? after quant. | Lazy modifier | Makes quantifier non-greedy (match as little as possible) | | (?=...) | Lookahead (positive) | Match if followed by pattern (doesn't include it in result) | | (?!...) | Lookahead (negative) | Match if not followed by pattern | | (?<=...) | Lookbehind (positive) | Match if preceded by pattern | | (?<!...) | Lookbehind (negative) | Match if not preceded by pattern | | \\ | Escape | Escapes a special character (e.g., \. matches a literal dot) |


    Python's re Module

    Python's re module contains several helpful methods used for completing text search, extraction, and manipulation with the help of regex.

    `re` Module Methods Quick Reference
    | Method         | Purpose                                   | Parameters                                | Returns                 | Notes                                              |
    

    |----------------|-------------------------------------------|-------------------------------------------|--------------------------|----------------------------------------------------| | match() | Match pattern at the start of string | pattern, string, flags=0 | Match object or None | Good for "does this string start with..." | | search() | Search anywhere in string | pattern, string, flags=0 | Match object or None | Finds first occurrence | | fullmatch() | Match the entire string | pattern, string, flags=0 | Match object or None | Use for strict validation | | findall() | Find all non-overlapping matches | pattern, string, flags=0 | List of strings or tuples| Use finditer() for match objects instead | | finditer() | Iterate over all matches as objects | pattern, string, flags=0 | Iterator of Match objects| Useful for position info, grouping, etc. | | sub() | Replace pattern with replacement string | pattern, repl, string, count=0 | New string | Use for find-and-replace | | subn() | Like sub(), but also returns count | pattern, repl, string, count=0 | Tuple: (string, count) | Great for auditing replacements | | split() | Split string by pattern | pattern, string, maxsplit=0 | List of strings | Smarter than str.split() | | compile() | Compile pattern for reuse | pattern, flags=0 | Compiled pattern object | Improves performance with repeated use | | escape() | Escape special regex chars in input | string | Escaped string | Use when inserting user input into regex safely |


    Text Search and Extraction with Regular Expressions

    In the upcoming tasks, you will have the opportunity to use Python's re module to search for pieces of text that match the what you are looking for. This will require writing regular expressions with core regular expression syntax such as symbols, anchors, and quantifiers.

    Tip: In Python, regex patterns are usually written as raw strings by prefixing them with r, like r"\d+", so that backslashes are treated correctly.


  3. Challenge

    Text Manipulation

    Match Objects

    A match object is a special object returned by some re module methods. It provides useful information about the match, such as the matched text, its position in the original string, and any captured groups.

    Match Object Methods & Properties Quick Reference

    | Property / Method | Description | |---------------------|-----------------------------------------------------------------------------| | .group() | Returns the entire match (or a specific group if passed an index) | | .groups() | Returns a tuple of all captured groups (excluding named groups) | | .groupdict() | Returns a dictionary of all named capturing groups | | .start() | Returns the start index of the match | | .end() | Returns the end index (1 past the last character) of the match | | .span() | Returns a tuple (start, end) representing the range of the match | | .pos | The starting position of the search within the string | | .endpos | The ending position (limit) of the search | | .re | The regular expression object used for the match | | .string | The original string passed to re.search() or similar | | .lastgroup | The name of the last matched capturing group | | .lastindex | The index of the last matched capturing group (by number) |

    Text Manipulation with Regular Expressions

    In the upcoming tasks, you will use Python’s re module to manipulate text by substituting specific patterns with new text and splitting text based on defined patterns. You may also work with match objects to extract additional information about matches.

    Regular Expression Syntax Table Quick Reference
    | Symbol             | Category              | Meaning / Description                                                           |
    

    |--------------------|-----------------------|----------------------------------------------------------------------------------| | . | Wildcard | Matches any single character except newline | | \d | Character class | Matches any digit (same as [0-9]) | | \D | Character class | Matches any non-digit | | \w | Character class | Matches any word character (letters, digits, underscore) | | \W | Character class | Matches any non-word character | | \s | Character class | Matches any whitespace (space, tab, newline) | | \S | Character class | Matches any non-whitespace character | | [...] | Character set | Matches any one character inside brackets | | [^...] | Negated set | Matches any character not inside brackets | | | | Alternation | OR operator; matches either the left or right pattern | | () | Grouping | Groups expressions, enables capturing or combining parts | | (?:...) | Non-capturing group | Groups pattern but doesn't capture it | | (?P<name>...) | Named group | Captures a group with a name | | \b | Anchor | Word boundary (between word and non-word character) | | \B | Anchor | Not a word boundary | | ^ | Anchor | Matches the start of the string (or line with multiline flag) | | $ | Anchor | Matches the end of the string (or line with multiline flag) | | * | Quantifier | Matches 0 or more repetitions | | + | Quantifier | Matches 1 or more repetitions | | ? | Quantifier | Matches 0 or 1 (makes preceding token optional) | | {n} | Quantifier | Matches exactly n repetitions | | {n,} | Quantifier | Matches n or more repetitions | | {n,m} | Quantifier | Matches between n and m repetitions | | ? after quant. | Lazy modifier | Makes quantifier non-greedy (match as little as possible) | | (?=...) | Lookahead (positive) | Match if followed by pattern (doesn't include it in result) | | (?!...) | Lookahead (negative) | Match if not followed by pattern | | (?<=...) | Lookbehind (positive) | Match if preceded by pattern | | (?<!...) | Lookbehind (negative) | Match if not preceded by pattern | | \\ | Escape | Escapes a special character (e.g., \. matches a literal dot) |

    <details><summary>`re` Module Methods Quick Reference</summary>
    
    | Method         | Purpose                                   | Parameters                                | Returns                 | Notes                                              |
    

    |----------------|-------------------------------------------|-------------------------------------------|--------------------------|----------------------------------------------------| | match() | Match pattern at the start of string | pattern, string, flags=0 | Match object or None | Good for "does this string start with..." | | search() | Search anywhere in string | pattern, string, flags=0 | Match object or None | Finds first occurrence | | fullmatch() | Match the entire string | pattern, string, flags=0 | Match object or None | Use for strict validation | | findall() | Find all non-overlapping matches | pattern, string, flags=0 | List of strings or tuples| Use finditer() for match objects instead | | finditer() | Iterate over all matches as objects | pattern, string, flags=0 | Iterator of Match objects| Useful for position info, grouping, etc. | | sub() | Replace pattern with replacement string | pattern, repl, string, count=0 | New string | Use for find-and-replace | | subn() | Like sub(), but also returns count | pattern, repl, string, count=0 | Tuple: (string, count) | Great for auditing replacements | | split() | Split string by pattern | pattern, string, maxsplit=0 | List of strings | Smarter than str.split() | | compile() | Compile pattern for reuse | pattern, flags=0 | Compiled pattern object | Improves performance with repeated use | | escape() | Escape special regex chars in input | string | Escaped string | Use when inserting user input into regex safely |

  4. Challenge

    Real World Examples

    Real World Examples

    In this step, you will apply what you've learned about core regex syntax, Python’s re module, and match objects to solve real-world problems using regular expressions.

    Real-world regex skills are powerful for validating, extracting, and cleaning data across countless applications!

    Regular Expression Syntax Table Quick Reference
    | Symbol             | Category              | Meaning / Description                                                           |
    

    |--------------------|-----------------------|----------------------------------------------------------------------------------| | . | Wildcard | Matches any single character except newline | | \d | Character class | Matches any digit (same as [0-9]) | | \D | Character class | Matches any non-digit | | \w | Character class | Matches any word character (letters, digits, underscore) | | \W | Character class | Matches any non-word character | | \s | Character class | Matches any whitespace (space, tab, newline) | | \S | Character class | Matches any non-whitespace character | | [...] | Character set | Matches any one character inside brackets | | [^...] | Negated set | Matches any character not inside brackets | | | | Alternation | OR operator; matches either the left or right pattern | | () | Grouping | Groups expressions, enables capturing or combining parts | | (?:...) | Non-capturing group | Groups pattern but doesn't capture it | | (?P<name>...) | Named group | Captures a group with a name | | \b | Anchor | Word boundary (between word and non-word character) | | \B | Anchor | Not a word boundary | | ^ | Anchor | Matches the start of the string (or line with multiline flag) | | $ | Anchor | Matches the end of the string (or line with multiline flag) | | * | Quantifier | Matches 0 or more repetitions | | + | Quantifier | Matches 1 or more repetitions | | ? | Quantifier | Matches 0 or 1 (makes preceding token optional) | | {n} | Quantifier | Matches exactly n repetitions | | {n,} | Quantifier | Matches n or more repetitions | | {n,m} | Quantifier | Matches between n and m repetitions | | ? after quant. | Lazy modifier | Makes quantifier non-greedy (match as little as possible) | | (?=...) | Lookahead (positive) | Match if followed by pattern (doesn't include it in result) | | (?!...) | Lookahead (negative) | Match if not followed by pattern | | (?<=...) | Lookbehind (positive) | Match if preceded by pattern | | (?<!...) | Lookbehind (negative) | Match if not preceded by pattern | | \\ | Escape | Escapes a special character (e.g., \. matches a literal dot) |

    <details><summary>`re` Module Methods Quick Reference</summary>
    
    | Method         | Purpose                                   | Parameters                                | Returns                 | Notes                                              |
    

    |----------------|-------------------------------------------|-------------------------------------------|--------------------------|----------------------------------------------------| | match() | Match pattern at the start of string | pattern, string, flags=0 | Match object or None | Good for "does this string start with..." | | search() | Search anywhere in string | pattern, string, flags=0 | Match object or None | Finds first occurrence | | fullmatch() | Match the entire string | pattern, string, flags=0 | Match object or None | Use for strict validation | | findall() | Find all non-overlapping matches | pattern, string, flags=0 | List of strings or tuples| Use finditer() for match objects instead | | finditer() | Iterate over all matches as objects | pattern, string, flags=0 | Iterator of Match objects| Useful for position info, grouping, etc. | | sub() | Replace pattern with replacement string | pattern, repl, string, count=0 | New string | Use for find-and-replace | | subn() | Like sub(), but also returns count | pattern, repl, string, count=0 | Tuple: (string, count) | Great for auditing replacements | | split() | Split string by pattern | pattern, string, maxsplit=0 | List of strings | Smarter than str.split() | | compile() | Compile pattern for reuse | pattern, flags=0 | Compiled pattern object | Improves performance with repeated use | | escape() | Escape special regex chars in input | string | Escaped string | Use when inserting user input into regex safely |

    Match Object Methods & Properties Quick Reference

    | Property / Method | Description | |---------------------|-----------------------------------------------------------------------------| | .group() | Returns the entire match (or a specific group if passed an index) | | .groups() | Returns a tuple of all captured groups (excluding named groups) | | .groupdict() | Returns a dictionary of all named capturing groups | | .start() | Returns the start index of the match | | .end() | Returns the end index (1 past the last character) of the match | | .span() | Returns a tuple (start, end) representing the range of the match | | .pos | The starting position of the search within the string | | .endpos | The ending position (limit) of the search | | .re | The regular expression object used for the match | | .string | The original string passed to re.search() or similar | | .lastgroup | The name of the last matched capturing group | | .lastindex | The index of the last matched capturing group (by number) |

About the author

Real skill practice before real-world application

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Learn by doing

Engage hands-on with the tools and technologies you’re learning. You pick the skill, we provide the credentials and environment.

Follow your guide

All labs have detailed instructions and objectives, guiding you through the learning process and ensuring you understand every step.

Turn time into mastery

On average, you retain 75% more of your learning if you take time to practice. Hands-on labs set you up for success to make those skills stick.

Get started with Pluralsight