Course info
Oct 31, 2011
3h 29m

Using regular expressions in .NET to process text.

About the author
About the author

Dan is an independent consultant, author, and speaker. He likes data; pointy data, rectangular data, even data just lying around on the floor. He is a co-author of the book "A Developers Guide to SQL Server 2005". His articles have been published in MSDN Magazine and SQL Server Magazine and he has spoken at WinDev, Microsoft events, as well as to various developer groups.

More from the author
Alan Turing's Wonderful Machine
2h 20m
Dec 16, 2015
XML Fundamentals
4h 51m
May 11, 2012
More courses by Dan Sullivan
Section Introduction Transcripts
Section Introduction Transcripts

Matching the Whole String
Hello, I'm Dan Sullivan from Pluralsight and I'll be presenting this module on using regular expressions to match a whole string. In this video, we are going to expand on how regular expressions find matches in a subject string. In the previous video, we saw that one of the key operational features of regular expressions was that they always match the leftmost sequence of matching characters they can find in a string, even when the pattern can be found more than once in the subject. In this video, we will see that regular expressions can be iteratively used to find more than one match of a pattern in a subject. We've already seen some of the special characters in regular expressions, parentheses for grouping and the Kleene star. In this video, we will be looking at more special characters and the features they provide and how to use them as literal characters. It's pretty common to say I want to match any capital letter, and we can do that using alternation. Character classes are a more efficient way to do that kind of thing and we will be looking at them. The regular expression engine does matching one character at a time. Backtracking is something the regular expression engine has to do when it gets partway through a match and fails. It backtracks to an earlier point in the match and sees if there is another way it can interpret the pattern to produce a successful match. Backtracking can have a big impact on performance and research usage and we will be looking at how it works and the things you can do to minimize its impact, so let's get started.

Regex Groups and Captures
A group is a part of a regular expression pattern and also a part of the subject the pattern matches. In this video, we will be looking at groups in a pattern and how they capture text from a subject. Regular expression can be hard to read and to understand, just because of their format. Likewise, so is a C# program if it is written as a single line of text. We will be looking at regular expression features that let us format our pattern and make it easier to understand and read, much like we format our C# programs for the same reason. And lastly, we will be looking at making back references to groups. This lets us, in effect, dynamically adapt a regular expression pattern to the content of the subject it is matching. So let's get started.

Fine Tuning Regular Expression Matches
Hello, I'm Dan Sullivan from Pluralsight, and I'll be presenting this video on fine tuning the matches made by regular expressions.

Conditional Expressions
Hello, I'm Dan Sullivan and I'll be presenting this video on conditionals, that is adapting the content of a regular expression based on the content of the subject the regular expression is matching. A regular expression can, of course, contain some alternatives. The alternatives we have seen so far act like an or statement in that they are successful matches if any of the alternatives they enumerate match the subject. The first two kinds of conditionals we will look at use alternatives, but in a different way than we have seen so far. These conditionals have a predicate that is a test, and just two alternatives. The predicate tests a part of the subject the regular expression is trying to match. Based on the result of the test, just one of the alternatives is chosen to match the subject. You can think of a conditional expression as a way to dynamically adapt a regular expression to the content of a subject. The first of the conditionals we will look at is called a lookaround, because it looks around a particular point in the subject to see what immediately precedes or follows that point. Based on what it sees, it selects one of two alternatives to use to match what follows the point. The if conditional, which by the way, does not use the word if, is a general purpose if statement. The predicate associated with the if statement can be either a regular expression or the name of a group in the overall regular expression. The predicate is true if the regular expression found a match or if the group used made a capture. Based on the result of the predicate, one of the two alternatives is used. The last conditional we will look at is not technically a regular expression, in that it is not an alternation, concatenation or repetition rule. Nonetheless, it is a very useful enhancement to regular expressions and is called a balancing group. It makes it possible for a regular expression to deal with hierarchies. So let's get started.

Applying Regex Class
Hello, I'm Dan Sullivan and I'll be presenting this video on using the programming features of the Regex class. In classic regular expressions, all of the features of the regular expression engine are accessed through the regular expression syntax. The Regex class has a number of methods associated with it. Some of these methods duplicate features that are available via the syntax, but others can be accessed only using these methods. There are a number of regular expression modifiers that can be applied through Regex methods, as well as the syntax. Regular expressions can be compiled, and when they are, they typically speed up the time to do a match. Many of the methods have both static and instance versions. When a regular expression is used just to validate the format of the subject, you can skip the overhead of saving the captures that are made in the process of doing the match. Backtracking can have a big impact on performance. Sometimes reversing the direction that the regular expression engine does at scanning can reduce that impact. Comparing strings from different sources can be problematic because the culture a string comes from can affect character codes used to represent upper and lower case. The Regex class supports ECMA regular expressions. The Regex class supports both the type of replacements classic regular expressions do and doing replacements generated by code that you write. The Regex class supports splitting strings based on a delimiter that is defined by a regular expression. Some characters have special meaning when used in a pattern. To disable the special meaning, you have to escape these characters. The Regex class has special methods to escape and unescape a pattern.