Author avatar

Chris Behrens

Streamlining LINQ Code with Let

Chris Behrens

  • Nov 29, 2018
  • 6 Min read
  • 2,130 Views
  • Nov 29, 2018
  • 6 Min read
  • 2,130 Views
C#
LINQ

The Problem This Solves

Sometimes in our queries, we must modify column values with functions in the same way over and over again. As we compose the query, we feel the urge to copy and paste, which we have been trained is the wrong impulse. LINQ offers us a way to modularize these functions and then refer to them elsewhere in the query without having to copy and paste.

Consider the following query, which queries a set of customers whose email does not have the "email.com" and returns the count for each non-"email.com" domain:

1from c in customers
2where c.Email.Substring(c.Email.IndexOf("@") + 1) != "email.com"
3group c by c.Email.Substring(c.Email.IndexOf("@") + 1) into g
4select new { Total = g.Count(), Domain = g.Key };` 
csharp

We're using our domain string-parsing logic in two places: once to filter out the undesired domain and the second to group our customers together. The code is duplicated in two places – duplicated code is vulnerable to defects and is best refactored to a common block. The let statement in LINQ allows us to do just that.

Our Scenario

We've been tasked with writing a new query that returns the set of employees that belong to one of two Locations, and then order by those locations. The resulting query looks like this:

1from e in employees
2where e.GroupCode.ToUpper().Substring(0, 2) == "AA"
3||
4e.GroupCode.ToUpper().Substring(0, 2) == "BB"
5orderby e.GroupCode.ToUpper().Substring(0, 2)
6select e;
csharp

This repeating clause:

1e.GroupCode.ToUpper().Substring(0, 2)
2``
3`
4- represents Location – the building where the employee is physically located. It would be nice if this were broken out separately from the other fields but, unfortunately, we must do that ourselves in code. Looking at this, we consider that it would be nice to streamline and centralize this code. 
5
6After a little research, we find that this is what the _let _keyword in LINQ is for. Let allows you to define a variable which contains the expression you want and use it as if it were any other field. We can take our query from before and streamline it to look like this:
7
8```csharp
9from e in employees
10let location = e.GroupCode.ToUpper().Substring(0, 2)
11where location == "AA"
12||
13location == "BB"
14orderby location
15select e;
csharp

Now, the location variable contains the GroupCode string manipulation expression and, when we refer to it on subsequent lines, the query behaves as if the code were there. This query returns the same results as the previous one.

Chained Lets

Let statements can function as miniature where statements. Consider this list of animals:

1var animals = new List<Animal>
2{
3new Animal {Name = "Whale", Class = "Mammal", Location = "Ocean"},
4new Animal {Name = "Bear", Class = "Mammal", Location = "Forest"},
5new Animal {Name = "Hawk", Class = "Bird", Location = "Forest"},
6new Animal {Name = "Tuna", Class = "Fish", Location = "Ocean"}
7};
csharp

Let's say we want to return a list of animals that:

  1. Aren't scary
  2. Is one of our favorite animals on the list
  3. Lives in the forest

The following query will return those results:

1from a in animals
2where !(new List<string> { "Bear", "Shark" }.Contains(a.Name))
3&& new List<string> { "Hawk", "Shark" }.Contains(a.Name)
4&& a.Location == "Forest"
5select a;
csharp

This is a little unclear because the meaning of the two lists, the bear and shark and hawk and shark elements, above is ambiguous. The developer may have known which was his favorite and which were scary (and what the relationship between those two lists was) at the time he wrote it but, looking at it a week later, it's difficult to know.

Put in plain English, the above query says:

Give me the animals from the animal list such that this list: "Bear", "Shark" doesn't' contain the name, and this list: "Hawk", "Shark" does contain the name, and the location is "Forest".

There's no duplication here for the let statement to resolve but we _can _make everything much clearer by using it:

1var results = from a in animals
2let favoriteAnimals = new List<string> { "Hawk", "Shark" }
3let scaryAnimals = new List<string> { "Bear", "Shark" }
4let isFavorite = favoriteAnimals.Contains(a.Name)
5let isScary = scaryAnimals.Contains(a.Name)
6let livesInTheWoods = a.Location == "Forest"
7where (!isScary && isFavorite && livesInTheWoods)
8select a;
csharp

Here, we define our lists, favoriteAnimals and scaryAnimals, inline in the query. We then create two boolean variables that reflect whether a given animal’s name is in that list. Then we add a livesInTheWoods variable that checks the Location and we concatenate these three elements together in the where clause.

The result of both of these queries is Hawk– my only favorite animal I'm not scared of and that lives in the forest. Both of these queries are entirely correct but the second reflects the meaning of the sets far more clearly. If, later on, I decide I'm not actually afraid of sharks anymore, it's perfectly clear that I need to remove that element from the scaryAnimals list. Compare that to the previous query:

1from a in animals
2where !(new List<string> { "Bear", "Shark" }.Contains(a.Name))
3&& new List<string> { "Hawk", "Shark" }.Contains(a.Name)
4&& a.Location == "Forest"
5select a;
csharp

If I'm not scared of sharks anymore, which list does that need to be removed from? The first or the second?

It's important to consider that this let approach resulted in a much longer query. But shorter is not always better and less code is not always better than more. Sometimes it's worth a few extra lines of code if the trade-off is a significant increase in clarity.