Dynamically Build LINQ Expressions

LINQ stands for Language Integrated Query and is one of my favorite .NET and C# technologies. Using LINQ, developers can write queries directly in strongly typed code. LINQ provides a standard language and syntax that is consistent across data sources.

The Basics

Consider this LINQ query (you can paste this into a console application and run it yourself):

using System;
using System.Linq;

public class Program
{
   public static void Main()
   {
      var someNumbers = new int[]{4, 8, 15, 16, 23, 42};
      var query =
         from num in someNumbers
            where num > 10
            orderby num descending
            select num.ToString();
      Console.WriteLine(string.Join('-', query.ToArray()));
      // 42-23-16-15
   }
}

Because someNumbers is an IEnumerable<int>, the query is parsed by LINQ to Objects. The same query syntax can be used with a tool like Entity Framework Core to generate T-SQL that is run against a relational database. LINQ can be written using one of two syntaxes: query syntax (shown above) or method syntax. The two syntaxes are semantically identical and the one you use depends on your preference. The same query above can be written using method syntax like this:

var secondQuery = someNumbers.Where(n => n > 10)
    .OrderByDescending(n => n)
    .Select(n => n.ToString());

Every LINQ query has three phases:

A data source is set up, known as a provider, for the query to act against. For example, the code shown so far uses the built-in LINQ to Objects provider. Your EF Core projects use the EF Core provider that maps to your database.
The query is defined and turned into an expression tree. I’ll cover expressions more in a moment.
The query is executed, and data is returned.

Step 3 is important because LINQ uses what is called deferred execution. In the example above, secondQuery defines an expression tree but doesn’t yet return any data. In fact, nothing actually happens until you start to iterate the data. This is important because it allows providers to manage resources by only delivering the data requested. For example, let’s assume you want to find a specific string using secondQuery, so you do something like this:

var found = false;
foreach(var item in secondQuery.AsEnumerable())
{
    if (item == "23")
    {
        found = true;
        break;
    }
}

A provider can handle the enumerator so that it feeds data one element at a time. If you find the value on the third iteration, it is possible that only three items were actually returned from the database. On the other hand, when you use the .ToList() extension method, all data is immediately fetched to populate the list.

The Challenge

For my role as PM for .NET Data, I frequently speak to customers to understand their needs. Recently, I had a discussion with a customer who wants to use a third-party control in their website to build business rules. More specifically, the business rules are “predicates” or a set of conditions that resolve to true or false. The tool can generate the rules in JSON or SQL format. SQL is tempting to pass along to the database, but their requirement is to apply the predicates to in-memory objects as a filter on the server as well. They are considering a tool that translates SQL to expressions (it’s called Dynamic LINQ if you’re interested). I suggested that the JSON format probably is fine, because it can be parsed into a LINQ expression that is run against objects in memory or easily applied to an Entity Framework Core collection to run against the database.

The spike I wrote only deals with the default JSON produced by the tool:

{
  "condition": "and",
  "rules": [
    {
      "label": "Category",
      "field": "Category",
      "operator": "in",
      "type": "string",
      "value": [
        "Clothing"
      ]
    },
    {
      "condition": "or",
      "rules": [
        {
          "label": "TransactionType",
          "field": "TransactionType",
          "operator": "equal",
          "type": "boolean",
          "value": "income"
        },
        {
          "label": "PaymentMode",
          "field": "PaymentMode",
          "operator": "equal",
          "type": "string",
          "value": "Cash"
        }
      ]
    },
    {
      "label": "Amount",
      "field": "Amount",
      "operator": "equal",
      "type": "number",
      "value": 10
    }
  ]
}

The structure is simple: there is an AND or OR condition that contains a set of rules that are either comparisons, or a nested condition. My goal was twofold: learn more about LINQ expressions to better inform my understanding of EF Core and related technologies, and provide a simple example to show how the JSON can be used without relying on a third party tool.

#linq #expressions

The Basics

The Challenge

blog.jeremylikness.com

Dynamically Build LINQ Expressions