Deep Heresy in C#

Deep Heresy in C#

A while back I wrote a heretical post about extensions for nullable reference types in C#. It was ok. But since writing it I have lived and learned. I've delved deeper. Darker. Seen things no C# developer ever wants to see.

The Horror

In the darkest depths of C# there lurks a pattern. A thing of dreams and nightmares. Capable of shattering reality as we know it.

Dormant, in the bowels of C#, lies support for the Monad (don't worry, I won't go into it). A monster it is not, but a tool from a different age.
When I emerged from the depths, the world was ... different. I'd changed. No longer a naïve neophyte, I sought to change the world.

The Past

The last post I made on the subject used the traditional map and flatMap methods you see in more popular programming languages that have a functional flair - Scala, Swift, Rust - to extend C#8's new nullable reference types. It's fine, but there's a better way. A way built right into the guts of C# itself.

If we change the name of the FlatMap method from my original post to its LINQ equivalent , SelectMany, in addition to implementing the SelectMany overload with a "selector", C# will enable LINQ query expressions when we're working with our nullable types. SelectMany can be a method on a class or an extension method. No interface implementation is required. It will "just work".

And since I changed the name of FlatMap to SelectMany, I decided to change Map to Select for consistency with IEnumerable and other types that may make use of this pattern.

The Future

In the following example, I decided to focus on working with reference types, but if you're using C#8, you can implement the methods on structs too. In C#9 it looks like things have changed a little bit and we might not need to declare our generics as reference or value types anymore. Methods will work on both.

This code is the exact same code as in my last post, the only difference being the name change and the added SelectMany overload:

public static class NullableReferenceExtensions {

    public static U? Select<T, U>(this T? x, Func<T, U> fn)
        where T : class
        where U : class {

        return x switch {
            null => null,
            T it => fn(it)
        };

    }

    public static U? SelectMany<T, U>(this T? x, Func<T, U?> fn)
        where T : class
        where U : class {

        return x switch {
            null => null,
            T it => fn(it)
        };
    }

    public static V? SelectMany<T, U, V>(this T? x, Func<T, U?> fn, Func<T, U, V?> selector)
        where T : class
        where U : class
        where V : class {

        return x.SelectMany(y => fn(y).SelectMany(z => selector(y, z)));
    }
}

There's nothing special about the selector in the overloaded SelectMany. It's just a function that takes each of the intermediate results and uses them as arguments. It's as simple as the signature (a, b) -> c.

I'll use a simple point class as the data type we're going to work with.

class Point {
    public int X { get; }
    public int Y { get; }

    public Point(int x, int y) {
        this.X = x;
        this.Y = y;
    }
}

Using XUnit to test, we'll create a few points, and make one of them null. We'll use the null point in different places in the query expression to make sure if there's a null anywhere in the pipeline, the result is null.

[Fact]
public void Query_Test() {
    Point? p1 = new Point(1, 2);
    Point? nullPoint = null;
    Point? p3 = new Point(4, 5);
    Point? p4 = new Point(10, 20);

    var result1 = (from x in nullPoint
                   from y in p1
                   select new { First = x, Second = y });

    var result2 = (from x in p1
                   from y in p3
                   from z in p4
                   select new { First = x, Second = y, Third = z });

    var result3 = (from x in p1
                   from y in p3
                   from z in nullPoint
                   select new { First = x, Second = y, Third = z });

    Assert.Null(result1);
    Assert.Null(result3);
    Assert.Equal(1, result2.First.X);
    Assert.Equal(4, result2.Second.X);
    Assert.Equal(10, result2.Third.X);
}

Both result1 and result3 contain a null point. Each one returns null. It doesn't matter where in the pipeline a null occurs. If there is any null at all, it is carried through the pipeline and returned as the result.

result2 doesn't have any null values so the pipeline returns the anonymous class we create using select.

What Query Expressions Buy Us

It might not be immediately clear why I think query expressions are so important. Chaining Select or SelectMany works just fine. But there are times where you only want to work on the next piece of data if you know the last one isn't null. If you just use regular method chaining, it ends up looking like this:

p1.SelectMany(first => {
	return p3.SelectMany(second => {
    	return p4.SelectMany(third => {
        	return new { First = first, Second = second, Third = third }
        });
    });
});

compared to the query expression:

from first in p1
from second in p3
from third in p4
select new { First = first, Second = second, Third = third };

I'll  take the query expression every time.

Thank you C# compiler magic