ploeh blog

Porting song recommendations to Haskell

Monday, 21 April 2025 10:19:00 UTC

An F# code base translated to Haskell.

This article is part of a larger article series that examines variations of how to take on a non-trivial problem using functional architecture. In a previous article we established a baseline C# code base. Future articles are going to use that C# code base as a starting point for refactored code. On the other hand, I also want to demonstrate what such solutions may look like in languages like F# or Haskell. In this article, you'll see how to port the baseline to Haskell. To be honest, I first ported the C# code to F#, and then used the F# code as a guide to implement equivalent Haskell code.

If you're following along in the Git repositories, this is a repository separate from the .NET repositories. The code shown here is from its master branch.

If you don't care about Haskell, you can always go back to the table of contents in the 'root' article and proceed to the next topic that interests you.

Data structures #

When working with statically typed functional languages like Haskell, it often makes most sense to start by declaring data structures.

data User = User
  { userName :: String
  , userScrobbleCount :: Int }
  deriving (Show, Eq)

This is much like an F# or C# record declaration, and this one echoes the corresponding types in F# and C#. The most significant difference is that here, a user's total count of scrobbles is called userScrobbleCount rather than TotalScrobbleCount. The motivation behind that variation is that Haskell data 'getters' are actually top-level functions, so it's usually a good idea to prefix them with the name of the data structure they work on. Since the data structure is called User, both 'getter' functions get the user prefix.

I found userTotalScrobbleCount a bit too verbose to my tastes, so I dropped the Total part. Whether or not that's appropriate remains to be seen. Naming in programming is always hard, and there's a risk that you don't get it right the first time around. Unless you're publishing a reusable library, however, the option to rename it later remains.

The other two data structures are quite similar:

data Song = Song
  { songId :: Int
  , songHasVerifiedArtist :: Bool
  , songRating :: Word8 }
  deriving (Show, Eq)
 
data Scrobble = Scrobble
  { scrobbledSong :: Song
  , scrobbleCount :: Int }
  deriving (Show, Eq)

I thought that scrobbledSong was more descriptive than scrobbleSong, so I allowed myself that little deviation from the idiomatic naming convention. It didn't cause any problems, but I'm still not sure if that was a good decision.

How does one translate a C# interface to Haskell? Although type classes aren't quite the same as C# or Java interfaces, this language feature is close enough that I can use it in that role. I don't consider such a type class idiomatic in Haskell, but as an counterpart to the C# interface, it works well enough.

class SongService a where
  getTopListeners :: a -> Int -> IO [User]
  getTopScrobbles :: a -> String -> IO [Scrobble]

Any instance of the SongService class supports queries for top listeners of a particular song, as well as for top scrobbles for a user.

To reiterate, I don't intend to keep this type class around if I can help it, but for didactic reasons, it'll remain in some of the future refactorings, so that you can contrast and compare the Haskell code to its C# and F# peers.

Test Double #

To support tests, I needed a Test Double, so I defined the following Fake service, which is nothing but a deterministic in-memory instance. The type itself is just a wrapper of two maps.

data FakeSongService = FakeSongService
  { fakeSongs :: Map Int Song
  , fakeUsers :: Map String (Map Int Int) }
  deriving (Show, Eq)

Like the equivalent C# class, fakeSongs is a map from song ID to Song, while fakeUsers is a bit more complex. It's a map keyed on user name, but the value is another map. The keys of that inner map are song IDs, while the values are the number of times each song was scrobbled by that user.

The FakeSongService data structure is a SongService instance by explicit implementation:

instance SongService FakeSongService where
  getTopListeners srvc sid = do
    return $
      uncurry User <$>
      Map.toList (sum <$> Map.filter (Map.member sid) (fakeUsers srvc))
  getTopScrobbles srvc userName = do
    return $
      fmap (\(sid, c) -> Scrobble (fakeSongs srvc ! sid) c) $
      Map.toList $
      Map.findWithDefault Map.empty userName (fakeUsers srvc)

In order to find all the top listeners of a song, it finds all the fakeUsers who have the song ID (sid) in their inner map, sum all of those users' scrobble counts together and creates User values from that data.

To find the top scrobbles of a user, the instance finds the user in the fakeUsers map, looks each of that user's scrobbled song up in fakeSongs, and creates Scrobble values from that information.

Finally, test code needs a way to add data to a FakeSongService value, which this test-specific helper function accomplishes:

scrobble userName s c (FakeSongService ss us) =
  let sid = songId s
      ss' = Map.insertWith (\_ _ -> s) sid s ss
      us' = Map.insertWith (Map.unionWith (+)) userName (Map.singleton sid c) us
  in FakeSongService ss' us'

Given a user name, a song, a scrobble count, and a FakeSongService, this function returns a new FakeSongService value with the new data added to the data already there.

QuickCheck Arbitraries #

In the F# test code I used FsCheck to get good coverage of the code. For Haskell, I'll use QuickCheck.

Porting the ideas from the F# tests, I define a QuickCheck generator for user names:

alphaNum :: Gen Char
alphaNum = elements (['a'..'z'] ++ ['A'..'Z'] ++ ['0'..'9'])
 
userName :: Gen String
userName = do
  len <- choose (1, 19)
  first <- elements $ ['a'..'z'] ++ ['A'..'Z']
  rest <- vectorOf len alphaNum
  return $ first : rest

It's not that the algorithm only works if usernames are alphanumeric strings that start with a letter and are no longer than twenty characters, but whenever a property is falsified, I'd rather look at a user name like "Yvj0D1I" or "tyD9P1eOqwMMa1Q6u" (which are already bad enough), than something with line breaks and unprintable characters.

Working with QuickCheck, it's often useful to wrap types from the System Under Test in test-specific Arbitrary wrappers:

newtype ValidUserName = ValidUserName { getUserName :: String } deriving (Show, Eq)
 
instance Arbitrary ValidUserName where
  arbitrary = ValidUserName <$> userName

I also defined a (simpler) Arbitrary instance for Song called AnySong.

A few properties #

With FakeSongService in place, I proceeded to add the test code, starting from the top of the F# test code, and translating each as faithfully as possible. The first one is an Ice Breaker Test that only verifies that the System Under Test exists and doesn't crash when called.

testProperty "No data" $ \ (ValidUserName un) -> ioProperty $ do
  actual <- getRecommendations emptyService un
  return $ null actual

As I've done since at least 2019, it seems, I've inlined test cases as anonymous functions; this time as QuickCheck properties. This one just creates a FakeSongService that contains no data, and asks for recommendations. The expected result is that actual is empty (null), since there's nothing to recommend.

A slightly more involved property adds some data to the service before requesting recommendations:

testProperty "One user, some songs" $ \
  (ValidUserName user)
  (fmap getSong -> songs)
  -> monadicIO $ do
  scrobbleCounts <- pick $ vectorOf (length songs) $ choose (1, 100)
  let scrobbles = zip songs scrobbleCounts
  let srvc = foldr (uncurry (scrobble user)) emptyService scrobbles
 
  actual <- run $ getRecommendations srvc user
 
  assertWith (null actual) "Should be empty"

A couple of things are worthy of note. First, the property uses a view pattern to project a list of songs from a list of Arbitraries, where getSong is the 'getter' that belongs to the AnySong newtype wrapper.

I find view patterns quite useful as a declarative way to 'promote' a single Arbitrary instance to a list. In a third property, I take it a step further:

(fmap getUserName -> NonEmpty users)

This not only turns the singular ValidUserName wrapper into a list, but by projecting it into NonEmpty, the test declares that users is a non-empty list. QuickCheck picks all that up and generates values accordingly.

If you're interested in seeing this more advanced view pattern in context, you may consult the Git repository.

Secondly, the "One user, some songs" test runs in monadicIO, which I didn't know existed before I wrote these tests. Together with pick, run, and assertWith, monadicIO is defined in Test.QuickCheck.Monadic. It enables you to write properties that run in IO, which these properties need to do, because getRecommendations is IO-bound.

There's one more QuickCheck property in the code base, but it mostly repeats techniques already shown here. See the Git repository for all the details, if necessary.

Examples #

In addition to the properties, I also ported the F# examples; that is, 'normal' unit tests. Here's one of them:

"One verified recommendation" ~: do
  let srvc =
        scrobble "ana" (Song 2 True 5) 9_9990 $
        scrobble "ana" (Song 1 False 5) 10 $
        scrobble "cat" (Song 1 False 6) 10 emptyService
 
  actual <- getRecommendations srvc "cat"
 
  [Song 2 True 5] @=? actual

This one is straightforward, but as I already discussed when characterizing the original code, some of the examples essentially document quirks in the implementation. Here's the relevant test, translated to Haskell:

"Only top-rated songs" ~: do
  -- Scale ratings to keep them less than or equal to 10.
  let srvc =
        foldr (\i -> scrobble "hyle" (Song i True (toEnum i `div` 2)) 500) emptyService [1..20]
 
  actual <- getRecommendations srvc "hyle"
 
  assertBool "Should not be empty" (not $ null actual)
  -- Since there's only one user, but with 20 songs, the implementation
  -- loops over the same songs 20 times, so 400 songs in total (with
  -- duplicates). Ordering on rating, only the top-rated 200 remains, that
  -- is, those rated 5-10. Note that this is a Characterization Test, so not
  -- necessarily reflective of how a real recommendation system should work.
  assertBool "Should have 5+ rating" (all ((>= 5) . songRating) actual)

This test creates twenty scrobbles for one user: One with a zero rating, two with rating 1, two with rating 2, and so on, up to a single song with rating 10.

The implementation of GetRecommendationsAsync uses these twenty songs to find 'other users' who have these top songs as well. In this case, there's only one user, so for every of those twenty songs, you get the same twenty songs, for a total of 400.

There are more unit tests than these. You can see them in the Git repository.

Implementation #

The most direct translation of the C# and F# 'reference implementation' that I could think of was this:

getRecommendations srvc un = do
  -- 1. Get user's own top scrobbles
  -- 2. Get other users who listened to the same songs
  -- 3. Get top scrobbles of those users
  -- 4. Aggregate the songs into recommendations

  -- Impure
  scrobbles <- getTopScrobbles srvc un
 
  -- Pure
  let scrobblesSnapshot = take 100 $ sortOn (Down . scrobbleCount) scrobbles
 
  recommendationCandidates <- newIORef []
  forM_ scrobblesSnapshot $ \scrobble -> do
    -- Impure
    otherListeners <- getTopListeners srvc $ songId $ scrobbledSong scrobble
 
    -- Pure
    let otherListenersSnapshot =
          take 20 $
          sortOn (Down . userScrobbleCount) $
          filter ((10_000 <=) . userScrobbleCount) otherListeners
 
    forM_ otherListenersSnapshot $ \otherListener -> do
      -- Impure
      otherScrobbles <- getTopScrobbles srvc $ userName otherListener
 
      -- Pure
      let otherScrobblesSnapshot =
            take 10 $
            sortOn (Down . songRating . scrobbledSong) $
            filter (songHasVerifiedArtist . scrobbledSong) otherScrobbles
 
      forM_ otherScrobblesSnapshot $ \otherScrobble -> do
        let song = scrobbledSong otherScrobble
        modifyIORef recommendationCandidates (song :)
 
  recommendations <- readIORef recommendationCandidates
  -- Pure
  return $ take 200 $ sortOn (Down . songRating) recommendations

In order to mirror the original implementation as closely as possible, I declare recommendationCandidates as an IORef so that I can incrementally add to it as the action goes through its nested loops. Notice the modifyIORef towards the end of the code listing, which adds a single song to the list.

Once all the looping is done, the action uses readIORef to pull the recommendations out of the IORef.

As you can see, I also ported the comments from the original C# code.

I don't consider this idiomatic Haskell code, but the goal in this article was to mirror the C# code as closely as possible. Once I start refactoring, you'll see some more idiomatic implementations.

Conclusion #

Together with the previous two articles in this article series, this establishes a baseline from which I can refactor the code. While we might consider the original C# code idiomatic, this port to Haskell isn't. It is, on the other hand, similar enough to both its C# and F# peers that we can compare and contrast all three.

Particularly two design choices make this Haskell implementation less than idiomatic. One is the use of IORef to update a list of songs. The other is using a type class to model an external dependency.

As I cover various alternative architectures in this article series, you'll see how to get rid of both.

Next: Song recommendations as an Impureim Sandwich.

Porting song recommendations to F#

Monday, 14 April 2025 08:54:00 UTC

A C# code base translated to F#.

This article is part of a larger article series that examines variations of how to take on a non-trivial problem using functional architecture. In the previous article we established a baseline C# code base. Future articles are going to use that C# code base as a starting point for refactored code. On the other hand, I also want to demonstrate what such solutions may look like in languages like F# or Haskell. In this article, you'll see how to port the C# baseline to F#.

The code shown in this article is from the fsharp-port branch of the accompanying Git repository.

Data structures #

We may start by defining the required data structures. All are going to be records.

type User = { UserName : string; TotalScrobbleCount : int }

Just like the equivalent C# code, a User is just a string and an int.

When creating new values, record syntax can sometimes be awkward, so I also define a curried function to create User values:

let user userName totalScrobbleCount =
    { UserName = userName; TotalScrobbleCount = totalScrobbleCount }

Likewise, I define Song and Scrobble in the same way:

type Song = { Id : int; IsVerifiedArtist : bool; Rating : byte }
let song id isVerfiedArtist rating =
    { Id = id; IsVerifiedArtist = isVerfiedArtist; Rating = rating }
 
type Scrobble = { Song : Song; ScrobbleCount : int }
let scrobble song scrobbleCount = { Song = song; ScrobbleCount = scrobbleCount }

To be honest, I only use those curried functions sparingly, so they're somewhat redundant. Perhaps I should consider getting rid of them. For now, however, they stay.

Since I'm moving all the code to F#, I also have to translate the interface.

type SongService =
    abstract GetTopListenersAsync : songId : int -> Task<IReadOnlyCollection<User>>
    abstract GetTopScrobblesAsync : userName : string -> Task<IReadOnlyCollection<Scrobble>>

The syntax is different from C#, but otherwise, this is the same interface.

Implementation #

Those are all the supporting types required to implement the RecommendationsProvider. This is the most direct translation of the C# code that I could think of:

type RecommendationsProvider (songService : SongService) =
    member _.GetRecommendationsAsync userName = task {
        // 1. Get user's own top scrobbles
        // 2. Get other users who listened to the same songs
        // 3. Get top scrobbles of those users
        // 4. Aggregate the songs into recommendations
 
        // Impure
        let! scrobbles = songService.GetTopScrobblesAsync userName
 
        // Pure
        let scrobblesSnapshot =
            scrobbles
            |> Seq.sortByDescending (fun s -> s.ScrobbleCount)
            |> Seq.truncate 100
            |> Seq.toList
 
        let recommendationCandidates = ResizeArray ()
        for scrobble in scrobblesSnapshot do
            // Impure
            let! otherListeners =
                songService.GetTopListenersAsync scrobble.Song.Id
 
            // Pure
            let otherListenersSnapshot =
                otherListeners
                |> Seq.filter (fun u -> u.TotalScrobbleCount >= 10_000)
                |> Seq.sortByDescending (fun u -> u.TotalScrobbleCount)
                |> Seq.truncate 20
                |> Seq.toList
 
            for otherListener in otherListenersSnapshot do
                // Impure
                let! otherScrobbles =
                    songService.GetTopScrobblesAsync otherListener.UserName
 
                // Pure
                let otherScrobblesSnapshot =
                    otherScrobbles
                    |> Seq.filter (fun s -> s.Song.IsVerifiedArtist)
                    |> Seq.sortByDescending (fun s -> s.Song.Rating)
                    |> Seq.truncate 10
                    |> Seq.toList
 
                otherScrobblesSnapshot
                |> List.map (fun s -> s.Song)
                |> recommendationCandidates.AddRange
 
        // Pure
        let recommendations =
            recommendationCandidates
            |> Seq.sortByDescending (fun s -> s.Rating)
            |> Seq.truncate 200
            |> Seq.toList
            :> IReadOnlyCollection<_>
 
        return recommendations }

As you can tell, I've kept the comments from the original, too.

Test Double #

In the previous article, I'd written the Fake SongService in C#. Since, in this article, I'm translating everything to F#, I need to translate the Fake, too.

type FakeSongService () =
    let songs = ConcurrentDictionary<int, Song> ()
    let users = ConcurrentDictionary<string, ConcurrentDictionary<int, int>> ()
 
    interface SongService with
        member _.GetTopListenersAsync songId =
            let listeners =
                users
                |> Seq.filter (fun kvp -> kvp.Value.ContainsKey songId)
                |> Seq.map (fun kvp -> user kvp.Key (Seq.sum kvp.Value.Values))
                |> Seq.toList
 
            Task.FromResult listeners
 
        member _.GetTopScrobblesAsync userName =
            let scrobbles =
                users.GetOrAdd(userName, ConcurrentDictionary<_, _> ())
                |> Seq.map (fun kvp -> scrobble songs[kvp.Key] kvp.Value)
                |> Seq.toList
 
            Task.FromResult scrobbles
 
    member _.Scrobble (userName, song : Song, scrobbleCount) =
        let addScrobbles (scrobbles : ConcurrentDictionary<_, _>) =
            scrobbles.AddOrUpdate (
                song.Id,
                scrobbleCount,
                fun _ oldCount -> oldCount + scrobbleCount)
            |> ignore
            scrobbles
 
        users.AddOrUpdate (
            userName,
            ConcurrentDictionary<_, _>
                [ KeyValuePair.Create (song.Id, scrobbleCount) ],
            fun _ scrobbles -> addScrobbles scrobbles)
        |> ignore
        
        songs.AddOrUpdate (song.Id, song, fun _ _ -> song) |> ignore

Apart from the code shown here, only minor changes were required for the tests, such as using those curried creation functions instead of constructors, a cast to SongService, and a few other non-behavioural things like that. All tests still pass, so I consider this a faithful translation of the C# code base.

Conclusion #

This article does more groundwork. Since it may be illuminating to see one problem represented in more than one programming language, I present it in both C#, F#, and Haskell. The next article does exactly that: Translates this F# code to Haskell. Once all three bases are established, we can start introducing solution variations.

If you don't care about the Haskell examples, you can always go back to the first article in this article series and use the table of contents to jump to the next C# example.

Next: Porting song recommendations to Haskell.

Characterising song recommendations

Thursday, 10 April 2025 08:05:00 UTC

Using characterisation tests and mutation testing to describe existing behaviour. An example.

This article is part of an article series that presents multiple design alternatives for a given problem that I call the song recommendations problem. In short, the problem is to recommend songs to a user based on a vast repository of scrobbles. The problem was originally proposed by Oleksii Holub as a an example of a problem that may not be a good fit for functional programming (FP).

As I've outlined in the introductory article, I'd like to use the opportunity to demonstrate alternative FP designs. Before I can do that, however, I need a working example of Oleksii Holub's code example, as well as a trustworthy test suite. That's what this article is about.

The code in this article mostly come from the master branch of the .NET repository that accompanies this article series, although some of the code is taken from intermediate commits on that branch.

Inferred details #

The original article only shows code, but doesn't link to an existing code base. While I suppose I could have asked Oleksii Holub if he had a copy he would share, the existence of such a code base isn't a given. In any case, inferring an entire code base from a comprehensive snippet is an interesting exercise in its own right.

The first step was to copy the example code into a code base. Initially it didn't compile because of some missing dependencies that I had to infer. It was only three Value Objects and an interface:

public sealed record Song(int Id, bool IsVerifiedArtist, byte Rating);

public sealed record Scrobble(Song Song, int ScrobbleCount);

public sealed record User(string UserName, int TotalScrobbleCount);

public interface SongService
{
    Task<IReadOnlyCollection<User>> GetTopListenersAsync(int songId);
    Task<IReadOnlyCollection<Scrobble>> GetTopScrobblesAsync(string userName);
}

These type declarations are straightforward, but still warrant a few comments. First, Song, Scrobble, and User are C# records, which is a more recent addition to the language. If you're reading along here, but using another C-based language, or an older version of C#, you can implement such immutable Value Objects with normal language constructs; it just takes more code, instead of the one-liner syntax. F# developers will, of course, be familiar with the concept of records, and Haskell also has them.

Another remark about the above type declarations is that while SongService is an interface, it has no I prefix. This is syntactically legal, but not idiomatic in C#. I've used the name from the original code sample verbatim, so that's the immediate explanation. It's possible that Oleksii Holub intended the type to be a base class, but for various reasons I prefer interfaces, although in this particular example I don't think it would have made much of a difference. I'm only pointing it out because there's a risk that it might confuse some readers who are used to the C# naming convention. Java programmers, on the other hand, should feel at home.

As far as I remember, the only other change I had to make to the code in order to make it compile was to give the RecommendationsProvider class a constructor:

public RecommendationsProvider(SongService songService)
{
    _songService = songService;
}

This is just basic Constructor Injection, and while I find the underscore prefix redundant, I've kept it in order to stay as faithful to the original example as possible.

At this point the code compiles.

Test Double #

The goal of this article series is to present several alternative designs that implement the same behaviour. This means that as I refactor the code, I need to know that I didn't break existing functionality.

"to refactor, the essential precondition is [...] solid tests"

Refactoring, Martin Fowler, 1999

Currently, I have no tests, so I'll have to add some. Since RecommendationsProvider makes heavy use of its injected SongService, tests must supply that dependency in order to do meaningful work. Since Stubs and Mocks break encapsulation I instead favour state-based testing with Fake Objects.

After some experimentation, I arrived at this FakeSongService:

public sealed class FakeSongService : SongService
{
    private readonly ConcurrentDictionary<int, Song> songs;
    private readonly ConcurrentDictionary<string, ConcurrentDictionary<int, int>> users;
 
    public FakeSongService()
    {
        songs = new ConcurrentDictionary<int, Song>();
        users = new ConcurrentDictionary<string, ConcurrentDictionary<int, int>>();
    }
 
    public Task<IReadOnlyCollection<User>> GetTopListenersAsync(int songId)
    {
        var listeners =
            from kvp in users
            where kvp.Value.ContainsKey(songId)
            select new User(kvp.Key, kvp.Value.Values.Sum());
 
        return Task.FromResult<IReadOnlyCollection<User>>(listeners.ToList());
    }
 
    public Task<IReadOnlyCollection<Scrobble>> GetTopScrobblesAsync(
        string userName)
    {
        var scrobbles = users
            .GetOrAdd(userName, new ConcurrentDictionary<int, int>())
            .Select(kvp => new Scrobble(songs[kvp.Key], kvp.Value));
 
        return Task.FromResult<IReadOnlyCollection<Scrobble>>(scrobbles.ToList());
    }
 
    public void Scrobble(string userName, Song song, int scrobbleCount)
    {
        users.AddOrUpdate(
            userName,
            new ConcurrentDictionary<int, int>(
                new[] { KeyValuePair.Create(song.Id, scrobbleCount) }),
            (_, scrobbles) => AddScrobbles(scrobbles, song, scrobbleCount));
 
        songs.AddOrUpdate(song.Id, song, (_, _) => song);
    }
 
    private static ConcurrentDictionary<int, int> AddScrobbles(
        ConcurrentDictionary<int, int> scrobbles,
        Song song,
        int scrobbleCount)
    {
        scrobbles.AddOrUpdate(
            song.Id,
            scrobbleCount,
            (_, oldCount) => oldCount + scrobbleCount);
        return scrobbles;
    }
}

If you're wondering about the use of concurrent dictionaries, I use them because it made it easier to write the implementation, and not because I need the implementation to be thread-safe. In fact, I'm fairly sure that it's not thread-safe. That's not an issue. The tests aren't going to use shared mutable state.

The GetTopListenersAsync and GetTopScrobblesAsync methods implement the interface, and the Scrobble method (here, scrobble is a verb: to scrobble) is a back door that enables tests to populate the FakeSongService.

Icebreaker Test #

While the 'production code' is in C#, I decided to write the tests in F# for two reasons.

The first reason was that I wanted to be able to present the various FP designs in both C# and F#. Writing the tests in F# would make it easier to replace the C# code base with an F# alternative.

The second reason was that I wanted to leverage a property-based testing framework's ability to produce many randomly-generated test cases. I considered this important to build confidence that my tests weren't just a few specific examples that wouldn't catch errors when I made changes. Since the RecommendationsProvider API is asynchronous, the only .NET property-based framework I knew of that can run Task-valued properties is FsCheck. While it's possible to use FsCheck from C#, the F# API is more powerful.

In order to get started, however, I first wrote an Icebreaker Test without FsCheck:

[<Fact>]
let ``No data`` () = task {
    let srvc = FakeSongService ()
    let sut = RecommendationsProvider srvc
    let! actual = sut.GetRecommendationsAsync "foo"
    Assert.Empty actual }

This is both a trivial case and an edge case, but clearly, if there's no data in the SongService, the RecommendationsProvider can't recommend any songs.

As I usually do with Characterisation Tests, I temporarily sabotage the System Under Test so that the test fails. This is to ensure that I didn't write a tautological assertion. Once I've seen the test fail for the appropriate reason, I undo the sabotage and check in the code.

Refactor to property #

In the above No data test, the specific input value "foo" is irrelevant. It might as well have been any other string, so why not make it a property?

In this particular case, the userName could be any string, but it might be appropriate to write a custom generator that produces 'realistic' user names. Just to make things simple, I'm going to assume that user names are between one and twenty characters and assembled from alphanumeric characters, and that the fist character must be a letter:

module Gen =
    let alphaNumeric = Gen.elements (['a'..'z'] @ ['A'..'Z'] @ ['0'..'9'])
 
    let userName = gen {
        let! length = Gen.choose (1, 19)
        let! firstLetter = Gen.elements <| ['a'..'z'] @ ['A'..'Z']
        let! rest = alphaNumeric |> Gen.listOfLength length
        return firstLetter :: rest |> List.toArray |> String }

Strictly speaking, as long as user names are distinct, the code ought to work, so this generator may be more conservative than necessary. Why am I constraining the generator? For two reasons: First, when FsCheck finds a counter-example, it displays the values that caused the property to fail. A twenty-character alphanumeric string is easier to relate to than some arbitrary string with line breaks and unprintable characters. The second reason is that I'm later going to measure memory load for some of the alternatives, and I wanted data to have realistic size. If user names are chosen by humans, they're unlikely to be longer than twenty characters on average (I've decided).

I can now rewrite the above No data test as an FsCheck property:

[<Property>]
let ``No data`` () =
    Gen.userName |> Arb.fromGen |> Prop.forAll <| fun userName ->
    task {
        let srvc = FakeSongService ()
        let sut = RecommendationsProvider srvc
 
        let! actual = sut.GetRecommendationsAsync userName
 
        Assert.Empty actual } :> Task

You may think that this is overkill just to be able to supply random user names to the GetRecommendationsAsync method. In isolation, I'd be inclined to agree, but this edit was an occasion to get my FsCheck infrastructure in place. I can now use that to add more properties.

Full coverage #

The cyclomatic complexity of the GetRecommendationsAsync method is only 3, so it doesn't require many tests to attain full code coverage. Not that 100% code coverage should be a goal in itself, but when adding tests to an untested code base, it can be useful as an indicator of confidence. Despite its low cyclomatic complexity, the method, with all of its filtering and sorting, is actually quite involved. 100% coverage strikes me as a low bar.

The above No data test exercises one of the three branches. At most two more tests are required to attain full coverage. I'll just show the simplest of them here.

The next test case requires a new FsCheck generator, in addition to Gen.userName already shown.

let song = ArbMap.generate ArbMap.defaults |> Gen.map Song

As a fairly simple one-liner, this seems close to the Fairbairn threshold, but I think that giving this generator a name makes the test easier to read.

[<Property>]
let ``One user, some songs`` () =
    gen {
        let! user = Gen.userName
        let! songs = Gen.arrayOf Gen.song
        let! scrobbleCounts =
            Gen.choose (1, 100) |> Gen.arrayOfLength songs.Length
        return (user, Array.zip songs scrobbleCounts) }
    |> Arb.fromGen |> Prop.forAll <| fun (user, scrobbles) ->
    task {
        let srvc = FakeSongService ()
        scrobbles |> Array.iter (fun (s, c) -> srvc.Scrobble (user, s, c))
        let sut = RecommendationsProvider srvc
 
        let! actual = sut.GetRecommendationsAsync user
 
        Assert.Empty actual } :> Task

This test creates scrobbles for a single user and adds them to the Fake data store. It uses TIE-fighter syntax to connect the generators to the test body.

Since all the scrobble counts are generated between 1 and 100, none of them are greater than or equal to 10,000 and thus the test expects no recommendations.

You may think that I'm cheating - after all, why didn't I choose another range for the scrobble count? To be honest, I was still in an exploratory phase, trying to figure out how to express the tests, and as a first step, I was aiming for full code coverage. Even though this test's assertion is weak, it does exercise another branch of the GetRecommendationsAsync method.

I had to write only one more test to fully cover the System Under Test. That method is more complicated, so I'll spare you the details. If you're interested, you may consider consulting the example source code repository.

Mutation testing #

While I don't think that code coverage is useful as a target measure, it can be illuminating as a tool. In this case, knowing that I've now attained full coverage tells me that I need to resort to other techniques if I want another goal to aim for.

I chose mutation testing as my new technique. The GetRecommendationsAsync method makes heavy use of LINQ methods such as OrderByDescending, Take, and Where. Stryker for .NET knows about LINQ, so among all the automated sabotage is does, it tries to see what happens if it removes e.g. Where or Take.

Although I find the Stryker jargon childish, I set myself the goal to see if I could 'kill mutants' to a degree that I'd at least get a green rating.

I found that I could edge closer to that goal by a combination of appending assertions (thus strengthening postconditions) and adding tests. While I sometimes find it easier to define properties than examples, at other times, it's the other way around. In this case, I found it easier to add single examples, like this one:

[<Fact>]
let ``One verified recommendation`` () = task {
    let srvc = FakeSongService ()
    srvc.Scrobble ("cat", Song (1, false, 6uy),     10)
    srvc.Scrobble ("ana", Song (1, false, 5uy),     10)
    srvc.Scrobble ("ana", Song (2,  true, 5uy), 9_9990)
    let sut = RecommendationsProvider srvc
 
    let! actual = sut.GetRecommendationsAsync "cat"
 
    Assert.Equal<Song> ([ Song (2, true, 5uy) ], actual) } :> Task

It adds three scrobbles to the data store, but only one of them is verified (which is what the true value indicates), so this is the only recommendation the test expects to see.

Notice that although song number 2 'only' has 9,9990 plays, the user ana has exactly 10,000 plays in all, so barely makes the cut. By carefully adding five examples like this one, I was able to 'kill all mutants'.

In all, I have eight tests; three FsCheck properties and five normal xUnit.net facts.

All tests work exclusively by supplying direct and indirect input to the System Under Test (SUT), and verify the return value of GetRecommendationsAsync. No Mocks or Stubs have opinions about how the SUT interacts with the injected SongService. This gives me confidence that the tests constitute a trustworthy regression test suite, and that they're still sufficiently decoupled from implementation details to enable me to completely rewrite the SUT.

Quirks #

When you add tests to an existing code base, you may discover edge cases that the original programmer overlooked. The GetRecommendationsAsync method is only a code example, so I don't blame Oleksii Holub for some casual coding, but it turns out that the code has some quirks.

For example, there's no deduplication, so I had to apologise in my test code:

[<Fact>]
let ``Only top-rated songs`` () = task {
    let srvc = FakeSongService ()
    // Scale ratings to keep them less than or equal to 10.
    [1..20] |> List.iter (fun i ->
        srvc.Scrobble ("hyle", Song (i, true, byte i / 2uy), 500))
    let sut = RecommendationsProvider srvc
 
    let! actual = sut.GetRecommendationsAsync "hyle"
 
    Assert.NotEmpty actual
    // Since there's only one user, but with 20 songs, the implementation loops
    // over the same songs 20 times, so 400 songs in total (with duplicates).
    // Ordering on rating, only the top-rated 200 remains, that is, those rated
    // 5-10. Note that this is a Characterization Test, so not necessarily
    // reflective of how a real recommendation system should work.
    Assert.All (actual, fun s -> Assert.True (5uy <= s.Rating)) } :> Task

This test creates twenty scrobbles for one user: One with a zero rating, two with rating 1, two with rating 2, and so on, up to a single song with rating 10.

You might protest that this is because my FakeSongService implementation is too unsophisticated. Obviously, it should not return the 'original' user's songs! Do, however, consider the implied signature of the GetTopListenersAsync method:

Task<IReadOnlyCollection<User>> GetTopListenersAsync(int songId);

The method only accepts a songId as input, and if we assume that the service is stateless, it doesn't know who the 'original' user is.

Should I fix the quirks? In a real system, it might be appropriate, but in this context I find it better to keep the them. Real systems often have quirks in the shape of legacy business rules and the like, so I only find it realistic that the system may exhibit some weird behaviour. The goal of this set of articles isn't to refactor this particular system, but rather to showcase alternative designs for a system sufficiently complicated to warrant refactorings. Simplifying the code might defeat that purpose.

As shown, I have an automated test that requires me to keep that behaviour. I think I'm in a good position to make sweeping changes to the code.

Conclusion #

As Martin Fowler writes, an essential precondition for refactoring is a trustworthy test suite. On a daily basis, millions of developers prove him wrong by deploying untested changes to production. There are other ways to make changes, including manual testing, A/B testing, testing in production, etc. Some of them may even work in some contexts.

In contrast to such real-world concerns, I don't have a production system with real users. Neither do I have a product owner or a department of manual testers. The best I can do is to add enough Characterisation Tests that I feel confident that I've described the behaviour, rather than the implementation, in enough detail to hold it in place. A Software Vise, as Michael Feathers calls it in Working Effectively with Legacy Code.

Most systems in 'the real world' have too few automated tests. Adding tests to a legacy code base is a difficult discipline, so I found it worthwhile to document this work before embarking on the actual design changes promised by this article series. Now that this is out of the way, we can proceed.

The next two articles do more groundwork to establish equivalent code bases in F# and Haskell. If you only care about the C# examples, you can go back to the first article in this article series and use the table of contents to jump to the next C# example.

Next: Porting song recommendations to F#.

Alternative ways to design with functional programming

Monday, 07 April 2025 18:27:00 UTC

A catalogue of FP solutions to a sample problem.

If you're past the novice stage of learning programming, you will have noticed that there's usually more than one way to solve a particular problem. Sometimes one way is better than alternatives, but often, there's no single clearly superior option.

It's a cliche that you should use the right tool for the job. For programmers, our most important tools are the techniques, patterns, algorithms, and data structures we know, rather than the IDEs we use. You can only choose the right tool for a job if you have more than one to choose from. Again, for programmers this implies knowing of more than one way to solve a problem. This is the reason I find doing katas so valuable.

Instead of a kata, in the series that this article commences I'll take a look at an example problem and in turn propose multiple alternative solutions. The problem I'll tackle is bigger than a typical kata, but you can think of this article series as a spirit companion to Thirteen ways of looking at a turtle by Scott Wlaschin.

Recommendations #

The problem I'll tackle was described by Oleksii Holub in 2020, and I've been considering how to engage with it ever since.

Oleksii Holub presents it as the second of two examples in an article named Pure-Impure Segregation Principle. In a nutshell, the problem is to identify song recommendations for a user, sourced from a vast repository of scrobbles.

The first code example in the article is fine as well, but it's not as rich a source of problems, so I don't plan to talk about it in this context.

Oleksii Holub's article does mention my article Dependency rejection as well as the Impureim Sandwich pattern.

It's my experience that the Impureim Sandwich is surprisingly often applicable, despite its seemingly obvious limitations. More than once, people have responded that it doesn't work in general.

I've never claimed that the Impureim Sandwich is a general-purpose solution to everything, only that it surprisingly often fits, once you massage the problem a bit:

Refactoring registration flow to functional architecture
A conditional sandwich example
Functional file system
A restaurant sandwich
Song recommendations as a C# Impureim Sandwich
Song recommendations as an F# Impureim Sandwich
Song recommendations as a Haskell Impureim Sandwich

I have, however, solicited examples that challenge the pattern, and occasionally readers supply examples, for which I'm thankful. I'm trying to get a sense for just how widely applicable the Impureim Sandwich pattern is, and finding its limitations is an important part of that work.

The song recommendations problem is the most elusive one I've seen so far, so I'm grateful to Oleksii Holub for the example.

In the articles in this series, I'll present various alternative solutions to that problem. To make things as clear as I can, I don't do this because I think that the code shown in the original article is bad. Quite the contrary, I'd probably write it like that myself.

I offer the alternatives to teach. Only by knowing of more than one way of solving the problem can you pick the right tool for the job. It may turn out that the right design is the one already suggested by Oleksii Holub, but if you change circumstances, perhaps another design is better. Ultimately, I hope that the reader can extrapolate from this problem to other problems that he or she may encounter.

The way much online discourse is conducted today, I wish that I didn't have to emphasise the following: Someone may choose to read Oleksii Holub's article as a rebuttal of my ideas about functional architecture and the Impureim Sandwich. I don't read it that way, but rather as honest pursuit of intellectual inquiry. I engage with it in the same spirit, grateful for the rich example that it offers.

Code examples #

I'll show code in the languages with which I'm most comfortable: C#, F#, and Haskell. I'll attempt to write the C# code in such a way that programmers of Java, TypeScript, or similar languages can also read along. On the other hand, I'm not going to explain F# or Haskell, but I'll write the articles so that you can skip the F# or Haskell articles and still learn from the C# articles.

While I don't expect the majority of my readers to know Haskell, I find it an indispensable tool when evaluating whether or not a design is functional. F# is a good didactic bridge between C# and Haskell.

The code is available upon request against a small support donation of 10 USD (or more). If you're one of my regular supporters, you have my gratitude and can get the code without further donation. Also, on his blog, Oleksii Holub asks you to support Ukraine against the aggressor. If you can document that you've donated at least 10 USD to one of the charities listed there, on or after the publication of this article, I'll be happy to send you the code as well. In both cases, please write to me.

I've used Git branches for the various alternatives. In each article, I'll do my best to remember to write which branch corresponds to the article.

Articles #

This article series will present multiple alternatives in more than one programming language. I find it most natural to group the articles according to design first, and language second.

While you can view this list as a catalogue of functional programming designs, I'm under no illusion that it's complete.

Characterising song recommendations
Porting song recommendations to F#
Porting song recommendations to Haskell
Song recommendations as an Impureim Sandwich
- Song recommendations as a C# Impureim Sandwich
  - Song recommendations proof-of-concept memory measurements
- Song recommendations as an F# Impureim Sandwich
- Song recommendations as a Haskell Impureim Sandwich
Song recommendations from combinators
- Song recommendations from C# combinators
- Song recommendations from F# combinators
- Song recommendations from Haskell combinators
Song recommendations with pipes and filters
- Song recommendations with Reactive Extensions for .NET
- Song recommendations with F# agents
Song recommendations with free monads
- Song recommendations with Haskell free monads
- Song recommendations with F# free monads
- Song recommendations with C# free monads

Some of the design alternatives will require detours to other interesting topics. While I'll do my best to write to enable you to skip the F# and Haskell content, few articles on this blog are self-contained. I do expect the reader to follow links when necessary, but if I've failed to explain anything to your satisfaction, please leave a comment.

Conclusion #

This article series examines multiple alternative designs to the song recommendations example presented by Oleksii Holub. The original example has interleaved impurities and is therefore not really functional, even though it looks 'functional' on the surface, due to its heavy use of filters and projections.

That example may leave some readers with the impression that there are some problems that, due to size or other 'real-world' constraints, are impossible to solve with functional programming. The present catalogue of design alternatives is my attempt to disabuse readers of that notion.

Next: Characterising song recommendations.

Ports and fat adapters

Tuesday, 01 April 2025 13:16:00 UTC

Is it worth it having a separate use-case layer?

When I occasionally post something about Ports and Adapters (also known as hexagonal architecture), a few reactions seem to indicate that I'm 'doing it wrong'. I apologize for the use of weasel words, but I don't intend to put particular people on the spot. Everyone has been nice and polite about it, and it's possible that I've misunderstood the criticism. Even so, a few comments have left me with the impression that there's an elephant in the room that I should address.

In short, I usually don't abstract application behaviour from frameworks. I don't create 'application layers', 'use-case classes', 'mediators', or similar. This is a deliberate architecture decision.

In this article, I'll use a motivating example to describe the reasoning behind such a decision.

Humble Objects #

A software architect should consider how the choice of particular technologies impact the development and future sustainability of a solution. It's often a good idea to consider whether it makes sense to decouple application code from frameworks and particular external dependencies. For example, should you hide database access behind an abstraction? Should you decouple the Domain Model from the web framework you use?

This isn't always the right decision, but in the following, I'll assume that this is the case.

When you apply the Dependency Inversion Principle (DIP) you let the application code define the abstractions it needs. If it needs to persist data, it may define a Repository interface. If it needs to send notifications, it may define a 'notification gateway' abstraction. Actual code that, say, communicates with a relational database is an Adapter. It translates the application interface into database SDK code.

I've been over this ground already, but to take an example from the sample code that accompanies Code That Fits in Your Head, here's a single method from the SqlReservationsRepository Adapter:

public async Task Create(int restaurantId, Reservation reservation)
{
    if (reservation is null)
        throw new ArgumentNullException(nameof(reservation));
 
    using var conn = new SqlConnection(ConnectionString);
    using var cmd = new SqlCommand(createReservationSql, conn);
    cmd.Parameters.AddWithValue("@Id", reservation.Id);
    cmd.Parameters.AddWithValue("@RestaurantId", restaurantId);
    cmd.Parameters.AddWithValue("@At", reservation.At);
    cmd.Parameters.AddWithValue("@Name", reservation.Name.ToString());
    cmd.Parameters.AddWithValue("@Email", reservation.Email.ToString());
    cmd.Parameters.AddWithValue("@Quantity", reservation.Quantity);
 
    await conn.OpenAsync().ConfigureAwait(false);
    await cmd.ExecuteNonQueryAsync().ConfigureAwait(false);
}

This is one method of a class named SqlReservationsRepository, which is an Adapter that makes ADO.NET code look like the application-specific IReservationsRepository interface.

Such Adapters are often as 'thin' as possible. One dimension of measurement is to look at the cyclomatic complexity, where the ideal is 1, the lowest possible score. The code shown here has a complexity measure of 2 because of the null guard, which exists because of a static analysis rule.

In test parlance, we call such thin Adapters Humble Objects. Or, to paraphrase what Kris Jenkins said at the GOTO Copenhagen 2024 conference, separate code into parts that are

hard to test, but easy to get right
hard to get right, but easy to test.

You can do the same when sending email, querying a weather service, raising events on pub-sub infrastructure, getting the current date, etc. This isolates your application code from implementation details, such as particular database servers, SDKs, network protocols, and so on.

Shouldn't you be doing the same on the receiving end?

Fat Adapters #

In his article on Hexagonal Architecture Alistair Cockburn acknowledges a certain asymmetry. Some ports are activated by the application. Those are the ones already examined. An application reads from and writes to a database. An application sends emails. An application gets the current date.

Other ports, on the other hand, drive the application. According to Tomas Petricek's distinction between frameworks and libraries (that I also use), this kind of behaviour characterizes a framework. Examples include web frameworks such as ASP.NET, Express.js, Django, or UI frameworks like Angular, WPF, and so on.

While I usually do shield my Domain Model from framework code, I tend to write 'fat' Adapters. As far as I can tell, this is what some people have taken issue with.

Here's an example:

[HttpPost("restaurants/{restaurantId}/reservations")]
public async Task<ActionResult> Post(int restaurantId, ReservationDto dto)
{
    if (dto is null)
        throw new ArgumentNullException(nameof(dto));
 
    Reservation? candidate1 = dto.TryParse();
    dto.Id = Guid.NewGuid().ToString("N");
    Reservation? candidate2 = dto.TryParse();
    Reservation? reservation = candidate1 ?? candidate2;
    if (reservation is null)
        return new BadRequestResult(/* Describe the errors here */);
 
    var restaurant = await RestaurantDatabase.GetRestaurant(restaurantId)
        .ConfigureAwait(false);
    if (restaurant is null)
        return new NotFoundResult();
 
    using var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled);
 
    var reservations = await Repository
        .ReadReservations(restaurant.Id, reservation.At)
        .ConfigureAwait(false);
    var now = Clock.GetCurrentDateTime();
    if (!restaurant.MaitreD.WillAccept(now, reservations, reservation))
        return NoTables500InternalServerError();
 
    await Repository.Create(restaurant.Id, reservation).ConfigureAwait(false);
 
    scope.Complete();
 
    return Reservation201Created(restaurant.Id, reservation);
}

This is (still) code originating from the example code base that accompanies Code That Fits in Your Head, although I've here used the variation from Coalescing DTOs. I've also inlined the TryCreate helper method, so that the entire use-case flow is visible as a single method.

In a sense, we may consider this an Adapter, too. This Post action method is part of a Controller class that handles incoming HTTP requests. It is, however, not that class that deals with the HTTP protocol. Neither does it parse the request body, or checks headers, etc. The ASP.NET framework takes care of that.

By following certain naming conventions, adorning the method with an [HttpPost] attribute, and returning ActionResult, this method plays by the rules of the ASP.NET framework. Even if it doesn't implement any particular interface, or inherits from some ASP.NET base class, it clearly 'adapts' to the ASP.NET framework.

It does that by attempting to parse and validate input, look up data in data sources, and in general checking preconditions before delegating work to the Domain Model - which happens in the call to MaitreD.WillAccept.

This is where some people seem to get uncomfortable. If this is an Adapter, it's a 'fat' one. In this particular example, the cyclomatic complexity is 6. Not really a Humble Object.

Shouldn't there be some kind of 'use-case' model?

Use-case Model API #

I deliberately avoid 'use-case' model, 'mediators', or whatever other name people tend to use. I'll try to explain why by going through the exercise of actually extracting such a model. My point, in short, is that I find it not worth the effort.

In the following, I'll call such a model a 'use-case', since this is one piece of terminology that I tend to run into. You may also call it an 'Application Model' or something else.

The 'problem' that I apparently am meant to solve is that most of the code in the above Post method is tightly coupled to ASP.NET. If we want to decouple this code, how do we go about it?

It's possible that my imagination fails me, but the best I can think of is some kind of 'use-case model' that models the 'make a reservation' use case. Perhaps we should name it MakeReservationUseCase. Should it be some kind of Command object? It could be, but I think that this is awkward, because it also needs to communicate with various dependencies, such as RestaurantDatabase, Repository, and Clock. A long-lived service object that can wrap around those dependencies seems like a better option, but then we need a method on that object.

public sealed class MakeReservationUseCase
{
    // What to call this method? What to return? I hate this already.
    public object MakeReservation(/* What to receive here? */)
    {
        throw new NotImplementedException();
    }
}

What do we call such a method? Had this been a true Command object, the single parameterless method would be called Execute, but since I'm planning to work with a stateless service, the method should take arguments. I played with options such as Try, Do, or Go, so that you'd have MakeReservationUseCase.Try and so on. Still, I thought this bordered on 'cute' or 'clever' code, and at the very least not particularly idiomatic C#. So I settled for MakeReservation, but now we have MakeReservationUseCase.MakeReservation, which is clearly redundant. I don't like the direction this design is going.

The next question is what parameters this method should take?

Considering the above Post method, couldn't we pass the dto object on to the use-case model? Technically, we could, but consider this: The ReservationDto object's raison d'être is to support reception and transmission of JSON objects. As I've already covered in an earlier article series, serialization formats are inexplicably coupled to the boundaries of a system.

Imagine that we wanted to decompose the code base into smaller constituent projects. If so, the use-case model should be independent of the ASP.NET-based code. Does it seem reasonable, then, to define the use-case API in terms of a serialization format?

I don't think that's the right design. Perhaps, instead, we should 'explode' the Data Transfer Object (DTO) into its primitive constituents?

public object MakeReservation(
    int restaurantId,
    string? id,
    string? at,
    string? email,
    string? name,
    int quantity)

I'm not too happy about this, either. Six parameters is pushing it, and this is even only an example. What if you need to pass more data than that? What if you need to pass a collection? What if each element in that collection contains another collection?

Introduce Parameter Object, you say?

Given that this is the way we want to go (in this demonstration), this seems as though it's the only good option, but that means that we'd have to define another reservation object. Not the (JSON) DTO that arrives at the boundary. Not the Reservation Domain Model, because the data has yet to be validated. A third reservation class. I don't even know what to call such a class...

So I'll leave those six parameters as above, while pointing out that no matter what we do, there seems to be a problem.

Return type woes #

What should the MakeReservation method return?

The code in the above Post method returns various ActionResult objects that indicate success or various failures. This isn't an option if we want to decouple MakeReservationUseCase from ASP.NET. How may we instead communicate one of four results?

Many object-oriented programmers might suggest throwing custom exceptions, and that's a real option. If nothing else, it'd be idiomatic in a language like C#. This would enable us to declare the return type as Reservation, but we would also have to define three custom exception types.

There are some advantages to such a design, but it effectively boils down to using exceptions for flow control.

Is there a way to model heterogeneous, mutually exclusive values? Another object-oriented stable is to introduce a type hierarchy. You could have four different classes that implement the same interface, or inherit from the same base class. If we go in this direction, then what behaviour should we define for this type? What do all four objects have in common? The only thing that they have in common is that we need to convert them to ActionResult.

We can't, however, have a method like ToActionResult() that converts the object to ActionResult, because that would couple the API to ASP.NET.

You could, of course, use downcasts to check the type of the return value, but if you do that, you might as well leave the method as shown above. If you plan on dynamic type checks and casts, the only base class you need is object.

Visitor #

If only there was a way to return heterogeneous, mutually exclusive data structures. If only C# had sum types...

Fortunately, while C# doesn't have sum types, it is possible to achieve the same goal. Use a Visitor as a sum type.

You could start with a type like this:

public sealed class MakeReservationResult
{
    public T Accept<T>(IMakeReservationVisitor<T> visitor)
    {
        // Implementation to follow...
    }
}

As usual with the Visitor design pattern, you'll have to inspect the Visitor interface to learn about the alternatives that it supports:

public interface IMakeReservationVisitor<T>
{
    T Success(Reservation reservation);
    T InvalidInput(string message);
    T NoSuchRestaurant();
    T NoTablesAvailable();
}

This enables us to communicate that there's exactly four possible outcomes in a way that doesn't depend on ASP.NET.

The 'only' remaining work on the MakeReservationResult class is to implement the Accept method. Are you ready? Okay, here we go:

public sealed class MakeReservationResult
{
    private readonly IMakeReservationResult imp;
 
    private MakeReservationResult(IMakeReservationResult imp)
    {
        this.imp = imp;
    }
 
    public static MakeReservationResult Success(Reservation reservation)
    {
        return new MakeReservationResult(new SuccessResult(reservation));
    }
 
    public static MakeReservationResult InvalidInput(string message)
    {
        return new MakeReservationResult(new InvalidInputResult(message));
    }
 
    public static MakeReservationResult NoSuchRestaurant()
    {
        return new MakeReservationResult(new NoSuchRestaurantResult());
    }
 
    public static MakeReservationResult NoTablesAvailable()
    {
        return new MakeReservationResult(new NoTablesAvailableResult());
    }
 
    public T Accept<T>(IMakeReservationVisitor<T> visitor)
    {
        return this.imp.Accept(visitor);
    }
 
    private interface IMakeReservationResult
    {
        T Accept<T>(IMakeReservationVisitor<T> visitor);
    }
 
    private sealed class SuccessResult : IMakeReservationResult
    {
        private readonly Reservation reservation;
 
        public SuccessResult(Reservation reservation)
        {
            this.reservation = reservation;
        }
        
        public T Accept<T>(IMakeReservationVisitor<T> visitor)
        {
            return visitor.Success(reservation);
        }
    }
 
    private sealed class InvalidInputResult : IMakeReservationResult
    {
        private readonly string message;
 
        public InvalidInputResult(string message)
        {
            this.message = message;
        }
 
        public T Accept<T>(IMakeReservationVisitor<T> visitor)
        {
            return visitor.InvalidInput(message);
        }
    }
 
    private sealed class NoSuchRestaurantResult : IMakeReservationResult
    {
        public T Accept<T>(IMakeReservationVisitor<T> visitor)
        {
            return visitor.NoSuchRestaurant();
        }
    }
 
    private sealed class NoTablesAvailableResult : IMakeReservationResult
    {
        public T Accept<T>(IMakeReservationVisitor<T> visitor)
        {
            return visitor.NoTablesAvailable();
        }
    }
}

That's a lot of boilerplate code, but it's so automatable that there are programming languages that can do this for you. On .NET, it's called F#, and all of that would be a single line of code.

Use Case implementation #

Implementing MakeReservation is now easy, since it mostly involves moving code from the Controller to the MakeReservationUseCase class, and changing it so that it returns the appropriate MakeReservationResult objects instead of ActionResult objects.

public sealed class MakeReservationUseCase
{
    public MakeReservationUseCase(
        IClock clock,
        IRestaurantDatabase restaurantDatabase,
        IReservationsRepository repository)
    {
        Clock = clock;
        RestaurantDatabase = restaurantDatabase;
        Repository = repository;
    }
 
    public IClock Clock { get; }
    public IRestaurantDatabase RestaurantDatabase { get; }
    public IReservationsRepository Repository { get; }
 
    public async Task<MakeReservationResult> MakeReservation(
        int restaurantId,
        string? id,
        string? at,
        string? email,
        string? name,
        int quantity)
    {
        if (!Guid.TryParse(id, out var rid))
            rid = Guid.NewGuid();
        if (!DateTime.TryParse(at, out var rat))
            return MakeReservationResult.InvalidInput("Invalid date.");
        if (email is null)
            return MakeReservationResult.InvalidInput("Invalid email.");
        if (quantity < 1)
            return MakeReservationResult.InvalidInput("Invalid quantity.");
        var reservation = new Reservation(
            rid,
            rat,
            new Email(email),
            new Name(name ?? ""),
            quantity);
 
        var restaurant = await RestaurantDatabase.GetRestaurant(restaurantId).ConfigureAwait(false);
        if (restaurant is null)
            return MakeReservationResult.NoSuchRestaurant();
 
        using var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled);
 
        var reservations = await Repository.ReadReservations(restaurant.Id, reservation.At)
            .ConfigureAwait(false);
        var now = Clock.GetCurrentDateTime();
        if (!restaurant.MaitreD.WillAccept(now, reservations, reservation))
            return MakeReservationResult.NoTablesAvailable();
 
        await Repository.Create(restaurant.Id, reservation).ConfigureAwait(false);
 
        scope.Complete();
 
        return MakeReservationResult.Success(reservation);
    }
}

I had to re-implement input validation, because the TryParse method is defined on ReservationDto, and the Use-case Model shouldn't be coupled to that class. Still, you could argue that if I'd immediately implemented the use-case architecture, I would never had had the parser defined on the DTO.

Decoupled Controller #

The Controller method may now delegate implementation to a MakeReservationUseCase object:

[HttpPost("restaurants/{restaurantId}/reservations")]
public async Task<ActionResult> Post(int restaurantId, ReservationDto dto)
{
    if (dto is null)
        throw new ArgumentNullException(nameof(dto));
 
    var result = await makeReservationUseCase.MakeReservation(
        restaurantId,
        dto.Id,
        dto.At,
        dto.Email,
        dto.Name,
        dto.Quantity).ConfigureAwait(false);
    return result.Accept(new PostReservationVisitor(restaurantId));
}

While that looks nice and slim, it's not all, because you also need to define the PostReservationVisitor class:

private class PostReservationVisitor : IMakeReservationVisitor<ActionResult>
{
    private readonly int restaurantId;
 
    public PostReservationVisitor(int restaurantId)
    {
        this.restaurantId = restaurantId;
    }
 
    public ActionResult Success(Reservation reservation)
    {
        return Reservation201Created(restaurantId, reservation);
    }
 
    public ActionResult InvalidInput(string message)
    {
        return new BadRequestObjectResult(message);
    }
 
    public ActionResult NoSuchRestaurant()
    {
        return new NotFoundResult();
    }
 
    public ActionResult NoTablesAvailable()
    {
        return NoTables500InternalServerError();
    }
}

Notice that this implementation has to receive the restaurantId value through its constructor, since this piece of data isn't part of the IMakeReservationVisitor API. If only we could have handled all that pattern matching with a simple closure...

Well, you can. You could have used Church encoding instead of the Visitor pattern, but many programmers find that less idiomatic, or not sufficiently object-oriented.

Be that as it may, the Controller now plays the role of an Adapter between the ASP.NET framework and the framework-neutral Use-case Model. Is it all worth it?

Reflection #

Where does that put us? It's certainly possible to decouple the Use-case Model from the specific framework, but at what cost?

In this example, I had to introduce two new classes and one interface, as well as four private implementation classes and a private interface.

And that was to support just one use case. If I want to implement a query (HTTP GET), I would need to go through similar motions, but with slight variations. And again for updates (HTTP PUT). And again for deletes. And again for the next resource, such as the restaurant calendar, daily schedule, management of tenants, and so on.

The cost seems rather substantial to me. Do the benefits outweigh them? What are the benefits?

Well, you now have a technology-neutral application model. You could, conceivably, tear out ASP.NET and replace it with, oh... ServiceStack. Perhaps. Theoretically. I haven't tried.

This strikes me as an argument similar to insisting that hiding database access behind an interface enables us to replace SQL Server with a document database. That rarely happens, and is not really why we do that.

So to be fair, decoupling also protects us from changes in libraries and frameworks. It makes it easier to modify one part of the system without having to worry (too much) about other parts. It makes it easier to subject subsystems to automated testing.

Does the above refactoring improve testability? Not really. MakeReservationUseCase may be testable, but so was the original Controller. The entire code base for Code That Fits in Your Head was developed with test-driven development (TDD). The Controllers are mostly covered by self-hosted integration tests, and bolstered with a few unit tests that directly call their action methods.

Another argument for a decoupled Use-case Model is that it might enable you to transplant the entire application to a new context. Since it doesn't depend on ASP.NET, you could reuse it in an Android or iPhone app. Or a batch job. Or an AI-assisted chat bot. Right? Couldn't you?

I'd be surprised if that were the case. Every application type has its own style of user interaction, and they tend to be incompatible. The user-interface flow of a web application is fundamentally different from a rich native app.

In short, I consider the notion of a technology-neutral Use-case Model to be a distraction. That's why I usually don't bother.

Conclusion #

I usually implement Controllers, message handlers, application entry points, and so on as fairly 'fat Adapters' over particular frameworks. I do that because I expect the particular application or user interaction to be intimately tied to that kind of application. This doesn't mean that I just throw everything in there.

Fat Adapters should still be covered by automated tests. They should still show appropriate decoupling. Usually, I treat each as an Impureim Sandwich. All impure actions happen in the Fat Adapter, and everything else is done by a pure function. Granted, however, this kind of architecture comes much more natural when you are working in a programming language that supports it.

C# doesn't really, but you can make it work. And that work, contrary to modelling use cases as classes, is, in my experience, well worth the effort.

Comments

Thomas Skovsende #

I do not really strongly disagree, but what would you do in the case where you had multiple entry points for your use cases? Ie. I have a http endpoint that creates a reservation, but I also need to be able to listen to a messagebus where reservations can come in. A lot of the logic is the same in both cases.

2025-04-09 09:17 UTC

Mark Seemann #

Thomas, thank you for writing. I'll give you two answers, because I believe that the specific answer doesn't generalize. On the other hand, the specific answer also has merit.

In the particular case, it seems as though an 'obvious' architecture would be to have the HTTP endpoint do minimal work required to parse the incoming data, and then put it on the message bus together with all the other messages. I could imagine situations where that would be appropriate, but I can also image some edge(?) cases where that still couldn't work.

As a general answer, however, having some common function or object that handles both cases could make sense. That's pretty much the architecture I spend this article discouraging. That said, as I tried to qualify, I usually don't run into situations where such an architecture is warranted. Case in point, I haven't run into a scenario like the one you describe. Other people, however, also wrote to tell me that they have two endpoints, such as both gRPC and HTTP, or both SOAP and REST, so while I, personally, haven't seen this much, it clearly does happen.

In short, I don't mind that kind of architecture when it addresses an actual problem. Often, though, that kind of requirement isn't present, and in that case, this kind of 'use-case model' architecture shouldn't be the default.

The reason I also gave you the specific answer is that I often get the impression that people seek general, universal solutions, and this could make them miss elegant shortcuts.

2025-04-13 15:17 UTC

Phased breaking changes

Monday, 17 March 2025 14:02:00 UTC

Giving advance warning before breaking client code.

I was recently listening to Jimmy Bogard on .NET Rocks! talking about 14 versions of Automapper. It made me reminisce on how I dealt with versioning of AutFixture, in the approximately ten years I helmed that project.

Jimmy has done open source longer than I have, and it sounds as though he's found a way that works for him. When I led AutoFixture, I did things a bit differently, which I'll outline in this article. In no way do I mean to imply that that way was better than Jimmy's. It may, however, strike a chord with a reader or two, so I present it in the hope that some readers may find the following ideas useful.

Scope #

This article is about versioning a code base. Typically, a code base contains 'modules' of a kind, and client code that relies on those modules. In object-oriented programming, modules are often called classes, but in general, what matters in this context is that some kind of API exists.

The distinction between API and client code is most clear if you're maintaining a reusable library, and you don't know the client developers, but even internal application code has APIs and client code. The following may still be relevant if you're working in a code base together with colleagues.

This article discusses code-level APIs. Examples include C# code that other .NET code can call, but may also apply to Java objects callable from Clojure, Haskell code callable by other Haskell code, etc. It does not discuss versioning of REST APIs or other kinds of online services. I have, in the past, discussed versioning in such a context, and refer you, among other articles, to REST implies Content Negotiation and Retiring old service versions.

Additionally, some of the techniques outlined here are specific to .NET, or even C#. If, as I suspect, JavaScript or other languages don't have those features, then these techniques don't apply. They're hardly universal.

Semantic versioning #

The first few years of AutoFixture, I didn't use a systematic versioning scheme. That changed when I encountered Semantic Versioning: In 2011 I changed AutoFixture versioning to Semantic Versioning. This forced me to think explicitly about breaking changes.

As an aside, in recent years I've encountered the notion that Semantic Versioning is somehow defunct. This is often based on the observation that Semantic Version 2.0.0 was published in 2013. Surely, if no further development has taken place, it's been abandoned by its maintainer? This may or may not be the case. Does it matter?

The original author, Tom Preston-Werner, may have lost interest in Semantic Versioning. Or perhaps it's simply done. Regardless of the underlying reasons, I find Semantic Versioning useful as it is. The fact that it hasn't changed since 2013 may be an indication that it's stable. After all, it's not a piece of software. It's a specification that helps you think about versioning, and in my opinion, it does an excellent job of that.

As I already stated, once I started using Semantic Versioning I began to think explicitly about breaking changes.

Advance warning #

Chapter 10 in Code That Fits in Your Head is about making changes to existing code bases. Unless you're working on a solo project with no other programmers, changes you make impact other people. If you can, avoid breaking other people's code. The chapter discusses some techniques for that, but also briefly covers how to introduce breaking changes. Some of that chapter is based on my experience with AutoFixture.

If your language has a way to retire an API, use it. In Java you can use the @Deprecated annotation, and in C# the equivalent [Obsolete] attribute. In C#, any client code that uses a method with the [Obsolete] attribute will now emit a compiler warning.

By default, this will only be a warning, and there's certainly a risk that people don't look at those. On the other hand, if you follow my advice from Code That Fits in Your Head, you should treat warnings as errors. If you do, however, those warnings emitted by [Obsolete] attributes will prevent your code from compiling. Or, if you're the one who just adorned a method with that attribute, you should understand that you may just have inconvenienced someone else.

Therefore, whenever you add such an attribute, do also add a message that tells client developers how to move on from the API that you've just retired. As an example, here's an (ASP.NET) method that handles GET requests for calendar resources:

[Obsolete("Use Get method with restaurant ID.")]
[HttpGet("calendar/{year}/{month}")]
public ActionResult LegacyGet(int year, int month)

To be honest, that message may be a bit on the terse side, but the point is that there's another method on the same class that takes an additional restaurantId. While I'm clearly not perfect, and should have written a more detailed message, the point is that you should make it as easy as possible for client developers to deal with the problem that you've just given them. My rules for exception messages also apply here.

It's been more than a decade, but as I remember it, in the AutoFixture code base, I kept a list of APIs that I intended to deprecate at the next major revision. In other words, there were methods I considered fair use in a particular major version, but that I planned to phase out over multiple revisions. There were, however, a few methods that I immediately adorned with the [Obsolete] attribute, because I realized that they created problems for people.

The plan, then, was to take it up a notch when releasing a new major version. To be honest, though, I never got to execute the final steps of the plan.

Escalation #

By default, the [Obsolete] attribute emits a warning, but by supplying true as a second parameter, you can turn the warning into a compiler error.

[Obsolete("Use Get method with restaurant ID.", true)]
[HttpGet("calendar/{year}/{month}")]
public ActionResult LegacyGet(int year, int month)

You could argue that for people who treat warnings as errors, even a warning is a breaking change, but there can be no discussion that when you flip that bit, this is certainly a breaking change.

Thus, you should only escalate to this level when you publish a new major release.

Code already compiled against previous versions of your deprecated code may still work, but that's it. Code isn't going to compile against an API deprecated like that.

That's the reason it's important to give client developers ample warning.

With AutoFixture, I personally never got to that point, because I'm not sure that I arrived at this deprecation strategy until major version 3, which then had a run from early 2013 to late 2017. In other words, the library had a run of 4½ years without breaking changes. And when major version 4 rolled around, I'd left the project.

Even after setting the error flag to true, code already compiled against earlier versions may still be able to run against newer binaries. Thus, you still need to keep the deprecated API around for a while longer. Completely removing a deprecated method should only happen in yet another major version release.

Conclusion #

To summarize, deprecating an API could be considered a breaking change. If you take that position, imagine that your current Semantic Version is 2.44.2. Deprecating a method would then required that you release version 3.0.0.

In any case, you make some more changes to your code, reaching version 3.5.12. For various reasons, you decide to release version 4.0.0, in which you can also turn the error flag on. EVen so, the deprecated API remains in the library.

Only in version 5.0.0 can you entirely delete it.

Depending on how often you change major versions, this whole process may take years. I find that appropriate.

Appeal to aithority

Monday, 10 March 2025 14:40:00 UTC

No, it's not a typo.

A few months ago, I was listening to a semi-serious programme from the Danish public service radio. This is a weekly programme about language that I always listen to as a podcast. The host is the backbone of the show, but in addition to new guests each week, he's flanked by a regular expert who is highly qualified to answer questions about etymology, grammar, semantics, etc.

In the episode I'm describing, the expert got a question that a listener had previously emailed. To answer, (s)he started like this (and I'm paraphrasing): I don't actually know the answer to this question, so I did what everyone does these days, when they don't know the answer: I asked ChatGPT.

(S)he then proceeded to read aloud what ChatGPT had answered, and concluded with some remarks along the lines that that answer sounded quite plausible.

If I used ten to twenty hours of my time re-listening to every episode from the past few months, I could find the particular episode, link to it, transcribe the exact words, and translate them to English to the best of my abilities. I am, however, not going to do that. First, I'm not inclined to use that much time writing an essay on which I make no income. Second, my aim is not to point fingers at anyone in particular, so I'm being deliberately vague. As you may have noticed, I've even masked the person's sex. Not because I don't remember, but to avoid singling out anyone.

The expert in question is a regular of the programme, and I've heard him or her give good and knowledgeable answers to many tricky questions. As far as I could tell, this particular question was unanswerable, along the lines of why is 'table' called 'table' rather than 'griungth'?

The correct answer would have been I don't know, and I don't think anyone else does.

Being a veteran of the programme, (s)he must have realized on beforehand that this wouldn't be good radio, and instead decided to keep it light-hearted.

I get that, and I wouldn't be writing about it now if it doesn't look like an example of an increasing trend.

People are using large language models (LLMs) to advocate for their positions.

Appeal to authority #

Appeal to authority is no new technique in discourse.

"You may also, should it be necessary, not only twist your authorities, but actually falsify them, or quote something which you have invented entirely yourself. As a rule, your opponent has no books at hand, and could not use them if he had."

The Art of Being Right, Arthur Schopenhauer, 1831

This seems similar to how people have begun using so-called artificial intelligence (AI) to do their arguing for them. We may, instead, call this appeal to aithority.

Epistemological cul-de-sac #

We've all seen plenty of examples of LLMs being wrong. I'm not going to tire you with any of those here, but I did outline my experience with GitHub Copilot in 2022. While these technologies may have made some advances since then, they still make basic mistakes.

Not only that. They're also non-deterministic. Ask a system a question once, and you get one answer. Ask the same question later, and you may get a variation of the same answer, or perhaps even a contradictory answer. If someone exhibits an answer they got from an LLM as an argument in their favour, consider that they may have been asking it five or six times before they received an answer they liked.

Finally, you can easily ask leading questions. Even if someone shows you a screen shot of a chat with an LLM, they may have clipped prior instructions that nudge the system towards a particular bias.

I've seen people post screen shots that an LLM claims that F# is a better programming language than C#. While I'm sympathetic to that claim, that's not an argument. Just like how you feel about something isn't an argument.

This phenomenon seems to be a new trend. People use answers from LLMs as evidence that they are right. I consider this an epistemological dead end.

Real authority #

Regular readers of this blog may have noticed that I often go to great lengths to track down appropriate sources to cite. I do this for several reasons. One is simply out of respect for the people who figured out things before us. Another reason is to strengthen my own arguments.

It may seem that I, too, appeal to authority. Indeed, I do. When not used in in the way Schopenhauer describes, citing authority is a necessary epistemological shortcut. If someone who knows much about a particular subject has reached a conclusion based on his or her work, we may (tentatively) accept the conclusion without going through all the same work. As Carl Sagan said, "If you wish to make an apple pie from scratch, you must first invent the universe." You can't do all basic research by yourself. At some point, you'll have to take expert assertions at face value, because you don't have the time, the education, or the money to build your own Large Hadron Collider.

Don't blindly accept an argument on the only grounds that someone famous said something, but on the other hand, there's no reason to dismiss out of hand what Albert Einstein had to say about gravity, just because you've heard that you shouldn't accept an argument based on appeal to authority.

Conclusion #

I'm concerned that people increasingly seem to resort to LLMs to argue a case. The LLMs says this, so it must be right.

Sometimes, people will follow up their arguments with of course, it's just an AI, but... and then proceed to unfold their preferred argument. Even if this seems as though the person is making a 'real' argument, starting from an LLM answer establishes a baseline to a discussion. It still lends an aura of truth to something that may be false.

Reactive monad

Monday, 03 March 2025 09:30:00 UTC

IObservable<T> is (also) a monad.

This article is an instalment in an article series about monads. While the previous articles showed, in great detail, how to turn various classes into monads, this article mostly serves as a place-holder. The purpose is only to point out that you don't have to create all monads yourself. Sometimes, they come as part of a reusable library.

Rx define such libraries, and IObservable<T> forms a monad. Reactive Extensions for .NET define a SelectMany method for IObservable<T>, so if source is an IObservable<int>, you can translate it to IObservable<char> like this:

IObservable<char> dest = source.SelectMany(i => Observable.Repeat('x', i));

Since the SelectMany method is, indeed, called SelectMany and has the signature

public static IObservable<TResult> SelectMany<TSource, TResult>(
    this IObservable<TSource> source,
    Func<TSource, IObservable<TResult>> selector)

you can also use C#'s query syntax:

IObservable<char> dest = from i in source
                         from x in Observable.Repeat('x', i)
                         select x;

In both of the above examples, I've explicitly declared the type of dest instead of using the var keyword. There's no practical reason to do this; I only did it to make the type clear to you.

Left identity #

As I've already written time and again, a few test cases don't prove that any of the monad laws hold, but they can help illustrate what they imply. For example, here's an illustration of the left-identity law, written as a parametrized xUnit.net test:

[Theory]
[InlineData(1)]
[InlineData(2)]
[InlineData(3)]
public async Task LeftIdentity(int a)
{
    IObservable<char> h(int i) => Observable.Repeat('x', i);
 
    IList<char>  left = await Observable.Return(a).SelectMany(h).ToList();
    IList<char> right = await h(a).ToList();
 
    Assert.Equal(left, right);
}

Not only does the System.Reactive library define monadic bind in the form of SelectMany, but also return, with the aptly named Observable.Return function. .NET APIs often forget to do so explicitly, which means that I often have to go hunting for it, or guessing what the developers may have called it. Not here; thank you, Rx team.

Right identity #

In the same spirit, we may write another test to illustrate the right-identity law:

[Theory]
[InlineData("foo")]
[InlineData("bar")]
[InlineData("baz")]
public async Task RightIdentity(string a)
{
    IObservable<char> f(string s) => s.ToObservable();
 
    IObservable<char> m = f(a);
    IList<char>  left = await m.SelectMany(Observable.Return).ToList();
    IList<char> right = await m.ToList();
 
    Assert.Equal(left, right);
}

In both this and the previous test, you can see that the test has to await the observables in order to verify that the resulting collections are identical. Clearly, if you're instead dealing with infinite streams of data, you can't rely on such a simplifying assumption. For the general case, you must instead turn to other (proof) techniques to convince yourself that the laws hold. That's not my agenda here, so I'll skip that part.

Associativity #

Finally, we may illustrate the associativity law like this:

[Theory]
[InlineData("foo")]
[InlineData("123")]
[InlineData("4t2")]
public async Task Associativity(string a)
{
    IObservable<char> f(string s) => s.ToObservable();
    IObservable<byte> g(char c)
    {
        if (byte.TryParse(c.ToString(), out var b))
            return Observable.Return(b);
        else
            return Observable.Empty<byte>();
    }
    IObservable<bool> h(byte b) => Observable.Repeat(b % 2 == 0, b);
 
    IObservable<char> m = f(a);
    IList<bool>  left = await m.SelectMany(g).SelectMany(h).ToList();
    IList<bool> right = await m.SelectMany(x => g(x).SelectMany(h)).ToList();
 
    Assert.Equal(left, right);
}

This test composes three observable-producing functions in two different ways, to verify that they produce the same values.

The first function, f, simply turns a string into an observable stream. The string "foo" becomes the stream of characters 'f', 'o', 'o', and so on.

The next function, g, tries to parse the incoming character as a number. I've chosen byte as the data type, since there's no reason trying to parse a value that can, at best, be one of the digits 0 to 9 into a full 32-bit integer. A byte is already too large. If the character can be parsed, it's published as a byte value; if not, an empty stream of data is returned. For example, the character stream 'f', 'o', 'o' results in three empty streams, whereas the stream 4, t, 2 produces one singleton stream containing the byte 4, followed by an empty stream, followed again by a stream containing the single number 2.

The third and final function, h, turns a number into a stream of Boolean values; true if the number is even, and false if it's odd. The number of values is equal to the number itself. Thus, when composed together, "123" becomes the stream false, true, true, false, false, false. This is true for both the left and the right list, even though they're results of two different compositions.

Conclusion #

The point of this article is mostly that monads are commonplace. While you may discover them in your own code, they may also come in a reusable library. If you already know C# LINQ based off IEnumerable<T>, parts of Rx will be easy for you to learn. After all, it's the same abstraction, and familiar abstractions make code readable.

Next: The IO monad.

Easier encapsulation with static types

Monday, 24 February 2025 14:05:00 UTC

A metaphor.

While I'm still struggling with the notion that dynamically typed languages may have compelling advantages, I keep coming back to the benefits of statically typed languages. One such benefit is how it enables the communication of contracts, as I recently discussed in Encapsulating rod-cutting.

As usual, I base my treatment of encapsulation on my reading of Bertrand Meyer's seminal Object-Oriented Software Construction. A major aspect of encapsulation is the explicit establishment of contracts. What is expected of client code before it can invoke an operation (preconditions)? What is guaranteed to be true after the operation completes (postconditions)? And what is always true of a particular data structure (invariants)?

Contracts constitute the practical part of encapsulation. A contract can give you a rough sense of how well-encapsulated an API is: The more statements you can utter about the contract, the better encapsulation. You may even be able to take all those assertions about the contract and implement them as property-based tests. In other words, if you can think of many properties to write as tests, the API in question probably has good encapsulation. If, on the other hand, you can't think of a single precondition, postcondition, or invariant, this may indicate that encapsulation is lacking.

Contracts are the practical part of encapsulation. The overall notion provides guidance of how to achieve encapsulation. Specific contracts describe what is possible, and how to successfully interact with an API. Clearly, the what and how.

They don't, however, explain why encapsulation is valuable.

Why encapsulate? #

Successful code bases are big. Such a code base rarely fits in your head in its entirety. And the situation is only exacerbated by multiple programmers working concurrently on the code. Even if you knew most of the code base by heart, your team members are changing it, and you aren't going to be aware of all the modifications.

Encapsulation offers a solution to this problem. Instead of knowing every detail of the entire code base, encapsulation should enable you to interact with an API (originally, an object) without knowing all the implementation details. This is the raison d'être of contracts. Ideally, knowing the contract and the purpose of an object and its methods should be enough.

Imagine that you've designed an API with a strong contract. Is your work done? Not yet. Somehow, you'll have to communicate the contract to all present and future client developers.

How do you convey a contract to potential users? I can think of a few ways. Good names are important, but only skin-deep. You can also publish documentation, or use the type system. The following metaphor explores those two alternatives.

Doing a puzzle #

When I was a boy, I had a puzzle called Das verflixte Hunde-Spiel, which roughly translates to the confounded dog game. I've previously described the game and an algorithm for solving it, but that's not my concern here. Rather, I'd like to discuss how one might abstract the information carried by each tile.

A picture of the box of the puzzle, together with the tiles spread out in unordered fashion.

As the picture suggests, the game consists of nine square tiles, each with two dog heads and two tails. The objective of the puzzle is to lay all nine tiles in a three-by-three grid such that all the heads fit the opposing tails. The dogs come in four different colour patterns, and each head must fit a tail of the same pattern.

It turns out that there are umpteen variations of this kind of puzzle. This one has cartoon dogs, but you can find similar games with frogs, cola bottles, playing card suits, trains, ladybirds, fast food, flowers, baseball players, owls, etc. This suggests that a generalization may exist. Perhaps an abstraction, even.

"Abstraction is the elimination of the irrelevant and the amplification of the essential"

Robert C. Martin, Designing Object-Oriented C++ Applications Using The Booch Method, ch. 00

How to eliminate the irrelevant and amplify the essential of a tile?

To recapitulate, a single tile looks like this:

One of the tiles of the Hunde-Spiel.

In a sense, we may regard most of the information on such a tile as 'implementation details'. In a code metaphor, imagine looking at a tile like this as being equivalent to looking at the source code of a method or function (i.e. API). That's not the essence we need to correctly assemble the puzzle.

Imagine that you have to lay down the tiles according to a known solution. Since you already know the solution, this task only involves locating and placing each of the nine tiles. In this case, there are only nine tiles, each with four possible rotations, so if you already know what you're looking for, that is, of course, a tractable endeavour.

Now imagine that you'd like to undertake putting together the tiles without having to navigate by the full information content of each tile. In programming, we often need to do this. We have to identify objects that are likely to perform some subtask for us, and we have to figure out how to interact with such an object to achieve our goals. Preferably, we'd like to be able to do this without having to read all the source code of the candidate object. Encapsulation promises that this should be possible.

The backside of the tiles #

If we want to eliminate the irrelevant, we have to hide the information on each tile. As a first step, consider what happens if we flip the tiles around so that we only see their backs.

An empty square, symbolizing the backside of a tile.

Obviously, each backside is entirely devoid of information, which means that we're now flying blind. Even if we know how to solve the puzzle, our only recourse is to blindly pick and rotate each of the nine tiles. As the previous article calculated, when picking at random, the odds of arriving at any valid solution is 1 to 5,945,425,920. Not surprisingly, total absence of information doesn't work.

We already knew that, because, while we want to eliminate the irrelevant, we also must amplify the essential. Thus, we need to figure out what that might be.

Perhaps we could write the essential information on the back of each tile. In the metaphor, this would correspond to writing documentation for an API.

Documentation #

To continue the metaphor, I asked various acquaintances to each 'document' a title. I deliberately only gave them the instruction that they should enable me to assemble the puzzle based on what was on the back of each tile. Some asked for additional directions, as to format, etc., but I refused to give any. People document code in various different ways, and I wanted to capture similar variation. Let's review some of the documentation I received.

The back of a tile, with written documentation and some arrows.

Since I asked around among acquaintances, all respondents were Danes, and some chose to write the documentation in Danish, as is the case with this one.

Unless you have an explicit, enforced policy, you might run into a similar situation in software documentation. I've seen more than one example of software documentation written in Danish, simply because the programmer who wrote it didn't consider anything else than his or her native language. I'm sure most Europeans have similar experiences.

The text on the tile says, from the top and clockwise:

light brown dog/light snout/dark ears
dark brown, white/snout
orange tail/brown spots on/back
orange tail/orange back

Notice the disregard for capitalization rules or punctuation, a tendency among programmers that I've commented upon in Code That Fits in Your Head.

In addition to the text, the back of the above tile also includes six arrows. Four of them ought to be self-evident, but can you figure out what the two larger arrows indicate?

It turns out that the person writing this piece of documentation eventually realized that the description should be mirrored, because it was on the backside of the tile. To be fair to that person, I'd asked everyone to write with a permanent marker or pen, so correcting a mistake involved either a 'hack' like the two arrows, or starting over from scratch.

Let's look at some more 'documentation'. Another tile looks like this:

The back of a tile, with some fairly cryptic symbols in the corners.

At first glance, I thought those symbols were Greek letters, but once you look at it, you start to realize what's going on. In the upper right corner, you see a stylized back and tail. Likewise, the lower left corner has a stylized face in the form of a smiley. The lines then indicate that the sides indicated by a corner has a head or tail.

Additionally, each side is encoded with a letter. I'll leave it as an exercise for the reader to figure out what G and B indicate, but also notice the two examples of a modified R. The one to the right indicates red with spots, and the other one uses the minus symbol to indicate red without spots.

On the one hand, this example does an admirable job of eliminating the irrelevant, but you may also find that it errs on the side of terseness. At the very least, it demands of the reader that he or she is already intimately familiar with the overall problem domain. You have to know the game well enough to be able to figure out that R- probably means red without spots.

Had this been software documentation, we might have been less than happy with this level of information. It may meet formal requirements, but is perhaps too idiosyncratic or esoteric.

Be that as it may, it's also possible to err on the other side.

The back of a tile, this time with an almost one-to-one replica of the picture on the front.

In this example, the person writing the documentation essentially copied and described every detail on the front of the tile. Having no colours available, the person instead chose to describe in words the colour of each dog. Metaphorically speaking, the documentation replicates the implementation. It doesn't eliminate any irrelevant detail, and thereby it also fails to amplify the essential.

Here's another interesting example:

The back of a tile, with text along all four sides.

The text is in Danish. From the top clockwise, it says:

dark brown dog with blue collar
light brown dog with red collar
brown dog with small spots on back
Brown dog with big spots on back

Notice how the person writing this were aware that a tile has no natural up or down. Instead, each side is described with letters facing up when that side is up. You have to rotate the documentation in order to read all four sides. You may find that impractical, but I actually consider that to represent something essential about each tile. To me, this is positive.

Even so, an important piece of information is missing. It doesn't say which sides have heads, and which ones have tails.

Finally, here's one that, in my eyes, amplifies the essential and eliminates the irrelevant:

The back of a tile, with text along all four sides.

Like the previous example, you have to rotate the documentation in order to read all four sides, but the text is much terser. If you ask me, Grey head, Burnt umber tail, Brown tail, and Spotted head amplifies the essential and eliminates everything else.

Notice, however, how inconsistent the documentation is. People chose various different ways in their attempt to communicate what they found important. Some people inadvertently left out essential information. Other people provided too much information. Some people never came through, so in a few cases, documentation was entirely absent. And finally, I've hinted at this already, most people forgot to 'mirror' the information, but a few did remember.

All of this has direct counterparts in software documentation. The level of detail you get from documentation varies greatly, and often, the information that I actually care about is absent: Can I call this method with a negative number? Does the input string need to be formatted in a particular way? Does the method ever return null? Which exceptions may it throw?

I'm not against documentation, but it has limitations. As far as I can tell, though, that's your only option if you're working in a dynamically typed language.

Static types with limited expression #

Can you think of a way to constrain which puzzle pieces fit together with other pieces?

That's how jigsaw puzzles work. As a first attempt, we may try to cut out out the pieces like this:

An anonymous jigsaw puzzle piece

This does help some, because when presented with the subtask of having to locate and find the next piece, at least we can't rotate the next piece in four different positions. Instead, we have only two options. Perhaps we'll choose to lay down the next piece like this:

Two anonymous jigsaw pieces fit together

You may also decide to rotate the right piece ninety degrees clockwise, but those are your only two rotation options.

We may decide to encode the shape of the pieces so that, say, the tabs indicate heads and the indentations indicate tails. This, at least, prevents us from joining head with head, or tail with tail.

This strikes me as an apt metaphor for C, or how many programmers use the type systems of C# or Java. It does prevent some easily preventable mistakes, but the types still don't carry enough information to enable you to identify exactly the pieces you need.

More expressive static types #

Static type systems come in various forms, and some are more expressive than others. To be honest, C#'s type system does come with good expressive powers, although it tends to require much ceremony. As far as I can tell, Java's type system is on par with C#. Let's assume that we either take the time to jump through the hoops that make these type systems expressive, or that we're using a language with a more expressive type system.

In the puzzle metaphor, we may decide to give a tile this shape:

A jigsaw puzzle piece with four distinct tab and indentation shapes.

Such a shape encodes all the information that we need, because each tab or indentation has a unique shape. We may not even have to remember exactly what a square indentation represents. If we're presented with the above tile and asked to lay down a compatible tile, we have to find one with a square tab.

Two jigsaw puzzle pieces with distinct tab and indentation shapes, arranged so that they fit together.

Encoding the essential information into tile shapes enables us to not only prevent mistakes, but identify the correct composition of all the tiles.

Completed puzzle with nine distinctly shaped pieces.

For years, I've thought about static types as shapes of objects or functions. For practical purposes, static types can't express everything an operation may do, but I find it useful to use a good type system to my advantage.

Code examples #

You may find this a nice metaphor, and still fail to see how it translates to actual code. I'm not going to go into details here, but rather point to existing articles that give some introductions.

One place to start is to refactor from primitive obsession to domain models. Just wrapping a string or an integer in a predicative type helps communicate the purpose and constraints of a data type. Consider a constructor like this:

public Reservation(
    Guid id,
    DateTime at,
    Email email,
    Name name,
    NaturalNumber quantity)

While hardly sophisticated, it already communicates much information about preconditions for creating a Reservation object. Some of the constituent types (Guid and DateTime) are built in, so likely well-known to any developer working on a relevant code base. If you're wondering whether you can create a reservation with a negative quantity, the types already answer that.

Languages with native support for sum types let you easily model mutually exclusive, heterogeneous closed type hierarchies, as shown in this example:

type PaymentService = { Name : string; Action : string }
 
type PaymentType =
| Individual of PaymentService
| Parent of PaymentService
| Child of originalTransactionKey : string * paymentService : PaymentService

And if your language doesn't natively support sum types, you can emulate them with the Visitor design pattern.

You can, in fact, do some quite sophisticated tricks even with .NET's type system.

Conclusion #

People often argue about static types with the assumption that their main use is to prevent mistakes. They can help do that, too, but I also find static types an excellent communication medium. The benefits of using a static type system to model contracts is that, when a type system is already part of a language, it's a consistent, formalized framework for communication. Instead of inconsistent and idiosyncratic documentation, you can embed much information about a contract in the types of an API.

And indeed, not only can the types help communicate pre- and postconditions. The type checker also prevents errors.

A sufficiently sophisticated type system carries more information that most people realize. When I write Haskell code, I often need to look up a function that I need. Contrary to other languages, I don't try to search for a function by guessing what name it might have. Rather, the Hoogle search engine enables you to search for a function by type.

Types are shapes, and shapes are like outlines of objects. Used well, they enable you to eliminate the irrelevant, and amplify the essential information about an API.

In defence of multiple WiP

Monday, 17 February 2025 08:52:00 UTC

Programming isn't like factory work.

I was recently stuck on a programming problem. Specifically, part two of an Advent of Code puzzle, if you must know. As is my routine, I went for a run, which always helps to get unstuck. During the few hours away from the keyboard, I'd had a new idea. When I returned to the computer, I had my new algorithm implemented in about an hour, and it calculated the correct result in sub-second time.

I'm not writing this to brag. On the contrary, I suck at Advent of Code (which is a major reason that I do it). The point is rather that programming is fundamentally non-linear in effort. Not only are some algorithms orders of magnitudes faster than other algorithms, but it's also the case that the amount of time you put into solving a problem doesn't always correlate with the outcome.

Sometimes, the most productive way to solve a problem is to let it rest and go do something else.

One-piece flow #

Doesn't this conflict with the ideal of one-piece flow? That is, that you should only have one piece of work in progress (WiP).

Yes, it does.

It's not that I don't understand basic queue theory, haven't read The Goal, or that I'm unaware of the compelling explanations given by, among other people, Henrik Kniberg. I do, myself, lean (pun intended) towards lean software development.

I only offer the following as a counterpoint to other voices. As I've described before, when I seem to disagree with the mainstream view on certain topics, the explanation may rather be that I'm concerned with a different problem than other people are. If your problem is a dysfunctional organization where everyone have dozens of tasks in progress, nothing ever gets done because it's considered more important to start new work items than completing ongoing work, where 'utilization' is at 100% because of 'efficiency', then yes, I'd also recommend limiting WiP.

The idea in one-piece flow is that you're only working on one thing at a time.

One-piece flow illustrated as a series of boxes in a row.

Perhaps you can divide the task into subtasks, but you're only supposed to start something new when you're done with the current job. Compared to the alternative of starting a lot concurrent tasks in order to deal with wait times in the system, I agree with the argument that this is often better. One-piece flow often prompts you to take a good, hard look at where and how delays occur in your process.

Even so, I find it ironic that most of 'the Lean squad' is so busy blaming Taylorism for everything that's wrong with many software development organizations, only to go advocate for another management style rooted in factory work.

Programming isn't manufacturing.

Urgent or important #

As Eisenhower quoted an unnamed college president:

"I have two kinds of problems, the urgent and the important. The urgent are not important, and the important are never urgent."

It's hard to overstate how liberating it can be to ignore the urgent and focus on the important. Over decades, I keep returning to the realization that you often reach the best solutions to software problems by letting them stew.

I'm sure I've already told the following story elsewhere, but it bears repeating. Back in 2009 I started an open-source project called AutoFixture and also managed to convince my then-employer, Safewhere, to use it in our code base.

Maintaining or evolving AutoFixture wasn't my job, though. It was a work-related hobby, so nothing related to it was urgent. When in the office, I worked on Safewhere code, but biking back and forth between home and work, I thought about AutoFixture problems. Often, these problems would be informed by how we used it in Safewhere. My point is that the problems I was thinking about were real problems that I'd encountered in my day job, not just something I'd dreamt up for fun.

I was mostly thinking about API designs. Given that this was ideally a general-purpose open-source project, I didn't want to solve narrow problems with specific solutions. I wanted to find general designs that would address not only the immediate concerns, but also other problems that I had yet to discover.

Many an evening I spent trying out an idea I'd had on my bicycle. Often, it turned out that the idea wouldn't work. While that might be, in one sense, dismaying, on the other hand, it only meant that I'd learned about yet another way that didn't work.

Because there was no pressure to release a new version of AutoFixture, I could take the time to get it right. (After a fashion. You may disagree with the notion that AutoFixture is well-designed. I designed its APIs to the best of my abilities during the decade I lead the project. And when I discovered property-based testing, I passed on the reins.)

Get there earlier by starting later #

There's a 1944 science fiction short story by A. E. van Vogt called Far Centaurus that I'm now going to spoil.

In it, four astronauts embark on a 500-year journey to Alpha Centauri, using suspended animation. When they arrive, they discover that the system is long settled, from Earth.

During their 500 years en route, humans invented faster space travel. Even though later generations started later, they arrived earlier. They discovered a better way to get from a to b.

Compared to one-piece flow, we may illustrate this metaphor like this:

A row of boxes above another row of thinner boxes that are more spread out, but indicates an earlier finish.

When presented with a problem, we don't start working on it right away. Or, we do, but the work we do is thinking rather than typing. We may even do some prototyping at that stage, but if no good solution presents itself, we put away the problem for a while.

We may return to the problem from time to time, and what may happen is that we realize that there's a much better, quicker way of accomplishing the goal than we first believed (as, again, recently happened to me). Once we have that realization, we may initiate the work, and it it may even turn out that we're done earlier than if we'd immediately started hacking at the problem.

By starting later, we've learned more. Like much knowledge work, software development is a profoundly non-linear endeavour. You may find a new way of doing things that are orders of magnitudes faster than what you originally had in mind. Not only in terms of big-O notation, but also in terms of implementation effort.

When doing Advent of Code, I've repeatedly been struck how the more efficient algorithm is often also simpler to implement.

Multiple WiP #

As the above figure suggests, you're probably not going to spend all your time thinking or doing. The figure has plenty of air in between the activities.

This may seem wasteful to efficiency nerds, but again: Knowledge work isn't factory work.

You can't think by command. If you've ever tried meditating, you'll know just how hard it is to empty your mind, or in any way control what goes on in your head. Focus on your breath. Indeed, and a few minutes later you snap out of a reverie about what to make for dinner, only to discover that you were able to focus on your breath for all of ten seconds.

As I already alluded to in the introduction, I regularly exercise during the work day. I also go grocery shopping, or do other chores. I've consistently found that I solve all hard problems when I'm away from the computer, not while I'm at it. I think Rich Hickey calls it hammock-driven development.

When presented with an interesting problem, I usually can't help thinking about it. What often happens, however, is that I'm mulling over multiple interesting problems during my day.

Same diagram as above, but now with more boxes representing thinking activities interleaved among each other.

You could say that I actually have multiple pieces of work in progress. Some of them lie dormant for a long time, only to occasionally pop up and be put away again. Even so, I've had problems that I'd essentially given up on, only to resurface later when I'd learned a sufficient amount of new things. At that time, then, I sometimes realize that what I previously thought was impossible is actually quite simple.

It's amazing what you can accomplish when you focus on the things that are important, rather than the things that are urgent.

One size doesn't fit all #

How do I know that this will always work? How can I be sure that an orders-of-magnitude insight will occur if I just wait long enough?

There are no guarantees. My point is rather that this happens with surprising regularity. To me, at least.

Your software organization may include tasks that represent truly menial work. Yet, if you have too much of that, why haven't you automated it away?

Still, I'm not going to tell anyone how to run their development team. I'm only pointing out a weakness with the common one-piece narrative: It treats work as mostly a result of effort, and as if it were somehow interchangeable with other development tasks.

Most crucially, it models the amount of time required to complete a task as being independent of time: Whether you start a job today or in a month, it'll take x days to complete.

What if, instead, the effort was an function of time (as well as other factors)? The later you start, the simpler the work might be.

This of course doesn't happen automatically. Even if I have all my good ideas away from the keyboard, I still spend quite a bit of time at the keyboard. You need to work enough with a problem before inspiration can strike.

I'd recommend more slack time, more walks in the park, more grocery shopping, more doing the dishes.

Conclusion #

Programming is knowledge work. We may even consider it creative work. And while you can nurture creativity, you can't force it.

I find it useful to have multiple things going on at the same time, because concurrent tasks often cross-pollinate. What I learn from engaging with one task may produce a significant insight into another, otherwise unrelated problem. The lack of urgency, the lack of deadlines, foster this approach to problem-solving.

But I'm not going to tell you how to run your software development process. If you want to treat it as an assembly line, that's your decision.

You'll probably get work done anyway. Months of work can save days of thinking.

Page 1 of 78

"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!

ploeh blog danish software design

Data structures #

Test Double #

QuickCheck Arbitraries #

A few properties #

Examples #

Implementation #

Conclusion #

Data structures #

Implementation #

Test Double #

Conclusion #

Inferred details #

Test Double #

Icebreaker Test #

Refactor to property #

Full coverage #

Mutation testing #

Quirks #

Conclusion #

Recommendations #

Code examples #

Articles #

Conclusion #

Humble Objects #

Fat Adapters #

Use-case Model API #

Return type woes #

Visitor #

Use Case implementation #

Decoupled Controller #

Reflection #

Conclusion #

Comments

Scope #

Semantic versioning #

Advance warning #

Escalation #

Conclusion #

Appeal to authority #

Epistemological cul-de-sac #

Real authority #

Conclusion #

Left identity #

Right identity #

Associativity #

Conclusion #

Why encapsulate? #

Doing a puzzle #

The backside of the tiles #

Documentation #

Static types with limited expression #

More expressive static types #

Code examples #

Conclusion #

One-piece flow #

Urgent or important #

Get there earlier by starting later #

Multiple WiP #

One size doesn't fit all #

Conclusion #