Code

What's the Frequency, Kenneth? (F#)

Online forums, primitive obsession, F# coding, and premature optimization in Functional Programming

Most coders recognize that premature optimization is a horrible thing to do, but we keep doing it anyway. Since we live in two worlds, and premature optimization happens differently in each, here is an example from Imperative Land, followed by one from Functional Land.

Imperative Premature Optimization

A tweet from my friend Tim Ottinger, probably the nicest and easily in the top ten most capable tech coaches I know.

What Tim's talking about here is code smells and DRY, or Don't Repeat Yourself, a fundamental piece of how structured programming works. When simple tools are used over and over again in the same way, we should make a better tool to do the work of both. If you keep typing in the same code over and over, the code bloats out. You make copying mistakes, and the codebase becomes harder to maintain.

It's also a code smell called Primitive Obsession, which is where you always focus on using primitive types instead of creating your own. It happens a lot for functional coders entering the OOP world.

What Tim is saying is true and good, but Twitter being Twitter, he's probably unable to add the nuance necessary, so I'll provide it:

If you want the Domain in your code, ie, where you group code into pieces that you can talk about with your business partner, great! However you must keep in mind that regrouping code through DRY, SOLID, or whatnot is actually a business decision, you're just making it at the code level.

Business should tell us what they want the code to do. If we find duplication in that work, then by axiom the business has somehow told us to do the same thing twice. That's unusual. Businesses don't like to waste energy either. We need to check with them to make sure that it's actually the same thing and we're not missing something important. Analysis "flows" from business to code, but it goes the other way too. Good code should make us think about the business differently.

Of course, the reply most coders will give: most of the time this kind of DRY happens because of architectural concerns. That doesn't make the nuance go away. In this case you're doing the same thing, only with the architecture. In short, if I'm abstracting code that uses X and Y out into its own class or structure because I'm repeatedly using it to do the same thing, that's fine. It's trivial to show that it's the same thing if X and Y are primitives. That's because there can be no other context. But after I create that class, or change X or Y from a primitive to their own class, then I am changing the way I (and everybody else) reason about the problem, only this time without thinking through the reasons I grouped them together in the first place. I'm subtly telling myself and future coders how to think about the architecture. In both cases, it's object slicing, only in analysis instead of code.

When as a coder I group and name code, I'm not just making a decision about the code; I'm making a decision about everything else too. Cleaning up is an easy thing to show with primitives, but it can become a cognitive mess with more complex types. As programmers, we tend to make bigger and more subtle ontological messes while trying to clean up obviously technically smelly code. We're dragging code into business context where it might not belong. That lesson was a tough one for me to learn.

Functional Premature Optimization

If in the imperative world we take primitives and group them into abstractions faster than we should, the opposite happens in functional world: we take abstractions and use them as primitives when the concept is not required as part of the problem. The concepts, structs, and classes are just baggage we brought into the problem and we quickly lose track of the real problem.

Here's a guy bringing a bunch of context to a programming forum and asking for help. There's a dungeon, he's got a list of rooms, he's writing a game, he has a lot of variable names in place, he's already got some sample code, and so forth.

In short, here's a guy who already has a ton of analysis concepts in his head and wants help reasoning about them functionally. Where's my tool? Will a fold work? In imperative coding, we either hold on to the primitives too long when they should be a business class or we create business classes because cleanup without having the right conversation. We're trying to do analysis through the code. In FP, we tend to completely ignore the underlying primitive/domain problem and instead ask what sort of functional hammer I need to pound on this domain concept nail.

The highest-voted comment says

"...The Option is probably over complicating things. Also, fromChamberToChamber for inner loop and outer name seems confusing. So:..."

then provides this code

let fromChamberToChamber (source:int list) =
    let rec loop (source:int list) (previousInt: int) (outcome:(int*int) list) =
        match source with
        | [] -> outcome
        | head::tail -> loop tail head ((previousInt, head)::outcome)
    match source with
    | [] | [_] -> []
    | head::tail -> loop tail head [] |> List.rev

The commenter is then nice enough to point out that there are primitive types on this code, ints, and he makes it all generic.

let fromChamberToChamber source =
    let folder ((previousIntOpt, outcome): int option * (int*int) list) (value: int) =
        // your code here
    source
    |> List.fold folder (None, [])
    |> snd // chuck that first bit of state
    |> List.rev

The poster said "thanks!" and went on his way.

I was going to let it go until I saw this question in the thread:

Wait a minute. Here's a restatement of the problem without any business context and the desired solution. This problem is not context-dependent at all. We're dragging business context into code where it might not belong.

You don't bring a knife to a gunfight, so why are we bringing all of these concepts into code where we don't need them? Here's my answer:

// Given a sequence named "animals"
Seq.zip animals (animals |> Seq.skip 1)

Vector go in, pairs come out. None of all of that other stuff matters. You could call this "premature de-optimization", because the other answers were more wordy, but the other answers were making the same mistake the imperative guys were: mixing up analysis and code without realizing it.

The only difference in premature optimization between imperative and functional programming is that in imperative premature optimization one creates and manages analysis classes too quickly. The functional version creates and manages complex functions constructs using analysis objects that aren't necessary.

Put differently, in imperative land, we name and group more than we should without doing the associated work. We're creating things in analysis that don't belong there. In functional land, we don't take enough names and groups out of the analysis. We're using analysis concepts where they're not needed. Imperative premature optimization makes the mistake of cleaning things up right away without thought about the larger picture. Functional premature optimization thinks about the larger picture and never appropriately cleans up, it never comes back to genericize the function, instead leaving business concepts strewn about in the code that don't belong there. Both make things really suck over time.

In both cases we're trying to be "cooler" by adding code and groupings or keeping them around where it's not necessary. We're trying to use hammers to pound nails they're not built to pound. But hell, it feels good.

Since it happens in opposite ways and we fight it in opposite ways. In OOP, we fight it by continuing conversations with the Product Owner. Is this grouping the same in both cases? Is it a real dupe? Does it affect other parts of the code if we re-arrange like this? In simple cases and simple architectural code the answer will be "no". That's why it's taught this way.We tend to teach simple and straightforward stuff, not how to think. In FP, we fight it by evolving the code as much as we can, trying to reject all context. This is where the first guy was going when he started generalizing his answer. If both cases the goal is to pass the acceptance test with the smallest and simplest change to the code that's possible. We just suck at it. Instead, we dream up problems and optimize for them that may never exist.

These over-optimizations are at heart analysis errors, not coding errors. The code might be working perfectly. We might even be reaching the initial goals we have for the code. There's no problem in the tools or job; the problem is how we think about the job. This will usually take a while to cause pain, but when the pain starts, by then it's intrinsic to the work we've created. The only way out is to rebuild.

I went back this morning and double-checked my answer. I think maybe the guy was trying to make a game loop. I don't know. What I do know is that my code passes the acceptance test. If the acceptance test changes I'll add more code and complexity, but I won't add it until I'm forced to. It's called YAGNI, or You Ain't Going To Need It. It is easy to say, easy to repeat as a mantra, difficult to actually practice. Brains don't work like computers, and the programmer and the code have a back-and-forth relationship. It's not just one-way. We need the code to optimize our thinking as much or more than we need our thinking to optimize the code.[1]

I believe as you eliminate smells in functional code, you end up where I did, whereas if you eliminate smells in OO code you just continue to add complexity. That's a topic for another day, though.

Note: the headline, "What's the Frequency, Kenneth" is based on an awful event, the assassination of John Lennon in the streets of New York by a deranged gunman. It's not meant to make light of that event. I used the associated REM song title rather than the actual words spoken. The point is that very bad things in system creation can seem to come from nowhere; they make no sense on their own. Instead, they're the products of problems that have nothing at all to do with what we're currently doing. These problems started a long time ago and our pain is not related to anything we'd ever suspect. Yet it's not just random, either. We can do things to fix them, but we have to recognize the drivers of the problem and start early.