Daniel Frost - Frontpage

Files and belongings, why all the indirection ?

mandag den 20. januar 2025

This might be addressing dogma, what is occasionally dressed up as good practice. I am not blaming anyone. It seems most part of the world is reading the same information anyway, so I cannot resist. I have come to play with a thought that challenges how some code is probably written.

I am in no position to dictate how your code is written or to what extent your own self-established beliefs have been formed, so this is merely a play of thought. But do take into consideration the "self-established" part of that argument - the why. Why are you writing code that way? Most often, I believe, in a professional context, it's to get things out the door. It would be naive, however, not to accept that we are all, of course, being influenced by whatever is around us, and by what influences those things, and so forth.

Validating your needs, discarding your guesses

I have talked at length about how I believe validation should occur before writing any code. Others might refer to this as, "Write code that does exactly what you need, make it look good later, and in the end make it fast." The validation of writing code is like the validation of a business idea - if you do not have a single customer, why build the thing? The balance between "does exactly" and "right and fast" is of some importance. Perhaps, if you're the only customer, you don't need to worry too much about the future. Unless, of course, what you're building is meant to portray contextual excellence and self-esteem somehow - then go all in and show the world how great you are. As a programmer, I have been down that road again and again to challenge my own wits (and to sleep at night), but I am fully aware that my code is not of higher importance than how others contribute.

Anyways, the thing about validation is that it could just as easily be applied to writing code. Of course it can. Because why write code that is not needed? Is it because you are predicting the future? Is it because you are arrogant? Is it because you don't want to take the time to contribute to existing code that does nearly the same thing? I am well aware that this argument could almost be considered a religious one, and since context and setting are of high importance, I can only guess why. This has been discussed since the dawn of source code - why are you writing code that solves the future? Sure, you're right, now I am just shoveling more into that bucket, and for that, please excuse me.

And then again, sometimes you do not need to apply validation before you write code; instead, the validation is about something else. Imagine if someone is economically waiting for the outcome of where you are applying your energy - then the code must be written, and the validation might be more about whether that code is sustainable enough for the future. Wait, why? You don't know the future, so apply validation. Like I said, I am in no position to tell you anything that you have already applied your own thorough thoughts to.

Why all the files? Where is the actual belonging ?

I am going to be a bit provocative and make a guess: most programmers, for a lot of their time, are not working on the higher complexities of things but instead find themselves in a position where abstractions make things easier. You are probably not writing your own security transport layer, your own relational mapper, message bus, and so on. You are most likely using these tools to maintain a productive pace. Your validation will speak to your needs - it's part of preparing work. If you cannot prepare, I believe you diminish your chances of success.

I am stating these things for myself, to underline, for my remembrance, that writing code you do not need to write is not a very good idea. This is also why I have emphasized the validation part of writing code, building software, and perhaps even business ventures. Validation can be difficult, but if you cannot apply the principle of "write code that does exactly the thing you need," chances are that you are already in too deep. Ask yourself: point to the validation you did. Point to the preparation of your work.

Before I wrote this, I initially set out to give myself an example of code that could hide the most important parts, because I have come to realize that when I look at code I have written and that is running in production, it could have been written differently. As it turns out, that code is rarely touched after that point. Rarely is not the same as never, but I have come to realize that grouping things even tighter together could be an easier way to work while figuring out if "writing this code is exactly what I need."

What this has manifested into is sometimes a codebase with fewer files. It is definitely a codebase where seams are tighter but perhaps structured differently for reuse. It has the same number of types and objects but, to me, offers an easier navigational approach to understanding where things belong. It doesn't violate OOP either, which I often find myself applying.

Some code is not touched much after it has been written - so why apply too much indirection to that code? And even if it gets touched often, why the indirection?

It is very important to me, for the reader to understand, that I am not advocating for this as an approach you should adopt - nor even as an approach I would consistently follow. But I do hope this can make you think just a little more about how you approach the code you're writing and, in the end, the software you are responsible for as its caretaker. Do not take what I am writing at face value - I am validating my own chain of thoughts here.

The codebase of an API integration

I am going to lay out two different styles, one of which I find more commonly used than the other. You can determine for yourself what you think, but don't blame the messenger - add your own validation.

AirBnbApi
    - Constants
      - FormattingConstants.cs
      - UrlConstants.cs
    - Extensions
      - StringExtensions.cs
      - DateTimeExtensions.cs
    - Helpers
      - DateTimeJsonConverter.cs
      - RequestUrlHelper.cs
    - Interfaces
      - IAirbnbClientService.cs
    - Models
      - Price.cs
  - AirbnbClientService.cs

This is a small codebase. From the naming of the files, you can infer somewhat that this is about Airbnb and prices. However, there are a whole lot of other files that the programmer might look at and think, "These are global types used to ease the actual API communication and modeling." How the API works internally is not of importance in this context.

Why have all these files if they are really only serving the purpose of the inner workings of AirbnbClientService.cs? The reason I choose to mention AirbnbClientService is implicit - I know from looking at codebases like this that this is most likely the type that orchestrates the other types found in the codebase.

Personally, I dislike names such as Extensions, Helpers, Utilities, Common, and other similar global identifiers for things that, in my mind, should be kept closer to where those extensions, helpers, utilities, or common items are actually used. When I say "belong," I mean where they are first put to use. That can change over time, of course, but it doesn't change when we are in the state of "write code for exactly what you need."

So, in this codebase, I could argue that if the code inside FormattingConstants.cs has formatting methods closely related to AirbnbClientService.cs, those methods could just as easily live inside that type.

Let's say that the AirbnbClientService.cs file is started as the following:

public class AirbnbClientService : IAirbnbClientService

Now, we want to expand that type so the code inside FormattingConstants.cs will be part of it as well. An easy approach is to add a new top-level class name.

Since this code is about an API for prices at Airbnb, we can call the top-level class PriceApi and then adjust the code accordingly. I would also add the content of the interface into that file simply because that is where I believe it belongs the most. You could argue that having the interface in its own file makes the reading experience better, but I believe the tradeoff of having it as part of where it is used is worth it.

The effects of grouping code together are not only technical. They also make it easier for me to understand its possible heritage.

public interface IPricesHttpClient
{
    Task<Price> GetPricesForPeriodRangeAsync(DateTime start, DateTime end);
}

public sealed class PriceApi : IPricesHttpClient
{
    ....

    public async Task<Price> GetPricesForPeriodRangeAsync(DateTime start, DateTime end)...

    public sealed class FormattingConstants
    {
        public const string BaseUrl = "https://url.to/airbnb/prices/api";

        public const string PeriodRangeQueryString = "&start={0}?end={1}";
    }

    ...
}

So I merged three files into one. And I made a new top level type. I will continue doing this until all code that is specifically tailored to the PriceApi type is in that specific type. In the end that means that all code beloning to PriceApi is in that type.

public interface IPricesHttpClient
{
    Task<Price> GetPricesForPeriodRangeAsync(DateTime start, DateTime end);
}

public sealed class PriceApi : IPricesHttpClient
{
    ....

    public async Task<Price> GetPricesForPeriodRangeAsync(DateTime start, DateTime end)...

    public sealed class FormattingConstants
    {
        public const string BaseUrl = "https://url.to/airbnb/prices/api";

        public const string PeriodRangeQueryString = "&start={0}?end={1}";
    }

    public sealed class Price 
    {
        ....
    }

    public sealed class RequestUrlHelpder 
    {
        ...
    }

    ...
}

This will leave me with a PriceApi type that encapsulates all its code within that type. Not only that, it will make it easier to know what potential piece of code belongs.

Imagine writing some code, before the change, that would use the Price model inside the Models folder. You would not know, just by writing the code, where that model belongs.

With the change I just made, this is less ambiguous since it is now explicitly part of the PriceApi type. You would need to access the type by specifying PriceApi.Price rather than just Models.Price. Where does Models.Price belong? Now it belongs to PriceApi, and I would argue that is where it should be.

The most obvious counterargument, of course, is: "What if I want to use the Price model in some place other than PriceApi?" Essentially, this is a question about how to reuse Price. I cannot answer that easily because I am not programming for the future - I am programming for what I know at this moment in time.

That decision would depend on future requirements. You might argue to pull Price out to avoid violating DRY, but that would only make sense if the other piece of code needed Price exactly as it is. If it is not exactly as-is, it's not a violation of DRY, and perhaps a lookalike type would be better suited to the code it is really part of.

With the changes completed, the structure and files of the codebase now look like this. I haven't shown you all the changes since that is not the important part.

AirBnb
  - PriceApi.cs
  - DateTimeJsonConverter.cs

I believe two files are easier to comprehend than nine - not to mention the folders. I have left the DateTimeJsonConverter.cs file to make the reader think about why. Why is that not part of PriceApi.cs? Just as simply, why were there nine files? Why?

Now, what happens if we introduce another API from airbnb we wish to communicate with ? Let's say I want to communicate with the "Search" API from airbnb.

public interface ISearchHttpClient
{
    Task<Price> GetSearchForPeriodRangeAsync(string search, DateTime start, DateTime end);
}

public sealed class SearchApi : ISearchHttpClient
{
    ....

    public async Task<Price> GetSearchForPeriodRangeAsync(string search, DateTime start, DateTime end)...

    public sealed class FormattingConstants
    {
        public const string BaseUrl = "https://url.to/airbnb/search/api";

        public const string SearchQueryString = "&q={0}?start={1}?end={2}";
    }

    public sealed class Search 
    {
        ....
    }

    public sealed class RequestUrlHelpder 
    {
        ...
    }

    ...
}

Then we can start to ask ourselves what is actually up for reuse between the two API clients, PriceApi and SearchApi.

Since we already have a PriceApi, we can extract the things that are truly duplicates. It might look something like this: there are no folders with odd names, which, to me personally, add more indirection than explicitness. It even seems that most of the code is on the same level, except for the types that belong to their respective APIs, which are placed in the same files.

AirBnb
  - PriceApi.cs
  - SearchApi.cs
  - DateTimeJsonConverter.cs
  - AirbnbHttpClientFacade.cs
  - FormattingConstants.cs

Conclusion

This exploration is not about prescribing a one-size-fits-all approach to organizing codebases but rather about challenging the patterns we often adopt without much reflection. By questioning the need for perhaps excessive indirection and fragmentation, I merely aim add to a conversation about code validation and its implications for maintainability and clarity.

The central theme here is to prioritize writing code that does not serve the speculative future-proofing. This approach might foster a balance between explicitness and simplicity, applying a narrower structure of the code.

But ultimately, every choice in a codebase reflects a tradeoff. Whether it's fewer files or more modularity, tighter encapsulation or reusability. My belief is that the answer lies in the context of the validation and the moment of time I am in. This might encourage you to revisit some of your coding habits, validating your decisions against the needs of the present rather than the uncertainties of the future.

Not finding e-mail signatures with ML.Net.

fredag den 3. januar 2025

This is a piece on me trying to work with ML.Net, a machine learning library from Microsoft, training a model for predicting what is an email signature and what is email body content.

I hoped that I would end up with a trained model that could predict with greater certainty what an email signature looks like in any email I would serve to the model.

I am not a trained data scientist and I have no background in statistics either, but as a human-programmable problem-solver, I believed it would be both a fun task and a fine approach to learning about both the ML.Net library as well as machine learning in general.

I started off totally blank but read the documentation on ML.Net and began writing some code. It struck me how well the library is put together; being a programmer, it felt like I could start fairly easily and not dive too heavily into the underlying subjects. I did some research into the different machine learning algorithms due to my passion for knowledge, but it did make my head spin a bit, which I definitely also expected, since nothing comes for free or without proper preparation.

I will perhaps gradually learn more in the time to come, but it does feel like a branch of computer engineering that deserves its own dedicated space of knowledge.

The feeling of using a trained model can be overwhelmingly positive, and what I believe is the average human impatience for searching for answers themselves, makes models seem like both genuine progress and perhaps a bit of magic. But if you really want to know what is going on, as with everything in life, the journey is long and answers don't arise without hard work.

As you will find, I did not succeed in my endeavor, whether because I am not good enough at labeling my data or, just as plausibly, because the library I chose is either not mature enough or I used it incorrectly.

It becomes even more humorous since the email messages I had available were in "emlx" files, and therefore I had to write a parser before I could even start with the actual machine learning code. Emlx files are Apple Mail's proprietary format, and even though they resemble RFC 5322 and in turn RFC 2822, they act as a superset in some aspects. A typical email formatted as emlx could look like this:

5028      
Return-Path: <kate@xzy.com>
X-Original-To: www@vvv.com
Delivered-To: xxx@xxx.com
Received: from localhost (localhost [127.0.0.1])
	by xxx.wannafind.dk (Postfix) with ESMTP id C282E2710010
	for <xxx@xxx.com>
X-Virus-Scanned-By: Wannafind A/S
X-Spam-Flag: NO
X-Spam-Score: -0.449
X-Spam-Level: 
X-Spam-Status: No, score=-0.449 tagged_above=-999 required=3 tests=[AWL=0.249,
	DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001,
	RCVD_IN_DNSWL_LOW=-0.7, URIBL_BLOCKED=0.001] autolearn=unavailable
Received: from mail26.wannafind.dk ([127.0.0.1])
	by localhost (mail26.wannafind.dk [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id Q1wWb6BNZMRl for <xxx@xxx.com>;
	Fri, 17 Mar 2017 22:07:15 +0100 (CET)
Received: from mail-qt0-f182.google.com (mail-qt0-f182.google.com [209.85.216.182])
	by mail26.wannafind.dk (Postfix) with ESMTP id CF0952710001
	for <xxx@xxx.com>; Fri, 17 Mar 2017 22:07:15 +0100 (CET)
Received: by mail-qt0-f182.google.com with SMTP id n21so72649531qta.1
        for <xxx@xxx.com>; Fri, 17 Mar 2017 14:07:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=swellsailing-com.20150623.gappssmtp.com; s=20150623;
        h=mime-version:from:date:message-id:subject:to;
        bh=PAhF8i05o9wx4Ma9+vOYJAusyA2JndcqLmzT4Q3MKAU=;
        b=UKtfvvYoLf5dG59oJ46upzV1C/tmVCnm04CFaILxSx4ZHqBcZkafCUopnNwnlvJHdW
         9ucIZV+4C80+IOI2aIv3cS3NG968qGkBn3mdl8Y6N/lfe3i8GHLtYaA0ci+wTft5VDun
         TdJXacCe8P4eRhkUlkYcWOfn3NWWv21gJccveBakoyUYXjxejWy3Dz+wLMqRU993XeUL
         ffIG455Zwzky8rZ7AQzV4CoGa+TXxOlOcdZZCOLqXlLgSM/QD6pXCYktnuNm43ekK5r5
         JCdPN+Xz1J1cRoXpdo14PfCXMVX5WMFK/MvAGDpfOBP7nCH1wCVa0AVCDymZZVp02la3
         ++/w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:from:date:message-id:subject:to;
        bh=PAhF8i05o9wx4Ma9+vOYJAusyA2JndcqLmzT4Q3MKAU=;
        b=aZydM85qslAlbt2d1eu7DMrz6oaqrnQ6NEhKga59CFFZWL63Jfyz8cWEGjw7cVMIbC
         11yNazwr5tez0n/NuuC6wxxWUabUKx8dNXWdCEkHBP5gsd19ktRg/jIS2i+LUB7mO1Nb
         89XZo7E40ukf7GkGsGfox1nG4VbVR7NlJphr+IwlJErzAR/sw+OK778HgWvbS1rLTwqo
         a+/4EtPazKYl2QbkGtO/M0tpcbi05XZMIxEIz+PCAQOsCpiijERJGubhp0Vo+6aq7OsQ
         yOJ/aw7bIEtp7fhnOARyHD++He0LAhbemz/OiZxguB06E6u015AEHt8jrGNAcfb6TxI7
         ILbA==
X-Gm-Message-State:
X-Received: 
MIME-Version: 1.0
Received: 
From: Kate <kate@xzy.com>
Date: Fri, 17 Mar 2017 17:06:34 -0400
Message-ID: <CABvh=O_K-0rtWDVAjspNPR6UijEnu0ga4xEXn+04hMoaAyYL4A@mail.gmail.com>
Subject: schedule 3/18 - 19
To: Andrew <andrew@xxx.com>, Colin <colin@xxx.com>
Content-Type: multipart/alternative; boundary=001a113d0ce6dfa2e0054af38f25

--001a113d0ce6dfa2e0054af38f25
Content-Type: text/plain; charset=UTF-8

hi hi
your charts are clear

have a lovely weekend!
xo

Kate Moshbure
Swell Sailing Caribien <http://swellsailing.com>
Orlando Street, Suite 112
O.345.124.1433
C.918.112.4322
@swellsailing <http://instagram.com/swellsailing>

Cancellation Policy: Confirmed bookings cancelled less than one calendar week
from the depature date are subject to a 50% cancellation fee. Confirmed bookings
cancelled within 3 calendar days or less from the depature date require 100%
payment.

--001a113d0ce6dfa2e0054af38f25
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">hi hi<div>your charts are clear</div>

--001a113d0ce6dfa2e0054af38f25--
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
	<key>date-last-viewed</key>
	<integer>1490088118</integer>
	<key>date-received</key>
	<integer>1489784836</integer>
	<key>flags</key>
	<integer>8590195717</integer>
	<key>remote-id</key>
	<string>2151</string>
</dict>
</plist>

As I write this, I am looking at the code for extracting the actual content of the email. Since you cannot control the output, all kinds of things can appear in such a message. If you have ever parsed HTML documents for validation, you know a bit of what I am talking about. You would also be aware that removing and trimming whitespace and new lines is crucial. I am merely trying to emphasize that inspecting computer formats can be difficult.

Ok, so after running through some 30GB of emlx files, I ended up with a 335MB CSV file containing email content.

A content blob in the CSV could look like this (the CSV columns are there too):

date,from,to,subject,body
"2017-03-17T15:37:52.0000000+01:00","ddd@ggg.com","xxx@eee.com","Re: Avffen","Kære Krølle
Tusind tak for begge mails med flotte skibe.
Jeg vil gemme skibene i vores arkiv og så vender jeg tilbage når vi har nogle spændende nye sejladser :-)
Ønsker dig en dejlig weekend.
 LOVE
 Johhny Money
 Project Manager 
 Sugar Sailing ApS
 Hannesgade 16 - 1st floor
 1208 Copenhagen K
 Denmark
 Office: +45 1234 1234
 sugarsailing.com <
 FOLLOW US ON INSTAGRAM <"

Now, this message is actually decently formatted. It has left me with a proper message body and a well-established signature too, which is what I came for in the first place.

However, it is not uncommon for some of the content to contain gibberish and strange formatting like > < + ? and so on. What I found out later is that your data must be absolutely as clean as possible. The more solid your data foundation, the more solid your model becomes. Cleaning and preparing data is definitely not a job to take lightly; it takes time and multiple iterations. And when you have lots of data, it takes even more time.

If you are reading on and believe your favorite AI chatbot could do this in a jiffy, be my guest and try. Please let me know how well it went.

The next task for me from here is to label email bodies and signatures, somehow building a file for later use that can tell our model builder in ML.Net what is signature text and what is actual email text. So, I wrote another program to help me with that. It allows me to mark what text in an email is a signature and what is not. It would be persisted in a new CSV file for later use and have a format similar to this:

Text  |   Label 

"Best Regards", "positive"
"Med venlig hilsen", "positive"
"Tak fordi du vendte hurtigt tilbage", "negative"

As you might have noted already, one of these is a Danish signature "Med venlig hilsen," and I knew it might cause me trouble—I wasn't certain, though—since I mixed Danish and English signatures, as well as email content. But hey, why wouldn't a static algorithm be able to handle both languages?

Of course, in real life, a signature rarely looks like any of the above. There might be images, GIFs, HTML, long corporate marketing phrases, and sometimes... nothing at all.

A more realistic signature looks a lot more like this:

Med venlig hilsen / Best Regards Peter Jensen Firma ApS office: +45 22 11 22 11 cell: +45 11 22 22 11 www.jajaja.dk < instagram < Store Kongensgade 12  2.sal 1264 København K Danmark", Positive

Before I continue, let me just recap for the sake of my own memory and try to make it clear what I learned from a few days of programming these things.

Extracting emlx files from Apple Mail is not difficult, but reading a macOS-formatted disk can be tricky on Windows. I used HFSExplorer.

Emlx is closely related to multiple RFCs and is quite okay to work with for tasks such as determining what is a "From" field and what is an "HtmlBody." However, the DTD of http://www.apple.com/DTDs/PropertyList-1.0.dtd is not documented anywhere but can represent XML like this:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
	<key>date-last-viewed</key>
	<integer>1490088118</integer>
	<key>date-received</key>
	<integer>1489784836</integer>
	<key>flags</key>
	<integer>8590195717</integer>
	<key>remote-id</key>
	<string>2151</string>
</dict>
</plist>

Any actual textual extraction from a computer format, such as HTML, CSS, or emlx, is tricky. Parsers of such kind are, to my knowledge, impossible to get right with RegEx. I relearned this while writing this software since I used RegEx for extracting content.
I didn't anticipate or prepare for a cleaning pipeline around emails to have this many steps. This was not because I didn't respect the complexity, but simply because I didn't prepare beforehand.
The cleaning pipeline consists of:
1. Raw emlx format extractions
2. Content extractions from raw email format
3. Labeling signatures from actual communicative content

To sum it up, I now have a 335MB CSV file with email content, cleaned reasonably well. I also have a file with positive and negative examples of what a signature could look like and what is definitely not a signature. This file is based on a subset of the file with email content.

As far as I understand from reading about basic machine learning concepts, this leaves me with the option of training a model using the positive and negative file. The model is supposed to learn from those entries and "teach itself" how to determine what an email signature looks like.

Now, if you are wondering why I am doing this, it is because if I succeed in extracting signatures from the actual content, I plan to take on a different task: trying to classify the actual email content. To what extent, I haven't decided yet. One could imagine a machine learning model capable of identifying historical tendencies in emails. That model could potentially focus on sales patterns, negotiation strategies, or other interesting insights. But that is for future exploration.

As I mentioned earlier, ML.Net seems to be an easy enough library to work with. The challenge is that it packages functionality for which I have little cognitive reference, which, of course, is not ML.Net's fault. This is the nature of such abstractions. They may be too easy to use for the uninitiated, and when you really need to tweak the parameters, you cannot avoid learning the underlying details. I believe I have fallen into that trap with this model.

var mlContext = new MLContext();

var posNegFilePath = "posneg.csv";

IEnumerable<EmailText> data = File.ReadAllLines(posNegFilePath)
    .Skip(1) // header of csv
    .Select(line =>
    {
        var columns = line.Split(',');
        return new EmailText
        {
            Text = columns[0].Trim(),
            Label = columns[1].Trim().Equals("Positive", 
                StringComparison.OrdinalIgnoreCase)
        };
    });

var dataView = mlContext.Data.LoadFromEnumerable(data);

...

From here on out there were some back and forth on how to use the mlContext for resulting as high a predictability as possible.

var pipeline = mlContext.Transforms.Text.FeaturizeText("Features", nameof(EmailText.Text))
    .Append(mlContext.BinaryClassification.Trainers.SdcaLogisticRegression(
        labelColumnName: nameof(EmailText.Label),
        featureColumnName: "Features"));

var signatureModel = pipeline.Fit(dataView);

I am really not sure if TF-IDF is the best option for what I am trying to do here, since, to my knowledge at least, words that occur commonly are given lower (IDF) weights. But this is shaky ground for me, as it feels like I am configuring a black box rather than truly understanding what I am doing, one abstraction at a time.

var predictions = signatureModel.Transform(dataView);
var metrics = mlContext.BinaryClassification.Evaluate(predictions, labelColumnName: nameof(EmailText.Label));

Console.WriteLine($"Accuracy: {metrics.Accuracy:P2}");

Luckily, the model is easy enough to train and evaluate, and neither process takes long. However, I find the results questionable. How can I achieve an accuracy above 90% from just 1,000 entries in the underlying file with positives and negatives? Could it be because I am evaluating the model on the same data that I used to train it?

Train or Run
train

Accuracy: 94.00%

When I run and use the model, rather than training it, it does not yield any results I am confident in. I could argue that this is somehow an accumulated positive bias, but then again, not really. I might be wrong, but "Denmark" should not have a score of 99.71%, and "Follow us on Instagram" should perhaps have a higher score?

[Signature Detected]: Ønsker dig en dejlig weekend. (Probability: 96.05%)
[Signature Detected]:  LOVE (Probability: 96.73%)
[Signature Detected]:  Robert Tobiassen (Probability: 93.27%)
[Signature Detected]:  Project Manager  (Probability: 98.21%)
[Signature Detected]:  ZigZag ApS (Probability: 96.91%)
[Signature Detected]:  Johnny Madsensgade 6 - 1st floor (Probability: 92.83%)
[Signature Detected]:  1208 Copenhagen K (Probability: 99.69%)
[Signature Detected]:  Denmark (Probability: 99.71%)
[Signature Detected]:  Office: +45 3344 7722 (Probability: 99.15%)
[Signature Detected]:  zigzag.com < (Probability: 98.41%)
[Signature Detected]:  FOLLOW US ON INSTAGRAM < (Probability: 81.16%)

Look at this one

[Signature Detected]: How old are they? Are they full time?  (Probability: 98.99%)
[Signature Detected]: Thank you (Probability: 99.86%)

It does seem like a bad model to me, and it also seems that my data might not be good enough. So, I need to revisit the cleaning pipeline and labeling of data. While I am at it, I think I will use only English.

A failed attempt so far, but I will try to give it another go in the not-too-distant future.

A Software Challenge that in turn is an Organizational Challenge

tirsdag den 10. december 2024

This is an exercise where there is no right or wrong answer. It is up to you to determine how the establishment of communication between two pieces of software should be put in place. You have no organizational power, so you cannot change how management has laid out the organization. You are part of a product team that develops a software product and uses other teams' products and data.

Today you have

ImageTexts (a Git repository hosted on GitHub).
ImageGen (an executable program).

Today, ImageGen is an executable program that pulls file data from ImageTexts and works its magic from there.

The only thing ImageGen knows about ImageTexts is the Uri of the Git repository. ImageTexts implements a file data format that ImageGen also knows about - it could just as well have been a binary format like PNG or JPEG. However, they do not share any code paths. They are completely separate on a technical level aside from knowing the same file format.

This separation exists partly because they deliver two different value streams to the business, and the orgnisation is unaware of how this might cause trouble:

ImageGen is a product, a piece of software, created by your product team.
ImageTexts is essentially a file store (really just a Git repository at its core) implemented by another product (pick a name!) that is managed by a different team. Marketing staff use this product to write the texts that generate AI images. Marketing, of course, does not know anything about the ImageTexts Git repository. Nor should they.

ImageGen generates and saves images by pulling the files and formats persisted by ImageTexts. For simplicity, we can assume ImageGen saves its created images locally on disk.

The thing is, this works fine. Pulling information is easy and has worked well for a long time. Creating images is also straightforward. Disk is blazing fast, and everyone has been happy.

Your new Problem

But one day, someone in marketing - the people who write the cool texts that ImageGen turns into images - says:

"Wait, why do we have to wait 10 to 20 minutes for our images to show up? Why can't I get the AI image as soon as I save the text that it should generate from?"

Faster results they want!

You know this delay might be because ImageGen is pulling data from the ImageTexts Git repository at some fixed intervals. The execution frequency seems to be too small for marketing's needs. They're busy people man!

It also sounds like marketing would like to use some kind of push mechanism so that ImageGen "wakes up" immediately when a file is saved to the ImageTexts git repo.

Push Mechanism and Events

This problem sounds familiar: some event happens (a file is saved in ImageTexts), and something else should respond to it (ImageGen generates an image).

Conceptually it could like a bit like any other event mechanism:

Action -> Event -> Action -> Event
                          \
                           -> Event

You are well aware of the technical side, having done this for many different products. You believe it would not be a problem to implement a push mechanism, on the technical side.

For example, the product that marketing uses could push events about new files in the ImageTexts repo to a queue, bus, or publisher. Then, on the other side, ImageGen could start listening to those events.

There are other things to be aware of.

You are not part of the team responsible for the ImageTexts repository. This means you would likely need to create a ticket asking for the new functionality. They don't accept pull-requests from others. But how busy is that team? Can they prioritize it? And marketing is on your back every week about this.

If the team cannot prioritize the request, are there alternative approaches that don't require changes on their side?

As you think a little about this, new questions arise:

Who should be responsible for running and maintaining the queue, bus, or publisher?

If something breaks in this proposed push-based mechanism, who will troubleshoot it? The ImageTexts team? The ImageGen team? Someone else?

You also wonder...

Can we just increase the execution frequency of the current pull mechanism in ImageGen? Maybe this simple solution will be sufficient for marketing's needs ?

Who is to determine the golden path of how important this is and who is in charge ? Do I call on some kind of leadership ?

Remember a few of the things you are dealing with:

Two systems, decoupled and unaware of each other at a technical level.
Two different product teams: one for ImageGen and another for ImageTexts.
No shared packages or code - only a URI and a file format, implemented separately in each system.

If you have read this being a programmer, you might already have come to a technical solution. In that case, have you thought about what strains the solution will leave on the rest of the organisation ?

Have you made any considerations of how large the organisation is ? How slow or fast it is ? Have you considered how communication flows and who is in charge ? Even though this might resemble the company you work for today, it is not. Respect that and understand that context when you start inspecting for solutions.

Like I stated in the beginnning, use this as an excersise. There is no right or wrong answer. You might be able to learn from it or you might have a too large ego or too simple answers for contexts you are unaware of.

But given these constraints, what would you do?

A German Dog Puzzle, Recursion and Backtracking

onsdag den 4. december 2024

At times, programming feels like something entirely different from building software within the boundaries of monetary expectations and some external pseudo-demands. The paper before you, filled with attempts at sketching algorithms, the allure of simply understanding structures, can be captivating - and yet, the frustrations can feel completely boundless. It can cripple your inability to step back, consult books, learn from the struggles of others, and try anew. To linger into the early hours of the morning, poring over details, branching out, circling back, starting over. What a horrible mess it can be.

When the children are asleep, conversations seem distant, and your mind swirls with possibilities - both the challenge at hand and the one waiting just beyond it - I find myself drifting more quickly to the window. I squeeze it open, letting the late evening's cool quietly embrace me, pulling me into a mental space filled with sweet digital memories and unresolved puzzles.

Dogs still bark in the hills. The moon hangs large over the cypress lining the avenue, reminding me of my place within my own surroundings. I tell myself, "I must program this to learn." To confront my shortcomings. To be in conflict. To conquer. To feel the satisfaction - not in the solution itself, but in the unraveling of understanding something. The act of completing something feels good. To have created a result feels meaningful.

The heavy yellow light has returned. The Half-moon and half-streetlamp. My feel is half-knowing and half-powerless. The humble honesty of my poor self, the echo in the clatter of my keyboard, while Ferry Corsten's upbeat rhythms weave into my ears, carrying me into some mellow trance of presence and, ultimately, exhaustion. Another night like this.

I've always found recursion to be equal parts amusing and frustrating. When I first encountered it in 2001, it was in connection with a tree structure I was tasked with printing on a screen - something hierarchical. Perhaps it was something as labyrinthine and finite as an organizational diagram for a government office.

I remember it vividly. I spent hours agonizing over why my brain couldn't grasp why a simpler iterative loop wouldn't suffice. I couldn't solve it. I hadn't even heard of recursion before, and while it's technically possible to traverse a tree structure without a self-calling function, my mind couldn't comprehend how.

I couldn't even be partly in invention of the idea that a function could call itself to arrive at a final result. I am, what do you mean by calling yourself ? Are you kidding me ? But after all, I didn't come from a computer science background - as a kid I programmed some C64, but I was also out a lot, looking for skateboarding spots.

Being sunk into the leather chair, Lis Wessberg playing softly in the speakers, a glass of dark rum before me, I now open the dense references with joy, almost bliss, eager to sink myself into these topics. Not like the old days, when speed was necessary to survive a job, but now, it's for the sheer pleasure of understanding - slowly.

Because how does one calculate the "child" of a "child" of a "parent"? Equal parts art, craft, timing, and logical reasoning. Correctness is enough for me, I am not pedantic. Even after 25 years of programming, I know I'm not particularly skilled in the grand scheme of this. But I get around.

A tree structure. My goal back then was to produce a result like this:

Frontpage
    Articles
        Archive
        By Author
            Nuller
            Niller
        Latest
            Yesterday
                Morning
                Afternoon
                Night
About
Contact
    Secratary
    Leadership

And the data I had available was structured as follows:

key     child     title
 1      null    Frontpage
 2      null      About 
 3      null     Contact
 8       1       Articles
 11      8       Archive
 12      8       By Author
 13      12      Niller
 17      12      Nuller

And so on...A good match for the use of recursion.

Processing item: Key=1, Child=, Title=Frontpage
- Frontpage
    Processing item: Key=8, Child=1, Title=Articles
    - Articles
        Processing item: Key=11, Child=8, Title=Archive
        - Archive
------ Exiting recursion: parentKey=11, level=3
        Processing item: Key=12, Child=8, Title=By Author
        - By Author
            Processing item: Key=13, Child=12, Title=Niller
            - Niller
-------- Exiting recursion: parentKey=13, level=4
            Processing item: Key=17, Child=12, Title=Nuller
            - Nuller
-------- Exiting recursion: parentKey=17, level=4
------ Exiting recursion: parentKey=12, level=3
---- Exiting recursion: parentKey=8, level=2
-- Exiting recursion: parentKey=1, level=1
Processing item: Key=2, Child=, Title=About
- About
-- Exiting recursion: parentKey=2, level=1
Processing item: Key=3, Child=, Title=Contact
- Contact
-- Exiting recursion: parentKey=3, level=1
 Exiting recursion: parentKey=, level=0

Throughout my career, I have often encountered cases where similar data is modeled by adding yet another column, which at first glance seems to make things "easier" than resorting to recursion. A column where the tree is precomputed for each node. This opens up the possibility of traversing the tree linearly.

key     child     title		precomp
 1      null    Frontpage	  1
 2      null      About 	  2
 3      null     Contact	  3
 8       1       Articles	  1, 8
 11      8       Archive	  1, 8, 11
 12      8       By Author    1, 8, 12
 13      12      Niller		  1, 8, 12, 13
 17      12      Nuller		  1, 8, 12, 17

I suddenly remember the last time I used recursion was for an experience and solution I devised for the "A Man with a Three Path Challenge." In that scenario, I ended up using recursion to partition data.

Yet, as if a wrench was suddenly thrown into the gears of my old OM636, I stumbled upon an article by Mark Seemann about the delightful little game Das Verflixte Hundespiel. And there I was - trapped. Let me clarify: it's not as if I don't understand which battles are worth fighting. But this challenge? It blindsided me. And it was a great read too. Bollocks.

Das Verflixte Hundespiel

For hours each evening, over the course of a week, I found myself returning to this puzzle. A week when, by all measures, I had no energy to write a single line of code. And yet, there I sat. Paralyzed by the backtracking nature of a challenge I had convinced myself I must solve.

What makes this puzzle particularly challenging, in my opinion, isn't just the number of possible permutations. It's the combination mechanism (rotation), the potential need for memory to track which pieces have been tried and where, and - crucially - the twist recursion introduces: the requirement to "backtrack" to every prior execution.

The Haskell code I had in front of me offered little assistance - I simply don't read Haskell. But I understood the logical, come to understand difficult to solve, designed rules of the game.

My first attempt was, quite honestly, more about getting something to compile than it was about crafting a solution. I needed to grasp the mechanics: how a piece in the game, when it doesn't fit, forces you to backtrack to the previous position on the board, try a different combination, and continue this process until all possible paths lead to a positive outcome.

Very quickly, my implementation became ridiculously unwieldy and obscure. I realized I had gone too far. The solution didn't need to be this convoluted. As always when I program, I start over at least some times.

Here's an excerpt from that first iteration:

public bool Solve(PuzzleSetup setup, PuzzleGridRowColumn topRow, int column)
{
    // we are at the first tile, we must start from the beginning
    if (column == 0)
    {
        setup.ReleaseUnvailableTile(topRow.PuzzleRow[0]);
        var mostLeftTile = setup.PickAnyTilePseudoRandom();
        topRow.PlaceTileInColumn(mostLeftTile, 0);
        setup.AddUnvailableTile(mostLeftTile);

        //since we might get in here by stepping back, we release
        setup.ReleaseTriedTiles();

        ++column;
    }

    foreach (var item in setup.AvailableTiles.ToList())
    {
        // if we have reached index 3 we are out of bounds and know we have succeeded
        if (column == 3)
            return true;

        var tryTile = setup.PickAnyTilePseudoRandom();
        var tryTileRotations = setup.Rotations(tryTile);

        //remember the tile that did not work so it's not being tried again.
        //and then release it when we find a match.
        var tileMatch = topRow.IfLeftTileMatchRightTilePlaceTile(topRow, column, tryTileRotations);

        // no match
        if (!tileMatch)
        {
            // if we have exhausted available tiles, release, step back and retry on that column index
            if (setup.AvailableTiles.Count == 0)
            {
                setup.ReleaseTriedTiles();

                // step back
                if (column > 0)
                {
                    --column;
                }
            }

            Solve(setup, topRow, column);
        }
        else
        {
            //match, make the tile unvailable
            setup.AddUnvailableTile(tryTile);

            //next column
            ++column;
        }
    }

    return false;
}

And so, I had at least four or five iterations that ended up looking rather peculiar.

Ken Shirriff explains the challenge greatly on his blog about this very puzzle. I also delved into some of Stanford's CS106b material, which featured a visualization of the solution - a striking depiction of where the combinations are tested and how the algorithm backsteps. Let me just say, the material available on Stanford's computer science pages is absolutely phenomenal.

recursive rotating backtracking

In the meantime, as I explored what turned out to be an incredibly elegant algorithm, I managed to solve a few smaller challenges. I failed miserably at using TDD though - a method I almost can't code without - and it's been ages since that last happened. I think the issue was that I didn't have a clear enough picture of how I wanted this to look in the end. As I mentioned to myself earlier, my coding stamina was far from strong that week. I was mentally elsewhere for most of it, and when my focus is scattered, I can't produce anything meaningful.

public class PuzzleSolver
{
    private const int gridSize = 9;
    private bool[] usedTiles;

    public PuzzleTile[,] PuzzleBoard;
    public PuzzleTile[] Tiles { get; set; }

    public PuzzleSolver(PuzzleTile[] puzzleTiles)
    {
        Tiles = puzzleTiles;
        PuzzleBoard = new PuzzleTile[3, 3];
        usedTiles = new bool[9];
    }

    public bool Solve()
    {
        ResetBoard();

        return PlaceTile(0, 0);
    }

    private void ResetBoard()
    {
        // no Array.Fill for [,]
        for (int row = 0; row < 3; row++)
            for (int col = 0; col < 3; col++)
                PuzzleBoard[row, col] = PuzzleTile.Empty;


        Array.Fill(usedTiles, false);
    }


    private int executionCounter = 0;
    private int NextExecution()
    {
        return executionCounter++;
    }

    private int NextRow(int column, int row)
    {
        return column == 2 ? row + 1 : row;
    }

    private int NextColumn(int column)
    {
        return (column + 1) % 3; //3 is max column size
    }

    private bool PlaceTile(int row, int column)
    {
        if (row == 3) // base case, solved!
        {
            return true;
        }
        
        int nextRow = NextRow(column, row);
        int nextColumn = NextColumn(column);

        for (int i = 0; i < Tiles.Length; i++)
        {
            if (usedTiles[i])
            {
                continue;
            }

            foreach (var rotatedTile in TileRotator.Rotations(Tiles[i]))
            {
                if (IsPlacementValid(rotatedTile, row, column))
                {
                    PuzzleBoard[row, column] = rotatedTile;
                    usedTiles[i] = true;

                    // next tile
                    if (PlaceTile(nextRow, nextColumn))
                        return true;

                    // if we cannot place the tile, "undo"/backstep tile at position
                    PuzzleBoard[row, column] = PuzzleTile.Empty;
                    usedTiles[i] = false;
                }
            }
        }

        // no valid tile found. Backtracking.
        return false;
    }


    private bool IsPlacementValid(PuzzleTile tile, int row, int col)
    {
        // Check top
        if (row > 0 && PuzzleBoard[row - 1, col] != null)
        {
            if (!TileMatches.TopBottomMatch(PuzzleBoard[row - 1, col], tile))
                return false;
        }

        // Check left
        if (col > 0 && PuzzleBoard[row, col - 1] != null)
        {
            if (!TileMatches.LeftRightMatch(PuzzleBoard[row, col - 1], tile))
                return false;
        }

        return true;
    }
}

public static class TileRotator
{
    public static PuzzleTile Rotate(PuzzleTile incoming)
    {
        return new PuzzleTile(incoming.left, incoming.top, incoming.right, incoming.bottom);
    }

    public static IEnumerable<PuzzleTile> Rotations(PuzzleTile current)
    {
        var rotations = new List<PuzzleTile>();

        for (int i = 0; i < 4; i++)
        {
            rotations.Add(current);
            current = Rotate(current);
        }

        return rotations;
    }
}

public static class TileMatches
{
    public static readonly HashSet<string> matches;

    static TileMatches()
    {
        matches = new HashSet<string>
        {
            "BHBT",
            "GHGT",
            "SHST",
            "UHUT"
        };
    }

    public static bool LeftRightMatch(PuzzleTile left, PuzzleTile right)
        => matches.Contains(string.Concat(left.right, right.left));

    public static bool TopBottomMatch(PuzzleTile top, PuzzleTile bottom)
        => matches.Contains(string.Concat(top.bottom, bottom.top));
}

public static class TileShuffler
{
    public static PuzzleTile[] Shuffle(List<PuzzleTile> array)
    {
        return array.OrderBy(_ => Guid.NewGuid()).ToArray();
    }
}

public record PuzzleTile(string top, string right, string bottom, string left)
{
    public static PuzzleTile Empty => new PuzzleTile(string.Empty, string.Empty, string.Empty, string.Empty);

    public override string ToString()
    {
        return $"Top: {top} Right: {right} Bottom: {bottom} Left: {left}";
    }
}

And so, I emerged from that week, not with a pristine solution, but with a deeper understanding of the problem - and myself. It reminded me of why I code in the first place. It's not about the cleanest implementation or even solving the puzzle; it's about the process - the way challenges stretch the mind and reveal its limits, the quiet joy of inching closer to clarity, even through failure.

Sometimes, the real value isn't in finding the perfect answer but in the effort and persistence it takes to get there. The week left me with a respect for the elegance of recursion, the complexity of backtracking, and the satisfaction of tackling something that genuinely stretched my limits.

I happily run the PuzzleSolver by initializing the 9 tiles. One thing I probably do not need is the while loop, but I am actually not confident ?

public class Program
{
    public static void PrintSolution(PuzzleTile[,] Grid)
    {
        var output = new StringBuilder();

        //rows
        for (int i = 0; i < 3; i++) 
        {
            // first row
            for (int j = 0; j < 3; j++)
            {
                output.Append($"   {Grid[i, j].top}   ");
                output.Append(" ");
            }
            output.AppendLine();

            // second row
            for (int j = 0; j < 3; j++)
            {
                output.Append($"{Grid[i, j].left}    {Grid[i, j].right}");
                output.Append(" ");
            }
            output.AppendLine();

            // third row
            for (int j = 0; j < 3; j++)
            {
                output.Append($"   {Grid[i, j].bottom}   ");
                output.Append(" ");
            }

            //spacers
            output.AppendLine();
        }

        Console.WriteLine(output.ToString());
    }

    public static void Main()
    {
        var gamePieces = new List<PuzzleTile>
        {
            new PuzzleTile("BH", "GH", "UT", "SH"), // Brown head, grey head, umber tail, spotted tail
            new PuzzleTile("BH", "SH", "BT", "UT"), // Brown head, spotted head, brown tail, umber tail
            new PuzzleTile("BH", "SH", "GT", "UT"), // Brown head, spotted head, grey tail, umber tail (duplicate)
            new PuzzleTile("BH", "SH", "GT", "UT"), // Brown head, spotted head, grey tail, umber tail (duplicate)
            new PuzzleTile("BH", "UH", "ST", "GT"), // Brown head, umber head, spotted tail, grey tail
            new PuzzleTile("GH", "BH", "ST", "UT"), // Grey head, brown head, spotted tail, umber tail
            new PuzzleTile("GH", "SH", "BT", "UT"), // Grey head, spotted head, brown tail, umber tail
            new PuzzleTile("GH", "UH", "BT", "ST"), // Grey head, umber head, brown tail, spotted tail
            new PuzzleTile("GH", "UH", "GT", "ST")  // Grey head, umber head, grey tail, spotted tail
        };

        bool solutionFound = false;

        while (!solutionFound)
        {
            var shuffledPieces = TileShuffler.Shuffle(gamePieces);
            var solver = new PuzzleSolver(shuffledPieces);

            Console.WriteLine("Solve the Puzzle...");
            solutionFound = solver.Solve();

            if (solutionFound)
            {
                PrintSolution(solver.PuzzleBoard);
            }
            else
            {
                Console.WriteLine("Back to square (□) 1...");
            }

            Console.WriteLine(Environment.NewLine);
        }
    }
}

The old Drum of When Scrum Works

mandag den 18. november 2024

When Scrum works, organizations and teams, I've observed, can become accountable and predictable.

This will sound like the beat of an old drum - nonetheless, I believe it's an old drum worth beating. This is yet another opinionated piece delving into why, as a field, we still struggle with Scrum - a process that, on paper, is quite approachable but demands professionalism, time, understanding, and respect to truly succeed.

Over the past years, Agile - and particularly Scrum - has been criticized for being tedious and difficult. I come from a different school of thought, one that might seem a bit too "square" for the pop-culture-inspired "young guns". I do not mean that in any harsh way, I myself was a "young gun" plowing away. But I was raised in software development with some notion of real agility, but my own context and setting for many years was in environments of small teams and limited resources - untill I tried the powers of larger organizations and teams.

Different approaches work in different contexts and settings, and this distinction is key. Process can be beneficial, but process for the sake of process is irrelevant and often counterproductive in certain circumstances.

Blaming the process, in my view, is a mistake - usually stemming from a misunderstanding of the context and setting. This, of course, is not the fault of the process.

Hanlon's Razor reminds us: "Never attribute to malice what can adequately be explained by incompetence."

Programmers are often creative beings, driven by a desire to unlock their potential through code. In that light, a process like Scrum might feel restrictive, or even stifling. It may feel like it slows progress, though often it doesn't. The tension comes from misaligned expectations and a lack of solidarity within the given context and setting.

Rather than asking how a process is failing, we should ask why the process is needed. In some contexts, it may not be. But context and setting should inform that decision, not blind faith or frustration.

Scrum has been sold as an Agile framework for sustaining flow and improving output, but it has never claimed to guarantee correct outcomes. That distinction - between output and outcome - is critical. Scrum won't save you if you don't know your goals.

Over my years in programming, I've seen this misplaced hope time and time again: a belief that process alone will deliver good outcomes, as if it could somehow walk on water.

A better approach is to ask why a process like Scrum is crucial for a particular organization or team. Software development is notoriously unpredictable (seriously, read the books!), but unpredictability doesn't mean we shouldn't try. Predictability is a shared responsibility - not just among developers but also for those setting business goals.

Scrum isn't perfect for every scenario. It's not great for validating hypotheses or scenarios requiring rapid iteration. It's too ceremonial for that. But in the right contexts - especially those demanding stability, predictability, and accountability - it excels.

Professionalism and respect are non-negotiable.

There are no shortcuts if the desired outcome is valuable. Sometimes, value means discarding an idea early because you've identified it as infeasible within your context and setting.

I believe Scrum is a fine process. Having worked through a wide range of issues with it, I've learned that the core problems often come down to maturity - or lack thereof - and an overload of noise.

Large egos often complicate things. Just as architects can dictate from ivory towers, claiming the map is the territory, colleagues might whisper (or shout) that Scrum isn't working. Some might claim Scrum isn't "truly Agile." I've even done so myself at times - but only when it's clear that someone has misunderstood or misapplied the process.

Scrum is a well - documented, robust framework.

When I've seen it work brilliantly, it's been transformative - like a seafarer discovering new continents. In these cases, teams understood the context and setting that made Scrum both helpful and necessary. Conversely, I've also seen Scrum fail miserably, often because it was implemented without consideration of those same factors.

Agility for individuals vs. teams

Individual agility and team agility are not the same. Teams are not just collections of individuals. A team of five agile individuals is not necessarily an agile team. Scrum is a tool for teams, not individuals. It emphasizes stability, predictability, and accountability within a collective, which may feel stifling for individual contributors who value creative freedom above all else.

This tension often stems from broader social constructs. In a world that prizes individuality, true teamwork - with no egos - can be a hard sell.

That said, many organizations have embraced Scrum as a default process, often misunderstanding its purpose. Scrum doesn't promise "swiftness" or ease of execution - it promises a framework. It supports teams by providing structure, but success requires professionalism and respect.

Introducing Scrum is a big deal.

Its ceremonies and roles are deliberate, and if the process isn't followed properly, it's not Scrum's fault. The expectation that Scrum is simple or intuitive is misguided. Implementing it requires significant effort and discipline.

In every organization, there are underlying factors - politics, pressure, dishonesty, incentives, communication barriers - that can derail even the best intentions. Before choosing a process, these factors need to be acknowledged and addressed.

Scrum is not complex in itself, but getting people to work together within its framework is. Don't be fooled by thin books or easy certifications. "The word is not the thing," as Korzybski famously said.

Beyond understanding Scrum, people need to respect it. Retrospectives, for example, are a vital space for actionable discussions. But if those discussions don't lead to tangible change, retrospectives lose their value.

Scrum is not agility - it's accountability.

In my opinion, Scrum works best when teams are in a stable, norming state - not forming or storming. For teams still finding their footing, lighter-weight processes are often more appropriate.

Yes, Scrum can feel tedious, but that's part of its DNA. It's designed to foster accountability and predictability. For teams focusing solely on business capabilities and neglecting technical growth, Scrum doesn't dictate that imbalance. Professional developers should know when to prioritize technical improvements.

The problem with Scrum isn't Scrum - it's expectations.

If leadership doesn't support it, Scrum won't work. If your work items are ambiguous, Scrum won't save you. If your Scrum Master fails to follow through on retrospectives, Scrum will falter. And so on.

Scrum is difficult to get right, but the process itself is rarely to blame for failures.

Balancing Readability, Testability, and Structure: Refactoring a small type with John Carmack’s Email in mind

mandag den 28. oktober 2024

This is an article about refactoring the characteristics and behaviors of a type, based on some interesting arguments written in an email by programmer John Carmack in 2007.

The email.

He introduces an interesting idea that makes me reflect on when and why it might be necessary to disregard conventional wisdom surrounding the downsizing and decomposition of methods/functions, often aimed at improving readability, code navigation, and mental cognition. While I definitely believe that smaller pieces of code can enhance readability, there is also a cost associated with it, particularly around maintaining the same navigational context and minimizing context switching.

It's a worthwhile question whether breaking down methods/functions into smaller ones is the right approach from the outset.

John's email from 2007, at least as I interpret it, suggests something contrary to what I have been taught over the years—that is, not to lump everything into one method or function. Larger methods, of course, can blur readability, become brittle in testing, and violate the Single Responsibility Principle (SRP). The exercise of "3 strikes, then add the first abstraction" comes to mind. I no longer apply this upfront because my sense of when to decompose a method is somewhat ingrained in how I approach programming in the first place. So, I could say that I decompose right away, but that wouldn't be entirely accurate because I frequently revisit and refactor, either following the Boy Scout Rule or based on my cognitive state at that particular time. I write a method, decompose it when it becomes too long, and continuously refine it without giving it much thought. I do, however, approach decomposition with more consideration when it involves a public API.

I mention "first-time abstraction" because I don't believe the initial abstraction attempt is always correct, even after reviewing something three times and determining it should be abstracted. I might need the abstraction, but does that mean it will be perfect the first time? In my experience, no.

This has become part of my own practice: applying a hypothesis around a potential abstraction but also keeping it as narrow as possible due to the likelihood of revisiting the same piece of code in the future. This isn't universally applicable—magic numbers and strings, for example, are clear candidates for abstraction early on.

Anyway, as I reflect on this while writing and reviewing some code, I see no reason to split up this function or module/class as a means of improving readability. Readability is a crucial aspect of programming, but I also worry about applying too much abstraction since it can extend beyond readability concerns. And honestly, I'm not sure I entirely believe that.

When I navigate through code—going to references, implementations, scrolling up and down—I often lose context. But maybe if the decomposition isn't scattered across different types and files, it could be effective. This is how I've programmed for years.

It has been a long time since I've used a different approach in any object-oriented programming language, though earlier I wrote a lot of interpreted code. Robert C. Martin's advice, “Gather together the things that change for the same reasons,” still resonates. It's solid guidance.

John Carmack's perspective adds another layer, one that I believe significantly impacts a programmer's cognitive load when navigating and reading source code: “Step into every function to try and walk through the complete code coverage.” Next time you're working in a codebase, try following all the paths your code takes during execution and debugging.

We can debate readability and composition all year. I'm not here to impose any dogma or dictate the best approach—you can decide that for yourself. I'm simply experimenting with styles. Some, as John suggests, might be less prone to introducing defects.

In his email, John outlines three types of compositions for a function, or a module if you prefer to extend the concept.

------- style A:
 
void MinorFunction1( void ) {
}
 
void MinorFunction2( void ) {
}
 
void MinorFunction3( void ) {
}
 
void MajorFunction( void ) {
        MinorFunction1();
        MinorFunction2();
        MinorFunction3();
}
 
--------- style B:
 
void MajorFunction( void ) {
        MinorFunction1();
        MinorFunction2();
        MinorFunction3();
}
 
void MinorFunction1( void ) {
}
 
void MinorFunction2( void ) {
}
 
void MinorFunction3( void ) {
}
 
 
---------- style C:
 
void MajorFunction( void ) {
        // MinorFunction1
 
        // MinorFunction2
 
        // MinorFunction3 
}

He concludes with these thoughts:

“Inlining code quickly runs into conflict with modularity and object-oriented programming (OOP) protections, and good judgment must be applied. The whole point of modularity is to hide details, while I advocate for increased awareness of those details. Practical factors like increased multiple checkouts of source files and including more local data in the master precompiled header, forcing more full rebuilds, must also be weighed. Currently, I lean towards using heavyweight objects as the reasonable breakpoint for combining code and reducing the use of medium-sized helper objects while keeping any very lightweight objects purely functional if they must exist at all.”

The key points I take away are:

"Good judgment must be applied"; you need experience and professionalism for this.
"Practical factors..."; juggling multiple files while reading or writing code can be challenging.
"Heavyweight objects...very lightweight objects...purely functional"; this appears to be the crux of his argument, suggesting a preference for larger objects while avoiding decomposition unless a clear breakpoint exists. Pure functions are beneficial but tricky when dealing with I/O and networking.

Finally, John gives this advice:

“If a function is only called from a single place, consider inlining it.”

I don't intend to diminish John's insights—he's a far more experienced programmer than I am—but I find his thoughts and arguments compelling enough to explore further. Since it's a style I haven't used in a while, I'm curious to give it a try.

The first thing that comes to mind when I read, “If a function is only called from a single place, consider inlining it,” is that every function starts by being called from just one place. If not, you might already be ahead of yourself.

I'll begin with an inlined version of an originally non-inlined method and present the full type. One could write a type like this with significant inlining involved.

public class DownloadMarkdownFileService : IDownloadMarkdownFile
{
	private readonly string[] ValidFileExtensions = [".md", ".markdown"];
	private readonly List<DownloadMarkdownExecption> downloadExceptions = [];
	private readonly List<MarkdownFile> markdownDownloads = [];
	private readonly DownloadMarkdownFileServiceResult downloadAsyncResult = new();

	private readonly HttpClient httpClient;

	public DownloadMarkdownFileService(HttpClient httpClient)
	{
		this.httpClient = httpClient;
	}

	public async Task<DownloadMarkdownFileServiceResult> DownloadAsync
		(IEnumerable<Uri> uris, CancellationToken cancellationToken = default)
	{
		foreach (var uri in uris)
		{
			if (uri == null) continue;

			var fileName = Path.GetFileName(uri.AbsoluteUri);
			var extension = fileName.Contains('.') ? Path.GetExtension(fileName) : string.Empty;

			if (!ValidFileExtensions.Contains(extension)) continue;

			var markdownFile = new MarkdownFile(uri);

			try
			{
				var result = await httpClient.GetAsync
					(markdownFile.Path, cancellationToken);

				if (!result.IsSuccessStatusCode)
				{
					throw new HttpRequestException
						($"Could not download file at {markdownFile.Path}", 
						null, 
						result.StatusCode);
				}

				markdownFile.Contents =
					await result.Content.ReadAsStringAsync();

				markdownDownloads.Add(markdownFile);
			}
			catch (HttpRequestException hre)
			{
				downloadExceptions.Add
					(new DownloadMarkdownExecption($"{hre.Message}", hre));
			}
			catch (Exception e)
			{
				downloadExceptions.Add
					(new DownloadMarkdownExecption($"{e.Message}", e));
			}
		}

		downloadAsyncResult.DownloadExceptions = downloadExceptions;
		downloadAsyncResult.MarkdownFiles = markdownDownloads;

		return downloadAsyncResult;
	}
}

Anyone can program in their own preferred style; I'm not here to critique but rather to experiment with different approaches and explore their potential.

When I look at this code, I feel a bit mixed. It accomplishes its purpose, and its structure and length are reasonable.

However, I find it difficult to fully accept this method because it goes against some principles I consider important, and I suspect it may pose challenges when testing. Although readability isn't particularly problematic here, I feel the method does more than is ideal for a single function to handle.

The inlined elements within this method could also be seen as problematic. For instance, the method performs multiple tasks, limits extension options, and might even conflict with the Dependency Inversion Principle, as it returns a concrete type rather than an interface.

Here's what the method currently does:

Validates file extensions (ValidFileExtensions)
Fetches content via HTTP (using HttpClient)
Processes downloaded files (markdownDownloads)
Returns the result in DownloadMarkdownFileServiceResult

While I don't find the method's readability poor—it narrates a clear story, has some error-handling decisions, and a logical structure—testing it could be challenging. This is mainly due to the intertwined implementation details. For instance, the URI extension validation mechanism is tightly integrated within the method, making it less flexible and more difficult to modify or test independently. This isn't about rigid adherence to any single approach; rather, it's just an observation. Our goal here is to explore if and when inlining code in this way is effective for certain parts of our codebase.

In a more structured OOP environment, the URI extension might make an excellent candidate for a pure function. And this aligns with what I understand John is suggesting in his email, where he states: "If a function is only called from a single place, consider inlining it." He also notes, "If the work is close to purely functional...try to make it completely functional."

Since this function is called only once, inlining seems appropriate. So, let's look for other potential candidates where this might also apply.

One could write a pure function as shown below and encapsulate it in a lightweight type. While this approach might counter the purpose of inlining, it's worth noting. So far, I've identified one candidate for inlining, but there could be more.

public string UriPathExtension(Uri uri) =>
	uri is null ? string.Empty : Path.GetExtension(uri.AbsoluteUri);

The next point John makes in his list is: "If a function is called from multiple places, see if it is possible to arrange for the work to be done in a single place, perhaps with flags, and inline that."

This is an interesting suggestion. Suppose I have two types of tasks I need to carry out. I want to download markdown files, but I also need to download image files, like JPGs and PNGs. I might initially have two distinct services, each calling the pure function I outlined in UriPathExtension. John's advice here is to consolidate that work in a single place and then inline the function again.

Since my original method returns a DownloadMarkdownFileServiceResult, I would likely need to refactor it to a more general DownloadFileServiceResult. From there, the adjustments could start to cascade. With this type, I could handle both tasks while keeping potential extractions inlined. (Please note, I haven't fully implemented this code, so it's not build-ready due to the incomplete refactoring.)

In this experiment, I'm focusing on readability, testability, and the shifts in mindset required when programming in an OOP context.

public class DownloadMarkdownFileService : IDownloadFiles
{
	private readonly string[] ValidMarkdownFileExtensions = [".md", ".markdown"];
	private readonly string[] ValidPictureFileExtensions = [".jpg", ".png"];

	private readonly List<DownloadMarkdownExecption> downloadExceptions = [];
	private readonly List<DownloadableFile> downloadedFiles = [];
	private readonly DownloadMarkdownFileServiceResult downloadAsyncResult = new();

	private readonly HttpClient httpClient;

	public DownloadMarkdownFileService(HttpClient httpClient)
	{
		this.httpClient = httpClient;
	}

	public async Task<DownloadMarkdownFileServiceResult> DownloadAsync
		(IEnumerable<Uri> uris, CancellationToken cancellationToken = default)
	{
		foreach (var uri in uris)
		{
			if (uri == null) continue;

			var fileName = Path.GetFileName(uri.AbsoluteUri);
			var extension = fileName.Contains('.') ? Path.GetExtension(fileName) : string.Empty;

			if (!ValidMarkdownFileExtensions.Contains(extension)) continue;
			if (!ValidPictureFileExtensions.Contains(extension)) continue;

			try
			{
				var result = await httpClient.GetAsync
					(file.Path, cancellationToken);

				if (!result.IsSuccessStatusCode)
				{
					throw new HttpRequestException
						($"Could not download file at {file.Path}", 
						null, 
						result.StatusCode);
				}

				//Markdown
				if (ValidMarkdownFileExtensions.Contains(extension))
				{
					//markdown 
					file.Contents =
						await result.Content.ReadAsStringAsync();
				}

				//Pictures
				if (ValidPictureFileExtensions.Contains(extension))
				{
					//markdown 
					file.FileStream =
						await result.Content.ReadAsStreamAsync();
				}

				downloadedFiles.Add(file);
			}
			catch (HttpRequestException hre)
			{
				downloadExceptions.Add
					(new DownloadResultExecption($"{hre.Message}", hre));
			}
			catch (Exception e)
			{
				downloadExceptions.Add
					(new DownloadResultdownExecption($"{e.Message}", e));
			}
		}

		downloadAsyncResult.DownloadExceptions = downloadExceptions;
		downloadAsyncResult.DownloadableFiles = downloadedFiles;

		return downloadAsyncResult;
	}
}

How does it read? How does it test? I could reiterate what I mentioned with the first method described earlier: the more varied tasks I incorporate around the same domain concept, the more this method seems to expand.

I realize I'm bringing up testing considerations here, and perhaps I shouldn't, but each time I add more inlined code, I have an instinctive hesitation. This latest method won't necessarily become easier to test or to use. One small challenge among many is that now I need to assess the file type in the result as well. But having all the code in one place certainly has its benefits. The trade-offs, however, are evident.

"good judgment must be applied" - John Carmack

For readability, this setup isn't quite working for me. While I appreciate the reduced cognitive load from having all the code in one place, the trade-offs seem too significant. One might argue that this type is too small to be a meaningful experiment, which is fair — but is any type truly too small? Either way, the bottom line is that I don't find this approach appealing, so I'll go back, keeping in mind that "good judgment must be applied," and refactor the first method posted to explore a few alternative styles I might actually use.

My initial refactoring approach still leans towards the principles outlined in Carmack's Email which is perfectly fine. I'll decompose the type as I proceed.

public class DownloadMarkdownFileService : IDownloadMarkdownFile
{
    private readonly HttpClient httpClient;
    private readonly string[] validFileExtensions = { ".md", ".markdown" };

    public DownloadMarkdownFileService(HttpClient httpClient)
    {
        this.httpClient = httpClient;
    }

    public async Task<DownloadMarkdownFileServiceResult> DownloadAsync(IEnumerable<Uri> uris, CancellationToken cancellationToken = default)
    {
        var downloadExceptions = new List<DownloadMarkdownExecption>();
        var markdownFiles = new List<MarkdownFile>();

        foreach (var uri in uris)
        {
            if (uri == null) continue;

            var fileName = Path.GetFileName(uri.AbsoluteUri);
            var extension = fileName.Contains('.') ? Path.GetExtension(fileName) : string.Empty;

            if (!validFileExtensions.Contains(extension)) continue;

            try
            {
                var response = await httpClient.GetAsync(uri, cancellationToken);

                if (!response.IsSuccessStatusCode)
                {
                    throw new HttpRequestException($"Could not download file at {uri}", null, response.StatusCode);
                }

                var contents = await response.Content.ReadAsStringAsync();
                markdownFiles.Add(new MarkdownFile(uri) { Contents = contents });
            }
            catch (HttpRequestException hre)
            {
                downloadExceptions.Add(new DownloadMarkdownExecption(hre.Message, hre));
            }
            catch (Exception e)
            {
                downloadExceptions.Add(new DownloadMarkdownExecption(e.Message, e));
            }
        }

        return new DownloadMarkdownFileServiceResult
        {
            DownloadExceptions = downloadExceptions,
            MarkdownFiles = markdownFiles
        };
    }
}

The second approach is leaning a little more into something around SOLID, not much though, since I have only really extracted a new interface in IFileValidator.

public interface IFileValidator
{
    bool IsValid(string fileName);
}

public class FileValidator : IFileValidator
{
    private readonly string[] _validFileExtensions = { ".md", ".markdown" };

    public bool IsValid(string fileName)
    {
        var extension = Path.GetExtension(fileName);
        return !string.IsNullOrEmpty(extension) && _validFileExtensions.Contains(extension);
    }
}

public class DownloadMarkdownFileService : IDownloadMarkdownFile
{
    private readonly IFileValidator fileValidator;
    private readonly HttpClient httpClient;

	public DownloadMarkdownFileService(IFileValidator fileValidator, HttpClient httpClient)
	{
		this.httpClient = httpClient;
		this.fileValidator = fileValidator;
	}

    public async Task<DownloadMarkdownFileServiceResult> DownloadAsync(IEnumerable<Uri> uris, CancellationToken cancellationToken = default)
    {
        var downloadExceptions = new List<DownloadMarkdownException>();
        var markdownDownloads = new List<MarkdownFile>();

        foreach (var uri in uris)
        {
            if (uri == null) continue;

            var fileName = Path.GetFileName(uri.AbsoluteUri);
            if (!fileValidator.IsValid(fileName)) continue;

            var markdownFile = new MarkdownFile(uri);

            try
            {
                var result = await httpClient.GetAsync(markdownFile.Path, cancellationToken);

                if (!result.IsSuccessStatusCode)
                {
                    throw new HttpRequestException($"Could not download file at {markdownFile.Path}", null, result.StatusCode);
                }

                markdownFile.Contents = await result.Content.ReadAsStringAsync();
                markdownDownloads.Add(markdownFile);
            }
            catch (HttpRequestException hre)
            {
                downloadExceptions.Add(new DownloadMarkdownException(hre.Message, hre));
            }
            catch (Exception e)
            {
                downloadExceptions.Add(new DownloadMarkdownException(e.Message, e));
            }
        }

        return new DownloadMarkdownFileServiceResult
        {
            DownloadExceptions = downloadExceptions,
            MarkdownFiles = markdownDownloads
        };
    }
}

As I go along, please remember that I am still looking for readability and secondly, testability.

The next refactoring I present is towards even more SOLID, and even more decomposing. Does it read well ? Does it test well ?

public interface IFileValidator
{
	bool IsValid(string fileName);
}

public class FileValidator : IFileValidator
{
	private readonly string[] _validFileExtensions = { ".md", ".markdown" };

	public bool IsValid(string fileName)
	{
		var extension = Path.GetExtension(fileName);
		return !string.IsNullOrEmpty(extension) && _validFileExtensions.Contains(extension);
	}
}

public class DownloadMarkdownFileService : IDownloadMarkdownFile
{
	private readonly IFileValidator fileValidator;
	private readonly HttpClient httpClient;

	public DownloadMarkdownFileService(IFileValidator fileValidator, HttpClient httpClient)
	{
		this.httpClient = httpClient;
		this.fileValidator = fileValidator;
	}

	public async Task<DownloadMarkdownFileServiceResult> DownloadAsync(
		IEnumerable<Uri> uris, 
		CancellationToken cancellationToken = default)
	{
		var downloadExceptions = new List<DownloadMarkdownException>();
		var markdownFiles = new List<MarkdownFile>();

		foreach (var uri in uris)
		{
			if (uri == null) continue;

			var (success, markdownFile, exception) = await TryDownloadFileAsync(uri, cancellationToken);

			if (success && markdownFile != null)
			{
				markdownFiles.Add(markdownFile);
			}
			
			if (exception != null)
			{
				downloadExceptions.Add(exception);
			}
		}

		return CreateResult(markdownFiles, downloadExceptions);
	}

	private async Task<(bool Success, MarkdownFile? MarkdownFile, DownloadMarkdownException? Exception)> 
		TryDownloadFileAsync(Uri uri, CancellationToken cancellationToken)
	{
		if (!IsValidFile(uri))
		{
			return (false, null, null);
		}

		var markdownFile = new MarkdownFile(uri);

		try
		{
			var content = await DownloadFileContentAsync(markdownFile, cancellationToken);
			markdownFile.Contents = content;
			return (true, markdownFile, null);
		}
		catch (HttpRequestException hre)
		{
			return (false, null, new DownloadMarkdownException(hre.Message, hre));
		}
		catch (Exception e)
		{
			return (false, null, new DownloadMarkdownException(e.Message, e));
		}
	}

	private bool IsValidFile(Uri uri)
	{
		var fileName = Path.GetFileName(uri.AbsoluteUri);
		return fileValidator.IsValid(fileName);
	}

	private async Task<string> DownloadFileContentAsync(
		MarkdownFile markdownFile, 
		CancellationToken cancellationToken)
	{
		var response = await httpClient.GetAsync(markdownFile.Path, cancellationToken);

		if (!response.IsSuccessStatusCode)
		{
			throw new HttpRequestException($"Could not download file at " +
				$"{markdownFile.Path}", null, response.StatusCode);
		}

		return await response.Content.ReadAsStringAsync();
	}

	private DownloadMarkdownFileServiceResult CreateResult(
		IEnumerable<MarkdownFile> markdownFiles, 
		IEnumerable<DownloadMarkdownException> exceptions)
	{
		return new DownloadMarkdownFileServiceResult
		{
			MarkdownFiles = markdownFiles.ToList(),
			DownloadExceptions = exceptions.ToList()
		};
	}
}

Now I'm approaching something that's highly decomposed compared to the first type I presented above. For me, it's not as readable, though that may be personal preference.

The key point here is that the benefits from this decomposition are less about readability and more about testability. How code reads is often more a matter of context. We could agree that, in isolation, each of these smaller methods is easier to read than the earlier types, but without the context around them, they only convey a single aspect of the process.

When methods do only one thing, they're either meant to be used by other types or are perhaps too small and should be consolidated into the types where they naturally fit. I believe that's in line with what Carmack mentioned. So, in a way, we're back to square one: writing code that balances readability, testability, and structure is challenging.

I could continue with a few more examples of what this type might look like. One example, and a language feature I rarely see used, is the use of local functions. I find them particularly appealing when working with functions needed only within a single type. This will be the last refactoring I present. It's been enjoyable to explore and demonstrate John Carmack's ideas, and exercises like this are always insightful.

public class DownloadMarkdownFileService : IDownloadMarkdownFile
{
	private readonly string[] validFileExtensions = { ".md", ".markdown" };
	private readonly List<DownloadMarkdownExecption> downloadExceptions = new();
	private readonly List<MarkdownFile> markdownDownloads = new();
	private readonly DownloadMarkdownFileServiceResult downloadAsyncResult = new();

	private readonly HttpClient httpClient;

	public DownloadMarkdownFileService(HttpClient httpClient)
	{
		this.httpClient = httpClient;
	}

	public async Task<DownloadMarkdownFileServiceResult> DownloadAsync(
		IEnumerable<Uri> uris, CancellationToken cancellationToken = default)
	{
		foreach (var uri in uris)
		{
			if (uri == null) continue;

			var fileName = Path.GetFileName(uri.AbsoluteUri);
			var extension = GetFileExtension(fileName);

			if (!IsValidExtension(extension)) continue;

			var markdownFile = new MarkdownFile(uri);

			try
			{
				await DownloadFileAsync(markdownFile, cancellationToken);
				markdownDownloads.Add(markdownFile);
			}
			catch (HttpRequestException hre)
			{
				downloadExceptions.Add(new DownloadMarkdownExecption($"{hre.Message}", hre));
			}
			catch (Exception e)
			{
				downloadExceptions.Add(new DownloadMarkdownExecption($"{e.Message}", e));
			}
		}

		return BuildDownloadResult();

		string GetFileExtension(string fileName) =>
			fileName.Contains('.') ? Path.GetExtension(fileName) : string.Empty;

		bool IsValidExtension(string extension) =>
			validFileExtensions.Contains(extension);

		async Task DownloadFileAsync(MarkdownFile markdownFile, CancellationToken cancellationToken)
		{
			var response = await httpClient.GetAsync(markdownFile.Path, cancellationToken);

			if (!response.IsSuccessStatusCode)
			{
				throw new HttpRequestException(
					$"Could not download file at {markdownFile.Path}", null, response.StatusCode);
			}

			markdownFile.Contents = await response.Content.ReadAsStringAsync();
		}

		DownloadMarkdownFileServiceResult BuildDownloadResult()
		{
			downloadAsyncResult.DownloadExceptions = downloadExceptions;
			downloadAsyncResult.MarkdownFiles = markdownDownloads;
			return downloadAsyncResult;
		}
	}
}

Refactoring the Mental Bin: An Extension in the way

mandag den 7. oktober 2024

This is an article that is a little bit obnoxius really. I draw up some opionions here that I don't initially, when writing code, always comply with. But I would like to comply with at all times. But I guess that's another story.

I was listening to Jazzmatazz Vol. 1 and reading some really simple code.

I remembered a talk by a former professional friend who had made a statement about arrays and performance. Another professional friend had once said, "stupid code is fast code." I was left pondering these two statements while reading the code.

Code readability is subjective. How someone else perceives code is different from how I do, and therefore, it's of course very important that, when working on a team, such things are handled by using some kind of tooling.

You should, of course, not be "allowed" to use one style of code while your colleague uses a different one; that will most likely lead to confusion and frustration. Program correctness, on the other hand, is something you can objectively measure — "if your specifications are contradictory, life is very easy, for then you know that no program will satisfy them, so, make "no program"; if your specifications are ambiguous, the greater the ambiguity, the easier the specifications are to satisfy (if the specifications are absolutely ambiguous, every program will satisfy them!).".

Having consistent, readable code alone should lower the cognitive load while reading that code. It's about context switching, they say. We all know how it feels when we enter a module with neatly organized methods, naming conventions, sizing, and documentation—and then switch to a different module that doesn’t fit into the mental model you were just in. You then need to switch gears and figure out what's going on.

Think about this. Imagine having three programmers. One writes abstract classes more often than the second programmer, who writes more interfaces. And the last programmer writes and uses concrete types. Will this be, or is it even, a readability issue?

How programmers use a language will most likely evolve over time. It will most likely evolve as they learn more things. So a codebase will likely also change over time due to that evolution. How should that be controlled?

Reading code is more difficult than writing it. That’s the same reason programmers should be very much aware of how they present their code to others in written form. If you're working on a team, at least have tooling set up around structural semantics. Personally, I like code that reads into the lowest common denominator; call it naive, stupid, or the opposite of clever. I really don't care about how clever you are, I also need to understand it.

Reading code should be like drinking a good glass of wine while eating a good pasta dish and listening to some jazz tunes — something a little less hip-hop-ish than Guru’s Jazzmatazz. It should be comfortable and make you feel safe and welcomed.

So, I was reading some code, and this might seem like a bland example, but it should be easy for someone to read and understand.

Return this result:

You have a list of URIs.
You want to return only the URIs that have certain path/file extensions.

Go ahead and write a program that does that.

Such a type or method could easily be a utility type/method. Some might even call it a helper. An extension is another name for a higher communicated candidate. There is nothing wrong with any of them, but personally I sometimes forget to think whether such a type is too easy to implement in terms of missing out on responsibility and better encapsulation.

So I can have a difficult time with these sort of types since they tend to grow in the number of files, the number of methods, and the number of things that are just considered an extension, helper or utility.

I often fall into my own traps. This code is part of my site generator Rubin.Static, and I have done exactly what I just said I didn’t like to do. I have more extensions in there than I like.

But I can also explain why, since I am quite aware of how my mental juggling works with this. An Extension/Utility/Helper directory, or even (worse) a project of such kind, is such a great mental bin to offload into whenever something does not fit right into the directory you're standing in.

And it underlines and shows to me how I go about programming. It is in iterations, and sometimes I'm simply not good enough to capture or being aware of everything. Then, luckily, I can get back and iterate again.

At the time of writing, I have five files in the Extensions directory. It’s the directory in that small project that holds the most files. It sort of left me with a "there is something off" feeling, but it also tells the story of not believing in my own standards for everything I do or don’t do. Heck, I could have left it alone and just pumped more extensions in there over time. But no.

I can’t remember why I wrote the initial method like I did, but as I just stated, "Extensions/Helper/Utilities" is such a great bin to put things into, right?

I have experienced this many times in my career, in many codebases, and I have come to believe that it can stem from not fully collecting or gathering the code where it belongs. The responsibilty and encapsulation of these types should be thought of, I think.

Take these methods as they are, which is part of the namespace Rubin.Markdown.Extensions.

public static class UriExtensions
{
    public static IEnumerable<Uri> ByValidFileExtensions(this IEnumerable<Uri> uris, string[] validExtensions)
    {
        foreach (var uri in uris)
        {
            for (var i = 0; i < validExtensions.Length; i++)
            {
                if (validExtensions[i] == uri.GetFileExtension())
                {
                    yield return uri;
                    break;
                }
            }
        }
    }

    public static string GetFileExtension(this Uri uri)
    {
        if (uri == null)
        {
            return string.Empty;
        }
        
        string path = uri.AbsoluteUri;
        string fileName = Path.GetFileName(path);
        
        if (fileName.Contains('.'))
        {
            return Path.GetExtension(fileName);
        }
        
        return string.Empty;
    }
}

It works, and there’s nothing particularly wrong with it. We can always argue about the nested loops, but for such a short method, it’s not really an issue. The GetFileExtension method can also be narrowed down somewhat, sure.

[Fact]
public void When_Uris_Should_Be_Markdown_Return_Only_Markdown_Uris()
{
    var uris = new List<Uri>
    {
        new Uri("https://some.dk/domain/file.md"),
        new Uri("https://some.dk/domain/file.markdown"),
        new Uri("https://some.dk/domain/file1.markdown"),
        new Uri("https://some.dk/domain/file"),
        new Uri("https://some.dk/domain/file.jpg")
    };
    
    var validUris = uris.ByValidFileExtensions(ValidMarkdownFileExtensions.ValidFileExtensions);
    
    Assert.True(validUris.Count() == 3);
}

Just as a testament, the test asserts true.

Let me restate the opening, rather loose, program specification again:

You have a list of URIs.
You want to return only the URIs that have certain path/file extensions.

In the original implementation, I used an extension method, which is arguably a fine way to extend functionality to a type. I am not sure I fully agree. The particular extension methods here add functionality to .NET’s Uri type along with IEnumerable<Uri>, and I guess that’s a working approach. But that wasn’t my concern.

My main concern is that when I program these extensions (utility or helpers) without having given any explicit thought about whether those are the right fit or if the actual belonging is correct. This relates to what I wrote earlier about using extensions as a mental bin, an offload of functionality that might actually be more closely related to something else. So I need to ask myself why I made extensions rather than something else that might be more closely connected to the code that actually uses those extensions.

Now, if I were to refactor this and decrease the number of extensions in the directory I just mentioned, how could I do that?

I will start by looking at where this extension code belongs. As an experiment, I will conclude for now that the particular extensions do not belong as an extension, helper, or utility, but something else.

I'm backtracking the references from the extension file in Visual Studio, and I see that I have a reference inside the file DownloadMarkdownFileService.

foreach (var uri in uris.UrisWithValidFileExtensions(ValidMarkdownFileExtensions.ValidFileExtensions))

DownloadMarkdownFileService contains business rules; its purpose is to only download markdown files and convert their contents into a known type called MarkdownFile. So DownloadMarkdownFileService is really a fine candidate for the extension rather than my original approach.

The first thing to note is that DownloadMarkdownFileService is at the lowest level of abstraction. It's deepest in any call chain I have written myself; there are no calls to other types I have programmed myself. So I can also say that it’s a lower type, a fundamental type. I could also argue that having extensions so low down the chain can be bad since any user of the code could simply alter the extension, e.g., changing the valid file extensions, and then everything else would fail. Not good.

A second thing to consider is that someone else could later alter the extension method, perhaps causing a breaking change at runtime, which could also affect the DownloadMarkdownFileService.

I understand that one could mitigate these issues with tests, and one should. But it doesn’t solve the issue around responsibility, I think. The code not only belongs in a closer relationship with the DownloadMarkdownFileService, but it also makes that type "stronger" and more solid.

That has resulted in adding some code to the DownloadMarkdownFileService and its test pendant, but also deleting three files: two files in the extensions folder and one file in the test project.

The previous foreach loop, which looked like this:

foreach (var uri in uris.UrisWithValidFileExtensions(ValidMarkdownFileExtensions.ValidFileExtensions))

I changed to look like this:

foreach (var markdownFile in ValidMarkdownUris(uris))

Instead of working on the basis of Uri, I made a few changes around the DownloadMarkdownFileService type, such as adding a new Path property to the MarkdownFile type.

Instead of working more than absolutely needed on a Uri, I now work on the MarkdownFile type. Perhaps this is even more positive encapsulation.

This is what the DownloadMarkdownFileService type looked like before I refactored it toward better encapsulation and responsibiliy. I have left out unimportant code for now.

public class DownloadMarkdownFileService : IDownloadMarkdownFile
{
    ...
    ...

    public async Task<List<MardownFile>> DownloadAsync(IEnumerable<Uri> uris)
    {
        var markdownDownloads = new List<MardownFile>();

        foreach (var uri in uris.ByValidFileExtensions(ValidMarkdownFileExtensions.ValidFileExtensions))
        {
            ...
        }

        return markdownDownloads;
    }

    private async Task<MardownFile> DownloadAsync(Uri uri)
    {
        var result = await httpClient.GetAsync(uri);

        ...

        MardownFile

 markdownFile = new();
        markdownFile.Contents = await result.Content.ReadAsStringAsync();

        return markdownFile;
    }
}

Now it looks like this:

public class DownloadMarkdownFileService : IDownloadMarkdownFile
{
    private readonly string[] ValidFileExtensions = { ".md", ".markdown" };

    ...
    ...

    public async Task<List<MarkdownFile>> DownloadAsync(IEnumerable<Uri> uris)
    {
        var markdownDownloads = new List<MarkdownFile>();

        foreach (var markdownFile in ValidMarkdownUris(uris))
        {
            ...
        }

        return markdownDownloads;
    }

    private async Task<MarkdownFile> DownloadAsync(MarkdownFile markdownFile)
    {
        var result = await httpClient.GetAsync(markdownFile.Path);
        
        ...

        markdownFile.Contents = await result.Content.ReadAsStringAsync();

        return markdownFile;
    }

    private IEnumerable<MarkdownFile> ValidMarkdownUris(IEnumerable<Uri> uris)
    {
        foreach (var uri in uris)
        {
            if (ValidFileExtensions.Contains(UriPathExtension(uri)))
            {
                yield return new MarkdownFile(uri);
            }
        }
    }

    private string UriPathExtension(Uri uri)
    {
        if (uri == null) return string.Empty;

        var fileName = Path.GetFileName(uri.AbsoluteUri);

        if (fileName.Contains('.'))
        {
            return Path.GetExtension(fileName);
        }

        return string.Empty;
    }
}

I am happy with the result. I didn’t sacrifice much but I like the approach better than the original. The point is not the code per se, but that I should always apply thought to what is an extensible type (or helper and utilities) and what is supposed to be closer to the actual type I am working on. In this cas it was the Uri but made better sense to also adjust the surrounding types.

You can find the whole commit and changes here.

Write. Push. Publish. Separating the concerns.

tirsdag den 17. september 2024

For the local-first, feature-less, and admittedly boring static site generator that runs this blog, I have had a few thoughts around a different model for materializing response output. You can always read about the initial design thoughts for Rubin.Static, I still do sometimes.

Today, when I write an article or post, I type the text in my favorite text editor as Markdown. When I’m finished writing, I push that Markdown file to a repository on GitHub.

Now, I have a file persisted in what I call my content repository. You could also call it my write store.

That write-push workflow is completely independent from publishing and updating the actual site, and that was an initial design decision. I wanted to keep writing content and pushing it to whatever write store I chose, independent of whoever or whatever wanted to utilize that content later on.

Write. Push. Publish?

This article you are reading is not being materialized from data upon your request. It was put together before the request, hence the "static" part. It is a file on disk — a static HTML file.

When I want to update this site, I run a program locally on my machine called Rubin.Static. I execute a console program by running .\run.ps1, and from there I wait until the program completes its task.

The program downloads every Markdown file from the write store every time it runs.
The program parses every Markdown file every time it runs.
The program generates all static files every time it runs.
The program does not publish any files externally.

This is where my fraudulent programming mind starts playing tricks on me. I start thinking about parallelism for HTTP requests, wasted clock cycles when generating files whose content hasn’t even been touched and why even download files that haven’t changed ? And so on. Overengineering is the root of...and all that jazz. But I fallen for the traps before so these days I do not have to let the "creative department on the top floor" take over completely.

This approach around "write -> push" leaves the actual site eventually consistent. Even though I may have content in the write store, which is set as published, the site will not reflect that until the static files are generated and the site is updated. We could choose to call this a manual event, but it was no accident, I designed it this way on purpose.

How is a post or article generated?

For every Markdown file downloaded, Rubin.Static parses it into a strongly typed model and generates a static HTML page from that model.

It does this via the Razor templating engine, with some Razor views and a Layout view. But it’s all within a console application, without a web server being fired up. There are some bounded dependencies.

At the end of an execution, I can navigate to Rubin.Static's output directory and find the site ready to run in a browser. It’s just HTML, CSS, and a little external JavaScript for code highlighting.

Where data is written to has nothing to should not matter. Hence the "reading" of data is almost entirely ignorant of where it was written to. That results in not using the same store for writes and reads. Write store is a github repository. Read store is a directory of web associated files.

This is an example of separation of concerns. I like this approach, and every time I use it, I smile. The whole notion of writing anything should be totally separated from any other concerns. And it is.

Today, when I execute Rubin.Static and add a new Markdown file (post/article), the Index page must be updated to reflect that. The Index page lists the contents of N new posts/articles.

There are also category pages generated if a post or article has a category "attached" to it.

So, with the Index page and Category pages, multiple static HTML pages are generated whenever a new post or article is committed to the repository (or removed, renamed, updated, etc.). However, today Rubin.Static cannot generate Category or Index pages without scanning every single Markdown file.

If I were to take this to an extreme and let my "creative department" go wild, I would experiment with event-driven file generation.

If I were to event-drive this, it would require a completely different page generation mechanism. I would most likely need to persist some interim data between receiving an event (such as a new Markdown file or a deleted one) and generating the static pages.

How do you materialize views from data?

To make it crystal clear, the following Rubin.Static-supported Markdown file would generate a static HTML file called this-is-the-slug-for-some-post.html and also contribute to generating software.html, code.html, and index.html:

[//]: # "title: Some post"
[//]: # "slug: this-is-the-slug-for-some-post"
[//]: # "pubDate: 14/6/2024 12:01"
[//]: # "lastModified: 17/6/2024 10:20"
[//]: # "excerpt: Whatever you want!"
[//]: # "categories: software, code"
[//]: # "isPublished: true" \

Content goes here!

I would bet the most common way online applications materialize their views is based on collecting data "just in time." With the data at hand, we can then materialize and render it for the user.

User hits the endpoint (cache is invalidated!).
The endpoint queries a database.
The database retrieves the data.
Some modeling based on that data occurs.
The endpoint renders the model.
200 OK.

I have no issues with this approach. At all. The power of querying a relational datastore is incredibly useful. But since I’m already heading down the event-driven path in my mind, I imagine a different approach. This wouldn’t necessarily be the preferred option, but event-driven design could allow my read model to be eventually consistent based on events already received.

In theory, this sounds good, but in practice, I find that designs and architectures around eventual consistency often being evanglized into a too too simple yet too broad approach. But who am I to judge without any context.

How might events be applied in Rubin.Static?

Publishing a new post or article to the content repository triggers an event.
The event prompts Rubin.Static to only download the new post or article, or the event itself carries the data.
Rubin.Static generates a new static HTML page based on the slug.

Now, the tricky part—because we don’t have all the data like we used to.

Rubin.Static appends to the index.html view with content from the newly added post or article.
Rubin.Static appends to each attached category with the new content.

The same process would apply for deleting or updating a Markdown file. Broad strokes, okay.

And suddenly, I’ve come up with an event-based update mechanism for the read model. It’s completely over-engineered and far too complex for where these designs apply correctly. But that's just me. I might try it as an experiment, but it will not end up in the main branch — it’s just too much.

With Rubin.Static, there is no relational data. I initially decided to forego that complexity, opting for Markdown as a low-common-denominator format. I wanted to write software without the need for a database.

So, there are only Markdown files to parse, and my own format parser to apply during the process. It’s all rather sequential. This has resulted in parsing and generating HTML pages, but I’ve already explained that part.

When Technology Overshadows the Social in Software Development

mandag den 9. september 2024

A piece of thought with little to back it up.

In the world around the individualism of software enginnering and programming one sometimes need what cyclists calls "an exercise adjustment". It often results in ones own arrogance being taken down a notch.

Technology often takes center stage. Socalled new frameworks, never-ending disussions around paradigms, and tools that promise to optimize, streamline, and solve complex problems. I get a bit disappointed in the field at times, hoping for higher standards, and I would also be happy for a bit more pressure on the social well being in a professional setting. I am not talking about perks or office hours, I am talking power and social dynamics.

Lately I have watched four real human beings, working together for weeks, to develop an algorithm. Sitting in the same room. Churning. Talking. Discussing. Sharing knowledge. Using social skills. Increasing cognitive safety for domain understanding. In my book, that is team work. Funny enough, others around this group has expressed their interest, wanting in the future to be part of such a tightly knit setting.

In our pursuit of technical solutions, we often overlook the social aspects that are crucial to building successful teams and organizations. Why do you think this happens ? And what are the consequences for the humans alongside you ?

As a programmer, I frequently find myself wondering why the social aspects of our field are so hard to come by — even though practiced they are seldom practiced for lengths at a time, and in my experience, they are often replaced by an individualistic approach to power and decision-making. I've seen plenty of times, teams operate on unvalidated information, following a single person's "good idea", often driven by some sort of "HIPO", seniority or status. This dynamic is compounded by the way many companies organize themselves, creating an environment that does not support healthy social activities for teamwork, collaboration, and feedback.

The Allure of Technology

It's common for both programmers and managers to attempt to solve organizational issues by implementing new technologies. Look at how loose the saying from Conway has been become, like it's a technological issue to fix how communication flows through an organisation.

There might be tendency to believe that technology can drive and can resolve deeper, more fundamental collaboration problems. But this reinforces a theory I’ve long held: we misunderstand the role that technology plays within our organizations.

A well-intentioned but purely technological approach can easily overlook the social structures needed to foster healthy teams. We become trapped in a mindset that assumes technological design will naturally create the right conditions for better collaboration and outcomes. This is a short-sighted solution that focuses only on the tools and not on how people actually work together.

Who Holds the Power?

Power dynamics in an organization play a crucial role in how decisions are made and which solutions are chosen. I recently encountered a manager in a ten-person team who believed that there was no time for code reviews. This might seem as both an easy fix and a shit for brains stance, but in real life you encounter organisations and people that hold power they will not democratize. Being afraid of loosing that power grip is a real thing in some humans.

This is also a good example of how one person’s decision can shape an entire team's working processes. Code reviews are not just a technical step toward better code quality; they are also an opportunity for learning, feedback, and collaboration. When social mechanisms like these are removed from the process, you lose the relational dynamics that can strengthen a team.

If we up the ante and instead add a deeper rooted decision, that could have a more profound effect on a team, then we simply complicate matters even further.

I have no negative opinions towards e.g. DDD or Event-sourcing, but imagine the power dynamics of making such a choice without understanding the underlying organisational conflicts that coulda also add. The higher the complexity the more social stability within your organisation is probably needed.

This is where we often see a conflict between technology and social structures. Many of us in the tech world are influenced by a bias that leads us to believe technical solutions are always the answer. Jeff Hodges’ article, Notes on Distributed Systems for Young Bloods, encapsulates this well: “Our background, education, and experience bias us towards a technical solution even when a social solution would be more efficient, and more pleasing. Let’s try to correct for that.”

Technology and Social Dynamics: Two Sides of the Same Coin

I believe the biggest challenge in modern software development is not about choosing technologies or paradigms. It lies in creating the right conditions for teams to thrive both socially and technologically. That's different from the technological direction. We can’t ignore the social dynamics that arise within teams, organizations, and leadership. Technology is not a replacement for relationships, communication, and trust. And it shouldn't be neither the bearer of it nor a parameter for successful team work.

When organizations attempt to improve their outcomes by implementing new technologies without considering how those technologies will be socially integrated, they risk creating more chaos than clarity. The social architecture of an organization is just as important as its technical architecture. Ignoring this leads us to listen to those who talk the loudest and the most, rather than establishing best insights through information, measurements and team work.

Personally I would much rather be part of a reflective and social organization than a technology-driven one. This is where I have experienced teams build the most sustainable, resilient and quality work and at same time been the best place to be as a human.

Three Page Path Challenge - fifth iteration

mandag den 2. september 2024

This is the sixth article in a series about a man with a three-page path challenge, an interview question that can come up for programmers.

Make sure to read the problem scope in the first article in this series.

Also, read about the first iteration, the second iteration, the third iteration, and the fourth iteration.

I had to question myself around the matter of explicitness and defensiveness in the code I had written. Arguments should be clear, but I had found some of what I had written to be too implicit. This is in relation to correctness but also somewhat about the future and how we can prepare for it. I used to say I am not concerned with the future when writing code because it has an impossibility that is self-explanatory. Live requirements are most often a sign to be read as "stop."

If we guess about exact needs, the correctness could turn into incorrectness. It's a guess. I still fall into the trap, guessing my own ability, sometimes taking it for something higher than it really is. That will be a strain on any human mind—guessing, hoping to be right, but not with a high possibility of certainty.

Anyways, to boil it all the way down, it should be as obvious as possible to a programmer how an API works when one uses it. So when you allow your fellow programmer to pass in a type as an argument which is not explicitly telling, you are not making it obvious. And you might be guessing, even though it might be unconscious.

Here, I am training my abilities

For a constructor to accept an IEnumerable<T> but then change that type inside the constructor, e.g., to an IImmutableList<T>, closes off the possible updating of an exact type. It will hide the fact—e.g., if the exact type of the IEnumerable<T> is a List<T>—that one could manipulate that collection after initializing. That would not be possible in the same capacity with an IImmutableList<T>.

Think about it for a bit. Should my LogEntry collection be left open for manipulation later in the execution chain? Why not? Why yes? Am I the one to determine that? Is it a business rule? Do I know the future? Which type is more open than the other? Will your code be easier to close or to open?

If I did this public PathPatternsAnalyzer(IEnumerable<LogEntry> logEntries, int partitionSize) and wrote a test like this, it would fail.

[Fact]
public void Constructor_Parameter_Can_Update_During_Execution()
{
    //Arrange
    var listOfLogEntries = Log_Entries_No_Common_Paths().ToList();
    var count = listOfLogEntries.Count;
    var sut = new PathPatternsAnalyzer(listOfLogEntries, 3);

    //Act
    // will add to the collection and affect the sut
    listOfLogEntries.Add(new LogEntry(1, "path", 100));
    
    //Assert
    Assert.True(count == sut.logEntries.Count());
}

Yet, changing the constructor argument to an IImmutableList<T> closes off that possibility to change the collection during execution. What is expected, and how do I decide?

public PathPatternsAnalyzer(IImmutableList<LogEntry> logEntries, int partitionSize)

While this test would then succeed:

[Fact]
public void Constructor_Parameter_Cannot_Update_During_Execution()
{
    //Arrange
    var listOfLogEntries = Log_Entries_No_Common_Paths().ToList();
    var count = listOfLogEntries.Count;
    var sut = new PathPatternsAnalyzer(listOfLogEntries.ToImmutableList(), 3);
    
    //Act
    // collection in sut will not be affected
    listOfLogEntries.Add(new LogEntry(1, "path", 100));
    
    //Assert
    Assert.True(count == sut.logEntries.Count());
}

I know it sounds trite, and you might read this and think, "why is this idiot not just using whatever type? Why all this?" That's a valid point, but you might be forgetting that this is a challenge in programming, and asking yourself these questions is exactly how you become a better programmer. The reflections made can be useful. The exploration of types, patterns, and options.

I want to be explicit now, and therefore I will accept only IImmutableList<T> because it first and foremost signals to the programmer what to expect: that this constructor argument is immutable. I also believe that it's easier to understand since it is more predictable.

Bear in mind that it was never the goal that I wanted to create something immutable. It happened during the challenge, and it happened because I could take liberty to try it out while also finding it more correct at that moment.

You can find the code for this fifth iteration here, and I have made a small Runner program where you can follow the execution chain from.