Experiment 02

June 20, 2022 10 minute read

The next few posts take step back to examine the benefits of creating reproducible software. We will explore:

the continuously reproducible mindset (this post)
foundational tools for reproducibility (Exp 03)
creating a continuously reproducible .NET project (Exp 04)

Introduction

Scientific experiments must be repeatable and reproducible to be considered scientific. Reproducibility in software is optional - software that works but is not reproducible is still successful software. I hope to convince you that the overhead required to create reproducible software is low compared to the benefits that it provides future developers, even if the only future developer is you.

Defining reproducible software

It is useful to clarify our definition of reproducibility within the context of software development. Let P₀ represent a stable, compiling build of a codebase that results in a correct program. The reproducibility test for P₀ is as follows:

Does the code/documentation for P₀ contain sufficient information to reproduce the correct program from a clean environment? (Yes/No)

Next, let P₁ represent the code (in the new environment) that has undergone a substantial change that modified the build environment. We can reapply the reproducibilty test to P₁. The number of times that code passes the reproducibility test can be defined as its reproducibility level [0..N].

It may be useful to name a few of these levels.

Irreproducible - Reproducibility level 0; P₀ failed the reproducibility test.
One-time reproducible - Reproducibility level 1; P₀ passed the reproducibility test, but P₁ failed.
Continuously reproducible - Reproducibility level 2+; If P₀ and P₁ pass the reproducibility test then it is indicitave that the code is written in a way that supports reproducibility for future generations of the code.

The continuously reproducible mindset

How many times have you pulled a project from Github only to have it fail to compile?

It works on my machine ¯\_(ツ)_/¯

We can reduce this problem by expanding our mindset to strive for continuously reproducible code. The key to creating continuously reproducible code is create a simple workflow that rebuilds the project from a clean environment (preferably Windows, Linux, and OSX). This allows you to isolate undocumented side-effects that can occur in your local development environment (e.g. relying on a tool available locally that is not installed during the build process).

Continuosly reproducible code balances the need to solve the current problem with the need to redeploy the codebase to new systems. If this doesn’t seem worthwhile, then it might be helpful to imagine that your code (P₀) will be extended by a different developer in a substantial way (P₁) before it is returned to you for another round of development (P₂). The time spent during the initial phase of development to create a build process that is easy to replicate across platforms (and modify as needed) will payoff in the long run.

But what if you are the only developer that will ever use this code? I have found the continuously reproducible mindset to be helpful in my personal projects for tracking down build-related problems and ensuring that my code works even after long pauses in active development.

Measuring the longevity of a build

If a specific build passes the reproducibility test then its longevity can be measured. Longevity is a measure of the period of time between the first time the build passes the reproducibility test and the last time it passes. All builds eventually fail because some dependency of the build process will fail (including the language itself - .NET Framework 3.5 was released in November 2007 but it is no longer available from Microsoft).

Longevity is measured for a single build configuration. Namely, P₀ will have a certain longevity, but P₁’s longevity may be shorter or longer depending on the changes made to its build configuration. Although true longevity can only be calculated after the build fails, developers can make conscious decisions to maximize the expected longevity of their code:

prefer dependencies that offer long term support (e.g. choose .NET 6 LTS even after .NET 7 is released) ¹
prefer dependencies that minimize the number of transitive dependencies
specify dependencies using pinned version constraints ²
if using Docker, build from official base images

Conclusion

Reproducibility is relatively easy with modern software development tools. The crux of the problem is giving future developers the ability to:

easily recreate the initial development environment across multiple platforms/architectures
continue to make changes to the code that does not break this process

In the next few posts I will describe my approach to reproducibility and demonstrate how to apply them to an existing code base.

Footnotes

Microsoft patches .NET LTS releases for 3 years while current releases are only patched for 18 months. ↩
While it may seem counterintuitive to limit the available versions of your dependencies, it improves control over the automatic dependency resolver. This is in line with the continuously reproducible mindset and future developers are always free to update the version if they encounter a conflict. ↩

Twitter Facebook LinkedIn