Integrating continuously with dependencies - without breaking the build
Continuous Integration (CI) is perhaps one of the most underrated and misunderstood techniques brought to us by the Extreme Programming movement. In its essence continuous integration requires you to, well… continuously integrate. That means all the changes to an application or even multiple cooperating applications to get feedback about integration issues often and early. Changes tend to be smaller and issues easier to resolve this way, or as Martin Fowler put it: frequency reduces difficulty. If this sounds new to you, you might want to read this article by James Shore first before proceeding with this article.
The problem with dependencies
I recently came about this Tweet by @RealGeneKim that pretty much sums up my experience when it comes to upgrading dependencies:
A troubling observation from @ndm: “the longer you wait to upgrade versions, more difficult the upgrade becomes”— Gene Kim (@RealGeneKim) April 6, 2018
Resonates with an app I “know of” that is on 6 yo Ruby Sinatra. To enable upgrading certain gems, I had to write rspec tests, when docs on 6yo rspec no longer exist!
For a long time it’s been considered a good practice to keep dependencies “stable” meaning, their version should never change. And even when they change, they should never break any of the existing code. While this sounds like a reasonable approach, the drawbacks are obvious when you consider the medium to long-term implications to security and innovation.
At some point this way of operating will leave you without vendor support for bug and security fixes and vulnerable to all kinds of nasty attacks. It was this lack of basic patch hygiene that got Equifax and caused one of the biggest data leaks in history. And even the rare software promising Long Term Support (LTS), a rarity for open source libraries, have been found insecure in many cases as vendors can only afford to backport some of the most important fixes to LTS versions.
Another issue of this approach is, that you won’t be able to use new and/or improved features of your dependencies unless you invest time explore what would break and migrate to the latest version first. If you have been tracking an early version of a dependency this may come with lots of breaking changes as APIs changed in incompatible fashion. Instead of taking the intended route of slowly migrating away from deprecated API over time you may be forced to drag all your code along in one big batch to the latest API version.
It’s not like it’s hard to always use the latest version of dependencies in your build. Most modern dependency managers and build systems offer the option to do just that. But this option comes with another, even worse pain. Imagine this: you come to work in the morning just to find that most of your tests that worked fine yesterday are failing. You spend your morning debugging and cursing someone who pushed a breaking change to one of the libraries you use just a few days before your important deadline. At this point you vow never to track the latest version of dependencies again, because this always breaks at the worst moments, causing a lot of unplanned work.
There’s got to be a better way
There’s actually an approach you can use to track the latest versions of your dependencies without sacrificing either build stability or the feedback of continuous integration. It’s inspired by refactoring techniques such as Branch by Abstraction and Parallel Change, and it may cost you a little bit of duplication (though depending on your build system this may be minimal). What if you were to create two artifacts to go through your build pipeline. One is tracking the last version of dependencies known to work fine, and one is always tracking the bleeding edge version of your dependencies.
While the “stable” pipeline is used to deliver your software and will never be broken by upstream changes, the bleeding edge pipeline provides feedback about the safety of upgrading or the magnitude of integration pains ahead. For this pipeline you only monitor the number of compile issues, deprecation warnings, number of tests failing and get a feeling for the integration work ahead. If there are none, it’s probably safe to move to the latest version. If there are issues, you might even work on fixing some these already, e. g. by removing calls to deprecated methods that will cause compile issues in the bleeding edge pipeline.
But make sure you don’t fall back into bad old habits. Keep an eye on your bleeding edge pipeline and set up notifications for version bumps. You could scrape the build output for the “Downloading …” for example to find out about new versions. And keep continuously upgrading your versions in the stable pipeline, as soon as you made sure that it’s safe. Do some exploratory testing if you’re uncertain. But keep patching those libraries!