A Deep Dive Into Faster Release Management & Its Benefits
Warning, there will be a tiny bit of mathematics involved but it will mostly be words & pictures, so bear with me.
A pain everyone from the commercial side of a company is surely familiar with is the dreaded Microsoft Word "Track Changes" feature. You know, that horrible thing where everyone makes their own edits to a single word document that never quite line up, clash with each other, and some poor soul has to somehow all merge together at the end?
It's hell, literal hell, and it never works. Inevitably, two peoples changes both get merged into the final version of the document and they don't line up. You end up with a sentence that... this often causes the... not to quite work.... spinach.
So what do you do? Well, the logical thing is to try to merge all the current changes into a new master document as often as possible to reduce the likelihood of conflicts. Now interestingly, this process is remarkably similar to the process of creating software, the only difference is that when you are writing software you are doing it across a "document" with - say - 1 million lines and 30 different editors and you can only merge all the changes at the time of release.
How bugs happen
So you remember all those times you've seen real word documents where parts didn't gel because different peoples changes conflicted? Well, that happens in code too and it's one of the most common causes of bugs.
Every line of code in your code-base can interact with every other line of code. They can modify each other's behaviour, change the meaning of words, and loop the reader back to an earlier line and create an infinite spiral of doom. So when you have 30 people all making changes to your "document" at the same time, and merging all their changes at once at the end, you often find someone's code (usually several someones) causes another's to do weird, bug-like things.
A visual guide
The image above shows a situation in which two separate changes are made to a codebase within a single release. You can see from the picture that from these two code changes, there are two potential points of conflict:
- Change 1 breaks Change 2 (To make life easier, let's abbreviate this to "C1 X C2")
- Change 2 breaks Change 1 (C2 X C1)
So far, pretty simple - 2 changes, 2 opportunities for a bug, sounds totally manageable, let's introduce our next change.
Uh-oh... somethings gone horribly wrong.
We added one extra change, but suddenly our number of potential breaking points has gone from 2 to 6.
1. C1 X C2
2. C1 X C3
3. C2 X C1
4. C2 X C3
5. C3 X C1
6. C3 X C2
So this isn't looking great... but surely it's just a slight speed bump, it won't keep getting worse. Let's introduce another change.
Wow! We had to introduce a whole new dimension of movement to keep track of all these conflict opportunities and our number of potential bugs has jumped from 6 to 12! So by now, you're starting to see the problem. Effectively there's an exponential relationship between change numbers and potential bugs. It's something close to the number of changes (n) squared, minus the number of changes (n*n - n).
So let's imagine we have 30 developers making an average of 2 "completed" changes each week on a monthly release cycle (1 release every 4 weeks).
- Weekly changes: 30 * 2 = 60
- Monthly changes: 60 * 4 = 240
- Potential bugs: 240 * 240 - 240 = 57,360
57,000+ potential bugs, how does the software EVER work?
Well, luckily developers are smart people and they've built all kinds of tools and processes to ensure that these points of conflict dont break the software. If you ever hear developers talking about a thing called "git" or discussing why it's so important that "to have time to write automated tests", these 57,360 opportunities for a bug to be created every month are the reason.
So let's assume that your developers have got their distributed version control system (git) down to an art and let's assume that they're given the time to write all those automated tests they keep talking about. These two processes could probably catch 99.99% of those potential bugs so we're fine, right?
57,360 potential bugs / 10,000 (that's your 99.99% safety net) = 5.736.
5.736 bugs a release - that sounds about right, yeah?
So what's this all got to do with faster releases?
Well, you see, maths is kinda cool when it comes to exponential numbers. Every time you release, you merge all those changes back into the "master document" and you essentially "reset the counter" back to 0. Now that doesn't mean much at first, because exponential functions are hard, so let's run through the maths again. Except this time we'll double our release speed.
30 developers making 2 completed changes each week, except this time it's a fortnightly release cycle (1 release every 2 weeks).
- Weekly changes: 30 * 2 = 60
- Fortnightly changes: 60 * 2 = 120
- Potential bugs: 120 * 120 - 120 = 14,280
- Bugs in production: 14,280 / 10,000 = 1.42
That's a quarter the number of bugs per release. Now yes, we are releasing twice as often so the total number of bugs per month is closer to 2.8 but that's still half as many bugs per month from doing nothing differently other than releasing twice as often.
And you can get this benefit again if you release even faster, let's do a weekly release cycle
- Weekly changes: 30 * 2 = 60
- Potential bugs: 60 * 60 - 60 = 3540
- Bugs in production: 3540 / 10,000 = 0.354 per release (or roughly 1 per month)
It's all in the numbers
There you have it, mathematical proof that releasing more often does not just reduce the chance of bugs in each release, but it reduces your likelihood of bugs overall. For every 50% reduction in time-to-release, you effectively reduce your likelihood of a bug in production overall by an equal 50%.
Monthly releases to weekly releases = roughly 84% reduction in likely bugs in production.
Notes on the math
This is a drastic oversimplification of the realities of certain situations.
For starters, I don't even begin to consider the implications on the "potential conflicts" number caused by the combination of two independent changes on a third change, and more (too much maths... yuck). However, the key here is to know if that were taken into account it would actually make the benefits of faster releases MORE prominent (potentially to an extra order of magnitude), and not less.