Code Rot Is a Process Decision

Inevitably, code degrades as it scales. This is only a problem if your process robs you of the power to refactor

Daniel Niland
Better Programming

--

Photo by Giuseppe CUZZOCREA on Unsplash

I’ve had the great fortune to work for an organization that made a good-faith effort to empower its teams. And I’ve worked at other places that didn’t.

In my role as a dev lead, therefore, I’ve seen firsthand the difference between the two: empowered and… what’s the opposite of empowered? Not exactly disempowered, more… unpowered.

“Unpowered” because there’s no ‘oomph’ to it. It constantly needs to be prodded and pushed. It won’t move by itself.

What do I mean by that? Let’s talk about the way unpowered organizations affect technical work.

Oh boy. So many topics to choose from — ineffective use of resources; poor documentation; a lack of meaningful ways to improve the process from below.

But I’m a dev lead. I’m coming at this from an engineering perspective. Most books and articles on the empowered process don’t look too deeply into how it directly improves the engineering process.

I’m here to rectify this situation. But to understand how empowerment propels us forward, we must first understand how a lack of empowerment holds us back.

Let’s look at code rot.

Technical Debt: Someone Else’s Problem

I’m not satisfied with the concept of ‘technical debt.’

The first time I heard the term many years ago, I thought it was very clever. It’s a great way to convey the idea that there is a cost for decisions that trade long-term stability for short-term gains.

But does this concept prevent these short-term decisions from actually being made? Not that I’ve ever seen.

As far as I can tell, the most common understanding of technical debt is that it’s an act of neglect by too-hurried or too-inexperienced developers.

I mean, this type of code damage does indeed happen. Engineers not allowed to spend time ‘getting to code right’ will produce bad, poorly crafted code. Inexperienced engineers, if left unsupervised, can rip through a codebase like a tornado, leaving devastation behind them.

The common understanding of technical debt, however, places the onus squarely on the shoulders of the engineers. It’s a time constraint, and the desire is to keep time as the constant value, to look for some other scapegoat. If only the engineers were better, if only we had some of those 10x developers we’re always hearing about, if only they worked harder, then we could sidestep this whole technical debt problem.

Therefore there’s nothing to stop management from saying, ahh, we’re gonna need you to go ahead and come in on Saturday, m’kay? Job done. High five!

So we need another term. Something visceral enough to shock the suits.

Thus, code rot.

Genuine code rot can’t be swept under the rug; it is the product of a bad process. It is the direct result of poor management. Only organizational change can fix it.

Here’s how it works:

Code Rot Step 1: Real Decisions are Made By The Real Players

A roadmap is basically a financial artifact. It allows companies to apportion budgets over the short and long term. As such, they are the literal manifestation of real power in the organization. Decisions about what goes onto the roadmap are fiercely political.

There’s a lot of ‘ego’-measuring going on here. The more say you have over the roadmap, the more money you control, and the further you can pull out your measuring tape.

And you maintain your spot in the room by completing projects on time and on budget. Do those completed projects need to work well? Theoretically, yes. In practice, timely completion is usually a good enough proxy for success; the rest can be damage-controlled.

The last thing upper management could imagine is giving up control over the roadmap. Such an idea just doesn’t compute for them. Any problems in the organization must be due to something, anything else.

Code Rot Step 2: Enter Program Management

So when things inevitably get complicated. Upper management doubles down.

The most common way to enforce that projects on the roadmap are completed on time and on budget is to create a team with special extra-judicial powers: program management. This group is an extension of the financial arm of the company. Thus, they answer directly to upper management.

I’ve never met a program manager who isn’t laser-focused on dates. They will grill you with all the zeal of the Spanish Inquisition, stopping at nothing to get you to commit to a date, no matter how hypothetical.

They then enter that date into their scheduling algorithms. I don’t know what they see in these mystical mirrors of bent reality, but it isn’t pretty. They are nearly always in crisis, and they are tired of your excuses.

And the very last thing they want to hear, ever, is that your team needs to postpone a project to clean up ‘technical debt.’

Code Rot Step 3: Project Plan vs. Reality, the Smack-down

Projects are created to add various features to the product. Given the financial nature of the roadmap, these projects are usually tied, directly or indirectly, with funding, which means that they need to be costed out before they are started, which means the general scope of work must be ‘known’ beforehand.

Most project plans, though not all, are smart enough to stay away from implementation details. The plans were probably hashed out by various managers, architects, and even people on the project team. Once this plan is set and the project is green-lit, engineers are expected to follow the project bounds and go at it.

But software engineering projects are like jumping out of a plane; you can’t know what it’s like until you actually do it.

If the product as a whole were shown on a diagram, then the domain of each project would look something like this:

A mapping of roadmap projects to the product domain. Lots of gray areas left over.
Whenever work is apportioned by projects, there are plenty of gray areas without coverage.

Notice that each project’s limited nature leaves many gray areas where no one is immediately responsible for the code’s upkeep. Even if a team is nominally tasked with maintaining a specific domain, if they are project-driven, they will have a disincentive to work on those parts that aren’t specifically called out in the current project plan. That’s not the way to make the program manager happy.

Code Rot Step 4: Gray Areas Lead to Lack of Ownership

While building features, it is very common to discover gaps in systems outside your ‘responsibility.’ Services you thought should be there aren’t. APIs don’t quite do what you need. Shared code doesn’t handle your use case.

So what’s a team constantly under time pressure to do?

You can look for an owner and ask them to take care of it for you. In a project-led organization, however, the services or code in question often have no real owner. The team that created the code may have moved on to other things or may have disbanded. Even if there is a clear owner, the team is usually busy with their own current projects. They’ll politely put your request ‘on the backlog’ — another way of saying, yeah right, buddy.

Your team, then, has no real alternative but to work around these issues. You can build a new service from scratch. You can wrap the service in a facade that inserts your use case. You can add your bit of logic to the shared code.

If you’re conscientious (which I know you are), you’ll treat this seriously… or as seriously as you can. You’ll write good code — that does the minimum to unblock your team. You’ll test it — against your use case. You’ll make sure you don’t disturb whatever was there before — by extending base classes.

But do you really think this code will be as good as if it were created by a team that is unambiguously responsible for the long-term care of their product?

And here’s the kicker. Over the years and all those project iterations, even the code your own team created becomes external to your current projects — it becomes part of the gray area.

Code Rot Step 5: Bad Code Breeds

Any experienced software engineer knows that bits of half-baked code are common. Often they happen because when we first write the code, we don’t really understand how it will be used. Any team worth its salt will revisit old decisions and refactor any creaky code to a higher standard once usage patterns are clear.

But when no one owns the domain in which the code resides, it’s difficult to understand both the intention and current usage. That random service. That crazy facade. That extended bit of code. Who’s calling it, and why? There’s likely no documentation.

This makes it risky and time-consuming to do all the things most good teams do to keep their code tidy and easy to work with. And who has time for that anyway? The program manager is breathing down your neck to finish the next feature.

But broken windows lead to ruined houses. That service that started as a one-off may become a dumping ground for random functionality. If one team needed it for something, chances are other teams have similar (but not identical) needs. If there’s no ownership, there’s no real design. If you’re unlucky, you get the worst-case scenario: look at the service sideways, and it brings down your whole system.

That facade may get repackaged such that everyone forgets what lies behind it. Then along comes another team who changes the base code, and bam, half your services start throwing strange errors.

That inherited class may, in turn, be inherited by other classes, freezing together unrelated features, requiring funky hard-to-understand state, causing strange race conditions.

As the system grows, the gray areas become larger and larger in relation to the current project work, eventually dominating the codebase.

The possibilities for failure due to neglect are endless. As Tolstay said: All well-built codebases are alike, but every coupled codebase is coupled in its own way.

Code Rot End Game: Collapse

An untended garden will be overwhelmed by weeds.

You can hire the best developers, you can sprinkle in agile coaches, you can even ask everyone to come work on Saturday, but nonetheless, a codebase without real ownership will rot. Every time.

And at scale, rot eventually leads to collapse. Every time.

At that point, your software either stops growing (all resources go into keeping the system limping along), or you start over with the next version.

Upper management might get upset at the incompetence of their engineers, but what can they do?

Look in a mirror? Probably not.

No. We’ll tell them this time will be different. We’ll tell them everyone has learned their lesson.

And if you’re given the wonderful reprieve of working on a new version, it usually takes a few years before the code rot returns.

But you’ll probably be gone by then anyway, am I right?

If you’ve never known anything else, it’s hard to imagine an alternative to the unpowered way of doing things.

In my next essay, I’ll describe how ownership actually works on an empowered team.

--

--