Mistakes engineers make in large established codebases

(seangoedecke.com)

Comments

peterldowns 7 January 2025
I agree that consistency is important — but what about when the existing codebase is already inconsistent? Even worse, what if the existing codebase is both inconsistent and the "right way to do things" is undocumented? That's much closer to what I've experienced when joining companies with lots of existing code.

In this scenario, I've found that the only productive way forward is to do the best job you can, in your own isolated code, and share loudly and frequently why you're doing things your new different way. Write your code to be re-used and shared. Write docs for it. Explain why it's the correct approach. Ask for feedback from the wider engineering org (although don't block on it if they're not directly involved with your work.) You'll quickly find out if other engineers agree that your approach is better. If it's actually better, others will start following your lead. If it's not, you'll be able to adjust.

Of course, when working in the existing code, try to be as locally consistent as possible with the surrounding code, even if it's terrible. I like to think of this as "getting in and out" as quickly as possible.

If you encounter particularly sticky/unhelpful/reticent team members, it can help to remind them that (a) the existing code is worse than what you're writing, (b) there is no documented pattern that you're breaking, (c) your work is an experiment and you will later revise it. Often asking them to simply document the convention that you are supposedly breaking is enough to get them to go away, since they won't bother to spend the effort.

mjr00 7 January 2025
> The other reason is that you cannot split up a large established codebase without first understanding it. I have seen large codebases successfully split up, but I have never seen that done by a team that wasn’t already fluent at shipping features inside the large codebase. You simply cannot redesign any non-trivial project (i.e. a project that makes real money) from first-principles.

This resonates. At one former company, there was a clear divide between the people working on the "legacy monolith" in PHP and the "scalable microservices" in Scala/Go. One new Scala team was tasked with extracting permissions management from the monolith into a separate service. Was estimated to take 6-9 months. 18 months later, project was canned without delivering anything. The team was starting from scratch and had no experience working with the current monolith permissions model and could not get it successfully integrated. Every time an integration was attempted they found a new edge case that was totally incompatible with the nice, "clean" model they had created with the new service.

pablobaz 7 January 2025
In my experience with very large codebases, a common problem is devs trying to improve random things.

This is well intentioned. But in a large old codebase finding things to improve is trivial - there are thousands of them. Finding and judging which things to improve that will actually have a real positive impact is the real skill.

The terminal case of this is developers who in the midst of another task try improve one little bit but pulling on that thread leads to them attempting bigger and bigger fixes that are never completed.

Knowing what to fix and when to stop is invaluable.

Animats 8 January 2025
> Single-digit million lines of code (~5M, let’s say)

> Somewhere between 100 and 1000 engineers working on the same codebase

> The first working version of the codebase is at least ten years old

That's 5,000 to 50,000 lines of code per engineer. Not understaffed. A worse problem is when you have that much code, but fewer people. Too few people for there to be someone who understands each part, and the original authors are long gone. Doing anything requires reverse engineering something. Learning the code base is time-consuming. It may be a year before someone new is productive.

Such a job may be a bad career move. You can spend a decade learning a one-off system, gaining skills useless in any other environment. Then it's hard to change jobs. Your resume has none of the current buzzwords. This helps the employer to keep salaries down.

lmm 8 January 2025
I've worked in codebases like this and disagree. Consistency isn't the most important, making your little corner of the codebase nicer than the rest of it is fine, actually, and dependencies are great - especially as they're the easiest way to delete code (the article is right about the importance of that). What's sometimes called the "lava layer anti-pattern" is actually a perfectly good way of working, that tends to result in better systems than trying to maintain consistency. As Wall says, the three cardinal virtues of a programmer are laziness, impatience, and hubris; if you don't believe you can make this system better then why would you even be working on it?

Also if the system was actually capable of maintaining consistency then it would never have got that large in the first place. No-one's actual business problem takes 5M lines of code to describe, those 5M lines are mostly copy-paste "patterns" and repeated attempts to reimplement the same thing.

IvyMike 7 January 2025
The "The cardinal mistake is inconsistency" is 100% true. We used to call the guiding philosophy of working in these codebases "When in Rome".
dav 8 January 2025
I have three maxims that basically power all my decisions as an engineer:

1. The three C’s: Clarity always, Consistency with determination, Concision when prudent. 2. Keep the pain in the right place. 3. Fight entropy!

So in the context of the main example in this article, I would say you can try to improve clarity by e.g. wrapping the existing auth code in something that looks nicer in the context of your new endpoint but try very hard to stay consistent for all the great reasons the article gives.

protonbob 7 January 2025
I don't have a real critique because I don't have that many years in a codebase the size of OP (just 2). But I struggle with the advice to not try and make a clean section of the code base that doesn't depend on the rest of the application.

Isn't part of good engineering trying to reduce your dependencies, even on yourself? In a latter part of the post, OP says to be careful tweaking existing code, because it can have unforeseen consequences. Isn't this the problem that having deep vertical slices of functionality tries to solve? High cohesion in that related code is grouped together, and low coupling in that you can add new code to your feature or modify it without worrying about breaking everyone else's code.

Does this high cohesion and low coupling just not really work at the scale that OP is talking about?

maxwellg 7 January 2025
I never really understood putting consistency on a pedestal. It's certainly nice when everything operates exactly the same way - but consistency for consistency's sake is awful to work in too. If a team realizes that logging library B is better than library A, and but NEVER switches from A to B because of consistency concerns, then in two years they'll still all be using inferior tools and writing worse code. Similarly, if a team DOES decide to switch from A to B, they probably shouldn't spend months rewriting all previous code to use the new tool. It's ok for multiple established patterns to live in the same codebase, so long as everyone has an understanding of what the "correct" pattern should be for all new code.
Kon5ole 8 January 2025
Sometimes the right approach is to keep the consistency. Other times, that approach is either impossible or catastrophic.

IMO software development is so diverse and complex that universal truths are very very rare.

But to us programmers, anything that promises to simplify the neverending complexity is very tempting. We want to believe!

So we're often the equivalent of Mike Tyson reading a book by Tiger Woods as we look down a half-pipe for the first time. We've won before and read books by other winners, now we're surely ready for anything!

Which leads to relational data stored in couchDB, datalayers reimplemented as microservices, simple static sites hosted in kubernetes clusters, spending more time rewriting tests than new features, and so on.

IMO, most advice in software development should be presented as "here's a thing that might work sometimes".

hnanon98791 8 January 2025
> Single-digit million lines of code (~5M, let’s say)

> Somewhere between 100 and 1000 engineers working on the same codebase

> The first working version of the codebase is at least ten years old

> The cardinal mistake is inconsistency

Funny enough, the author notes the problem on why consistency is impossible in such a project and the proceeds to call it the cardinal mistake.

You cannot be consistent in a project of that size and scope. Full stop. Half those engineers will statistically be below average and constantly dragging the codebase towards their skill level each time they make a change. Technology changes a lot in ten years, people like to use new language features and frameworks.

And the final nail in the coffin: the limits of human cognition. To be consistent you must keep the standards in working memory. Do you think this is possible when the entire project is over a million LOC? Don't be silly.

There's a reason why big projects will always be big balls of mud. Embrace it. http://www.laputan.org/mud/

0xbadcafebee 7 January 2025
"Coding defensively" is perhaps the understatement of the year. Good software architecture is, in my opinion, the single most powerful tool you have to keep things from becoming an unmanageable mess.

If I could give one "defensive coding" tip, it would be for seniors doing the design to put in road blocks and make examples that prevent components from falling for common traps (interdependency, highly variant state, complex conditions, backwards-incompatibility, tight coupling, large scope, inter-dependent models, etc) so that humans don't have to remember to avoid those things. Make a list of things your team should never do and make them have a conversation with a senior if they want to do it anyway. Road blocks are good when they're blocking the way to the bad.

Starting with good design leads to continuing to follow good design. Not starting with good design leads to years of pain. Spend a lot more time on the design than you think you should.

forty 7 January 2025
I have been working on a code base that is now 14 year old for many years (almost since the beginning), and is now well over 1M LoC of typescript (for Nodejs) - we are only 20-30 engineers working on it, rather than the 100-1000 suggested on the article. And I can say I couldn't agree more with the article.

If you have to work on such projects, there are two things to keep in mind: consistency and integration tests.

zzbzq 8 January 2025
Wrong, wrong. Opposite of everything he said. All his examples are backwards. The article is basically inversing the Single Responsibility Principle.

First of all, consistency does not matter at all, ever. THat's his main thesis so it's already wrong. Furthermore, all his examples are backwards. If you didn't know the existence of "bot" users, you probably don't want your new auth mechanism to support them. Otherwise, the "nasty surprise" is the inverse of what he said: not that you find you don't support bot users, but you find out that you do.

Build stuff that does exactly what you want it to do, nothing more. This means doing the opposite of what he said. Do not re-use legacy code with overloaded meanings.

adamc 7 January 2025
I really liked this: "as a general rule, large established codebases produce 90% of the value."

People see the ugliness -- because solving real problems, especially if business practices are involved, is often very messy -- but that's where the value is.

o_nate 7 January 2025
A big part of this advice boils down to the old adage: "Don't remove a fence if you don't know why it was put there." In other words, when making changes, make sure you preserve every behavior of the old code, even things that seem unnecessary or counter-intuitive.
sirmarksalot 7 January 2025
Consistency is often helpful, but you also need to be wary of cargo culting. For example, you see a server back end that uses an ORM model and you figure you'll implement your new feature using the same patterns you see there. Then a month later the author of the original code you cribbed comes by and asks you, "just out of curiosity, why did you feel the need to create five new database tables for your feature?"

I know, that's a pretty specific "hypothetical," but that experience taught me that copying for the sake of consistency only works if you actually understand what it is you're copying. And I was also lucky that the senior engineer was nice about it.

ram_rar 7 January 2025
>The other reason is that you cannot split up a large established codebase without first understanding it. I have seen large codebases successfully split up, but I have never seen that done by a team that wasn’t already fluent at shipping features inside the large codebase

I cannot resonate with this. Having worked with multiple large code bases 5M+, splitting the codebase is usually a reflection of org structure and bifurcation of domain within eng orgs. While it may seem convoluted at first, its certainly doable and gets easier as you progress along. Also, code migrations of this magnitude is usually carried out by core platform oriented teams, that rarely ship customer-facing features.

jongjong 7 January 2025
I'm now working on a codebase which is quite large (13 micro-services required to run the main product); all containerized to run on Kubernetes. The learning curve was quite steep but luckily, I was familiar with most of the tech so that made it easier (I guess that's why they hired me). The project has been around for over 10 years so it has a lot of legacy code and different repos have different code styles, engine versions and compatibility requirements.

The biggest challenge is that it used to be maintained by a large team and now there are just 2 developers. Also, the dev environment isn't fully automated so it takes like 20 minutes just to launch all the services locally for development. The pace of work means that automating this hasn't been a priority.

It's a weird experience working on such project because I know for a fact that it would be possible to create the entire project from scratch using only 1 to 3 services max and we would get much better performance, reliability, maintainability etc... But the company wouldn't be willing to foot the cost of a refactor so we have to move at steady snail's pace. The slow pace is because of the point mentioned in the article; the systems are all intertwined and you need to understand how they integrate with one another in order to make any change.

It's very common that something works locally but doesn't work when deployed to staging because things are complicated on the infrastructure side with firewall rules, integration with third-party services, build process, etc... Also, because there are so many repos with different coding styles and build requirements, it's hard to keep track of everything because some bug fixes or features I implement touch on like 4 different repos at the same time and because deployment isn't fully automated, it creates a lot of room for error... Common issues include forgetting to push one's changes or forgetting to make a PR on one of the repos. Or sometimes the PR for one of the repos was merged but not deployed... Or there was a config or build issue with one of the repos that was missed because it contained some code which did not meet the compatibility requirements of that repo...

praptak 8 January 2025
Another post from the same author puts this in an interesting context: https://www.seangoedecke.com/glue-work-considered-harmful/ (follow up: https://www.seangoedecke.com/cynicism/)

Keeping the code base tidy is glue work, so you should only do enough of it to ship features. So maybe these are not "mistakes" but rather tactical choices made by politically smart engineers focused on shipping features.

nadis 13 January 2025
A little buried since the overarching focus is on consistency but I found these two paragraphs from the blog post really relevant points as well:

"You need to develop a good sense of how the service is used in practice (i.e. by users). Which endpoints are hit the most often? Which endpoints are the most crucial (i.e. are used by paying customers and cannot gracefully degrade)? What latency guarantees must the service obey, and what code gets run in the hot paths? One common large-codebase mistake is to make a “tiny tweak” that is unexpectedly in the hot path for a crucial flow, and thus causes a big problem.

You can’t rely on your ability to test the code in development like you can in a small project. Any large project accumulates state over time (for instance, how many kinds of user do you think GMail supports?) At a certain point, you can’t test every combination of states, even with automation. Instead, you have to test the crucial paths, code defensively, and rely on slow rollouts and monitoring to catch problems."

ge96 7 January 2025
I'm just thinking about this time at a previous job, I was reviewing a PR and they decided to just find/replace every variable and switch from snake to camel case. I was like "why are you guys doing this, not part of the job". There was some back and forward on that. This is a place where PRs weren't about reviews but just a process to follow, ask someone to approve/not expect feedback.

edit: job = ticket task

jongjong 7 January 2025
I once worked on a large project in the past where it took 3 days to rename a field in an HTTP response because of how many services and tests were affected. Just getting that through QA was a huge challenge.

Working in a large dev team, focusing on a small feature and having a separate product manager and QA team makes it easier to handle the scale though. Development is very slow but predictable. In my case, the company had low expectations and management knew it would take several months to implement a simple form inside a modal with a couple of tabs and a submit button. They hired contractors (myself included), paying top dollar to do this; for them, the ability to move at a snail's pace was worth it if it provided a strong guarantee that the project would eventually get done. I guess companies above a certain size have a certain expectation of project failure or cancellation so they're not too fussed about timelines or costs.

It's shocking coming from a startup environment where the failure tolerance is 0 and there is huge pressure to deliver on time.

crabbone 7 January 2025
OP has some particular type of project in mind, where what they say probably makes sense. Not all large codebases are like that.

For example, it could be a lot of individual small projects all sitting on some common framework. Just as an example: I've seen a catering business that had an associated Web site service which worked as follows. There was a small framework that dealt with billing and navigation etc. issues, and a Web site that was developed per customer (couple hundreds shops). These individual sites constituted the bulk of the project, but outside of the calls to the framework shared nothing between them, were developed by different teams, added and removed based on customer wishes etc. So, consistency wasn't a requirement in this scheme.

Similar things happen with gaming portals, where the division is between some underlying (and relatively small) framework and a bunch of games that are provided through it, which are often developed by teams that don't have to talk to each other. But, to the user, it's still a single product.

LAC-Tech 8 January 2025
Debates about the technical merits aside...the alternative to not allowing people to build their own nice little corner of a legacy codebase is not a bunch of devs building a consistent codebase. It's devs not wanting to touch the codebase at all.

Working on an old shitty codebase is one thing. Being told you have to add to the shit is soul crushing.

nitwit005 7 January 2025
Except, the old stuff will be effectively untestable, and they'll demand near perfect coverage for your changes.

Also, they're will be four incomplete refactorings, and people will insist on it matching the latest refactoring attempt. Which, will then turn out to be impossible, as it's too unfinished.

dktoao 8 January 2025
This is good advice but only it has been followed from the beginning and consistently throughout the development of the original code. It is applicable to large organizations with lots of resources who hire professional developers and have a lot of people who are familiar with the code that are active in code reviews and have some minimum form of documentation / agreement on what the logic flow in the code should look like (the article does not claim otherwise). But I would implore those who work at the 80% of other companies that this advice is nearly useless and YMMV trying to follow it. The one thing that I think is universally good advice is to try and aggressively remove code whenever possible.
physicsguy 8 January 2025
A big problem I come across is also half-assing the improvements.

Taks as an example - for some reason you need to update an internal auth middleware library, or a queue library - say there is a bug, or a flaw in design that means it doesn't behave as expected in some circumstances. All of your services use it.

So someone comes along, patches it, makes the upgrade process difficult / non-trivial, patches the one service they're working on, and then leaves every other caller alone. Maybe they make people aware, maybe they write a ticket saying "update other services", but they don't push to roll out the fixed version in the things they have a responsibility for.

luisgvv 8 January 2025
My only advice is "if it ain't broke don't fix it". And if you're going to improve something, make sure it's something small and local, ideally further from the "core logic" of the business.
KronisLV 8 January 2025
> If they use some specific set of helpers, you should also use that helper (even if it’s ugly, hard to integrate with, or seems like overkill for your use case). You must resist the urge to make your little corner of the codebase nicer than the rest of it.

This reads like an admission of established/legacy codebases somewhat sucking to work with, in addition to there being a ceiling for how quickly you can iterate, if you do care about consistency.

I don't think that the article is wrong, merely felt like pointing that out - building a new service/codebase that doesn't rely on 10 years old practices or code touched by dozens of developers will often be far more pleasant, especially when the established solution doesn't always have the best DX (like docs that tell you about the 10 abstraction layers needed to get data from an incoming API call through the database and back to the user, and enough tests).

Plus, the more you couple things, the harder it will be to actually change anything, if you don't have enough of the aforementioned test coverage - if I change how auth/DB logic/business rules are processed due to the need for some refactoring to enable new functionality, it might either go well or break in hundreds of places, or worse yet, break in just a few untested places that aren't obvious yet, but might start misbehaving and lead to greater problems down the road. That coupling will turn your hair gray.

valzam 9 January 2025
During my last job I have formulated what for me are the 2 unquestionable metrics to care about when trying to build long-term maintanable systems:

- Consistency (fully agree with the article here)

- Control

Control to me means that you have to work extremely hard to lose the ability to change the parts you care about. For example:

- Do not leak libraries and frameworks far into your business logic. At some point you want to introduce a new capabilty but say the class/type you re-used from a library makes it really awkward. Now you are faced with a huge refactor. The more logic, the purer and simpler the code should be. Ideally stdlib only.

- Do not build magic, globally shared test harnesses. Helpers yes, but if you give up control over the environment a test runs is / setting up fixtures, test data etc. you will run into a world of pain due to dependencies between tests and especially the test data.

- Do not let libraries dictate your application architecture. E.g. I always separate the web framework layer (controllers, views etc.) from the service and data layers.

- Consistency plays a major part here. If you introduce 3 libraries to do the same thing you have basically given up control over that dependencies and refactors in the future will be much harder.

BlackFly 8 January 2025
> Because it protects you from nasty surprises, it slows down the codebase’s progression into a mess, and it allows you to take advantage of future improvements.

The codebase is already a nasty surprise for people coming in from the outside with experience or people that are aware of current best practices or outside cultures, therefore, the codebase is already a mess and you cannot take advantage of future improvements without a big bang since that would be inconsistent.

How to keep your code evolving in time and constantly feeling like it is something you want to maintain and add features to is difficult. But constantly rewriting the world when you discover a newer slight improvement will grind your development to a halt quickly. Never implementing that slight improvement incrementally will also slowly rot your feelings and your desire to maintain the code. Absolute consistency is the opposite of evolution: never allowing experimentation; no failed experiments mean no successes either. Sure, too much experimentation is equally disastrous, but abstinence is the other extreme and is not moderation.

PaulDavisThe1st 7 January 2025
> You can’t practice it beforehand (no, open source does not give you the same experience).

This is ridiculous. Even if you want to ignore the kernel, there are plenty of "large established codebases" in the open source world that are at least 20 years old. Firefox, various *office projects, hell, even my own cross-platform DAW Ardour is now 25 years old and is represented by 1.3M lines of code at this point in time.

You absolutely can practice it on open source. What you can't practice dealing with is the corporate BS that will typically surround such codebases. Which is not to say that the large established codebases in the open source world are BS free, but it's different BS.

lifeisstillgood 8 January 2025
All these are part of a different mistake - lack of common culture amount the coders. It’s really really hard, and often antithetical to the politics of the org, but having a common area (email, actual physical meetings) between the “tech leads” - so that’s one in 8 to one in twenty devs, is vital.

Sharing code, ideas, good and bad etc is possible - but it requires deliberate effort

perdomon 12 January 2025
https://jimmyhmiller.github.io/ugliest-beautiful-codebase

Heres an alternate take of what a lack of consistency and a tendency to build microservices makes for an enjoyable work environment (and an admittedly ugly codebase)

Jean-Papoulos 8 January 2025
I'm all for consistency. But imagine having a codebase where most operations point back to an in-memory dataset representation of a database table, and everytime you change your position in this dataset (the only "correct" way of accessing that table's data), it updates the UI accordingly.

New feature where you compare 2 records ? Too bad, the UI is going to show them both then go back to the first one in a epileptic spasm.

Sometimes, things are just that bad enough that keeping it consistent would mean producing things that will make clients call saying it's a bug. "No sorry, it's a feature actually".

dang 8 January 2025
Small prior discussion:

Mistakes engineers make in large established codebases - https://news.ycombinator.com/item?id=42570490 - Jan 2025 (3 comments)

quick_brown_fox 8 January 2025
I mostly agree, however experienced a different challenge exactly for the very reason of consistency:

I used to work within the Chromium codebase (at the order of 10s of million LOC) and the parts I worked in were generally in line with Google's style guide, i.e. consistent and of decent quality. The challenge was to identify legacy patterns that shouldn't be imitated or cargo-culted for the sake of consistency.

In practice that meant having an up to date knowledge of coding standards in order to not perpetuate anti-patterns in the name of consistency.

daitangio 8 January 2025
I agree that consistency is important, and also this is the real problem. There is no perfect architecture. Needs evolve. So consistency is a force, but architecture evolution (pushed by new features, for example) is an opposite force.

Balancing the two is not easy, and often if you do not have time, you are forced to drop your strong principles.

Let me do a simple example.

Imagine a Struts2 GUI. One day your boss ask you to do upgrade it to fancy AJAX. It is possible, for sure, but it can require a lot of effort, and finding the right solution is not easy,

gwbas1c 7 January 2025
Unit tests, exhaustive regression tests, and automated tests are the best way to prevent regressions.

Time spent writing good unit tests today allows you to make riskier changes tomorrow; good unit tests de-risk refactors.

cess11 8 January 2025
Very good advice. An implication from being reluctant to introducing dependencies is that you should remove dependencies if you can. Perhaps different parts of the system is using different PDF-generation libraries, or some clever person introduced Drools at some point but you might as well convert those rules to plain old Java.

Tooling is important too. IDE:s are great, but one should also use standalone static analysis, grepping tools like ripgrep and ast-grep, robust deterministic code generation, things like that.

riwsky 8 January 2025
Sounds like common law—one of the biggest, oldest "codebases" there is.

To quote Wikipedia:

> Common law is deeply rooted in stare decisis ("to stand by things decided"), where courts follow precedents established by previous decisions.[5] When a similar case has been resolved, courts typically align their reasoning with the precedent set in that decision.[5] However, in a "case of first impression" with no precedent or clear legislative guidance, judges are empowered to resolve the issue and establish new precedent.

z3t4 8 January 2025
If you have no idea what the usage is you are already doomed. However tests helps, whenever there is a regression you should write a test so that the same thing wont regress again.

Writing code is not like in real life where herd mentally usually saves your life. Go ahead and improve the code, what helps is tests... but also at least logging errors, and throwing errors. Tests and errors go hand in hand. Errors are not your enemy, errors helps you improve the program.

rf15 8 January 2025
> Single-digit million lines of code (~5M, let’s say)

as someone working on a 60M codebase, we have very different understandings of the word "large". My team is leaning more towards "understand the existing code, but also try to write maintainable and readable code". Everything looks like a mess built by a thousand different minds, some of them better and a lot of them worse, so keeping consistency would just drag the project deeper into hell.

hoten 7 January 2025
I love how the first example is "use the common interfaces for new code". If only! That assumes there _is_ a common interface for doing a common task, and things aren't just a copy-paste of similar code and tweaked to fit the use case.

So the only tweak I'd make here, is that if you are tempted to copy a bit of code that is already in 100 places, but with maybe 1% of a change - please, for the love of god, make a common function and parameterize out the differences. Pick a dozen or so instances throughout the codebase and replace it with your new function, validating the abstraction. So begins the slow work of improving an old code base created by undisciplined hands.

Oh, and make sure you have regression tests. The stupider the better. For a given input, snapshot the output. If that changes, audit the change. If the program only has user input, consider capturing it and playing it back, and if the program has no data as output, consider snapshotting the frames that have been rendered.

_fizz_buzz_ 8 January 2025
> no, open source does not give you the same experience

Why not? There are open source projects that are many years old with millions lines of code and many developers.

edanm 8 January 2025
Fantastic article.

One small point - consistency is a pretty good rule in small codebases too, for similar reasons. Less critical, maybe, but if your small codebase has a standard way of handling e.g. Auth, then you don't want to implement auth differently, for similar reasons (unified testing, there might be specialized code in the auth that handles edge cases you're not aware of, etc.)

lr4444lr 7 January 2025
Single-digit million lines of code (~5M, let’s say)

Somewhere between 100 and 1000 engineers working on the same codebase

The first working version of the codebase is at least ten years old

All of these things, or any of them?

In any event, though I agree with him about the importance of consistency, I think he's way off base about why and where it's important. You might as well never improve anything with the mentality he brings here.

gwbas1c 7 January 2025
One thing I did was implement a code formatter, and enforce it in CI.

"dotnet format" can do wonders, and solved most serious inconsistency issues.

ozim 8 January 2025
Problem with consistency is that people miss forest for the trees.

So lots of nitpicking on irrelevant stuff - keep files under 50 lines - that is silly consistency of little minds.

Author of the post fortunately writes from experience perspective with architectural examples so I ca write that it is good article.

tonymet 8 January 2025
Or you can help migrate them to Karenina Microservices: each service can be dysfunctional in its own way.
ReflectedImage 8 January 2025
"as a general rule, large established codebases produce 90% of the value."

This is only until your new upstart competitor comes along, rewrites your codebase from scratch and runs you out of the market with higher development velocity (more features).

roxyrox 8 January 2025
Have you ever tried Source-graph ? To handle such consistency issues. (we are trying to do the same at Anyshift for Terraform code) For me the issue is only be exacerbated by gen AI and the era of "big code" thats ahead
mrkeen 7 January 2025
There was only one mistake that the article felt like giving a header to: "The cardinal mistake is inconsistency"

The instinct to keep doing things the wrong way because they were done the wrong way previously is strong enough across the industry without this article.

I love to

> take advantage of future improvements.

However, newer and better ways of doing things are almost invariably inconsistent with the established way of doing things. They are dutifully rejected during code review.

My current example of me being inconsistent with our current, large, established database:

Every "unit test" we have hits an actual database (just like https://youtu.be/G08FxxwPjXE?t=2238). And I'm not having it. For the module I'm currently writing, I'm sticking the reads behind a goddamn interface so that I can have actual unit tests that will run without me spinning up and waiting for a database.

trgn 7 January 2025
Great article, has much wisdom. If you're contributing to a large code base, you need to be knee deep. It's against your developer instinct, but it's constant pruning and polishing. It is the only way.
sema4hacker 8 January 2025
I once did some contract work for a development group at Apple in the late 90's working on a product not yet released. It was the first time I was exposed to a large codebase that could be updated by any of a large number of programmers at any time. While investigating a bug, it was scary to constantly see large, complicated routines headed with one-line comments from 20 or 30 people logging the changes they had made. There would be no consistent style of code, no consistent quality, no consistent formatting, no real sense of ownership, a real free-for-all. The system not only immediately projected a sense of hopelessness, but also indicated that any attempts at improvement would quickly be clobbered by future sloppy changes.
kazinator 8 January 2025
This is reminiscent of the stupid things people have tried to fix email.

If I just do this simple thing in my mail client. ... or server ... mail security and spam and whatever else will be solved.

kbruce 9 January 2025
One common mistake: not playing Factorio and understanding how to scale, maintain, and debug a large factory with friends >>

(this is a half-joke... iykyk)

jheriko 8 January 2025
there is a balance to be had here. oftentimes people make their own corner of the code because they are afraid, or their superiors are, of the scope of work which is actually about 3 hours of consistent work with good discipline and not the 17 years they imagine.

millions of lines of code itself is a code smell. some of the absolute worst code i have to work with comes from industry standard crapware that is just filled with lots of do nothing bug factories. you gotta get rid of them if you want to make it more stable and more reliable.

however... i often see the problem, and its not "don't do this obvious strategy to improve qol" its "don't use that bullshit you read a HN article about last week"

i suspect this is one of those.

oglop 8 January 2025
Coders aren’t engineers, they are just bit bureaucrats. Everything in this post isn’t engineering, it’s autism.
dalton_zk 8 January 2025
Interesting point of view
fedeb95 8 January 2025
this should be tattooed to every newly hired CTO arm.
hnthrow90348765 7 January 2025
To be fair, the previous engineers got paid to write the legacy mess and were employed for a long time if there's a lot of it.

Where is the incentive to go the extra mile here? Do you eventually put up with enough legacy mess, pay your dues, then graduate to the clean and modern code bases? Because I don't see a compelling reason you should accept a job or stay in a code base that's a legacy mess and take on this extra burden.

t43562 8 January 2025
Big codebases develop badly/well because of the established company culture.

Culture tends to flow from the top. If it's very expedient at the top then the attitude to code will be too.

You get stuck in the "can't do anything better because I cannot upgrade from g++-4.3 because there's no time or money to fix anything, we just work on features. Work grinds to a near halt because the difficulties imposed by tech debt. The people above don't care because they feel they're flogging a nearly-dead horse anyhow or they're just inappropriately secure about its market position. Your efforts to do more than minor improvements are going to be a waste.

Even in permissive environments one has to be practical - it's better to have a small improvement that is applied consistently everywhere than a big one which affects only one corner. It has to materially help more than just you personally otherwise it's a pain in the backside for others to understand and work with when they come to do so. IMO this is where you need some level of consensus - perhaps not rigid convention following but at least getting other people who will support you. 2 people are wildly more powerful and convincing than 1.

The senior programmers are both good and bad - they do know more than you and they're not always wrong and yet if you're proposing some huge change then you very likely haven't thought it out fully. You probably know how great it is in one situation but not what all the other implications are. Perhaps nobody does. The compiler upgrade is fine except that on windows it will force the retirement of win2k as a supported platform .... and you have no idea if there's that 1 customer that pays millions of dollars to have support on that ancient platform. So INFORMATION is your friend in this case and you need to have it to convince people. In the Internet world I suppose the equivalent question is about IE5 support or whatever.

You have to introduce ideas gradually so people can get used to them and perhaps even accept defeat for a while until people have thought more about it.

It does happen that people eventually forget who an idea came from and you need to resist the urge to remind them it was you. This almost never does you a favour. It's sad but it reduces the political threat that they feel from you and lets them accept it. One has to remember that the idea might not ultimately have come from you either - you might have read it in a book perhaps or online.

At the end, if your idea cannot be applied in some case or people try to use it and have trouble, are you going to help them out of the problem? This is another issue. Once you introduce change be prepared to support it.

In other words, I have no good answers - I've really revolutionised an aspect of one big system (an operating system) which promptly got cancelled after we built the final batch of products on it :-D. In other cases I've only been able to make improvements here and there, in areas where others didn't care too much.

The culture from the top has a huge influence that you cannot really counter fully - only within your own team sometimes or your own department and you have to be very careful about confronting it head on.

So this is why startups work of course - because they allow change to happen :-)

tsarchitect 8 January 2025
OP has identified a universal norm: "Law of Large Established Codebases (LLEC)" states that "Single-digit million lines of code, Somewhere between 100 and 1000 engineers, first working version of the codebase is at least ten years old" tend to naturally dissipate, increasing the entropy of the system, inconsistency being one of characteristics.

OP also states that in order to 'successfully' split a LEC you need to first understand it. He doesn't define what 'understanding the codebase' means but if you're 'fluent' enough you can be successful. My team is very fluent in successfully deploying our microfrontend without 'understanding' the monstrolith of the application.

I would even go out and make the law a bit more general: any codebase will be both in a consistent and inconsistent state. If you use a framework, library, or go vanilla, the consistency would be the boilerplate, autogenerated code, and conventional patterns of the framework/library/programming language. But inconsistency naturally crops up because not all libraries follow the same patterns, not all devs understand the conventional patterns, and frameworks don't cover all use cases (entropy increases after all). Point being, being consistent is how we 'fight' against entropy, and inconsistency is a manifestation of increasing entropy. But there is nothing that states that all 'consistent' methods are the same, just that consistency exists and can be identified but not that the identified consistency is the same 'consistency'. And taking a snapshot of the whole you will always find consistent & inconsistent coexisting