Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Upvoted to spread the immense wisdom in this post. But golly I must say the line can get blurry quickly.

> Would anything of value be lost if this or that chunk of it was removed?

In early stage projects I’ve seen this mentality backfire occasionally because it’s tough to estimate future value, especially for code and data.

For example, one time in a greenfield project I created the initial SQL schema which had some extra metadata columns, essentially storing tags for posts. The next week, a more senior engineer removed all those columns and associated code, citing the YAGNI principle (“you aren’t gonna need it”). He was technically correct, there was no requirement for it on our roadmap yet.

But the original work had taken me maybe an hour. And the cost of keeping the data around was approximately zero. It seemed he didn’t consider that.

Guess who needed the columns to build a feature a year later? Yeah me, so I found myself repeating the work, with the additional overhead of prod DB migrations etc now that the product had users.

I guess my point is, sometimes it’s also wise to consider the opposite:

Would anything of value be gained if this or that chunk of it was removed?

In this article the evidence is clearly affirmative, but in my case, well it wasn’t so clear cut.



I'm sympathetic to your situation, but it's possible that the senior was still right to remove it at the time, even if you were eventually right that the product would need it in the end.

If I recall correctly they have a measure at SpaceX that captures this idea: The ratio of features added back a second time to total removed features. If every removed feature was added back, a 100% 'feature recidivism' (if you grant some wordsmithing liberty), then obviously you're cutting features too often. 70% is too much, even 30%. But critically 0% feature recidivism is bad too because it means that you're not trying hard enough to remove unneeded features and you will accumulate bloat as a result. I guess you'd want this ratio to run higher early in a product's lifecycle and eventually asymptote down to a low non-zero percentage as the product matures.

From this perspective the exact set of features required to make the best product are an unknown in the present, so it's fine to take a stochastic approach to removing features to make sure you cut unneeded ones. And if you need to add them back that's fine, but that shouldn't cast doubt on the decision to remove it in the first place unless it's happening too often.

Alternatively you could spend 6 months in meetings agonizing over hypotheticals and endlessly quibbling over proxies for unarticulated priors instead of just trying both in the real world and seeing what happens...


At one time Musk stated that SpaceX data suggested that needing to add back 15% of what was removed was a useful target. He suggested that some of the biggest problems came from failure to keep requirments simple enough due to smart people adding requirements, because they offered the most credible and seemingly well-reasoned bad ideas.


Thanks I couldn't remember the exact number. 15% seems like a reasonable r&d overhead to reduce inefficiencies in the product. But I suspect the optimal number would change depending on the product's lifecycle stage.


He also made it mandatory that all requirements had to be tied to a name. So there was no question as to why something was there.


How would you measure / calculate something like that? Seems like adding some amount back is the right situation, and not too much either, but putting a number on it is just arrogance.


Either accepting or dismissing the number without understanding its purpose or source can also be arrogance, but I agree that throwing a number out without any additional data is of limited, but not zero, usefulness.

When I want to know more about a number, I sometimes seek to test the assumption that an order of magnitude more or less (1.5%, 150%) is well outside the bounds of usefulness—trying to get a sense of what range the number exists within


No, dismissing something that is claimed without evidence is not arrogance, but common sense.


I think we're getting hung up on the concept of dismissing. To question skeptically, to ask if there is evidence or useful context, to seek to learn more is different than to dismiss.

The 5 Step Design Process emphasizes making requirements "less dumb," deleting unnecessary parts or processes, simplifying and optimizing design, accelerating cycle time, and automating only when necessary.

Musk suggests that if you're not adding requirements back at least 10%-15% of the time, you're not deleting enough initially. The percentage is an estimate initially based on experience, and now for several years based on estimates from manufacturing practice.


> How would you measure / calculate something like that?

SpaceX probably has stricter processes than your average IT shop, then it isn't hard to calculate stats like that. Then when you have the number, you tune it until you are happy, and now that number is your target. They arrived at 15%. This process is no different than test coverage numbers etc, its just a useful tool not arrogance.


I have no clue what you are saying. "They did it somehow"? How? Maybe they did not measure it, but Elon just imagined it. How can we tell the difference?


Yeah, not sure how you'd measure this apart from asking people to tag feature re-adds. And all that will happen is that people will decide that something is actually a new feature rather than a re-add because the threshold has already been hit this quarter. Literally adding work for no benefit.


Maybe it's just down to the way the comment was written and it actually played out differently, but the only thing I'd be a bit miffed about is someone more senior just coming in and nuking everything because YAGNI, like the senior guy who approves a more junior engineer's code and then spends their time rewriting it all after the fact.

Taking that situation as read, the bare minimum I'd like from someone in a senior position is to:

a) invite the original committer to roll it back, providing the given context (there isn't a requirement, ain't gonna need it, nobody asked for it, whatever). At the bare minimum this might still create some tension, but nowhere near as much as having someone higher up the food chain taking a fairly simple task into their own hands.

b) question why the team isn't on the same page on the requirements such that this code got merged and presumably deployed.

You don't have to be a micromanager to have your finger on the pulse with your team and the surrounding organisational context. And as a senior being someone who passes down knowledge to the more junior people on the team, there are easy teaching moments there.


There are a lot of cultural factors that could change my perspective here, but this is a reasonable criticism of the senior guy's approach.


Main issue with adding something you might need in the future is that people leave, people forget, and then, one year later, there's some metadata column but no one remembers whether it was already used for something. Can we use it? Should we delete it? Someone remembers Knight Capital and spectacular failure when old field was reused. So it's always safer to keep existing field and then you end up with metadata and metadata_1. Next year no one remembers why there are two metadata fields and is very confused on which one should be used.


Easy solution: Add a comment in your schema-creation SQL script explaining what the purpose of the column is. Or some other equivalent documentation. Stuff like that should be documented in any case.


So, every db-column gets a "in_use_boolean" assigned? It gets reset every year, reset on first use query and auto-purged after a year and a day. Self-pruning database..


This would break if you need something after two or three years. It happens.

My point is - it's relatively easy to tell if something is used. Usually, a quick search will find a place where it's referenced. It's much harder to 100% confirm that some field is not used at all. You'll have to search through everything, column names can be constructed by concatenating strings, even if it's strongly typed language there could be reflection, there could be scripts manually run in special cases... It's a hard problem. So everyone leaves "unknown" fields.


Only three years? I have many clients who want 7-10 years of historical data at the tap of a button... which they rarely if ever use :)


In most cases though, anticipating requirements results in building things you don't need. And if you do need them, you usually end up needing a very different form of it. The worst codebase I have worked on is the one that was designed with some complex future-use in mind. In your example as well, the codebase only required columns a year later. So I think removing all chunks of code that anticipate a future need sets the right precedent, even if you end up needing it eventually.


Thanks for this balanced (and IMO necessary) reminder.

It's all too easy to get caught either doing way too much defensive/speculative stuff, or way too little.

one person's "premature optimisation" is another person's "I've seen roughly this pattern before, I'll add this thing I wished I had last time" - and AFAICT there's no _reliably_ helpful way of distinguishing which is correct.


Would you have remembered to write this comment if the fields had never been added back?

In the case you describe, there are three possible outcomes, in broad categories:

1) The fields do turn out to be useful, in exactly the way you implemented them first.

2) The feature is implemented, but using a different set of fields or implementation.

3) The feature is not implemented.

Even if we assign equal probablility to all the options, creating them in the beginning still only results in a win in 1/3 of the time.

How much extra mental effort would have been spent making sure that all the other features implemented in the mean time work correctly with the metadata columns if they had not been removed?

Of course, you turned out to be correct in this case and that shows you certainly had excellent insight and understanding of the project, but understanding whether a decision was right or wrong, should be done based on information available at the time, not with full hindsight.


What was the DB migration required for those particular columns?

I presume this was not the only time such a migration needed to be done to add columns. Is it possible that as new features emerge, new columns and migrations will need to be done anyway and one more or less migration would make less of a difference on the grander scale?


Well, in your case it wasn't clear cut, but YAGNI is still a good default approach, I'd suspect maybe even in your case.

First of all, it wasn't guaranteed that this feature of yours would come. Even in this case, the feature came, you could probably add it with not too much effort, sure maybe a bit more than otherwise, but on a large project it's hard to guess the future. What if someone else would have taken that task, maybe they wouldn't even recognize why those columns are there and they could have just reimplemented it anyway.

Also, a year is a long time, and who knows how many times it would have caused additional work and confusion.

> A: hey why this thing here, it doesn't do anything / never actually used?

> B: I dunno, Alex added it because 'one day we might need it' (eyeroll), will get very touchy about if you try to remove it, and will explain how he/she can predict the future and we will definitely need this feature.

> A: and this thing over there?

> B: Same... just move on, please, I can't keep re-discussing these things every month...

And, if your team wouldn't have applied YAGNI, you would have 1 feature that was a year later needed, and probably around 20 that was never needed, yet caused maintenance burden for years down the road.

I dunno, YAGNI is one of the most valuable principles in software development, in my opinion.


how about writing it down instead of relying in tribal knowledge?


Writing what down? A bullet list of 30 items that "we added something that currently doesn't do anything or not used, but in case two years down the road, you need something, it might be close to what you need, so don't delete any of these unused items"? YAGNI is much simpler.


The thing is that such features can also become a burden quickly. E.g. people know it is not used, so nobody bothers to write tests or other things that would be usually done and those things might come back to haunt you once it is and boom a good source for catastrophic failure is there. Also, when you implement it later you can shape it without having to rework a ton of code.

Don't get me wrong, I understand your point here — and if the feature is something that you know for sure will be needed later on, adding it in from the start instead of bolting it on later is absolutely the smart choice. You wouldn't pour a concrete foundation for a building after you built two floors — especially if you know from the start it is going to be a skyscraper you are building.

In fact my experience with software development is that good foundations make everything easier. The question is just whether the thing we're talking about is truly part of the fundament or rather some ornament that can easily be added in later.


I get what you mean. However other comments raised some valid points too. It's just a few hours of works . I think what matters a lot more is when these kind of anticipated features do not add up to constant gains (which in your case, setting aside the fact that this senior engineer rolled back your tactful additions, would be the time it takes to run the prod DB migrations, since this is what differs between your early and late implementation). More often than not an architectural choice will impact n features orthogonally with respect to other potential features. If it takes you 2 days of work to implement that particular transversal feature, the time it will take you to implement it across all "base" features will be 2n days of work. Stuff like class inheritance will allow you factor that into, say 2 days + 10n minutes. I witnessed disasters because of situations like this where stuff that would have taken me days to implement (or even hours) took 6 months. And the main reason for this is that another team was tasked with doing this, and they didn't know that part of the code base well, which, because everything had to be super "simple" ("easy" would fit better), no class hierarchy, everything super flat, each base feature taking up thousands of lines across several microservices in the hope that anybody could take over, was a slow, dreadly, soul crushing job. Members of the previous team said it would have taken them 2 weeks (because they had a long experience with the spaghetti dish). I re-implemented the program at home in a weekend (took 3 month to code): the required change would have taken me an hour to code, and a fews day of integration to change those n base features (but because I had a leetcode-complex, very expressive architecture, and knew the domain very well). It took the new team 6 months. 6 months ! And I think they only managed to implement one feature, not all of them.

Result: disgruntled senior employees quitted and were replaced with juniors. 3 month later, overtime was distributed across the tech department, killing the laid back 4-days week culture that was put in place to attract talents, the employee unionized, some more quitted, and about a year later, upon failing to hire back these productive elements, the COO was fired.


My takeaway here is if you hire domain experts, then trust them when they say you are going to need something.


The big difference here is something the user sees versus the developers as in your example. For users I think the answer is almost always, less is better. For the developers also, but there’s a bit more leeway.


> But the original work had taken me maybe an hour. And the cost of keeping the data around was approximately zero. It seemed he didn’t consider that.

i guess this is very illuminating - you have to predict the cost of adding YAGNI, before doing it. A low cost YAGNI feature might actually serendipidously become useful.

I feel this is the same principle as random side projects, not done out of necessity but out of curiosity and/or exploration.


Wouldn't using a VCS help in this case? So you could go back in time before this column was removed, and copy paste the code you already wrote and maybe change some things (as likely some things have changed since you wrote the code the first time)


No because you still have to deal with the DB migration. Which tbf should be fairly easy since it's just adding empty columns.


In this case you where right, and the removal of the additional fields turned out to be a mistake. More often though I see this going the other way, you end up carrying around tables and fields that aren't needed and just muddles the picture. Even worse is "Well we added this field, but we're not using it, so let just stuff something else in there". It can get really ugly really fast.

The article is a little different, because "YES, they do need the calculator, people should know what your service costs up front". If they want to do away with the calculator, then they should restructure their pricing plans, so that it will no longer be required. That just not an IT problem.


I think when you have a mindset of removing the useless (which your columns were at at the time) you have to be prepared to sometimes add things back in. Yes, it is painful, but it is not a signal that the process didn't work. You should expect to sometimes have to add things back in.

We cannot perfectly predict the future, so when removing things there will always be false positives. The only way to get a false positive rate of zero is to never remove anything at all.

The real question is what level of false positives we find acceptable. If you can only cite this one case where it was painful, then I'd say that's evidence your colleagues approach worked very well!

(I think Elon Musk recommends a deletion false positive rate of 10 % as appropriate in general, but it will vary with industry.)


> the original work had taken me maybe an hour. And the cost of keeping the data around was approximately zero. It seemed he didn’t consider that.

You're missing the time other developers will spend trying to figure out why that code is there. When investigating a bug, upgrading, or refactoring, people will encounter that code and need to spend time and mental energy figuring it out.

Recently, I've been modernizing a few projects to run in containers. This work involves reading a lot of code, refactoring, and fixing bugs. Dead code—either code that is no longer called or code that never was—is one of the things that most wastes my time and energy. Figuring out why it's there and wondering if changing or deleting it will affect anything is just tiresome.

Answering "Why" is usually the hardest question. It becomes especially challenging when the original developer is no longer with the team.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: