For sure. One of the things I learned from the Lean folks was to look for inventory; it's one of the 7 Wastes. [1] In physical manufacturing, it's pretty obvious, because it's physical stuff sitting around on the journey to becoming actually useful.
With software it can be harder to notice because you don't have to make room for it. But in essence it's the same deal; it's anything we have paid to create that isn't yet delivering value to the people we serve. Plans, design docs, and any unshipped code.
There are a lot of reasons to avoid inventory, but a big one is that until something is in use, we haven't closed a feedback loop. Software inventory embodies untested hypotheses. E.g., a product manager thinks users will use X. A designer thinks Y will improve an interface for new users. A developer thinks Z will make for cleaner, faster code.
Both large pull requests and stacked pull requests increase inventory. In the case of incorrect hypotheses, they also increase rework. I could believe that for a well-performing team stacked PRs are better than equally-sized single big PRs, in that they could reduce inventory and cycle time. But like you, I think I'd rather just go for frequent, unstacked, small PRs.
I have stacked diffs sometimes when the rework is large. I want to make sure that I know the full story sounds the change I’m making because I’m forced to think about that upfront. What refactoring was needed? Was it actually needed? What new path do I carve out in the code or how do features interplay? Broken tests with good coverage tell me if I made a foundational mistake. Even if I decide to throw away the work because it was a dead end (rare) the team will have learned something by me explaining what didn’t work out. More often, I’ll need to go back and clean up. But I save significant reviewer time by doing that before putting up random prs one at a time that are not well understood. With the exception of very simple work, stacked reviews generally save significant time. You get reviews of non objectionable prs. Coworkers can see a bigger picture if that’s helpful to understand the context of the change that’s still coming into shape. It actually reduces merge conflicts because, for example, you can enable a refactor that everyone agrees needs to be done and land that. Then your conflict space is smaller.
Small prs don’t need it of course but complex features benefit from shaking out things earlier. Commit more than 100 lines are really hard to review (lots of anecdotal and empirical research). If you’re not reviewing small commit by small commit, the reviews are easily missing things. A single PR that’s 800 lines adds review time to go commit by commit. If you can merge the non objectionable stuff, the reviewee gets to feel a some of forward progress and fewer merge conflicts (eg someone lands a refactor before some simple change of yours vs your simple change handed before and you made it the person refactoring their problem where it belongs)
It sounds like you're talking about a bunch of different cases, and I'm having trouble untangling them.
If there's a simple refactoring everybody agrees is good whether or not your overall goal ends up making sense, then yes, by all means merge that. But that doesn't require stacking unless your review process is slow. In which case I still think the right solution is to speed up review, not to stack.
For the cases where we don't know the full story, my first thought is that we never know the full story. So there I try to instead find the smallest unit of work that everybody agrees is a step forward.
When that's not possible, where the unit of work still seems pretty large, instead of breaking that up into a bunch of stacked diffs that shouldn't be merged until we really understand something (which to me sounds like a large PR in disguise), I think a better option is a spike, where we intentionally do a quick, throwaway version of some change as a way of learning about the change. Instead of trying to do good code along the way to good understanding, we just go for the understanding. Once we have thrown out that scratch code, we then go back with our new knowledge for a proper PR.
The premise of stacked diffs seems that we won't learn anything significant from reviewing or deploying code. (If we did learn something valuable, then the things stacked on top could be up for a lot of rework or might be thrown out altogether.) I think that has a lot of bad effects, but one of the biggest for me is that the bigger the inventory of code (whether in one big PR or a stacked set of smaller ones), the more a reviewer will feel obliged to say, "LGTM" and let it go, because they know there's not much point in saying, "Actually, I think this whole thing could be better approached by X."
So like you I'm entirely for small, reviewable lumps that are easily merged. I just think they should be then reviewed quickly, so that stacking or agglomerating is unnecessary.
> unless your review process is slow. In which case I still think the right solution is to speed up review, not to stack.
This feels like a critically important point to your position but there's no actionable advice provided on how to achieve that. FWIW I've found stacked PRs do speedup reviews because it pipelines the work. Pipelining something removes bubbles (in this case time spent waiting on review) from forming.
Simpler parts of the work get eyeballs from more junior engineers who feel more comfortable approving smaller / simpler PRs (& other people feel confident in that). Trickier stuff is left to the smaller pool of people who have the appropriate context / skillset. If you're waiting on landing 1 PR at a time, then you're serializing the review flow which means your total time on the PR is "time spent writing + time spent reviewing 100% of the code". If you pipeline your stacked diff, then potentially you could get ~80% of the code reviewed & landed by the time you finish the more complex pieces. Then you're left with "time spent writing + time spent reviewing 20% of the code". Additionally, by putting up the commits early, you're letting other people fit smaller reviews into their schedule more easily vs "here's a PR with 5000 lines of code changes" which is a monstrosity to review (i.e. quickly runs into mental fatigue issues / quality of the reviews can easily degrade, especially if commit hygiene wasn't practiced).
Have you actually tried stack PRs with a proper review process & good commit hygiene? This is one of those "try it before you knock" it things.
Again, I think you're tangling together cases. And also tangling arguments. You can't argue from theoretical gains in an imagined optimal case and then say the only argument that matters is the empirical one.
If your point is that in some existing organizations stacked pull requests are better for a specific engineer's experience than doing all the related code in a big blob, I certainly believe you.
Similarly, there are manufacturing shops where reducing inventory in line with Lean approaches doesn't work out of the gate, because there are other organizational problems/constraints that have to be dealt with first. For example, you might need a large buffer of component X at stage Y of a manufacturing process because upstream quality issues mean that a smaller buffer would cause frequent stalls at stage Y. First you have to fix the upstream issue before you can cut stocks there.
So are stacked pull requests the optimal choice for some specific person on some specific occasion? Sure! I'll take your word that's the case for you. What I'm saying is that I think they're an indicator that there is some systemic problem that could be resolved so stacked pull requests and giant pull requests both become unnecessary.
> One of the things I learned from the Lean folks was to look for inventory; it's one of the 7 Wastes. [1] In physical manufacturing, it's pretty obvious, because it's physical stuff sitting around on the journey to becoming actually useful.
I guess this is not true anymore post covid outbreak? Pretty sure a lot of companies would kill to have inventory of their raw materials right now...
It is still true. The lean analysis of waste splits things into "necessary waste" and "pure waste". For a particular place and moment in time, there will be some waste that you can't remove without harming production significantly. That's necessary waste. The goal in the long term is to reduce total waste by finding ways to make some bit of necessary waste unnecessary.
It's true that pandemic supply chain issues have change the level of necessary waste in a lot of supply chains. But that doesn't make inventory good. Often production halts not due to everything being missing, but a shortfall of just one input. A company might mistakenly react by stock up on everything, but that still won't solve the shortfall of the critical component.
So should the just stock up on the critical component? Go get a year's backlog of that? If everybody does that, that will cause a shortfall all on its own, as when everybody did panic buying of toilet paper in 2020. And then when supply chains straighten out, then the stockpile is back to being unnecessary waste. So I don't think there are any simple answers there.
With software it can be harder to notice because you don't have to make room for it. But in essence it's the same deal; it's anything we have paid to create that isn't yet delivering value to the people we serve. Plans, design docs, and any unshipped code.
There are a lot of reasons to avoid inventory, but a big one is that until something is in use, we haven't closed a feedback loop. Software inventory embodies untested hypotheses. E.g., a product manager thinks users will use X. A designer thinks Y will improve an interface for new users. A developer thinks Z will make for cleaner, faster code.
Both large pull requests and stacked pull requests increase inventory. In the case of incorrect hypotheses, they also increase rework. I could believe that for a well-performing team stacked PRs are better than equally-sized single big PRs, in that they could reduce inventory and cycle time. But like you, I think I'd rather just go for frequent, unstacked, small PRs.
[1] e.g. https://kanbanize.com/lean-management/value-waste/7-wastes-o...