Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've been doing software of all kinds for a long long time. I've never, ever, been in a position where I was concerned about the speed of my package manager.

Compiler, yes. Linker, sure. Package downloader. No.



When you're in a big company, the problem really starts showing. A service can have 100+ dependencies, many in private repos, and once you start adding and modifying dependencies, it has to figure out how to version it to create the lock file across all of these and it can be really slow.

Cloud dev environments can also take several minutes to set up.


Many of these package managers get invoked countless times per day (e.g., in CI to prepare an environment and run tests, while spinning up new dev/AI agent environments, etc).


Is the package manager a significant amount of time compared to setting up containers, running tests etc? (Genuine question, I’m on holiday and can’t look up real stats for myself right now)


Anecdotally unless I'm doing something really dumb in my Dockerfile (recently I found a recursive `chown` that was taking 20m+ to finish, grr) installing dependencies is longest step of the build. It's also the most failure prone (due to transient network issues).


Ye,s but if your CI isn't terrible, you have the dependencies cached, so that subsequent runs are almost instant, and more importantly, you don't have a hard dependency on a third party service.

The reason for speeding up bundler isn't CI, it's newcomer experience. `bundle install` is the overwhelming majority of the duration of `rails new`.


> Ye,s but if your CI isn't terrible, you have the dependencies cached, so that subsequent runs are almost instant, and more importantly, you don't have a hard dependency on a third party service.

I’d wager the majority of CI usage fits your bill of “terrible”. No provider provides OOTB caching in my experience, and I’ve worked with multiple in house providers, Jenkins, teamcity, GHA, buildkite.


GHA with the `setup-ruby` action will cache gems.

Buildkite can be used in tons of different ways, but it's common to use it with docker and build a docker image with a layer dedicated to the gems (e.g. COPY Gemfile Gemfile.lock; RUN bundle install), effectively caching dependencies.


> GHA with the `setup-ruby` action will cache gems.

Caching is a great word - it only means what we want it to mean. My experience with GHA default caches is that it’s absolutely dog slow.

> Buildkite can be used in tons of different ways, but it's common to use it with docker and build a docker image with a layer dedicated to the gems (e.g. COPY Gemfile Gemfile.lock; RUN bundle install), effectively caching dependencies.

The only way docker caching works is if you have a persistent host. That’s certainly not most setups. It can be done, but if you have that running in docker doesn’t gain you much at all you’d see the same caching speed up if you just ran it on the host machine directly.


> My experience with GHA default caches is that it’s absolutely dog slow.

GHA is definitely far from the best, but it works:, e.g 1.4 seconds to restore 27 dependencies https://github.com/redis-rb/redis-client/actions/runs/205191...

> The only way docker caching works is if you have a persistent host.

You can pull the cache when the build host spawns, but yes, if you want to build efficiently, you can't use ephemeral builders.

But overall that discussion isn't very interesting because Buildkite is more a kit to build a CI than a CI, so it's on you to figure out caching.

So I'll just reiterate my main point: a CI system must provide a workable caching mechanism if it want to be both snappy and reliable.

I've worked for over a decade on one of the biggest Rails application in existence, and restoring the 800ish gems from cache was a matter of a handful of seconds. And when rubygems.org had to yank a critical gem for copyright reasons [0], we continued building and shipping without disruption while other companies with bad CIs were all sitting ducks for multiple days.

[0] https://github.com/rails/marcel/issues/23


> So I'll just reiterate my main point: a CI system must provide a workable caching mechanism if it want to be both snappy and reliable.

The problem is that none of the providers really do this out of the box. GHA kind of does it, but unless you run the runners yourself you’re still pulling it from somewhere remotely.

> I've worked for over a decade on one of the biggest Rails application in existence, and restoring the 800ish gems from cache was a matter of a handful of seconds.

I kind of suspected - the vast majority of orgs don’t have a team of people who can run this kind of a system. Most places with 10-20 devs (which was roughly the size of the team that ran the builds at our last org) have some sort of script, running on cheap as hell runners and they’re not running mirrors and baking base images on dependency changes.


> none of the providers really do this out of the box

CircleCI does. And I'm sure many others.


> My experience with GHA default caches is that it’s absolutely dog slow.

For reference, oven-sh/setup-bun opted to install dependencies from scratch over using GHA caching since the latter was somehow slower.

https://github.com/oven-sh/setup-bun/issues/14#issuecomment-...


This is what I came to say. We pre cache dependencies into an approved baseline image. And we cache approved and scanned dependencies locally with Nexus and Lifecycle.


There is no situation where toolchain improvements or workflow improvements should be snuffed at.


It can be harder to justify in private tooling where you might only have a few dozen or hundred devs saving those seconds per each invocation.

But in public tooling, where the benefit is across tens of thousands or more? It's basically always worth it.


Obviously effort vs reward comes in here, but if you have 20 devs and you save 5 seconds per run, you save a context switch on every tool invocation possibly.


This is true, but I think the other side of it is that in most shops there is lower hanging fruit than 5 seconds per tool run, especially if it's not the tool that's in the build / debug / test loop but rather the workspace setup / packaging / lockfile management tool.

Like, I switched my team's docker builds to Depot and we immediately halved our CI costs and shed like 60% of the build time because it's a persistent worker node that doesn't have to re-download everything every time. I have no association with them, just a happy customer; I'm only giving it to illustrate how many more gains are typically on the table before a few seconds here and there are the next thing to seriously put effort into.


Must be nice not to use Python!


I agree for my own projects where the code and libraries are not enormous. The speed and size gains aren’t going to be something that matters enough.


try conda

took an hour to install 30 dependencies




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: