Hacker Newsnew | past | comments | ask | show | jobs | submit | maccard's commentslogin

> All encryption is end-to-end, if you’re not picky about the ends.

This is a great quote.


> The real annoying thing about Opus 4.5 is that it's impossible to tell most people "Opus 4.5 is an order of magnitude better than coding LLMs released just months before it" without sounding like a AI hype booster clickbaiting, but it's the counterintuitive truth. To my continual personal frustration.

The problem is that these increases in model performance are like the boy who cried wolf. There's only so many times you can say "this model is so much better, and does X/Y/Z more/less" and have it _still_ not be good enough for general use.


I’m far more in the camp of not AI than pro LLM but I gave Claude the HTML of our jira ticket and told it we had a Jenkins pipeline that we wanted to update specific fields on the ticket of using python. Claude correctly figured out how we were calling python scripts from Jenkins, grabbed a library and one shorted the solution in about 45 seconds. I then asked it to add a post pipeline to do something else which it did, and managed to get it perfectly right.

It was probably 2-3 hours work of screwing around figuring out issue fields, python libraries, etc that was low priority for my team but causing issues on another team who were struggling with some missing information. We never would have actually tasked this out, written a ticket for it, and prioritised it in normal development, but this way it just got done.

I’ve had this experience about 20 times this year for various “little” things that are attention sinks but not hard work - that’s actually quite valuable to us


> It was probably 2-3 hours work of screwing around figuring out issue fields

How do you know AI did the right thing then? Why would this take you 2-3 hours? If you’re using AI to speed up your understanding that makes sense - I do that all the time and find it enormously useful.

But it sounds like you’re letting AI do the thinking and just checking the final result. This is fine for throwaway work, but if you have to put your name behind it that’s pretty risky, since you don’t actually understand why AI did what it did.


> How do you know AI did the right thing then?

Because I tested it, and I read the code. It was only like 40 lines of python.

> Why would this take you 2-3 hours?

It's multiple systems that I am a _user_ of, not a professional developer of. I know how to use Jira, I'm not able to offhand tell you how to update specific fields using python - and then repeat for Jenkins, perforce, slack. Getting credentials in (Claude saw how the credentials were being read in other scripts and mirrored that) is another thing.

> This is fine for throwaway work, but if you have to put your name behind it that’s pretty risky, since you don’t actually understand why AI did what it did.

As I said above, it's 30 lines of code. I did put my name beind it, it's been running on our codebase on every single checkin for 6 months, and has failed 0 times in that time (we have a separate report that we check in a weekly meeting for issues that were being missed by this process). Again, this isn't some massive complicated system - it's just glueing together 3/4 APIs in a tiny script in 1/10 of the time that it took me to do it. Worst case scenario is it does exactly what it did before - nothing.


Hah, even the concept of putting your name behind something is so great. It's kind of the ultimate protest against LLMs and social media, isn't it?

I've used it for minor shit like that, but then I go back and look at the code it wrote with all its stupid meandering comments and I realize half the code is like this:

const somecolor='#ff2222'; /* oh wait, the user asked for it to be yellow. Let's change the code below to increase the green and red /

/ hold on, I made somecolor a const. I should either rewrite it as a var or wait, even better maybe a scoped variable! /

hah. Sorry I'm just making this shit up, but okay. I don't hire coders because I just write it myself. If I did, I would assign them all* kinds of annoying small projects. But how the fuck would I deal with it if they were this bad?

If it did save me time, would I want that going into my codebase?


I've not found it to be bad for smaller things, but I've found once you start iterating on it quickly devolves into absolute nonsense like what you talked about.

> If it did save me time, would I want that going into my codebase?

Depends - and that's the judgement call. I've managed outsourcers in the pre-LLM days who if you leave them unattended will spew out unimaginable amounts of pure and utter garbage that is just as bad as looping an AI agent with "that's great, please make it more verbose and add more design patterns". I don't use it for anything that I don't want to, but for so many things that just require you to write some code that is just getting in the way of solving the problem you want to solve it's been a boon for me.


> Which goes to show that Ctrl-C profiling is often enough to solve a simple problem, and it’s usually much easier than learning how to use a profiler and how to properly read its output

As the article says, this is a low frequency sampling profiler, which means it comes with all the caveats of a sampling profiler, and interpreting its output. As a very crude tool, sure, but it is not an excuse to not learnt to use a profiler. Perf, instruments and UIForETW are simple enough to use that anyone who can follow the instructions in this blog past can pick up the basics in the same length of time.


I just went and checked my university - they’re still teaching c++ to first year uni students in 2025

Full 4k - very few, but lots are running adaptive resolutions at > 2k and at 120hz

There is no situation where toolchain improvements or workflow improvements should be snuffed at.

It can be harder to justify in private tooling where you might only have a few dozen or hundred devs saving those seconds per each invocation.

But in public tooling, where the benefit is across tens of thousands or more? It's basically always worth it.


Obviously effort vs reward comes in here, but if you have 20 devs and you save 5 seconds per run, you save a context switch on every tool invocation possibly.

This is true, but I think the other side of it is that in most shops there is lower hanging fruit than 5 seconds per tool run, especially if it's not the tool that's in the build / debug / test loop but rather the workspace setup / packaging / lockfile management tool.

Like, I switched my team's docker builds to Depot and we immediately halved our CI costs and shed like 60% of the build time because it's a persistent worker node that doesn't have to re-download everything every time. I have no association with them, just a happy customer; I'm only giving it to illustrate how many more gains are typically on the table before a few seconds here and there are the next thing to seriously put effort into.


> Ye,s but if your CI isn't terrible, you have the dependencies cached, so that subsequent runs are almost instant, and more importantly, you don't have a hard dependency on a third party service.

I’d wager the majority of CI usage fits your bill of “terrible”. No provider provides OOTB caching in my experience, and I’ve worked with multiple in house providers, Jenkins, teamcity, GHA, buildkite.


GHA with the `setup-ruby` action will cache gems.

Buildkite can be used in tons of different ways, but it's common to use it with docker and build a docker image with a layer dedicated to the gems (e.g. COPY Gemfile Gemfile.lock; RUN bundle install), effectively caching dependencies.


> GHA with the `setup-ruby` action will cache gems.

Caching is a great word - it only means what we want it to mean. My experience with GHA default caches is that it’s absolutely dog slow.

> Buildkite can be used in tons of different ways, but it's common to use it with docker and build a docker image with a layer dedicated to the gems (e.g. COPY Gemfile Gemfile.lock; RUN bundle install), effectively caching dependencies.

The only way docker caching works is if you have a persistent host. That’s certainly not most setups. It can be done, but if you have that running in docker doesn’t gain you much at all you’d see the same caching speed up if you just ran it on the host machine directly.


> My experience with GHA default caches is that it’s absolutely dog slow.

GHA is definitely far from the best, but it works:, e.g 1.4 seconds to restore 27 dependencies https://github.com/redis-rb/redis-client/actions/runs/205191...

> The only way docker caching works is if you have a persistent host.

You can pull the cache when the build host spawns, but yes, if you want to build efficiently, you can't use ephemeral builders.

But overall that discussion isn't very interesting because Buildkite is more a kit to build a CI than a CI, so it's on you to figure out caching.

So I'll just reiterate my main point: a CI system must provide a workable caching mechanism if it want to be both snappy and reliable.

I've worked for over a decade on one of the biggest Rails application in existence, and restoring the 800ish gems from cache was a matter of a handful of seconds. And when rubygems.org had to yank a critical gem for copyright reasons [0], we continued building and shipping without disruption while other companies with bad CIs were all sitting ducks for multiple days.

[0] https://github.com/rails/marcel/issues/23


> So I'll just reiterate my main point: a CI system must provide a workable caching mechanism if it want to be both snappy and reliable.

The problem is that none of the providers really do this out of the box. GHA kind of does it, but unless you run the runners yourself you’re still pulling it from somewhere remotely.

> I've worked for over a decade on one of the biggest Rails application in existence, and restoring the 800ish gems from cache was a matter of a handful of seconds.

I kind of suspected - the vast majority of orgs don’t have a team of people who can run this kind of a system. Most places with 10-20 devs (which was roughly the size of the team that ran the builds at our last org) have some sort of script, running on cheap as hell runners and they’re not running mirrors and baking base images on dependency changes.


> none of the providers really do this out of the box

CircleCI does. And I'm sure many others.


> My experience with GHA default caches is that it’s absolutely dog slow.

For reference, oven-sh/setup-bun opted to install dependencies from scratch over using GHA caching since the latter was somehow slower.

https://github.com/oven-sh/setup-bun/issues/14#issuecomment-...


This is what I came to say. We pre cache dependencies into an approved baseline image. And we cache approved and scanned dependencies locally with Nexus and Lifecycle.

> Can I get a laptop to sleep after closing the lid yet?

> on windows all of this just works

Disagree on the sleep one - my work laptop doesn’t go to sleep properly. The only laptop I’ve ever used that behaves as expected with sleep is a macbook.


That's funny - my work MBP won't go to sleep properly, lol. Often come back to work after the weekend to find a dead laptop.

Then you have a significant outlier experience for that platform.

It’s more than fine for people to dislike Apple products but this is simply not an area where other platforms have them beat.


Not sure why you're insinuating that I dislike apple products. My personal mb air doesn't have this issue and most of my household is on apple.

I'm also seeing results for "macbook pro doesn't go to sleep when lid closed", so other people see this problem too. You can't really claim that other platforms have them beat here if there isn't data to support the claim.


> Not sure why you're insinuating that I dislike apple products.

Your comment was written in a manner that echos the same anti-Apple bias that's frequently found on HN. If that's not you, then it's just a misread on my part.

> You can't really claim that other platforms have them beat here if there isn't data to support the claim.

I can, because by and large those are still anecdotal experiences posted online. The deeper integration of OS/hardware due to Apple controlling the entire chain has made sleep mostly a non-issue; it's typically a misbehaving application that might prevent it. There are valid reasons an app might need to do that, so it's not like macOS is going to prevent it - but if sleep's not working right on macOS, it's typically a user error.

This is different from Linux (and Windows, to a lesser degree) where you have a crazy amount of moving parts with drivers/hardware/resources/etc.


Macs do sleep well, when they manage to sleep. Sometimes macOS takes issue with certain programs, the last stack I used at work had a ~50/50 chance of inhibiting sleep when it was spun up.

All in all, I've given up on sleep entirely and default to suspend/hibernate now.


A buggy program preventing sleep is a bug in that program, not a mark on the overall support and reliability of sleep functionality in macOS.

There are valid reasons why a program might need to block sleep, so it's not like macOS is going to hard-prevent it if a program does this. Most programs should not be doing that though.


In my band, we sell digital lossless albums on bandcamp for just that reason.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: