More

tartakovsky · 2025-07-13T04:51:19 1752382279

2-ish questions:

Is this level of fear typical or reasonable? If so, why doesn’t Anthropic / AI code gen providers offer this type of service? Hard to believe Anthropic is not secure in some sense — like what if Claude Code is already inside some container-like thing?

Is it actually true that Claude cannot bust out of the container?

pxc · 2025-07-13T07:05:13 1752390313

> Is this level of fear typical or reasonable?

Just a month ago, an AI coding agent deleted all the files on someone's computer and there was a little discussion of it here on HN. Support's response was basically "yeah, this happens sometimes".

forum post: https://forum.cursor.com/t/cursor-yolo-deleted-everything-in...

HN thread (flagged, probably because it was a link to some crappy website that restates things from social media with no substantive content of its own): https://news.ycombinator.com/item?id=44262383

Idk how Claude Code works in particular, though.

wongarsu · 2025-07-13T08:10:21 1752394221

It's worth noting that the default settings of Cursor do prevent this by asking you to confirm every command that is run. And when you get tired of that 5 minutes in and switch to auto-approving there is still protection against files outside the work directory being deleted. The story above is about someone who disabled all the safeguards because they were inconvenient, then bad things happened

It is a good example of "bad things can happen", but when talking about whether we need additional safeguards the lessons are less clear. And while I'm not as familiar with the safeguards of Claude Code I'm assured it also has some by default

tosh · 2025-07-13T09:20:20 1752398420

re Anthropic: there is a whole docs section on devcontainers

https://docs.anthropic.com/en/docs/claude-code/devcontainer

and an example repo

https://github.com/anthropics/claude-code/tree/main/.devcont...

avtar · 2025-07-13T17:54:04 1752429244

--cap-add=NET_ADMIN

--cap-add=NET_RAW

https://github.com/anthropics/claude-code/blob/main/.devcont...

If the cointainer route (with those types of privileges) is being suggested from a security point of view, then might as well run these processes in a VM and call it a day :/

kxrm · 2025-07-13T05:26:46 1752384406

I haven't found that to be the case. I have used cc within an container and on the host machine and it has been fine. Any command that could cause changes to your system you MUST approve when using it in agent mode.

Revisional_Sin · 2025-07-13T10:53:19 1752403999

You either have the option of approving each command manually, or you can let it run commands autonomously. If you let it run any commands then you have the risk of it doing something stupid (mainly deleting files).

You also have MCP tools running on your machine, which might have security issues.

extr · 2025-07-13T06:29:55 1752388195

I have personally never seen claude (or actually any AI agent) do anything that could not be fixed with git. I run 24/7 in full permissions bypass mode and hardly think about it.

swayson · 2025-07-13T08:25:19 1752395119

Correlation does not equal causation as the old adage goes. Just because if you havent seen the pattern, doesn't mean it can't.

It is like insurance, 99.95% of the time you don't need it. But when you do, you wish you had it.

photonthug · 2025-07-13T06:54:31 1752389671

> Is this level of fear typical or reasonable?

Anyone with more than one toolbox knows that fear isn't required. Containers are about more than security, including stuff like organization and portability.

> If so, why doesn’t Anthropic / AI code gen providers offer this type of service?

Well perhaps I'm too much the cynic, but I'm sure you can imagine why a lack of portability and reproducibility are things that are pretty good for vendors. A lack of transparency also puts the zealots for "100x!", and vendors, and many other people in a natural conspiracy together, and while it benefits them to drum up FOMO it makes everyone else burn time/money trying to figure out how much of the hype is real. People who are new to the industry get leverage by claiming all existing knowledge does not matter, workers who are experienced but looking to pivot into a new specialization in a tough job market benefit from making unverifiable claims, vendors make a quick buck while businesses buy-to-try and forget to cancel the contract, etc etc.

> Is it actually true that Claude cannot bust out of the container?

Escaping containers is something a lot of people in operations and security have spent a lot of time thinking about long before agents and AI. Container escape is possible and deadly serious, but not in this domain really, I mean all your banks and utility providers are probably using Kubernetes so compared to that who cares about maybe leaking source/destroying data on local dev machines or platforms trying to facilitate low-code apps? AI does change things slightly because people will run Ollama/MCP/IDEs on the host, and that's arguably some new surface area to worry about. Sharing sockets and files for inter-agent comms is going to be routine even if everyone says it's bad practice. But of course you could containerize those things too, add a queue, containerize unit-tests, etc

dannymi · 2025-07-13T10:44:19 1752403459

>Is this level of fear typical or reasonable?

Of course. Also with regular customer projects. Even without AI--but of course having an idiot be able to execute commands on your PC makes the risk higher.

> If so, why doesn’t Anthropic / AI code gen providers offer this type of service?

Why? Separate the concerns. Isolation is a concern depending on my own risk appetite. I do not want stuff to decide on my behalf what's inside the container and what's outside. That said, they do have devcontainer support (like the article says).

>Hard to believe Anthropic is not secure in some sense — like what if Claude Code is already inside some container-like thing?

It's a node program. It does ask you about every command it's gonna execute before it does it, though.

>Is it actually true that Claude cannot bust out of the container?

There are (sporadic) container escape exploits--but it's much harder than not having a container.

You can also use a qemu vm. Good luck escaping that.

Or an extra user account--I'm thinking of doing that next.

tartakovsky · 2025-06-21T19:13:56 1750533236

This landing page also fails to properly describe anything I understand.

homebrewer · 2025-06-21T19:17:33 1750533453

It's sway with eye-candy. (Or, if you don't know about sway, the Wayland version of i3 — also with eye-candy).

If you're still not sure, you're simply not the target audience, these things almost always require further learning and customization, so some level of gate-keeping from the start is helpful.

sheepscreek · 2025-06-21T19:41:29 1750534889

Yep this - you’re not the target audience. It’s a UI desktop layer for Linux, made for the enthusiast.

But also, OP posted a link that is not mobile friendly (at least on safari) and doesn’t say much about what this is. The main page does a fine job though (for those would care I guess).

dartharva · 2025-06-21T19:49:15 1750535355

Then that's honestly not on them.

tartakovsky · 2025-06-21T19:12:44 1750533164

I clicked 3 different links failing at answering that question for myself and then I stopped caring, though not enough to pass up one-upping your comment.

tartakovsky · on Oct 20, 2024

original paper: https://news.ycombinator.com/item?id=41784591

tartakovsky · on Sept 30, 2024

Huh? “Total Shazams ever detected: 240. That's an average of 240 songs per day.”

tartakovsky · on Aug 11, 2024

spoiler alert -- no need to share the answer, it takes the fun away for others

sahmeepee · on Aug 11, 2024

I was just relieved when it wasn't vulva

jtokoph · on Aug 11, 2024

Hopefully they just don't realize that the word is the same for everyone each day.

tartakovsky · on Feb 26, 2024

ollama is MIT licensed unless i am misreading

wokwokwok · on Feb 27, 2024

Look more closely at the software.

It does not just magically conjure LLM model files out of thin air.

Where do those models come from?

https://github.com/ollama/ollama/issues/2390

The registry is not open source.

You think I’m being unfair?

https://github.com/ollama/ollama/issues/914#issuecomment-195...

(Paraphrased)

>> How do I run my own registry?

> email us, let’s talk.

tartakovsky · on Feb 14, 2024

What is your goal? if d1, d2, d3, etc is the dataset over which you're trying to optimize, then the goal is to find some best performing d_i. In this case, you're not evaluating. You're optimizing. Your acquisition function even says so: https://rentruewang.github.io/bocoel/research/

And in general if you have an LLM that performs really well on one d_i then who cares. The goal in LLM evaluation is to find a good performing LLM overall.

Finally, it feels that your Abstract and other snippets sound like an LLM wrote them.

Good luck.

doubtfuluser · on Feb 14, 2024

I disagree that the goal in „evaluation is to find a good performing LLM overall“. The goal in evaluation is to understand the performance of an LLm (on average). This approach actually is more about finding „areas“ where the LLm does not behave well and where the LLm behaves well (by the Gaussian process approximation) This is indeed an important problem to look at. Often you just run an LLm evaluation on 1000s of samples, some of them similar and you don’t learn anything new from the sample „what time is it, please“ over „what time is it“.

If instead you can reduce the number of samples to look at and automatically find „clusters“ and their performance, you get a win. It won’t be the „average performance number“, but it will give you (hopefully) understanding which things work how well in the LLm.

The main drawback in this (as far as I can say after this short glimpse at it) is the embedding itself. Only if the distance in the embedding space really correlates with performance, this will work great. However we know from adversarial attacks, that already small changes in the embedding space can result in vastly different results

tartakovsky · on Feb 6, 2024

What is this exactly in plain english, please?

tartakovsky · on Dec 25, 2023

A happy person is not in a particular set of circumstances, but rather has a particular set of attitudes.