I've been reviewing Agent sandboxing solutions recently and it occurred to me there is a gaping vector for persistent exploits for tools that let the agent write to the project directory. Like this one does.
I had originally thought this would ok as we could review everything in the git diff. But, it later occurred to me that there are all kinds of files that the agent could write to that I'd end up executing, as the developer, outside the sandbox. Every .pyc file for instance, files in .venv , .git hook files.
ChatGPT[1] confirms the underlying exploit vectors and also that there isn't much discussion of them in the context of agent sandboxing tools.
My conclusion from that is the only truly safe sandboxing technique would be one that transfers files from the sandbox to the dev's machine through some kind of git patch or similar. I.e. the file can only transfer if it's in version control and, therefore presumably, has been reviewed by the dev before transfer outside the sandbox.
I'd really like to see people talking more about this. The solution isn't that hard, keep CWD as an overlay and transfer in-container modified files through a proxy of some kind that filters out any file not in git and maybe some that are but are known to be potentially dangerous (bin files). Obviously, there would need to be some kind of configuration option here.
It's a good point. Maybe I should add an option to make certain directories read-only even under the current working directory, so that you can make .git/ read-only without moving it out of the project directory.
You can already make CWD an overlay with "jai -D". The tricky part is how to merge the changes back into your main working directory.
This is the problem yoloAI (see below comment) is built around. The merge step is `yoloai diff` / `yoloai apply`: the agent works against a copy of your project inside the container, you review the diff, you decide what lands.
jai's -D flag captures the right data; the missing piece is surfacing it ergonomically. yoloAI uses git for the diff/apply so it already feels natural to a dev.
One thing that's not fully solved yet: your point about .git/hooks and .venv being write vectors even within the project dir. They're filtered from the diff surface but the agent can still write them during the session. A read-only flag for those paths (what you're considering adding to jai) would be a cleaner fix.
I've already shipped this and use it myself every day. I'm the author of yoloAI (https://github.com/kstenerud/yoloai), which is built around exactly this model.
The agent runs inside a Docker container or containerd vm (or seatbelt container or Tart vm on mac), against a full copy of your project directory. When it's done, `yoloai diff` gives you a unified diff of everything it changed. `yoloai apply` lands it. `yoloai reset` throws it away so you can make the agent try again. The copy lives in the sandbox, so your working tree is untouched until you explicitly say so.
The merge step turned out to be straightforward: just use git under the hood. The harder parts were: (a) making it fast enough that the copy doesn't add annoying startup overhead, (b) handling the .pyc/.venv/.git/hooks concern you raised (they're excluded from the diff surface by default), and (c) credential injection so the agent can actually reach its API without you mounting your whole home dir.
Leveraging existing tech is where it's at. Each does one thing and does it well. Network isolation is done via iptables in Docker, for example.
Still early/beta but it's working. Happy to compare notes if you're building something similar.
I don't follow why you'd run uncommitted non-reviewed code outside of the sandbox (by sandbox I'm meaning something as secure as a VM) you use. My mental model is more that you no longer compile / run code outside of the sandbox, it contains everything, then when a change is ready you ship it after a proper review.
The way I'd do it right now:
* git worktree to have a specific folder with a specific branch to which the agent has access (with the .git in another folder)
* have some proper review before moving the commits there into another branch, committing from outside the sandbox
* run code from this review-protected branch if needed
Ideally, within the sandbox, the agent can go nuts to run tests, do visual inspections e.g. with web dev, maybe run a demo for me to see.
But neither of the previous HN submissions reached the front page. The benefit of this article is that it got to the front page and so raised awareness.
Creating a new URL with effectively the same info but further removed from the primary source is not good HN etiquette.
Plus this is just content marketing for the ai security startup who posted it. Theyve added nothing, but get a link to their product on the front page ¯\_(ツ)_/¯
Unfortunately it's kind of random what makes it to the front page. If HN had a mechanism to ensure only primary sources make it, automatically replacing secondary sources that somehow rank highly, I'd be all for that, but we don't have that.
>Creating a new URL with effectively the same info but further removed from the primary source is not good HN etiquette.
I'm going to respectfully disagree with all the above and thank the submitter for this article. It is sufficiently different from the primary source and did add new information (meta commentary) that I like. The title is also catchier which may explain its rise to the front page. (Because more of us recognize "Github" than "Cline").
The original source is fine but it gets deep into the weeds of the various config files. That's all wonderful but that actually isn't what I need.
On the other hand, this thread's article is more meta commentary of generalized lessons, more "case study" or "executive briefing" style. That's the right level for me at the moment.
If I was a hacker trying to re-create this exploit -- or a coding a monitoring tool that tries to prevent these kinds of attacks, I would prefer the original article's very detailed info.
On the other hand, if I just want some highlights that raises my awareness of "AI tricking AI", this article that's a level removed from the original is better for that purpose. Sometimes, the derived article is better because it presents information in a different way for a different purpose/audience. A "second chance pool" doesn't help a lot of us because it still doesn't change the article to a shorter meta commentary type of article that we prefer.
The thread's article consolidated several sources into a digestible format and had the etiquette of citations that linked backed to the primary source urls.
> Plus this is just content marketing for the ai security startup who posted it. Theyve added nothing, but get a link to their product on the front page ¯\_(ツ)_/¯
This. I want to support original researchers websites and discussions linking to that rather than AI startup which tries to report the same which ends up on front page.
Today I realized that I inherently trust .ai domains less than other domains. It always feel like you have to mentally prepare your mind that the likelihood of being conned is higher.
Level 12 is a software consulting and custom development agency. We have mid and senior level positions. Our job descriptions are written by developers for developers. No HR fluff here, we want you to know what you are really getting into:
- We have a commitment to transparency and offer a "no surprises experience" throughout the interview and hiring process. We value candor...as evidenced by the length of our job description. :)
- Our CEO prefers the title CED, Chief Executive Developer. Engineering and operational concerns don't take a back seat when potential sales come to the front door.
- You will be working remote with a team that is all working remote and has been for years. Let's make the best use of our time by not commuting.
- We practice and preach sound development practices. You are likely to learn and grow as a developer while working here.
- You are committed to automated testing of all the software you write (our apps typically have 92%+ test coverage).
- We have a no-drama office policy. We value and cultivate enjoyable working relationships among team members. No jerks allowed!
- We emphasize work/life balance and adopt policies that make sure our people don't get burnt out. For instance, our PTO/Vacation policies are designed so that you actually use them.
- If you apply, we guarantee that we will give you a response, whether "yay" or "nay". No black holes here!
- Hiring status & updates: see my most recent post to our hiring google group. Also, subscribe there if you'd like to be updated in the future: https://groups.google.com/g/level12-hiring
We will be hiring in the next 2-6 weeks. More details are in a recent post in the linked Google group. And we'll get the website updated to avoid further confusion.
I think this is FUD. Haven't GLP-1s been around for 20+ year? The recent surge in popularity was due to new formulas that could be injected once per week instead of daily.
The most noticeable jump has been at 10mg but i started noticing it from the beginning of 2.5 to 5 to 7.5 over a 6 month period but something about 10 just hit different but only on my 2nd month of it now. Less anxiety about things i was hit by before and mood improved overall
Yeah I'm just confused why someone would go from a completely deterministic dependency management system back to a dice-rolling one especially when LLM's now exist where all the top tier ones are excellent at the Nix language
Because I myself am never going to anything else ever again, unless it's a derivative of the same idea, because it's the only one that makes sense
After troubleshooting a couple issues with the GitHub Actions Linux admin team, and their decision to not address either issue, I'm highly skeptical of investing more in GitHub Actions:
- Ubuntu useradd command causes 30s+ hang [1]
- Ubuntu: sudo -u some-user unexpectedly ends up with environment variables for the runner [2]
They told you why it takes so long no? the runners come by default with loads of programming languages installed like Rust, Haskell, Node, Python, .Net etc so it sets all that up per user add.
I would also question why your adding users on an ephemeral runner.
> I would also question why your adding users on an ephemeral runner.
We use runners for things that aren't quite "CI for software source code" that does some "weird" stuff.
For instance, we require that new developer system setup be automated - so we have a set of scripts to do that, and a CI runner that runs on those scripts.
Fair enough if you've some development environment automation and you want the CI to run it as well so CI is consistent with local development.
Don't know exactly what your doing but others(myself included) are using Mise or Nix on a per project basis to automate the development environment setup and that works well on GitHub Actions.
But I don't think useradd taking 30's on GitHub Actions is a bug or something they need to fix, they've explained why. Unsure about the sudo issues, did not read it carefully.
> Fair enough if you've some development environment automation and you want the CI to run it as well so CI is consistent with local development.
Oh we don't even run it in applications' CI, the environment automation is an entirely separate CI workflow. The intention isn't consistency between dev/CI, the environment automation CI effectively just serves to ensure that the automations actually run without error, and adds some explicit responsibility for anyone who's adding a new dependency.
> But I don't think useradd taking 30's on GitHub Actions is a bug or something they need to fix, they've explained why. Unsure about the sudo issues, did not read it carefully.
Yeah, agreed. Tangential, our dev setup CI is fairly slow, which tends to be fine - it runs a couple orders of magnitude less frequently than our app CI.
> That said, the framing feels a bit too poetic for engineering.
I wholeheartedly disagree but I tend to believe that's going to be highly dependent on what type of developer a person is. One who leans towards the craftsmanship side or one who leans towards the deliverables side. It will also be impacted by the type of development they are exposed to. Are they in an environment where they can even have a "lump of clay" moment or is all their time spent on systems that are too old/archaic/complex/whatever to ever really absorb the essence of the problem the code is addressing?
The OP's quote is exactly how I feel about software. I often don't know exactly what I'm going to build. I start with a general idea and it morphs towards excellence by the iteration. My idea changes, and is sharpened, as it repeatedly runs into reality. And by that I mean, it's sharpened as I write and refactor the code.
I personally don't have the same ability to do that with code review because the amount of time I spend reviewing/absorbing the solution isn't sufficient to really get to know the problem space or the code.
Level 12 is a software consulting and custom development agency. We have mid and senior level positions. Our job descriptions are written by developers for developers. No HR fluff here, we want you to know what you are really getting into:
- We have a commitment to transparency and offer a "no surprises experience" throughout the interview and hiring process. We value candor...as evidenced by the length of our job description. :)
- Our CEO prefers the title CED, Chief Executive Developer. Engineering and operational concerns don't take a back seat when potential sales come to the front door.
- You will be working remote with a team that is all working remote and has been for years. Let's make the best use of our time by not commuting.
- We practice and preach sound development practices. You are likely to learn and grow as a developer while working here.
- You are committed to automated testing of all the software you write (our apps typically have 92%+ test coverage).
- We have a no-drama office policy. We value and cultivate enjoyable working relationships among team members. No jerks allowed!
- We emphasize work/life balance and adopt policies that make sure our people don't get burnt out. For instance, our PTO/Vacation policies are designed so that you actually use them.
- If you apply, we guarantee that we will give you a response, whether "yay" or "nay". No black holes here!
- Hiring status & updates: see my most recent post to our hiring google group. Also, subscribe there if you'd like to be updated in the future: https://groups.google.com/g/level12-hiring
I mean you must care a little bit right? Why publish it and share it here otherwise? :) Maybe you're looking for people to just review and learn from the code, rather than use it in their projects?
> From license terms you can see that any independent developer and small teams could use it without any issues
Right, until they cannot, and that choice won't be made from their own agency, and most people will try to avoid ending up there, hence not using the project in the first place.
Not saying "it's doomed to have zero users", but you'll probably find it slightly strange when people seemingly would have perfect use for your project, yet find other options anyways.
> And yes I do not want it to be "free stuff" for big corporations. I just do not know any existing license that can define such terms.
Guess BSL would fit you, but yeah, if you want any sort of restrictions, what you want is something else than Free and Open Source Software, and that's fine of course, just be aware it'll be a hard sell to developers used to FOSS. Again, a fine choice to be making and understandable.
I had originally thought this would ok as we could review everything in the git diff. But, it later occurred to me that there are all kinds of files that the agent could write to that I'd end up executing, as the developer, outside the sandbox. Every .pyc file for instance, files in .venv , .git hook files.
ChatGPT[1] confirms the underlying exploit vectors and also that there isn't much discussion of them in the context of agent sandboxing tools.
My conclusion from that is the only truly safe sandboxing technique would be one that transfers files from the sandbox to the dev's machine through some kind of git patch or similar. I.e. the file can only transfer if it's in version control and, therefore presumably, has been reviewed by the dev before transfer outside the sandbox.
I'd really like to see people talking more about this. The solution isn't that hard, keep CWD as an overlay and transfer in-container modified files through a proxy of some kind that filters out any file not in git and maybe some that are but are known to be potentially dangerous (bin files). Obviously, there would need to be some kind of configuration option here.
1: https://chatgpt.com/share/69c3ec10-0e40-832a-b905-31736d8a34...
reply