Rendered at 21:31:28 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
isehgal 2 hours ago [-]
we've run into this problem. when you're running 4-5 Codex/Claude Code sessions in parallel across worktrees, the port collision sucks. have to check out
markamo 8 hours ago [-]
This solves the parallel runtime problem well.
The adjacent problem I’ve been focused on is what happens after the agent finishes in its isolated environment: how do you review what it actually changed before accepting the result?
I’m interested in diff/commit/rollback at the filesystem level, so you can selectively keep some changes and discard others.
Different problem, but they compose naturally.
fabianlindfors 24 hours ago [-]
We have been trying to solve the same problem (and a bunch of other ones) with https://specific.dev as well. We’ve tried to stay away from Docker as much as we can though because of the still pretty bad experience on Mac.
Our approach is having our CLI handle port assignments (and pass any connection details/ports along as env vars) and that way being able to spin up “isolated” copies of the local dev environment. Has the added benefit of us being able to deploy the same config straight to production and switch in production database connections strings and anything else needed.
jsunderland323 24 hours ago [-]
We started with an approach like that but I think our grounding principal has been that you shouldn't have to modify your docker-compose to get parallelized local development. I think we want to layer onto your existing setup, not make you re-write your stack around us.
I haven't really had a bad experience with Docker on Mac. but Is the idea you basically just build your service on top of specific.dev's provided services (postgres and redis) and those run bare-metal locally and then you can deploy to specific.dev's hosted solution?
fabianlindfors 23 hours ago [-]
Yes, exactly. Probably two different focuses between us, we are more focused on providing the full environment to build productively with coding agents, from local dev all the way to prod. The key thing for us is that the agent can write code, build infrastructure and test the entire system autonomously locally, and then deploying to production should be dead simple.
A bit of a different approach from the classic use case of docker-compose that is often orthogonal to the production infrastructure in some sense.
One thing I've used to great success though is taking an existing project or example docker-compose and simply asking the coding agent to translate it to Specific's IaC. Works a treat, especially as the coding agent can read all the code at the same time and connect it all together.
(also it looks like we were in the same batch!)
jsunderland323 23 hours ago [-]
I could definitely see that being useful for folks who are Docker-fearful or just less infra literate in general.
I think we're focused on the other end of the spectrum. Folks who like docker and have a good docker setup but want to have parallel runtimes. Anyway, best of luck!
fabianlindfors 14 hours ago [-]
Same to you!
chrisweekly 19 hours ago [-]
> "We’ve tried to stay away from Docker as much as we can though because of the still pretty bad experience on Mac."
This seems to be a pretty common perspective, but isn't it mostly about Docker Desktop? Orbstack solved my complaints, and I'm genuinely curious if I'm missing something significant (which is def possible).
fabianlindfors 9 hours ago [-]
Orbstack is definitely much better but far from native speeds in my experience. From our perspective of wanting all users to have a good experience, we also can't really point folks towards Orbstack as a "solution" to make the local dev experience great.
pdimitar 7 hours ago [-]
To put things into perspective: we have an integration test suite that takes:
- 30 minutes with Colima on Mac;
- 20 minutes with OrbStack on Mac;
- 13 minutes on a weaker CPU (Ryzen 5500U) on a native Linux laptop;
- 14 minutes on a Ryzen 5600X and a virtualized Debian inside Windows 10 WSL2.
Pretty stark differences. Granted our test suite is mostly I/O bound but that really tells you something about the VM overhead on a Mac and the lack of an actual kernel-native containerization support on macOS.
jsunderland323 18 hours ago [-]
I think this was a common perspective from early docker days with regard to local bind mounts (before docker switched from virtual box with hyperkit on macos). I do use Orb Stack and have noticed faster build times with Orb Stack but I haven't really noticed any difference in runtime performance between Orb Stack and Docker Desktop.
pdimitar 7 hours ago [-]
There are differences, but I think most people's code does not expose macOS' suboptimal containerization performance is all. Check my comment sibling to yours. We have noticed very observable differences.
Until Apple adds a kernel-level containerization support (likely: never) then this difference in performance will continue to exist.
That being said, Orbstack really is the best on macOS. Docker Desktop is only slightly slower but much worse as an UX. Colima I appreciate for its full headless nature but it's severely behind in performance, sadly.
rakeshd 16 hours ago [-]
This is really cool, been feeling this pain with worktrees for a while.
Curious about the hot strategy: when you do umount -l /workspace + mount --bind + mount --make-rshared inside the DinD container, lazy unmount means a running file watcher can still hold open fds to the old worktree while the new bind is already live. Have you hit cases where it keeps writing to stale paths after the switch? Or does it just naturally recover once the watcher picks up the inotify events from the new mount?
jsunderland323 16 hours ago [-]
I have waited 12 hours for someone to ask this! You are my hero.
So the name "hot" is a bit misleading. The containers don't actually stay alive through the switch. What happens is we do the umount -l /workspace, mount --bind, mount --make-rshared sequence first, and then we run docker compose up --force-recreate. Force-recreate skips compose down (which would tear down the network, named volumes, everything) and just swaps the container processes in place. The old containers and their file watchers are killed and new ones start up.
By the time the new container processes start, /workspace already points at the new worktree so all their file handles are fresh and correct. There's no window where a watcher could be writing to stale paths because the old processes are just gone.
I was pretty afraid of this at first too but it turns out the force-recreate sidesteps the whole problem.
oelmgren 1 days ago [-]
This is pretty cool, have personally felt this limitation many a time.
Basically been relying on spinning up cursor / niteshift / devin workflows since they have their own containers but this could be interesting to keep it all on your main machine.
jsunderland323 1 days ago [-]
Thanks!
Yeah, I think there's a ton of great remote solutions right now. I think worktrees make the local stuff tricky but hopefully Coasts can help you out.
Let me know how it goes!
dbla 1 days ago [-]
This looks really cool and I've definitely been feeling this pain. I've been building out a solution for myself on top of docker. What are the advantages of using coasts over docker?
jsunderland323 1 days ago [-]
Hey thanks! To be clear it does use docker. It's a docker-in-docker solution.
I think there's a quite a few things:
1) You need a control plane to manage the host-side ports. Docker alone cannot do that, so you're either going to write a docker-compose for your development environment where you hard code dynamic ports into a special docker-compose or you're going to end up writing your own custom control plane.
2) You can preserve your regular Docker setup without needing to alter it around dynamic ports and parallelized runtimes. I like this a lot because I want to know that my docker-compose is an approximation of production.
3) Docker basically leaves you with one type of strategy... docker compose up and docker compose down. With coasts you can decide on different strategies when you switch worktrees on a per service basis.
4) This is sort of back to point 2, but more often than not you want to do things like have some shared services or volumes across parallelized runtimes, Coasts makes that trivial (You can also have multiple coast configs so you can easily create a coast type that has isolated volumes). If you go the pure docker route, you are going to end up having multiple docker-composes for different scenarios that are easily abstracted by coasts.
5) The UI you get out of the box for keeping track of your assigned worktrees is super useful.
6) There's a lot of built in optimizations around switching worktrees in the inner bind mount that you'll have to manually code up yourself.
7) I think the ergonomics are just way better. I know that's kind of a vibesey answer but it was sort of the impetus for making Coasts in the first place.
8) There's a lot of stuff around secrets management that I think Coasts does particularly well but can get cumbersome if you're hand-rolling a docker solution.
dbla 18 hours ago [-]
Thank you for the detailed info! I will check it out
lukebaze 13 hours ago [-]
[dead]
magic_hamster 1 days ago [-]
> docker-in-docker solution
Goodbye Mac users.
jsunderland323 1 days ago [-]
Why do you say that?
It works fine on mac (that's what we developed it on) and it's not nearly as much overhead as I was initially expecting. There's probably some added latency from virtual box but it hasn't been noticeable in our usage.
jsunderland323 1 days ago [-]
HN questions we know are coming our way:
1) Could you run an agent in the coast?
You could... sort of. We started out with this in mind. We wanted to get Claude Max plans to work so we built a way to inject OAuth secrets from the host into the containerized host... unfortunately because the Coast runtime doesn't match the host machine the OAuth token is created on, Anthropic rapidly invalidates the OAuth tokens. This would really only work for TUIs/CLIs and you'd almost certainly have to bring a usage key (at least for Anthropic). You would also need to figure out how to get a browser runtime into the containerized host if you wanted things like playwright to work for your agent.
There's so many good host-side solutions for sandboxing. Coasts is not a sandboxing tool and we don't try to be. We should play well with all host-side sandboxing solutions though.
2) Why DinD and why not mount namespaces with unshare / nsenter?
Yes, DinD is heavy. A core principle of our design was to run the user's docker-compose unmodified. We wanted the full docker api inside the running containerized host. Raw mount namespaces can't provide image caches, network namespaces, and build layers without running against the host daemon or reimplementing Docker itself.
In practice, I've seen about 200mb of overhead with each containerized host running Dind. We have a Podman runtime in the works, which may cut that down some. But the bulk of utilization comes from the services you're running and how you decide to optimize your containerized hosts and docker stack. We have a concept of "shared-services". For example if you don't need isolated postgres or redis, you can declare those services as shared in your Coastfile, and they'll run once on the host Docker daemon instead of being duplicated inside each containerized host, coasts will route to them.
adrq 18 hours ago [-]
How reliably do agents stick to the 'coast exec' boundary in practice? Especially when they spawn subagents that may or may not inherit the instructions.
jsunderland323 18 hours ago [-]
Actually pretty reliably but you do need to explicitly call out the skill. I usually start agent threads with /coasts or in codex $coasts. Once it’s in the conversation they stick to it though.
One cool thing we do is we have the docs and semantic search of our docs baked into the CLI, so if the agents get lost they can usually figure things out kind of quickly by searching the docs via the cli.
Also we have a little section our agent.md and claude.md,I’m not sure how well it works without that.
Pakvothe 23 hours ago [-]
This is interesting for MCP server deployment. Right now most MCP servers run as local stdio processes. Containerizing them would solve the security and isolation concerns that come up every time someone installs a thirdparty MCP server.
Would love to see this support stdio-to-HTTP bridging so local MCP servers can be exposed as remote ones without rewriting them.
jsunderland323 23 hours ago [-]
There a couple of ways you can go about MCP within coasts (also depends on what the MCP does). You can either install the MCP service host-side (something like playwright), in which case everything should just work out of the box for you.
Alternatively, you can setup the Coast to install MCP services in the containers. There are some cases around specific logging or db MCP's where this might make sense.
>Would love to see this support stdio-to-HTTP bridging so local MCP servers can be exposed as remote ones without rewriting them.
Are you saying if you exposed the MCP service in the Coast and hosted it remotely you could expose back the MCP service remotely? That's actually a sort of interesting idea. Right now, the agents basically need to exec the mcp calls if they are running host-side and need to call an inner mcp. I hadn't considered the case of proxying the stdout to http. I'll think about how best to implement that!
cyanydeez 23 hours ago [-]
Isn't the primary security concern with thirdparty MCP servers the actual injected context and not whatever sandbox the MCP server is in? It doesn't really matter if the MCP can't do something to it's host; it's that it can manipulate the context to whatever ends it deems fit, which then is intractable in whatever LLM is calling it.
I'm really struggling to understand what peoples security concepts are with LLMs.
n1tro_lab 11 hours ago [-]
Containerization protects the host. It doesn't protect the model. A sandboxed MCP server can still return data that manipulates the agent into misusing tools it legitimately has access to. Different threat, different layer.
sReinwald 19 hours ago [-]
Third-party MCP servers create at least two different security problems. One is prompt/context injection through the tool output. The other is the much more conventional risk of executing untrusted code with transient dependencies on your machine (which is how the recent litellm compromise was discovered).
Containerization only helps with the second one, not the first, but that still matters. If you’re going to run random third-party MCP servers, isolating them from your host and any sensitive local data is still an obvious improvement over no isolation.
TZubiri 20 hours ago [-]
There's this naïve approach to security that obsesses with building walls, because walls are secure and nothing gets through.
Apparently a lot of people get nerd sniped into building impenetrable 10meter thick steel walls instead of thinking about doors and the windows.
smcleod 1 days ago [-]
Does it support native macOS containers?
jsunderland323 1 days ago [-]
It does not. It works through Docker Desktop, Orb Stack, or Colima on macOS.
mike_d 22 hours ago [-]
Just FYI you might want to reconsider your branding. Using the term "Coast Guard" in pretty much any capacity without written authorization is a felony.
jsunderland323 22 hours ago [-]
Interesting, I was not aware.
Well fortunately it's the name of a local observability ui and not the actual product. We'll change it if it becomes a problem.
oskapt 19 hours ago [-]
Super not true. Unless they're actively _impersonating_ a Coast Guard officer and acting overtly in that purported role, there's no crime. Simply having a thing called "coast guard" doesn't run afoul of anything. (18 USC SS 912/913).
TheProductAgent 8 hours ago [-]
[dead]
aplomb1026 22 hours ago [-]
[dead]
thestack_ai 16 hours ago [-]
[dead]
edinetdb 20 hours ago [-]
[dead]
MeetRickAI 1 days ago [-]
[flagged]
microbuilderco 13 hours ago [-]
[dead]
imta71770 1 days ago [-]
[flagged]
syntheticmind 18 hours ago [-]
[dead]
syntheticmind 1 days ago [-]
[flagged]
Copperline-Labs 1 days ago [-]
[flagged]
jsunderland323 1 days ago [-]
So technically you could use Coasts to sandbox but our default approach is actually not sandboxed at all. The agents still run host-side so unless you're sandboxing the agent host-side, you're not sandboxed. With coasts you're basically running exec commands against the coast container to extract runtime information.
>One thing I've been thinking about with agent infrastructure: the auth model gets complex fast when agents need to call external APIs on behalf of users. Per-key rate limiting and usage tracking at the edge (rather than in the container) has worked well for me. Curious how you’re handling the credential passing to containerized agents.
The way we handle secrets is at build-time we allow you to run scripts that can extract secrets and env vars host-side. The secrets get stored in a sqlite table (not baked into the coast image). When you start a coast, it injects those secrets -- you can decide how you the secrets should appear either as env vars, or if they should be written to the write layer. You're then able to trigger a re-injection of the secrets, so you can extract all the secrets again host-side and have them injected into all running coasts. This is useful because you don't have to rebuild and re-run just to update secrets.
wokgr3t4 19 hours ago [-]
[dead]
Sim-In-Silico 1 days ago [-]
[flagged]
jsunderland323 1 days ago [-]
>One thing I'm curious about: how do you handle state drift when agents are working on the same service across different worktrees? For example, if two agents are both making schema changes to a shared database service, do you have any coordination primitives, or is that left to the orchestration layer above? In my experience the runtime isolation is the easy part - the hard part is when agents need to share state (like a test database) without stepping on each other.
Great question! You can configure multiple coasts, so you could have a coast running with isolated dbs/state and also a shared version (you can either share the volume amongst the running coasts or move your db to run host-side as a singleton). So its sort of left to the orchestration layer: you put rules in your md file about when to use each. There's trade-offs to each scenario. I've been using isolated dbs for integration tests, but then for UI things I end up going with shared services.
>Re: For example, if two agents are both making schema changes to a shared database service
Obviously things can still go wrong here in the shared scenario, but it's worked fine for us and I haven't hit anything so far. It's just like having developers introducing schema changes across feature branches.
>Also, the per-service strategy config (none/hot/restart/rebuild) seems like the right abstraction. Most of the overhead in switching worktrees comes from unnecessary full restarts of services that don't actually care about the code change.
Totally, at first switching worktrees for our 1m+ loc repo was like 2 minutes. Then we introduced the hot/none strategies and got it down to like 8s. This is by far one of the best features we have.
The adjacent problem I’ve been focused on is what happens after the agent finishes in its isolated environment: how do you review what it actually changed before accepting the result?
I’m interested in diff/commit/rollback at the filesystem level, so you can selectively keep some changes and discard others.
Different problem, but they compose naturally.
Our approach is having our CLI handle port assignments (and pass any connection details/ports along as env vars) and that way being able to spin up “isolated” copies of the local dev environment. Has the added benefit of us being able to deploy the same config straight to production and switch in production database connections strings and anything else needed.
I haven't really had a bad experience with Docker on Mac. but Is the idea you basically just build your service on top of specific.dev's provided services (postgres and redis) and those run bare-metal locally and then you can deploy to specific.dev's hosted solution?
A bit of a different approach from the classic use case of docker-compose that is often orthogonal to the production infrastructure in some sense.
One thing I've used to great success though is taking an existing project or example docker-compose and simply asking the coding agent to translate it to Specific's IaC. Works a treat, especially as the coding agent can read all the code at the same time and connect it all together.
(also it looks like we were in the same batch!)
I think we're focused on the other end of the spectrum. Folks who like docker and have a good docker setup but want to have parallel runtimes. Anyway, best of luck!
This seems to be a pretty common perspective, but isn't it mostly about Docker Desktop? Orbstack solved my complaints, and I'm genuinely curious if I'm missing something significant (which is def possible).
- 30 minutes with Colima on Mac;
- 20 minutes with OrbStack on Mac;
- 13 minutes on a weaker CPU (Ryzen 5500U) on a native Linux laptop;
- 14 minutes on a Ryzen 5600X and a virtualized Debian inside Windows 10 WSL2.
Pretty stark differences. Granted our test suite is mostly I/O bound but that really tells you something about the VM overhead on a Mac and the lack of an actual kernel-native containerization support on macOS.
Until Apple adds a kernel-level containerization support (likely: never) then this difference in performance will continue to exist.
That being said, Orbstack really is the best on macOS. Docker Desktop is only slightly slower but much worse as an UX. Colima I appreciate for its full headless nature but it's severely behind in performance, sadly.
Curious about the hot strategy: when you do umount -l /workspace + mount --bind + mount --make-rshared inside the DinD container, lazy unmount means a running file watcher can still hold open fds to the old worktree while the new bind is already live. Have you hit cases where it keeps writing to stale paths after the switch? Or does it just naturally recover once the watcher picks up the inotify events from the new mount?
So the name "hot" is a bit misleading. The containers don't actually stay alive through the switch. What happens is we do the umount -l /workspace, mount --bind, mount --make-rshared sequence first, and then we run docker compose up --force-recreate. Force-recreate skips compose down (which would tear down the network, named volumes, everything) and just swaps the container processes in place. The old containers and their file watchers are killed and new ones start up.
By the time the new container processes start, /workspace already points at the new worktree so all their file handles are fresh and correct. There's no window where a watcher could be writing to stale paths because the old processes are just gone.
I was pretty afraid of this at first too but it turns out the force-recreate sidesteps the whole problem.
Basically been relying on spinning up cursor / niteshift / devin workflows since they have their own containers but this could be interesting to keep it all on your main machine.
Yeah, I think there's a ton of great remote solutions right now. I think worktrees make the local stuff tricky but hopefully Coasts can help you out.
Let me know how it goes!
I think there's a quite a few things:
1) You need a control plane to manage the host-side ports. Docker alone cannot do that, so you're either going to write a docker-compose for your development environment where you hard code dynamic ports into a special docker-compose or you're going to end up writing your own custom control plane.
2) You can preserve your regular Docker setup without needing to alter it around dynamic ports and parallelized runtimes. I like this a lot because I want to know that my docker-compose is an approximation of production.
3) Docker basically leaves you with one type of strategy... docker compose up and docker compose down. With coasts you can decide on different strategies when you switch worktrees on a per service basis.
4) This is sort of back to point 2, but more often than not you want to do things like have some shared services or volumes across parallelized runtimes, Coasts makes that trivial (You can also have multiple coast configs so you can easily create a coast type that has isolated volumes). If you go the pure docker route, you are going to end up having multiple docker-composes for different scenarios that are easily abstracted by coasts.
5) The UI you get out of the box for keeping track of your assigned worktrees is super useful.
6) There's a lot of built in optimizations around switching worktrees in the inner bind mount that you'll have to manually code up yourself.
7) I think the ergonomics are just way better. I know that's kind of a vibesey answer but it was sort of the impetus for making Coasts in the first place.
8) There's a lot of stuff around secrets management that I think Coasts does particularly well but can get cumbersome if you're hand-rolling a docker solution.
Goodbye Mac users.
It works fine on mac (that's what we developed it on) and it's not nearly as much overhead as I was initially expecting. There's probably some added latency from virtual box but it hasn't been noticeable in our usage.
1) Could you run an agent in the coast?
You could... sort of. We started out with this in mind. We wanted to get Claude Max plans to work so we built a way to inject OAuth secrets from the host into the containerized host... unfortunately because the Coast runtime doesn't match the host machine the OAuth token is created on, Anthropic rapidly invalidates the OAuth tokens. This would really only work for TUIs/CLIs and you'd almost certainly have to bring a usage key (at least for Anthropic). You would also need to figure out how to get a browser runtime into the containerized host if you wanted things like playwright to work for your agent.
There's so many good host-side solutions for sandboxing. Coasts is not a sandboxing tool and we don't try to be. We should play well with all host-side sandboxing solutions though.
2) Why DinD and why not mount namespaces with unshare / nsenter?
Yes, DinD is heavy. A core principle of our design was to run the user's docker-compose unmodified. We wanted the full docker api inside the running containerized host. Raw mount namespaces can't provide image caches, network namespaces, and build layers without running against the host daemon or reimplementing Docker itself.
In practice, I've seen about 200mb of overhead with each containerized host running Dind. We have a Podman runtime in the works, which may cut that down some. But the bulk of utilization comes from the services you're running and how you decide to optimize your containerized hosts and docker stack. We have a concept of "shared-services". For example if you don't need isolated postgres or redis, you can declare those services as shared in your Coastfile, and they'll run once on the host Docker daemon instead of being duplicated inside each containerized host, coasts will route to them.
One cool thing we do is we have the docs and semantic search of our docs baked into the CLI, so if the agents get lost they can usually figure things out kind of quickly by searching the docs via the cli.
Also we have a little section our agent.md and claude.md,I’m not sure how well it works without that.
Would love to see this support stdio-to-HTTP bridging so local MCP servers can be exposed as remote ones without rewriting them.
Alternatively, you can setup the Coast to install MCP services in the containers. There are some cases around specific logging or db MCP's where this might make sense.
>Would love to see this support stdio-to-HTTP bridging so local MCP servers can be exposed as remote ones without rewriting them.
Are you saying if you exposed the MCP service in the Coast and hosted it remotely you could expose back the MCP service remotely? That's actually a sort of interesting idea. Right now, the agents basically need to exec the mcp calls if they are running host-side and need to call an inner mcp. I hadn't considered the case of proxying the stdout to http. I'll think about how best to implement that!
I'm really struggling to understand what peoples security concepts are with LLMs.
Containerization only helps with the second one, not the first, but that still matters. If you’re going to run random third-party MCP servers, isolating them from your host and any sensitive local data is still an obvious improvement over no isolation.
Apparently a lot of people get nerd sniped into building impenetrable 10meter thick steel walls instead of thinking about doors and the windows.
Well fortunately it's the name of a local observability ui and not the actual product. We'll change it if it becomes a problem.
>One thing I've been thinking about with agent infrastructure: the auth model gets complex fast when agents need to call external APIs on behalf of users. Per-key rate limiting and usage tracking at the edge (rather than in the container) has worked well for me. Curious how you’re handling the credential passing to containerized agents.
The way we handle secrets is at build-time we allow you to run scripts that can extract secrets and env vars host-side. The secrets get stored in a sqlite table (not baked into the coast image). When you start a coast, it injects those secrets -- you can decide how you the secrets should appear either as env vars, or if they should be written to the write layer. You're then able to trigger a re-injection of the secrets, so you can extract all the secrets again host-side and have them injected into all running coasts. This is useful because you don't have to rebuild and re-run just to update secrets.
Great question! You can configure multiple coasts, so you could have a coast running with isolated dbs/state and also a shared version (you can either share the volume amongst the running coasts or move your db to run host-side as a singleton). So its sort of left to the orchestration layer: you put rules in your md file about when to use each. There's trade-offs to each scenario. I've been using isolated dbs for integration tests, but then for UI things I end up going with shared services.
>Re: For example, if two agents are both making schema changes to a shared database service
Obviously things can still go wrong here in the shared scenario, but it's worked fine for us and I haven't hit anything so far. It's just like having developers introducing schema changes across feature branches.
>Also, the per-service strategy config (none/hot/restart/rebuild) seems like the right abstraction. Most of the overhead in switching worktrees comes from unnecessary full restarts of services that don't actually care about the code change.
Totally, at first switching worktrees for our 1m+ loc repo was like 2 minutes. Then we introduced the hot/none strategies and got it down to like 8s. This is by far one of the best features we have.