dev_to 2026年4月25日

Open-Source 地元のサンドボックス、MCP サーバー、そして既知のアプリケーションが実際に必要としているもの

The Open-Source Local Sandbox Agents, MCP Servers, and Unknown Apps Actually Need

Translated: 2026/4/25 3:01:03

nilboxsandboxai-agentsmcpsecurity

Japanese Translation

開発者たちがかつて議論し続けている、同じ論争が現在三つの異なる形態で再登場している。「この AI アジェントをレポを爆破しないように実行する方法は？」「この MCP サーバーを試すには、素不明なコードをシェルに渡してよいのか？」「このランダムな GitHub プロジェクトをチェックアウトして、ラップトップに半分を実装しなくてはならないか？」「答えは一つ。すべては未確認コードが、あなたの端末上、あなたのファイルおよびトークンとして実行されるものだ。つまり、同じ信頼クラスであり、同じ回答を必要とする：ワンクリックでインストールできる地元のサンドボックスだ。それはプロンプトをパイプして利用するクラウド API ではない。また、「信頼せよ」という封闭ソースのランタイムでもない。その境界はディスク上にあり、ソースは GitHub にあり、キールスイッチはあなたが閉じる窓であるサンドボックスだ。現在、正確なその形状を備えたものは存在しない。nilbox は、我々の知る限り、AI アジェント、MCP サーバー、および未確認アプリケーション用に最初となるクロスプラットフォーム GUI サンドボックスだ：Windows、macOS、Linux 用の一つのインストール、それぞれの VM と境界は同じ。ソースはオープンソースの GitHub にあるので、その境界は信念するものであり、読み取れるものである。アジェント、MCP サーバー、および未知のアプリケーションは、すべて同じ問題へ集約される：あなたのホスト上で、あなた自身として実行される未確認コードだ。クラウドサンドボックスはデスクトップワークフローには合致しない。封闭ソースサンドボックスは、最初から確立しようとしている信頼モデルには合致しない。正しい形は、ローカル＋ワンクリック：自分の端末上で VM 等級の隔離、他者のクラスタへの往来なし、そしてあなたが監視したい場合 auditable な読み取れるソースを持つ。 nilbox はそれを提供している——我々の知る限り、この種の最初となるクロスプラットフォーム GUI サンドボックスであり、Windows、macOS、Linux での実装インストールを備えている。Debian ベースの VM、トークン境界ゼロのため API キーがサンドボックス内に決して入らず、デフォルト禁止のエグレス。ソースは github.com/rednakta/nilbox に公開されており、透明性がある。これらを三つの別々の問題と考えるのをやめるべきだ。それは三つの表面を持つ一つの問題である。 AI アジェント。アジェントはウェブページを読んで、実行すべきものを選び、実行する。その「選択」は言語モデルのトークンストリームである。つまり、すべての外部入力—README、PDF、HTML ページ、またはツールコールの結果—は潜在的な命令となる。プロンプトインジェクションは稀なexploit ではない；それは「助けて指示に従う」というトレーニングされたモデルに対し、未確認テキストがどのように動作すべきかである。アジェントは、`.ssh/id_rsa` を出力したり、秘密情報を伴って `curl -X POST` を実行するようになっている一つの注入文句だけの手前である。 MCP サーバー。MCP は素晴らしい。それはまた、アジェントがあなたが読んできたものを誰かが書いたコードを呼び出させるためのプロトコルでもある。二つの独立したリスクがここで複合する：MCP サーバー自体は、あなたが起動したばかりのバイナリである。もしそれが敵対的であれば、あなたはすでにそのアジェントの信頼境界に侵入している瞬間だ。 MCP サーバーが返す応答は、アジェントがツール出力として扱い、しばしば次のモデルコールの文脈として downstream に処理するテキストである。悪意のある応答は、より多くのステップを持つプロンプトインジェクションのキャリアとなる。つまり、MCP はアジェントよりも安全なカテゴリーではない。それは増幅器である：より多くのサードパーティコード経路、より多くの注入表面、より多くのトークンが遊戯する。未知のアプリケーション。最も古い問題のバージョン。評価したい CLI に対しての `curl | bash` インストール。同僚から転送された GitHub レポ。Slack DM にあるバイナリ。半分覚えている名前の npm パッケージ。インストールすることなく、それを試したい—あなたの PATH、dotfiles、keychain、ブラウザセッションにそれを書き込むことなく。三つのすべてで、脅威の形状は同じだ。未確認コード、あなたの認証情報、あなたのネットワーク、あなたのホームディレクトリ。同じ信頼クラス、同じ回答。 :::warning[構造的な罠] そして、より多くの失敗を避ける。「ローカル」という言葉は、上記の文において真に機能している。それを捨てると、サンドボックスはデスクトップ AI ワークロードに対する適切なツールを失う—セキュリティよりも、開発者の環境が実際にどのように機能するかという点で、より多くの理由が関与するからである。開発環境は、仕事そのものだ。あなたのエディタ、あなたのターミナル、

Original Content

There's a conversation developers keep having right now, and it's the same conversation in three different disguises. "How do I run this AI agent without it nuking my repo?" "How do I try this MCP server without handing a stranger's code my shell?" "How do I check out this random GitHub project without installing half of it into my laptop?" Three questions, one answer. All three are untrusted code running as you, on your machine, with your files and your tokens. That's the same trust class. It deserves the same response: a local sandbox you can install in one click. Not a cloud API you pipe prompts into. Not a closed-source "trust us" runtime. A sandbox whose boundary is on disk, whose source is on GitHub, and whose kill switch is a window you close. And as far as we can tell, nothing of that exact shape existed until now. nilbox is — to our knowledge — the first cross-platform GUI sandbox for AI agents, MCP servers, and untrusted apps: one installer for Windows, one for macOS, one for Linux, the same VM and the same boundary inside each. The source lives in the open at github, so the boundary is something you can read rather than something you have to take on faith. Agents, MCP servers, and unknown apps collapse to one problem: untrusted code running on your host as you. Cloud sandboxes don't fit desktop workflows; closed-source sandboxes don't fit the trust model you're trying to establish in the first place. The right shape is local + one-click: VM-grade isolation on your own machine, no round-trip to somebody else's cluster, with a readable source you can audit if you want to. nilbox ships that — to our knowledge, the first cross-platform GUI sandbox of this kind shipping real installers on Windows, macOS, and Linux. Debian-based VM, Zero Token boundary so the real API key never enters the sandbox, default-deny egress. Source is up at github.com/rednakta/nilbox for transparency. Stop thinking of these as three separate problems. They're one problem with three surfaces. AI agents. The agent reads a web page, decides what to execute, and runs it. The "decision" is a language model's token stream. That means every external input — a README, a PDF, an HTML page, the output of a tool call — is a potential instruction. Prompt injection is not a rare exploit; it's how untrusted text is supposed to work against a model that was trained on "helpfully follow instructions." The agent is one injected sentence away from cat ~/.ssh/id_rsa or curl -X POST with your secrets. MCP servers. MCP is great. It's also a protocol for letting an agent call code somebody else wrote and you didn't read. Two independent risks compound here: The MCP server itself is a binary you just ran. If it's hostile, it's already inside your agent's trust boundary the moment you start it. The responses an MCP server returns are text the agent will treat as tool output — and often, downstream, as context for the next model call. A malicious response is a prompt injection carrier with extra steps. So MCP isn't a safer category than agents. It's an amplifier: more third-party code paths, more injection surfaces, more tokens in play. Unknown apps. The oldest version of the problem. The curl | bash install for a CLI you want to evaluate. The GitHub repo a coworker forwarded. The binary in a Slack DM. The npm package whose name you half-remember. You want to try it without installing it — without writing it into your PATH, your dotfiles, your keychain, your browser session. The threat shape is the same across all three. Untrusted code, your credentials, your network, your home directory. Same trust class, same answer. :::warning[The framing trap] and less to get wrong. The "local" word is doing real work in the sentence above. Drop it and the sandbox stops being the right tool for desktop AI workloads — for reasons that have less to do with security and more to do with how a developer's environment actually works. The dev environment is the work. Your editor, your terminal, your services on localhost, your shell aliases, your git checkout, the node_modules you spent eleven minutes resolving — that's what the agent should be touching. Cloud sandboxes ask you to ship a snapshot somewhere else, run the agent there, and reconcile the result back. A local sandbox just runs alongside you. The agent reads the repo you're already reading, edits the files you can see in your editor, and its work shows up as git diff lines you can review before they leave your branch. Portability. A laptop is the developer's actual environment, full stop. The plane, the cafe, the captive-portal hotel wifi, the corporate VPN that won't let HTTPS out to certain hosts, the off-network box you log into from a different country — wherever the laptop goes, the work goes. A local sandbox goes with it. A cloud sandbox needs network reachability, an active account, and someone else's uptime. Ownership of side-effects. When a local agent writes a file, the file is on your disk. When it edits a config, you git diff it before committing. When the experiment goes nowhere, you git stash and walk away. No remote session to clean up, no detached state on a server, no sync conflict between a cloud copy and a local copy. The agent's work is just work in your repo, treated like work you'd have done yourself. The cloud-sandbox category has its place — hosted code interpreters, backend agent platforms, anything where the sandbox is part of a product you're shipping. That's not this post. This post is about the sandbox you need, sitting between your laptop and the things you don't trust yet. (A small aside on the source side of things: nilbox's boundary proxy, VM image, and store manifest are all in a public GitHub repo. Not as a marketing pitch, just as transparency — if you're going to trust a security boundary, being able to read it beats taking someone's word for it.) Four things. If any of them is missing, the sandbox is incomplete. Kernel-level isolation. Not just namespaces. A container escape is a host compromise, and LLM output is the exact kind of untrusted code that historically finds those bugs. VM-grade (hypervisor, microVM, whatever you want to call it) is the minimum. Token leak prevention. The real API key must not enter the sandbox. If it does, prompt injection and malicious packages both win — the kernel boundary doesn't protect a credential the process is authorized to read. Default-deny egress. The sandbox should reach the LLM provider you actually use and not much else. An agent that can POST anywhere on the internet is one tool call away from exfiltration, regardless of how isolated the process itself is. Covers all three workloads. Agent loops, MCP servers, and ad-hoc unknown apps have to run in the same environment, under the same boundary. If MCP servers require their own isolation mechanism, you'll skip it. A fifth, softer requirement: one-click install on the OS you actually use. Security tools nobody runs are not security tools. If installing the sandbox is a multi-evening adventure in WSL, Docker daemons, or hypervisor kernel modules, your teammates will just run the agent on the host and hope. Hope is not a threat model. nilbox is built exactly for this shape: a local sandbox for agents, MCP servers, and unknown apps, with the source kept open in the same repo. The sandbox itself is a Debian-based VM called Linux for nilbox. One-click install on macOS, Windows, and Linux — no WSL gymnastics, no Docker daemon, no "please enable virtualization in your BIOS" side-quest. The desktop app handles hypervisor setup, disk provisioning, and the shell handoff. When the window is open, the sandbox is running; when it's closed, it isn't. To our knowledge this is the first sandbox of this shape that ships a real desktop GUI on all three platforms rather than an API or a CLI. Docker has a desktop app but isn't kernel-isolated; VMware and VirtualBox are cross-platform but not purpose-built for agents; cloud sandbox APIs are purpose-built but neither local nor GUI. Source is up at github.com/rednakta/nilbox if you'd rather read the boundary than take our word for it. Zero Token Architecture is the second layer. The agent inside the sandbox never sees the real API key. You hand it a placeholder — literally OPEN_API_TOKEN=OPEN_API_TOKEN — and a boundary proxy substitutes the real token outside the sandbox, right before the outbound call leaves your machine: If the sandbox leaks its environment — prompt injection, a malicious dependency, a curious env tool call — what escapes is a string that equals its own variable name. You can't call an LLM with it, you can't charge anybody's account with it, you can't even prove which vendor it was for. The full argument lives in the Zero Token Architecture write-up. MCP servers run inside the same sandbox as the agent. That's the whole point of picking a single boundary — MCP isn't a separate trust domain, it's more code in the already-untrusted pile. When the agent talks to the MCP server, both are inside Linux for nilbox; when either talks to the outside world, both hit the same boundary proxy and the same egress policy. Unknown apps work the same way. Install the app into the sandbox via the store or a shell session. Try it, poke it, let it install things in its own home directory. If it turns out to be hostile, the blast radius is a Debian VM on a disk image you can delete. Your host ~/.ssh, your keychain, your browser cookies — never in scope. That's the full picture: one VM, one boundary, three workloads. Kernel isolation Token leak prevention Egress allow-list Fits agent + MCP + unknown app One-click desktop GUI Raw VM ✓ ✗ ✗ (manual) ✓ ✗ Docker container Partial ✗ ✗ (manual) Mostly ✗ Cloud sandbox API ✓ ✗ Varies Agent-only, usually ✗ nilbox ✓ (VM) ✓ ✓ ✓ ✓ If you're wondering how these four break down in more detail, the sandbox comparison post walks through each category and where it holds up. One sandbox. Three workloads. Local, so it fits your desktop workflow and your file tree. Default-secure, so the agent inside doesn't have the real API key, can't POST to arbitrary hosts, and can't reach out of the VM into your home directory. The source sits on GitHub if you ever want to verify any of that for yourself. If you've been running agents, MCP servers, or sketchy binaries directly on your host because the "real" solution felt like too much setup — this is the setup. It's a window you open on your laptop. github.com/rednakta/nilbox