dev_to 2026年3月14日

Claude Code スキル管理を困るようになったため、可視化ワークベンチを構築しました

Built a Visual Workbench Because Managing Claude Code Skills Was Driving Me Crazy

Translated: 2026/3/14 11:19:08

Japanese Translation

このプロジェクトは、一連の Markdown ファイルで膨らんだフォルダから始まりました。何ヶ月にもわたり、私は Claude Code を毎日利用し、すぐに頼みのcodingパートナーとなりました。当初、私は「Skills」と呼ばれる、タスクを Claude へ指示するため ~/.claude/skills/ に入力する YAML frontmatter を持った Markdown ファイルを発見しました。SKILL.md を書き、トリガー条件を記述し、指示を追加すれば、Claude はあなたのデプロイパイプラインやコーディング規約、プロジェクト特有の癖を突然理解するようになります。さらに深掘りし、あらゆる分野にスキルを記述し始めました。コードレビューガイド、データベース移行パターン、コンポーネントのテンプレ作成、API エンドポイントのボイラープレート、テスト生成戦略など。一つずつClaude Code をより鋭く、私の実際のワークスタイルに合わせてチューニングし上げました。その後、問題が始まりました。😅 フォルダに 5 つのスキルがあっても全然問題ありません。しかし、プロジェクトをまたぐ複数のプロジェクトに散らばった 30 つのスキルで、それぞれ微妙に異なるバージョンがあるのは absolute nightmare です。同じ壁を何度もぶつかりました。スキルの YAML frontmatter を編集してデプロイし、そして Typos がトリガーパターンを壊していることを発見しました。どこにもバリデーションはありませんでした。スキルをプロジェクト间にコピーし、微調整して、そしてどのバージョンが最新であることをすっかり忘れていました。バージョン履歴もありませんでした。デプロイ前にスキルが期待通りの出力を生み出すかテストしたかったのに、テストサンドボックスもありませんでした。共有が特に厄介でした。チームメンバーがコードレビュースキルの依頼をしたら、ファイルを送りました。どのモデルで調整されたか尋ねられると、思い出せませんでした。デプロイして「まあまあ」という結果に終わらせ、結局スキルの存在を否定されました。私はスキル管理に時間を費やすほどになり、コーディングよりも時間が取られました。生産性を高めるために AI エージェントを使いましたが、そのエージェント周りのツールが逆に私を押し戻し続けていました。🙃 ツールが機能しない時は、開発者がやることをしました：自作しました。アイデアはシンプルでした。YAML frontmatter と Markdown 指示を並べて表示できる可視エディタ。デプロイ前にエラーを検出するリアルタイムバリデーション。実際のモデルに対してストリーミングレスポンスとともにスキルをテストし、出力が望み通りになるまで指示を微調整する機能。そして準備が整ったら、ファイルをコピーしてまわす代わりにワンクリックでデプロイ。そのプロジェクトが ubersKILLS ⚡ となり、エージェントのスキルを設計・テスト・デプロイするためのオープンソース可視ワークベンチとなりました。最初のバージョンは拙く、Next.js アプリと基本的なエディタ、そして ~/.claude/skills/ にファイルを書き込むデプロイボタンしか持っていませんでしたが、この素朴なバージョンだけでも数時間を節約できました。YAML シNTAX エラーが不再有。盲目的なデプロイが不再有。スキルが実際に動くかどうかが疑わしい日が不再有。ここから本格的な部分に入ります。 Claude Code の ubersKILLS を構築中であった頃、エージェントエコシステムは爆発的に成長しました。Cursor がそのルールのシステムをリリース。GitHub Copilot にカスタム指示を追加。Windsurf が独自のスキル形式で登場。Gemini CLI にエージェント構成を備えたツールが現れた。Codex、OpenCode、Antigravity... 突然、8 つの主要なコードエージェントが存在し、それぞれが何か形のある永続的な指示をサポートするようになりました。私が Claude Code のために解決した問題は、ここに普遍的に存在しました。各エージェントは独自のディレクトリ構造、独自のコンベンション、独自のデプロイパスを持っていました。複数のエージェントを使用する開発者は、共通のツールなしで零倍数の指示セットを複製して管理していました。😩 そのため、ubersKILLS は成長を続けました。現在は 8 つのエージェントへのデプロイをサポート 🎯 します： - Claude Code - Cursor - GitHub Copilot - Windsurf - Gemini CLI - Codex - OpenCode - Antigravity 一度スキルを作成し、ターゲットを選択すれば、どこにでもデプロイできます。スキル形式は標準化されており（メタデータとトリガーのための YAML frontmatter、指示のための Markdown ボディ）、エンジンが各エージェントの期待される構造への変換を処理します。これが、想像しているほど重要ではありません。詳細なコードレビュープロンプトを Claude に対して作成し、それがうまくいくのに時間を費やしたなら、コピオライやカーサスで同様の作業をそのまま使用できるはずです。書き換えることなく。あなたのプロンプトエンジニアリングの専門性は、ポータブルでなければなりません。🔄 3 つのステップ：作成、テスト、デプロイ。構造化エディタと手動で、メタデータフィールド（名前、説明、トリガー）を埋め込むことができます。

Original Content

It started with a folder full of markdown files. I'd been using Claude Code daily for months. It became my go-to coding partner pretty quickly. Early on, I discovered Skills: markdown files with YAML frontmatter that you drop into ~/.claude/skills/ to teach Claude how you want things done. Write a SKILL.md, describe when it should trigger, add your instructions, and Claude suddenly knows your deployment pipeline, your coding standards, your project's weird quirks. I went deep. I wrote skills for everything. Code review guidelines. Database migration patterns. Component scaffolding. API endpoint boilerplate. Test generation strategies. Each one made Claude Code sharper, more tuned to how I actually work. Then the problems started. 😅 Five skills in a folder? Totally fine. Thirty skills spread across multiple projects, each with slightly different versions? Absolute nightmare. I kept hitting the same walls. I'd edit a skill's YAML frontmatter, deploy it, then discover a typo broke the trigger pattern. No validation anywhere. I'd copy a skill between projects, tweak it, then completely forget which version was current. No version history. I wanted to test whether a skill actually produced the output I expected before shipping it. No testing sandbox. Sharing was the worst part. A teammate would ask for my code review skill. I'd send the file over. They'd ask which model it was tuned for. I couldn't remember. They'd deploy it, get so-so results, and write off skills entirely. I was spending more time managing skills than writing code. I was using an AI agent to be more productive, but the tooling around that agent kept dragging me back. 🙃 So I did what any developer does when the tooling falls short: built my own. The idea was straightforward. A visual editor where I can see YAML frontmatter and markdown instructions side by side. Real-time validation so I catch errors before deployment. A way to test a skill against actual models with streaming responses, so I can tweak the instructions until the output matches what I want. And when it's ready, one-click deploy instead of manually copying files around. That project became uberSKILLS ⚡ an open-source visual workbench for designing, testing, and deploying agent skills. The first version was rough. A Next.js app with a basic editor and a deploy button that wrote files to ~/.claude/skills/. But even that bare-bones version saved me hours. No more YAML syntax errors. No more blind deployments. No more wondering if a skill would actually work. This is where things got interesting. While I was building uberSKILLS for Claude Code, the agent ecosystem blew up. Cursor shipped their rules system. GitHub Copilot added custom instructions. Windsurf launched with its own skill format. Gemini CLI showed up with agent configuration. Codex, OpenCode, Antigravity... suddenly there were eight major code agents, all supporting some form of persistent instructions. The problem I'd solved for Claude Code? It existed everywhere. Every agent had its own directory structure, its own conventions, its own deployment path. Developers using multiple agents were maintaining duplicate sets of instructions with zero shared tooling. 😩 So uberSKILLS grew. Today it deploys to eight agents 🎯: Claude Code Cursor GitHub Copilot Windsurf Gemini CLI Codex OpenCode Antigravity Write your skill once, pick your targets, deploy everywhere. The skill format is standardized (YAML frontmatter for metadata and triggers, markdown body for instructions) and the engine handles translation to each agent's expected structure. This matters more than it might sound. If you've spent time crafting a detailed code review prompt that works great with Claude, you should be able to use that same work with Copilot or Cursor without rewriting anything. Your prompt engineering expertise should be portable. 🔄 Three steps: create, test, deploy. You can go manual with the structured editor and fill in metadata fields (name, description, trigger patterns, tags, model preferences). Or open the AI chat, describe what you want in plain language, and let it generate a complete skill for you. The AI creation flow has a live preview panel so you can watch the SKILL.md update as you refine your description through conversation. This is where uberSKILLS really pays for itself. The multi-model sandbox lets you pick any model available through OpenRouter (Claude, GPT, Gemini, Llama, dozens more) and run your skill against it with streaming responses. You see output in real time, plus metrics: token counts, latency, time to first token. Tweak the instructions, test again, compare outputs across models, and actually feel confident a skill works before it touches a real project. Every test run gets saved too, so you can track how instruction changes affect output quality over time. 📊 One click. Pick your target agents from a dropdown, hit deploy, and uberSKILLS writes the files to the correct directory for each agent. Status updates to "deployed" so you can see at a glance what's live and what's still in draft. Beyond those three steps: there's a skills library with search, status filtering, and sorting. Version history tracks every edit so you can roll back any revision. Import and export lets you pull skills from zip files or directories, and share them with your team. Settings panel covers API key management, theme preferences, and data backup. 📦 For the curious, here's the stack: Turborepo monorepo with pnpm. Next.js 15 on the App Router with React 19, shadcn/ui, and Tailwind CSS v4. SQLite through Drizzle ORM for the database, so no external database server needed. Everything runs locally. AI integration uses the Vercel AI SDK with the OpenRouter provider for multi-model support. SQLite was a deliberate choice. uberSKILLS is local-first. Your skills, test history, API keys... all of it stays on your machine. The API key gets encrypted with AES-256-GCM before storage. No cloud dependency, no account to create, no data leaving your laptop. 🔒 Getting started is one command: npx @uberskillsdev/uberskills It creates a ~/.uberskills/data/ directory, sets up the database, runs migrations, generates an encryption secret, and launches at localhost:3000. No Docker, no cloning, no configuration ceremony. ✨ I talk to developers every week who use Claude Code or Copilot and have never written a single skill. They're leaving a ton of productivity on the table. A well-written skill turns a general-purpose agent into a specialist. Without skills, you repeat the same context in every conversation. With skills, that context loads automatically based on trigger patterns. Your agent already knows your database conventions, your error handling patterns, your test philosophy, your deployment checklist... before you type a word. The developers getting the most out of code agents are the ones who invest time teaching them. Skills are how you do the teaching. uberSKILLS is how you manage all that teaching without going crazy. 🧠 uberSKILLS is open source under MIT and free forever. The roadmap includes a community skill marketplace where developers can share and discover skills, collaborative editing for teams, and deeper integrations as new agents keep showing up. The agent ecosystem moves fast. New agents ship every month, existing ones pick up new capabilities every week. But one thing stays consistent: developers who customize their agents outperform those who don't. A proper workbench for that customization isn't optional anymore. It's infrastructure. If you're still managing agent skills by hand-editing markdown files and copying them between directories, try uberSKILLS. Your future self, the one who isn't debugging YAML indentation at midnight, will appreciate it. 😄 GitHub: github.com/uberskillsdev/uberskills npm: npx @uberskillsdev/uberskills