dev_to 2026年3月16日

エッジでのトークン正規化を通じた OAuth2 と API キーの統一認証

Unified Authentication for OAuth2 and API Keys via Edge Token Normalization

Translated: 2026/3/16 14:02:39

oauth2api-keysauthenticationzero-interaction-automationtoken-exchange

Japanese Translation

最近、開発者向け API の構築中で、どこにでも見つからない明確な答えがあった問題に直面しました。ブラウザや人間の介入なしに、長期間にわたって完全に自動化されたユーザーによる代行アクセスを支援する必要があるためであり、OAuth2 ではその明確な答えがありませんでした。API キーと OAuth2 を併用することを決断しましたが、これは認証アーキテクチャに実質的な影響を与え、他の人々がこの長い旅を歩むのを防ぐために共有したいと考えました。OAuth2 は、インターネット規模において最も優れている認可フレームワークです。クライアントタイプを超えた認証処理を標準化し、プラットフォーム横断 SSO を可能にし、公開クライアントの振る舞いを定義し、チームが独自のアースロジックを実装することを避けるのを許可します。先例と比較すると、これは大きな前進です。それはフレームワークとして設計され、厳密なプロトコルではありませんでした。この柔軟性が意図的であり、すべての主要なアイデンティティプロバイダーがそれを自らのシステムに適応させることができたのです。この緩さは多くの未定義の事項を残しました。IETF OAuth ワーキンググループは、これらのギャップを埋めるために 30 件以上の RFC 拡張（PKCE、DPoP、PAR）を生成し、リストは引き続き拡大しています。そのギャップの一つが、非インタラクティブなユーザー代行アクセスです。既存の選択肢はありますが、どれも完全に適合しません。クライアントクレデンシャルフローは本質的に無頭（headless）であり、機械対機械のシナリオで良好に機能します。ただし、発行されるアイデンティティはアプリケーションに属し、ユーザーには属さないのです。ユーザーはこの図に描かれていません。誰かがアクションをトリガーしたかを知る必要がある場合、クライアントクレデンシャルはあなたに言いません。デバイス認可グラントが少し近づいています。ユーザーはブラウザまたは第二のデバイスから一度だけアクセスを承認し、その点からそれ以降、クライアントはリフレッシュトークンを使用して無頭で動作できます。完全に無人の自動化のために、それは崩壊します。トークンリボット、有効期限、無効化は、最終的に人間が再び現れることを意味します。それは実行時に無頭ですが、永遠に無頭ではありません。リボットイベントがネットワーク失敗またはクラッシュしたプロセスにより見逃された場合、すべてのリフレッシュチェーンが無効となり、人間が再認証しなければなりません。自動化が無限に無人で実行される必要がある場合、それが致命的な失敗モードです。ギャップは交点です：人間のループにないもブラウザのない、長期間にわたって完全に自動化された、ユーザーによる代行アクセス。それは OAuth2 がきれいに答えを出すものではありません。新興のトークン交換パターンとエージェント固有の委任モデルはこの方向に進んでいますが、真のゼロインタラクション自動化のための摩擦は残っています。ギャップは理論的ではありません。ほぼすべての主要な開発者向けプラットフォームは、OAuth フローの横にあるユーザースコープのクレデンシャルパスを長寿命に維持しています。これはレガシーの残りではなく、OAuth 一人ではきれいに服務しないアクセスのクラスのために意図的な選択です。RFC 8693（OAuth 2.0 トークン交換）は、外部 JWT やカスタム subject_token タイプ（API キーなど）を内部トークンに交換するメカニズムを定義することができますが、実際には、信頼の外部 JWT 交換、特に API キー交換のために、主要なアイデンティティプロバイダーはなお、重要なカスタム拡張、または完全に別のパスを必要とします。主要な実装は、アップストリーム制限に対抗したり、ダウンストリームに複雑性を再導入したりすることなく、この使用ケースをプラグアンドプレイの方法で提供するものではありません。問題は API キーそのものではありません。それはあなたが一度 API キーを導入した場合、それがあなたの認証アーキテクチャに何が起こるかです。二つの認証スキームは、あなたの API サーバーがエントリポイントで両方を処理しなければなりません：IDP のから署名された JWT を検証し、別のパスで API キーを参照します。あなたのアーキテクチャがそこで止まったら、一つの API サーバー、内部サービス呼び出しなし、成長計画なし、これはあなたの問題ではありません。タブを閉じて、何か船送ってください。しかし、あなたのサービスが他のサービスと話しており、それらのサービスは誰が呼び出しているかを知らなければなりません、続きをよんでください、それはそれが興味を惹く場所からです

Original Content

Recently, I was building a developer-facing API and ran into a problem I couldn’t find a clean answer to anywhere. I needed to support long-running, fully automated, user-delegated access with no browser and no human in the loop, and OAuth2 had no clear answer. I landed on implementing API keys alongside OAuth2, but that decision has real implications on the authentication architecture, and I wanted to share it to hopefully save others from taking this long journey. OAuth2 is, by most measures, the best authorization framework we have at internet scale. It standardizes how applications handle authentication across client types, enables SSO across your platform, defines how public clients should behave, and lets your teams avoid implementing their own auth logic from scratch. Compared to what came before it, it is a major step forward. It was designed to be a framework, not a strict protocol. That flexibility was intentional, so every major identity provider could adapt it to their systems. This looseness left many things unspecified. The IETF OAuth working group has since produced more than 30 RFCs and extensions (PKCE, DPoP, and PAR) to fill the gaps, and the list continues to grow. One of those gaps is non-interactive user-delegated access. There are existing options, but none quite fit. The client credentials flow is truly headless and works well for machine-to-machine scenarios. The identity it issues, though, belongs to the application, not the user. There is no user in the picture. If you need to know which person triggered an action, client credentials cannot tell you. The Device Authorization Grant gets you closer. A user approves access once from a browser or secondary device, and from that point forward, the client can operate headlessly using refresh tokens. For fully unattended automation, it breaks down. Token rotation, expiry, and revocation mean a human eventually has to show up again. It is headless at runtime, but not headless forever. When a rotation event is missed due to a network failure or a crashed process, the entire refresh chain is invalidated, and a human has to re-authorize. For automation that needs to run unattended indefinitely, that is the specific failure mode that kills it. The gap is the intersection: long-running, fully automated, user-delegated access that involves no human within the loop and no browser. That specific combination is what OAuth2 does not answer cleanly. Emerging token exchange patterns and agent-specific delegation models are moving in this direction, but the friction for true zero-interaction automation remains. The gap is not theoretical. Nearly every major developer-facing platform maintains long-lived, user-scoped credential paths alongside their OAuth flows. Not as legacy holdovers, but as deliberate choices for a class of access that OAuth alone does not cleanly serve. While RFC 8693 (OAuth 2.0 Token Exchange) defines mechanics that could support exchanging external JWTs or even custom subject_token types (such as API keys) for internal tokens, production identity providers in practice still require significant custom extensions, or outright separate paths, for reliable external JWT exchange and especially for API key exchange. Major implementations do not yet deliver this use case in a plug-and-play way without either fighting upstream limitations or reintroducing complexity downstream. The challenge is not the API key itself. It is what happens to your authentication architecture once you introduce one. Two authentication schemes means your API server has to handle both at the point of entry: validate a signed JWT from your IDP on one path, look up an API key on another. If your architecture stops there, one API server, no internal service calls, no plans to grow, this is not your problem. Close the tab, go ship something. But if your services talk to other services, and those services need to know who is calling, keep reading, because this is where it gets interesting. The problem is what happens next. Your API server has verified the request and knows who is making the call. Now it needs to talk to an internal service, and that service needs to know who is calling. You cannot pass the credential forward because that service would need to implement the same dual validation, which would put you back in the same problem one layer deeper. So you extract the identity and pass it along in some form. A header, a forwarded value, whatever convention your stack has settled on. Here is where the real cost shows up. You took a signed JWT with cryptographic guarantees and a verifiable chain of trust, and reduced it to a plain string that any service in your stack could have fabricated. The signature is gone. The guarantee is gone. The internal services receiving that request no longer operate on a verified credential; they operate on trust in your infrastructure. That is a meaningful step down in your security posture, and it compounds with every service hop in your call graph. The natural next question is whether you can sidestep the dual authentication problem by issuing a single, long-lived credential. One credential type, one validation path, no normalization required. If the problem is having two schemes, eliminate one of them. It is a reasonable thought. The issue is what happens when you need to revoke access. An API key is just a database record. Delete it or mark it invalid, and it stops working. A self-contained JWT with a long expiry is still cryptographically valid regardless of what you want. The only mitigation is a blocklist, an external store of invalidated tokens that all services have to check on every request. That works, but you have reintroduced the database lookup you were trying to avoid with JWTs, plus the operational burden of keeping that list consistent across your stack. Most IDPs also cap access token lifetimes and do not support long-lived JWTs out of the box. What looked like a simplification turns out to be a harder problem than the one you started with. Opaque tokens are another option worth addressing directly. Some IDPs issue them by default, and on the surface, they seem to sidestep the JWT revocation problem, since the token is just a reference to the authorization server’s control. The trade-off is that all services that receive an opaque token must call the authorization server to validate it. That is a per-request network dependency on your IDP for every internal service hop, which adds latency, creates an availability coupling you do not want inside your service mesh, and scales poorly as your call graph grows. Opaque tokens are a reasonable choice at the edge. They are a poor fit for internal service-to-service communication. The solution is normalization. Whatever credential arrives at the edge, an external access token or an API key, the gateway performs lightweight structural validation of the incoming credential: confirming the token type, format, and expiry. If that passes, it checks its cache. On a hit, it returns the cached internal token immediately. On a miss, it invokes the appropriate exchange flow on an internal OAuth2 authorization server, which handles identity resolution, token issuance, and claim normalization. Your downstream services never see the original credential. They receive a consistent internal token every time, from an issuer they trust, regardless of how the caller authenticated. This sits in the critical path of every request, so performance is a legitimate concern. It can be addressed through caching. The internal token issued by the authorization layer is cached against the lifecycle of the incoming credential. Layer one and layer two caches for token lookups mean real work happens only on cache misses. A cache miss adds the cost of a token exchange against a co-located auth server, which is negligible for most traffic patterns. After that, the cached internal token serves subsequent requests without any additional round-trips for the duration of its TTL. This compares favorably to the opaque token model, in which every internal service hop requires a round-trip back to the authorization server. The internal token TTL is configurable and does not have to mirror the external credential’s remaining lifetime. Shorter windows tighten the revocation exposure at the cost of more frequent exchanges. Longer windows reduce the auth server load but increase the gap between a revocation event and enforcement. This is the same sliding scale you navigate when configuring access token lifetimes on any OAuth2 server. Neither end is wrong; it is an operational choice based on your threat model and traffic patterns. Revocation handling depends on your implementation choices. Short TTLs on the cached internal token naturally limit exposure windows to the five to fifteen minute range for high-security use cases. For faster enforcement, connect revocation events from your identity service to Redis pub/sub, key expiration notifications, or a lightweight revocation signal channel to trigger active cache invalidation. This keeps the pattern flexible: simple TTL-only for low-friction deployments, or event-driven invalidation for tighter security SLAs. Claim enrichment is the other major benefit of this architecture. Because the authorization layer controls the internal token’s claim structure, you can normalize claim data across IDPs, add entitlements, unify user identifiers, or inject any context your internal services need. None of that is possible when credentials pass through unchanged. The deeper value is insulation. The gateway establishes a hard boundary between two trust domains: the external world with its IDPs, credential types, and lifecycle variability, and the internal world operating on a stable token format that you own and control. Onboarding a second IDP or user pool gets absorbed at that boundary. Nothing downstream needs to be updated. Your internal services evolve against a contract you define, not against the shifting surface of your external identity providers. That separation of responsibility is what makes this pattern durable as systems grow. To intercept traffic before it reaches your services, you need a proxy layer. Service meshes provide exactly that extension point. This pattern is an Envoy capability at its core. Any service mesh that runs Envoy as its data plane supports the same ext_authz extension point natively. The implementation here uses Istio, but Consul is another example. If you are not running a service mesh at all, a standalone Envoy or NGINX deployment with an external auth filter works the same way. Here is how the full flow works in practice with Istio as the proxy layer. Istio’s ext_authz filter intercepts inbound requests at the proxy layer and delegates the auth decision to an external service before forwarding to the destination. That service can approve, deny, or modify the request, including rewriting headers. The full flow: A caller sends a request with either a bearer token or an API key. Istio intercepts the request and forwards it to the token gateway via ext_authz (gRPC). The gateway performs basic validation of the incoming credential, checking expiry and structure, then looks it up in cache. If a cached internal token exists, the gateway returns it immediately with an Allow decision and rewrites the Authorization header. On a cache miss, the gateway forwards the credential to the authorization layer for exchange, which issues a new internal token. The new internal token is cached against the lifecycle of the incoming credential, and the Authorization header is rewritten. Istio forwards the modified request to the destination service. The destination service validates the internal token against the authorization layer’s JWKS endpoint. If basic validation fails at step 3, the gateway returns a Deny response. The request never reaches the destination. The gateway handles two credential types: Authorization: Bearer where the token is a JWT issued by an external IDP (Okta, Auth0, Keycloak, or any OIDC-compliant provider) Authorization: Token where the value is an API key issued by the gateway itself, permanently tied to a user identity at creation time (X-API-Key is a common alternative; the header scheme is configurable) Both arrive with different shapes and different trust models. Both leave as the same thing: a consistent internal token issued by the internal OAuth2 authorization server, validated by your services against its JWKS endpoint. How each credential type gets resolved is handled by the auth server through two dedicated exchange flows. Those flows are the subject of part two. API key issuance and the user-identity binding are handled by an internal identity service, which also provides the validation endpoint that the gateway calls at exchange time. That service is covered in Part 2. A bearer token. From an internal issuer they trust. With claims they can use for their own authorization decisions. That is the whole contract. Whether the original caller authenticated via PKCE in a browser, via a CLI using stored credentials, via an API key in a script, or via an access token from a completely different IDP, every service in your stack sees the same thing every time. One token format. One trust relationship. The credential complexity lives at the boundary, and nowhere else. A reference implementation is available at github.com/mberwanger/token-gateway. It covers the ext_authz integration and the credential normalization layer described here. In that implementation, the gateway delegates token issuance to a purpose-built OAuth2 authorization server sitting behind it. That server is the subject of part two, where we will walk through building it with two custom grant types: one for external token exchange and one for API key exchange.