Production AI Infrastructure

The Zero-Trust Auth Stack Architects Nod At: Heimdall + OpenFGA

How the gateway forwardAuth to external authorizer to internal JWT to thin services pattern works, built with Heimdall as the PEP and OpenFGA as the Zanzibar-style PDP.

19 min read Updated Jun 17, 2026

TL;DR

  • There is a boringly correct way to do authorization in multi-tenant SaaS, and most serious shops converge on it: a gateway enforces, an external service decides, and downstream services stay thin and trusting.
  • In Zero-Trust language (NIST SP 800-207) this is the PEP/PDP split: the Policy Enforcement Point is the muscle at the door, the Policy Decision Point is the brain that returns allow/deny.
  • The flow is: API gateway forwardAuth → external authorizer → external authorizer asks the decision engine → authorizer mints a short-lived internal JWT → thin downstream service trusts that JWT.
  • This article builds it with Heimdall as the enforcement pipeline (authenticate → authorize → enrich → mint token) and OpenFGA as the Zanzibar-style ReBAC decision engine.
  • Heimdall and OpenFGA are deliberate, modern picks. The most popular equivalents are Envoy/Istio + OPA and oauth2-proxy + SpiceDB. The shape of the architecture is identical; only the components swap.
  • The payoff: authorization logic lives in one place, downstream services stop carrying if user.role == ... spaghetti, and you get a clean audit and fail-closed boundary.
  • This is a topology guide, not a permission-modeling guide. If you want to learn how to design OpenFGA models, see the companion article on OpenFGA for granular permissions.

What You Will Learn Here

  • The vocabulary architects use: Zero Trust, PEP vs PDP, ReBAC, Zanzibar
  • Why the “gateway → external authorizer → internal JWT → thin services” pattern keeps showing up
  • How Heimdall’s pipeline (authenticators, authorizers, contextualizers, finalizers) maps cleanly onto the PEP role
  • How OpenFGA acts as the PDP, and the difference between Check and ListObjects
  • A complete, runnable example: gateway, Heimdall, OpenFGA, a thin upstream, and the rules that tie them together
  • Why downstream services should trust a freshly minted internal token, not the original login token
  • Honest alternatives (Envoy/Istio + OPA, SpiceDB, oauth2-proxy) and when they fit better
  • The production concerns that separate a demo from a deployment

”Yep, That’s How You’re Supposed to Do It”

Here is a fun test. Describe this to any platform or security architect:

“The API gateway does a forwardAuth to an external authorizer. The authorizer validates the caller, asks a Zanzibar-style decision engine whether this action is allowed, and if so mints a short-lived internal JWT. The downstream services are thin and just trust that internal token.”

You will get a nod and some version of “yep, that’s how you’re supposed to do it.”

That reaction is not because the design is clever. It is because it is convergent. Google (BeyondCorp / Identity-Aware Proxy), Auth0/Okta (FGA), and most mature multi-tenant SaaS independently land on the same shape, because the alternatives age badly:

  • Auth logic copy-pasted into every service drifts out of sync.
  • Login tokens passed straight through give every service the user’s full blast radius.
  • Role checks buried in business code are impossible to audit or change centrally.

The rest of this article is about that convergent shape, and how to build a credible version of it with Heimdall and OpenFGA.

The Vocabulary, Briefly

You do not need a PhD in Zero Trust, but four terms unlock every architecture diagram in this space.

TermPlain meaning
Zero TrustNo request is trusted because of where it came from. Every request is authenticated and authorized. (NIST SP 800-207)
PEP — Policy Enforcement PointThe gatekeeper in the data path. Intercepts every request, enforces the verdict. It does not decide.
PDP — Policy Decision PointThe brain. Given subject + action + resource + context, returns allow / deny. It does not sit in the data path.
ReBAC — Relationship-Based Access ControlPermissions derived from relationships (user:anne is reader of document:1234). The model popularized by Google’s Zanzibar paper.

The single most important idea in NIST SP 800-207 is the separation of decision from enforcement. The thing that intercepts traffic (PEP) is not the thing that decides (PDP). That split is the entire point, because it lets you centralize policy while distributing enforcement.

        decision  ── PDP (the brain)   ──  "allow / deny"
                          ^
                          | asks
                          |
        enforcement ── PEP (the muscle) ── sits in the request path

Map this onto our tools:

  • Heimdall = PEP (plus a token-minting step we will get to).
  • OpenFGA = PDP (specifically a ReBAC decision engine).

The Reference Architecture

Here is the whole pattern in one diagram. Keep this picture in your head for the rest of the article.

                        (1) request + original token
   Client ───────────────────────────────────────────────► API Gateway
                                                            (Traefik / Caddy / NGINX / Envoy)

                                                       (2) forwardAuth: "is this allowed?"

   ┌──────────────────────────────────────────────────────────────────────────┐
   │  Heimdall  (PEP + token broker)                                            │
   │                                                                            │
   │   authenticate ──► authorize ──► (enrich) ──► finalize                     │
   │   verify login     ask the PDP    add context   MINT internal JWT          │
   │      token            │                                                    │
   └───────────────────────┼────────────────────────────────────────────────────┘
                           │ (3) Check: user:anne reader document:1234 ?

                   ┌───────────────┐
                   │   OpenFGA     │  (PDP / ReBAC decision engine)
                   │   model+tuples│  ──► { "allowed": true }
                   └───────────────┘

                  (4) 200 OK + Authorization: Bearer <internal JWT>

                                                            API Gateway

                  (5) original request, now carrying the internal JWT

                                                      Thin Upstream Service
                                                      (just verifies the JWT signature)

Read it as five moves:

  1. The client hits the gateway with its original login token.
  2. The gateway pauses and asks Heimdall (forwardAuth) whether to allow the request.
  3. Heimdall validates the token, then asks OpenFGA the precise relationship question.
  4. If OpenFGA says yes, Heimdall mints a new, short-lived internal JWT and returns 200 OK with it.
  5. The gateway forwards the original request to the upstream, now carrying the internal JWT. The upstream only has to verify a signature.

The upstream service contains zero authorization logic. That is the whole win.

Why a Gateway Pipeline + External Authorizer

Before the tools, the why. Three forces push every growing system toward this design.

1. Stop scattering if statements. The first version of any app checks permissions inline:

if (user.role !== "admin" && doc.ownerId !== user.id) {
  throw new ForbiddenError();
}

Multiply that by 200 endpoints across 12 services and you have an unauditable mess. Centralizing the decision means one place to reason about, test, and change.

2. Fail closed at the edge. Zero Trust mandates that if the decision point is unreachable, the enforcement point denies. A gateway-level PEP gives you exactly one choke point to guarantee that, instead of hoping every service got its fallback right.

3. Shrink the blast radius. If services receive the raw login token, a compromised or buggy service can replay the user’s full identity anywhere. If instead each service receives a narrow, short-lived internal token minted for this hop, the damage from any single service is bounded.

The pattern that satisfies all three is: enforce at a gateway, decide in a dedicated service, hand downstream services a fresh minimal token.

Heimdall as the PEP + Token Broker

Heimdall describes itself as a cloud-native Identity-Aware Proxy and Access Control Decision service. For our purposes it is the enforcement pipeline: it authenticates, authorizes, optionally enriches, and finalizes each request.

It runs in Decision Operation Mode (serve decision) when sitting behind another gateway. The gateway uses a forwardAuth-style middleware: if Heimdall returns a 2XX, the gateway forwards the original request to the upstream; otherwise it returns Heimdall’s error to the client.

Heimdall’s pipeline is built from four mechanism types, which line up beautifully with the Zero-Trust roles:

Heimdall mechanismJobRole in the pattern
AuthenticatorVerify credentials, establish the subject (e.g. validate a JWT against a JWKS endpoint)“Who is this?”
AuthorizerDecide whether the subject may do this (CEL expression, OPA, or OpenFGA)Calls the PDP
ContextualizerEnrich the subject with extra data from another serviceAdd context for the decision or the upstream
FinalizerProduce the output for the upstream — typically mint a signed JWTThe token-broker step

You define these once in a mechanisms catalogue, then reference them per route in rules. A trimmed catalogue:

mechanisms:
  authenticators:
    - id: jwt_auth
      type: jwt
      config:
        jwks_endpoint: http://idp:8080/.well-known/jwks
        assertions:
          issuers:
            - demo_issuer

  authorizers:
    - id: openfga_check
      type: remote
      config:
        endpoint: http://openfga:8080/stores/{{ .Values.store_id }}/check
        payload: |
          {
            "authorization_model_id": "{{ .Values.model_id }}",
            "tuple_key": {
              "user": "user:{{ .Subject.ID }}",
              "relation": "{{ .Values.relation }}",
              "object": "{{ .Values.object }}"
            }
          }
        expressions:
          - expression: |
              Payload.allowed == true

  finalizers:
    - id: create_jwt
      type: jwt
      config:
        signer:
          secret:
            source: jwt_key_source
        ttl: 5m

Three things to notice:

  • The jwt authenticator turns the incoming login token into a verified Subject.
  • The remote authorizer is just an HTTP call to OpenFGA’s /check endpoint; the rule will fill in the relation and object. The decision is “is Payload.allowed == true?”
  • The jwt finalizer mints a brand-new token, signed with Heimdall’s own key, with a short TTL. This is the internal JWT. The upstream verifies it against Heimdall’s JWKS endpoint.

That last point is the architectural heart: Heimdall does not forward the user’s login token. It issues a new one scoped to this hop.

OpenFGA as the PDP

OpenFGA is an open-source authorization engine inspired by Google’s Zanzibar. It stores an authorization model (the types and relations that are possible) plus relationship tuples (the facts), and answers permission questions.

For this pattern you care about two query shapes:

  • Check — “Does user:anne have relation reader on document:1234?” Returns { "allowed": true }. This is the call the authorizer makes for a single protected action.
  • ListObjects — “Which documents can user:anne read?” Returns a list. This is what a contextualizer uses to power list endpoints or to stuff the allowed set into the minted token.

A minimal model for a document API:

model
  schema 1.1

type user

type document
  relations
    define reader: [user]
    define writer: [user]
    define owner: [user]

And a fact:

user:anne   reader   document:1234

OpenFGA also speaks the AuthZEN standard (an OpenID working-group spec for authorization interoperability). That matters because it lets an AuthZEN-aware gateway treat OpenFGA as a drop-in PDP via /access/v1/evaluation. For Heimdall today, the native /check endpoint shown above is the straightforward path.

One PDP, two access patterns: Check guards a single action; ListObjects powers “what can I see?” Never use ListObjects results as the security boundary for a single action — re-Check on access.

End-to-End: A Working Example

Let’s assemble the whole thing. The setup has four containers: a gateway-style proxy (here Heimdall itself runs in proxy mode for brevity, but the rule is identical behind Traefik/Caddy), an upstream echo service, OpenFGA, and a tiny IdP exposing a JWKS endpoint. This mirrors Heimdall’s official OpenFGA guide.

# docker-compose.yaml
services:
  heimdall:
    image: dadrus/heimdall:dev
    ports:
      - "9090:4456"
    volumes:
      - ./heimdall-config.yaml:/etc/heimdall/config.yaml:ro
      - ./rules:/etc/heimdall/rules:ro
      - ./signer.pem:/etc/heimdall/signer.pem:ro
    command: serve proxy -c /etc/heimdall/config.yaml --insecure

  upstream:
    image: traefik/whoami:latest
    command:
      - --port=8081

  openfga:
    image: openfga/openfga:latest
    command: run
    ports:
      - "8080:8080"

  idp:
    image: nginx:1.25.4
    volumes:
      - ./idp.nginx:/etc/nginx/nginx.conf:ro
      - ./jwks.json:/var/www/nginx/jwks.json:ro

The Heimdall config wires the mechanisms together and points the finalizer at a signing key:

# heimdall-config.yaml
secret_management:
  jwt_key_source:
    type: pem
    config:
      path: /etc/heimdall/signer.pem

mechanisms:
  authenticators:
    - id: jwt_auth
      type: jwt
      config:
        jwks_endpoint: http://idp:8080/.well-known/jwks
        assertions:
          issuers:
            - demo_issuer
  authorizers:
    - id: openfga_check
      type: remote
      config:
        endpoint: http://openfga:8080/stores/{{ .Values.store_id }}/check
        payload: |
          {
            "authorization_model_id": "{{ .Values.model_id }}",
            "tuple_key": {
              "user": "user:{{ .Subject.ID }}",
              "relation": "{{ .Values.relation }}",
              "object": "{{ .Values.object }}"
            }
          }
        expressions:
          - expression: |
              Payload.allowed == true
  finalizers:
    - id: create_jwt
      type: jwt
      config:
        signer:
          secret:
            source: jwt_key_source

providers:
  file_system:
    src: /etc/heimdall/rules
    watch: true

Now the rule — the per-route policy that says which relation and object to check, depending on the HTTP method:

# rules/demo.yaml
version: "1beta1"
rules:
  - id: access_document
    match:
      routes:
        - path: /document/:id
      methods: [GET, POST, DELETE]
    forward_to:
      host: upstream:8081
    execute:
      - authenticator: jwt_auth
      - authorizer: openfga_check
        config:
          values:
            store_id: 01HSXG2XSZJMQG99EVXB4QQX8P
            model_id: 01HSXG7TBQEJ7GBPKQR2VYH24G
            relation: >
              {{- if eq .Request.Method "GET" -}} reader
              {{- else if eq .Request.Method "POST" -}} writer
              {{- else if eq .Request.Method "DELETE" -}} owner
              {{- else -}} unknown
              {{- end -}}
            object: >
              document:{{- .Request.URL.Captures.id -}}
      - finalizer: create_jwt

This compact rule is doing a lot:

  • The method maps to a relation: GET needs reader, POST needs writer, DELETE needs owner. This is policy expressed declaratively, not buried in service code.
  • The object is derived from the URL: document:1234.
  • On success, the finalizer mints the internal JWT and the request flows to upstream:8081.

Create the store, model, and a tuple so anne can read document 1234:

# 1. create the store
curl -X POST http://127.0.0.1:8080/stores \
  -H "content-type: application/json" \
  -d '{"name": "FGA Demo Store"}'

# 2. create the authorization model (returns an authorization_model_id)
curl -X POST http://127.0.0.1:8080/stores/<store_id>/authorization-models \
  -H "content-type: application/json" \
  -d '{"schema_version":"1.1","type_definitions":[
        {"type":"user"},
        {"type":"document","relations":{
          "reader":{"this":{}},"writer":{"this":{}},"owner":{"this":{}}},
         "metadata":{"relations":{
          "reader":{"directly_related_user_types":[{"type":"user"}]},
          "writer":{"directly_related_user_types":[{"type":"user"}]},
          "owner":{"directly_related_user_types":[{"type":"user"}]}}}}]}'

# 3. grant anne the reader relationship on document:1234
curl -X POST http://127.0.0.1:8080/stores/<store_id>/write \
  -H "content-type: application/json" \
  -d '{"authorization_model_id":"<model_id>",
       "writes":{"tuple_keys":[
         {"user":"user:anne","relation":"reader","object":"document:1234"}]}}'

Now a request from anne (a JWT with sub: anne) to GET /document/1234 succeeds:

curl -X GET -H "Authorization: Bearer <anne-login-jwt>" \
  127.0.0.1:9090/document/1234

The upstream echoes the request — and crucially, the Authorization header it receives is not Anne’s login token. It is a fresh JWT issued by Heimdall ("iss": "heimdall"), with sub: anne and a short expiry. A DELETE /document/1234 from Anne fails, because she is only a reader, not an owner. The decision happened in OpenFGA; the enforcement happened at Heimdall; the upstream never had to know any of it.

The Internal JWT Handshake (and Why It Matters)

This is the step people skip, and it is the most important one.

The naive version of a gateway just forwards the original login token to every service:

Client ──[login JWT]──► Gateway ──[same login JWT]──► Service A
                                 ──[same login JWT]──► Service B

The problem: every service now holds a credential with the user’s full authority. Service B, which only needs to render a profile, is now holding a token it could replay against Service A’s admin endpoints. That is the opposite of least privilege.

The correct version mints a new internal token at the boundary:

Client ──[login JWT]──► Gateway+Heimdall
                            │  verify login JWT
                            │  ask PDP
                            │  MINT internal JWT (iss: heimdall, ttl: 5m, sub: anne)

                        ──[internal JWT]──► Thin Service (verify signature, done)

Why this is better:

  • Short-lived. A 5-minute token that leaks is far less dangerous than a long-lived login session.
  • Scoped and shaped for the upstream. Heimdall can put exactly the claims the service needs (and the contextualizer can even embed the OpenFGA ListObjects result), and nothing more.
  • Decoupled identity providers. The upstream verifies one issuer — Heimdall — regardless of whether users log in via Okta, Auth0, Keycloak, or three of them at once. Swapping IdPs becomes a Heimdall config change, not a fleet-wide migration.
  • A real trust boundary. The upstream’s rule is dead simple: trust a valid signature from Heimdall’s JWKS, reject everything else.

This is not a Heimdall invention. It is exactly what Google’s Identity-Aware Proxy does with its x-goog-iap-jwt-assertion signed header: IAP strips client-supplied identity headers and injects a cryptographically signed assertion that the backend verifies. BeyondCorp’s whole premise is that the backend trusts the proxy’s signed claim, not the network. Heimdall + OpenFGA is the self-hostable version of that idea.

Honest Alternatives

Heimdall and OpenFGA are good, modern picks — but they are not the only way to draw this diagram. The architecture is what matters; the boxes are swappable. Here are the popular substitutes.

RoleThis articleMost popular alternativeAlso seen
Gateway / PEPHeimdall (behind any proxy)Envoy / Istio ext_authzNGINX, Traefik, Kong, oauth2-proxy
Decision engine / PDPOpenFGA (ReBAC)OPA (policy-as-code, Rego)SpiceDB, Ory Keto, Cedar
ReBAC specificallyOpenFGASpiceDB (AuthZed)Ory Keto

A few notes so you can choose well:

  • Envoy/Istio + OPA is the default in heavy Kubernetes/service-mesh shops. Envoy’s ext_authz filter is the PEP, OPA is the PDP. OPA’s Rego is extremely flexible but is policy-as-code (attribute/rule logic), not relationship-native — modeling deep ReBAC graphs in Rego is possible but awkward.
  • OPA vs OpenFGA is the key fork: choose OPA when your decisions are attribute/condition logic (“is the request from a managed device during business hours?”); choose OpenFGA/SpiceDB when your decisions are relationship graphs (“can this user edit this doc inherited from this folder in this org?”). Many mature systems run both.
  • SpiceDB vs OpenFGA: both are Zanzibar implementations. SpiceDB (AuthZed) is the most popular; OpenFGA is a CNCF sandbox project with first-class Auth0/Okta backing. They are more alike than different; pick on ecosystem fit and operational preference.
  • oauth2-proxy is the lightweight crowd-favorite for “just put SSO in front of this app.” It handles authentication well but is not a full decision pipeline; it does not mint scoped internal tokens or call a ReBAC PDP the way Heimdall does.

The reason to reach for Heimdall + OpenFGA over the most-popular pairing is usually: you want the clean pipeline + token-minting model without adopting a whole service mesh, and you want relationship-native authorization from day one.

Production Concerns

The demo runs in a weekend. Production needs discipline at the seams where identity, decisions, and tokens meet.

ConcernWhat to get right
Fail closedIf OpenFGA is unreachable, the authorizer must fail and the gateway must deny. Verify this explicitly; never let a PDP timeout become an implicit allow.
LatencyEvery request now makes a decision call. Use Heimdall’s caching, OpenFGA’s Check performance, and keep the PDP close (same cluster/zone). Budget single-digit milliseconds.
ConsistencyAfter a permission change (tuple write), reads may briefly be stale. For sensitive flows, use OpenFGA’s higher-consistency options and design for read-after-write where it matters.
Key managementHeimdall’s signing key is now a crown jewel. Rotate it, publish via JWKS so upstreams pick up new keys, and keep TTLs short so rotation is painless.
AuditLog decisions at the PDP and enforcement at the PEP: who, what relation, what object, allowed/denied. This is your accountability trail.
Token TTLShort enough to limit replay (minutes), long enough to avoid re-minting on every call. Cache minted tokens keyed on subject + claims.
Trusted proxiesIf Heimdall sits behind Traefik/Caddy, configure trusted proxy IPs so it correctly reads X-Forwarded-*. Do not leave this at 0.0.0.0/0 in production.
Upstream verificationThin does not mean naive: each upstream must still verify the internal JWT’s signature, issuer, and expiry. A disabled check is not a trust boundary.

Gaps: What This Stack Does Not Solve

Be honest with yourself about the edges of the pattern.

  • It is not authentication. Heimdall validates tokens, but you still need an IdP (Okta, Auth0, Keycloak) to issue login tokens in the first place.
  • It does not enforce data boundaries. OpenFGA says whether an action is allowed; your database still needs tenant filters, row-level security, and narrow roles. Treat the PDP and the database as two independent locks (defense in depth).
  • ListObjects is not a content filter for huge tenants. At scale you still need search indexes and pagination; do not turn every page load into an enormous authorization query.
  • It does not model the permissions for you. Getting the OpenFGA model right (inheritance, groups, conditions) is its own discipline — covered in the companion modeling article.
  • It does not handle high-risk execution. “Allowed” is not “safe to auto-execute.” Money movement, production deploys, and data exports may still warrant human approval on top of the allow decision.
  • The gateway becomes a critical path. You have centralized enforcement, which is the goal — but now the gateway + PDP availability is your availability. Plan redundancy accordingly.

Final Mental Model

If you remember one thing, make it the division of labor:

IdP / OAuth        proves who you are            (authentication)
Gateway (PEP)      intercepts and enforces       (Heimdall, in the path)
Decision engine    decides allow/deny            (OpenFGA, the PDP brain)
Token broker       mints a scoped internal JWT   (Heimdall finalizer)
Thin service       verifies signature, executes  (no auth logic)
Database / RLS     enforces the data blast radius (independent lock)
Audit log          explains the decision later   (accountability)

The pattern is not exciting, and that is exactly why architects approve of it. It puts authorization in one auditable place, hands each service the smallest credential it needs, and fails closed at a single, well-understood boundary. Heimdall and OpenFGA are a clean, self-hostable way to build it — and if you later swap in Envoy + OPA or oauth2-proxy + SpiceDB, the diagram on the wall does not change. That stability is the point.

Source List