AWS and Cloudanix team co-authored this blog: Real-Time Threat and Anomaly Detection for Workloads on AWS
Agentic JIT · Pod Identity

Kubernetes workload IAM.
without permanent SA-to-role bindings, without the one-hour cached-STS ghost credential window, scoped to the pod lifecycle, revoked on preStop

EKS Pod Identity, IRSA, GKE Workload Identity, AKS workload identity — all bind at the Service Account level, and the SDK in the pod caches the STS token for an hour. Every pod sharing the SA inherits the role. Even “removing” the binding leaves a ghost-credential window. Agentic JIT for Pod Identity flips it: register the SA → role binding as a Cloudanix Agent, the pod elevates at startup, the preStop hook revokes — and Cloudanix invalidates cached STS tokens so the window collapses to seconds.

✓ EKS Pod Identity · EKS IRSA · GKE Workload Identity · AKS workload identity · ECS Task Roles ✓ Deployment · DaemonSet · StatefulSet · ReplicaSet · Job · CronJob · standalone Pod ✓ SA-scoped guardrails · 1-SA-per-workload policy · pod-lifecycle audit
Register New Agent Pod Identity
Name payments-api-agent
Use case Build Workflow Cloud Workload Pod Identity Other
Cluster prod-eks-use1 · us-east-1
Namespace payments
Service Account payments-api
IAM Role / Policies S3ReadOnly KMSDecrypt
deployment.yaml · payments-api · lifecycle hooks
elevate at start · revoke on preStop
cloudanix-jit-k8s · helm chart v2.1
spec: serviceAccountName: payments-api
initContainers:
 - image: cloudanix/jit-init
    args: ['elevate', '--ttl', 'pod-lifecycle']
lifecycle: preStop:
  exec: ['cloudanix-jit', 'revoke']
kubectl logs · payments-api-7c4b9 · live
pod payments-api-7c4b9 · sa payments-api
[INIT]elevate · SA payments-api → PaymentsReadRole · ✓
[MAIN]aws s3 ls s3://pmt-receipts/ · 142 keys
[PRESTOP]revoke · assoc deleted · cached STS killed
credentials live for pod lifecycle · revoked in seconds not the 1-hour STS hangover
The problem

Workload identity is binding-at-SA-level. That's the quiet blast radius.

EKS Pod Identity is a real improvement over IRSA: no OIDC dance, simple trust policy, no static creds in the pod. But like IRSA, it binds at the Service Account level, and the SDK inside the pod caches its STS token for ~1 hour. So “forever access” and a “1-hour ghost-credential window after you remove the binding” are the operating defaults.

📦

Every pod sharing the SA inherits the role

Pod Identity / IRSA / Workload Identity all bind at the Service Account, not the pod. One shared app-sa in a busy namespace ends up wired to S3, KMS, RDS, Secrets Manager — permanently, for every pod that mounts it. New pod deployed? Inherits everything. Compromised image? Inherits everything.

The 1-hour STS hangover

When you delete a Pod Identity Association (or equivalent), the binding is gone — but the AWS SDK already cached its STS token inside the pod. That token is good for the rest of its TTL (typically ~1 hour) regardless of what happened upstream. Your “revoke” isn't real for an hour.

🤝

Multi-tenant SAs at scale

Best practice is 1 SA per workload. In practice, namespaces end up with default SAs that ten Deployments use, plus three CronJobs and an experimental Job from a dev branch. Nobody enforces the boundary. Roles grow accordingly.

🧾

Audit ends at the SA name

CloudTrail says arn:aws:sts::...:assumed-role/ PaymentsRole/payments-api deleted the bucket. Which Deployment? Which ReplicaSet? Which container image? Which restart attempt? The SA can't tell you. Auditors and SREs both want the chain.

How it works

Register the SA. Add two lifecycle hooks. Stop carrying the 1-hour STS hangover.

The integration surface is a Cloudanix init container and a one-line preStop hook. Your application image and your CI/CD chart are unchanged.

  1. 1

    Register the SA → role binding as an Agent

    In the Cloudanix console, pick use case Pod Identity. Provide cluster, namespace, Service Account, and the IAM Role / policies this agent is allowed to grant. The 1-SA-per-workload rule is enforced here — you can't reuse a registered SA across unrelated workloads by accident.

  2. 2

    Add the JIT init container to your pod spec

    One block in your Deployment / StatefulSet / Job manifest — or render it via the Cloudanix Helm chart. The init container calls elevate, waits for the Pod Identity Association to be live, and exits.

  3. 3

    Add a preStop hook that calls revoke

    lifecycle.preStop.exec: ['cloudanix-jit', 'revoke']. Kubernetes runs this before SIGTERM, so revoke is the last thing the pod does. Cloudanix deletes the Pod Identity Association and immediately invalidates the SDK's cached STS token — the 1-hour ghost-credential window collapses to seconds.

  4. 4

    The application code is unchanged

    Your container keeps using boto3 / aws-sdk-go / azure-identity / google-cloud-storage the way it always did. Pod Identity, IRSA, GKE Workload Identity, AKS workload identity — whichever model your cluster uses — Cloudanix manages the binding beneath it.

  5. 5

    Audit links action → agent → pod → SA

    Every elevation logs cluster · namespace · SA · pod · image · node · workload type. CloudTrail entries thread back to a specific pod restart, not just the SA name. Lands in your S3 / Blob / GCS, kept as long as your retention policy says.

Workload identity, every flavour

EKS, GKE, AKS, ECS — same lifecycle contract.

Each cluster runtime has its own binding model. The primitive we lay over the top (elevate · pod-lifecycle · revoke) is the same.

EKS Pod Identity · payments-api · S3 reads
payments-api-agent (Deployment) Pod starts, calls elevate. Cloudanix creates the PodIdentityAssociation. Pod runs for hours, reads S3, writes metrics. preStop calls revoke — assoc deleted, cached STS killed.
⚠ SA-scope: payments-api · namespace payments · 1-per-workload ✓ Enforced by Cloudanix boundary policy
EKS pod · startup
📦
elevate
Cloudanix JIT
create-association
SA → PaymentsReadRole
scope granted cluster: prod-eks · ns: payments · sa: payments-api · policies: S3ReadOnly + KMSDecrypt · ttl: pod-lifecycle
spec:
  serviceAccountName: payments-api
  initContainers:
    - { image: cloudanix/jit-init, args: ['elevate'] }
  lifecycle:
    preStop:
      exec: { command: ['cloudanix-jit', 'revoke'] }
✓ Stamped agent → pod → SA → action ⛔ Immediate revoke · cached STS invalidated
EKS IRSA (legacy) · analytics-batch · nightly job
analytics-batch-agent (CronJob) Pre-Pod-Identity cluster. Nightly batch reads a warehouse export bucket, writes back to a parquet-output bucket. Cloudanix manages the IAM role trust + SA annotation for the lifetime of each Job pod.
EKS CronJob
📦
elevate (IRSA)
Cloudanix JIT
OIDC trust + SA annot.
AnalyticsBatchRole
scope granted cluster: legacy-eks · ns: analytics · sa: analytics-batch · ttl: job-lifecycle
◎ elevate · agent: analytics-batch-agent
✓ SA annotated with IRSA role · trust policy updated
✓ Job pod assumes role via OIDC · STS issued
◎ Job completes · revoke fires from cleanup container
✓ SA annotation removed · role trust narrowed
✓ Stamped agent → Job → SA → run-id ⛔ Revoke + cached-STS invalidate at job end
GKE Workload Identity · ml-inference · GCS
ml-inference-agent (Deployment) Inference pods stream models from a GCS bucket. GKE Workload Identity wires the KSA to a GSA via annotation. Cloudanix manages the annotation lifecycle — bound at pod start, unbound on preStop.
GKE pod · startup
📦
elevate
Cloudanix JIT
set KSA annotation
KSA → GSA
scope granted cluster: prod-gke · ns: ml · ksa: ml-inference · gsa: ml-inference@proj.iam · ttl: pod-lifecycle
◎ elevate · agent: ml-inference-agent
✓ KSA annotated · iam.gke.io/gcp-service-account
✓ GSA binding live · Workload Identity Pool active
◎ preStop fires · annotation cleared · binding revoked
✓ Cached token invalidated · next call would 401
✓ Stamped cluster → ns → KSA → GSA → action ⛔ Annotation cleared + tokens invalidated
AKS workload identity / ECS Task Role · cross-cloud
data-sync-agent (multi-runtime) Same workload runs as a Deployment on AKS for Azure-side reads and as a Fargate task on ECS for AWS-side writes. Cloudanix manages workload identity / task role bindings on both sides with one agent definition.
AKS pod & ECS task
📦
elevate
Cloudanix JIT
federated identity
Azure MI + AWS Task Role
scope granted aks: ns/data-sync · ecs: cluster prod · ttl: workload-lifecycle · multi-cloud agent
◎ elevate · agent: data-sync-agent · multi-runtime
✓ AKS workload identity bound · ns/data-sync
✓ ECS task role granted DataSyncWrite policy
◎ Both workloads call revoke independently
✓ Tokens invalidated on both clouds
✓ Stamped per-runtime, joined under one agent ⛔ Independent per-cloud revoke · symmetric
workload kinds we attach to Deployment StatefulSet DaemonSet ReplicaSet Job CronJob standalone Pod Argo Rollouts Knative Service KEDA-scaled Job ECS Task · Fargate …any pod-shaped thing that mounts an SA
The model

App teams write Kubernetes. Security writes the boundary.

What app & platform teams get
  • Application code is unchangedboto3 / aws-sdk-go / google-cloud-* work the way they always have
  • One Helm chart adds the JIT init container and preStop hook — or two blocks of YAML
  • Same model across EKS Pod Identity, IRSA, GKE Workload Identity, AKS workload identity — one mental model for multi-cluster fleets
  • Stop debugging “why does this pod 403?” at 3am — the agent boundary is in code review
two YAML blocks. zero SDK changes.
What security keeps
  • Zero standing SA → role bindings — the binding exists only while a registered pod is running
  • 1-SA-per-workload enforced at agent registration — namespaces stop accumulating default SAs with five jobs
  • Cached STS tokens get invalidated on revoke — no 1-hour ghost-credential window after a pod terminates
  • Audit links action → agent → pod → SA → workload type → image → node
Governance & audit stay in your account.
Pod-lifecycle audit

Action → agent → pod → SA → workload kind. The full chain.

Every elevation lands as a correlated timeline. When CloudTrail says s3:DeleteObject from PaymentsRole/payments-api, you click through to the pod (payments-api-7c4b9), the Deployment, the container image, the node it ran on, and the exact second the preStop revoke fired.

pod elevation timeline · elev-3a91c4d2 · payments-api-agent
[10:14:02]  ELEVATE      payments-api-agent · SA payments-api · pod 7c4b9
[10:14:02]  POLICY-CHECK boundary OK · 1-SA-per-workload satisfied
[10:14:03]  EKS EVENT    CreatePodIdentityAssociation · ns payments
[10:14:18]  ACTION       s3:ListObjectsV2 · pmt-receipts
[10:14:41]  ACTION       kms:Decrypt · key/pmt-data
[16:32:11]  PRESTOP      preStop hook invoked · pod terminating
[16:32:11]  REVOKE       DeletePodIdentityAssociation + cached STS invalidated
[16:32:12]  AUDIT        written to s3://your-bucket/jit-agents/
1:14 / 2:00
agent · payments-api-agent pod · payments-api-7c4b9 workload · Deployment ttl · pod-lifecycle (6h 18m)
  • Pod-lifecycle resolved, end to end

    Init container elevate → PodIdentityAssociation created → every API call the pod made → preStop revoke → cached STS invalidated → assoc deleted. One row in the audit log per pod restart.

  • 🧾

    Evidence-grade for SOC 2 · ISO · PCI

    Quarterly: every elevation by every workload, with pod name, image digest, node, workload kind, cluster, and second-level revoke timestamps. Forensics get the pod restart history alongside.

  • 🔗

    Sits in your data plane

    Audit writes to your S3 / Azure Blob / GCS, not ours. Send your auditor a signed URL scoped to one elevation or a whole quarter. The link expires. Nothing copied, nothing leaked.

Plugs in · doesn't replace

Your cluster. Your cloud IAM. Our JIT in between.

EKS stays EKS. GKE stays GKE. The Pod Identity / IRSA / Workload Identity primitives keep working the way they always have. Cloudanix manages the SA ↔ role binding lifecycle — created at pod start, deleted at preStop, cached tokens invalidated.

Workload identity systems

EKS · Pod Identity
EKS · IRSA
GKE · Workload Identity
AKS · workload identity
ECS · Task Role
Fargate · Task Role
k8s SA · any cluster · raw API

Workload kinds

Deployment
StatefulSet
DaemonSet
ReplicaSet
Job
CronJob
standalone Pod
Argo Rollouts
KEDA · scaled-job
Knative Service
What you get

Workload identity that lives only as long as the workload.

Zero standing SA → role bindings

The binding doesn't exist until a registered pod starts. kubectl describe sa on a quiet cluster says no annotation — no orphan IAM permissions to inherit.

1-hour STS hangover → seconds

On revoke we don't just delete the binding — we invalidate the SDK's cached STS token. The pod that just terminated cannot use its credentials from another box.

🧱

1-SA-per-workload, enforced

The agent registration is the boundary — an SA registered to one workload cannot be silently reused by another. Best practice becomes the default, not a wiki page.

📦

One model · EKS, GKE, AKS, ECS

EKS Pod Identity, IRSA, GKE Workload Identity, AKS workload identity, ECS Task Roles — one Helm chart, one agent definition, identical lifecycle semantics.

🪶

Application code is unchanged

boto3 / aws-sdk-go / google-cloud-* / azure-identity work exactly the way they did. Init & preStop do all the lifecycle.

🔁

Survives restarts & eviction

Pod evicted by the scheduler? The restart calls elevate again. Old pod's tokens are invalidated the instant the SIGTERM fires. No carrying credentials into the next replica.

🧾

Audit: agent → pod → SA → action

CloudTrail entries thread back to this restart of this pod, running this image, on this node. No more “the SA did it.”

🛡

Blast radius shrinks to the pod

Container image breach? Inherits permissions only for the lifetime of that pod. Restart = new elevation under the same boundary. No SA-wide permanent leak.

🌐

One JIT engine · humans & non-humans

The same Cloudanix that gives an engineer JIT into the AWS console manages your pods' IAM lifecycle. One policy model, one audit pipeline, one team of operators.

Ready to see your graph?

Connect a cloud account in under 30 minutes. See every finding rooted in identity, asset, and blast radius — with a fix path attached.

Book a Demo