Security should never be an afterthought. It should come at the time of designing, be implemented during product building, and remain part of the platform from day one. Advait Patel, Senior SRE at Broadcom, brings this philosophy to every infrastructure decision. A founding member of OWASP’s AI Vulnerability Scoring System (AIVSS) and co-lead for the CSA AI Control Metrics working group, Advait balances performance, security, and scalability across cloud environments. In this episode, he shares how IAM architecture must evolve for 2026, why traditional SOC KPIs fail with AI, and what autonomous agents mean for production security.
You can read the complete transcript of the episode here >
What are the real challenges with infrastructure security and compliance today?
As an SRE, Advait is the person closest to production. That proximity creates both responsibility and difficulty when it comes to compliance:
- Compliance is not a checklist: Treating it as a document with items A, B, C, and D to complete will not work in 2026. Systems are too dynamic, too automated, and too interconnected for static checklists.
- Build controls from day one: Visibility, log collection, change tracking, and access auditing must be in place from the start. If you wait until auditors arrive next week, you have already lost.
- Engineer confidence follows control: When guardrails and visibility are in place from day one, engineers feel confident about their product and confident about passing any compliance audit.
The reframe: if you treat compliance as a goal and an action item integrated into your engineering workflow, it becomes achievable. If you treat it as a separate annual exercise, it will always feel impossible. This aligns with how secure software development lifecycle practices embed security into the engineering culture rather than bolting it on after the fact.
How should IAM programs be architected in 2026?
Advait is clear that IAM in 2026 must prioritize developer experience alongside security:
- Security should not be a blocker: If a developer needs access and has to file a ticket, wait for IT security review, get manager approval, and then wait days for provisioning, they will resent security. You lose trust and lose champions.
- Just-in-time access is the answer: When a developer needs access, predefined controls review the request and grant access in seconds or minutes, not days. The workflow: request → automated policy check → approval → grant → auto-revoke.
- Start from the bottom, not the top: Instead of granting admin privileges and revoking unused permissions later, start with zero access and add permissions on a need basis. Assign roles by team function: DB teams get database service access, not infrastructure access.
The anti-patterns to avoid:
- Granting admin access and auditing downward: This used to be common in startups but creates unmanageable privilege sprawl.
- Manual access workflows that take days: Developers will find workarounds, creating shadow access patterns that are worse than the original risk.
- One-size-fits-all IAM: What works for 100 engineers will not work for 10,000. The architecture must fit your organization’s size and needs.
With AI, you can manage permissions dynamically: checking guardrails, managing access controls, and ensuring that when someone needs access, they get it without waiting an unreasonable amount of time. This connects directly to how JIT access eliminates standing privilege while maintaining engineering velocity.
What KPIs should security leaders use for AI in SOC operations?
Traditional KPIs (mean time to detect, mean time to react, mean time to resolve) are not enough in 2026. Advait recommends four additional dimensions:
- Signal quality: Is AI improving existing workflows? Are recommendations accurate rather than voluminous? 100 accurate recommendations beat 10,000 false positives every time.
- Analyst efficiency: After using AI, can engineers solve problems in X amount of time that previously took Y? If AI is not saving time, there is no point using it.
- Decision quality: Quality over quantity. AI should recommend with high accuracy so engineers can act confidently. If you still need to verify every AI output, the tool is adding work rather than reducing it.
- Automation safety: Are AI-driven automations correct? How often do they produce the right outcome? Engineers cannot monitor 100 things simultaneously, so AI must verify the correctness of its own automation.
The principle: always quality over quantity. An AI agent giving 10,000 recommendations that are mostly false positives is worse than one giving 100 that are 99.99% accurate. Those 100 save time and lead in the right direction.
How are AI agents changing SRE and security operations?
AI agents are becoming essential for SRE teams, but they require careful boundaries:
Where agents help:
- Tracking changes and collecting logs
- Pattern matching in production environments, especially during alert investigation
- Creating incident runbooks based on historical patterns
- Root cause analysis by correlating multiple signals
Where agents create risk:
- You lose visibility into what happens behind the workflow. When you click A through F manually, you know what happened at each step. When an agent runs the workflow, you may not.
- If the initial data (step zero) is false or poisoned, the entire automation chain produces wrong results, and the runbook created from it will perpetuate the error.
- Agents operating with production API keys without human oversight can make irreversible changes.
The guidance: start with low-risk tasks where you are comfortable with the results even if you do not have full visibility. If you are even 0.01% in doubt about autonomous AI in your production environment, that doubt is worth respecting.
What does the AI versus AI arms race mean for defenders?
Attackers are using AI too. They use it to inject malicious traffic, find new attack vectors, create patterns that bypass behavioral detection, and make defender models produce faulty recommendations.
The specific risk: if an attacker poisons your detection data at step zero, your AI agent treats the entire chain as legitimate. When you create a runbook based on that poisoned data and feed it back to your agent, the agent will perpetuate the false conclusion. This cascading trust failure is the core danger of AI versus AI in security operations.
The defense: never trust AI with critical production decisions without human verification. Trust but verify applies even more strongly to AI-driven workflows than to human ones. The anomaly detection layer must itself be monitored for poisoning and drift. Organizations investing in agentic AI security must address this cascading trust problem.
Where is AI headed in cloud security?
Advait ends on a positive note. AI is headed in a good direction for security operations:
- More workflows and automations that genuinely help rather than add noise
- Better experience for security engineers, customers, and products
- NIST working on the AI and cybersecurity intersection means institutional support is building
- Once people treat AI as a friendly partner rather than a threat, innovation accelerates
The fundamental principle that applies to every new technology (cloud, containers, Kubernetes, AI): it must solve the core problems of security, reliability, cost effectiveness, and visibility. If a new tool or trend does not address these fundamentals, it is introducing new problems rather than solving existing ones.
The recommendation for practitioners: stop reading and start building. Build something, break something, learn something. Certificates and courses are step one, but the real knowledge comes from hands-on experience with the systems you are trying to secure.