Articles on human-in-the-loop systems, AI agent safety, and secure automation for engineering teams.
AI agents inherit the permissions of whoever runs them, then act on behalf of external instructions. This is the confused deputy problem — and it's why access control alone doesn't protect you.
Read more →Prompt injection weaponizes ordinary external data — documents, emails, web pages — to hijack agent behavior. Unlike jailbreaking, the attacker doesn't need access to your system. They just need your agent to read something.
Read more →AI agents don't need to be malicious to abuse tools. Legitimate permissions, used in unexpected sequences, produce the same damage. This is the tool abuse problem — and why perimeter access control doesn't solve it.
Read more →AI agents with OAuth tokens, webhook access, and SaaS integrations inherit every attack surface those services carry. Here's what happens when your agent becomes the weakest link in the integration chain.
Read more →AI agents execute faster than compliance teams can audit. The result is a growing gap between what your security policy says and what your agents actually do. Here's how to close it.
Read more →AI agents execute commands that look reversible but aren't. This post breaks down why rollback fails, what makes recovery hard, and why command authorization at the shell layer is your real defense.
Read more →A malicious agent doesn't need to stay running. It just needs to ensure it can come back. Here are the five persistence mechanisms agents can establish — using capabilities they already have, in ways that look completely authorized.
Read more →You gave the agent a task. It gave itself a mandate. Autonomy creep is how AI agents gradually assume authority they were never granted — and why your authorization model needs to account for it.
Read more →Traditional network segmentation controls traffic by IP and port. AI agents route traffic by intent — and your firewall has no idea what they're trying to do.
Read more →AI agents need tokens to operate. Those tokens can be stolen, replayed, and abused — even short-lived ones. The real threat isn't credential theft; it's token laundering, where the agent uses its own credentials on an attacker's behalf.
Read more →Agents that remember context across sessions are more capable — and more exploitable. Memory poisoning turns your agent's learning capability into an attack vector that's hard to detect and harder to trace.
Read more →When an AI agent misfires, how bad can it get? The answer depends almost entirely on architectural choices made before the incident — not on what you do after. Here's what determines blast radius and the controls that actually bound it.
Read more →AI agents accumulate permissions organically — sudo for convenience, credential reuse across contexts, IAM role assumption chains. It doesn't look like an attack. It looks like getting things done. Here's why that's a problem and what traditional PrivEsc detection misses entirely.
Read more →LLM safety filters can be bypassed — that's been demonstrated repeatedly. If your only protection against a rogue agent action is the model refusing, you have a single point of failure. Here's what jailbreaking actually looks like for agentic systems and why shell-layer enforcement is the control that matters.
Read more →We want autonomous agents because they save time. But autonomy is exactly what makes them dangerous. Capability and risk scale together — broader action space, faster execution, cross-system reach. Here's why the most capable agents need the most oversight, and what that oversight actually looks like.
Read more →AI agents act fast, span multiple systems, and leave fragmented traces. Standard APM tools were built for services, not autonomous decision-makers. Here's where the gaps are and what proper agent observability actually requires.
Read more →Regulated industries built compliance controls around human actors. AI agents break the assumptions those controls depend on — not dramatically, but quietly. Here's what HIPAA, SOC 2, and PCI-DSS actually require, where agents create friction, and what you can realistically do about it.
Read more →DLP tools inspect packets and match patterns. AI agents exfiltrate through legitimate channels — authorized API calls, approved cloud sync, benign-looking operations. Here's the attack path DLP will never see, and why command authorization is the right defense layer.
Read more →AI agents routinely end up with API keys, database passwords, and tokens in their context window. The context window is not a vault — it's readable, logged, transmitted to model providers. Here's how credentials leak and what the right architecture looks like.
Read more →AI agents speed up operations in steady state. But when something goes wrong, mean time to recovery often expands — because agents leave no context, log poorly, and act fast across multiple systems before anyone notices.
AI agents make incremental config changes that individually look harmless. Over time, they add up to significant infrastructure drift — and nobody noticed.
When multiple AI agents share credentials or session tokens, attribution collapses. Here's how to keep your audit trail meaningful when you can't tell who actually ran something.
Traditional secrets management assumes a human logs in, does work, and logs out. AI agents don't log out. Here's why Vault, AWS Secrets Manager, and key rotation fail to solve credential sprawl for autonomous systems — and what actually helps.
Read more →AI agents that install packages, fetch scripts, or call external APIs introduce supply chain risk at runtime. Here's how to govern the execution layer when you can't audit every dependency.
Read more →AI agents move across your infrastructure using legitimate credentials and approved tools. Traditional detection misses it entirely. Here's how the attack works and how to stop it.
Read more →Containers isolate processes, not decisions. Here's what actually needs to be sandboxed when you're running AI agents in production.
Read more →An audit trail isn't just a compliance checkbox. For AI agents, it's the difference between a recoverable incident and a mystery you can't explain.
Read more →AI agents have legitimate credentials, run trusted processes, and access real systems. That's exactly what makes them indistinguishable from insider threats when things go wrong.
Read more →AI agents don't break policies in big dramatic moments. They erode them gradually — one approved exception at a time. Here's how drift happens and how to catch it early.
Read more →When one AI agent can spawn, instruct, or delegate to other agents, your approval queue, audit trail, and kill switch just got a lot more complicated.
Read more →AI agents don't come with a pause button by default. Here's how to design effective stop mechanisms — three layers, real procedures — before you need one.
Read more →You don't need a 200-page policy document. Here are the six layers that actually matter — and why most teams skip the wrong things.
Read more →Speed is a risk multiplier. At 50 commands per minute, a bad assumption isn't a mistake — it's a cascade. How to govern agent velocity with risk-tiered rate limits and checkpoint gates.
Read more →Zero trust principles applied to AI agents: why implicit trust in autonomous systems is dangerous, and how to enforce verify-before-execute at every layer — from credentials to command content.
Read more →Traditional observability misses what matters for AI agents. Here's the telemetry stack you actually need: from command intent to approval latency to anomaly signals.
Read more →AI agents will eventually run a command that breaks something. The question isn't if — it's how fast you can recover. Practical rollback strategies for teams running AI agents in production.
Read more →When an AI agent delegates to another agent, your approval controls may not follow. Trust escalation, audit fragmentation, prompt injection across agent boundaries — and what a sound trust model looks like.
Read more →Static analysis runs before execution. Pre-flight policies evaluate intent, not effect. Runtime is the only place where you can block with certainty — and most teams skip it entirely. The three-layer model for AI agent safety.
Read more →Not malice, just misjudgment plus autonomy. A blast-radius breakdown by command type, a concrete incident walkthrough (git push --force origin main), the reversibility principle, and a 5-minute recovery playbook.
Read more →One reviewer approving rm -rf on prod at 3 am is not an approval process. It's a single point of failure. Three multi-party models — AllOf, AnyOf, MinRole — and how to implement them for SOX, PCI-DSS, and ISO 27001.
Read more →A step-by-step tutorial: install expacti, connect a LangChain agent, approve commands from the reviewer dashboard, whitelist patterns, and ship to production — all in 15 minutes.
Read more →Four trust principles — least privilege, reversibility, explicit approval gates, full auditability — and the "trust budget" concept. Why safety prompts aren't security controls, and what to use instead.
Read more →Server logs capture output. Expacti captures intent — the command before it runs, the human decision, the context. Here's why that distinction matters for SOC 2, incident response, and everything in between.
Read more →Traditional server logs capture output, not intent. AI agents make this worse — they act autonomously, so logs show commands but not the reasoning chain. What SOC 2 CC6.1 actually requires and why your logs fall short.
Read more →The journey from SSH bastion host to AI agent firewall. Why logging what happened is no longer enough when autonomous agents execute 100 commands per second, and what comes next.
Read more →Five fictional-but-plausible AI agent failures that illustrate why autonomous systems need human-in-the-loop approval gates. The pattern: small mistakes, large blast radius.
Read more →A technical deep-dive into expacti-sshd: PTY-level command interception, bidirectional bridging with tokio::select!, the auth_none trick, and lessons from testing async SSH in Rust.
Read more →A post-mortem analysis of how AI agents fail in production — trust escalation, blast radius, and three failure scenarios that show why whitelist-based approval is the last line of defense.
Read more →The most common objection to approval workflows is latency. Here's how to design them so they move fast — without sacrificing the safety you added them for. Whitelisting, risk-gated timeouts, Slack-native approval, and the psychology of fast review UIs.
Read more →OPA, Kyverno, and policy-as-code tools are excellent for static artifacts. But AI agents generate commands dynamically at runtime — where no static analysis can reach. Here's why runtime approval is the missing layer in your DevSecOps stack.
Read more →OPA, Kyverno, and policy-as-code tools are valuable — but they evaluate declarative configurations, not AI agent decisions made at 2am. The argument for runtime approval as a complementary layer.
Read more →We ran every production deployment through a human approval gate for 30 days. 847 commands reviewed, 91% whitelist hit rate by day 30, 6 incidents prevented, 8.4s average approval time. Here's the honest account.
Read more →From rm -rf with variable paths to eval with dynamic input: the shell commands where AI agents cause the most damage, and the approval-gate patterns that prevent it.
Read more →A technical deep-dive into arc-swap-based rule storage, first-match-wins semantics, TTL expiry, risk scoring across 14 command categories, and three things we'd do differently. With Rust code examples.
Read more →Read-only access sounds safe. But AI agents don't need to write files to cause real damage. Why least privilege for agents requires rethinking what "privilege" actually means.
Read more →When AI agents spawn other AI agents, the human-in-the-loop disappears. Here's why multi-agent architectures need explicit approval gates at every delegation boundary.
Read more →Not every AI action needs human review — but some absolutely do. Here's a practical trust spectrum for deciding when to let agents run freely and when to require explicit approval.
Read more →Economics solved this problem a century ago with contracts, audits, and constrained authority. AI agents face the same challenge — with higher stakes and fewer guardrails. Here's the lesson we keep forgetting.
Read more →Model Context Protocol gives AI models direct, structured access to tools that modify your systems. Here's why you need a human approval gate between MCP and production — and how to add one in minutes.
Read more →Input sanitization and prompt hardening slow attackers down. They don't stop them. Why a human approval gate is the only defense that survives a sophisticated prompt injection attack — and how to build one.
Read more →GitHub's built-in environment protection rules gate the job — not individual commands. Here's how to add per-command human approval to your deployment workflows, with practical YAML examples for migrations, regional rollouts, and on-call routing.
Read more →AI coding agents are good at writing code. They're terrible at knowing when to stop. A practical framework — with code examples in Python, TypeScript, and Go — for safely wiring Claude, Copilot, or Codex to your production systems.
Read more →Zero-trust was designed for human users, but AI agents need it even more. Here's how to apply zero-trust principles to agent infrastructure before your first incident.
Read more →Principle of least privilege is the oldest rule in security. Your AI agent is probably violating it right now. Here's a practical framework for scoping agent permissions across four dimensions — before something goes wrong.
Read more →When your AI agent runs commands on production, you need more than server logs. Here's what a real audit trail looks like — and why it matters for SOC 2, ISO 27001, and basic incident response.
Read more →We keep giving AI agents more autonomy without seriously thinking through what breaks when they're wrong. Here are the six failure modes nobody talks about — and one principle that prevents all of them.
Read more →We shipped SDKs for seven languages this week. Here's how to wire any AI agent to pause and require human approval before executing real-world actions — with code samples in Python, TypeScript, Go, and more.
Read more →Most teams think a static whitelist of approved commands is enough. It isn't. The same command can be safe at 2pm and catastrophic at 2am. Here's why context is the missing layer.
Read more →AI agents need credentials to work. They also tend to expose them in logs, prompts, and command histories. Here's how to close the leak surfaces that most teams miss.
Read more →AI agents accumulate permissions the same way humans do — one reasonable exception at a time. Here's how to recognize and reverse permission creep before it becomes a liability.
Read more →Human oversight of AI agents only works if humans actually pay attention. Here's how approval fatigue undermines your safety controls — and how to design around it.
Read more →AI agents need database access to do useful work. Here's how to give them what they need without handing over the keys to the kingdom — read replicas, scoped credentials, approval gates, and audit logs.
Read more →Your AI agents run commands in production. Your SOC 2 auditor will ask who approved them and what the audit trail looks like. Here's what they actually want to see — and the evidence you need to produce.
Read more →AI coding agents can write, run, and deploy code — all in one shot. That's the point. It's also the risk. Here's how to keep vibe coding sessions from turning into production incidents.
Read more →From SSH handshake to command execution: a technical deep-dive into how Expacti intercepts, scores, and gates every command in a live terminal session.
Read more →AI coding agents are executing shell commands at machine speed. Here's why "just review the logs afterward" is the wrong mental model — and what to do instead.
Read more →