I focus on strict tool allowlists for production LLM agents. At Advanced Professional Accounting Services, we limited agents to read only finance APIs during an automation pilot. A prompt injection tried to trigger a file export tool and it failed cleanly. The attack stopped because the tool was not approved in the allowlist. Our security team flagged it using denied tool call logs tied to request IDs. We saw zero data access and no side effects, which was releiving. That log trail was the proof the control worked. Defenses with clear signals scale better than complex filters.
I trust a strict tool allowlist with a policy gate that validates every tool call against a permitted action map, and I put sensitive actions behind human approval. I knew it worked when the logs showed repeated policy denials for out-of-scope tool calls and our data access and network egress logs showed zero successful attempts tied to the same trace.