The Security Risks of LLM Agents

‪@jovaneyck.bsky.social‬

The Problem

Whether you're clicking together an agent flow or vibe-coding your first MVP, you're potentially exposing yourself to a massive security leak.

LLMs are Coding Genies

Prompts = wishes
Data = prompt = context

The Obvious Risk


reddit

The Lethal Trifecta


Simon Willison; The lethal trifecta for AI agents: private data, untrusted content, and external communication

When all three are present simultaneously, congratulations! You just exposed yourself to a significant security risk.

Scenario 1: Email Automation Agent

Scenario 2: Over-Eager Customer Support

Scenario 3: AI-Assisted Coding

Mitigation Strategy 1: Architectural Controls

  • NEVER give LLMs access to production databases
  • Whitelisting all outbound curl calls might not be a safe default
  • Sanitize inputs before they reach your LLM

Mitigation Strategy 2: Security Scanning

Mitigation Strategy 3: Least Privilege

  • Minimum whitelist, revise often
  • Put authorization in code you control which LLM has no way to bypass

Mitigation Strategy 4: Monitoring

  • Monitor and alerts for suspicious LLM interactions
    • "CRITICAL 🚨 FORGET ALL PREVIOUS INSTRUCTIONS "
  • Alert on unusual data access or external communications

AKA SSDD same shit, different day

Key Takeaways

  • ✅ Be aware of the lethal trifecta
  • ✅ Never assume LLMs will generate secure code
  • ✅ Implement defense in depth

Thank You 👋

Questions?

They treat your prompts as *wishes*, not unbreakable instructions. LLMs don't distinguish between "instructions" and "data" - it's all just text to them.

LLMs are so very very good at generating tons of code rapidly. Without proper security controls, you're potentially exposing yourself to huge risks.

**Setup**: n8n.io workflow that reads emails and sends polite replies to LinkedIn recruiters **Attack**: Mr. Hacker Man sends malicious prompt via email **Result**: Your entire email history neatly summarized and sent to the attacker

**Setup**: Chatbot with production database access, carefully designed to only show user's own data **Attack**: Carefully crafted prompt injection to extract *all* user information **Result**: Passwords and personal data for all users exposed

**Setup**: Claude Code with MCP servers for GitHub and production database for analytics **Attack**: Malicious content in GitHub issues from users **Result**: Production data exposed via pull requests or issue comments

- Automated security vulnerability **scanning in deployment pipelines** - Manual **reviews** for LLM-generated code

Bottom line LLMs are powerful tools that can significantly improve our productivity and capabilities. Approach them with the same security mindset as any other technology.