reef-prompt-guard
Detect and filter prompt injection attacks in untrusted input.
Setup & Installation
Install command
clawhub install staybased/reef-prompt-guardIf the CLI is not installed:
Install command
npx clawhub@latest install staybased/reef-prompt-guardOr install with OpenClaw CLI:
Install command
openclaw skills install staybased/reef-prompt-guardor paste the repo link into your assistant's chat
Install command
https://github.com/openclaw/skills/tree/main/skills/staybased/reef-prompt-guardWhat This Skill Does
Scans untrusted text for prompt injection before it reaches an LLM. Applies context-aware scoring multipliers based on input source, with stricter thresholds for high-risk origins like web scrapes or email. Covers injection, jailbreaks, exfiltration, privilege escalation, and hidden instruction techniques.
Source-aware scoring lets you apply stricter thresholds for high-risk inputs like web scrapes without writing separate filter logic for each channel.
When to Use It
- Filtering email bodies before LLM summarization
- Screening Discord bot messages for jailbreak attempts
- Validating web-scraped content before passing to an agent
- Blocking injection in sub-agent output pipelines
- Protecting API request handlers from malicious user prompts
Example Workflow
Here's how your AI assistant might use this skill in practice.
User asks: Filtering email bodies before LLM summarization
- 1Filtering email bodies before LLM summarization
- 2Screening Discord bot messages for jailbreak attempts
- 3Validating web-scraped content before passing to an agent
- 4Blocking injection in sub-agent output pipelines
- 5Protecting API request handlers from malicious user prompts
Detect and filter prompt injection attacks in untrusted input.
Security Audits
These signals reflect official OpenClaw status values. A Suspicious status means the skill should be used with extra caution.