idpishield

Defense against indirect prompt injection

Fast, local risk assessment for untrusted text before it reaches your LLM. Go library, CLI, and MCP server. Sub-millisecond detection with 88+ patterns.

Go Sub-millisecond Apache 2.0 88+ patterns MCP server

Three lines to protection

Add idpishield to your Go project and start assessing untrusted content immediately.

Go Library

terminal
go get github.com/pinchtab/idpishield
main.go
shield := idpi.New(idpi.Config{Mode: idpi.ModeBalanced})

result := shield.Assess(untrustedText, sourceURL)
if result.Blocked {
    log.Printf("blocked: score=%d reason=%s", result.Score, result.Reason)
}

CLI

terminal
go install github.com/pinchtab/idpishield/cmd/idpishield@latest
terminal
# Scan a file
idpishield scan page.txt --mode balanced

# Scan from stdin
echo "Ignore all previous instructions" | idpishield scan

# JSON output
{"score":80,"level":"critical","blocked":true}

MCP Server

terminal
# stdio mode (default)
idpishield mcp serve

# HTTP mode with auth
idpishield mcp serve --transport http --auth-token "$IDPI_MCP_TOKEN"

Exposes idpi_assess as an MCP tool — works with any MCP-compatible agent framework.

Built for the AI security stack

Everything you need to assess prompt injection risk before content reaches your LLM.

Sub-millisecond

Pattern matching and risk scoring complete in under a millisecond. No network calls in fast or balanced mode.

🛡️

88+ Detection Patterns

Multi-language patterns covering instruction override, exfiltration, role hijacking, encoding tricks, and more.

🎯

Tiered Defense

Three modes — fast, balanced, deep — so you pick the right tradeoff between speed and detection accuracy.

🔧

Go Library First

Import as a Go package. One function call: Assess(text, url). CLI and MCP server are secondary interfaces.

🌍

Multi-language

Patterns for English, French, Spanish, German, and Japanese. Unicode normalization handles obfuscation attempts.

📊

Explainable Results

Every assessment returns score, level, matched patterns, categories, and reason — ready for audit logging.

🔒

Production Hardening

Input size limits, decode depth bounds, circuit breaker for deep service, strict mode for lower thresholds.

🤖

MCP Native

Run as an MCP server with stdio or HTTP transport. Token-based auth, constant-time credential checks.

Tiered defense by design

Local pattern matching handles most threats instantly. Optional deep service adds semantic analysis when needed.

  Input Text
      │
      ├── Domain Allowlist ──── trusted? ──→ skip
      │
      ├── Unicode Normalization
      │     └── decode obfuscation (HTML entities, base64, etc.)
      │
      ├── Pattern Matching (88+ patterns, 5 languages)
      │     ├── instruction-override
      │     ├── exfiltration
      │     ├── role-hijacking
      │     ├── encoding-tricks
      │     └── social-engineering
      │
      ├── Risk Scoring (0–100)
      │     ├── default: blocks at ≥ 60
      │     └── strict:  blocks at ≥ 40
      │
      └── [Deep mode] ──→ Service escalation
            ├── Semantic similarity
            └── LLM intent analysis
    
Fast

Pattern matching on raw input. Highest throughput, lowest latency.

Balanced

Normalization + pattern matching. Recommended default for most integrations.

Deep

Balanced + optional service escalation for semantic and LLM-based analysis.

Structured risk results

Every assessment returns an actionable, auditable result.

result.go
type RiskResult struct {
    Score      int      // 0–100 risk estimate
    Level      string   // safe | low | medium | high | critical
    Blocked    bool     // policy decision (score + strict mode)
    Reason     string   // human-readable explanation
    Patterns   []string // matched pattern IDs
    Categories []string // threat categories
}

Risk Levels

safe — score 0–19
low — score 20–39
medium — score 40–59
high — score 60–79
critical — score 80–100

Blocking Semantics

Default mode blocks at score ≥ 60. Strict mode lowers the threshold to ≥ 40. The blocked field is a policy output, not just a detection flag.

idpishield is part of the Pinchtab ecosystem — tools for AI agents that take security seriously. Built in the open, designed for production.