Inside the Black Box: What Leaked AI System Prompts Reveal About How Your Favorite Tools Actually Think

Part 1: Foundations — The Mental Model

Every AI tool you use daily has a hidden constitution: a system prompt that defines its personality, capabilities, restrictions, and the exact tools it can call. These prompts are the real product — more so than the models themselves.

The GitHub repository x1xhlol/system-prompts-and-models-of-ai-tools is the most comprehensive public collection of leaked system prompts and tool definitions for over 30 major AI products. With 30,000+ lines of raw prompts, this is effectively a museum of how the modern AI industry builds agents.

Think of it this way: if the LLM (GPT-4, Claude, Gemini) is the engine, the system prompt is the driver. Reading these leaks is like seeing behind the wheel for the first time.

The mental model:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
User Query
    │
    ▼
[System Prompt] ← defines identity, tools, rules, constraints
    │
    ▼
[LLM Backbone] ← Claude, GPT-4, etc.
    │
    ▼
[Tool Calls] ← shell, browser, editor, deploy
    │
    ▼
[Result]

Part 2: The Investigation — What’s in the Repo

The repository is organized by product, each folder containing prompt files (.txt) and tool schemas (.json). Here’s a map of what’s included:

Category	Products
Coding Agents	Cursor, Devin AI, Windsurf, Augment Code, Junie, Kiro, Trae, VSCode Agent
Autonomous Agents	Manus, Replit Agent, Emergent, Leap.new
Gen AI Builders	v0 (Vercel), Lovable, Same.dev, Orchids.app
AI Assistants	Anthropic Claude (multiple versions), Perplexity, NotionAI, Cluely
Mobile / IDE	Xcode AI, Qoder, CodeBuddy, Poke

Each folder typically contains:

A Prompt.txt — the actual system prompt injected before every conversation
A Tools.json — the full list of callable tools with TypeScript-like signatures
Sometimes versioned snapshots (e.g., Prompt Wave 11.txt, Agent Prompt 2.0.txt)

This means you can literally diff how a product’s instructions evolved over time.

Part 3: The Diagnosis — What These Prompts Actually Reveal

🧠 Manus: The Most Transparent Agent Architecture

Manus’s leaked Agent loop.txt is a masterclass in agentic design. It reveals a multi-module architecture:

1
Event Stream → Planner Module → Knowledge Module → Datasource Module → Executor

The agent loop is explicit:

1
2
3
4
5
6
1. Analyze Events: Understand user needs through the event stream
2. Select Tools: Choose the next tool call based on current state
3. Wait for Execution: Tool runs in sandbox, result added to event stream
4. Iterate: Repeat with ONE tool call per iteration
5. Submit Results: Send deliverables via message tools
6. Enter Standby

Key insight: Manus separates planning from execution explicitly. The Planner module provides numbered pseudocode steps as part of the event stream, and the agent must complete every planned step. This is why Manus feels so methodical.

Interesting rules from the prompt:

Default language: English, but adapts to user’s language
“Avoid using pure lists and bullet points format in any language” — Manus is instructed to write in prose
Capable of deploying services and exposing ports publicly

✂️ Cursor: Surgical Precision in Code Editing

Cursor’s system prompt reveals a philosophy of minimal, targeted edits. The prompt instructs the model to:

Never output unchanged code — always use markers like // ... existing code ...
Default to a “lazy edit” mode: only write the parts of the file that change
Use explicit <CHANGE> annotations to mark modified lines

This explains why Cursor’s edits feel surgical compared to tools that rewrite the entire file. The system prompt literally forbids unnecessary rewrites.

🤖 v0 (Vercel): The Full-Stack React Renderer

v0’s prompt introduces a concept called CodeProject — a special block that groups React component files and renders them in the browser. The tool has specific knowledge of:

Writing to files using ```lang file="path/to/file" syntax
Using kebab-case for filenames
Including taskNameActive and taskNameComplete metadata for UI feedback

The prompt even covers how to use // ... existing code ... markers. v0 is doing the same “lazy edit” strategy as Cursor, but for React/Next.js specifically.

🔍 Devin AI: Evidence-Based Software Engineering

Devin’s leaked prompt is philosophically different from the others. It’s designed as a code archaeology tool that answers questions about a codebase:

1
2
3
4
5
INSTRUCTIONS:
- DO NOT MAKE UP ANSWERS
- Cite EVERY SINGLE SENTENCE with <cite repo="..." path="..." start="..." end="..." />
- Citations should span at most 5 lines of code
- End every answer with a "Notes" section

Devin is explicitly instructed to be a skeptic — if it doesn’t know something, it says so. Every claim must be backed by file-level evidence with line numbers. This is extraordinary for an AI tool — it’s essentially a peer-reviewed engineering assistant.

Devin’s prompt even includes:

Support for Mermaid diagrams (no colors — “they make text hard to read”)
Never cite entire functions, only salient lines
Adapts output language to user’s language

🌊 Windsurf: Full TypeScript Tool API

Windsurf’s leaked Tools Wave 11.txt exposes its entire tool API as TypeScript type definitions. This is one of the most technically detailed leaks in the repository:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
type capture_browser_screenshot = (_: {
  PageId: string;
  toolSummary?: string;  // "2-5 word summary of what this tool is doing"
}) => any;

type codebase_search = (_: {
  Query: string;
  TargetDirectories: string[];
  toolSummary?: string;
}) => any;

type deploy_web_app = (_: {
  Framework: "nextjs" | "sveltekit" | "remix" | ...;
  ProjectId: string;
  ProjectPath: string;
  Subdomain: string;
}) => any;

The toolSummary parameter on every tool is fascinating — Windsurf is instructed to briefly describe what it’s doing in every tool call. This is how the “Windsurf is doing X” status bar messages are generated.

🔬 Anthropic Claude: Multiple Persona Versions

The Anthropic folder contains multiple versions of Claude’s agent prompt across time:

Agent Prompt v1.0.txt, v1.2.txt, 2.0.txt
Sonnet 4.5 Prompt.txt
Claude Code 2.0.txt
Chat Prompt.txt
Tools Wave 11.txt

This reveals how Claude’s instructions evolved as Anthropic expanded from a chat assistant to a full coding agent. The Agent Tools v1.0.json shows the original toolset, while Tools Wave 11.txt is significantly larger.

Part 4: The Resolution — What This Means for Developers

1. You Can Learn Prompt Engineering from the Best

These are production-grade system prompts written by teams at Anthropic, Cognition (Devin), Codeium (Windsurf), and Vercel. Reading them teaches you:

How to define tool schemas that models actually follow
How to structure agent loops for reliability
How to constrain LLM behavior with explicit rules
How to version prompts as products evolve

2. Security Warning for AI Startups

The repo includes a direct warning: if you’re building an AI product, your system prompt is a high-value attack surface. ZeroLeaks — linked in the repo’s README — offers prompt extraction audits.

If your system prompt contains API keys, internal URLs, or proprietary logic, this is a real risk.

3. Building Better Agents

Key patterns from the best prompts in this repo:

Pattern 1: Separate planning from execution

1
2
# Bad: Let the model figure it out as it goes
# Good: Manus-style explicit pseudocode planning, then execute step by step

Pattern 2: One tool call per reasoning step

1
# All top agents: iterate with ONE tool at a time, observe, then decide next step

Pattern 3: Cite your sources

1
# Devin-style: every claim backed by file + line number evidence

Pattern 4: Lazy edits, not rewrites

1
# Cursor/v0 style: only output changed code, use markers for unchanged sections

4. Compare Tool Philosophies

Tool	Core Philosophy
Manus	Deliberate, event-driven, one-step-at-a-time
Cursor	Surgical precision, minimal output, code-first
Devin	Evidence-based, citation-driven, skeptical
v0	Full-stack aware, component-centric, UI-first
Windsurf	Verbose tool API, status transparency

Final Mental Model

1
2
3
4
5
6
7
System Prompt Anatomy (Universal Pattern)
├── Identity      → "You are X, built by Y"
├── Capabilities  → What tasks the agent can do
├── Tools         → Typed API for taking actions
├── Rules         → Constraints on behavior
├── Agent Loop    → How to iterate towards a goal
└── Output Format → How to structure responses

The x1xhlol/system-prompts-and-models-of-ai-tools repository is more than a curiosity — it’s a reference architecture for how the AI industry is building the next generation of software agents. Whether you’re building your own AI product or just want to understand why your coding assistant behaves the way it does, this repo is an invaluable window into the black box.

⭐ The repo has 30,000+ lines of raw AI intelligence. Drop a star if you find it useful.