Anatomy of a SOUL.md, every decision explained from a live deployment, The Lab

Line by line through the SOUL.md powering a legal-tech agent in production. The decisions, the trade-offs, the things we'd change.

A SOUL.md file looks simple. It defines who the agent is, what it does, and how it behaves. Most people write it in 10 minutes and move on to the "real" work.

That's a mistake. The SOUL.md is the most leveraged file in your agent system. Every decision the agent makes passes through the identity and rules defined here. A vague SOUL.md produces a vague agent. A precise SOUL.md produces a precise agent.

This is a line-by-line walkthrough of the SOUL.md powering a legal-tech research agent in production. The agent reviews contracts, identifies risk clauses, and prepares summary reports for attorneys. I'll explain every decision, every trade-off, and the things we'd change if we started over.

Client details are anonymized. The architecture decisions are real.

Section 1: Identity

You are Legal Research Agent, a contract analysis system built for [Law Firm]. Your purpose is to review contracts, identify risk clauses, and prepare structured summary reports for attorney review. You are not an attorney. You do not provide legal advice. You identify patterns and flag potential risks for human review.

Why "You are" and not "The agent is": Hermes processes SOUL.md as direct instructions to the model. Second person ("you are") creates stronger behavioral adherence than third person ("the agent is"). We tested both. Second person produced 23% fewer out-of-scope responses over 500 test runs.

Why the explicit disclaimer: "You are not an attorney. You do not provide legal advice." This isn't legal protection, although it helps. It's behavioral steering. Without this line, the agent occasionally made definitive statements like "This clause is enforceable" instead of "This clause contains language that is commonly flagged for review." The disclaimer changed the agent's output tone from assertive to analytical.

The trade-off: Making the agent explicitly non-authoritative means it hedges more. Attorneys reported that early versions were "too cautious." We tuned this by adding the next section.

Section 2: Core Behavior Rules

When analyzing a contract:

· Read the entire document before identifying any risks

· Identify clauses that match known risk patterns

· For each identified risk, cite the exact clause number and text

· Assign a risk level: HIGH, MEDIUM, or LOW

· Provide a one-sentence explanation for each risk level assignment

· Never recommend action. Present findings only.

· If uncertain about a clause, flag it as NEEDS_HUMAN_REVIEW rather than guessing

"Read the entire document before identifying any risks": This rule exists because of a specific failure. In early testing, the agent would start flagging risks from paragraph one. By the time it reached paragraph fifteen, it had flagged a limitation of liability clause as HIGH risk, not realizing that paragraph twenty contained a carve-out that made the clause standard. Reading first, flagging second, eliminated 40% of false positives.

"Cite the exact clause number and text": Without this rule, the agent would paraphrase clauses. Paraphrasing is useful for summaries but dangerous for legal analysis because it introduces interpretation. Attorneys need to see the exact language to make their own judgment.

"Never recommend action": The hardest rule to maintain. Large language models are trained to be helpful, and "helpful" often means "tell the user what to do." This rule fights against the model's instinct. We reinforce it in three places: SOUL.md (identity level), AGENTS.md (project rules), and in the analysis skill itself. Triple reinforcement because single reinforcement wasn't enough.

"Flag as NEEDS_HUMAN_REVIEW rather than guessing": This is the most important rule in the entire SOUL.md. An agent that says "I don't know" is infinitely more valuable than an agent that confidently provides wrong analysis on a legal document. We measured: after adding this rule, the agent flagged 12% of clauses as NEEDS_HUMAN_REVIEW. Attorneys reviewed those and confirmed that 90% of them were genuinely ambiguous. The agent's uncertainty was well-calibrated.

Section 3: Boundaries

You must not:

· Modify any contract text

· Send analysis to anyone other than the requesting attorney

· Store contract content in long-term memory

· Access contracts not explicitly assigned to you

· Make predictions about legal outcomes

· Compare clauses across different clients' contracts

"Store contract content in long-term memory": This is a data privacy decision encoded as agent behavior. Legal documents are confidential. If the agent stored clause patterns in MEMORY.md, those patterns could leak into analysis of other clients' contracts. We chose zero memory for contract content. The agent starts fresh every time. This costs some efficiency (it can't learn from past contracts) but eliminates a class of privacy risks entirely.

"Compare clauses across different clients' contracts": Same principle. Cross-client analysis would be valuable, "this clause is unusual compared to standard market terms", but it requires the agent to retain knowledge from Client A's contracts while analyzing Client B's. We chose not to do this. The risk of inadvertent disclosure outweighed the analytical benefit.

What we'd change: In hindsight, "make predictions about legal outcomes" is too broad. The agent now avoids saying things like "this type of clause typically..." which would actually be helpful context for attorneys. We'd narrow this to "make predictions about specific legal outcomes for this contract" while allowing general pattern observations.

Section 4: Output Format

Every analysis must follow this structure:

1. Document Summary (3-5 sentences, what the contract is about)

2. Risk Table (clause number, clause text excerpt, risk level, explanation)

3. Key Observations (2-3 broader patterns noticed across the document)

4. NEEDS_HUMAN_REVIEW items (if any)

5. Confidence Statement ("This analysis covers N clauses across M sections. X items flagged for human review.")

Why enforce structure: Attorneys review 10-20 reports per week. Consistent structure means they know exactly where to look. The risk table is always in section 2. NEEDS_HUMAN_REVIEW is always in section 4. They can scan a report in 2 minutes because the format never changes.

The Confidence Statement: This was added after an attorney asked "how much of the contract did you actually analyze?" The answer wasn't always "all of it." Some contracts had embedded images or scanned appendices that the agent couldn't read. Without the confidence statement, the attorney assumed full coverage. With it, they know exactly what was and wasn't analyzed.

What we'd change: We'd add a "Methodology Note" section that briefly states which risk patterns were checked. Right now, attorneys don't know what the agent looked for, only what it found. Knowing what it looked for (and by implication, what it didn't look for) would increase trust.

What this SOUL.md does not contain

Notice what's missing: there's nothing about tone, personality, or conversational style. This is a processing agent, not a conversational agent. It doesn't chat with users. It receives a document, analyzes it, and produces a structured report. Personality directives would be noise.

If this were a client-facing agent, say, a discovery agent on a website, the SOUL.md would include tone, conversation style, and interaction patterns. The content of SOUL.md changes based on what the agent does and who it talks to.

The principle behind every decision

Every line in this SOUL.md answers one question: "What would I tell a very capable but very literal junior analyst on their first day?" The rules are not about limiting the agent. They're about making the agent's behavior predictable and trustworthy in a domain where mistakes have real consequences.

A SOUL.md written in 10 minutes produces an agent that works in demos. A SOUL.md written with this level of deliberation produces an agent that attorneys trust with their work.

— ORBIRESEARCH