Find the friction before the customer files the ticket.
In B2B SaaS, most churn is decided in the first weeks, quietly. The customer does not complain, they just stop showing up, and by the time Customer Success notices, the decision is already made. Our client, a B2B SaaS platform with 200 plus customers, saw 30% of new signups become inactive within the first 14 days. We built a three-agent fleet that watches how new customers actually use the product, recognizes friction as it forms, and tells the CS team who needs what, before anyone asks. The customer never sees it. They just experience a product where help arrives strangely on time.
Hermes runtime 3-agent fleet Product event streams Friction detection CS-facing recommendations
The client's onboarding looked healthy in aggregate and leaked in the particulars. Some customers activated fast, others stalled on the same three steps everyone stalls on, and a few went silent entirely. The data to see all of this existed, every click, every configuration step, every abandoned flow was in the event stream. What did not exist was anyone with time to watch it per customer, per day, and translate patterns into action while the customer was still reachable.
CS worked reactively instead: tickets, check-in calls on a calendar cadence, and gut feel. The calendar does not know which customer is struggling today. The event stream does.
I · Behavior Monitor
Watches the product event stream per new account across the first 14 days: login frequency, activation steps completed, feature adoption sequence, time spent per session, error encounters, help doc visits, and support ticket submissions. Maintains a living picture of where each customer is on the onboarding path.
II · Friction Analyst
Compares each account's trajectory against a healthy onboarding baseline derived from the client's most successful customers, and against the patterns of accounts that churned. Flags stalls, drop-offs, and silence, with the evidence attached.
III · Recommendation Writer
Turns each flag into a concrete, personalized recommendation for the CS team: which customer, what friction, what to offer, and the suggested message. CS decides and sends.
> observe
Every new account's product events feed the Behavior Monitor from day one. No surveys, no asking, just what the customer actually does.
> compare
Trajectories are compared against known-good and known-churned patterns. A customer three days silent after a failed configuration step looks exactly like what it is.
> flag
Friction becomes a flag with evidence: the account, the stall, the pattern it matches, the risk level.
> recommend
Each flag becomes a CS-ready recommendation: the customer, the friction, the fix to offer, a suggested message in the client's tone. Recommendations are surprisingly specific, instead of this customer might be struggling, they get this customer has attempted the Zapier integration 4 times and failed at the OAuth step each time, suggested action send the OAuth troubleshooting walkthrough video.
> act
The CS team reviews, edits, and reaches out. The system never contacts a customer, it makes the humans precise.
Layer I · Visual Architecture
One diagram, events in, three agents in sequence, recommendations out to CS. The customer sits outside the system boundary, deliberately.
Layer II · Contracts
What counts as friction, what risk levels mean, and what a recommendation may propose, written with the CS lead before implementation.
Layer III · Technical Diagrams
Event ingestion, trajectory comparison, flag thresholds, recommendation formats, all specified in advance.
Layer IV · Implementation
Hermes runtime on a dedicated isolated instance, Supabase for trajectories, flags, and recommendation history. Slack delivery to the CS channel as a non-critical channel.
False alarms
A quiet week is not always churn, holidays and procurement cycles exist. Silence is evaluated against the account's own rhythm, not a global constant.
Alert fatigue
CS trusts the flag because flags are rationed: evidence thresholds, deduplication per account, and risk levels that mean something.
Pattern overfitting
Success patterns are recalibrated as the product changes; last quarter's healthy trajectory is not gospel. The baseline is automatically recalibrated monthly using the latest top-performing customer cohort.
Privacy creep
The system reads product usage events only. The boundary is architectural, and it is part of what the client can tell their customers with a straight face.
3
Agents in the fleet
6 weeks
From kickoff to staging
staging
Current status, production rollout in phases
day one
Coverage of every new account from signup
72%
Friction Detector accuracy in staging
100%
Recommendations reviewed by a human before any outreach
The system is in staging with the client's CS team; production figures will be published once the rollout completes.
The event stream already knows. Every churned account had left tracks weeks in advance. The engineering problem was never detection in principle, it was watching every account every day, which is exactly the kind of vigilance humans are bad at and agents are built for.
Recommendations beat alerts. An alert says something is wrong and creates work. A recommendation says here is the customer, the friction, and the message, and removes work. The CS team's adoption followed that difference precisely.
Staging is where trust is built. Running shadow-mode next to the CS team's own judgment, and being right, is what turns a tool into a colleague. We ship the confidence before we ship the autonomy.
Your churn is announcing itself in week one. Is anyone watching?
> ../book_a_call.sh