POST · 03 CLUSTER / AI AGENTS WALKTHROUGH · 13 MIN READ APRIL 2026

Building your first revenue agent: a technical walkthrough.

This is the post we'd have wanted when we shipped our first revenue agent. No abstraction-soup, no vendor marketing. Blank repo to production system that researches accounts, drafts first-touch emails, logs outcomes to the CRM, and runs under $40/mo of inference. Steal it.

The goal

A background worker that, every morning: pulls accounts flagged as "ready for outreach," runs a structured research pass on each, drafts a first-touch email in the founder's voice, and drops the draft into a queue for approval. On approval, the email sends; on any reply, a downstream loop records the outcome. One agent, one loop, one job.

Not a toy. This exact shape runs inside three clients today.

The stack

Minimal, boring, durable. Python 3.12, Postgres 16, a Claude or GPT-4 class model over the Anthropic or OpenAI API, Resend for email, a tiny Next.js dashboard for the approval queue, and a single Temporal worker to orchestrate the loop. No LangChain, no agent framework, no vector DB for this first one. You can add those later if you need them; most agents don't.

$ tree agent/

agent/
├── prompts/
│   ├── research.md
│   └── write.md
├── tools/
│   ├── enrich.py        # third-party data pulls
│   ├── crm.py           # read + write HubSpot
│   └── send.py          # Resend wrapper
├── loop.py              # the orchestrator
├── schema.sql           # 4 tables: accounts, briefs, drafts, outcomes
└── main.py              # Temporal worker entrypoint

Step 01 · The schema

Four tables. accounts (id, domain, signals jsonb, last_seen), briefs (account_id, summary, angles jsonb, created_at), drafts (account_id, body, status, approved_by, sent_at), outcomes (account_id, event, payload jsonb, at). That's the whole persistence surface.

Resist the urge to add more. The compounding is in how much each row knows, not how many tables you have.

Step 02 · The research prompt

A single markdown file that takes signals and a company domain and returns a structured JSON brief: what the company does, three plausible business reasons they'd need you, one piece of verifiable timing evidence. Three hundred tokens in, six hundred tokens out. The magic is in the instruction to cite the source URL for any claim — otherwise the LLM hallucinates a funding round that never happened and your outbound dies of embarrassment.

Step 03 · The writer prompt

The writer takes the brief and emits a draft. This prompt is 80% voice and 20% structure. Feed it 10–20 real examples of the founder's best previous cold emails, labeled with reply rates. Ask it to match the cadence, the hedges, the length. The most common mistake is asking for "concise" — you want consistent with the corpus, which is usually longer than "concise."

Output is four things: subject line, opener that references the brief, body that offers a crisp reason to meet, and signoff. Nothing else. No "I hope this finds you well."

IMG / PLACEHOLDER

Writer prompt output example

A real brief → draft run from the system, with both halves side-by-side. Swap with a safe anonymized screenshot.

Step 04 · The approval loop

This is the only step most guides skip. Your first agent must not send anything outward without a human touching it. The interface is dead simple: a Next.js page that lists drafts with status=pending, side-by-side with the brief they came from, plus three buttons — approve, edit, kill.

Edits rewrite the drafts.body field. Approvals flip status to approved and flag a send. Kills flip to killed with an optional reason. That reason, over hundreds of runs, becomes the single most valuable training signal you have.

Step 05 · The send

Resend or any transactional API. Critical detail: always record the full sent payload (subject, body, to, from, message_id) to the outcomes table the instant you ship it. If the send fails downstream you need to diff the intent against what actually landed.

Step 06 · The reply loop

Pointed at a catch-all inbox, a second worker polls for inbound. Every inbound that threads back to a sent message writes an outcomes row with the thread body, classification (positive, negative, ooo, unsubscribe), and a timestamp. The classification is another small LLM call — don't overthink it.

Step 07 · The compounding query

The reason this whole exercise matters. Once outcomes exist, a nightly query pulls every (brief, draft, outcome) triple and asks: which brief shapes produced positive replies? Which opener patterns did? Which send windows? The answers go into two prompt-level updates — nudging the research and writer prompts toward what worked.

In month one this query is noise. In month three it's the reason your reply rate quietly doubles while your token spend stays flat.

Costs, real numbers

Per 100 outbound drafts produced: around $1.40 in inference (Claude Sonnet or GPT-4o-mini class), $0.60 in third-party enrichment, $0.20 in email send, plus a flat $12/mo of Temporal and Postgres. Under $40/mo at 1,000 drafts.

Your time cost at week one is about 20 hours. At month one, under 3 hours per week reviewing the queue. The review time converges down from there as the prompt improves.

The traps we hit so you don't

Trap one: too many tools. We tried a vector DB before we needed it, and burned two weeks on an index that solved no real problem. Add it later if you need retrieval over a large corpus; skip for a first agent.

Trap two: prompt-only thinking. Prompts are 30% of the output. The other 70% is the brief's quality. Invest in research before you invest in writing.

Trap three: auto-send too early. The moment you remove the human approval step before you've shipped ~200 reviewed sends, the agent drifts into generic voice and you won't notice until the reply rate collapses.

IMG / PLACEHOLDER

Month-three reply rate chart

Weekly reply rate and cost-per-meeting booked over 12 weeks. Drop in the real chart when you have one to show.

Where to go from here

Once this agent is stable — meaning it runs for a month without you babysitting it — the natural next steps are: add a memory layer so past conversations shape new drafts, add a research agent that ingests signals rather than waiting for flags, and split the writer into subject-line and body sub-prompts for independent learning. Each is a week of work on top of a working foundation.

Your first revenue agent should be boring. One job, one loop, four tables. The compounding comes from the outcomes column growing every week — not from how exotic the stack looks in the architecture diagram.

Work with us

We'll ship the first agent for you in four weeks.

Thirty-minute intake. We scope the first agent, the prompts, the dashboards, and hand you the keys.

Free
No deck
Written scope in 48 hrs

Book the audit → See our services