Finance agents — Playbooks

Playbook · Domain Playbooks

Finance agents.

Where agents earn their keep in finance — reconciliation, research synthesis, KYC review — and the hard rails (audit, determinism, regulator-readable trails) they must carry. Finance is a domain where the worst case is not a bad UX day; it is a regulator-mandated breach notice and a multi-quarter remediation. The same constraints that make finance look "boring" — auditability, determinism, segregated duties — are exactly the constraints your agent must internalize before it touches anything that posts a journal entry.

STEP 1

Where finance agents actually earn their keep today.

Three jobs reliably justify themselves:

Reconciliation — matching transactions across systems (bank statement vs ledger, sub-ledger vs GL, broker confirms vs internal trade blotter). The agent surfaces the exceptions and proposes adjustments; a human approves the booking.
Research synthesis — analyst-style note generation over filings, earnings transcripts, broker research. Output is a draft for a human analyst, not a published view.
KYC / onboarding review — document extraction, sanctions-list comparison, beneficial-ownership unwinding, flagging anomalies for human investigation. The agent compresses an analyst's hour into minutes; the analyst still signs off.

The pattern is the same across all three: the agent does the volume work and presents a structured exception or draft; a human is on the consequential decision. This is also the autonomy posture you can defend to a regulator the morning after something goes wrong.

STEP 2

Where finance agents don't belong.

Three places to stay out of, regardless of how clever the demo looked:

High-frequency or latency-sensitive trading. Deterministic, microsecond-bounded systems outperform an LLM at the only thing that matters there.
Fiduciary advice without a human in the loop. Personalized investment recommendations, suitability assessments, and discretionary asset-management decisions sit inside fiduciary regimes that assume a licensed human is making them. An agent can draft; a human owns the recommendation.
Final-mile booking of customer-impacting transactions. Trade execution, payment release, account closure — anything that moves money or status — should be a tool call the agent proposes and a human authorizes, until you have years of audit-grade evidence saying otherwise.

An agent that gives personalized fiduciary advice without a licensed human signing off is not a clever product feature — it is a regulatory event waiting to happen. Treat fiduciary surfaces as gated by default and design the human-in-the-loop in before you ship, not after a complaint.

STEP 3

The non-negotiable rails: audit, determinism, regulator-readable trails.

Three constraints carry across every finance use case, and you do not get to negotiate them away:

Full audit trail. Every prompt, every tool call, every retrieved document, every model version, every decision — captured with timestamps and tied to the agent identity that took the action. The auditor's question is "show me everything that influenced this booking"; your system must answer it cold. See audit trails.
Determinism where you can get it. Frozen model versions per release; temperature pinned (usually zero) for any classification or extraction; structured outputs with strict schemas; prompt-cache aware so the same inputs produce the same outputs across runs. Variance in outputs is variance in your books.
Regulator-readable explanation. Not the raw chain of thought — that is a persuasive narrative, not verified causality — but a structured "why we did X": the retrieved evidence, the rule that fired, the threshold that was crossed, the human who approved. A regulator should be able to read the explanation without reading prose.

STEP 4

Know the regulators in your jurisdiction; name them honestly.

The regulator names you might owe an answer to depend on what you do and where:

US broker-dealers answer to FINRA and the SEC; the relevant supervision regime predates agents but applies wholesale to anything that influences customer-facing decisions.
UK financial services answer to the FCA, with conduct rules that put the firm — not the model vendor — on the hook for outcomes.
Singapore sits under MAS; EU firms additionally face the operational-resilience and AI-specific regimes layered on top of the existing financial-services rulebook.

Don't cite specific directive numbers or dollar amounts unless you have verified them — a wrong citation in a finance entry is worse than no citation. The honest framing is "regulators in most jurisdictions require X" plus a real example regulator name, and a pointer to the regulatory landscape essay for the broader map.

STEP 5

The minimum bar to ship.

Before a finance agent reaches a customer-facing or books-of-record surface, you owe yourself answers to:

Can the audit log reconstruct any individual decision end-to-end, including the model version and the retrieved evidence?
Is there a documented human approval on every consequential output, with the approver's identity preserved separately from the agent's identity?
If the model is replaced, are the deterministic surfaces (extraction, classification, structured output) reproducible against the previous version's recorded inputs?
Does the explanation log read as evidence a compliance officer can defend, or as a marketing description of the agent's reasoning?
Is there a kill switch that halts the agent for a customer, a tenant, and globally, and has it been drilled in the last quarter?

If any answer is "not yet," the answer to "can we ship?" is also "not yet." Finance is one of the few domains where shipping the rails late is structurally more expensive than building them first.