Outbound voice agents.
Agents that make the call instead of answering it — pacing, abandonment, identity disclosure, and the regulatory landmines that turn a clever demo into a fine. An outbound voice agent is structurally a different product from the inbound voice agents covered elsewhere in this section: the latency budget, the rapport-building moves, and the regulatory regime all shift the moment your system is the one dialing.
Outbound is not inbound, even with the same speech stack.
Inbound voice agents start a turn in a known context: the caller dialed in for a reason and has already framed the conversation. The latency budget is generous, the user is leaning in, and the failure modes are mostly about not breaking the relationship the caller already chose to start.
Outbound voice agents flip every one of those assumptions. The callee did not ask for the call; the first three seconds decide whether they hang up; latency has to feel natural without the cover of "you're already in our IVR"; and the most important conversational move is not solving the problem, it is establishing that the call is legitimate. Same speech stack — same architecture, same failure modes — but a different product.
Pacing and abandonment — the math you cannot ignore.
Predictive-dialer math applies to AI-driven outbound just as it did to human call-center outbound. If the system dials more numbers than it has live agents (or agent capacity) to handle, some answered calls get dropped — "abandoned" — and the callee hears a click. Major jurisdictions cap the abandonment rate (typically a low single-digit percent) and the cap applies whether the "agent" is a person or a model.
Practically: never dial more concurrent calls than your model + downstream capacity can actually pick up and speak on within the regulated answer window. Treat abandonment rate as a first-class metric, monitor it per campaign, and tie a kill-switch to the threshold. A demo that "just dials harder to test scale" is the demo that produces the campaign-ending compliance event.
Identity disclosure: you must say "this is an AI" — clearly, early, in a real voice.
In the United States, the Telephone Consumer Protection Act (TCPA) and FCC rules on prerecorded and AI-generated voice require that the calling entity identify itself, and recent FCC guidance has treated AI-generated voice on outbound calls as falling within the existing artificial-voice rules. The EU and UK regimes have parallel requirements rooted in transparency and consent. The right design assumption is: identity disclosure is mandatory in every jurisdiction you operate in.
"Identify as AI" must be a sentence the callee actually hears and understands, not a legal nicety buried in fast speech at the end. Something the caller can comprehend in one pass, in the first few seconds of the call, in plain language ("Hi, this is an AI assistant calling on behalf of Acme — is now a good time?"). A disclosure that the model races through during a barge-in is not a disclosure.
AI disclosure is not a marketing choice. It is a regulatory requirement in every major jurisdiction running outbound voice today. Build it into the prompt as a non-skippable opening, write the eval to fail any call that does not include it, and treat a missed disclosure as a P0 incident — not a polish item.
Consent, recording, and the controls callees expect.
Three controls are non-negotiable on outbound:
- Do-not-call honoring. Federal DNC, state DNC, and internal company DNC lists must be checked before the dial — not after. The list filter is the same upstream fail-closed gate the GTM playbook talks about, just for phone numbers instead of email addresses.
- Time-of-day restrictions. Calling hours are regulated by the callee's local time zone, not the dialer's. A correctly-localized restriction must consult the area code's time zone — or, better, modern carrier metadata — before each dial.
- In-call opt-out. The callee can say "stop calling me" mid-conversation; the agent must detect that intent, confirm it verbally, and route the number to the internal DNC immediately. A model that argues, asks "are you sure?", or proceeds to its script is producing exactly the consumer-protection complaint the rules were written for.
Call recording rules also vary — some US states are one-party-consent, some are two-party-consent, EU jurisdictions generally require explicit consent. Configure per-jurisdiction, default to the strictest, and log the consent decision next to the recording metadata.
The regulatory landmines that turn a demo into a fine.
Three landmines reliably produce expensive incidents:
- TCPA statutory damages. US private right of action with per-call statutory damages plus state-level augmentations (California's CIPA-related claims, Washington's WADAD-style rules, others). The damages stack per call; a high-volume campaign that misses disclosure can produce six- to seven-figure class-action exposure.
- FCC artificial-voice rules. AI-generated voice on outbound calls falls within the FCC's existing artificial-voice framework; recent declaratory rulings reinforced that. Failing to comply is not just a private suit — it is a federal regulatory event.
- State-level augmentations. California, Washington, Florida, and others have their own additions — required disclosures, stricter time windows, in-state plaintiff-friendly courts. A national campaign needs per-state configuration; "the US rules" is not a single regime.
Before any outbound voice agent reaches a real PSTN number: identity disclosure verified by an eval on every call; abandonment rate monitored with a kill-switch; DNC lists checked upstream and updated daily; in-call opt-out detected and honored; recording consent configured per jurisdiction; and audit logs sufficient to reconstruct any call after the fact. Outbound voice is the surface where the difference between a clever capability and a regulatory event is the smallest in this whole section — build the rails before the campaign, not after the complaint.