Lifecycle
You send a request
POST /v1/calls with a phone number (to), a prompt telling the AI what to do, optional context with background info, and a returns schema describing the data you want out of the call.CallingBox dials the number
The call goes out right away. If the line is busy, goes to voicemail, or nobody picks up, CallingBox catches that and updates the status. You’re only billed for time someone actually answers.
The AI has the conversation
Once the person picks up, a real-time voice loop kicks in:
- Speech-to-text transcribes what the person says
- The language model generates a response using your prompt and context
- Text-to-speech speaks the reply
Data gets extracted
After the call ends, if you set a
returns schema, CallingBox reads through the transcript and pulls out the fields you asked for. If you asked for {"confirmed": "boolean"}, you’d get back something like {"confirmed": true}.When nobody answers
CallingBox detects voicemail, busy signals, and no-answer. The status updates tono_answer, busy, or failed. Voicemail is reported as status: no_answer with answered_by: machine so the dashboard labels it “Voicemail”. By default CallingBox hangs up as soon as voicemail is detected; pass voicemail_action: "leave_message" on the create-call request to keep the line open. No extraction runs, no charge. More detail in Statuses and failures.
Calls
All the request fields and how to use them.
Structured results
Define a returns schema and get typed JSON back.
Delivery
Polling vs webhooks for getting results.